D4.1 Enriched Semantic Models of Emergency Events

Page 1

www.comrades-­‐project.eu

D4.1 Enriched Semantic Models of Emergency Events

Project acronym:

COMRADES

Collective Platform for Community Project full title: Resilience and Social Innovation during Crises Grant agreement no.: 687847 Responsible:

Grégoire Burel (OU)

Contributors:

Reviewer:

Diana Maynard (USFD)

Document Reference:

D4.1

Dissemination Level:

<PU>

Version:

1.6

Date:

20/09/16

Disclaimer: This document reflects only the author's view and the Commission is not responsible for any use that may be made of the information it contains.

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 687847


D4.1 Enriched Semantic Models of Emergency Events

History Version

Date

Modification reason

Modified by

0.1

20-09-2016

Initial draft

Grégoire Burel

0.2

07-10-2016

Requirements and Specification

Grégoire Burel

0.3

20-12-2016

Model Implementation

Grégoire Burel

0.4

22-12-2016

Comments Merging

Grégoire Burel

0.5

26-12-2016

Evaluation and Full Draft

Grégoire Burel

Copyright 2016 Grégoire Burel ©

2 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Table of contents History ............................................................................................................................... 2 Table of contents ............................................................................................................... 3 List of tables ...................................................................................................................... 4 List of figures ..................................................................................................................... 4 Executive summary ............................................................................................................ 6 1 Introduction ................................................................................................................ 7 1.1 Objectives and Modelling Principles .............................................................................. 7 1.2 Design Approach and Methodology .............................................................................. 8 1.2.1 The NeOn Modelling Approach .............................................................................. 8 1.2.2 The Qualitative and Structural Design Methodology ........................................... 10 1.2.3 Ontology Evaluation ............................................................................................. 11 2 Structure of this document ........................................................................................ 12 Part I: Requirements Analysis and Model Specifications .................................................. 13 3 Introduction .............................................................................................................. 13 4 Requirements Information Sources ........................................................................... 14 4.1 COMRADES General Requirements ............................................................................. 14 4.2 Stakeholder Interviews ................................................................................................ 14 4.3 Ushahidi Data Structures ............................................................................................. 16 4.4 Crisis Related Datasets ................................................................................................. 19 5 The Ontology Requirement Specification Document (ORSD) ...................................... 24 5.1 COMRADES Aims and Model Purpose ......................................................................... 24 5.2 Intended Use and Users ............................................................................................... 25 5.3 Competency Questions ................................................................................................ 25 5.3.1 Work Package Requirements ................................................................................ 26 5.3.2 Interviews and Qualitative Requirements ............................................................ 26 5.3.3 Ushahidi Data Structures ...................................................................................... 27 5.3.4 Crisis Related Datasets .......................................................................................... 28 6 Term Glossary ............................................................................................................ 28 7 Summary ................................................................................................................... 29 Part II: COMRADES Ontology Model ................................................................................. 31 8 Introduction .............................................................................................................. 31 9 Model Principles ........................................................................................................ 31 10

Ontology Components ............................................................................................. 32

10.1 Classes and Relations ................................................................................................. 32 Copyright 2016 Grégoire Burel ©

3 | P a g e


D4.1 Enriched Semantic Models of Emergency Events 10.1.1 Information Sources, Reports and Situations ..................................................... 33 10.1.2 Collections, Categories and Topics ..................................................................... 33 10.1.3 Actors, Organizations and Accounts ................................................................... 34 10.1.4 Tasks, Roles and Permissions .............................................................................. 34 10.2 Properties .................................................................................................................. 34 11

Integration with Existing Ontologies ........................................................................ 36

11.1 Crisis Related Ontologies ........................................................................................... 36 11.2 Other Ontologies ....................................................................................................... 37 12

Multilingual Support ............................................................................................... 38

13

Domain Knowledge ................................................................................................. 38

14

Summary ................................................................................................................. 38

Part III: Model Evaluation ................................................................................................ 40 15

Introduction ............................................................................................................ 40

16

Ontology Evaluation ................................................................................................ 40

16.1 Competency Questions Mapping .............................................................................. 40 16.2 Results ....................................................................................................................... 43 17

Conclusions ............................................................................................................. 44

Appendix ......................................................................................................................... 45 18

References .............................................................................................................. 45

List of tables Table 1 Ushahidi data structures ........................................................................................ 17 Table 2 Crisis related datasets ............................................................................................ 20 Table 3 Data Structure of Crisis related datasets ................................................................ 23 Table 4 Top Terms Extracted from the Competency Question, Crisis Related Dataset and the Ushahidi Data Structures. .................................................................................................. 29 Table 5 Properties of the COMRADES Ontology .................................................................. 36 Table 6 Competency questions ontology mappings and evaluation. ................................... 43

List of figures Figure 1: COMRADES Communities ............................................ Error! Bookmark not defined. Figure 2: Relationships of resilience capabilities (Comes, Unpublished Manuscript) ....... Error! Bookmark not defined.

Copyright 2016 Grégoire Burel ©

4 | P a g e


D4.1 Enriched Semantic Models of Emergency Events Figure 3: The relationship among enactment, organizing and sensemaking (Jennings, Greenwood 2003) ...................................................................... Error! Bookmark not defined. Figure 4: Institutional Framework for Disaster Management in India (R.K Dave 2012) .. Error! Bookmark not defined. Figure 5: Stills from the World Disaster Report 2013 promotional Video . Error! Bookmark not defined. Figure 6: The COBACORE Concept and challenges ....................... Error! Bookmark not defined. Figure 7: emBRACE Framework .................................................. Error! Bookmark not defined. Figure 8: COMRADES evaluation and communities’ interaction .. Error! Bookmark not defined.

Copyright 2016 Grégoire Burel ©

5 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Executive summary COMRADES (Collective Platform for Community Resilience and Social Innovation during Crises, www.comrades-project.eu) aims to empower communities with intelligent socio-technical solutions to help them reconnect, respond to, and recover from crisis situations. This deliverable analyses the COMRADES requirements from different project perspectives in order to design and implement a common semantic model that represents micro emergency events and related metadata. In particular we analyse: 1) the data structures used by the Ushahidi platform since it is used as the underlying platform of the COMRADES system; 2) the requirements for the tools that need to be integrated into COMRADES platform; 3) stakeholder interviews, and; 4) the structure of crisis related datasets. Based on the NeOn methodology [1] and a qualitative and structural design approach [2], we created an Ontology Requirement Specification Document (ORSD) [3], that highlights the needs and specifies the competency questions that the model needs to address in order to comply with the COMRADES model requirements. Following the development of the ORSD, we implement the COMRADES model as an ontology using RDF/OWL. In order to allow the usage of the ontology in multilingual scenarios we translate the classes, properties and relation names to different languages. Finally, for improving the interoperability of the model with existing ontological models we align some part of the COMRADES ontology with well known ontologies such as SIOC and FOAF. Although we cannot completely evaluate the ontological model since some data is not yet available for the model (i.e. the COMRADES platform is not yet fully developed), we show that the model can successfully represent 102 different competency questions.

Copyright 2016 Grégoire Burel ©

6 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

1

Introduction

The representation of crisis events and micro-events is a key aspect of the COMRADES European project that aims to create an open-source community resilience platform for helping the management of emergency crises. The model needs to be easily integrated to the Ushahidi1 platform as it will be used as the backbone of the developed resilience platform. The focus of this deliverable is to provide a common semantic model that can be used in all the different aspects of the platform. In particular, the proposed model should allow the collection of user reports and related information and its organization. In the rest of this document, we refer to such model as the COMRADES model. In order to achieve this aim, we develop an ontological model based on standard semantic web technologies (RDF/OWL). The model is designed by studying what functionalities are necessary for developing the COMRADES crisis platform, existing platforms (in particular the Ushahidi platform), existing datasets and stakeholder interviews. Our development approach follows a qualitative and structural design methodology [2] where requirement and modelling needs are extracted from stakeholder interviews and existing platform data structures (e.g. Ushahidi) and datasets (e.g. Twitter2 and ACLED3). The idea of using datasets as input while developing the COMRADES model is motivated by the need for representing a large variety of input sources in the model. This differs from other ontology development methods where existing model are reviewed first by matching them to existing datasets and then extended. The COMRADES development approach first identifies requirements from the dataset and other sources, before creating an ontology and then aligning it if possible with existing ontologies. This approach has the advantage of better integration with requirements that are not specified in existing data such as requirements obtained from interviews and to provide a simpler ontology model. Besides the previous methodology, the COMRADES model development partially follows the NEON ontology guidelines [1] that outlines an approach for specifying and implementing ontological models. 1.1

Objectives and Modelling Principles

A difference between COMRADES and existing emergency platforms is the need to model and analyse small-scale events within larger crises and the importance of coordinating actions between parties. For instance, in emergency crises, an individual may need transportation while another party may be willing to provide such transportation. In this context, it is necessary to enable the coordination of the needed services with available resources. Similarly, it is also important to distinguish relevant

1

Ushahidi, https://www.ushahidi.com/.

2

Twitter, https://twitter.com/.

3

ACLED, http://www.acleddata.com/.

Copyright 2016 Grégoire Burel ©

7 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

and trustworthy information. As with action coordination, it remains mostly a manual task. In order to develop the COMRADES platform, a model that allows for such analysis is required. Although a few models have been designed in the past, most of them have been focused on particular crises aspects and tend to be overly complex. The aim of the proposed model is to provide a flexible and minimal model that addresses the representation of events, the associated evidences and resources, and provides a means for coordinating action automatically. The COMRADES model is directly linked with other WP4 tasks as well as the other work package needs. In particular, it needs to allow event and micro-event modelling (T4.2) and action coordination (T4.3) as well as multilingual processing (T3.1), content informativeness representation (T3.2) and content validity assessment (T3.3). Although there are different technologies for representing ontologies, we decide to use RDF/OWL as the COMRADES project needs to deal with data obtained from social media and online data sources and RDF/OWL is a semantic technology particularly adapted to such setting. Moreover, since the COMRADES platform will be web based this helps the integration of the model as web frameworks can manipulate RDF/OWL data easily. Even though different methods exist for building ontologies and data models, we decide to rely on two different approaches for building the COMRADES model. First, we propose to partially follow the NeOn methodology [1], a comprehensive approach for specifying, developing and evaluating ontologies. Second, we propose to apply the qualitative and structural design approach [2] for including non-ontological resources and user studies during the specification phase of the COMRADES model. Although the COMRADES model may be applied to different crises and scenarios not covered by the COMRADES project, the model goal is only focused on fulfilling the project requirements in order to provide a relatively simple model that can be easily reuse within the COMRADES project. For this purpose, the COMRADES model aims to follow the project requirements rather than providing a single model that fulfils all existing and future crisis platforms. Nevertheless, we aim to provide a model that can be extended easily, so it may be integrated in additional scenarios in the future. 1.2

Design Approach and Methodology

The development of the COMRADES model needs to follow a methodology in order to make sure that the model properly captures the design requirements. 1.2.1 The NeOn Modelling Approach The NeOn methodology [1] is an approach for developing ontologies, identifying 9 different scenarios (i.e. design steps) that may arise when creating a new ontological model (Figure 1). As part of the creation of a new ontology, it is necessary to identify

Copyright 2016 Grégoire Burel ©

8 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

what scenarios apply to a particular development, as well as to create an Ontology Requirement Specification Document (ORSD) [3] document. A few different methods exists for designing ontological models such as Methontology [4], On-To-Knowledge [5], and DILIGENT [6]. However, we decided to focus on the NeOn approach since it helps the integration of existing models and reuse of nonontological models.

Figure 1 Scenarios for Building Ontology Networks (Image source [7])

As displayed in Figure 1, the NeOn methodology is divided into the following 9 scenarios: − − − − − − − − −

Scenario 1: Specification to Implementation. Scenario 2: Reusing and re-­‐engineering non-­‐ontological resources (NORs). Scenario 3: Reusing ontological resources. Scenario 4: Reusing and re-­‐engineering ontological resources. Scenario 5: Reusing and merging ontological resources. Scenario 6: Reusing, merging and re-­‐engineering ontological resources. Scenario 7: Reusing ontology design patterns. Scenario 8: Restructuring ontological resources. Scenario 9: Localizing ontological resources.

For creating the COMRADES model, we have to follow the scenarios 1 and 9. To some extent we also try to reuse some commonly use ontologies as outlined in other scenarios. However, we do not strictly follow the ontology reuse scenario as it is not the focus of the COMRADES data model.

Copyright 2016 Grégoire Burel ©

9 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

The main scenario for developing the COMRADES model is Scenario 1, as the COMRADES ontological model needs to be developed from scratch. An important task of this scenario is the creation of an Ontology Requirement Specification Document (ORSD) [3] that describes the purpose, scope, implementation language, target group and intended uses of the specified model. In particular, this specification document needs to define a set of requirements that are defined as a set of Competency Questions (CQs). In order to create the ontology requirement specification, we need to collect knowledge from different sources and design competency questions. The COMRADES model needs to integrate with the existing Ushahidi platform, and to map existing non-ontological resources (i.e. existing datasets). Although the second scenario is in principle designed to help with this task by proposing a non-ontological resource reuse process, it focuses on glossaries, dictionaries, lexicons, classification schemes and taxonomies, and thesauri. This type of resource is wildly different from the ones we are integrating when designing the COMRADES model, as we focus on the integration of existing data structures from the Ushahidi platform and third party datasets that are more complex models that dictionaries and thesauri. As a result, the second scenario is not really suitable for our task. An important task is to enable the COMRADES model to be used in different languages. As a consequence, we use scenario 9 which mostly consists of translating ontological labels and descriptions to multiple languages. Although the scenarios 3, 4, 5, 6, 7 and 8 could be also applied to the design of the COMRADES model, the main focus is to provide modelling support for the different project tasks, integrated datasets and the Ushahidi platform4. Existing crisis ontologies are not completely relevant for COMRADES since they either focus on very specific use cases or are not designed to integrate with a large variety of data sources. As a consequence, the COMRADES modelling task does not concentrate on these scenarios. Nevertheless, when possible, we try to map some of the key COMRADES concepts to existing ontologies that are not necessarily designed for representing crises (e.g. SIOC, FOAF, DC Terms). 1.2.2 The Qualitative and Structural Design Methodology One of the main shortcomings of the NeOn methodology is that the approach does not consider existing data structures and datasets as part of the development process. Even though the second scenario proposes the integration of non-ontological resources, its focus is not on existing data structures but on non-practical knowledge (e.g. thesauri, dictionaries). Therefore, the NeOn approach is mostly suitable when: 1) the new ontology needs to represent or integrate completely new datasets; or 2) the new ontology needs to integrate with existing ontologies.

4

As discussed previously, the Ushahidi platform will be the backbone of the COMRADES resilience platform. As a result, the model needs to support all the data structures used in the Ushahidi software.

Copyright 2016 Grégoire Burel ©

10 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Although integrating existing ontologies can be seen as a type of structural analysis since it examines how existing ontological models can be used for a particular modelling task, it does not include a data format that is not already formally represented. In this context we propose to integrate elements from the qualitative and structural design methodology [2] where the design of a particular model is extracted from qualitative studies (e.g. interviews, surveys, etc.) and the structural analysis of datasets and the structure of existing software platforms or processes (e.g. thread structure of the data, user social interactions). In order to use this methodology, we need to: 1) obtain requirements and perceived needs by stakeholders using interviews or surveys (qualitative phase); and 2) collect the data that needs to be represented (e.g. Ushahidi data format, Twitter posts) (structural phase). Following that phase, the interviews are used for obtaining functional requirements and identifying important features that are necessary for designing a new model. Similarly, data structures are analysed for creating a common representation that feeds into the competency question and model implementation. 1.2.3 Ontology Evaluation Although the NeOn methodology proposes an approach for developing a requirements document, it does not offer a clear process for evaluating if the developed model fulfils the ontology requirements. As part of the specification of the COMRADES model, we create competency questions that identify what type of query should be satisfied by the proposed model. Even though a complete evaluation of the COMRADES model requires the actual deployment of the model as part of the COMRADES resilience platform and the integration of the input and outputs of the different work packages, we propose to focus on a theoretical evaluation as there is not any data produced by the project yet. We evaluate the COMRADES model by mapping competency questions to the classes, properties and relations of the developed ontological model and by verifying if there is a possible query that can be used for connecting the different ontological resources associated with a given competency question. Therefore, the evaluation is a three steps process: 1) we map the classes, relations and properties from the COMRADES model to competency questions; 2) we determine if there is a path in the COMRADES ontology that connect the classes, relations and properties extracted from the competency questions; 3) if each competency question can be mapped and connected successfully to the COMRADES model, we conclude that the ontology successfully represent the competency questions.

Copyright 2016 Grégoire Burel ©

11 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

2

Structure of this document

This document is divided into three parts: Part I: Requirements Analysis and Model Specifications In the first part of this document we define the requirement of the COMRADES model by specifying an Ontology Requirement Specification Document (ORSD) [3]. The approach follows the aforementioned NeOn methodology and Qualitative and Structural Design approach by analysing stakeholder interviews, the Ushahidi platform data structure, and the crisis related datasets that are analysed by the project. Part II: COMRADES Model After creating the ORSD, we introduce the COMRADES model. The model is based on the modelling and requirement principles highlighted in the introduction and implemented as an ontological model. Where possible, the model is aligned with existing ontologies so it is interoperable with existing technologies. We also provide multilingual labels for the model classes, properties and relations, so international communities can use the model more easily. Part III: Model Evaluation Although the model cannot be completely validated until later on in the project as it needs to be integrated with the COMRADES resilience platform and the inputs and outputs of the tools developed in the project, it can be evaluated based on the competency questions defined during the requirement analysis phase. In this part, we match competency questions to the model in order to confirm if the model satisfies the existing requirements.

Copyright 2016 Grégoire Burel ©

12 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Part I: Requirements Analysis and Model Specifications Platform Requirement Analysis and Specifications Document

3

Introduction

According to the first scenario of the NeOn methodology, the first step for creating a new ontological model from scratch is to create an Ontology Requirement Specification Document (ORSD) [1]. In order to do so, we need to collect different information. As outlined in Figure 2, the ORSD document is divided in 7 different parts. For filling each of these parts, we use stakeholders’ interviews, the COMRADES project description (i.e. work package needs and project aims), and analyse the structure of the Ushahidi platform and different crises related datasets.

Figure 2 Template for creating an Ontology Requirement Specification Document (ORSD) (Image source [1]) Copyright 2016 Grégoire Burel ©

13 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

We mostly follow the structure outlined by Figure 2. First, we identify the purpose of the model, its scope and level of formality. Then, we use both the COMRADES project description and interviews for framing the intended use and users of the model. Finally, we create competency questions using the qualitative and structural design methodology discussed in the introduction. As mentioned in the introduction section of this document, we use the qualitative and structural design approach for determining the requirements of the COMRADES model.

4

Requirements Information Sources

As part of the ORSD design, we analyse four different type of data: 1) the COMRADES project development requirements (i.e. tools that are being developed in the various work packages); 2) stakeholder interviews; 3) the Ushahidi platform data structure, and; 3) the structure of the crises datasets. In this section, we present the different data sources that are investigated for designing the COMRADES model, and identify the model requirements that can be identified by those sources. 4.1

COMRADES General Requirements

A few requirements for the COMRADES model are clearly outlined by each work package (WP) tasks. In particular, the work on content informativeness and validity assessment (WP3) and emergency event detection, modelling and matchmaking (WP4) stipulates the following five tasks: − − − − −

Multilingual Content Processing (Task 3.1) Content Informativeness Classification (Task 3.2) Content and Source Validity Assessment (Task 3.3) Emergency Event Identification and Clustering (Task 4.2) Semantic Matchmaking of Emergency Events (Task 4.3)

Looking at each task, we can observe that for T3.1, the model needs to be able to represent different type of social media data and to be able to attach different pieces of information such as its language, topics and named entities. For T3.2 we need to represent the informativeness of individual messages. For T3.3, it is required to represent user profiles and the trustworthiness of particular pieces of information. For the WP4 tasks, documents need to be categorised, and events need to be represented (T4.2). For T4.3, events need to be clustered in order to match related events. 4.2

Stakeholder Interviews

Many requirements come directly from analysing the needs of existing communities dealing with emergency situations, which was gathered by WP2, and will be delivered in D2.1 in March 2017. In this context, 8 interviews were conducted as part of the Copyright 2016 Grégoire Burel ©

14 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

work on community requirements and evaluation of resilience platform (WP2) and the sociotechnical requirement of the COMRADES resilience platform. The interviews, which will be fully described in D2.2 due March 2017, involved a specialist in ICT for disaster management and 7 community leaders. Each interviewee was asked questions about how they currently use technologies when dealing with crises, and specifically “What sociotechnical requirements should be considered to design a social platform to boost communities’ resilience in a disaster situation?” The interviewed ICT specialist was Dr Marc van den Homberg (MA), a senior disaster management expert at CORDAID5 (a development aid organisation in the Netherlands). He is currently working in Bangladesh with local communities to help them to deal with floods. During the interview, he shared experiences from past and current projects. The interviews with the community leaders focused on their perception of how they currently deal with disaster situations and how it is supported by technology. They shared insights concerning how a new technology could improve crisis management. The interviewed community leaders were the following: 6 -­‐ Adin (AD), Director at Hysteria , a community laboratory that is focused on youth empowerment, art and city issues in Semarang, Indonesia. A current user of the Ushahidi platform. -­‐ Milan Mukhia (MI) from CORDAID. Milan has worked on humanitarian services in disaster zones in different countries for 12 years. Coordinating collaborations among stakeholders is his main focus. 7 -­‐ Salina Shakya (SA) from CORDAID. Works for the project Parivartan, helping to facilitate the process for the society to go back normal life after an earthquake disaster. 8 -­‐ Lumanti Joshi (LU), from Lumanti, a support group for shelter in Nepal. Architect, he has worked on organising communities for building reconstruction plans after disasters for 13 years. His focus is bridging community and government by creating structured plans. -­‐

Chuks (CH), Deputy Director at Reclaim Naija9 in Nigeria. The goal of the Reclaim Naija project is related to monitoring elections in real time. Citizens use Ushahidi to report incidents such as fraud or violence. The aim is to change the paradigm of elections in Nigeria by empowering grass-root communities towards civic participation.

-­‐

Elsa Marie D’Silva (EL) is the founder of the project Safecity,10 which aims at making the problem of sexual harassment more evident to the whole society

5

CORDAD, https://www.cordaid.org/.

6

Hysteria, http://grobakhysteria.or.id.

7

Parivartan, http://parivartannepal.org.np.

8

Lumanti, http://lumanti.org.np.

9

Reclaim Naija, http://reclaimnaija.net.

10

Safecity, http://safecity.np.

Copyright 2016 Grégoire Burel ©

15 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

-­‐

and policy makers. It promotes campaigns and workshops within communities on sharing stories and maps cases of sexual harassment or abuse in public. Olodotum Fadeyiye (OL), programmes officer, and Babatunde Adegoke, designer. Work for Connected Development11 in Nigeria on projects connecting communities and policy makers. Their projects are related to transparency, raising environmental awareness, mapping road conditions and traffic, human rights (monitoring abuse between police and citizens and women's rights), monitoring elections and emergency response, most of them use crowd mapping tools.

From the different interviews we can observe that there is a strong need for a platform that allows anonymous reports, privacy management, the collection and visualisation of event location and reports as well as methods for searching particular events and the ability to assign tasks to reports. The interviews identify requirements for the COMRADES platform and by extension the COMRADES model in term of functionalities, usability, data needs, performance and external data sources. Users need to create reports of incidents with geolocation, time and date, the source of the information (e.g. data source, person reporting the incident) while ensuring methods that allow anonymous reports and feedback. The reliability of information needs to be available, and reports need to be approved. It should also be possible to assign action to reports and check their status. Reports should be available in different languages if possible. Information should also be categorised (e.g. needs, resources…). In term of data sources, the model should support means for adding multiple data sources such as social media (e.g. Flickr, Instagram, Twitter…), SMS and WhatsApp. Besides the need for representing reports of event and external data, the interviews showed a strong need for identifying the reliability of information, privacy management as well as assigning tasks for solving particular issues. Therefore, the COMRADES model needs to provide an easy representation for external data and for a task representation model, as well as access to management of information and its trustworthiness. 4.3

Ushahidi Data Structures

As part of the development of the COMRADES platform, an audit of the different data structures used in the Ushahidi platform was performed (D5.1). The Ushahidi data structures cover a wide range of key COMRADES needs such as the representation of users, posts and categories. Since such data structures are all formatted in JSON, they need to be translated into an ontological model so they can be integrated into the COMRADES model.

11

Connected Development, http://connecteddevelopment.org.

Copyright 2016 Grégoire Burel ©

16 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

DATA STRUCTURE NAME

DESCRIPTION

PROPERTIES

RELATIONS

User

A user in the Ushahidi platform.

Id, url, created, updated, email, real name, allowed_privileges

role

Contact

A contact represents a social media account, SMS number or email address that a message came from.

Id, url, data_provider, type, contact, created, updated, allowed_privileges

User (creator)

Post (Survey)

A survey is the core unit of the Ushahidi platform. All social media data is transformed into a survey.

Id, url, title, content, created, updated, source, location, type, allowed_privileges.

Parent (Post), form, user (creator), tags,

Id,url, data_provider, data_provider_message_id, title, message, type, created, allowed_privileges

Post, contact

Message

Messages store the raw data ingested from social media sources. Each Message is turned into a Post that can then be modified, but the original message is always retained. Message also stores outgoing messages sent in response to social media sources.

Form

Forms define the data structure of surveys. Each form consists of a number of Stages, and each Stage has a number of Attributes.

Id, url, name, description, type, created, updated, allowed_privileges

Parent (From)

Form Stage

Form stages describe groups of form attributes.

Id, url, label, allowed_privileges

Form

Form Attribute

Form Attributes define the data type and input method of individual data point on a Post.

Id, url, label, input, type, required, default, priority, cardinality, created, updated, allowed_privileges

Form_stage,

Media

Media represents file uploads, usually to be attached to a post.

Id, url, caption, created, updated, allowed_privileges

User (creator), collection

Tag

Tags (or categories) can be applied across all posts, regardless of the Post’s Form.

Id, url, tag, slug, type, description, created, color, icon, role, allowed_privileges

Parent (Tag)

Collection

A collection is a group of Posts. Posts are manually added to a collection.

Id, url, name, description, created, updated, allowed_privileges

User (creator), posts, visible_to

Role (User Groups)

User roles used for determining administration privileges.

Id, url, name, display_name, description, permissions, allowed_privileges

Table 1 Ushahidi data structures

By analysing the different APIs of the Ushahidi platform, we obtain the data structures listed in Copyright 2016 Grégoire Burel ©

17 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

DATA STRUCTURE NAME

DESCRIPTION

PROPERTIES

RELATIONS

User

A user in the Ushahidi platform.

Id, url, created, updated, email, real name, allowed_privileges

role

Contact

A contact represents a social media account, SMS number or email address that a message came from.

Id, url, data_provider, type, contact, created, updated, allowed_privileges

User (creator)

Post (Survey)

A survey is the core unit of the Ushahidi platform. All social media data is transformed into a survey.

Id, url, title, content, created, updated, source, location, type, allowed_privileges.

Parent (Post), form, user (creator), tags,

Id,url, data_provider, data_provider_message_id, title, message, type, created, allowed_privileges

Post, contact

Message

Messages store the raw data ingested from social media sources. Each Message is turned into a Post that can then be modified, but the original message is always retained. Message also stores outgoing messages sent in response to social media sources.

Form

Forms define the data structure of surveys. Each form consists of a number of Stages, and each Stage has a number of Attributes.

Id, url, name, description, type, created, updated, allowed_privileges

Parent (From)

Form Stage

Form stages describe groups of form attributes.

Id, url, label, allowed_privileges

Form

Form Attribute

Form Attributes define the data type and input method of individual data point on a Post.

Id, url, label, input, type, required, default, priority, cardinality, created, updated, allowed_privileges

Form_stage,

Media

Media represents file uploads, usually to be attached to a post.

Id, url, caption, created, updated, allowed_privileges

User (creator), collection

Tag

Tags (or categories) can be applied across all posts, regardless of the Post’s Form.

Id, url, tag, slug, type, description, created, color, icon, role, allowed_privileges

Parent (Tag)

Collection

A collection is a group of Posts. Posts are manually added to a collection.

Id, url, name, description, created, updated, allowed_privileges

User (creator), posts, visible_to

Role (User Groups)

User roles used for determining administration privileges.

Id, url, name, display_name, description, permissions, allowed_privileges

Table 1. The information associated with the data structures consists of either properties or relations. Properties are textual fields that are not shared across data structures, whereas relations are used for linking different data structures together. Copyright 2016 Grégoire Burel ©

18 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

It is important to note that some of the features derived from the APIs would benefit from being modelled as relations rather than properties. For instance, the icon used for representing a Tag should be converted to a relation that links to a Media resource, so that any media can be used for representing a particular category. In general, it can be observed that different input sources (Message) need to be integrated into the COMRADES model, then converted into a standardised unit of information (Post). These posts are then categorised (Tag) or grouped (Collection). Users (User) need to be associated to documents as creators and input sources. Finally, users need roles (Role) that can be used for giving access permission to the platform data. An important aspect of the Ushahidi data model is the concept of forms. Forms are associated with particular posts and are used for representing arbitrary textual input using customisable fields. This is particularly challenging in an ontological context as it can add a lot of complexity to the model. 4.4

Crisis Related Datasets

Many of the crisis-related datasets and data sources that can be used for data analysis purposes by the COMRADES project come from social media and particularly Twitter. Crisis-related datasets are generally divided into high-level data and low-level information. High-level datasets contain citizen reports or social media reports about discrete events that occur in large-scale crises, whereas low-level datasets focus on the general description of events. Compared to high-level datasets, low-level datasets have more information about the specifics of particular events and are typically created manually by experts or organizations, by verifying reports. Unfortunately, such data tends to be created after events occur, and contains aggregated information. Compared to such low-level datasets, the high-level datasets tend to be unfiltered and unverified reports of discrete events that lack clear context. In COMRADES, we are more interested in types of data such as: 1) those which tend to contain more real-time information than the low-level datasets; 2) those where the dataset size is much larger than their low-level counterpart. The following table (Table 2) lists the different datasets that have been investigated so far. The available data can be divided depending on the data that was used for building a particular dataset. We distinguish three types of data source: social media data (i.e. Twitter posts), user reports (e.g. Ushahidi, ACLED) and news agency data (e.g. news websites). Each data types have advantages and disadvantages. Social media data is widely available, however reliability is unclear and the format is highly unstructured so it requires complex analysis in order to be converted into usable data. Citizen reports are more scarce but potentially more useful as they are formatted specifically for describing events. Finally, news data has the advantage to be more reliable and can contain information about disaster relief information. However, such data is more likely available after an event occurs and is low-level as it is summarizing a situation. DATASET

DESCRIPTION

MEDIA TYPE

COVERAGE

DATASET

Copyright 2016 Grégoire Burel ©

19 | P a g e


D4.1 Enriched Semantic Models of Emergency Events SIZE Crisis Lex T26

26 crises partially annotated with informativeness, information type and source.

Twitter (Social Media)

2012-2013

~250k Tweets

Incident Tweets

Data collected from multiple cities in the USA and UK. Partially annotated with event types.

Twitter (Social Media)

2012-2014

~15M Tweets

Crisis Lex T6

6 Crises / Annotated by relatedness

Twitter (Social Media)

2012-2013

~60k Tweets

Crisis NLP

Multiples events datasets with some computed features. Multiple languages (English/French/Spanish).

Twitter (Social Media)

2014-2016

~40M Tweets

Crisis Map (Ushahidi)

Many event report from Ushahidi deployments.

Citizen Reports (Ushahidi)

2011-2013

33 Events

Phoenix Data Project

Near real-time event dataset created by scrapping 400 news sources.

Event Summaries (News Agency Data Source). Uses the CAMEO event taxonomy.

2014-Now

Monthly datasets (expanding)

GDELT

Multiple databases created in near real-time created from multiple data sources in different languages.

Event Summaries (News Agency Data Source). Uses the CAMEO event taxonomy.

1979-Now / 2013-Now (with data source)

Collected every 15 minutes (expanding)

ACLED

Event summaries created weakly about event occurring in Africa and Asia.

Event Summaries (Created and verified manually). Uses the CAMEO event taxonomy.

1997 – Now (Africa) / 2010 – Now (Asia)

Weekly datasets (expanding)

Crisis Net

Data of crises such as diseases, political conflicts, and health (e.g. Ebola), all freely accessible via REST API.

Reports automatically generated from different data sources.

2014

Relief Web

Real-time API access to reports since 1996. Provides low-level information about global events.

Citizen Reports (Unformatted data).

1996-Now

HDX

The Humanitarian Data Exchange is a dataset repository that contains multiple datasets about different crises and related resources in different formats.

Citizen Reports / Event Summaries / Social Media

2014-Now

~1.6M Items

~54K Reports 4163 Datasets / 244 Locations / 804 Sources (expanding)

Table 2 Crisis related datasets

In term of data formats, existing social media datasets tend to be based on Twitter data, therefore, they directly follow the twitter message format and contains small short text with user information and sometimes user GPS coordinates that can be used for identifying the location of particular events. Report data is platform-specific but generally contains a title, a date, a location, a description, and type (e.g. fire, earthquake…). Sometimes there can be additional information depending on the type of report. For example, the Ushahidi instance

Copyright 2016 Grégoire Burel ©

20 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

created for monitoring the USA presidential elections of 201612 has custom fields about candidates in each reports. Finally, many of the news agency based datasets such as the GDELT13, ACLED14 and Phoenix Data Project datasets15 follow the CAMEO [8] model that provides a taxonomy to identify the type of event mentioned as well as the actors involved. Since there are many similarities between the different data models listed in Table Error! Reference source not found., and since each dataset uses different terminology for describing similar type data, we decided to translate the data structures found in each dataset into the same format (Table 3). Data Structure

Report/Event/Post

Feature

Description

Dataset

ID

Unique identifier.

ACLED, GDELT, Phoenix, Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Creator

Actor that created the document.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP.

Creation Date

Date when a document was created.

Twitter Message, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP.

Update Date

Update date of a document.

CrisisNet

Title

Summary/Title of a document

CrisisNet

Content

Content of a document.

ACLED,Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Media

A media associated with a document (e.g. Image, video).

Twitter, CrisisNet

Actor Source

The actor source.

ACLED, GDELT, Phoenix

Number of sources

Number of information sources for the document.

GDELT

Category

Tags or categories that classify a document.

Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Location Precision

Certainty of an event location.

ACLED

Information type

Identify the type of information contained in a document (e.g. caution, advice, donation...).

ACLED, GDELT, Phoenix,Incident Tweets,CrisisLex T26, CrisisNLP

Information sub-type

Identify the sub-type of the information contained a document (e.g. donation-shelter).

GDELT, CrisisNLP

Information Sub-subtype type

Identify sub-sub-type the information contained a document.

GDELT

Language

Language of a document.

Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

12

Ushahidi USA Elections, https://usaelectionmonitor.ushahidi.io.

13

GDELT, http://www.gdeltproject.org/data.html#rawdatafiles.

14

ACLED, http://www.acleddata.com/data/.

15

Phoenix Data Project, http://phoenixdata.org/data.

Copyright 2016 Grégoire Burel ©

21 | P a g e


D4.1 Enriched Semantic Models of Emergency Events English translation

The English translation of the content of a document.

CrisisNet

Relevance

Identify if a document is about a crisis event.

CrisisLex T26

Informativeness

Identify if a document is informative (i.e. gives useful information).

CrisisLex T26, CrisisNLP

Actor Relation

The relation between the actors involved in an event.

ACLED, Phoenix

Target Actor

The actors targeted by an event (recipient actor).

ACLED, Phoenix

Fatalities count

Number of fatalities.

ACLED

Goldstein Code

Numeric score capturing the theoretical potential impact that the type of event will have on the stability of a country.

GDELT, Phoenix

Average Tone

Scale defining the Positiveness / Negativeness of an event.

GDELT

Reference

An external information cited by a document.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Source Actor

The actor that initiated an event

ACLED, Phoenix

Favourite count

Number of times a document has been bookmarked.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Messaged Actor

The identifier of the actor that a document content is targeted at.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Parent Document

The identifier of a parent document (e.g. reply to) or related event.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Shares count

Number of times a document has been shared.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Event Location

The location of the reported object.

ACLED, GDELT, Phoenix, Twitter Message, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Event Date

Date when the object referred in the document or report was created (e.g. event, resource...).

ACLED, GDELT, Phoenix.

Event Lifespan

If the document describes something temporary (e.g. event) or permanent (e.g. resource).

CrisisNet.

Event Date precision

Certainty of an event date.

ACLED

Geolocation Description

Name of a geolocation.

ACLED, GDELT, Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Coordinates

GPS coordinates of a geolocation.

ACLED, GDELT, Phoenix, Twitter, CrisisNet, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Country

Country geolocation.

ACLED, GDELT, CrisisNet

Admin Region 1

Largest administrative region.

ACLED, GDELT, Phoenix

Admin Region 2

Second largest administrative region.

ACLED, GDELT, Phoenix

Admin Region 3

Third largest administrative region.

ACLED, GDELT, Phoenix

ID

Unique actor identifier.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Geolocation

User/Account

Copyright 2016 Grégoire Burel ©

22 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Data source

Name

Name of the actor.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP,ACLED

Agent Type

The type of an actor (e.g. media, government...).

CrisisLex T26, CrisisNLP, ACLED

Creation date

Date when an actor account has been created.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Description

Text description of an author.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Favourite count

Number of times an actor account has been bookmarked.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Suscribers

List of actors subscribing to that particular actor account.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Subscribers count

Number of actors account subscribing to that particular actor.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Subscribed count

Number of actor accounts subscribed to.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Language

Language of the actor.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Geolocation

Geolocation of actor.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Documents

List of created documents.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Documents count

Number of document created.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

URL

URL associated with the actor.

Twitter, CrisisLex T26, CrisisLex T6, Incident Tweets, CrisisNLP

Country

Country of the actor.

GDELT, ACELD, Phoenix

Organisation

The actor group. (e.g. United Nation, al-Quaida)

GDELT, ACLED, Phoenix

Ethnic group

The ethnic group of an actor.

GDELT

Religion

The actor religion.

GDELT

Creation date

Date time at which the source was created.

CrisisNet

Description

Description of the source.

CrisisNet

Start date

When data is available initially

CrisisNet

End date

When the data is not accessible anymore.

CrisisNet

Frequency

How often the data get updated.

CrisisNet

License

The license of the data source.

CrisisNet

Type

The resource type (e.g. social network, news...).

CrisisNet

Table 3 Data Structure of Crisis related datasets

As with the Ushahidi data structures, there are some features that may not be useful for the COMRADES model. For instance, the CrisisNet dataset provides data source information that is not necessary for the COMRADES model as this is not used by the different tools developed by the COMRADES platform and the Ushahidi platform. By analysing the different properties of each dataset, we distinguish four different data structures: 1) report, events and posts; 2) geolocation information; 3) user and account Copyright 2016 Grégoire Burel ©

23 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

information, and; 4) data sources. The report, events and posts hold the main documents of the datasets whereas user and account information represent document creators or the organisation or information sources involved in events. Geolocation data structures hold information related to events and users. Finally, data source is only used by the CrisisNet dataset, and stores information about how data is accessed. In general, it appears that the crisis related datasets hold more information about events than the Ushahidi data structures even though Ushahidi can use forms for modelling such type of data. In particular, the datasets that follow the CAMEO taxonomy support different types of events and actors, with many different properties such as the actors involved in particular events as well as the organisations they belong to. In summary, the analysis of the crisis related datasets show that the Ushahidi data structure already support many of the requirement of the existing dataset except for the representation of domain specific information (e.g. Twitter posts) and rich user or event model that is mostly given by the CAMEO taxonomy and the Twitter data.

5

The Ontology Requirement Specification Document (ORSD)

By analysing the stakeholder interviews, the Ushahidi platform and the different datasets considered for integration into the COMRADES model, we can create the ORSD that can be used for specifying what the COMRADES model should look like. 5.1

COMRADES Aims and Model Purpose

As part of the first step for designing the ORSD, we need to define the purpose, scope and level of formality of the model. The aim of the COMRADES project is to create a community resilience platform that provides a software that help communities to reconnect, response, and recover from crisis situations by providing a representation that allows communities or individuals to reconnect, respond to, and recover from crisis situations. In other words, the model needs to enable the representation of individuals and group of individuals, allow communication between individuals, enable communities to respond to crises by gathering critical information, and recover by allowing the organization of resources and aid. In order to do so, the COMRADES project aims to build on top of the Ushahidi platform by providing new intelligent algorithms aimed at helping communities, citizens, and humanitarian services with analysing, verifying, monitoring, and responding to emergency events. In general, the model needs to be general enough to cover a wide variety of scenarios and therefore be flexible. In the case of ontological development, a flexible model needs to offer relatively loose semantics (i.e. avoid overspecialisation) so that new types of users or resources do not require important ontological modifications. In terms of scope, the model aims to support the modelling of crises and their recovery through social media analysis and manual data input. In conclusion, the COMRADE model’s purpose can be defined as the following: Copyright 2016 Grégoire Burel ©

24 | P a g e


D4.1 Enriched Semantic Models of Emergency Events The COMRADES model is a flexible model designed to organize individual or community communication of events, allow the gathering of information and resources about crises, and organize them for recovery purpose.

5.2

Intended Use and Users

The COMRADES model needs to satisfy different user groups such as governmental organizations and non-governmental groups, as well as individuals. Such individuals may have many different aims and goals. The model also needs to support algorithmic needs by allowing software to assert new information themselves (e.g. trustworthiness, extracted entities…). Following the interviews with stakeholders, we distinguish four different type of users for the COMRADES model: (1) Platform stakeholders: The individuals or organisations that supply community platforms such as Ushahidi. (2) Local community groups: Community members of local activist groups. (3) Responders: Organisations and individuals that use information gathered by platforms in order to organise the response and recovery of a particular crisis. (4) Individuals and small citizen groups: Individuals or small communities that are affected by a particular crisis. Each of these user groups have different needs that define how the COMRADES model will be used. For instance, platform stakeholders need to make sure that the platform can be deployed easily. Community groups need to be able to assess a given crisis situation. Responders require the ability to analyse a given situation and organise recovery. Finally, individuals and citizen groups need to understand the situation and to be able to ask for assistance by reporting crisis events. In summary, the COMRADES model needs to cater for the four different user types mentioned above, as well as to be useful in situations where users are looking for or are willing to provide information about crises and where responders are organising resources in order to solve a particular situation. 5.3

Competency Questions

A key part of the ORSD is to define competency questions that define what types of queries the model should be able to support. As previously discussed, we perform two types of analysis: 1) study stakeholder interviews for better understanding their needs; 2) analyse the work package requirements of the COMRADES project, and; 3) study the structure of the Ushahidi community. We define competency questions as task-oriented questions that need to be satisfied by the COMRADES model. These competency questions are used in the second part of this document for making sure that the model supports all the question-based requirements and, in the last part, for validating the model against the competency questions.

Copyright 2016 Grégoire Burel ©

25 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

In the following sections we convert the knowledge sources discussed in the previous section into competency questions. If a particular question is already covered by another knowledge source, we do not add an additional question. 5.3.1 Work Package Requirements As previously discussed, the COMRADES tasks specify directly what type of information is needed for the algorithms used by the COMRADES resilience platform. For T3.2, T3.2 and T3.3, we obtain the following competency questions: CQ1: CQ3: CQ5: CQ6: CQ7: CQ8: CQ9: CQ10:

How many messages are submitted to the platform? What is the language of a message? What are the topics of a message? What are the named entities of a message? What are the properties of a message? How reliable is a message? How reliable is a user? How many users have submitted information to the platform?

For T4.2 and T4.3, we have the following competency questions: CQ11: CQ12: CQ13: CQ14: CQ15:

How many events are present in the platform? What are the type of events in the platform? How are two events related? How many resources are needed? How many resources are available?

5.3.2 Interviews and Qualitative Requirements As previously discussed, the interviews conducted with the stakeholders showed the need for a strong reporting model with methods for managing the access to information, multiple data sources and task management. The competency questions derived from the interviews are listed below: CQ16: CQ17: CQ18: CQ19: CQ20: CQ21: CQ22: CQ23: CQ24: CQ25: CQ26: CQ28: CQ29:

What is the location of an event? When did an event occur? What is the information source of an event? What are the information related to an event? Is the event information source anonymous? What are the pictures associated with an event? What is the language of an event? Who has access to the report? Is the report reliable? Is the report published (i.e. approved)? What are the tasks associated to the report? Who is assigned to a task? What are the types of information sources?

Copyright 2016 Grégoire Burel ©

26 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

5.3.3 Ushahidi Data Structures Although the Ushahidi platform supports 11 different data structures, it is possible to simplify the structure of the data model as long as the same information can be retrieved. The competency questions resulting from the analysis of the Ushahidi platform data format are: CQ30: CQ31: CQ32: CQ33: CQ34:

What are the type of posted document (Forms) available in the platform? What are the type of media available? What are the categories of documents that are in the platform? What are the document collections in the platform? What are the different user roles?

The COMRADES model also needs to be able to be queried for retrieving different properties from each model class stored in the model: CQ35: CQ36: CQ37: CQ38: CQ39: CQ40: CQ41: CQ42: CQ43: CQ44: CQ45: CQ46: CQ47: CQ48: CQ49: CQ50: CQ51: CQ52: CQ53: CQ54: CQ55: CQ56: CQ57: CQ58: CQ59: CQ60: CQ61: CQ62: CQ63: CQ64: CQ65: CQ66: CQ67: CQ68: CQ69:

When was a user created? When was a user information updated? When was a document created? When was a document updated? When was a message created? When was a message updated? When was a document type created? When was a document type updated? When was a media created? When was a media updated? When was a topic created? When was a document collection created? When was a document collection updated? What is the email of a user? What is the real name of a user? What is the role of a user? What are the privileges of a user? What is the title of a document? What is the content of a document? What is the source of a document? What is the location of a document? What is the type of a document? Who is the user that created a document? What is the category of a document? What is the caption of a media? What are the collection associated with a media? What is the type of a category? What is the description of a category? Who is the User that created a category? What is the parent category of a category? What is the name of a collection? What is the description of a collection? Who is the user that created a collection? What are the documents in a collection? What are the users that can access a collection?

Copyright 2016 Grégoire Burel ©

27 | P a g e


D4.1 Enriched Semantic Models of Emergency Events CQ70: CQ71: CQ72:

What is the name of a role? What is the description of a role? What are the permissions of a role?

5.3.4 Crisis Related Datasets As previously highlighted, many sources of information found in the crisis related dataset can be modelled by the Ushahidi data structures, and are therefore already taken into account by the Ushahidi data structures competency questions presented in the previous section. The competency questions retained when analysing the crisis related datasets are listed below: CQ73: CQ74: CQ75: CQ76: CQ77: CQ78: CQ79: CQ80: CQ81: CQ82: CQ83: CQ84: CQ85: CQ86: CQ87: CQ88: CQ89: CQ90: CQ91: CQ92: CQ93: CQ94: CQ95: CQ96: CQ97: CQ98: CQ99: CQ100: CQ101: CQ102:

6

What is the title of a document? What are the media associated with a document? What is the actor at the origin of the information referred in a document? How many data sources are referenced by a document? How precise is an event or resource geolocation? What is the type of information reported in a document? What is the language of a document? Is there an English translation for a document? What are the document related to crises? How informative is a document? What is the relation between the actors involved in an event? What are the actors involved in an event? How many people died during an event? What is the impact of an event on the stability of a country? What are the document related to a document? How positive is an event? How many times a document was favorited? Who was targeted by a message? Is there a parent event to an event? How many times a document was shared? Where did an event take place? What is the lifespan of an event or resource? What is the type of an actor? How is an actor described? How many times a user account has been favorited? Who is following a user account? What are the users followed by a user account? What is the religion of an actor? what is the ethnic group of an actor? What is the organisation associated with an actor?

Term Glossary

Now that we have extracted a set of competency questions, we can extract the terms that are the most used in the questions in order to help the development of the model. The idea is that the most frequent terms are key aspects of the model and need to be modelled prominently (e.g. classes), whereas infrequent term may not need to be represented as prominently in the final model. Copyright 2016 Grégoire Burel ©

28 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

The typical NeOn approach requires competency questions that are linked to actual data in order to extract each of those terms. This is different from the type of competency question that we have, since the previously listed questions are conceptualised from data structures and general interviews. As a result, our competency questions are not data specific and are more conceptual. The NeOn methodology [9] distinguishes three different types of terms: 1) competency question terms; 2) competency question answer terms, and; 3) object terms. The competency question terms are the top words that appears in competency questions, whereas answer terms are the ones that appears in the answer of the competency questions. The object terms are the named entities that are extracted from competency questions and answers. Since we do not have instantiated competency questions that both contain data specific questions and answers, we generate the glossary terms as follow: 1) we extract the most frequent terms appearing in our competency questions; 2) we extract the most frequent terms appearing in the data structures that we have used for creating our competency questions (Error! Reference source not found. and Error! Reference source not found.). The idea is that besides the terms extracted from the competency questions, the property descriptions and names of the different datasets and the Ushahidi can help the identification of the key concept and attributes of the COMRADES model. The top terms extracted from the competency questions are listed in Table 4. TYPE

TERM (FREQUENCY)

Competency Question

Document (27), Event (17), User (13), created (10), Type (9), Information (8), Message (8), Collection (7), Category (6), Platform (6), Actor (6), Media (6), updated (6), associated (5), related (4), Role (4), Report (4), name (3), description (3), Source (3), language (3), Account (3), Events (3), reliable (3).

Data Structures

created (13), allowed_privileges (11), id (10), URL (10), Form (9), data (9), User (8), Type (8), updated (7), Post (6), Message (6), Media (6), Multiple (5), Event (5), Creator (4), Posts (4), description (4), sources (4), Collection (4), Crises (4), social (4), contact (4), name (3), annotated (3).

Table 4 Top Terms Extracted from the Competency Question, Crisis Related Dataset and the Ushahidi Data Structures.

7

Summary

In order to specify the COMRADES model, we analysed the COMRADES project requirements and the Ushahidi platform as well as related crisis datasets in order to extract the data requirements for the model. We also analysed user requirements from stakeholder interviews in order to derive the model requirements from the future user perspectives. This approach was based on the structural and qualitative design approach discussed in the introduction. The analysis helped us to better understand the aims of the model, its future usage and users. We also produced a set of competency questions that form the basis of the model implementation. The next part of this document reuses those findings for fully specifying and implementing the COMRADES ontology. In particular, we reuse the competency questions for guiding Copyright 2016 Grégoire Burel ©

29 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

the development of the ontology as well as the common data structures observed in the Ushahidi platform and crisis related datasets.

Copyright 2016 Grégoire Burel ©

30 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Part II: COMRADES Ontology Model COMRADES Model Implementation

8

Introduction

In the previous sections, we created the Ontology Requirement Specification Document (ORSD) [1] for the COMRADES model based on multiple analyses and extracted a set of competency questions as well as a glossary of key terms. In the following sections we create the COMRADES ontology16 based on the ORSD by identifying the key components of the model and then identifying the relations between each component as well as the properties of the ontology. We also integrate the ontological model with different existing ontologies for improving the model interoperability and usability. For simplifying the usage of the model between different communities, we also translate the ontology classes, properties and relation to different language. Finally, we discuss how domain knowledge can be added to the COMRADES ontology.

9

Model Principles

Before introducing the COMRADES model, we discuss the main approach used for organising the gathering and organisation of information and resources about crises. Many of the datasets and data structure analysed when creating the ORSD are centred on reports and the ingestion of external documents rather than the direct modelling of events. In this context we decide to centre the COMRADES model on reports rather than events where reports are clustered together for describing events that result in real world situations and external documents (or other information sources such as other reports or an informant) are used for documenting what is discussed in a report. Reports can be used in different ways for documenting events, needs, resources and so on and form the base of the COMRADES model. The advantage of using a report centred approach is that it allows a more organic gathering of information related to events without needing rigid data structures. This is particularly suitable for resilience platforms that are deployed in large variety of situations where the types of reports are context specific. We use a situation model for documenting how events affect their environment. Typically, a situation would involve different entities (e.g. local population, building, political situation…) and would define the state that was induced by the situation. For example, a building explosion (situation) would induce a particular building (entity) to be collapsed (status).

16

COMRADES Ontology, http://socsem.open.ac.uk/ontologies/comrades.

Copyright 2016 Grégoire Burel ©

31 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Another important component of the model is the representation of categories and collections. We distinguish collections from categories as manually user curated groups of documents and reports whereas categories are hierarchical organisation used for classifying reports and events. For representing users and the permissions associated documents, reports and other model classes we use the concepts of roles and accounts where user hold roles that are associated with user permissions. We also use the concept of user account that are used for holding platform specific user information such as the user contribution reliability. Finally, we add a simple model for representing tasks that can be attached to reports and assigned to users.

10 Ontology Components Based on the previous model principles, we discuss the classes, relations and properties of the COMRADES model. We refer to the COMRADES namespace as com in the following sections. 10.1 Classes and Relations As previously discussed, the COMRADES model is divided in different classes that separate crisis related data in four different types of information (reports, documents event and situation) and associate them with tasks as well as users. In the following sections, we discuss how the different classes of the model are named and linked. Figure 3 shows the different classes and relations of the COMRADES model.

Figure 3 The COMRADES Ontology classes and relations.

Copyright 2016 Grégoire Burel ©

32 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

10.1.1 Information Sources, Reports and Situations The competency questions show that many properties and relations are focused on different types of documents and that both the Ushahidi platform and the crisis related dataset model prefer modelling event indirectly using user submitted reports or automatically generated documents. As a consequence, we decide to centre the representation crisis related information around the concepts of com:Report, com:Situation, com:Event, com:Document and com:Informant. In order to connect each of these components, we decided to associate a com:Report to a com:Document that represent the information sources that were used for creating a report such as an external media (com:Media) or message (com:Message). These classes can be subclassed as needed if a new com:Document representation is necessary. A com:Report can be also linked to a com:Informant that can be an com:Agent or com:Organisation and used for representing the organisation or person that gave the information used in a com:Report. We extend messages (com:Message) from documents (com:Document) as contrary to documents, messages occur in conversations (e.g. Twitter messages, forum posts) whereas documents are standalone information pieces (e.g. news articles, blog posts). Besides associating reports to documents and informants, reports are also connected with the events (com:Event) and situations (com:Situation) that they are describing or updating. Events are things that happens or takes place whereas situations are used for representing the states (com:State) of entities (com:Entities). The separation between com:Document and com:Informant with com:Event and com:Situation is designed for identifying how a piece of information obtained from an external source is integrated and processed through a report (com:Report) into a piece of usable knowledge in the form of a situation (com:Situation) or event (com:Event). This allows the model to be queried from different perspectives. For instance, com:Document can be used for understanding where an information comes from whereas com:Report can be used for understanding how a com:Document was brought in the COMRADES platform and, finally, com:Event and com:Situation can be used for analysing current emergency situations by observing the com:State of a com:Entity. Besides com:Document and com:Message, com:Report can be also associated with different type of medias (com:Media) such as pictures (com:Picture) and videos (com:Video). 10.1.2 Collections, Categories and Topics The different type of information collected by the COMRADES model can be grouped and categorised in different ways. For instance, com:Report and com:Document

Copyright 2016 Grégoire Burel ©

33 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

can be grouped into com:Collection whereas com:Report and com:Event can be grouped in come:Category. Collections (com:Collection) are directly designed to emulate the Ushahidi document collection by allowing different types of information to be grouped together as a list according to arbitrary criteria while categories (com:Category) are used as a public hierarchical classification model for information retrieval purpose. 10.1.3 Actors, Organizations and Accounts Another important part of the model is the representation of the actors, organisations and the accounts that are used for representing the creator of com:Document and the person that posted a com:Report as well as the people and organisation that created com:Situation or com:Event. We distinguish different types of users. In particular, we define com:Agent as a generic type of user and the com:Organisation class that can be used for defining different types of organisations or group a user can belongs to. For instance, a user could belong to a particular NGO or religious group. Users (com:Agent) are all defined as a subclass of com:Informant that can be used as the information source of com:Report when no document source (com:Document) is available but when an information comes from a known individual or person. For contributions within the COMRADES model, the com:Accout class is used for abstracting contributor specific information that only exist within the COMRADES model such as the number of documents created by a com:Agent. 10.1.4 Tasks, Roles and Permissions As highlighted by the ORSD, the COMRADES model needs to support access permissions to the different content represented by the model. In this context, we define the class com:Role and com:Permission that are used together for associating permissions to multiple model classes. The Ushahidi platforms also supports the assignment of tasks to platform users. We support tasks by adding the com:Task to the model and linking it to com:Account and com:Report so that reports can be used for assigning tasks. 10.2 Properties Contrary to relations, properties are not associated with other classes of the COMRADES ontology. The different properties required for each classes can be directly extracted from the competency questions as well as the previously analysed data structures. The properties of the classes displayed in Figure 3 are listed in the following table (Table 5): Copyright 2016 Grégoire Burel ©

34 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Class

Document

Report

Situation

Event

Entity

Category

Property

Description

updated

Date when a class was updated.

created

Date when a class was instantiated.

title

The document title.

content

The content of a document.

informativeness

How informative is a document (i.e. useful for crisis analysis).

favourites

How many times a document was bookmarked.

shares

The number of times the document was shared.

language

The language of the document.

polarity

Indicate a document sentiment.

englishTranslation

The English translation of the document content.

updated

Date when a class was updated.

created

Date when a class was instantiated.

title

The title of the report.

informativeness

How informative is the report (i.e. useful for crisis analysis).

language

The language of the report.

approvalStatus

The report status (e.g. draft, published, deleted...)

polarity

Indicate a report sentiment.

englishTranslation

The English translation of the report.

created

Date when a class was instantiated.

updated

Date when a class was updated.

description

The description of the situation.

title

The title of the situation.

startTime

When the situation started.

endTime

When the situation stopped.

informativeness

How informative is the situation (i.e. useful for crisis analysis).

polarity

Indicate the sentiment of a situation,

updated

Date when a class was updated.

created

Date when a class was instantiated.

title

The title of the event.

startTime

When the event started.

endTime

When the event stopped.

informativeness

How informative is the event (i.e. useful for crisis analysis).

polarity

Indicate the sentiment of an event.

created

Date when a class was instantiated.

updated

Date when a class was updated.

description

The description of the entity.

name

The name of the entity.

lifespan

The entity lifespan (e.g. permanent, temporary, consumable...).

title

The title of the category

description

The description of the category.

Copyright 2016 Grégoire Burel ©

35 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Collection

Agent

Account

Task

Geolocation

created

Date when a class was instantiated.

updated

Date when a class was updated.

title

The title of the collection.

description

The description of the collection.

created

Date when a class was instantiated.

updated

Date when a class was updated.

realName

The real name or full name of the agent.

email

The email associated with the agent.

description

The description of the agent.

favourite

The number of times the account was bookmarked.

created

Date when a class was instantiated.

updated

Date when a class was updated.

created

Date when a class was instantiated.

updated

Date when a class was updated.

title

The task title.

description

The task description.

status

The status of the task (e.g. accepted, pending, assigned...).

precision

The accuracy of the geolocation.

Table 5 Properties of the COMRADES Ontology

It is important to note that the reliability of the different elements of the ontology are not represented as properties. Instead, the reliability and trustworthiness of resources is represented using the Veracity ontology17 [10].

11 Integration with Existing Ontologies As displayed in Error! Reference source not found., the COMRADES model reuse multiple ontologies for modelling the different classes, properties and relations discussed in the previous section. 11.1 Crisis Related Ontologies Although many ontologies have been designed for representing crises or related information, most of them do not focus on the concepts of report and document. Rather than using those concepts, existing models prefer focusing on the event representation of emergency crises and ignore the collection of evidences and user submitted reports as a mean for representing event related information. Task representation is also generally absent from crisis related ontologies. Few ontologies have been designed for modelling event in crises situations such as MOAC18 (Management of a Crisis) and HXL (Humanitartian eXchange Lnaguage) [11]. However, despite modelling resources, processes, damages, and disasters (fire, 17

Veracity Ontology, http://purl.org/net/veracity/ns.

18

MOAC, http://www.observedchange.com/moac/ns/.

Copyright 2016 Grégoire Burel ©

36 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

people trapped, medical emergency), these models do not provide representations for documents and reports. The need for more complete models was highlighted by Liu et al. [12]. Moreover, existing semantic models were mostly designed for providing a static view of emergency situation, where elements are captured but not their temporal evolution. In term of document representation, the CURIO19 ontology (Collaborative User Resource Interaction Ontology) provides means for representing the collection of documents in an emergency context. However, the model only provides a simple model of event without the concept of event situations. However, the CURIO ontology shares some similarities with the COMRADES model as it is reusing many concepts from the SIOC20 ontology [13]. 11.2 Other Ontologies Most of the ontologies reused in the COMRADES ontology are based on widely used ontology. The main reason for reusing such kind of ontologies is that it improves the reusability of the model by allowing it to be used similarly to existing ontologies. The COMRADES ontology reuses five different ontologies for modelling its components and properties. The main ontology reused for representing the different elements of the COMRADES model is the SIOC ontology [13] that provides constructs for representing online communities. We reuse the SIOC ontology for representing documents, reports, collections, permissions and roles as well as a different properties and relations of the model. We also reuse the FOAF21 (Friend Of A Friend) ontology for representing users in the model as it integrates well with the SIOC ontology and provides ways for representing agents and organisations. For modelling geolocation, we use the Geonames22 and WGS8423 ontologies as they provide basic representations of geolocation coordinates that can be used for identifying the location of events and other resources. The Dublin Core24 model is also used as it provides many properties, relation and classes specifically designed for modelling documents. Finally, for representing the trustworthiness of the different content of the platform we us the Veracity ontology [10] as it provides methods for asserting the reliability of different resources. The different mappings are described in Figure 3. 19

CURIO, http://purl.org/net/curio/.

20

SIOC, http://rdfs.org/sioc/spec/.

21

FOAF, http://xmlns.com/foaf/spec/.

22

Geonames Ontology, http://www.geonames.org/ontology#.

23

WGS84 Ontology, https://www.w3.org/2003/01/geo/#vocabulary.

24

Dublin Core, http://dublincore.org/documents/dcmi-­‐terms/.

Copyright 2016 Grégoire Burel ©

37 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

12 Multilingual Support One of the aims of the COMRADES model is to support multiple languages so that the model can be used by different communities around the world. In order to do so, we translate the name of the classes, properties and relations of the ontology in different languages using the language tagging features of RDF [14]. For example, we translate the com:Organisation class as follow: com:Organisation rdfs:isDefinedBy com: ; a rdfs:Class, owl:Class ; rdfs:comment "An organisation"@en ; rdfs:label "Organisation"@en ; rdfs:comment "Une organisation"@fr ; rdfs:label "Organisation"@fr ; rdfs:comment "Una organización"@es ; rdfs:label "Organización"@es ; rdfs:subClassOf foaf:Group .

At the moment, we only translate labels into Spanish and French and do not translate the description of the ontology classes, properties and relations. Nevertheless, such translation can be added if necessary later on and it does not affect the usage of the COMRADES model as the ontological concepts are translation independent.

13 Domain Knowledge The specification of domain knowledge in the COMRADES ontology is mostly centred on: 1) The definition of user organisations, religious groups and ethnic groups; 2) The specification of report types, document types and event types, and; 3) The definition of categories, entity types and entity statuses. Although different methods can be used for creating such resources such as creating domain specific gazetteers, we decided to not enforce any specific domain knowledge in order to simplify the integration of the COMRADES model into existing dataset and tools. Each tool and dataset can specify its own domain knowledge depending on the model usage specifics. If interoperability between different datasets or model is required, resources can be linked to external entity resources such as DBpedia25 so that similar entities or resources can be identified more easily even if of the COMRADES ontology is used in different contexts.

14 Summary In the previous sections we introduced the COMRADES ontology based on the ORSD. First, we analysed the competency questions and ORSD glossary in order to create a high level version of the COMRADES model. Second, we implemented the ontology 25

DBpedia, http://dbpedia.org.

Copyright 2016 Grégoire Burel ©

38 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

using RDF/OWL and aligned the implemented ontology with existing ontological models. We also translated ontological classes, properties and relation to different language for simplifying the usage of the ontology in different communities. During the model development we decided to not implement any specific domain knowledge in order to simplify the model by not enforcing any default domain knowledge that can complicate the model integration into existing tools. Rather than proposing default domain knowledge, the COMRADES model provides classes that can be extended depending on the model usage or the integrated datasets. This allows for a more targeted usage of the model and a simpler integration of the model into existing applications or tools.

Copyright 2016 Grégoire Burel ©

39 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Part III: Model Evaluation Theoretical Model Evaluation

15 Introduction Although different methods can be used for evaluating ontologies, many methods rely on mapping existing data and then evaluating if the competency questions can be verified on real data. Since we do not have datasets that cover all the parts of the COMRADES ontology, we decided to perform a theoretical evaluation by checking if the classes and properties of the COMRADES ontology can be mapped to the competency questions. In the following section, we discuss the evaluation approach and how competency questions are mapped to the ontology properties, relations and classes of the COMRADES model. We also show how the current model represents the current competency questions.

16 Ontology Evaluation In order to evaluate the COMRADES ontology, we first extract the key classes, properties and relations associated with each competency questions. Then, we check if a path exists between each element of the extracted properties, relations and classes. Finally, we assert if a competency question is validated based on the path existence. 16.1 Competency Questions Mapping For each competency question, we list the classes and relations that needs to be connected and evaluate if the competency question is validated (i.e. if there is a path between the classes, relations and properties associated with the competency question). The mapping and results for each competency question is listed below (Table 6): CQ

PATH

CQ VALID?

CQ1

com:Message

Yes (COUNT)

CQ3

com:Message → com:language

Yes

CQ5

com:Message →, sioc:topic

Yes (with SIOC)

CQ6

com:Message → com:DocumentEntity

Yes (LIST)

CQ7

com:Message

Yes (LIST)

CQ8

com:Message / vo:Proposition → vo:has_trustworthiness → vo:Trustworthiness → ( vo:trusted; vo:is_asserting → vo:TrustworthinesssAssertion → vo:confidence)

Yes

CQ9

Similar to CQ8

Yes

CQ10

com:Agent → foaf:account → com:Account

Yes (COUNT)

CQ11

com:Event

Yes (COUNT)

Copyright 2016 Grégoire Burel ©

40 | P a g e


D4.1 Enriched Semantic Models of Emergency Events CQ12

com:Event

Yes (LIST)

CQ13

com:Event → com:describes → com:Report → com:describes → com:Report

Yes (Report)

CQ14

com:Entity → com:state → com:State

Yes (COUNT)

CQ15

com:Entity → com:state → com:State

Yes (COUNT)

CQ16

com:Event → com:geolocation → com:Geolocation

Yes

CQ17

com:Event → com:startTime

Yes

CQ18

com:Event → com:describes → com:Report → com:informant → com:Informant

Yes

CQ19

com:Event → com:describes → com:Report → com:source → com:Document

Yes (LIST)

CQ20

com:Event → com:describes → com:Report → com:informant → com:Informant

Yes (Agent properties)

CQ21

com:Picture → com:source → com:Report → com:describes → com:Event

Yes (LIST)

CQ22

com:Report → ( com:language; com:describes → com:Event)

Yes

CQ23

com:Report → com:scope → com:Role → com:role → com:Account → foaf:account → com:Agent

Yes

CQ24

Similar to CQ8

Yes

CQ25

com:Report → com:approvalStatus

Yes

CQ26

com:Report → com:task → com:Task

Yes (LIST)

CQ28

com:Task → com:assigned_to → com:Account

Yes

CQ29

com:Document; com:Informant

Yes (LIST)

CQ30

com:Report

Yes (LIST)

CQ31

com:Media

Yes (LIST)

CQ32

com:Category

Yes (LIST)

CQ33

com:Collection

Yes (LIST)

CQ34

com:Role

Yes (LIST)

CQ35

com:Agent → foaf:account → com:Account → com:created

Yes

CQ36

com:Agent → foaf:account → com:Account → com:updated

Yes

CQ37

com:Document → com:created

Yes

CQ38

com:Document → com:updated

Yes

CQ39

com:Message → com:created

Yes

CQ40

com:Message → com:updated

Yes

CQ41

com:Document → com:created

Yes

CQ42

com:Document → com:updated

Yes

CQ43

com:Media → com:created

Yes

CQ44

com:Media→ com:created

Yes

CQ45

com:Category → com:created

Yes

CQ46

com:Collection → com:created

Yes

CQ47

com:Collection → com:updated

Yes

CQ48

com:Agent → com:email

Yes

CQ49

com:Agent → com:realName

Yes

CQ50

com:Agent → foaf:account → com:Account → com:role → com:Role

Yes

CQ51

com:Agent → foaf:account → com:Account → com:role → com:Role → com:permission → com:Permission

Yes

CQ52

com:Document → com:title

Yes

Copyright 2016 Grégoire Burel ©

41 | P a g e


D4.1 Enriched Semantic Models of Emergency Events CQ53

com:Document → com:content

Yes

CQ54

com:Report → com:source → com:Document

Yes (Report)

CQ55

com:Document → com:geolocation → com:Geolocation

Yes

CQ56

com:Document

Yes (LIST)

CQ57

com:Document → com:created → com:Account → foaf:account → com:Agent

Yes

CQ58

com:Document → sioc:topic

Yes (with SIOC)

CQ59

com:Media → com:description

Yes

CQ60

com:Media → com:collection → com:Collection

Yes (LIST)

CQ61

com:Category

Yes (LIST)

CQ62

com:Category → com:description

Yes

CQ63

com:Category → com:created→ com:Account → foaf:account → com:Agent

Yes

CQ64

com:Category → com:parent_category → com:Category

Yes

CQ65

com:Collection → com:title

Yes

CQ66

com:Collection → com:description

Yes

CQ67

com:Collection → com:created → com:Account → foaf:account → com:Agent

Yes

CQ68

com:Collection → com:collection → com:Document

Yes (LIST)

CQ69

com:Collection → com:scope → com:Role → com:role → com:Account → foaf:account → com:Agent

Yes (LIST)

CQ70

com:Role → dc:title

Yes (with dcterms)

CQ71

com:Role → dc:description

Yes (with dcterms)

CQ72

com:Role → com:permission → com:Permission

Yes (LIST)

CQ73

com:Document → com:title

Yes

CQ74

com:Report → com:source → com:Media

Yes (Report)

CQ75

com:Report → com:informant → com:Informant

Yes (Report)

CQ76

com:Report → com:source → com:Document

Yes (Report/LIST)

CQ77

com:Event → com:geolocation → com:Geolocation → com:precision

Yes

CQ78

com:Report

Yes (Report)

CQ79

com:Document → com:language

Yes

CQ80

com:Document → com:englishTranslation

Yes

CQ81

com:Document → com:informativeness

Yes

CQ82

com:Document → com:informativeness

Yes

CQ83

com:Event → com:results_in → com:Situation → com:involves → com:Entity

Yes (Entity)

CQ84

com:Event → com:results_in → com:Situation → com:involves → com:Entity

Yes (Entity)

CQ85

com:Event → com:results_in → com:Situation → com:involves → com:Entity

Yes (Entity)

CQ86

com:Event → com:impact

Yes

CQ87

com:Report → com:describes → (com:Event; com:Situation) → com:describes → com:Report

Yes (Report/LIST)

CQ88

com:Event → com:polarity

Yes

CQ89

com:Document → com:favourites

Yes

CQ90

com:Message → com:addressed_to → com:Account

Yes

CQ91

com:Event → com:results_in → com:Event

Yes (ASK)

Copyright 2016 Grégoire Burel ©

42 | P a g e


D4.1 Enriched Semantic Models of Emergency Events CQ92

com:Document → com:shares

Yes

CQ93

com:Event → com:startDate

Yes

CQ94

com:Entity → com:lifespan

Yes (Entity)

CQ95

com:Agent

Yes (LIST)

CQ96

com:Agent → com:description

Yes

CQ97

com:Account → com:favourites

Yes

CQ98

com:Account → com:followed_by → com:Account

Yes (LIST)

CQ99

com:Account → com:follows → com:Account

Yes (LIST)

CQ100

com:Agent → foaf:member → com:Organisation

Yes

CQ101

com:Agent → foaf:member → com:Organisation

Yes

CQ102

com:Agent → foaf:member → com:Organisation

Yes

Table 6 Competency questions ontology mappings and evaluation.

16.2 Results As observed in Table 6, all the competency questions are successfully represented by the model. However, it is important to note that some mappings are not directly mapped by the COMRADES ontology but are inferred through merged ontologies. For instance, the topic of com:Message is not modelled directly by the COMRADES model but can be represented through the sioc:topic relation. Similarly, some of the competency questions are ambiguous with the loose usage of the term “document”. In the implementation of the COMRADES model, some of those “documents” are actually reports. We corrected those mappings when validating the competency questions. There are also some competency questions that are not represented directly but can be represented by adding subclasses to the existing model. For instance, the com:Agent involved in an com:Event can be represented through a com:Situation and a new type of com:Entity.

Copyright 2016 Grégoire Burel ©

43 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

17 Conclusions We introduced the COMRADES ontology as a model that supports the representation of events and related information during emergency crises. We based the development of the model on the NeOn methodology [1] and on a qualitative and structural design approach [2] and evaluated the COMRADES ontology by mapping competency questions to ontology properties, relations and classes. After creating an Ontology Requirement Specification Document (ORSD) [3], we implemented the model using semantic web technologies (RDF/OWL) and linked the newly developed data structures to existing ontologies such as FOAF and SIOC. Although the model is still not populated with the input and output data of the different components of the COMRADES platform, since they are still under development, we provided a partial evaluation of the COMRADES ontology by mapping a list of competency questions to the COMRADES ontology properties, relations and classes. Competency questions are commonly used in ontology evaluation practices, to test the capability of the model in answering all required queries. Since the needs and requirements of the COMRADES resilience platform are likely to evolve during the project, we designed the model to be easily extensible. For instance, additional types of data and reports can be added to the model and new types of events or resources can be specified. Further evaluations will be performed on the model in the COMRADES platform when further data becomes available.

Copyright 2016 Grégoire Burel ©

44 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

Appendix 18 References [1] [2] [3]

[4] [5] [6] [7]

[8]

[9] [10]

[11] [12]

[13]

M.C. Suárez-­‐Figueroa, A. Gómez-­‐Pérez, M. Fernández-­‐López, The neon methodology for ontology engineering, in: Ontol. Eng. a Networked World, 2012: pp. 9–34. doi:10.1007/978-­‐3-­‐642-­‐24794-­‐1_2. G. Burel, Community and Thread Methods for Identifying Best Answers in Online Question Answering Communities, (2016). http://oro.open.ac.uk/46144/ (accessed November 30, 2016). M.C. Suárez-­‐Figueroa, A. Gómez-­‐Pérez, B. Villazón-­‐Terrazas, How to write and use the ontology requirements specification document, in: Lect. Notes Comput. Sci. (Including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), 2009: pp. 966–982. doi:10.1007/978-­‐3-­‐642-­‐05151-­‐7_16. M.F. Lopez, A. Gomez-­‐Perez, J.P. Sierra, A.P. Sierra, Building a chemical ontology using Methontology and the Ontology Design Environment, IEEE Intell. Syst. 14 (1999) 37–46. doi:10.1109/5254.747904. S. Staab, R. Studer, H.P. Schnurr, Y. Sure, Knowledge processes and ontologies, IEEE Intell. Syst. Their Appl. 16 (2001) 26–34. doi:10.1109/5254.912382. D. Vrandecic, S. Pinto, C. Tempich, Y. Sure, The DILIGENT knowledge processes, J. Knowl. Manag. 9 (2005) 85–96. doi:10.1108/13673270510622474. M.C. Suárez-­‐Figueroa, A. Gómez-­‐Pérez, M. Fernandez -­‐ Lopez, The Neon methodology framework: a scenario -­‐ based methodology for ontology development, Appl. Ontol. 10 (2015) 107–145. doi:10.1007/978-­‐3-­‐642-­‐24794-­‐ 1. P. Schrodt, Ö. Yilmaz, The CAMEO (conflict and mediation event observations) actor coding framework, Annu. Meet. …. (2008). http://eventdata.parusanalytics.com/papers.dir/APSA.2005.pdf (accessed December 13, 2016). A. Pérez, M.D.F. Baonza, B. Villazón, Neon methodology for building ontology networks: Ontology specification, Methodology. (2008) 1–18. doi:10.1016/j.landurbplan.2011.04.007. G. Burel, A.E.C. Basave, M. Rowe, A. Sosa, Representing, proving and sharing trustworthiness of web resources using Veracity, Knowl. Eng. Manag. by Masses. (2010) 421–430. http://ekaw2010.inesc-­‐ id.pt/accepted_short_papers.html. C. Keßler, C. Hendrix, The Humanitarian eXchange Language: Coordinating disaster response with semantic web technologies, Semant. Web. 6 (2015) 5– 21. doi:10.3233/SW-­‐130130. S. Liu, D. Shaw, C. Brewster, Ontologies for crisis management: a review of state of the art in ontology design and usability, ISCRAM 2013 -­‐ 10th Int. Conf. Inf. Syst. Cris. Response Manag. (2013) 349–359. http://windermere.aston.ac.uk/~kiffer/papers/Liu_ISCRAM13.pdf. J.G. Breslin, S. Decker, SIOC: an approach to connect web-­‐based communities, Int. J. Web Based Communities. 2 (2006) 133–142. doi:10.1504/IJWBC.2006.010305.

Copyright 2016 Grégoire Burel ©

45 | P a g e


D4.1 Enriched Semantic Models of Emergency Events

[14] E. Montiel-­‐Ponsoda, D. Vila-­‐Suero, B. Villazón-­‐Terrazas, G. Dunsire, E.E. Rodríguez, A. Gómez-­‐Pérez, Style guidelines for naming and labeling ontologies in the multilingual web, Int. Conf. Dublin Core Metadata Appl. (2011) 105–115. http://dcpapers.dublincore.org/pubs/article/view/3626%5Cnhttp://oa.upm.es/ 12469/.

Copyright 2016 Grégoire Burel ©

46 | P a g e


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.