SmartSociety Hybrid and Diversity-Aware Collective Adaptive Systems When People Meet Machines to Build a Smarter Society
Grant Agreement No. 600584
Deliverable D4.2 Working Package 4
Peer Search in Smart Societies Dissemination Level 1 (Confidentiality): Delivery Date in Annex I: Actual Delivery Date Status2 Total Number of pages: Keywords:
1
PU December 31, 2014 January 16, 2015 F 71 peers, profiles, privacy protection, search, ranking
PU: Public; RE: Restricted to Group; PP: Restricted to Programme; CO: Consortium Confidential as specified in the Grant Agreeement 2 F: Final; D: Draft; RD: Revised Draft
c SmartSociety Consortium 2013-2017
2 of 71
Deliverable D4.2
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Disclaimer This document contains material, which is the copyright of SmartSociety Consortium parties, and no copying or distributing, in any form or by any means, is allowed without the prior written agreement of the owner of the property rights. The commercial use of any information contained in this document may require a license from the proprietor of that information. Neither the SmartSociety Consortium as a whole, nor a certain party of the SmartSocietys Consortium warrant that the information contained in this document is suitable for use, nor that the use of the information is free from risk, and accepts no liability for loss or damage suffered by any person using this information. This document reflects only the authors’ view. The European Community is not liable for any use that may be made of the information contained herein.
Full project title:
Project Acronym: Grant Agreement Number: Number and title of workpackage: Document title: Work-package leader: Deliverable owner: Quality Assessor: c SmartSociety Consortium 2013-2017
SmartSociety: Hybrid and Diversity-Aware Collective Adaptive Systems: When People Meet Machines to Build a Smarter Society SmartSociety 600854 4 Peer Modeling and Search Peer Search in Smart Societies Alethia Hume, UNITN Alethia Hume, UNITN Hong-Linh Truong, TU WIEN 3 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
List of Contributors Partner Acronym UNITN UNITN UNITN UNITN KU KU
4 of 71
Contributor Ronald Chenu-Abente Vincenzo Maltese Alethia Hume Uladzimir Kharkevich Simone Fischer-H¨ ubner Leonardo Martucci
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
Executive Summary This deliverable reports the evolution of the work done in WP4 during the second year of the project, in particular it reports the work done on tasks T4.1(privacy-friendly profiling of peers and representation of atomic tasks), T4.2 (matching atomic-task with peers), and T4.3 (ranking matching results). Furthermore, the appendix and the year two SmartSociety Platform integrated demo also contain the first results from T4.4 (implementation and application). In general the main focus of this year of work was on defining models and mechanisms to accomplish these tasks while also complying and using as a base the requirements identified during the first year. The main WP4-related outcome during the second year of the Smart Society project is the definition and early prototype implementation of the Peer Manager component, which is projected to become the main storage for peer information in the SmartSociety platform and a fundamental building block for managing the information of peers in a privacy-preserving manner. More specifically, the definition of the Peer Manager presented in this document includes: 1. An attribute-based model for data and knowledge representation, which provides an underlying model for the representation of information related to peers (i.e., humans/machines and collectives) and the tasks that they can perform. 2. A model for privacy-aware storage and sharing that introduces the main elements for the implementation of a privacy-preserving Peer Manager. 3. The mechanisms for search and ranking peers based on the attributes that allow them to solve a task; and designed upon a privacy-preserving framework thus paying close attention to privacy issues. 4. The definition of a privacy-by-design architecture for the Peer Manager that is able to provide the previous services effectively while complying with current European laws on privacy and being able to enforce the privacy policies defined by each user. While an initial prototype of the Peer Manager was developed during this year, a more complete implementation and integration of models and mechanisms proposed during this year is planned for the next year of the project (year three of SmartSociety).
c SmartSociety Consortium 2013-2017
5 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Table of Contents 1 Introduction
8
2 Attribute-based Peer and Task Modeling
9
2.1
Ground Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2
The Peer Manager Core Ontology . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3
2.2.1
Guidelines followed for the development of the ontology . . . . . . . 11
2.2.2
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.3
Entity classes, relations and attributes . . . . . . . . . . . . . . . . . 13
Core Entity Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3 Attribute-based Privacy-enhanced Model
15
3.1
Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.2
The Peer Manager Model Privacy-enhancing Structures . . . . . . . . . . . 17 3.2.1
Peers as distributed storage providers. . . . . . . . . . . . . . . . . . 17
3.2.2
Users structures as subject pseudonyms. . . . . . . . . . . . . . . . . 19
3.2.3
Profiles as object indirections. . . . . . . . . . . . . . . . . . . . . . . 20
4 Privacy-aware Search and Ranking
22
4.1
Attribute-based Search: Defining Search Constraints . . . . . . . . . . . . . 22
4.2
Attribute-based Ranking: Defining Order or Distance Function . . . . . . . 25
4.3
Privacy-aware Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
5 The Peer Manager Architecture
28
5.1
Peer Manager Internal Architecture . . . . . . . . . . . . . . . . . . . . . . . 28
5.2
The Peer Base Structures and Hybridity Support . . . . . . . . . . . . . . . 30
5.3
The Peer Manager as Part of the SmartSociety Platform . . . . . . . . . . . 31
5.4
Second Year Peer Manager Integrated Proof of Concept . . . . . . . . . . . 32
6 Conclusion and Final Remarks
33
A The Peer Manager Entity Types
38
A.1 Entity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 A.2 Agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 6 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
A.3 Person . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 A.4 Software agent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 A.5 Collective . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 A.6 Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 A.7 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 A.8 Location . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 A.9 Artifact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 A.10 Mind Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 A.11 Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 A.12 File
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.13 Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 A.14 Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 A.15 Role . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 A.16 Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 B Study on Personal Data
46
C Aligning WP1 and WP2 Formal Models
53
D Ethical and Privacy issues
55
D.1 Social and Ethical Issues of Profiles . . . . . . . . . . . . . . . . . . . . . . . 56 D.2 Privacy Issues of Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 E Policy Example
60
F Implementation
66
F.1 Implementation Details and Choices . . . . . . . . . . . . . . . . . . . . . . 66 F.2 Year Two Peer Manager Integrated Proof-of-Concept . . . . . . . . . . . . . 66 F.3 Proof-of-Concept Integration Calls Specification . . . . . . . . . . . . . . . . 67
c SmartSociety Consortium 2013-2017
7 of 71
c SmartSociety Consortium 2013-2017
1
Deliverable D4.2
Introduction
The overall objective of WP4 in the SmartSociety project is to provide mechanisms for the representation of peers and atomic-tasks; and to enable finding peers based on the characteristics that allow them to fulfil the requirements of a given atomic-task. Because the context of the project involves dealing with peers that can be human and therefore manipulating possibly sensitive data (for instance, personal information), in WP4 it is also fundamental to account for privacy concerns that may arise in relation with these objectives. During the first year of the project we focused on the analysis and study of requirements derived from the above mentioned objectives within the context of hybrid and diversity aware collective adaptive systems (HDA-CAS). The results of this initial work were reported in deliverable D4.1, which included early ideas related to the representation of peers (using profiles) as well as the introduction of instruments and mechanisms that can be used to construct profiles of peers. The deliverable D4.1 also included the identification and discussion of legal privacy requirements for such a system. Finally, in order to be able to ground the discussions in this deliverable to concrete examples, we would like to introduce a few reference scenarios that will be used in the rest of the deliverable when explaining how the different proposed mechanisms can be concretely applied. • Example Scenario 1. Maria lives in Milan and she, for work reasons, has to commute to Trento on a periodic basis. She has a car but she usually travels alone, leaving her with 4 empty seats in her car. In order to share some of the expenses of the trips and maybe having also someone to talk to, she wishes to share rides with other commuters (offer rides). However, she is concerned about sharing her private information openly in social networks or other platforms that may make this information public and easily accessible to everyone. As such, she would like to reveal only the minimal required information for the ride-sharing and, above all, she wishes to maintain full control over her personal data should she decide to withdraw from the platform. • Example Scenario 2. Marco, who has lived in Milan his whole life, is passionate for his city and considers himself quite an expert with regard to recommending restaurants of different types. Given that he has some spare time every now and then, he wishes to be able to use this time and his knowledge to help tourists to find good places to eat. The problem is that he does not want this activity to become a full-time job but rather to be something that he can do in a very simple way and without strict commitments, by using different communication channels to respond the recommendation request tasks. 8 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
• Example Scenario 3. Carlo Rossi is a doctor and a board member at the CAS city hospital, which makes organizing his agenda very complicated. For managing his always full agenda he receives the help of his assistant; who has to coordinate meetings with other board members or colleagues, schedule appointments for his patients, among other activities. To facilitate the requesting of appointments, Dr. Rossi also wishes to be able to directly share his personal agenda with his colleagues, patients and even family members; in such a way that they can directly see when there are available slots. Nevertheless, he does not want to share every little detail of his agenda with everyone; choosing instead which details to share with each group of people and only granting permission to directly edit his agenda to his assistant. The scenarios described here can be considered within macro scenarios in the areas of transportation, tourism, and health care respectively. These are interest areas for the project that, in turn, are related to scenarios defined in WP9. More in particular, the first scenario maps with the SmartShare pilot developed mainly in WPs 2, 5 and 6; and the second correspond to the Ask SmartSociety application that integrates the work of WPs 4, 6, 7 and 8. The reminder of the deliverable is organized as follows, Section 2 introduces the model for data and knowledge representation used in the Peer Manager, while the storage and privacy protection model is presented in Section 3. The mechanisms for privacy-aware search and ranking of search results are introduced in Section 4. The Section 5 presents the internal architecture of the peer manager and Section 6 concludes the deliverable.
2
Attribute-based Peer and Task Modeling
The main purpose of this section is to provide an underlying model for representing knowledge and data related to peers participating in collectives and tasks that they can perform (i.e., a data and knowledge representation model). Such model is provided by the Peer Manager (WP4) and it is meant to give a “minimum” common ground of specifications for the representation of data and knowledge across Smart Society applications, favouring semantic interoperability among them. It should have at least the following properties: • In line with WP4, it needs to be able to describe peers, as well as tasks and resources employed to perform them. • In line with WP1, it needs to account for the foundational notions of hybridity (peers are of different nature, i.e. humans/machines and collectives) and diversity (peers have different capabilities, skills and descriptive features). • To support interoperability, entities defined with the model should be dereferenceable, i.e. they should be associated an identifier (e.g., URI) such that any application c SmartSociety Consortium 2013-2017
9 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
can refer to them. Also, this is required for the interaction with the other components of the SmartSociety platform, in particular the provenance (WP2) and the orchestration (WP6). • The model should be minimally defined and extended to accommodate for applicationspecific requirements. This is actually what we mean by good enough actionable semantics, i.e. the semantics, which is strictly required for the tasks at hand.
2.1
Ground Knowledge
In the Peer Manager, the representation of information related to peers builds upon de notion of a semantic schema that follows an entity-centric approach, using the notion of entity to refer to a “thing” that exists in the real world. Within the peer manger we formalize this notion of entity and use it, as the basic element representing information about peers. In this subsection, we briefly present the evolution of the initial data and knowledge representation model that was initially discussed in deliverables D1.1 and D4.1. We distinguish between a Knowledge Base (or schema level ) containing the “format” to represent information and an Entity Base (or instance level) that instantiates the schema into actual information (i.e., data) [1, 2]. A Knowledge Base KB stores the definition of templates for each type of entity used in the system3 . This serve to establish restrictions on the set of attributes that can be used to describe a given type of entity, where the meaning is further specified by mapping single elements (i.e., types of entities, the names of attributes and their values) to concepts from the underlying ontology that is also part of the same knowledge base. • A concept is “an idea of what something is or how it works.”4 In the area of knowledge representation, concepts are used to formalize and represent the meaning of words in a language independent manner. Concepts can be mapped to an underlying ontology that greatly helps hard to manage limitations for the shared information (for interoperability) and identifying purposes (for purpose binding), while also providing the basis for more accurate access control methodologies as introduced in [3, 4]. • An entity type ET provides a template for the creation of entities by establishing a set of constraints about the metadata (i.e., attributes) those entities of that type can instantiate. The template for attributes are defined by mean of the so-called attribute definitions. An attribute definition AD imposes an explicit constraint about the name and the quantitative or qualitative values of a certain attribute that can be 3 4
Note that this approach is aligned with the Schema.org (http://schema.org/) initiative. Merriam-Webster (http://www.merriam-webster.com/dictionary/concept).
10 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
associated to an entity. An Entity Base (EB) stores concrete information about abstract and physical entities that exist in the real world and is represented by the following elements. • An entity (E) is defined as an abstract or physical object, can be of different types defined at the knowledge level (e.g., person, location, event, etc.) and is described by attributes (e.g., name, birth date, latitude-longitude, size, duration, etc.), which can be different for different types of entities. It is defined by mean of an entity type (ET ) and a non-empty set of attributes describing the characteristics of the entity ({A}). • An Attribute (A) instantiates an attribute definition AD to represent a particular characteristic of the entity. Some attributes may have multiple values, its values may be mapped to a meaning in some knowledge base (i.e., semantic values) or can represent a relation to another entity when the value is a reference to another E (i.e., relational attribute). It is formally defined by mean of an attribute definition (AD) and a set {V } of attribute values of the type of the corresponding AD.
2.2 2.2.1
The Peer Manager Core Ontology Guidelines followed for the development of the ontology
Gruber [5] defines ontology as a formal specification of a shared conceptualization. The notion of conceptualization refers to an abstract model of how people theorize (the relevant part of) the world in terms of basic cognitive units called concepts. Explicit specification means that the abstract model is made explicit, for instance by providing terms and definitions for the concepts. In other words the terms and the definition of the concept provide a specification of its meaning in relation with other concepts. The specification is said to be formal when it is written in a language with formal syntax and formal semantics, i.e. in a logic-based language. The conceptualization is shared in the sense that it captures knowledge, which is common to a community of people and therefore represents concretely the level of agreement reached in that community. Accordingly, we attach to each concept in the ontology, a set of terms and a definition. For instance, the concept of “the date on which a person was born” can be associated in English with the terms “date of birth” and “birthday”. In developing the ontology we also comply with principles of ontological analysis [6] and apply the DERA methodology [2] developed in Trento. The latter draws a neat distinction between ontologies of entity classes, relations and attributes and gives precise principles, borrowed and adapted from the faceted approach [7], to be followed in order to develop robust, and easy to extend, ontologies. For instance, it is essential to make explicit the criteria followed to c SmartSociety Consortium 2013-2017
11 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
arrange concepts when appearing at same level of the ontology (siblings) and the semantic relations between them. We also frequently follow the guidelines provided by ISO for thesauri construction [8] that provide (among others) concrete suggestions about how to appropriately select terms to lexicalize the concepts. Examples of pitfalls to avoid include: • The usage of labels that indicate mixed notions, e.g. UserAndMachine (two entity classes), ProvideVote (an activity and an attribute name) or PerformMachineNegotiation (an activity and an entity class); such examples are attempts of making the ontology enumerative, i.e. attempts to list all possible combinations of concepts (instead of leveraging on the compositional nature of semantics). • Forgetting to provide definitions for the concepts, thus leaving them ambiguous. For instance, the term bank may denote a financial institution or a river slope. • Forgetting to provide explicit semantic relations between concepts. For instance, we can say that person is-a agent and that person member-of collective. • Forgetting to discriminate between entity classes, relations and attributes. Such concepts need to belong to different ontologies. For instance, it would be wrong to treat price as a utility (an entity), but it should be rather an attribute of a utility. 2.2.2
Requirements
The main things that we need to model in the Peer Manager are: • Agents: an agent can be defined as being any entity capable to act independently and to bring changes in the world. It can correspond to a single person, a software agent or a collective. In turn a collective is a group whose members include any number of people, software agents or other collectives. • Users: Users are associated to agents registered in the platform. Users are typically of three kinds: standard, administrator and guest users. • Tasks: a task corresponds to any piece of work that is undertaken or attempted by one or more agents. It typically requires some pre- and post-requisites, such as specific resources, capabilities and skills. For instance, in the Ride Sharing application a ride requires a driver to be available possessing both a vehicle and a driving license. • Roles: a role represents the parametric profile of the resources, capabilities and skills that are required or expected to be owned by an agent to perform or execute a certain task. For instance, roles that can be played in the Ride Sharing scenario are drivers, passengers and ride planners. At any moment of time the same agent can play any of the roles (the first two not at the same time). Notice that, at the current status of technology, drivers and riders are roles that can only be played by people, 12 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
but it is expected that this will change quite soon. On the other hand, while in principle a person or software agent can play the role of ride planner, in the current implementation of the Ride Sharing application it is played by a software agent. • Entities: they refer to any real world object, such as locations (e.g. the origin and the destination of a ride), organizations (e.g. a transportation company), artifacts (e.g. vehicles necessary to perform rides, files necessary to store information), or any resource required and target of activities performed by agents. We should distinguish objects (e.g. cars) from their attributes (e.g. color and model) given that the latter are means to describe the former. Entities can be of any kind and their nature is highly domain dependent. Accordingly we define the following core ontology specifying concepts (with terms and definitions in English) and semantic relations between them. In line with the good enough principle, the ontology is further extended according to specific applications we may need to serve in the future. 2.2.3
Entity classes, relations and attributes
Here we provide the core ontology of entities. • Entity :: anything which is perceived or known or inferred to have its own distinct existence – [is-a] Agent :: an entity capable to act independently and to bring changes in the world ∗ [is-a] Individual :: a single agent · [is-a] Person :: an individual human being · [is-a] Software agent :: an individual computer program that acts on behalf of a user or another program ∗ [is-a] Collective :: any set of entities acting collectively as a single agent · [is-a] Organization :: a collective constituted by individuals with common goals and formal rules · [is-a] Facility :: anything providing a particular service or used for a particular industry · [member-of] Individual · [member-of] Collective – [is-a] Location :: an entity occupying fixed regions of space – [is-a] Artifact :: a physical entity created by a human – [is-a] Data store :: a repository of a set of integrated information objects ∗ [is-a] File :: a single block of arbitrary information or resource for storing c SmartSociety Consortium 2013-2017
13 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
information – [is-a] Mind Product :: a product of the human intellect ∗ [is-a] Document :: a conventionally text-based work that is used to transfer information – [is-a] Event :: anything that happens at a given place and time ∗ [is-a] Task :: any piece of work that is undertaken or attempted • Role :: the specific frame of activities played by an agent that takes part to a specific event Relations and attributes are directly defined as part of the entity types they belong to (see Appendix A). Here we provide only some examples. Here we provide only some examples of attributes. • Attribute :: any qualitative, quantitative or descriptive property of an entity – [is-a] Address :: a place where an entity can be found or communicated with – [is-a] Gender :: the properties that distinguish organisms on the basis of their reproductive roles ∗ [value-of] male ∗ [value-of] female – [is-a] Date of birth :: the date on which a person was born
2.3
Core Entity Types
Following the notation given in D1.1, in this subsection we provide the entity types for the core semantic schema of the Peer Manager. The resulting hierarchy (as from the core ontology) of entity types is shown in the following diagram (Figure 1). As we can see, the entity types that are lower in the hierarchy extend the parent classes and therefore (similar to object orientation) they will inherit their attribute definitions. More details related to the concrete attribute definitions for each entity type are reported in Appendix A. Notice that, in line with the good enough principle, entity types are defined only for those concepts for which it is worth defining at least one attribute given the applications for which they are foreseen to be used. Additionally, in Appendix B we report the outcome of a study on personal data. Notice that individual human peers are of high relevance in the context of the project. The study is carried out from the perspective of knowledge representation, taking into account philosophical points of view as well as standards that are currently being used for the representation of person attributes. Finally, we need to show the relation between the data and knowledge representation model presented in this section (grounded on WP1’s model) and WP2’s model. The Appendix C discusses how WP1 and WP2 models can be 14 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
!
Figure 1: Entity classes hierarchy aligned, allowing interoperability with other SmartSociety components (in particular the provenance - WP2).
3
Attribute-based Privacy-enhanced Model
This section extends the model defined in the previous section, by adding privacy regulations and considerations, to propose a privacy-aware storage and sharing model for entities that we call here the Peer Manager Storage and Privacy Protection Model. The entities in this model can act as both subjects and objects of different actions and, for this reason, they can be used to represent either peers or tasks. To answer its privacy and knowledge management requirements defined during the first year of the project, the Storage and Privacy Protection Model is mainly based on the following three core guiding principles: 1. Well-defined information separation between information owned by different subjects. For guaranteeing that the subjects control their own data, the Peer Manager creates distributed environments that host the knowledge container of each subject. Therefore, as the knowledge containers can be physically and logically distributed, the personal information of a subject is isolated from the personal information of other subjects. 2. User-centred identity management. The Peer Manager gives to each subject the control of the flow over personally identifying information. Furthermore, the identity management system issues pseudonyms and partial identifiers. 3. A deconstruction and re-imagining of information profiling. While the high-level objective of the Peer Manager’s Profiles is in general the same as the one of traditional profiles (i.e. an information-holding structure that is maintained and updated c SmartSociety Consortium 2013-2017
15 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
separately from the subject to which it refers), we have focused on turning around the regular profiling practices by making them transparent and controllable by the subjects that they refer to.
3.1
Requirements
We define our requirements for a privacy-enhanced model taking into account our previous work on D4.1 where we discussed basic privacy principles that apply to peer profiling. As a quick summary of what was explained in that deliverable, the main requirements that we need to comply in the Smart Society Peer Manager are: • Legitimacy & informed consent: The collection and processing of personal data in profiles needs to be legitimate, which usually implies that the data subjects have given their informed consent. • Purpose Specification & binding: Personal data used in the context of profiling must be collected for specified and legitimate purposes and may later only be used for those purposes. • Data minimization: Furthermore, the amount of personal data and the extent to which they are collected and processed in profiles should be minimized, which implies that the data in profiles should be anonymised or pseudonymised whenever possible. • No sensitive data: The collection and processing of so-called special categories of data in the context of profiling should in principle be prohibited. • Transparency & data subject rights: Data subjects that are being profiled have the right to access (i.e. to obtain information about) their personal data as well as the right to be informed by the data controller about the logic underpinning the processing of their profile data. Furthermore, data subjects have rights to correction, deletion and blocking of their data. • Security: The data controller has to implement proper technical and organizational security measures for the protection of personal profile data. During the second year, the previous requirements were extended by more specific privacy questions that, although considered for the Peer Manager privacy-aware model and platform described in this deliverable, still largely remain as challenges to be addressed within Smart Society in the following years. The most important three of these questions are the following: • How can privacy interests of “collectives” (consisting of several individuals and/or machines) be protected? How can collectives be formed in an anonymous manner, i.e. in a way that it does not relate to any identified or identifiable person? 16 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
• In hybrid systems, peer profiles of machines could include personal data of one or even several data subjects. For instance, in the Care House scenario, sensors capture data about when and for how long health care professionals and patients have met. • Are anonymous credentials suitable PETs (privacy-enhancing technologies) for enhancing the privacy of passengers and drivers in the Smart Society car ride sharing pilot? Both drivers and passengers could be pseudonymously registered by the platform but will the use of anonymous credentials in this context be practically feasible and socially accepted? These challenges to privacy are also equally important for a privacy-preserving cas computing platform. There is related work on privacy on sensor data collection, trust and reputation, and provenance (see e.g., [9, 10, 11]). Furthermore the ethical implications of data collecting, profiling and processing (in a big data manner) are elaborated with the previous three questions in the Appendix D. The rest of this deliverable will refer to these requirements and challenges when referring to privacy enhancements but some of them (specially the ones arising from questions raised during this year) will be addressed in future work within the model of the Peer Manager and its implementation in the SmartSociety platform.
3.2
The Peer Manager Model Privacy-enhancing Structures
For the Peer Manager Storage and Privacy Protection Model, we use the concept of Information Peer not in its traditional network management sense but in the more literal sense of “equally privileged participants in an information exchange”. Following this definition the different agents (i.e. human or machine-based actors, parties or participating stakeholders), represented as peer structures, interact under the same set of rules and are able to become both providers and subscribers of different exchanges or services at different times. To address the previously stated requirements and challenges related to personal data protection, the Peer Manager Model introduces three main structures: the peer, the user, and the profile structures. 3.2.1
Peers as distributed storage providers.
Peer structures are units of storage under the control of the Peer Manager and of the subjects that participate in it. The Peer Manager keeps an entity’s data and knowledge base. Every entity has a Peer Manager and defines the access control policies related to their data. The default policy is the most restrictive one, i.e., the data is available only c SmartSociety Consortium 2013-2017
17 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
to the entity that owns it. An entity’s data is kept isolated from the rest. Therefore, the Peer Manager helps to promote the privacy principle of informational self-determination. When a subject registers to the SmartSociety platform, it is assigned a peer structure defined as the the tuple P = hID, KB, EB, M E, {U }i where: • ID is a unique identifier and a reference number to a peer; • KB is the id of the Knowledge Base owned by the peer that will be used to store all of the concepts and Etypes that belong to the peer; • EB is the id of the Entity Base owned by the peer that will be used to store all of the Entities that belong to the peer; • M E is the id of the Main Entity of the peer and it is stored in EB. This Entity contains the information belonging to the subject that the peer represents; so in the case of a human, the main entity is a person entity but it could also be software process or a collective of peers (how the peer structure also enables hybridity in the SmartSociety Platform is shown in Section 5); and • {U } is a non-empty set of user structures, which is defined below. The peer structure thus allows each subject in the system to have its own dedicated storages (to which then they can apply their own policies and AC directives) as shown in Figure 2.
Figure 2: A data storage handled by the Peer Manager. Each subject is assigned its own peer storage, while the platform itself offers a shared storage (for Knowledge and Entities) for different interactions. Figure 2 shows that each P eeri has its own KBi (Knowledge Base) and EBi (Entity Base) assigned and clearly separated from the other peers and the platform. The design of the Peer Manager infrastructure also states that only the subject in control of Peer structure has access to it by default and allows for each of these Peers to be stored either in the same server or in different machines altogether. Through this, the platform guarantees that each subject will be always in control of the information stored in his/her assigned peer and that nobody (not even the platform holders) would be able to access this information unless given access by the same subject. 18 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
Figure 3: User structures are used instead of the Main Entity when it acts as a subject and profile structures are used for when the Main Entity is read as an object.
3.2.2
Users structures as subject pseudonyms.
When interacting with other peers registered in the platform, entities have the option to control the amount of personal data they reveal. User structures (corresponding to pseudonyms that a subject can act under) are introduced to enhance the privacy of all the subjects that participate in actions/interactions in the Smart Society platform. Entities are able to define N user structures (corresponding to N different pseudonyms) defined as tuples U = hU N, AU T H, P, M P D, {P D}i, where: • U N is an alphanumeric string used as the unique identifier to the user structure. This string is a pseudonym for the entity that controls the peer; • AU T H is an authentication token that is issued by the platform as a proof of peer’s identity; • P is the id of the peer to which a user structure corresponds. • M P D is the id of the Main profile definition structure that is applied to the peer’s M E. It is used to obfuscate (by pseudonymization or anonymization) the link between user structure and the peer that owns it. The resulting Entity Profile is associated to the user structure and (depending on its configuration) may provide none to full identification of the entity that controls it; • {P D} a possibly empty set of profile definitions that subjects create for their entities (e.g. their events, physical and logical objects, and other partial identities) that are linked to the current user structure. The left side of Figure 3 shows an user structure being used instead of the M E as the subject of the action “posted a comment”. As this example illustrates, the user structure corresponds to a pseudonym for the peer. For achieving a high degree of privacy/unlinkability, different user structures (i.e. different transaction pseudonyms) could be used for different actions. More in general, Figure 3 shows how user structures and profiles enhance privacy by providing indirect and partial access to the information from the Peer and its Main Entity, which may contain identifying information about the subject and additional personal data. c SmartSociety Consortium 2013-2017
19 of 71
c SmartSociety Consortium 2013-2017
3.2.3
Deliverable D4.2
Profiles as object indirections.
Instead of directly allowing access to the information contained in the peer’s entities, Profile structures are created to reply to queries that are sent to the peer (normally only revealing partial or obfuscated information about these entities). The right side of Figure 3 shows an example where, upon receiving a read query, the peer allows the requester to access the Profile named “Profile2”, which contains partial and obfuscated information from M E but not M E, which may contain personal data that the subject may not want to disclose. Profile structures, when they refer to M E, may represent partial identities of the subject controlling the Peer. The profile structure definition P D is used to define the subset of information to be included in the profile from the M E. A profile definition is defined as the tuple P D = hID, P E, {P P }, {GP }, {N R}i where: • ID is a numeric unique identifier. It is a reference to the P D; • P E is the id of the Profiled Entity, the entity to which this profile refers to; • {P P } is a non-empty set that specifies the different parameters that feed the algorithms that are to be applied to the profiled entity to obtain the profile, both this set of parameters and the algorithms are entity type-dependent (although future versions may consider its generalization); • {GP } is a possibly empty set that contains the id of all Generated Profiles obtained from the current definition, and; • {N R} represents the Negotiation Requirements that need to be complied by the parties wanting to have access to the information that this profile definition will generate. Applying a profile definition to the Entity it refers to materializes the information into an Entity Profile structure. An Entity Profile is defined as the tuple EP = hID, U, S, {A}, {AR}i where: • ID is a numeric unique identifier • U is the id of the user structure that was the source of this profile • {A} is a set of attributes defined as before but the specific attribute definitions and values may be different from the ones in the entity. • {AR} is the possibly empty set of Agreed Requirements set between the original controller of the information contained in this profile and the owner of the entity base where it is now stored, this property can be checked to make sure that the terms or agreements are not breached. The profile definition structure (i.e., the filter to apply to the original information before sharing it) is stored at the source peer’s storage while the materialized Entity 20 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
Figure 4: An example containing Entities, simplified profile definitions and profile materializations for sharing an Event Entity.
Profile structure (containing the shared partial and/or obfuscated information) is stored at the destination peer’s storage, as shown in the example illustrated in Figure 4. Recalling the example scenarios from the introduction, the left part of Figure 4 shows an Event (family lunch) that belongs to a doctor’s Peer. The doctor has created two profile definition structures, which are presented here in a simplified manner, to define how this event is shared to his assistant and patients. The rest of the figure shows the peer structures that belong to the subjects with the materialized profiles created from applying the restrictions from the profile definition. These materializations include examples of omitted pieces of information (e.g. the ‘food’ attribute is not shared in neither profile) and partial/obfuscated information (e.g. the time of the event becomes ‘Midday’). The use of the information contained in the materialized Profiles is restricted by the Agreed Requirement attribute, which is an agreed upon privacy policy based on ppl [12]. As shown in Figure 5, policy enforcement at the platform-level guarantees that the shared information is only used for the stated purposes, which are specified in the attribute level. For example, if profiles are used to share contact information, e.g., the doctor gives her telephone number to his/her assistant, the doctor may restrict its use to “call over phone only”. Therefore, any other operation over the data, such as reading or copying the telephone number is not allowed, i.e., the assistant may call the doctor but the actual phone number is not revealed to the assistant.
Figure 5: Even outside of the information controller’s storage, Agreed Requirements apply to the materialized Entity Profile to restrict the possible use that the information stored in them is given. It is important to note that the model in this section does not yet specify how the Nec SmartSociety Consortium 2013-2017
21 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
gotiation Requirements (N R) and Agreed Requirements (AR) from the Profile Definition and Entity Profile structures will be represented. This will be done in the next iteration of this deliverable (D4.3) by using the ppl policy language introduced last year in D4.1. A small ppl policy example is given in Appendix refappendix:ppl-example to show interested readers how a policy may look like. Furthermore, the definition of Collectives in this model, their privacy considerations and an exploration of their uses and roles within the SmartSociety Platform will also be better developed in future deliverables.
4
Privacy-aware Search and Ranking
The main objective of the search mechanisms in the Peer Manager (PM) is to find and rank peers (i.e., individuals and/or collectives of people and machines) that are most capable, interested and/or efficient in solving a given task. The result of such search need to be based on the attributes of the peers and the characteristics that may allow them to solve the task. Thus, having full access to all the attributes from all peers stored in peer profiles would make the search mechanism produce better results. However, such unlimited access may include personal or sensitive information, specially in the case of peers that present people. The PM therefore has to define a search method in such a way that the privacy of peers is preserved. Having defined the core data model (Section 2) and the privacy model (Section 3), we introduce a search mechanism that uses both of these models and it is implemented in the Peer Manager. In our entity-centric approach, a search request can specify (i) constraints on search results and optionally (ii) an ideal search result. We first discuss the specification of search constraints, which define what should be included in the search result, through the notion of attribute-based search. Next, we show how an ideal search result is used to define ranking order by specifying what results are preferred in relation to others. The points mentioned above are focused mainly on showing how an attribute-based search and ranking can exploit the semantic schema to enhance search capabilities. We will then present our approach for a privacy-aware search by integrating these notions within the PM’s storage and privacy protection model.
4.1
Attribute-based Search: Defining Search Constraints
Consider the example scenario 1 from Section 1 (i.e., ride-sharing example). For recruiting commuters to share a ride with Maria, the application coordinating rides needs to find a group of peers (individuals and/or collectives) with profile attributes that implicitly or explicitly indicate an interest in travelling between Milan and Trento. The diversity in the 22 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
way in which different peers are described in their profiles challenges the PM to support the specification of search constraints that can be matched with the attributes from the profiles of relevant peers even if they are not all described with the exact same words. As it has been presented in Section 2, the Peer Manager uses the notion of entity to represent information about peers. Thus, in the remainder of this section we refer to searching entities as the core mechanism to be used for implementing search in the PM. In our data model, the characteristics and relations of entities are described not only by using words, but those words are also extended with concepts from the underlying knowledge base (see Subsection 2.1). Entity types and entity attributes/relations are represented as concepts. For string attributes, words are preserved to allow a pure syntactic search on attribute values and also extended with concepts to allow semantic search. As a consequence, the entity search approach we implement is capable of semantically matching a search query to: • Entity types. For example, based on the etype hierarchy introduced in subsection 2.3, entities of types Person and Software Agent will be considered as relevant results for a query that is trying to find entities describing Individual agents. • Attributes/Relations. For example, an entity with an attribute Location set to Trento will be considered a relevant result for a query that is trying to find entities with attribute Place equal to Trento. • Attributes/Relations Values. For example, a person with background in artificial intelligence will also be considered as a relevant result to a query that is trying to find people with a computer science background (independently of whether his bio attribute contains words computer science or not). In short, attribute-based search is provided by modifying/extending the notion of semantic search, in particular we use concept search [13], to support entity search in semantic Entity Bases (EBs). The logic it follows is comparable to Lucene5 and its novelty is given by the fact that concept search is used to specify the constraints on search results. Concept search leverages on the core semantic data model of the peer manager that is stored in Knowledge Bases (KBs) to allow specifying search constraints on entity types, entity attributes and entity relations. In order to support the specification of arbitrary complex search constraints, the following elements are defined: • QueryClause: contains the root node of the input query and allows specifying the search scope, for example, by providing the entity type of the desired entities. • QueryNode: defines the list of the atomic queries (see the next item) and query 5
http://lucene.apache.org/core/index.html
c SmartSociety Consortium 2013-2017
23 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
$$ %%
! +,&-. + . +. /
+ &'0. /
#
$ % $ %
$$ %%
"
!!" !!" & ' ( ' ) & *
(a) Schematic representation
+ '- 1 + '.1.+ && . +( '. + & 1 +2 & ' + & .3' +. ' '4 + '- 1 + '- 1
! "
(b) Example
Figure 6: Complex search query
nodes joined with a “ANDâ€? or “ORâ€? operator. It allows composing atomic queries into complex query specifications. • QueryAttribute: represents an atomic clause in the query, which defines a single constraint in the desired entities in the form “attribute hoperatori valueâ€?, where the attribute can be specified by providing its name (e.g., age, location, etc.), its concept or both; the operator is the relation between the attribute and the value (e.g., EQUAL, MORE, LESS, etc.); and the value is given by a data type and the desired value. Figure 6 shows an schematic representation of search constraints (in 6a) and an example of how a concrete specification of constraints may looks like (in 6b). As it can be seen a complex query can be visually understood as a tree-like structure with QueryNodes as internal nodes and QueryAttributes as leafs. Some relevant features that are provided by using the approach described in this subsection are: 1. Semantic search on entity types, attributes and relations (based in the terminology of the PM’s core ontology) allows finding relevant peers even when they are described using different terminology in their profiles. 2. Complex search query specification is supported, allowing arbitrary description of relevant peers (i.e., complex search constraints). 3. The meaning of words in a query can be explicitly selected. 4. We leverage on Concept Search, which allows us to provide a wide range of (semantic) 24 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
operations on attributes. 5. Is scalable in terms of future (possible) extensions to the PM’s core ontology.
4.2
Attribute-based Ranking: Defining Order or Distance Function
Ranking of results is a popular and universal approach to ordering (or structuring) otherwise disorganized set of search results. The results of the query are usually ordered by employing some notion of distance between a given result and the ideal search result. The goal of computing such distance is to measure the relevance of the result for the corresponding query. This notion of distance that might be intuitive for humans is harder to encode in such a way that can be processed by computers. For this reason, different types of functions has to be defined for different types of values. Let us first introduce some query examples with different types of values: • Q1: “Find people living in Milan near the city center” • Q2: “Find people with a computer science background” • Q3: “Find people with a computer science background living in Milan near the city center” • Q4: “Find people living in Milan ordered by names” • Q5: “Find people living in Milan ordered by age” From these examples, it can be seen that Q1 imposes a single constraint that a person of interest must be living within borders of Milan. The ideal search result for this query would be a person leaving exactly in the center of the city. The result for this query is a list of people living in Milan ordered in the increasing order of the distance between the locations where they live and the city center. Notice that geographical distance is just an example of a distance function used for spatial entity attributes. For textual entity attributes and keyword based search query, Vector Space Model [14] can be used instead. Distance in this case can be defined as a cosine similarity between query and document vectors with terms weighted according to some sort of term frequency-inverse document frequency (tf-idf) model [15]. In the case of Q2 (mentioned in the previous subsection), those results will be considered ‘closer’ to the ideal search result which will have more words computer and science appearing in their bio attribute. As it was previously discussed, our model represents entity types, attributes and relations as concepts. With regard to ranking, this also means that results of the query (in addition to those exactly matching) can also be semantically similar to the ideal search result. In this case, some notion of semantic similarity can be used to define distance. An example of such distance function is a distance between concepts in the knowledge graph c SmartSociety Consortium 2013-2017
25 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
(stored in the KBs). Assuming that concept ‘artificial intelligence’ is a direct successor of concept ‘computer science’ and the weight for this relation is one, the distance between those concepts will also be one. Multiple entity attributes can be used for ranking results in a single query. For instance, Q1 and Q2 can be combined in Q3, where the final ranking is computed by aggregating all the independent rankings taking into account their weights. Then, if a query specifies only constraints on results and does not specify the ideal search result, a natural order of results can be used. This is true provided that the types of values included in the search constraints can be associated with some natural order function. For instance Q4 and Q5 will return results ordered alphabetically and numerically.
4.3
Privacy-aware Search
After presenting basic mechanisms for searching and ranking that are used in the Peer Manager, in this subsection we discuss in more details how the previously defined privacy model (Section 3) integrates with search. The Figure 7 shows a three-layered search schema that is defined and will be implemented in the peer manager in order to enable a privacy-aware search service.
Query
Purpose Specification
Search Computation
Results
Search Result Integration
Filtering Profiles
Figure 7: Peer Manager Search Schema In most search systems the items to be included in the result set only depend on the search query. One exception is the case of personalized search where the result set depends also of the user for whom the search is personalized. In a similar way, the peer manager takes into account the privacy requirements previously discussed in 3.1 and identifies the need to consider WHO is issuing the search request and for WHAT PURPOSE the returned information will be used when computing the search result. To account for this, a Query structure is defined in the peer manager as the tuple Q = hUI , P R, QC, IR, UT i, 26 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
where: • UI represents the user issuing the search request • P R represents the context/purpose for which the retrieved information will be used • QC is the query clause specifying the constrains on the search results • IR represents the ideal search result that is used for ranking • UT represents the possibly null target user. A target user is specified to explicitly specify the scope of the search (i.e., to ask another user, to search globally, etc.). Having defined the elements from the query, let us now discuss in more details the search schema from Figure 7. A search layer is able to understand the logic of the search service. On receiving a search request (i.e, query), parses its different elements and interact with other layers to compute and build an ordered list of results. It represents a service layer offering a personalized service based on privacy preferences of information owners. The privacy layer containing information about users, what data store they can search and for what purposes, is used to solve the data indirections. The data stores that can be searched (by a given user and for a given purpose) are identified at this level. It also allows filtering the profiles (from searchable data stores) that can not be reveled, again in accordance with the corresponding privacy preferences of its owner. The data layer containing the data storages (i.e., the different KBs and EBs under the control of the platform and of users) to be searched, allows finding semantically relevant information. Although these layers can be conceptually separated in different modules having different functions (as in Figure 7), it is important to note that when used together in the peer manager they actually work one embedded into the other. As a result, within this schema, search is performed in a privacy-friendly manner on Entity Profiles instead of the entities. This allows performing matching only on partial or obfuscated information (i.e., data minimization principle) about the peer and revealing also minimal information as part of the search result. Notice that the decision on which profiles can be included in the search and how they can be used is regulated by their corresponding privacy policies (i.e., based on the Agreed Requirements attribute from the profile). Finally, the approach presented in this section is designed upon a privacy-preserving framework offering a number features that can be summarized as follows: 1. The peer performing the search acts indirectly through its user, while the peer as the object being searched is also accessed indirectly through its profile. This is particularly important to help achieving the data minimization requirement. 2. Differently from other approaches we do not focus on dataset anonymization, our focus is on the user as an owner of its information and on enforcing their privacy preferences during search (i.e,. informed consent). c SmartSociety Consortium 2013-2017
27 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
3. Privacy is embedded into de design of the search mechanism in different forms. First, we do not need to search the entire repository, we constraint the search to data storages where the search issuer is authorized to search. Second, we constraint to return only profiles that can be revealed for a given purpose. 4. Only authorized users and only for the authorized purposes will be enabled to find profiles that match their query constraints. Informed consent and purpose binding are two important requirements that this feature can help achieving.
5
The Peer Manager Architecture
The Peer Manager (PM) can be viewed as the main storage of personal information in the SmartSociety Platform. As such, the PM has the following main objectives: 1. Store metadata and semantics by implementing the Data and Knowledge Representation Model from Section 2; 2. Protect privacy through the use of the Storage and Privacy Protection model from Section 3; 3. Provide matching and ranking between peers and tasks, as defined in Section 4; and 4. Support hybridity by allowing the defined peers to represent persons, software agents or collectives as shown below. The rest of the section will explain the architectural decisions made when designing the Peer Manager and will also present the first implementation and integration results within the SmartSociety Platform.
5.1
Peer Manager Internal Architecture
The different internal components of the Peer Manager, along with their interactions are shown in Figure 8. More specifically, these components are: • The Peer Base shown at the bottom is the core of the Peer Manager where all the represented knowledge, information and structures are stored and managed. • The Peer Manager Complements shown towards the top as red dotted boxes, represent additional algorithms or functionalities that work directly on the information of the Peer Base (and as such they run on the same server) and were developed in collaboration with other project partners. Examples of possible PM Complements include: – The Peer Profiler, used to bootstrap peer information from other existing platforms (e.g., social networks); 28 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Figure 8: Full internal architecture of the Peer Manager.
– The Privacy component implementing the privacy policy language that encodes the requirements to be enforced platform-wide (mentioned in Section 3); • The Peer Base Web APIs shown between the Peer Base and the Peer Manager Complements, is provided by WP4 to allow low level access to the functionalities from the Peer Base for PM Complement developers. • The Peer Manager Web APIs shown at the very top, is provided by WP4 to allow high-level access to all the functionalities from the Peer Base and the PM Complements. This particular architecture was chosen to make a clear separation between the SmartSociety components that are meant to interact directly/frequently with the information in the Peer Base (e.g., Privacy, Peer Profiler) and those components that only need high level interactions or results from the Peer Manager (e.g. Orchestration, Programming Framework). This separation enables better performance for those components that operate from within the Peer Manager (which implies in the same server), while still having good c SmartSociety Consortium 2013-2017
29 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
flexibility in providing services to components that contact the Peer Manager through a network connection.
5.2
The Peer Base Structures and Hybridity Support
Of the previously explained components, here we focus on the Peer Base to further clarify its subcomponents. The Peer Base is the responsible for peer’s data storage in the Peer Manager and, as shown at the bottom of Figure 8, it is internally separated in: • A Platform-Wide Knowledge Base (KB P W ) that will store the core entity types and ontology, and can be also used by other components of the SmartSociety platform (i.e., not only by the Peer Manager). This can potentially include SmartSociety general entity types (i.e., agent, person, software agent, collective, task, role, resource, location, event, etc) and allow all SmartSociety components to interoperate. • A Platform-Wide Entity Base (EB P W ) that will store entity instances of general interests as well as public profiles of peers (i.e., information that each peer decides to publish in the platform about themselves). • A Knowledge Base (KB i ) and an Entity Base (EB i ) for each peer, which will store the peers information. The peer maintains control over its data space and defines the privacy policies that apply to the data stored in it. The peer’s KB is bootstrapped with the content of the platform-wide’s KB, which then can be extended/specialized by the peer. The peer’s EB stores entity instances that are relevant to the peer (i.e., peers personal information, its resources, locations, roles, tasks, etc.). • The Peer structures and User structures are also stored and managed within the Peer Base. This storage separation for each of the peers and the platform is one of the Privacyby-design decisions made by the Peer Manager in accordance to the theory and model defined in Section 3. It is also important to notice that the Knowledge Base component described in the deliverables from the first year of the project, has now been integrated inside the Peer Manager. To help understand all these structures and their relations, Figure 9 gives a quick look at all the data structures that the Peer Base manages. As shown in Figure 9, each Peer structure includes one Knowledge Base and one Entity Base 6 for the peer, where the Main Entity is one specific entity known to the peer (i.e., contained in EBi ) that is used to store all the information about (and often identifying) the main subject that is represented in the SmartSociety Platform by that Peer structure. In case of this subject being a human, 6
For a full explanation of the structures inside a Knowledge Base and Entity Base please refer to Section 2 and last year’s D4.1.
30 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Users
where MEi ∈ PEBi
Peer i
a peer has 1 ME
…
U1
a peer has N users
Users interact with the outside world in name of the Peer
KBi
a peer has n Etypes
a peer has m Entities
Et1 … Etn
E1 … Em
Etypes
Entities
Mei (Main En+ty)
Profiles Pr1 Pr2
…
Un
EBi
a ME has M profiles
Person Entity
a ME can be either
SWAgent Entity
Prm
Collective Entity
Profiles are read from the outside world to give access to the Peer’s information
Figure 9: A quick explanation of the Peer Manager structures. The arity and relation of all the structures are shown. then the main entity is a person entity but the Main Entity could represent a software process or a collective of peers. The Figure 9 also shows Multiple Users that are used by the Peer as pseudonyms, meaning that one of these Users7 is revealed as the subject participating in a given interaction instead of revealing the actual Peer; and Multiple Profiles that are used to reveal partial and obfuscated information (instead of revealing it all) about the Peer’s entities. It is also important to note that Figure 9 also shows how the Peer Manager supports hybridity within the SmartSociety Platform. Namely by showing that the Main Entity of the Peer (shown towards the right side of the figure) can be either of the type Person (for human users), of the type SWAgent (for Software Agents providing services) or of the type Collective (for a group of peers acting as a unit). Furthermore this choice does not impact any of the other structures of the peer or their behaviour, making (in theory) the true identity of subject behind a peer transparent to the platform.
5.3
The Peer Manager as Part of the SmartSociety Platform
As shown in Figure 10, through the use of the Peer Manager Web APIs, the services provided by the Peer Manager can be invoked directly from a certified client application or internally by other components of the Smart Society Platform. The left part of Figure 10 shows two end-user clients directly interacting with the Peer Manager (through the PM Web API). The first example (at the top left) uses a web-based client to register new users to the SmartSociety Platform, while the second example (at the bottom left part) shows a mobile device-based control dashboard where the person controlling the peer may choose to modify its privacy settings or check different types of information related to its participation in the platform. 7
For a full explanation on how Users and Profiles protect the privacy of the Peer’s Main Entity please refer to Section 4
c SmartSociety Consortium 2013-2017
31 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
" #$ ! " ( )$
*
% & ' % &
!
Figure 10: Example of Connection of the Peer Manager to external clients and other SmartSociety Platform components.
On the other hand, the right part of Figure 10 shows an incoming call (to the same PM Web API) with a Peer Search query that gets answered with a list of Peers complying with the conditions of the search request. This kind of interactions can be used by other internal components from the SmartSociety Platform (e.g. Orchestration) to match a task (expressed as a Peer Search query) with a set of Peers (expressed as the list of Peers that complies with the original query). More details about how the Peer Manager is integrated into the SmartSociety Platform and more detailed examples may be found in the deliverable D8.2 - Platform Prototype: Early Results and System Design.
5.4
Second Year Peer Manager Integrated Proof of Concept
A Proof of concept for the Peer Manager has started development and an integrated demo (with similar efforts from other work packages) has been produced for the end of the second year. While the functionalities and specific calls that the Peer Manager part contributes to this integrated proof of concept can be found at the Appendix F, for a more global view and operation examples please refer to the deliverable D8.2. Future development plans includes better integration with the software developed by other work packages, implementation of the privacy policies, along with implementation of revisions/improvements based on feedback at all levels. While the current code of the PM Web API has not yet been released, we aim to fully release more complete versions of the Peer Manager Web API as open source. It is important to stress, however, that there are currently no plans for releasing the actual back-end code for the Peer Manager nor the Knowledge/Entity Base (as they are heavily based on knowledge and work that pre-dates the start of the Smart Society project). 32 of 71
http://www.smart-society-project.eu
Deliverable D4.2
6
c SmartSociety Consortium 2013-2017
Conclusion and Final Remarks
This deliverable introduces the concept of a privacy-enhanced SmartSociety Peer Manager, which is a fundamental building block for the implementation of a privacy-preserving cas computing platform. The Peer Manager allows people and other actors, such as sensors and actuators, to store their information in a secure and preserving framework. The design and development of the Peer Manager followed three core guiding principles for enforcing legal privacy principles including the principles of data minimisation, purpose binding, transparency and user control. 1. Separation of information among peers. 2. A user-centered identity management. 3. A novel approach to peer profiling. The representation of the information stored by the Peer Manager follows a semantic schema that defines an attribute-based representation of a peer’s characteristics. The Peer Manager allows people (i.e. users) to define profiles that contain and reveal only partial or obfuscated information that are used for replying to information requests and is thus enforcing data minimisation. In addition, during year three we plan to specify and limit the purposes and use of personal data by using a ppl privacy policy. This deliverable presented a search schema allowing the Peer Manager to provide a privacy-preserving search service. This search schema includes the mechanisms used for: (i) the specification of search constraints that can be matched with the attributes from the profiles of relevant peers overcoming the diversity in the terminology used for their representation; and (ii) the definition of distance functions for different types of values that can be used to rank results (i.e., peers matching with the search constraints) based on their relevance with respect to an ideal search result. The privacy-preserving part of the search service is provided by integrating the search and ranking mechanisms with the Peer Managers privacy-aware storage and sharing model. The resulting search schema, helps addressing the challenge of binding purpose to the management of peers information by allowing only authorized users and only for the authorized purposes to find profiles that match their query constraints. Privacy is then embedded into the design of the search mechanism in two different ways: first, the search does not operate on the entire data repository but only on the ones in which the search issuer is authorized; and second, the search results are constrained to include only profiles that can be revealed for a given purpose. Regarding the targeted challenges related with information profiling (such as the lack of information and feedback about how profile data is collected and traded), the Peer c SmartSociety Consortium 2013-2017
33 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Manager proposes to address them by giving transparency and control of this process to the actual subject that the profile refers to. Hence, subjects can create their own profiles with the minimal amount of information needed and share this information with the promise that their requirements will be enforced by the platform and that the shared data will not be misused or traded against their will. It is also worth noting that this approach introduces potential new issues such as: 1. Management complexity for users, as giving full control of the information could become overwhelming to it owners (specially at the scale in which personal information is produced and managed in current platforms). For this reason, it is key for the future adoption and practicability of this approach to present itself as an extension of existing identity management systems. While some people may not be interested in fine-grained privacy settings, we plan to provide the possibility to individuals to review and change their privacy settings in a usable way. 2. Lack of a concrete business model, many companies and platforms that base their revenue on the harvesting, processing and reselling customer information may initially be reluctant to give back to their customers more control over their information. There are however, other organizations (e.g. public utilities, health organizations) that value much highly the trust that their customers have in them. Moreover, this higher-degree of customer trust and sense of being in control may also make setting partial limits and transparency settings beneficial to some of the applications that do use this information as source of revenue. There is already some literature on the subject but this will be studied more in depth in future deliverables. The close attention we pay to privacy and social values for profiles within Smart Society does not mean that all difficulties are circumvented - there are a host of future challenges too. We have to be continually vigilant over whether autonomy and self-determination are indeed preserved by the envisaged technical measures implemented to protect profiles within Smart Society. For example, despite having the tools to control circulation of their data, a participant may still feel compelled to disclose an item to the system if refusing to do so leaves them materially disadvantaged. This means that we have to keep sight of the wider governance of Smart Society at the same time as we focus at specific technical means that preserve privacy locally. Furthermore, there are still technical challenges to be tackled when providing a high-granularity privacy friendly Platform that operates within human user acceptable responses times while also providing other resource intensive functionalities like provenance and search.
34 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
References [1] V. Maltese, F. Giunchiglia, and B. Dutta, “Domains and context: first steps towards managing diversity in knowledge,” Web Semantics:
Science, Services
and Agents on the World Wide Web, vol. 12, no. 0, 2012. [Online]. Available: http://www.websemanticsjournal.org/index.php/ps/article/view/229 [2] F. Giunchiglia, B. Dutta, and V. Maltese, “From knowledge organization to knowledge representation,” in ISKO UK Conference, 2013. [Online]. Available: http://www.iskouk.org/conf2013/papers/GiunchigliaPaper.pdf [3] R. Chenu-Abente, I. Zaihrayeu, and F. Giunchiglia, “A semantic-enabled engine for mobile social networks,” in The Semantic Web: ESWC 2013 Satellite Events, ser. Lecture Notes in Computer Science, P. Cimiano, M. Fern´andez, V. Lopez, S. Schlobach, and J. V¨ olker, Eds.
Springer Berlin Heidelberg, 2013, vol. 7955, pp.
298–299. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-41242-4 50 [4] F. Giunchiglia, B. Crispo, and R. Zhang, “Access control via lightweight ontologies,” in Semantic Computing (ICSC), 2011 Fifth IEEE International Conference on, Sept 2011, pp. 352–355. [5] T. R. Gruber, “A translation approach to portable ontology specifications,” Knowledge Acquisition, vol. 5, no. 2, pp. 199 – 220, 1993. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S1042814383710083 [6] N. Guarino and C. Welty, “Evaluating ontological decisions with ontoclean,” Commun. ACM, vol. 45, no. 2, pp. 61–65, Feb. 2002. [Online]. Available: http://doi.acm.org/10.1145/503124.503150 [7] S. R. Ranganathan, “Prolegomena to library classification,” The Five Laws of Library Science, 1967. [8] I. ISO, “25964-1: 2011,” Information and documentation - Thesauri and interoperability with other vocabularies. Part, vol. 1. [9] D. Christin, C. Roßkopf, M. Hollick, L. A. Martucci, and S. S. Kanhere, “Incognisense: An anonymity-preserving reputation framework for participatory sensing applications,” Pervasive and Mobile Computing, vol. 9, no. 3, pp. 353–371, 2013. [Online]. Available: http://dx.doi.org/10.1016/j.pmcj.2013.01.003 c SmartSociety Consortium 2013-2017
35 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
[10] L. A. Martucci, S. Ries, and M. M¨ uhlh¨auser, “Sybil-Free Pseudonyms, Privacy and Trust: Identity Management in the Internet of Services,” Journal of Information Processing, vol. 19, pp. 317–331, Jul 2011. [11] S. B. Davidson, S. Khanna, S. Roy, J. Stoyanovich, V. Tannen, and Y. Chen, “On provenance and privacy,” in Proc. of the 14th Int. Conf. on Database Theory - ICDT, 2011, pp. 3–10. [Online]. Available: http://doi.acm.org/10.1145/1938551.1938554 [12] S. Trabelsi, G. Neven, and D. Raggett, Eds., PrimeLife Public Deliverable D5.3.4 – Report on design and implementation, 20 May 2011. [13] F. Giunchiglia, U. Kharkevich, and I. Zaihrayeu, “Concept search,” in The Semantic Web: Research and Applications, ser. Lecture Notes in Computer Science, L. Aroyo, P. Traverso, F. Ciravegna, P. Cimiano, T. Heath, E. Hyv¨onen, R. Mizoguchi, E. Oren, M. Sabou, and E. Simperl, Eds.
Springer Berlin Heidelberg, 2009, vol. 5554, pp.
429–444. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-02121-3 33 [14] G. Salton, A. Wong, and C. S. Yang, “A vector space model for automatic indexing,” Commun. ACM, vol. 18, no. 11, pp. 613–620, 1975. [Online]. Available: http://doi.acm.org/10.1145/361219.361220 [15] G. Salton and M. J. McGill, Introduction to Modern Information Retrieval.
New
York, NY, USA: McGraw-Hill, Inc., 1986. [16] J. Kang, “Information privacy in cyberspace transactions,” Stanford Law Review, pp. 1193–1294, 1998. [17] S. Perreault, “vcard format specification,” 2011. [18] E. Goodman, “Design and ethics in the era of big data,” interactions, vol. 21, no. 3, pp. 22–24, May 2014. [Online]. Available: http://doi.acm.org/10.1145/2598902 [19] H. Nissenbaum, “Privacy as contextual integrity,” Wash. L. Rev., vol. 79, p. 119, 2004. [20] T. Monahan, “Surveillance and inequality,” Surveillance & Society, vol. 5, no. 3, 2002. [21] S. van der Hof,
,
and C. Prins,
“Personalisation and its influence on
identities, behaviour and social values,” M. Hildebrandt and S. Gutwirth, Eds.
New York:
Springer, 2008, ch. 6, pp. 111–127. [Online]. Available:
http://opac.inria.fr/record=b1126046 36 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
[22] Council of Europe, “Recommendation cm/rec(2010)13 of the committee of ministers to member states on the protection of individuals with regard to automatic processing of personal data in the context of profiling,” Available from https://wcd.coe.int/ ViewDoc.jsp?id=1710949, 11 2010. [23] M. Kosinski, D. Stillwell, and T. Graepel, “Private traits and attributes are predictable from digital records of human behavior,” Proceedings of the National Academy of Sciences, vol. 110, no. 15, pp. 5802–5805, Mar. 2013.
c SmartSociety Consortium 2013-2017
37 of 71
c SmartSociety Consortium 2013-2017
A
Deliverable D4.2
The Peer Manager Entity Types
In this appendix we present a mode detailed definition for each entity type including concrete attribute specifications.
A.1
Entity
ID 100 EC Entity PT Description Anything which is perceived or known or inferred to have its own distinct existence (living or nonliving). An entity is any object so important to be denoted with a proper name. Attributes ID Name Data Type Description 1000000 Name NLString[] The name of the entity, e.g., “Rome”. 1000001 Class Concept The class of the entity, e.g. “city”. 1000002 Description SString[] The entity description, e.g. “A sunny place in Italy”. 1000003 Part (of) hEntityi It connects the part to the whole, e.g. locations to their administrative division. 1000010 Start Moment The moment in time an entity started to exist (e.g. the date of birth of a Person). 1000011 End Moment The moment in time an entity ceased to exist (e.g. the date of death of a Person). 1000012 Duration Duration The duration of existence of an entity (e.g. the life length of a Person)
A.2
Agent
ID 102 EC Agent PT Physical Entity Description An entity capable to act independently and to bring changes in the world Attributes ID Name Data Type Description 1020000 Skill Concept[] Any ability possessed by the agent, typically acquired by trainning 1020001 Activity Concept[] The actions that the agent performs
38 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
A.3
Person
ID Description Attributes ID 1030001 1030002
103 EC Person PT Agent A person is an individual human being
1030006
Name Gender Date birth Place birth Date death Place death Language
1000010 1000011 1000012
Hometown Home Nationality
1030003 1030004 1030005
A.4
of
Data Type Concept Moment
Description The sex of the person (male, female) The date on which the person was born
of
hLocationi
The place where the person was born
of
Moment
The date on which the person died
of
hLocationi
The place where the person died
Concept[]
The language the person speaks and/or understands The town where the person grew up Where the person currently lives in The country the person is citizen of
hLocationi hFacilityi Concept[ ]
Software agent
ID
104
A.5
Collective
EC
Software PT Agent agent Description A software agent is any individual computer program that acts on behalf of a user or another program Attributes ID Name Data Type Description 1040000 Producer hAgenti[ ] The entity engaged in financing and manufacturing the software
ID 105 EC Collective PT Agent Description A collective is any set of entities acting collectively as a single agent Attributes
c SmartSociety Consortium 2013-2017
39 of 71
c SmartSociety Consortium 2013-2017
ID 1050000
A.6
Name Member
Data Type hAgenti[ ]
Deliverable D4.2
Description The entities composing the collective
Facility
ID 106 EC Facility PT Collective Description Anything providing a particular service or used for a particular industry. Examples of facilities include shops, restaurants, stadiums, theaters, playgrounds and parks. A facility is primarily a service provider Attributes ID Name Data Type Description 1060000 Date of esMoment The date on which the construction of tablishment the facility started. In case of buildings such as monuments this corresponds to the date at which the first stone was laid 1060002 Opening String The times when the facility is open for hours people to use it and services are offered 1060010 Address Address The postal address of the facility 1060020 Staff hPersoni[ ] The persons (e.g. employees) charged with carrying out the work 1060021 Administrator hAgenth[ ] The person or organization that administers the facility
A.7
Organization
ID 107 EC Organization PT Collective Description An organization is a collective constituted by individuals with common goals and formal rules. Organizations include corporations, companies, agencies, political parties and other groups of people defined by an established organizational structure Attributes ID Name Data Type Description 1070001 Date of esMoment The starting date of the organization tablishment 1070002 Date of Moment The ending date of the organization disestablishment
40 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
1070003 1070004
Seat Founder
hFacilityi[ ] hPersoni[ ]
1070005
Leader
hPersoni
1070006
Affiliate
hAgenti[ ]
A.8
The main seat of operations The person who established or founded the organization The person having the leadership, thus occupying the top position in the organization The person or other organizations that are affiliated with the organization
Location
ID 108 EC Location PT Physical Entity Description It refers to spatial objects, i.e. entities occupying fixed regions of space (e.g., regions, cities, boundaries, parcels of land, water bodies, roads, buildings, bridges, etc.) Attributes ID Name Data Type Description 1080000 Latitude Float The constant coordinate for the latitude (in WGS84 decimal format) 1080001 Longitude Float The constant coordinate for the longitude (in WGS84 decimal format) 1080002 Elevation Float The distance above a reference point (such as sea level)
A.9
Artifact
ID Description Attributes ID 1200000
120 EC Artifact PT Physical Entity An artifact represents a physical entity created by a human Name Creator
Data Type hAgenti[ ]
c SmartSociety Consortium 2013-2017
Description The entities (persons or organizations) that participated to the creation of the artifact
41 of 71
c SmartSociety Consortium 2013-2017
A.10
Mind Product
ID
109
Deliverable D4.2
Mind Prod- PT Entity uct Description A mind product is any abstract product of the human intellect Attributes ID Name Data Type Description 1090000 Creator hAgenti[ ] The persons or organizations that participated in the creation 1090001 Contributor hAgenti[ ] The persons or organizations that contributed to the work with less importance than the creator 1090002 Copyright SString A text containing a statement about various property rights associated with the mind product 1090003 Audience Concept The target part of the general public to whom the product is directed 1090012 Identifier String[ ] A string or number that identifies it
A.11
EC
Document
ID 110 EC Document PT Mind Product Description A document is a conventionally text-based work that is used as the basis, proof or support of something, or simply to transfer information. Instead of or in addition to text, a document may also contain media and other resources (e.g. pictures, videos). Examples of documents include books, articles, emails, etc Attributes ID Name Data Type Description 1100000 Title SString A short sentence describing the subject of the document 1100001 Abstract SString A short text summarizing the document 1100002 Keyword SString[ ] A text encoding a set of words that are related to the main topics of the paper, used to index it 1100003 Coverage SString The spatial or temporal topic of the resource, the spatial applicability of the resource, or the jurisdiction under which the resource is relevant, e.g.
42 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
1100004
Reference
hDocumenti[ ]
1100005
Language
Concept
1100006
Page count
Integer
A.12
One or more related resources (but external to the document) that are referenced, cited or pointed to by the document A language of the resource, e.g. English, Italian The number of pages of the document
File
ID 121 EC File PT Physical Entity Description A computer file is a block of arbitrary information or resource for storing information, which is available to a computer program and is usually based on some kind of durable storage. A file is durable in the sense that it remains available for programs to use after the current program has finished Attributes ID Name Data Type Description 1210000 Creator hAgenti[ ] Creator of the file 1210001 URL String The URL pointing to the physical location where the file is stored 1210002 Format String The format of the file, denoting a particular way to encode information 1210003 Size Long Measures the actual amount of disk space consumed by the file (in bytes) 1210004 Tag SString[ ] Keywords or terms associated with or assigned to a file 1210005 Hyperlink hFilei[ ] A link from a file to another file 1210010 Creation Moment The time at which the file was created Time 1210011 Modification Moment The time at which the file was changed Time
A.13
Event
ID 130 EC Event PT Entity Description Something that happens at a given place and time
c SmartSociety Consortium 2013-2017
43 of 71
c SmartSociety Consortium 2013-2017
Attributes ID 1300000
1300001
A.14
Event tus
Data Type hLocationi[ ]
Sta-
Concept
131 EC Task PT Event Any piece of work that is undertaken or attempted Name Participant
Data Type hRolei[ ]
Description The list of participants to the event
Role
ID 500 EC Role Description The specific frame of activities a specific event Attributes ID Name Data Type 5000000 Agent hAgenti
A.16
Description The place where the event occurs (e.g., city, city quarter, building, set of locations, etc.) The current status of the event (e.g. expected, canceled, confirmed)
Task
ID Description Attributes ID 1310000
A.15
Name Venue
Deliverable D4.2
PT played by an agent that takes part to
Description The agent playing the role
Address
ID 600 EC Address PT Description A collection of information used for describing the location of a building, apartment, or other structure or a plot of land, generally using political boundaries and street names as references, along with other identifiers such as house or apartment numbers Attributes ID Name Data Type Description
44 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
6000000 6000001
Address Postcode
NLString String
6000002 6000003
hLocationi hLocationi
6000004
Country Administrative division Place
6000005
Way
hLocationi
6000006
House number Apartment number
String
6000007
hLocationi
String
c SmartSociety Consortium 2013-2017
The full specification of the address A code of letters and digits added to a postal address to aid in the sorting of mail The territory occupied by a nation A district defined for administrative purposes The locality within the administrative division (a city, town, village, or other agglomeration of buildings where people live and work) The road, the square or any thoroughfare of the locality where the facility is located Unique number given to each building in a street area Number for further sub-dwellings internal to a building (e.g., door, floor, etc)
45 of 71
c SmartSociety Consortium 2013-2017
B
Deliverable D4.2
Study on Personal Data
Personal information can be generally defined as “data authored by an individual, that describe an individual, or that can be mapped to an individual� [16]. The complexity of this kind of data is proved by the plethora of different standards underlining various aspects of a person, e.g., business domain standards (vCARD [17] and hCARD8 ), health care domain standards (NIH Organizational Person Schema9 ), and authority records (ISAAR10 ). In the context of the Semantic Web, there have been attempts to capture personal data, e.g., schema.org11 and the FOAF ontology,12 striving for a general enough representation. Our work on defining personal data originates from a philosophical point of view and divides them in essential and accidental properties. An essential property of an entity is a property that it must have while an accidental property of an entity is one that it happens to have but that it could lack. Note also that essential and accidental properties, or attributes in the context of entity types, have an equivalent representation in time. In fact, essential attributes tend to be static, whereas accidental properties tend to be dynamic. Admittedly, static and dynamic allow for degrees of time variance and several other differences. Intuitively, static data encompass all identifiers, e.g., IDs and SSNs, but there are static data that do not serve identity purposes, e.g., biological features such hair colour, or languages known, which require many years of practice. As for dynamic data, while they usually rely on sensor inputs, not all of them represent aspects of users directly quantifiable by sensors. For instance, clothing, interests and mental states are clearly dynamic; however, they cannot be inferred from sensors. These issue cannot be attributed to data per se, but rather show some technological limitations; however, foreseeable technological advancements may change that in the near future. These philosophical assumptions are combined with a review of the available standards for representing the Person etype attributes, in addition to the core attributes already present in MS8. By core attribute we mean the most essential attributes of a person that we obtained by comparing all attributes of different standards showed in Table 17, and keeping only those shared by the majority of them, e.g., name, gender, nationality, and so on. Our approach is based on the assumption that if standards of different domain share attributes, then these attributes are context independent. W3C has a standard with a similar scope called Person Core Vocabulary that provides a minimum set of classes and properties for describing a person, i.e., essential demographics. A complete comparison 8
http://microformats.org/wiki/hcard http://nedinfo.nih.gov/amgtech/docs/schema/current.html 10 http://www.ica.org/10206/standards/standards-list.html 11 schema.org/Person 12 http://xmlns.com/foaf/spec/ 9
46 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
of the standards and their representation of Person attributes can be found in Table 18. Following our initial analysis reported in D4.1, we identify different dimensions of person attributes by taking into account the assumptions described above and the review of existing standards: Essential attributes can be divided into: 1. Identifying attributes: these attributes are used to uniquely refer to an entity. The strength of their reference depends on their scope, e.g., a name could be used as an identifier in many contexts but it is not as strong as an ID issued by a country. 2. Dates and Places of Birth and Death: these attributes define unchangeable point in time and space and are always true. 3. Languages: languages are a particular kind of quality of a person that require a long process to obtain, so, while not completely static, they tend to remain the same throughout a persons life Accidental attributes can be divided into: 1. Sensor based attributes: as the name suggests, these attributes are originated from sensors; they cover those quantifiable dimensions of a person, e.g., position and biometrics. 2. Descriptive attributes: these attributes are dynamic attributes that cannot be defined through sensors but do not represent essential aspects of a person, e.g., hair and eye colour. 3. Domain dependent attributes: these attributes are tied to a specific context, which means they are often true within said context. Similarly to sensorbased attributes, they show a high degree of time variance, but cannot be understood via sensors. They are generally social contexts, therefore these attributes cover dimensions such as: (a) Skills, e.g., swimming (b) Preferences and interests, e.g., hobbies (c) Education and occupation, e.g., degrees (d) Social relations, e.g., kinship (e) Contacts, e.g., phone number(s)
c SmartSociety Consortium 2013-2017
47 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Table 17: Personal data standards Standard vCARD hCARD
Reference [17]
Description A file format standard for electronic business cards, and an IETF standards-track A format for publishing people on the web, using a 1:1 representation of vCard
http:// microformats. org/wiki/hcard A set of XML specifications for representing CIQ TC http://docs. personal data independent of any culture, geoasis-open. ographical location, application or industry org/ciq/v3. 0/cs/specs/ ciq-xprl-specs-cs. pdf Provides guidance for preparing archival auISAAR http://www. thority records for describing entities associica.org/10206/ ated with the creation and maintenance of standards/ standards-list. archives html FOAF http://xmlns. A computer language defining a dictionary com/foaf/spec/ of people-related terms that can be used in structured data NIH ORGA- http: A standard representing people of the health //nedinfo.nih. care domain NIZATIONAL gov/amgtech/ PERSON docs/schema/ SCHEMA current.html SCHEMA.ORG schema.org/ A collection of schema used to mark up webPerson pages THE PERSON http://www.w3. It provides a minimum set of classes and CORE VO- org/ns/person properties for describing a natural person CABULARY
48 of 71
http://www.smart-society-project.eu
ISAAR
Names
CIQ TC
Name
additional Name; familyName; givenName; honorificPrefix; honorificSuffix
SCHEMA. ORG
familyName; lastName; family name; firstName; surname
FOAF
THE PERSON CORE VOCABULARY patronymic Name; birthName Name; Formatted Name; Nickname; Title
vCARD
Table 18: Comparison of standards for personal data NIH ORGANIZATIONAL PERSON SCHEMA Common name; personalTitle; givenName; middleName; Title; sn; generation Qualifier; nihCommonGivenName; nihCommonMiddleName; nihCommonSn; nihSuffixQualifier; initials Fn; familyname; given-name; additionalname; honorificprefix; honorificsuffix; nickname;
hCARD
Deliverable D4.2 c SmartSociety Consortium 2013-2017
c SmartSociety Consortium 2013-2017
49 of 71
50 of 71
Places and/or geographical areas of residence
Address
Identification details;
Contact numbers
Electronic address
ISAAR
CIQ TC
Telephone
Address; contactPoint;
SCHEMA. ORG
FOAF
Unique Identifier;
Official Website
THE PERSON CORE VOCABULARY residency
serial Number; nih DupUID;
TEL
Email; URL; IMcontact;
Delivery Address; Label Address;
vCARD
mail; Email forwarding address; homePhone; homeFax; Personal email address;
NIH ORGANIZATIONAL PERSON SCHEMA homePostal Address; nihDelivery Address; postalAddress; postalCode;
tel
postoffice-box; extendedaddress; streetaddress; locality; region; postal-code; countryname; type; value; Email; url;
hCARD
c SmartSociety Consortium 2013-2017 Deliverable D4.2
http://www.smart-society-project.eu
ISAAR
Dates and places of existence
Occupation; sphere of activity;
Relationships
Nationality
CIQ TC
Birth
Qualification
Relationships
c SmartSociety Consortium 2013-2017
Nationality Status Language
Spouse; Sibling; relatedTo; parent; knows; colleague; children; alumniOf; Nationality
workLocation; worksFor; affiliation; job Title; memberOf
birthDate; deathDate;
SCHEMA. ORG
Workinfo; Homepage Past Project; School Homepage; WorkplaceHomepage; knows
FOAF
citizenship
THE PERSON CORE VOCABULARY placeOfBirth; placeOfDeath; countryOfBirth; countryOfDeath; Occupation
SPOUSE
ROLE; ORG; MANAGER; ASSISTANT;
BDAY; ANNIVERSARY;
vCARD
NIH ORGANIZATIONAL PERSON SCHEMA
role
bday
hCARD
Deliverable D4.2 c SmartSociety Consortium 2013-2017
51 of 71
CIQ TC
ISAAR
SCHEMA. ORG
FOAF
THE PERSON CORE VOCABULARY
52 of 71
PHOTO; Logo; CARD PICTURE;
vCARD
NIH ORGANIZATIONAL PERSON SCHEMA photo
hCARD
c SmartSociety Consortium 2013-2017 Deliverable D4.2
http://www.smart-society-project.eu
Deliverable D4.2
C
c SmartSociety Consortium 2013-2017
Aligning WP1 and WP2 Formal Models
In this appendix we analyze the relation between the different models proposed as part of the foundational notions during the first year of the SmartSociety project. In particular, we focus the attention on the models of WP1 and WP2. We aim to clarify the scope of these models and how they can be aligned in order to allow interoperability with other components of the SmartSociety platform For instance, we are thinking on the integration of the Peer Manager with the provenance service developed by WP2. During the first year of the project WP1 and WP2 came up with their own models. WP1 focused on the way hybridity and diversity can be captured via a semantic schema and its instantiation for the representation of real world entities (e.g. persons, locations, events) with an underlying ontology specifying the meaning of the terms used. WP2 focused on provenance as a fundamental tool to support traceability. In some sense WP1 models capture the present (the current properties of the entities) and WP2 the past (who did what). To clarify the relation between such models, we need to distinguish among the following levels (Figure 11): • Data schema level: D1.1 and D4.1 describe how to define data schemas that we call entity types (also shortly summarized in subsection 2.1). Similarly to databases, an entity type specifies constraints on attributes and values that entities of that kind can instantiate. Differently from databases, this is done with explicit semantics (i.e. we use ontologies to specify the meaning of the terms used). For instance, we can define persons as entities having a name, a gender and a date of birth. The ontology specifies that the term date of birth means “the date on which a person was born”. Similarly to Object Oriented databases, entity types form a hierarchy where child entity types specialize parent entity types. For instance, we can define the entity type person as more specific than agent. • Data level: Entity types are instantiated in terms of what we call entities. Each entity instantiates exactly one entity type where its attributes and their values are consistent with what is defined by the entity type. For instance, we can define the person John Doe as having a gender attribute whose value is male. • Metadata provenance schema level: the PROV ontology provides the provenance schema in terms of Agents, Activities and Entities. • Metadata provenance level: Provenance metadata instantiates the PROV ontology schema by labeling data (i.e. it is an instance of metadata and not of data). c SmartSociety Consortium 2013-2017
53 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
For example, it can specify that the person John Doe published an offer of a ride in the Ride Sharing application. Notice that both the description of John Doe and of the offer, as entities, are at the data level and are referred to via URIs. ONTOLOGY
Person
Name: string Gender: concept Date of birth: date
Offer
Date of birth: (81257) “the date on which a person was born”
Origin: location Destination: location Time: date
DATA
Person
Offer
Name: John Doe Gender: male Date of birth: 1970-01-05
Origin: Trento Destination: Rome Time: 2015-02-25
Person #1
Publish
DATA SCHEMA (entity types)
Offer #2
METADATA PROVENANCE
METADATA PROVENANCE SCHEMA
Figure 11: Alignment of WP1 and WP2 formal models
54 of 71
http://www.smart-society-project.eu
Deliverable D4.2
D
c SmartSociety Consortium 2013-2017
Ethical and Privacy issues
In this appendix we explore how the Smart Society project pays attention to issues of privacy, ethics and social values, and expands upon issues associated more generally with ‘big data’ and profiling driven approaches. In particular, we draw attention to the extensive scope of the Smart Society vision and its extensive capacity to transform our lives to highlight the importance of paying close attention to these issues. To do this we draw on existing literature detailing the ethical challenge that now confront us from increasing levels of digital mediation within our everyday lives. The reference to ‘Society’ in the Smart Society name underlines the extensive ambition of the project. Examples and scenarios generated within the project encompass Tourism, Care, Health, Policing and span from grand aims of solving problems of sustainability to assisting the mundane practicalities of finding somewhere to eat in an unfamiliar town. This breadth and depth underlines the vast scope and everyday pervasiveness implied by the ambition of a ‘Smart Society’ that aims to address ‘societal challenges’ and operate at ‘internet scale’ whilst at the same time penetrating into the mundane aspects of many of our everyday routines and activities. The aim is not to leave these activities unaltered, but rather to supercharge them by linking individuals into collectives to access collectives’ problem solving and self-organizing abilities, and to draw upon portable devices, sensors, data and algorithms to assist in ‘orchestrating’ these newly collectivized activities. Part of the motivation for Smart Society stems from the perception that the existing accumulations of digital mediation for everyday activities have until now been technically untidy, due to a lack of appropriate engineering principles, and ethically haphazard, as a consequence of being unplanned and undirected. Hence Smart Society’s twin foci on engineering and ethics. Improved engineering is seen to address the problem of ethics by providing a structured and therefore more considered process for creating such systems. Ethics helps solve engineering problems by guiding the engineers towards solutions that preserve certain important social values, such as privacy. In the context of an increasingly digitally augmented lives, the performance of everyday activities now involves numerous data streams leading to vast accumulations of data. This data is viewed as a resource towards solving a wide array of social problems, but its misuse is also seen as threatening our privacy and autonomy. Goodman analyzes ethical issues in the era of personal ’big data’ draws attention to the following attributes that carry these types of risk:[18]: • “The sensor infused world”. The shear array of sensors, devices and increasingly everyday objects interconnected via online infrastructures passively generating inc SmartSociety Consortium 2013-2017
55 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
creasing quantities of data from an ever wider range of activities. • “Data as commodity”. As data has become valuable in its own right beyond the services which generated leading to important questions as to who gets to realize value from data and for what purposes? Nissenbaum’s concept of ‘data integrity’ maintains that use of data should be consistent with the values attached to the activities producing the data[19]. • “Opacity of back-end information exchange”. The many ways by which data circulates and are traded are hidden from view, as are the ultimate purposes to which those data may be put. So the ways that such data are subsequently used to filter or shape our experience of the world are often concealed. • “Mass scale”. How this is happening on an unprecedented scale and in ways that do not differentiate between diverse cultural expectations about privacy and data use. Smart Society could be prone to the hazards described by Goodman for social platforms if data is similarly centralized and its stewardship remains in the hands of platform operators. Providing tools and policies that extend control of data to users and that bind operators to principles of transparency are important challenges for Smart Society particularly where this creates technical and operational inconveniences. In the era of personal ’big data’ it is common to use data to profile individuals and stratify populations as a means to tailoring or individualizing experience, e.g. by targeting advertising, creating recommendations or tailoring services. Profiles are used within Smart Society to involve peers in collectives, perhaps to solve problems, based upon their experience, skills and reputation. We describe below the importance of profiles for Smart Society but first we enumerate some of the hazards of profiling already identified in the literature.
D.1
Social and Ethical Issues of Profiles
• Social sorting. Social Sorting refers to how profiling technologies sorts individuals into categories in order to affect their experiences and opportunities. Examples include banks routing calls from wealthier customers deliver speedier service at the expense of less well off customers or internet service providers giving priority to certain traffic or favored customers. Negative effects of sorting include reinforcing existing social divisions and creating new yet invisible hierarchies of access and privilege[20]. 56 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
• Autonomy and self-determination. Often profiles are created and used without our knowledge or consent, and the ways that our experience of services is modified by profiling is typically invisible. Where profiles are computed from data about us, then we are subject the the values embedded in the algorithms used to sift that information, but not given a voice in the creation of those algorithms. When we create profiles (e.g. on Linkedin or Facebook) then we are still constrained by what we can express and have little control over how much a flat partial identity may be read by others as a literal depiction of who we are. • Diminishing diversity. “With commercial personalization services, the myriad of individual differences is reduced to one or a few consuming categories, on the basis of which their preferences, character, life-style and so forth are determined for a specific context. Because of its tendency to generalize, personalization may lead to diminishing preferences, differences and values...” [21]. A question raised during the Patras workshop underlines this point. The questioner characterized our experience of city life as in turns vivid, serendipitous, frustrating and pleasurable and questioned how city life mediated through Smart Society ‘apps’ may lead to a dulling, standardisation and impoverishment of these sorts of experience. It is important for Smart Society to retain elements of fun, chance, discovery and provide an experience that is enriching to and complementary to existing beneficial forms of city life, and avoid the tendency of computer science to frame problems narrowly in terms of optimization. Profiles are a crucial component within the Smart Society platform and it is the Peer Profile which gives participants their identity within the system and thereby governs the relationship between individuals and the collectives in which they may become involved. The Smart Society Peer Profile codifies the participant’s reputation, interests, expertise and actions. The system uses this information to work out if it can recruit the ‘peer’ to contribute to solving a problem. In this way the peer profile plays a role in determining participants’ opportunities within the system. Given the extent of Smart Society vision then this could imply significant advantages or deficits in life-chances where profiles govern participants’ access to culture, education, healthcare and their ability to engage in economic activity. In the context of Smart Society then peoples’ very participation in civil society may be at stake. Some risks of privacy profiling within Smart Society are being addressed in Smart Society, as it was presented in the main part of this deliverable. In particular, risks to autonomy are addressed by creating mechanisms that give the participant ownership of their profile by hosting it on their device; allowing the user to edit or amend any aspect; c SmartSociety Consortium 2013-2017
57 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
and in specifying policies describing in what circumstances the data may be shared across the platform.
D.2
Privacy Issues of Profiles
Peer profiling may affect privacy in different respects. As the Council of Europe has discussed in its recommendation cm/rec(2010)13 on profiling [22], the collection, linking, calculation, comparison and statistical correction of data with the objective to create profiles may have significant privacy impacts, as profiling enables a person’s personality, behavior, interests and habits to be determined, analyzed and/or predicted. Often such profiling is even happening without the knowledge of the individuals concerned. While profiling may offer benefits for users and society at large, e.g. by providing users with targeted and better services addressing personal and societal interests or by permitting an analysis of risks and fraud, profiling techniques can also have the impact on the individuals concerned by placing them in predetermined categories and may unjustifiably deprive them from accessing certain services and by this discriminate individuals, a we also pointed out above [22]. Moreover, profiling techniques do not only allow to analyze data that are actually recorded, but also allows to statistically predict or implicitly derive information from such records. For instance, sensitive data including about political opinions, religious beliefs, intelligence or sexual orientation can be automatically predicted from Facebook Likes (see e.g., [23]). During the workshop at Patras, the following more specific privacy questions, mostly related to peer profiling, were raised and discussed, but not finally answered, which implies that they still largely remain challenges to be addressed within Smart Society: • How can privacy interests of “collectives” (consisting of several individuals and/or machines) be protected? How can collectives be formed in an anonymous manner, i.e. in a way that it does not relate to any identified or identifiable person? Can privacy policy languages be extended to define privacy preferences of Collectives and negotiate privacy policies for Collectives? Is it a challenge to define/jointly agree on privacy preferences for Collectives in regard to personal data that they have in common/share, or can group decisions and crowdsourcing on privacy preference settings enable/motivate users to spend more efforts on privacy preference management? • In hybrid systems, peer profiles of machines could include personal data of one or even several data subjects. For instance, in the application of Smart Society to a care scenario, sensors may capture data about when and for how long health care 58 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
professionals and patients have met. This implies that the sensor readings may reveal both personal information about health care professionals and the patients. Under which conditions can data subjects of data relating also to other data subjects can exercise their data subject rights (if for example the data is only intended for the health care professionals to organise their work, the patient (or their relatives) may still have the right to access data items that relate to them (e.g., to check whether the patient gets the right treatment)?). • Are anonymous credentials suitable pets (privacy-enhancing technologies) for enhancing the privacy of passengers and drivers participating in a Smart Society enabled ‘ride sharing’ platform that is currently under implementation? Both drivers and passengers could be pseudonymously be registered by the platform and prove only certain properties (e.g., passion of driving license for more than five years, reputation scores). Will the use of anonymous credentials in this context be practically feasible and socially accepted? Privacy-related questions concerning privacy on sensor data collection [9], trust and reputation [10], and provenance [11] were also raised. These topics are being addressed by Smart Society and are not discussed in this deliverable.
c SmartSociety Consortium 2013-2017
59 of 71
c SmartSociety Consortium 2013-2017
E
Deliverable D4.2
Policy Example
In this appendix, we present a small example PPL policy. This example is been purposely kept small due to presentation purposes and is only meant to give the reader an idea of how such a policy might look. The policy is not meant to be read by people, but rather by machines. In the Peer Manager such policies are meant to be used to regulate the access and release of personal data based on the users preferences. The final version of PPL used in Smart Society might be a reduced set of PPLs capabilities. The policy in this example only cover the data items nickname, location and food preferences and describe how they can be used in the smart food application. The policy consists of tree rules one for each of the data items. Each rule consists of two parts. The first part describes the data type and the access permissions to that data part. The second part describes the data handling policy for the data item. The data handling policy contains an authorization part that describes the purposes for which the data item can be used and an obligation part that stipulates how the data item is to be handled. In the current example: • nick name can be used for contact, food recommendation and search purposes, • location for search and food recommendation purposes with the obligation that access is logged when its used for search purposes, • and food preferences can be used for search and food recommendation purposes. In all of the cases the data value is only allowed to be read. For a full description of the PPL language we refer the reader to ( S. Trabelsi, G Neven, D. Raggett ed., D5.3.4 Report on design and implementation, Deliverable, PrimeLife, 2011.) <?xml version=” 1 . 0 ” e n c o d i n g=”UTF−8” ?> < !DOCTYPE p p l : P o l i c y S e t [ <!ENTITY pplschemapath ” schema / ”> ]> <p p l : P o l i c y S e t x s i : s c h e m a L o c a t i o n=” h t t p : //www. p r i m e l i f e . eu / p p l schema / PrimeLifeSchema . xsd h t t p : //www. p r i m e l i f e . eu / p p l / c r e d e n t i a l schema / P r i m e L i f e C r e d e n t i a l . xsd h t t p : //www. p r i m e l i f e . eu / p p l / o b l i g a t i o n schema / P r i m e L i f e O b l i g a t i o n . xsd u r n : o a s i s : n a m e s : t c : x a c m l : 2 . 0 : p o l i c y : s c h e m a : o s h t t p : // d o c s . o a s i s −open . o r g / xacml / a c c e s s c o n t r o l −xacml −2.0− p o l i c y −schema− o s . xsd ” P o l i c y S e t I d=”#t o p L e v e l ” PolicyCombiningAlgId=” 60 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
u r n : o a s i s : n a m e s : t c : x a c m l : 1 . 0 : p o l i c y −combining− a l g o r i t h m : p e r m i t −o v e r r i d e s ” x m l n s : p p l=” h t t p : //www. p r i m e l i f e . eu / p p l ” x m l n s : x s i=” h t t p : //www. w3 . o r g /2001/XMLSchema−i n s t a n c e ” x m l n s : c r=” h t t p : //www. p r i m e l i f e . eu / p p l / c r e d e n t i a l ” xmlns:ob=” h t t p : //www. p r i m e l i f e . eu / p p l / o b l i g a t i o n ” xmlns:xacml=” u r n : o a s i s : n a m e s : t c : x a c m l : 2 . 0 : p o l i c y : s c h e m a : o s ”> <x a c m l : T a r g e t /> <p p l : P o l i c y S e t P o l i c y S e t I d=”#SmartFood ” PolicyCombiningAlgId =” u r n : o a s i s : n a m e s : t c : x a c m l : 1 . 0 : p o l i c y −combining− a l g o r i t h m : p e r m i t −o v e r r i d e s ”> <x a c m l : T a r g e t /> <p p l : R u l e E f f e c t=” Permit ” RuleId=” SmartFood#Nickname ”> <x a c m l : T a r g e t> <x a c m l : R e s o u r c e s> <x a c m l : R e s o u r c e> <xacml:ResourceMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : a n y U R I −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# anyURI”>h t t p : //www. w3 . o r g /2006/ vcard / ns#nickname</ x a c m l : A t t r i b u t e V a l u e> <x a c m l : R e s o u r c e A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/ XMLSchema#anyURI” A t t r i b u t e I d=” h t t p : //www. p r i m e l i f e . eu / p p l / U n c e r t i f i e d A t t r i b u t e T y p e ” /> </ xacml:ResourceMatch> </ x a c m l : R e s o u r c e> </ x a c m l : R e s o u r c e s> <x a c m l : A c t i o n s> <x a c m l : A c t i o n> <xacml:ActionMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : s t r i n g −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# s t r i n g ”>r e a d</ x a c m l : A t t r i b u t e V a l u e>
c SmartSociety Consortium 2013-2017
61 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
<x a c m l : A c t i o n A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/ XMLSchema#s t r i n g ” A t t r i b u t e I d=” urn:oasis:names:tc:xacml:1 .0 : a c t i o n : a c t i o n −i d ” /> </ xacml:ActionMatch> </ x a c m l : A c t i o n> </ x a c m l : A c t i o n s> </ x a c m l : T a r g e t> <p p l : D a t a H a n d l i n g P r e f e r e n c e s> <p p l : A u t h o r i z a t i o n s S e t> <ppl:AuthzUseForPurpose> <p p l : P u r p o s e>h t t p : //www. w3 . o r g /2002/01/ P3Pv1/ c o n t a c t</ p p l : P u r p o s e> < p p l : P u r p o s e>h t t p : //www. w3 . o r g /2006/01/ P3Pv11/ s e a r c h</ p p l : P u r p o s e> <p p l : P u r p o s e>h t t p : //www. s m a r t S o c i e t y . eu /2014/PPLv11/ food− r e c o m e n d a t i o n</ p p l : P u r p o s e> </ ppl:AuthzUseForPurpose> <ppl:AuthzDownstreamUsage a l l o w e d=” f a l s e ” /> </ p p l : A u t h o r i z a t i o n s S e t> <o b : O b l i g a t i o n s S e t /> </ p p l : D a t a H a n d l i n g P r e f e r e n c e s> </ p p l : R u l e> <p p l : R u l e E f f e c t=” Permit ” RuleId=” SmartFood#L o c a t i o n ”> <x a c m l : T a r g e t> <x a c m l : R e s o u r c e s> <x a c m l : R e s o u r c e> <xacml:ResourceMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : a n y U R I −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# anyURI”>h t t p : //www. w3 . o r g /2006/ vcard / ns#L o c a t i o n</ x a c m l : A t t r i b u t e V a l u e> <x a c m l : R e s o u r c e A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/ XMLSchema#anyURI” A t t r i b u t e I d=” h t t p : //www. p r i m e l i f e . eu / p p l / U n c e r t i f i e d A t t r i b u t e T y p e ” /> </ xacml:ResourceMatch>
62 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
</ x a c m l : R e s o u r c e> </ x a c m l : R e s o u r c e s> <x a c m l : A c t i o n s> <x a c m l : A c t i o n> <xacml:ActionMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : s t r i n g −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# s t r i n g ”>r e a d</ x a c m l : A t t r i b u t e V a l u e> <x a c m l : A c t i o n A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/ XMLSchema#s t r i n g ” A t t r i b u t e I d=” urn:oasis:names:tc:xacml:1 .0 : a c t i o n : a c t i o n −i d ” /> </ xacml:ActionMatch> </ x a c m l : A c t i o n> </ x a c m l : A c t i o n s> </ x a c m l : T a r g e t> <p p l : D a t a H a n d l i n g P r e f e r e n c e s> <p p l : A u t h o r i z a t i o n s S e t> <ppl:AuthzUseForPurpose> <p p l : P u r p o s e>h t t p : //www. w3 . o r g /2006/01/ P3Pv11/ s e a r c h</ p p l : P u r p o s e> < p p l : P u r p o s e>h t t p : //www. s m a r t S o c i e t y . eu /2014/PPLv11/ food−r e c o m e n d a t i o n</ p p l : P u r p o s e> </ ppl:AuthzUseForPurpose> <ppl:AuthzDownstreamUsage a l l o w e d=” f a l s e ” /> </ p p l : A u t h o r i z a t i o n s S e t> <o b : O b l i g a t i o n s S e t> <o b : O b l i g a t i o n> <o b : T r i g g e r s S e t> < ob:TriggerPersonalDataAccessedForPurpose > < p p l : P u r p o s e>h t t p : //www. w3 . o r g /2006/01/ P3Pv11/ s e a r c h</ p p l : P u r p o s e> <ob:MaxDelay>
c SmartSociety Consortium 2013-2017
63 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
<o b : D u r a t i o n>P0Y0M0DT0H2M0 . 0 0 0 S</ o b : D u r a t i o n> </ ob:MaxDelay> </ ob:TriggerPersonalDataAccessedForPurpose > </ o b : T r i g g e r s S e t> <o b : A c t i o n L o g /> </ o b : O b l i g a t i o n> </ o b : O b l i g a t i o n s S e t> </ p p l : D a t a H a n d l i n g P r e f e r e n c e s> </ p p l : R u l e> <p p l : R u l e E f f e c t=” Permit ” RuleId=” SmartFood#Food P r e f e r e n c e s ”> <x a c m l : T a r g e t> <x a c m l : R e s o u r c e s> <x a c m l : R e s o u r c e> <xacml:ResourceMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : a n y U R I −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# anyURI”>h t t p : //www. s m a r t s o c i e t y /2014/PPLv11/ ns#food−p r e f e r e n s e s< / x a c m l : A t t r i b u t e V a l u e> <x a c m l : R e s o u r c e A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/ XMLSchema#anyURI” A t t r i b u t e I d=” h t t p : //www. p r i m e l i f e . eu / p p l / U n c e r t i f i e d A t t r i b u t e T y p e ” /> </ xacml:ResourceMatch> </ x a c m l : R e s o u r c e> </ x a c m l : R e s o u r c e s> <x a c m l : A c t i o n s> <x a c m l : A c t i o n> <xacml:ActionMatch MatchId=” urn:oasis:names:tc:xacml:1 .0 : f u n c t i o n : s t r i n g −e q u a l ”> <x a c m l : A t t r i b u t e V a l u e DataType=” h t t p : //www. w3 . o r g /2001/XMLSchema# s t r i n g ”>r e a d</ x a c m l : A t t r i b u t e V a l u e> <x a c m l : A c t i o n A t t r i b u t e D e s i g n a t o r DataType=” h t t p : //www. w3 . o r g /2001/
64 of 71
http://www.smart-society-project.eu
c SmartSociety Consortium 2013-2017
Deliverable D4.2
XMLSchema#s t r i n g ” A t t r i b u t e I d=” urn:oasis:names:tc:xacml:1 .0 : a c t i o n : a c t i o n −i d ” /> </ xacml:ActionMatch> </ x a c m l : A c t i o n> </ x a c m l : A c t i o n s> </ x a c m l : T a r g e t> <p p l : D a t a H a n d l i n g P r e f e r e n c e s> <p p l : A u t h o r i z a t i o n s S e t> <ppl:AuthzUseForPurpose> <p p l : P u r p o s e>h t t p : //www. w3 . o r g /2006/01/ P3Pv11/ s e a r c h</ p p l : P u r p o s e> <p p l : P u r p o s e>h t t p : //www. s m a r t S o c i e t y . eu /2014/PPLv11/ food−r e c o m e n d a t i o n</ p p l : P u r p o s e> </ ppl:AuthzUseForPurpose> <ppl:AuthzDownstreamUsage a l l o w e d=” f a l s e ” /> </ p p l : A u t h o r i z a t i o n s S e t> <o b : O b l i g a t i o n s S e t /> </ p p l : D a t a H a n d l i n g P r e f e r e n c e s> </ p p l : R u l e> </ p p l : P o l i c y S e t> </ p p l : P o l i c y S e t>
c SmartSociety Consortium 2013-2017
65 of 71
c SmartSociety Consortium 2013-2017
F
Deliverable D4.2
Implementation
The main focus of this year’s deliverable has been about establishing the Peer Manager’s underlying theory/model and defining its architecture within the SmartSociety Platform. Nevertheless, implementation work has also been started this year and simple Proof of Concept was finished and integrated with similar results from other work packages (namely WP6, WP7 and WP8). This appendix contains a very high-level explanation of the current implemented version along with the details of the main API endpoints and JSON structures that were used for interacting and integrating with the current SmartSociety demo/pilot.
F.1
Implementation Details and Choices
As shown in Section 5 (more specifically in Fig. 8) the Peer Manager implementation is separated in two broad groups: • The Peer Base Back-end is written in Java and it requires commonly-used frameworks like Hibernate13 and Spring
14 .
From Fig. 8, everything inside the Peer Base
(including the management of Knowledge Bases, Entity Bases, Users and Peers) is implemented in the Peer Base Back-end. • Peer Manager Front-end is also written in Java and implements the two web APIs from Fig. 8 (namely the Peer Base Web API and the Peer Manager Web API). These APIs, along with some minimal data processing, mainly de-serialize the JSONbased http calls they receive and call the methods from the Peer Base Back-end; the responses received from the Peer Base Back-end are then serialized back in these API and sent back to the caller. Due to the early nature of the code in this year’s Proof of Concept, no Peer Manager related has yet been released. Nevertheless, we expect to release the much more complete and integrated version of the Peer Manager Front-End (due in M30 with D4.3) as open source.
F.2
Year Two Peer Manager Integrated Proof-of-Concept
The implemented functionalities were heavily aimed to this year’s (M24) integrated demo and as such many important features of the Peer Manager (e.g. like privacy protection) were not yet focused on. The following is a quick list of the functionalities currently supported by the Proof of Concept version of the Peer Manager. 13 14
http://projects.spring.io/spring-framework/ http://hibernate.org/
66 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
• Knowledge Base Object Management: includes CRUD operation on all KB objects (e.g. Concepts and Etypes15 ) • Entity Base Object Management: includes CRUD operation on all EB objects (e.g. attribute values, Entity Instances). • Peer, User, Collective and Profile Management: includes the creation of Peers (and the generation of all of their internal structures), Users and Profiles16 based on existing Entities. An early version of Collective creation is also present in this version due to partner requests. • Search and Peer Matching: the general search17 functionality used to match tasks(represented as query parameters) to a list of peers (represented as the result of the search operation). It is also worth noting that the implementation of the previous functionalities is not final so their details may change according to feedback from partners and other factors.
F.3
Proof-of-Concept Integration Calls Specification
The following section specifies examples of the calls that the Proof of Concept version of the Peer Manager accepts and the replies (example+format) that it produces. These calls were used as the basis for integration for the production of this year’s SmartSociety integrated demo.
15 Please refer to Section 2 for a more detailed information of structures from the Knowledge Base and the Entity Base 16 Please refer to Section 3 for more information about these structures 17 Please refer to Section 4 for general information on Search and Ranking
c SmartSociety Consortium 2013-2017
67 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Figure 12: Call and response for searching all available peers for a generic task. This call is used by WP6 (Orchestration) to obtain a list of users (that represent peers) that matches their set requirement.
68 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
Figure 13: Call and response for creating a new collective from a set of passed users. This call is used by the WP8 developed client to convert the list of users given by WP6 into a single Collective.
c SmartSociety Consortium 2013-2017
69 of 71
c SmartSociety Consortium 2013-2017
Deliverable D4.2
Figure 14: Call and response for reading all the users from a given collective. WP7 (Middleware) uses this call to get the individual Users from a Collective with the intention of later contacting them.
70 of 71
http://www.smart-society-project.eu
Deliverable D4.2
c SmartSociety Consortium 2013-2017
Figure 15: Call and response for reading the contact information of a given peer. WP7 uses this call to get the specific contact preferences of a user in order to follow them when contacting that user.
c SmartSociety Consortium 2013-2017
71 of 71