D7.1 - Virtualization Techniques and Prototypes by Smart Society Project

SmartSociety Hybrid and Diversity-Aware Collective Adaptive Systems When People Meet Machines to Build a Smarter Society

Grant Agreement No. 600584

Deliverable 7.1 Working Package 7

Virtualization Techniques and Prototypes Dissemination Level 1 (Confidentiality): Delivery Date in Annex I: Actual Delivery Date Status2 Total Number of pages: Keywords:

PU 31/12/2014 January 8, 2015 F 138 hybrid compute units, software-based services, human-based services, virtualization, communication, SmartCom

PU: Public; RE: Restricted to Group; PP: Restricted to Programme; CO: Consortium Confidential as specified in the Grant Agreeement 2 F: Final; D: Draft; RD: Revised Draft

c SmartSociety Consortium 2013-2017

2 of 138

Deliverable 7.1

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Disclaimer This document contains material, which is the copyright of SmartSociety Consortium parties, and no copying or distributing, in any form or by any means, is allowed without the prior written agreement of the owner of the property rights. The commercial use of any information contained in this document may require a license from the proprietor of that information. Neither the SmartSociety Consortium as a whole, nor a certain party of the SmartSocietys Consortium warrant that the information contained in this document is suitable for use, nor that the use of the information is free from risk, and accepts no liability for loss or damage suffered by any person using this information. This document reflects only the authorsâ&#x20AC;&#x2122; view. The European Community is not liable for any use that may be made of the information contained herein.

Full project title:

Project Acronym: Grant Agreement Number: Number and title of workpackage: Document title: Work-package leader: Deliverable owner: Quality Assessor: c SmartSociety Consortium 2013-2017

SmartSociety: Hybrid and Diversity-Aware Collective Adaptive Systems: When People Meet Machines to Build a Smarter Society SmartSociety 600854 7 Programming Models and Frameworks Virtualization Techniques and Prototypes Hong-Linh Truong, TUW Hong-Linh Truong Daniele Miorandi, UH 3 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

List of Contributors Partner Acronym TUW UEDIN UH

4 of 138

Contributor Philipp Zeppezauer, Ognjen Scekic, Hong-Linh Truong Dimitrios Diochnos, Michael Rovatsos Tommaso Schiavinotto, Iacopo Carreras, Daniele Miorandi

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Executive Summary This report represents the outcome of the activities carried out within task T7.1, Virtualization Techniques and Prototypes during the first year of the WP7 effort (M13â&#x20AC;&#x201C;M24) in terms of investigating the virtualization and communication techniques for abstracting capabilities and communications of human/machine units (peers) and their collectives to enable programming these units and collectives. The document first describes the research problems of virtualizing human computing components together with software components for Hybrid and Diversity-aware Collective Adaptive Systems (HDA-CAS). A review of existing literature is presented, covering different types of systems where human computing elements were involved, such as sociotechnical systems, human-based services, crowdsourcing systems, and workflow systems. These approaches are compared with the approaches virtualizing purely machine-based elements, such as service-oriented computing. The document describes the key abstract concepts enabling the virtualization of both machine and human peers by leveraging the service-oriented computing and cloud computing models. Peers and their collectives are abstracted under the concept of service unit with clear interfaces and capabilities to enable the virtualization and communication. The concept was devised to meet some of the elicited requirements and facilitate the design of the communication and virtualization middlewareâ&#x20AC;&#x2122;s architecture and functioning principles. The document goes on to present a high-level architecture of the middleware component of the SmartSociety platform. The architecture design implements in practice the abstract concept of the service unit. Concretely, the architecture is described in terms of structure of exchanged message, inner components, their functionality and interconnectedness. The different components enable the conventional middleware communication functionalities in a HDA-CAS environment. Finally, the document presents the outcomes of the integration process between WP7 and other components in SmartSociety.

c SmartSociety Consortium 2013-2017

5 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Table of Contents 1 Introduction

2 State of the Art

2.1

Social Computing Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Cognate HDA-CAS Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3

Enterprise Service Busses (ESBs) . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Software and Human Virtualization

3.1

Virtualizing Communication . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.2

Provisioning of Human- and Software-based Services . . . . . . . . . . . . . 16 3.2.1

Capabilities and Connectedness . . . . . . . . . . . . . . . . . . . . . 16

3.2.2

Non-functional Properties . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Communication Middleware for HDA-CAS 4.1

Requirements Specification . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.1.1

Functional Requirements . . . . . . . . . . . . . . . . . . . . . . . . 18

4.1.2

Relations to other SmartSociety Platform Components . . . . . . . . 19

4.2

Service Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.3

SmartCom Design & Architecture

. . . . . . . . . . . . . . . . . . . . . . . 23

4.3.1

Messaging and Routing . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.3.2

Message Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 24

4.3.3

Message Persistence and Other Functionalities . . . . . . . . . . . . 26

4.3.4

External APIs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.3.5

Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Programming Communications with in SmartSociety Platform

5.1

SmartSociety Platform Integration . . . . . . . . . . . . . . . . . . . . . . . 30

5.2

Usage and Application Scenarios . . . . . . . . . . . . . . . . . . . . . . . . 31 5.2.1

Orchestration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.2

Ask SmartSociety! . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6 Conclusions 6 of 138

33 http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Introduction

The term â&#x20AC;&#x2DC;virtualizationâ&#x20AC;&#x2122; as used in SmartSociety refers to the process of modeling hybrid (human and software) peers in a uniform way to allow their programmatic lifecycle management from the SmartSocietyâ&#x20AC;&#x2122;s programming model. The lifecycle management includes, among others, the phases of locating of peers, provisioning/adapting of collectives, execution of the tasks, monitoring and subsequent rewarding. Therefore, the virtual (digital) entities representing the physical peers must be able to support these virtualization aspects. In this deliverable we are looking into different virtualization techniques required to support these aspects. As interaction with peers is inherent to all of the listed lifecycle phases, virtualization of communication becomes an additional fundamental aspect that needs to be researched. For this reason, throughout the document, virtualization of communication is considered in much more detail. From the communication perspective, contemporary approaches to virtualizing human computing elements for the purposes of integration into hybrid collective systems either rely on the service-oriented approach or on the natural language communication (Section 2). The advantage of the former approach is the reuse of existing standards and technologies for matching, invoking and describing possible messages and quality of services. This allows integration of human-provided services into business processes that are known and formalized in advance. However, these approaches are not suitable for executing ad-hoc tasks, where the actual peer collective and/or the exact steps of the collaboration are not known in advance. In the latter approach informal task descriptions are provided by task submitters and passed on to the human computing elements for interpretation. This approach is currently used by the commercial crowdsourcing platforms, where the described problems with ad-hoc tasks are solved by reverting to mostly unstructured collaboration patterns, sacrificing the task complexity in favor of simple tasks and coordination patterns, managed through natural language communication with human peers. From the programming model perspective, the virtualization is dependent on the labour model it needs to support. The two most common labour models are: (i) passively proposing tasks and waiting for human input; and (ii) actively finding and binding human capabilities into applications. The first model allows more flexible collaboration patterns, but relying on the peers to drive the collaborative effort, forming and provisioning a suitable collective to perform more complex or intellectually challenging tasks may c SmartSociety Consortium 2013-2017

7 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

prove difficult. The second approach is more adept to executing complex tasks, but only if the execution steps are known at design time. The labour model (i) usually exploits individual capabilities and is platform-specific. This is the typical approach used in todayâ&#x20AC;&#x2122;s conventional crowdsourcing platforms, such as Amazon Mechanical Turk, CrowdFlower or Clickworker. The virtualization in this case is very loose (often relying on unconstrained human communication), and often highly system-specific (defined by the concrete crowdsourcing platform), and thus not appropriate for a general HDA-CAS platform like SmartSociety. The virtualization approaches for labour model (ii) are typical of workflow-based solutions in which the workflow engine can find suitable human or software services and assign the tasks to them. In this case, the virtualization must be codified more formally, as the locating and provisioning of peers is driven by the system and not the peers. Since the maturity offered by the existing Web Service models is proven in practice, in this deliverable we show how a similar, service model approach can be used for virtualizing human peers. From the WP7 perspective, the goal of the SmartSociety platform is to try to converge the above-described virtualization approaches into a system managing provisioning, collaboration (composition, orchestration and orchestration) and communication in a programmable way, allowing a profitable trade-off between the complexity offered by programmable collaboration patterns, and the flexibility offered by collectives involving free interaction among human and machine-based peers. In order to support this requirement, we virtualize human capabilities under the service model, so that they can be programatically discovered, matched, composed with other software and human services into hybrid collectives and proactively invoked. This is the primary scientific novelty and the contribution of this deliverable. At the same time, the proposed virtualization concepts do not prevent unconstrained and informal communication and collaboration with/among human and software peers. This dual approach will allow the programming model to support the task execution in such a way that parts of the execution flow can be predetermined (workflow-driven), while other parts can human-driven, and thus unconstrained. In line with the DOW, part of the introduced virtualization concepts (in particular those relating to the virtualization of communication) have been implemented as SmartCom â&#x20AC;&#x201C; the middleware for virtualization of communication with hybrid collectives consisting of human and software peers (T7.1). The remaining virtualization concepts introduced in this deliverable will rely on SmartComâ&#x20AC;&#x2122;s communication functionalities and form the foundation of the programming model (T7.2). Currently, they have been implemented as 8 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

research prototypes, described in the referenced papers. The remainder of this deliverable is organized as follows: Section 2 presents the existing state of the art in related research areas. Section 3 presents the key virtualization concepsts, related to communication and provisioning. Section 4 presents the SmartCom communication middleware, including its design requirements, architecture, implementation, integration, evaluation and usage scenarios. Section 6 reviews the achieved results and concludes the main part of the document. Appendix includes the published papers, co-authored by the project members, containing the research background for the presented virtualizations concepts. Along with the deliverable, a technical report is provided presenting in more detail the internal design and algorithms used in SmartCom implementation, as well as exposed APIs. This technical report is also meant as reference material for project partners wishing to integrate SmartCom. The report also describes performance evaluation setup and results.

State of the Art

The focus of this section will be in reviewing the virualization approaches in existing systems and research involving human and software-based peers involved in collaborative efforts. To this end, we will review the existing state-of-the art in social computing platforms, involving virtualized human peers performing given tasks and cognate HDA-CAS research efforts. Since a significant portion of the T7.1 effort was dedicated to virtualization of communication, and building a prototype of the SmartCom communication middleware for the SmartSociety platform, the final part of this section reviews the current versions of the leading commercial Enterprise Service Busses (ESBs). Although ESBs work with conventional Web Service technologies, many technical aspects are shared with SmartCom. In this respect, the review serves to elicit shared design requirements (Section 4.1) and identify novel ones, needed specifically for HDA-CAS systems.

2.1

Social Computing Platforms

In D1.1 (Appendix III.14) and D6.1 (Sec. 7.2 and 7.3) we described a number of sociotechnical systems involving human and machine peers that are virtualized as entities executing activities in workflows, and/or invoked as web services. However, in these deliverables the focus is on execution and orchestration aspects of these systems, while here we take the look at some of the same systems from communication and virtualization c SmartSociety Consortium 2013-2017

9 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

perspective. Jabberwocky [1] is platforms utilizing human capabilities to solve problems. Jabberwocky’s computing stack consists of Dormouse, ManReduce and Dog. Dormouse provides the functionality to interact with humans and machines using a platform-independent programming environment that sits on top of other crowdsourcing platforms. ManReduce follows the MapReduce paradigm but enables the programmer to decide whether to use a human or machine in both, the map and the reduce phase. Finally Dog is a high-level scripting language that makes use of ManReduce and has been created to increase the flexibility of the programming environment. [1] Amazon provides a web service called Amazon Mechanical Turk3 that offers to clients to issue small human tasks, so called Human Intelligent Tasks (HITs). Human workers can solve those tasks and will get a monetary reward after finishing successfully. Usually those tasks are independent of each other and can be created and solved in parallel. Turkit [2] provides a programming environment on top of Amazon’s Mechanical Turk that enables the programmer to define a workflow of tasks. It is able to collect the results of finished tasks and create further tasks based on those results. [2] CrowdLang [3] brings in a number of novelties in comparison with the other systems, primarily with respect to the collaboration synthesis and synchronisation. It enables users to (visually) specify a hybrid machine-human workflow, by combining a number of generic collaborative patterns (e.g., iterative, contest, collection, divide-and-conquer), and to generate a number of similar workflows by differently recombining the constituent patterns, in order to generate a more efficient workflow. The use of human workflows also enables indirect encoding of inter-task dependencies. All the described social computing systems, while differing in ways they enact the collaborative effort of peers share a similar approach to virtualization. As all these systems are frameworks/components/libraries layered on top of existing human-computation commercial platforms (such as Amazon Mechanical Turk, Clickworker, CrowdFlower) this implies that fundamentally, the virtualization and communication on the lowest level is determined by the functionalities offered by the underlying platform. For each workflow action the underlying platform’s API is used to offer the corresponding human task to the crowd. In this respect, to a platform user, the virtualized concept of the peer in these systems is a programming language construct describing peer’s capabilities, constraints, and promised rewards which are passed on to the underlying platform which ultimately 3

http://www.mturk.com/

10 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

provisions the peers to perform the task. Differently than the SmartSociety platform, these systems do not allow explicitly modeling the relationships between people and machines, or uniform virtualization of the two concepts. The concept of collective is not supported, nor the unconstrained peer-to-peer communication. Furthermore, these systems do not consider other virtualization aspects, such as elasticity.

2.2

Cognate HDA-CAS Systems

The ALLOW Ensembles project deals with the concept of cell ensembles [4]. These ensembles consist of cells that are given a declaratively-defined behaviour, but the actual workflows, that they execute, adapts during runtime depending on the collective (ensemble) goal. The focus here is on adaptation of workflows to achieve the adaptability of CAS, whereas the virtualization and communication with human and machine elements is performed through standardized Web Service (WS–*) technologies. They use BPEL4Chor [5] for inter-cellular choreography. Using WS-BPEL4People [6] or Adaptive Pervasive Flows [7] one can also incorporate human activities. The ASCENS project focuses on the peer-to-peer approach in machine-only ensembles/collectives (e.g., robots, vehicles, storage nodes) [8]. In order to deal with collectivity, the notion of ensemble compositionality is introduced and expressed through the notion of Service Components (SC), where SCs are described as: ”nodes that can cooperate, with different roles, in possibly open and nondeterministic environments. These basic properties, already satisfied by, e.g., contemporary service-oriented architectures, will be enriched by new properties of awareness” [9]. The concept of awareness here refers to the node’s ability to ”determine that there is a (...) difference between the expected result of their action and the actual result; e.g., spinning the wheels should lead to movement in a certain direction, but did not influence their position at all.” [9] The collectivity is achieved by each node separately evaluating whether its action is helpful for reaching the ensemble’s overall goals. Similarly to SmartSociety, the ASCENS models the fundamental constructs as service-based components, however, due to the specificity of coordination languages they use, at the communication layer they use Pastry [10] and extend it with the SCRIBE protocol [11] to support message routing and delivery in a peer-to-peer fashion via any- and multi-casts [12]. This behaviour differs from our approach since we provide a centralized middleware for message exchange, while relying on an adaptive concept of centrally managed collective. c SmartSociety Consortium 2013-2017

11 of 138

c SmartSociety Consortium 2013-2017

2.3

Deliverable 7.1

Enterprise Service Busses (ESBs)

An Enterprise Service Bus (ESB) [13] is a software architecture that aims to handle the communication and integration between various software applications and components of a system (e.g., in an enterprise) in a service-oriented architecture. The following discussion of popular open-source ESBs is based on the comparison in [14] in 2011 with respect to the currently available versions. Due to the requirements for virtualization of communication in an HDA-CAS, this section will only take a look at messaging, integration with adapters, privacy and delivery policies, access control, and multi-tenancy of Mule ESB (3.5.0) [15], Apache ServiceMix (5.1.1) [16], JBoss ESB (4.12) [17], and WSO2 ESB (4.8.1) [18]. MuleESB provides patterns for message routing but does not provide message normalization within the ESB, messages are translated only as needed. As many other ESBs MuleESB does also support custom adapters and is shipped with many predefined adapters. It fully supports different security protocols for access control. The ESB does also provide some kind of multi-tenancy but does not enforce any policies at runtime. [15, 14] Apache ServiceMix provides various patterns for message routing and uses normalized messages to integrate components. By using Apache Camel4 various adapters can be used by Apache ServiceMix to integrate components, furthermore it is possible to define custom adapters to adopt the ESB to new technology. Apache ServiceMix does not support any privacy or delivery policies that can be applied to specific components and it seems that it does not provide multi-tenancy as well. The ESB provides different security methods using JAAS to provide access control to the system. [16, 14] JBoss ESB also provides various patterns for message routing and uses normalized messages too. It also supports the creation of custom adapters but they are not as flexible due to some restrictions compared to other ESB providers. JBoss ESB provides access control for different components and also a (limited) support for policies. The ESB does also provide some kind of multi-tenancy. [17, 14] WSO2 ESB does support all functionality mentioned above except the dynamically enforcement of policies and is - considering only the discussed requirements - the most advanced ESB available. [18, 14] In general all discussed middlewares provide adapters to communicate with components and custom adapters to adopt to new technology advancement. All ESBs, except 4

http://camel.apache.org/

12 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

MuleESB, use normalized messages internally for the messaging and routing. In general none of the presented ESBs comes with support of addressing of collectives. Furthermore the support of humans interacting with the system is generally not considered.

Software and Human Virtualization

In deliverable D1.1 (Sec. 2.6.3 and 4.4) we introduced the principal idea of human and machine elements virtualized under the concept of service units. In this section we further develop these concepts and summarize the virtualization aspects needed to expose human capabilities under the service model, allowing seamless integration of human and machinebased computing elements.5 The two key virtualization aspects regard: (a) virtualization of communication with/among peers; and (b) virtualization of human capabilities in such a way as to allow provisioning of human-based computing elements under a service model. In the remainder of the section we will use the terms ‘human-based service’ (HBS) and ‘software-based service’ (SBS) to denote virtualized entities corresponding to individual human and machine peers, respectively. Figure 1 shows a conceptual architecture of a system providing virtualization of hybrid SBS/HBS’s, allowing layering of a programming model on top of it.

3.1

Virtualizing Communication

Conceptually, we can assume that a HBS abstracting human capabilities can provide different communication interfaces to handle tasks based on a request and response model. Requests can be used to describe tasks/messages that an HBS should perform or receive. In SBS, specific request representations (e.g., based on XML) are designed for specific software layers (e.g., application layer, middleware layer, or hardware layer). In HBS we can assume that a single representation can be used, as HBS does not have similar layer structures seen in SBS (at the end only humans will understand and process the messages), while the message content can be defined based on application needs. Requests in HBS can, therefore, be composed and decomposed into different (sub)requests. The use of the request/response model will facilitate the integration between SBS and HBS via similar service APIs. Unlike SBS, where communication can be synchronous or asynchronous, with HBS all communication is asynchronous. The reason is that semantics of the human5

The virtualization elements are presented here in a summarized form. They are described in greater detail in the published papers [19, 20].

c SmartSociety Consortium 2013-2017

13 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Figure 1: Conceptual architecture for virtualizing hybrid software-based (SBS) and human-based (HBS) services. level communication messages, the way how humans take the messages, and the high latency of communication between a requester (whether it is a HBS or SBS) to a HBS prevent synchronous communication in which an HBS is expected to process the messages and send responses at the same time. HBS intercommunication can be modeled using the well-known message passing and shared memory models: • Message passing: message-passing in which two HBS’s can directly exchange requests: hbsi

→

request

hbsj . One example is that hbsi sends a request via SMS to hbsj .

Similarly, an SBS can also send a request directly to an HBS. • Shared memory: shared-memory in which two HBS can exchange requests via a SBS. For example, hbsi stores a request into a Dropbox directory and hbsj obtains the

request from the Dropbox directory. Similarly, an SBS and HBS can also exchange requests/responses via an SBS or an HBS (e.g., a software can be built atop Dropbox to trigger actions when a file is stored into a Dropbox directory (e.g., see 6 )). 6

http://www.wappwolf.com

14 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Figure 2: Message passing and shared memory communication models for SBS/HBS collectives.

Figure 2 describes possible message passing and shared-memory communication models for services in a collective of HBS and SBS. Message-passing middleware can be further divided into different channels implemented by different services, e.g., Short Message Service (SMS) and Instant Messaging (IM) are dedicated for humans, while Message-oriented Middleware (MOM) can be used for both SBS and HBS. Similarly, the shared-memory middleware can be built based on different services, offering different types of â&#x20AC;&#x153;sharedmemoryâ&#x20AC;?, such as file-based shared memory, like Dropbox and Amazon S3, and databasebased shared memory, like MongoDB and Amazon DynamoDB. Both message-passing and shared-memory communication middleware can be used internally for services within a collective or externally for the communication between service consumers and the HBS/SBS peers. Similarly to machine instances which offer facilities for remote job deployment and execution, an HBS communication interface can be used to run requests/jobs on HBS. In both communication models, the structure of messages sent to HBS are designed for human comprehension. The communication middleware is supposed to perform the transformation of messages to/from human and machine-comprehensible formats transparently to the consumers and peers themselves. In Section 4 we present the model and implementation of a communication middleware supporting both shared-memory and message-passing paradigms, allowing asynchronous communication with collectives (clouds) of virtualized HBS/SBS peers. c SmartSociety Consortium 2013-2017

15 of 138

c SmartSociety Consortium 2013-2017

3.2

Deliverable 7.1

Provisioning of Human- and Software-based Services

The term ‘provisioning’ refers to the process of automated locating and (de-)instantiating of HBS/SBS’s, scaling based on current consumer needs.7,8 It is performed not only during the initial collective assembly, but also for subsequent adaptations of the collective, allowing the collective to scale to consumer’s needs. The process of provisioning needs to consider the elasticity capabilities and as well as different non-functional properties of the peers and the collective as a whole (cf. D1.1, Sec. 4.4). 3.2.1

Capabilities and Connectedness

Human Power Unit (HPU) In order for provisioning algorithms to locate appropriate HBS’s, human peers offering the HBS need to be described by a set of ‘elastic capabilities’. For example, in [20] we defined the notion of the Human Power Unit (HPU) as an aggregate metric for expressing the notion of ‘computing power’ (one possible elastic capability) based on the skills of peer members of a HBS-based collective. Although there is no standard way to compare skills and skill levels described and/or verified by different people and organizations, it is feasible to establish a common set of comparable skills for a particular collective of HBS. Different evaluation techniques can be enforced upon collective members to ensure that any HBS in it will declare skills and skill levels in a collective-wide consistent manner. This is similar to some crowdsourcing systems which require entry tests to verify claimed skills. Different benchmarks can also be used to test and validate skills and skill levels. This approach is similar to Amazon which uses benchmarks to define its elastic compute unit. Finally, different skills from different sources can be mapped into a common view consistent within the scope of collective. Connectedness In order for humans to collaborate on performing a complex task it is necessary to impose different communication topologies and collaboration patterns [21]. Communication 7

Provisioning should not be confused with the process of ‘composition’, as defined in WP6. Composition refers to the process where peers actively participate in describing a future workflow, by posting task requests, negotiating on them, and finally executing them. Therefore, composition happens in the context of the approach (i) described in Section 1, where human peers are proposed tasks and expected to act on them. Provisioning, on the other hand, refers to the described approach (ii), where peers are actively located and engaged. The two approaches can co-exist as complementary on a single platform. 8 Discussion on locating and provisioning of SBS’s is out of scope of this deliverable, as it is a wellknown concept, used in practice in commercial cloud-based systems on different levels (IaaS, PaaS, SaaS). However, provisioning of HBS’s is a novel concept, as the provisioning models for humans differ from those for software services.

16 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

topologies specify which communication channels, tools and APIs are used for communication between HBS’s, but also times at which communication can happen (due to irregular availability of HBS’s). An example of a communication pattern is a star topology, where each peripheral node is allowed to communicate only with the central node, using a predefined communication technology. In case of the shared-memory communication model, the central node can play the role of the shared repository (e.g., Dropbox). Collaboration patterns specify which roles an HBS takes. For example, a tree topology of managerial relations can be imposed to mimic the command and responsibility chains within a collective. Custom delegation patterns can be defined to obtain replacement peers for an unavailable HBS. In [22] we investigate provisioning of collectives with optimized social connectedness (e.g., based on friendship or past collaboration relationships). 3.2.2

Non-functional Properties

Monitoring metrics In addition to skills and skill levels, other metrics can also been taken into consideration when provisioning HBS’s, such as effort, productivity, reputation, willingness. Defining a generally applicable set of metrics is not feasible. However, as the provisioning is managed by a dedicated middleware, a set of metrics can be defined on collective level or spanning multiple similar metrics. In [23] we investigate a number of different metrics suitable for a distributed software development scenario, and show how the introduced metrics can be used in algorithms for provisioning and adapting composition of HBS collectives. Pricing model SBS’s are generally provisioned under specific cloud pricing models [24], although some of them are also provided for free, in particular in volunteering, crowdsourcing and peerto-peer computing scenarios. In the case of HBS’s, different pricing models, specific to humans, need to be identified. In [20] we proposed a numbered of possible pricing factors that can be taken into account for building pricing models appropriate for HBS’s. Incentives Differently than SBS’s, a HBS must be (additionally) motivated to perform a task well by suitable (also non-monetary) incentives. We investigated incentive models suitable for inclusion into the programming model in [25]. In order to implement these models, techniques must be developed to support billing and incentive enforcement. Such billing and incentive enforcement can be decoupled and offered as service. For example, when an HBS is utilized and we need to apply a reward, the consumer or the provider can send c SmartSociety Consortium 2013-2017

17 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

its results to an incentive system, which calculates and performs payments. In [26] we presented one such decoupled incentive management system for HDA-CAS.

Communication Middleware for HDA-CAS

In this section we showcase the results of WP7 research in virtualizing communication for HBS’s and SBS’s in HDA-CAS, as described in Section 3.1, and present the design, implementation and evaluation of SmartCom, a communication middleware acting as the basic layer necessary for implementing other virtualization services described in Section 3.2, as well as a general communication facility for other SmartSociety platform components.

4.1 4.1.1

Requirements Specification Functional Requirements

SmartCom needs to support the following functionalities, typical of service buses: • Heterogeneity – supporting various types of communication channels (protocols) between the platform and (among) HBS/SBS units. This also includes supporting indirect communication through third-party tools (e.g., Dropbox). • Communication – providing primitives for: message transformation, routing and delivery both on individual and collective level.

• Persistence – message persistence and querying.

• Security – handling authentication and encryption, as well as preventing message flooding.

• Scalability – ability to handle large number of intermittently available service units. In addition to these features, the distinguishing novelty of SmartCom is its requirement of native support for virtualizing peers and collectives thereof, expressed through the following sub-requirements: • Adaptivity – hiding the communication complexity with/within a collective of unknown composition.

• Programmability – handling passing of instructions to collectives from the Smart-

Society platform, in order to manage the collective as a first-class, programmable entity.

18 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

• Hybridity – communicating with collective members (peers) transparently to the

SmartSociety platform, regardless of whether they are human (HBS) or machinebased (SBS).

• Context-awareness – supporting a single peer (human, sensor or software service) endpoint to participate in different collectives concurrently, acting as a different service unit with different context (e.g., SLA, delivery and privacy policies). The formulated requirements are in line with the overall SmartSociety platform requirements, defined in D8.1. Concretely, the requirements presented here map to the following overall requirements: CR–1-2, HT–1-4, SEC–4, and PR–1. 4.1.2

Relations to other SmartSociety Platform Components

Using SmartCom the orchestration components (WP6) and the future execution engine built to support the programming model (WP7 – T7.2 and T7.3) are able to send messages/instructions to collectives and individual peers identified by a unique identifier (peer/collective ID). The mapping of the IDs and the actual addresses/communication channels is handled by SmartCom with the assistance of the Peer Manager (WP4). SmartCom does not provide an environment for task execution to the peers, nor does it aim to be the only communication means for peers participating in the task execution. However, due to the support both of REST-based communication as well as of thirdparty, human-centric communication and collaboration tools, it can lend itself well as a communication platform for other WPs as well. Relying on the Peer Manager as the only authority for managing collective composition and individual peer profiles (including the personal information relating to their communication channels) decouples SmartCom from the responsibility of managing personal user information. The peers can specify in their peer profiles the preferred communication channels through which they want to be used in different contexts, fulfilling an important privacy aspect. The API that can be used to send messages through the middleware is general enough to be used by other SmartSociety platform components, e.g., for supporting incentives (WP5). In order to support provenance management (WP2), SmartCom offers a message persistence functionality, and exposes the access to this database internally to other SmartSociety components through the API. Through this API the interested components can access the exchanged messages and perform further analysis. c SmartSociety Consortium 2013-2017

19 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Unless a specific implementation is provided (cf. Adapters, Sec. 4.3) SmartCom itself does not read the contents of the passed message and is agnostic to its internal structure, wrapping it into an internal representation for routing and delivery purposes. This allows the other components to deliver their message structures independently of SmartCom, e.g., different task requests.

4.2

Service Units Service Unit (SU)

SmartSociety Platform (HDA-CAS)

Execution Context

App1 module App2 module

User App B

App3 module

째째째 AppN module

developers

Incentives

Provenance

QoS

Orchestration

Communication Adapter

Elasticity

context

Incentives SU etc. Human Peer

SMARTCOM Middleware

User App A E.g., Smart City Maintenance Provider

SmartSociety Applications SmartSociety APIs

users

SU Human Peer

SU Machine Peer (e.g. Web Service)

Compiler

Figure 3: Virtualizing communication with HBS/SBS peers to the rest of the SmartSociety platform through SmartCom middleware. SmartCom components are outlined in blue. A conceptual representation of a Service Unit (SU) is shown enlarged (top right). Remaining SUs are displayed collapsed. The Service Unit (SU) is a conceptual entity that virtualizes communication with a single human or machine computing element (peer) participating in task executions managed by the SmartSociety platform. Figure 3 shows the operating context of SmartCom and how the SU abstraction is used to represent hybrid peers for different SmartSociety application execution contexts. The concept of Service Unit as used in this section is based on the concepts of HBS or SBS-based Individual Compute Units (ICU) from [20]. While 20 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

the referenced paper discusses the concept primarily from the programming model perspective, here we take the look at the same concept from the communication perspective. An SU consists of: i) Peer – a human, a machine, a thing (e.g., a sensor), or a collective thereof, performing the computation or executing a task; ii) Context – a set of parameters describing the execution context of the particular SmartSociety application in which the SU is participating. The peer is represented by its physical addresses. A physical address is any set of technology-specific parameters required for SmartCom to communicate with the peer, possibly indirectly via third-party tools. For example, in case of a REST-based software peer the physical address is represented by the URL and the HTTP action; in case of a human peer with a dedicated Android application contacted via Google Cloud Messaging9 , the physical address is the API key which allows SmartCom to push notifications to the peer’s device; in case of collective with a collaboration workspace provided on Dropbox, the physical adress is a combination of a username and an API key, granting access to a specific Dropbox folder mapped to the shared workspace. In order for SmartCom to know which physical address to use for a particular interaction with the peer/collective, a communication context (embodied in the SmartCom component Communication Adapter, Sec. 4.3.2) is associated with the current execution context. The execution context is a subset of peer preferences related to peer’s participation in the given SmartSociety application, stored in the peer profile within the PeerManager (WP4) component. A Communication Adapter instance is created based on the preferences specified in the execution context. The preferences currently include delivery and privacy policies, effectively allowing peers to control over which physical channels they can be contacted by a specific SmartSociety application, transparently to the application itself. However, the design of the adapters is extensible, which should allow inclusion and honoring of additional peer preferences, such as those related to privacy once the privacy management design for the SmartSociety platform has been finalized. The execution and communication context are parts of the more general notion of SU’s context (Fig. 3), which additionally can include also a set of monitoring metrics as well as pricing and incentive mechanisms relating to a particular execution, to support on-demand provisioning of service units and elastic adaptations of collectives as described in Section 3. The lifecycle of communication adapters is fully managed by SmartCom, while the 9

https://developer.android.com/google/gcm/index.html

c SmartSociety Consortium 2013-2017

21 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

communication respects the peer’s preferences, hiding the communication complexity, elasticity and privacy handling from the SmartSociety application, thus effectively virtualizing communication with peers and collectives for the SmartSociety platform components, which only need to use the API (see Sec. 4.3.4) that SmartCom exposes to with peers/collectives and vice versa. This is in line with the PU–1 requirement from D8.1. The functionality offered by SmartCom corresponds to the basic communication primitives necessary to support programmable hybrid HBS/SBS collectives, described in [20].

Both the ‘message passing based communication’ and the ‘shared memory communication’ described in the referenced paper are supported. ‘Push’ and ‘pull’ primitives are implemented through corresponding SmartCom adapter types, while the communication medium itself can belong to a third party. For example, SmartCom-provided push and pull adapters can be used with Dropbox to emulate a shared-memory communication between peers. Depending on peer’s preferences on which communication channel(s) to use, specified for the execution context, SmartCom will use proper primitives for the given type of channel. For example, consider a use-case within an imaginary SmartSociety application ‘PhotoContest’ where peers are asked to provide a photo of a specific object. Peer Alice sets in her communication context that her only communication channel for providing the results is Dropbox, while peer Bob sets his communication context to use the dedicated Android/Web application for sending photos. During the execution of the ‘PhotoContest’ application SmartCom manages a pull-based input adapter instance which regularly polls Alice’s Dropbox folder for new photos while a push-based input adapter exposing a REST service is used to receive photos sent from Bob’s application. In either case, SmartCom notifies the ‘PhotoContest’ application that a new photo is available, hiding from it how the message was obtained, while allowing peers to choose preferred communication channels via their SU contexts. Since peers can alter their execution and communication contexts at any time, different privacy and convenience patterns can be obtained. For example, a peer can switch off certain communication channels during the weekend or out of working hours. Similarly, a single peer can define different communication contexts for different execution contexts, thus posing as a different SU in a different SmartSociety application. In this way, distinguishing between professional and leisure participation can be honored. 22 of 138

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

SMARTCOM Communication Engine

Mail Adapter

Messaging and Routing Manager

Human

Mobile App

Human

REST Adapter

Routing Rule Engine

Feedback Handler

Adapter Manager Adapter Execution Engine

App2 App3 Adapter Handler

Address Resolver

Message Broker

App1

Email MobileApp Adapter

Message Handler HDA-CAS Platform (e.g., Smart Society)

Collective #1

Peer Adapters

Software

Feedback Adapters Dropbox Adapter

Dropbox

FTP Adapter FTP Mailinglist Adapter Email

Message Query Service

Message Info Service

<<REST>>

Figure 4: Simplified architecture of the SmartCom middleware.

4.3

SmartCom Design & Architecture

Figure 4 shows the conceptual architecture of the SmartCom middleware. The HDA-CAS Platform components (e.g., executing application modules) pass the messages intended for collectives to the Communication Engine through a public API. The task of the Communication Engine is to effectively virtualize the notions of ‘collective’ and ‘service unit (SU)’ for the HDA-CAS platform. This means that the communication with different service units and collectives has to be handled transparently to the HDA-CAS platform, independent of unit’s actual type and availability. In the following sections, for brevity, when referring to the communicational aspect of SU’s functionality we will use the short term “peer” denoting the computing human/machine element within the SU that is the sender or receiver of information/data; and the term “adapter” denoting the middleware component in charge of handling the communication. Due to space constraints, a detailed description of the described architectural components and their implementation, as well as the full API specification is provided in the SmartCom Technical Report. 4.3.1

Messaging and Routing

All communication between the peers and the platform is handled asynchronously using normalized messages. A queue-based Message Broker is used to decouple the execution c SmartSociety Consortium 2013-2017

23 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

of SmartCom’s components and the communication with peers. SmartCom supports unicast as well as multicast messages. Therefore, multiple peers can also be addressed as collectives and the SmartCom will take care of sending the message to every single member of the collective. The Messaging and Routing Manager (MRM) is SmartCom’s principal entry point for HDA-CASs. It consists of the following components: 1) The Message Handler takes incoming messages from HDA-CAS and transforms them into an internal representation, sending it to the receiver via a determined peer output adapter. If the receiver of the message is a collective, it resolves the current member peers and their preferred communication channels, determining a set of output adapters to use; 2) The Routing Rule Engine then determines the proper route to the peers, invoking the Adapter Manager to instantiate appropriate adapters in order to complete the route, if needed (see below); 3) The Feedback Handler waits for feedback messages received through feedback (input) adapters and passes them to the Message Handler. Afterwards they will be handled like normal messages again, and re-routed where needed, e.g., back to the HDA-CAS. A route may include different communication channels as delivery start-/endpoints. Figure 5 shows the conceptual overview of SmartCom’s routing. For each message the route will be determined by the Routing Rule Engine using the pipes-and-filters pattern, determining the route based on the message properties: receiver ID, message type and message subtype, with decreasing priority. Note that there may be multiple routes per message (e.g., a single peer can be contacted using a mobile app and email concurrently). HDA-CAS P4

Sensor P5

Routing Engine

P1 P2 P3

Dropbox

Figure 5: Messages are routed to Output Adapters (Pa ) which forward the messages to the corresponding Peers (P1 to P5 ). Feedback is sent back by human peers, software peers (e.g., Dropbox) and sensors using Input Adapters (Fa ). The HDA-CAS Platform can also send and receive messages.

4.3.2

Message Transformation

In order to use a specific communication channel, an associated Communication Adapter (or simply adapter ) needs to be instantiated. The communication between peers and 24 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

the adapters is unidirectional — output adapters (also: peer adapters) are used to send messages to the peers; input adapters (also: feedback adapters ) are used to receive messages from peers. SmartCom originally provides some common input/output adapters (e.g., SMTP/POP, REST, Dropbox). The role of adapters should be considered from the following two perspectives: 1) functional; and 2) technical. Functionally, the adapters allow for:

a) Hybridity – by enabling different communi-

cation channels to and from peers; b) Scalability – by enabling SmartCom to cater to the dynamically changing number of peers; c) Extensibility – new types of communication and collaboration channels can easily be added at a later stage transparently to the rest of the HDA-CAS platform; d) Usability – human peers are not forced to use dedicated applications for collaboration, but rather freely communicate and (self-)organize among themselves by relying on familiar third-party tools; e) Load Reduction and Resilience – by requiring that all the feedback goes exclusively and unidirectionally through external tools first, only to be channelled/filtered later through a dedicated input adapter, the SmartCom is effectively shielded from unwanted traffic load, delegating the initial traffic impact to the infrastructure of the external tools. At the same time, failure of a single adapter will not affect the overall functioning of the middleware. Technically, the primary role of adapters is to perform the message format transformation. Optional functionalities include: message filtering, aggregation, encryption, acknowledging and delayed delivery. Similarly, the adapters are used to interface SmartCom with external software services, allowing the virtualization on third party tools as common software peers. The Adapter Manager is the component responsible for managing the adapter lifecycle (i.e., creation, execution and deletion of instances), elastically adjusting the number of active instances from a pool of available adapters. This allows scaling the number of active adapter instances out as needed. This is especially important when dealing with human peers, due to their inherent periodicity, frequent instability and unavailability, as well as for managing a large number of connected devices, such as sensors. The Adapter Manager consists of following subcomponents: • Adapter Handler : managing adapter instance lifecycle. It handles the following adapter types:

i) Stateful output adapters – peer adapters that maintain conversation state

(e.g., login information). For each peer a new instance of the adapter will be created; ii) Stateless output adapters – peer adapters that maintain no state. An instance of an adapter can send messages to multiple peers; iii) Input pull adapters – adapters that c SmartSociety Consortium 2013-2017

25 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

actively poll software peers for feedback. They are created on demand by applications running on the HDA-CAS platform and will check regularly for feedback on a given communication channel (e.g., check if a file is present on an FTP server); iv) Input push adapters – adapters that wait for feedback from peers. • Adapter Execution Engine: executing the active adapters. • Address Resolver : mapping adapter instances with peers’ external identifiers (e.g., Google/Dropbox username) in order to initiate the communication.

Feedback messages from peers (e.g., subtask results) or external tools (e.g., Dropbox file added, email received on a mailing list) are consumed by the adapters either by a push notification or by pulling in regular intervals. 4.3.3

Message Persistence and Other Functionalities

All sent and received messages as well as internal messages are persisted in a NoSQL database. Stored messages can be queried and analyzed through the MessageQuery public API (e.g., to derive metrics or identify conditions for applying incentives). Since messages can be of arbitrary subtype and contain an arbitrary payload, human peers (and their local third-party applications) might not know how to interpret the message. The MessageInfoService provides: a) The semantic meaning/description of message type and contents in a human-readable way; b) Dependencies to other messages; c) Timing constraints (e.g., expiry, priority). This is especially useful in application contexts (e.g., task acceptance negotiations), where human peers are required to fully understand the message meaning and send back valid answers. Currently, the service annotates the message field types, provides a natural-language description of the expected fields contents and provides a state-machine description describing the allowed message exchange sequence with respect to dependency and timing constraints. SmartCom supports specifying and observing delivery and privacy policies on message, peer and collective level: Delivery policies stipulate how to interpret and react to possible communication exceptions, such as: failed, timed out, unacknowledged or repeated delivery. Privacy policies restrict sending or receiving messages or private data to/from other peers, collectives or HDA-CAS applications under different circumstances. Apart from offering predefined policies, SmartCom also allows the users to import custom, application- or peer-specific policies. As noted, both types of policies can be specified 26 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

at different levels. For example, a peer may specify that he can be reached only by peer ‘manager’ via communication channel ‘email’, from 9am to 5pm in collective ‘Work’. The same person can set to be reachable via ‘SMS’ any time by all collective members except ‘manager’ in collective ‘Bowling’. Similarly, a HDA-CAS platform application could specify the collective delivery policy stating that when sending instructions to a collective it suffices that the delivery to a single member succeeds to consider the overall delivery successful on the collective level. SmartCom takes care of combining and enforcing these policies transparently to the HDA-CAS user in different collective contexts. Peer authentication is handled externally.

Before instantiating the corresponding

adapter, SmartCom requires the peers to authenticate with the external tool and obtains from the tool the token that is used to authenticate messages from/to the peer. More information is provided in the supplement materials. 4.3.4

External APIs

This section briefly describes the interfaces exposed by SmartCom. Clients in this case are both internal SmartSociety components that may want to use SmartCom to perform communication with collectives or other components, or peers themselves, to communicate with platform components or other peers/collectives. The APIs are presented in more detail in the SmartCom Technical Report. • Communication – Acting as a single point of access for internal SmartSociety platform components wishing to use SmartCom’s communication functionality. The

API is used to pass messages that need to be delivered to collectives/components. The same API is used to register callbacks in order to receive reply messages. Additionally, the API offers registering new adapters and routing rules. • InputPushAdapter – Interface exposed by input adapters that receive messages from

peers via push-based mechanism. This API can be used by peers or peer applications to actively send messages to the SmartSociety platform components or collectives, subject to prior authentication.

• InputPullAdapter – Interface exposed by input adapters that receive messages from

peers via pull communication, i.e., by querying the corresponding endpoint in regular intervals. This API can be used by peers or peer applications that produce results of a computation to register a callback that will be used by the adapter implementing the API to occasionally poll for new/updated computations results.

c SmartSociety Consortium 2013-2017

27 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

• MessageInfoService – A service that can provide information on a specific message type, i.e. how to interpret the message and the relationship to other messages.

The service allows the applications to store a human-readable interpretation and/or machine-interpretable description of the fields of the message (which is applicationspecific), and related semantic constraints. This, in turn, allows the end recipient of the message to understand the message properly and know what are the expected responses. For example, when a human peer receives a message during the task negotiation phase, by querying this service the peer can find out in which way to respond to the message to accept the task, reject it, ask for more information, ask for another offer, suggest a replacement. The provided choices are in this case dependent on the specific negotiation pattern used. However, the message interpretation can be used generally in any message exchange where message ordering or particular field values have associated semantics. In addition, the service could facilitate building of peer applications. For example, upon each received message from the SmartSociety platform, the peer application could query the service to obtain the information on expected responses, offering automatically appropriate GUI to compose responses, and integrating into the GUI forms the descriptions of message fields retrieved from the MessageInfoService, possibly translated into user’s native tongue. • MessageQueryService – Service that allows to querying persisted messages. Default setting is that all messages transported by SmartCom get persisted. However, de-

pending on the privacy guidelines, service can also filter the messages or particular fields that get persisted. The service is accessible only internally to the SmartSociety components. The primary motivation for the introduction of this service was to allow access to the data necessary for provenance (WP2) and incentive (WP5) management. The messages get internally stored in a MongoDB database. The API offers the querying functionality analogous to the to JPA Criteria API10 returning a collection of matching messages. 4.3.5

Evaluation

The planned contribution of T7.1 is not to build a performance-oriented communication middleware, but to design a middleware capable of functionally responding to SmartSo10

http://docs.oracle.com/javaee/6/tutorial/doc/gjitv.html

28 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

ciety platform requirements of hybridity, diversity, collective-centrality. Therefore, the performance comparison with commercial middleware systems, such as those described in Section 2, is out of scope of this document, not least because SmartCom’s performance is effectively limited by the properties of the commercial de-facto standard technology at its core (Apache ActiveMQ) for doing the queuing heavy work. In this respect, we formulated the requirements in Section 4.1, and in Section 4.3 justified how the presented design fulfils those parameters. Each of the stated functionalities is covered by unit tests and realistic integration tests simulating multiple remote peers using all the technologies for which we provide the adapters to connect to the middleware (mail, Dropbox, dedicated Android app), provided in the code repository11 . Furthermore, SmartCom is used in the WP8-led ”Ask SmartSociety!” M24 demo (Section 5.2.2) showcasing a real-world example scenario integrating SmartCom with other WP components (WP4 PeerManager, and WP6 Orchestration Manager). In addition, we ran simulation-based scalability experiments with SmartCom showing that it can handle up to 5000 messages per second on average (subject to limitation of ActiveMQ). We tested it with up to 1000 concurrent peers, a load of 1 · 106 messages uniformly distributed to peers on a very resource-limited laptop with satisfactory results.

Scaling the number of peers has not proven problematic, while the throughput of 5000msg/sec is expected to cover typical peak loads we will encounter within the project’s scope, considering that most of peers are humans. The throughput limitation applies to a single SmartCom instance, and multiple SmartCom instances can be deployed to balance the load if needed, sharing the database and PeerManager access. The evaluation setup and results are presented in the SmartCom Technical Report.

https://github.com/tuwiendsg/SmartCom

c SmartSociety Consortium 2013-2017

29 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Programming Communications with in SmartSociety Platform

5.1

SmartSociety Platform Integration

SmartCom represents the primary interface towards peers and application end-users. It supports dynamic selection of the most appropriate communication channel to interact with a single peer and/or a collective, including e.g., mobile application, Dropbox, emails, social media, etc.. The following dynamic view details how an application will be using the SmartSociety platform in order to perform a hybrid computation and shows how SmartCom is interacting with the other platform components and with (in this case: human) peers. In particular, SmartCom is used during the orchestration phase for (i) connecting to peers for recruitment (for the execution of a given task/composition); and (ii) interacting with the peers during the actual execution of the task.

Figure 6: Dynamic view of interactions among SmartSociety platform components: focus on SmartCom. SmartCom is then interfacing with the PeerManager for retrieving the necessary information for interacting with Peers. This includes the preferred interaction channel of any 30 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

given peer when available. Examples of preferred interaction channels are email, Dropbox, mobile application, social media, etc. This is typically based either on the preferences expressed explicitly by the peer, or on previous interactions with that given peer or based on the actual context the user is in. It is worth remarking that a peer can be human, a machine or a combination thereof. In addition, SmartCom handles the scalability of the platform by maintaining different conversations, whereas each conversation is initiated by the composition manager.

5.2 5.2.1

Usage and Application Scenarios Orchestration

Orchestration of SmartSociety applications follows a RESTful document-driven paradigm as described in D6.1 and D6.2. In the context of T7.1, the primary roles of orchestration are described below: 1. Trigger composition events as a consequence of new task requests that arrive in the platform, potentially communicating with the peer manager, the reputation service, and the incentives manager depending on the composition policy followed on a specific application. These composition events give rise to new tasks where different collectives (teams) are formed per task. 2. Synchronisation between the composed tasks and the associated task requests. 3. Enforcing negotiation policies and allowing negotiation between the peers in the collective of each task. 4. Synchronisation of the tasks, the associated task requests, as well as among tasks (potentially affecting the negotiation of unrelated peers in other collectives), as negotiation is taking place. 5. Allowing and synchronising information that is generated during the execution of a task and is stored in the appropriate task record. Based on the above, we present four examples of how SmartCom can be used in orchestration to deliver messages to various users or peers of a SmartSociety application: 1. Upon completion of a composition event, the set of associated negotiable tasks turns from empty to non-empty for a set of task requests. SmartCom notifies the appropriate collectives that there are tasks where negotiation can take place. c SmartSociety Consortium 2013-2017

31 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

2. Depending on the negotiation policy applied on a specific application, certain peers may need to negotiate before others. For example, in a Ridesharing scenario, the negotiation policy may require that the drivers first select the task that they find more attractive and only then allow commuters to continue the negotiation on the same task. In such cases, SmartCom is used to provide guidance to the various peers indicating that there are tasks where they may be able to agree. 3. When an agreement is reached on a task (given the negotiation policy at hand), SmartCom notifies the participants of the task that execution may now start. 4. Reaching agreements and finalising the formation of collectives on tasks is of utmost importance for Smart Society applications. In certain applications the agreement on a task may invalidate ongoing negotiations on other tasks. In this case, the participants of such tasks need to be notified that they can no longer negotiate on the invalidated tasks and perhaps need to start over the negotiation process on a different task that is available. For example, in a Ridesharing application, consider a scenario where we have two drivers d1 and d2 and two commuters c1 and c2 . Assume that d1 initiates negotiation with c1 and c2 , and d2 with c2 alone. Then, c1 agrees to the trip (d1 , c1 , c2 ) and c2 agrees to the trip (d2 , c2 ). Since the latter trip has been finalised, the trip (d1 , c1 , c2 ) is rendered invalid and SmartCom can be used to send a notification to the driver d1 that a new trip should be selected (e.g. the one involving c1 alone). A positive effect of using SmartCom in orchestration activities, such as previously described ones, is the reduction of the polling requests from the client applications. 5.2.2

Ask SmartSociety!

Ask SmartSociety! is a Q&A service enabled by the SmartSociety platform, which will be used as a basis for developing demonstration scenarios, in particular with focus on tourism, the reference domain for validating the SmartSociety vision. Ask SmartSociety! will be a service where users can post questions in natural languages and peers can provide answers. Peers providing answers can be humans (individuals or collectives) as well as machines (intelligent software agents). Peers can compose (forming collectives, hybrid or not) to provide answers. Answers can be ranked based on the reputation of the peers or on community ranking. In some instances the user issuing the question can select an answer 32 of 138

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

and provide feedback on the peer providing it. An example (grounded in the tourism application scenarios) to help in understanding the features of the Ask SmartSociety! service is the following: Next week Peter will fly to Venice. He will be busy in meetings during the day but wants to explore some ‘hidden’ places at night. He could well explore various online tourist sites but he prefers to ask experts and local people. He could also google for relevant content, but he does not actually need an answer right away, he just needs to get it in one week. And having one system which allows him to query local experts, web-based recommendation services and incoming tourism institutions looks definitely appealing to him! The Ask SmartSociety! has been developed in the form of a mobile application, where users can send a question to the SmartSociety platform through a simple user interface. After the question is submitted the composition phase takes places: peers that have registered to the application and agreed to answer to a given topic are selected. In this scenario there is no ’real’ negotiation for each single question because agreement is considered implicit at the moment of the registration (the application should make it clear in the terms of service). Once the list of peers is available SmartCom is used to send them the actual question. In this case, SmartCom is used to send the question to be answered by the various peers. A conversation is established with each peer, while waiting for an answer from each one of them. When the peer answers (or produces an answer), the mobile application returns a response directly to the application12 . In this specific case, the following SmartCom adapters have been used: • REST: for sending the question to two machine peer: one posting the question on Twitter, the other one looking for an answer on Google.

• GCM: for sending the question through Google Cloud Messaging to Android users that have installed the Ask SmartSociety! application for replying.

Conclusions

In this deliverable we have introduced the founding principles of virtualization of human or machine-based peers necessary for supporting uniform communication between the 12

For this specific service no instance of the task executor is actually employed.

c SmartSociety Consortium 2013-2017

33 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

SmartSociety (or indeed any similar HDA-CAS) platform and the virtualized peers (T7.1) and subsequent integration of virtualized peers into the programming model (T7.2). The virtualization of individual peers of different types is a necessary step in achieving the hybridity requirement of the SmartSociety platform. The theoretical virtualization concepts presented in Section 3 allow provisioning of human or machine-based peers under the service model, and their incorporation into elastic, scalable collectives which can be characterized by specific metrics, pricing and incentive characteristics. Particular attention was paid to the communication virtualization. The introduced theoretical concepts have been implemented within the prototype of the communication middleware SmartCom whose internal design and exposed functionalities have been described in Section 4.3. SmartCom natively supports working with collectives of peers, thus taking over the burden of keeping track of collective’s composition and single peer communication preferences from middleware’s clients. SmartCom’s exposed functionality allows it to be used by other SmartSociety platform components. Personal peer preferences and collective information are not managed by SmartCom directly, but obtained from the WP4 PeerManager, meaning that other platform components which already make use of the centralized peer profile management and communicate directly with peer can efficiently transition to SmartCom while keep using the peer profile IDs if they need more advanced communication capabilities. SmartCom exposes the low-level communication primitives necessary for enacting various negotiation, collaboration and communication patterns that will be described in the programming model (T7.2). As such, this document also serves to observe the enabling and limiting factors in defining the expressiveness of the programming model patterns. It also serves as a guidance document for other WPs which may want to integrate SmartCom’s functionality.

34 of 138

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

References [1] S. Ahmad, A. Battle, Z. Malkani, and S. Kamvar, “The jabberwocky programming environment for structured social computing,” in Proceedings of the 24th annual ACM symposium on User interface software and technology.

ACM, 2011, pp. 53–64.

[2] G. Little, L. B. Chilton, M. Goldman, and R. C. Miller, “Turkit: tools for iterative tasks on mechanical turk,” in Proceedings of the ACM SIGKDD workshop on human computation.

ACM, 2009, pp. 29–30.

[3] P. Minder and A. Bernstein, “CrowdLang: programming human computation systems,” University of Zurich, Tech. Rep., 2012. [4] V. Andrikopoulos, A. Bucchiarone, S. G. S´aez, D. Karastoyanova, and C. A. Mezzina, “Towards modeling and execution of collective adaptive systems,” in Service-Oriented Computing–ICSOC 2013 Workshops.

Springer, 2014, pp. 69–81.

[5] G. Decker, O. Kopp, F. Leymann, and M. Weske, “Bpel4chor: Extending bpel for modeling choreographies,” in Web Services, 2007. ICWS 2007. IEEE International Conference on.

IEEE, 2007, pp. 296–303.

[6] O. C. Specification, “Ws-bpel extension for people (bpel4people) specification version 1.1,” http://docs.oasis-open.org/bpel4people/bpel4people-1.1.html, 2010. [7] A. Bucchiarone, A. L. Lafuente, A. Marconi, and M. Pistore, “A formalisation of adaptable pervasive flows,” in Proceedings of the 6th International Conference on Web Services and Formal Methods, ser. WS-FM’09.

Berlin, Heidelberg:

Springer-Verlag, 2010, pp. 61–75. [Online]. Available: http://dl.acm.org/citation. cfm?id=1880906.1880910 [8] F. Zambonelli, N. Bicocchi, G. Cabri, L. Leonardi, and M. Puviani, “On Selfadaptation, Self-expression, and Self-awareness in Autonomic Service Component Ensembles,” in 2011 International Conference on Self-Adaptive and Self-Organizing Systems Workshops.

Ann Arbor (MC): IEEE CS Press, October 2011.

[9] “Objectives of the ascens project,” http://www.ascens-ist.eu/objectives, accessed: 2014-11-20. c SmartSociety Consortium 2013-2017

35 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

[10] A. Rowstron and P. Druschel, “Pastry: Scalable, decentralized object location, and routing for large-scale peer-to-peer systems,” in Middleware 2001.

Springer, 2001,

pp. 329–350. [11] M. Castro, P. Druschel, A.-M. Kermarrec, and A. I. Rowstron, “Scribe: A largescale and decentralized application-level multicast infrastructure,” Selected Areas in Communications, IEEE Journal on, vol. 20, no. 8, pp. 1489–1499, 2002. [12] P. Mayer, A. Klarl, R. Hennicker, M. Puviani, F. Tiezzi, R. Pugliese, J. Keznikl, and T. Bure, “The autonomic cloud: a vision of voluntary, peer-2-peer cloud computing,” in Self-Adaptation and Self-Organizing Systems Workshops (SASOW), 2013 IEEE 7th International Conference on.

IEEE, 2013, pp. 89–94.

[13] D. Chappell, Enterprise service bus.

O’Reilly Media, Inc., 2004.

[14] C. consortium, “Immigrant paas technologies: port.”

Scientific and technical re-

http://4caast.morfeo-project.org/wp-content/uploads/2011/02/D7.1.

1-M14-PU-Immigrant-PaaS-technologies-Scientific-and-technical-report.pdf, 2011. [15] MuleSoft,

“Mule

ESB,”

http://www.mulesoft.com/platform/soa/

mule-esb-open-source-esb, [Online; accessed September 2014]. [16] A. S. foundation, “Apache ServiceMix,” http://servicemix.apache.org/, [Online; accessed September 2014]. [17] R. H. JBoss, “JBoss ESB,” http://jbossesb.jboss.org/, [Online; accessed September 2014]. [18] W.

Inc,

“WSO2

Enterprise

Service

Bus,”

http://wso2.com/products/

enterprise-service-bus/, [Online; accessed September 2014]. [19] H.-L. Truong, H. Dam, A. Ghose, and S. Dustdar, “Augmenting complex problem solving with hybrid compute units,” in Service-Oriented Computing ICSOC 2013 Workshops, ser. Lecture Notes in Computer Science, A. Lomuscio, S. Nepal, F. Patrizi, B. Benatallah, and I. Brandic, Eds.

Springer International Publishing,

2014, vol. 8377, pp. 95–110. [Online]. Available:

http://dx.doi.org/10.1007/

978-3-319-06859-6 9 36 of 138

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

[20] H.-L. Truong, S. Dustdar, and K. Bhattacharya, “Conceptualizing and programming hybrid services in the cloud,” International Journal of Cooperative Information Systems, vol. 22, no. 04, 2013. [21] O. Scekic, M. Riveni, H.-L. Truong, and S. Dustdar, “Social interaction analysis for team collaboration,” in Encyclopedia of Social Network Analysis and Mining, R. Alhajj and J. Rokne, Eds.

Springer New York, 2014, pp. 1807–1819. [Online].

Available: http://dx.doi.org/10.1007/978-1-4614-6170-8 257 [22] M. Candra, H.-L. Truong, and S. Dustdar, “Provisioning quality-aware social compute units in the cloud,” in Service-Oriented Computing, ser. Lecture Notes in Computer Science, S. Basu, C. Pautasso, L. Zhang, and X. Fu, Eds. Springer Berlin Heidelberg, 2013, vol. 8274, pp. 313–327. [Online]. Available: http://dx.doi.org/10.1007/978-3-642-45005-1 22 [23] M. Riveni,

H.-L. Truong,

and S. Dustdar,

“On the elasticity of social

compute units,” in Advanced Information Systems Engineering, ser. Lecture Notes in Computer Science, M. Jarke, J. Mylopoulos, C. Quix, C. Rolland, Y. Manolopoulos, H. Mouratidis, and J. Horkoff, Eds.

Springer International

Publishing, 2014, vol. 8484, pp. 364–378. [Online]. Available: http://dx.doi.org/10. 1007/978-3-319-07881-6 25 [24] S. Qanbari, F. Li, S. Dustdar, and T.-S. Dai, “Cloud asset pricing tree (capt) elastic economic model for cloud service providers.” in CLOSER, 2014, pp. 221–229. [Online]. Available: http://dx.doi.org/10.5220/0004849702210229 [25] O. Scekic, H.-L. Truong, and S. Dustdar, “Incentives and rewarding in social computing,” Commun. ACM, vol. 56, no. 6, pp. 72–82, Jun. 2013. [Online]. Available: http://doi.acm.org/10.1145/2461256.2461275 [26] ——, “Managing incentives in social computing systems with pringl,” in Web Information Systems Engineering

WISE 2014, ser. Lecture Notes in Computer

Science, B. Benatallah, A. Bestavros, Y. Manolopoulos, A. Vakali, and Y. Zhang, Eds.

Springer International Publishing, 2014, vol. 8787, pp. 415–424. [Online].

Available: http://dx.doi.org/10.1007/978-3-319-11746-1 30

c SmartSociety Consortium 2013-2017

37 of 138

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Appendix

38 of 138

http://www.smart-society-project.eu

International Journal of Cooperative Information Systems Vol. 22, No. 4 (2013) 1341003 (28 pages) c World Scientiﬁc Publishing Company DOI: 10.1142/S0218843013410037

CONCEPTUALIZING AND PROGRAMMING HYBRID SERVICES IN THE CLOUD

HONG-LINH TRUONG∗ and SCHAHRAM DUSTDAR† Distributed Systems Group, Vienna University of Technology Argentinierstrasse 8/184-1, A-1040 Vienna, Austria ∗truong@dsg.tuwien.ac.at †dustdar@dsg.tuwien.ac.at KAMAL BHATTACHARYA IBM Research, Nairobi, Kenya, Africa kamal@ke.ibm.com Received 10 March 2013 Accepted 28 October 2013 Published 24 December 2013 For solving complex problems, in many cases, software alone might not be suﬃcient and we need hybrid systems of software and humans in which humans not only direct the software performance but also perform computing and vice versa. Therefore, we advocate constructing “social computers” which combine software and human services. However, to date, human capabilities cannot be easily programmed into complex applications in a similar way like software capabilities. There is a lack of techniques to conceptualize and program human and software capabilities in a uniﬁed way. In this paper, we explore a new way to virtualize, provision and program human capabilities using cloud computing concepts and service delivery models. We propose novel methods for conceptualizing and modeling clouds of human-based services (HBS) and combine HBS with software-based services (SBS) to establish clouds of hybrid services. In our model, we present common APIs, similar to well-developed APIs for software services, to access individual and team-based compute units in clouds of HBS. Based on that, we propose a framework for utilizing SBS and HBS to solve complex problems. We present several programming primitives for hybrid services, also covering forming hybrid solutions consisting of software and humans. We illustrate our concepts via some examples of using our cloud APIs and existing cloud APIs for software. Keywords: Hybrid services; cloud computing; human-based computation; service computing.

1. Introduction Recently the concept of building “social computers” has emerged, in which the main principle is to combine human capabilities and software capabilities into composite applications solving complex problems.1,a In such types of computers, both software ∗ Corresponding author. a The Social Computer —

Internet-Scale Human Problem Solving. socialcomputer.eu. Last access:

3 May 2012. 1341003-1

H.-L. Truong, S. Dustdar & K. Bhattacharya

and humans play equally roles: they direct/coordinate as well as perform tasks, dependent on their capabilities and specific context. For example, (i) a software can analyze an image and direct scientists to carry out a quality assessment of the analysis result before sending the result to another software or (ii) a scientist can direct a software to analyze high quality of images while asking another scientist to judge if it makes sense to continue the analysis of images of low quality. While such a combination in complex systems is not new, to build and program such systems capable of supporting pro-active, highly-interactive, team-based human computation under different elastic, pay-per-use models with on-demand (cloud) software services in a unified way remains challenging. To date, concrete technologies have been employed to provide human capabilities via standard, easy-to-use interface, such as Web services and Web platforms2,3,b and some efforts have been devoted for modeling and coordinating flows of human works in the process/workflow level.4,5 In all these works, a fundamental issue is how to utilize human capabilities. We observed two main approaches in utilizing human capabilities: (i) passively proposing tasks and waiting for human input, such as in crowd platforms,3 and (ii) actively finding and binding human capabilities into applications. While the first one is quite popular and has many successful applications,3,6–9 it mainly exploits individual capabilities and is platform-specific. In the second approach, it is difficult to proactively invoke human capabilities in Internet-scale due to the lack of techniques and systems supporting proactive utilization of human capabilities.1 From a programming perspective, currently, most techniques concentrate on workflow-based solutions in which the workflow engine can find suitable humans and assign the tasks to them. However, many complex problems requiring both humans and software cannot be solved by using current workflow-based solutions, as these problems demand flexible interactions among humans and software services. In this paper, we conceptualize human capabilities under the service model and combine them with software to establish clouds of hybrid human- and softwarebased services (HBS and SBS). This enables the provisioning of a large-scale number of HBS together with SBS. Our approach aims at exploring novel ways to actively program and utilize human capabilities in a similar way to software services and to provision human capabilities using cloud service and deployment models for high level frameworks and programming languages to build “social computers”. 1.1. Motivation Several works have shown that we need to integrate further humans and software under the service model.1,10 Hybrid services, in our notion, include SBS and HBS. We argue that we could provide a cloud of HBS working in a similar manner to b WS-BPEL

Extension for People (BPEL4People) Specification Version 1.1, November 2009. http://docs.oasis-open.org/bpel4people/bpel4people-1.1-spec-cd-06.pdf. 1341003-2

Conceptualizing and Programming Hybrid Services in the Cloud

contemporary clouds of SBS (such as Amazon services and Microsoft Azure services) so that HBS can be invoked and utilized in a proactive manner, rather than in a passive way like in existing crowdsourcing platforms. Furthermore, HBS can be programmed together with SBS in a composite application, enabling complex, dynamic interactions between SBS and HBS, instead of being used separately from SBS as in contemporary crowdsourcing platforms and workflows without/with little interactions. Our goal is to program HBS and SBS together in an easier way because several complex applications need to utilize SBS and HBS in a similar way. For example, several Information Technology (IT) problems, such as in incident management for IT systems, software component development, and collaborative data analytics, can be described as a dependency graph of tasks in which a task represents a unit of work that should be solved by a human or a software. Solving a task may need to concurrently consider other relevant tasks in the same graph as well as introduce new tasks (this, in turns, expands the task graph). Utilizing team and hybrid services is important here as tasks are interdependent, but unlike crowdsourcing scenarios in which different humans solving different tasks without the context of teamwork and without the connectedness to SBS. Teamwork is crucial as it allows team members to delegate tasks when they cannot deal with the task as well as newly tasks can be identified and created that need to be solved. SBS for teamwork is crucial for team working platforms in terms of communication, coordination, and analytics. Therefore, it is crucial to have solutions to provision individual- and team-based human capabilities under clouds of human capabilities, in parallel with the provisioning of SBS. These clouds require novel service models and infrastructures to provide and support on-demand and elastic HBS provisioning. We need solutions allowing us to buy and provision human capabilities via simple interfaces in a similar way to buying and provisioning virtual machines in contemporary clouds of Infrastructure-as-aService (IaaS) and Software-as-a-Service (SaaS). By doing so, we could incorporate human capabilities in programming paradigms for “social” computers. However, so far, to our best knowledge, there is no proposed solution towards a cloud model for human capabilities that enables us to acquire, program, and utilize HBS in a similar way to that of IaaS, Platform-as-a-Service (PaaS) and SaaS. Existing technologies are not adequate, for example, workflow and language extensions and social computing platforms are focused too much on crowdsourcing models. The way to program human capabilities in contemporary workflows and crowdsourcing platforms is that either tasks are published for humans to bid them (e.g. in most crowdsourcing platforms) or tasks are directly mapped to humans by the workflow engines (e.g. in human-centric workflows). While the first mechanism allows people to select the task, it do not encourage people to interact together to solve the task. In the second mechanism, it is possible that the workflow engine actively matches suitable people to tasks, (although currently such workflow engines are not popular and they tend to search people only within local organizations). However, still human capabilities 1341003-3

H.-L. Truong, S. Dustdar & K. Bhattacharya

are utilized in a passive way, e.g. humans are assigned to tasks or perform simple control activities (e.g. approving a task). Furthermore, with current workflows, it is difficult to utilize large-scale human capabilities in a dynamic, selective way due to the lack of APIs to invoke human services. Furthermore, workflow-based solutions exploiting human capabilities focus on the way to define how the tasks should be done but not how can we provision humans and software in a unified manner so that both humans and software can act as computing units of a single “social computer”. Overall, programming flexible and dynamic relationships among software and humans are not well supported, thus hindering the incorporation of human services into programming paradigms. 1.2. Approach To incorporate humans into programming paradigms, our approach aims at tackling the following issues from different aspects: • Programming languages: First of all, we need to abstract human capabilities under compute units and provision them under service units 10 so that they can be easily incorporated into high-level program elements and constructs. Second, we can extend existing programming languages to support human compute units. Finally, we need to enable different data and control flows among software and human service units via extensible APIs and develop new APIs to support other types of flows among software and human service units. • Multiple programming models: By utilizing human service units as programming elements/constructs, we could support different programming models, such as shared memory, message passing, and artifact-centric models via APIs working atop the service unit abstraction. Furthermore, contemporary workflow languages can be extended to exploit large-scale HBS. • Execution environments: We will need to develop several components for managing provisioning of HBS and the interaction of humans with HBS abstracting them. First computing capability/profile management will allow us to conceptualize and define computing power, pricing and incentive models. We need to monitor and enforce incentives/rewards, quality of results, availability, to name just a few. Several way of communication between different structures of HBS and SBS must be supported. In this paper, we will focus on abstracting and conceptualizing HBS and clouds of HBS, providing APIs and presenting basic programming techniques for hybrid services. 1.3. Contributions and paper structure We concentrate on conceptualizing the cloud of HBS and how clouds of HBS and SBS can be programmed for solving complex problems. Our main contributions 1341003-4

Conceptualizing and Programming Hybrid Services in the Cloud

are: • a novel model for clouds of HBS and hybrid services provisioning, • a framework for solving complex problems using clouds of hybrid services, • programming primitives for hybrid services. To illustrate our work, we present several examples of how to program softwarebased services and HBS in a uniﬁed way. This paper substantially extends the work described in Ref. 11. We have added our approach (Sec. 1.2), substantially detailed the concepts of hybrid services and extended them to cover also other aspects like archetypes and incentives (in Sec. 2). We have added a conceptual architecture of IaaS of hybrid services (in Sec. 2.5). We also extend programming primitives by detailing our techniques, discuss how our framework can be used to implement HBS provisioning platforms, and elaborate the related work. The rest of this paper is organized as follows. Section 2 discusses our model of clouds of hybrid services. Section 3 describes a generic framework utilizing hybrid services. Section 4 presents programming primitives utilizing clouds of hybrid services. Section 5 discusses related work. Section 6 concludes the paper and outlines our future work.

2. Conceptualizing Clouds of Hybrid Services 2.1. Clouds of hybrid services In our work, we consider two types of computing elements: software-based computing elements and human-based computing elements. In software-based computing elements, different types of services can be provided to exploit machine capabilities and we consider these types of services under the SBS category. Similarly, humanbased computing elements can also offer different types of services under the HBS category. Definition 2.1 (Cloud of HBS). A cloud of HBS includes HBS that can be provisioned, deployed and utilized on-demand based on different pricing and incentive models. Models for SBS and their clouds are well defined.12 By combining SBS with our model for HBS, we consider a cloud of hybrid services as follows: Definition 2.2 (Cloud of hybrid services). A cloud of hybrid services includes SBS and HBS that can be provisioned, deployed and utilized on-demand based on different pricing and incentive models. In principle, a cloud of hybrid services can also be built atop clouds of SBS and clouds of HBS (by employing concepts for federated clouds). As SBS and clouds of SBS are well researched, in the following we will discuss models for clouds of HBS and of hybrid services. 1341003-5

H.-L. Truong, S. Dustdar & K. Bhattacharya

2.2. Models for HBS Human capabilities can be provisioned under the service model, e.g. our previous work introduced techniques for offering individual human capabilities under Web services.2 However, at the moment, there exists no cloud system that the consumer can program HBS in a similar way like IaaS (e.g. Amazon EC) or data (e.g. Microsoft Azure Data Marketplace). Before discussing how clouds of hybrid services can be designed and used, we propose a conceptual model for clouds of HBS that cover the following aspects: (i) communication interfaces, (ii) human power unit (HPU), (iii) HBS archetypes, and (iv) pricing and incentive models. 2.2.1. HBS communication interfaces Humans have different ways to interact with other humans and ICT systems. Conceptually, we can assume that HBS (and corresponding HBS clouds) abstracting human capabilities can provide different communication interfaces to handle tasks based on a request and response model. Requests can be used to describe tasks/messages that an HBS should perform or receive. In SBS, specific request representations (e.g. based on XML) are designed for specific software layers (e.g. application layer, middleware layer, or hardware layer). In HBS we can assume that a single representation can be used, as HBS does not have similar layer structures seen in SBS (at the end only humans will understand and process the messages), while the message content can be defined based on application needs. Requests in HBS can, therefore, be composed and decomposed into different (sub)requests. The use of the request/response model will facilitate the integration between SBS and HBS as via similar service APIs. Unlike SBS in which communication can be synchronous or asynchronous, in HBS all communication is asynchronous. Clearly, the reason is that the semantics of human-level communication messages, the way of how the human takes the messages, and the high latency of communication between a requester (whether it is a HBS or SBS) to a HBS prevent synchronous communication in which an HBS is expected to process the messages and send responses at the same time. In general, the upper bound of the communication delay in and the internal request processing mechanism in HBS are unknown (and these issues are not in the focus of this paper). However, HBS intercommunication can be modeled using the wellknown message passing and shared memory models: • Message passing: Message-passing in which two HBS can directly exchange requests: hbsi → hbsj . One example is that hbsi sends a request via SMS request

to hbsj . Similarly, an SBS can also send a request directly to an HBS. • Shared memory: Shared-memory in which two HBS can exchange requests via a SBS. For example, hbsi stores a request into a Dropboxc directory and hbsj c www.dropbox.com.

1341003-6

Conceptualizing and Programming Hybrid Services in the Cloud

Fig. 1. Message passing and shared-memory communication models for services in clouds of HBS and SBS.

obtains the request from the Dropbox directory. Similarly, an SBS and HBS can also exchange requests/responses via an SBS or an HBS (e.g. a software can be built atop Dropbox to trigger actions when a file is stored into a Dropbox directory (see http://www.wappwolf.com)). Figure 1 describes possible message passing and shared-memory communication models for services in cloud of HBS and SBS. Message-passing middleware can be further divided into different channels implemented by different services, e.g. Short Message Service (SMS) and Instant Messaging (IM) are dedicated for humans, while Message-oriented Middleware (MOM) can be used for both SBS and HBS. In particular, the implementation of communication channels for HBS — among humans and cloud middleware — can benefit from well-researched collaboration services.13 Similarly, the shared-memory middleware can be built based on different services, offering different types of “shared-memory”, such as file-based shared memory, like Dropbox and Amazon S3,d and database-based shared memory, like MongoDBe and AmazonDynamoDB.f Both message-passing and shared-memory communication middleware can be used internally for services within a cloud or externally for the communication between service consumers and services within clouds. Conceptually, we could have all of these middleware under the same set of APIs. In both communication models, the structures of messages sent to HBS are designed for human comprehension. Similarly to machine instances which offer facilities for remote job deployment and execution, an HBS communication interface can be used to run requests/jobs on HBS. 2.2.2. Human power unit (HPU) The first issue is to define a basic model for describing the notion of “computing power” of HBS. Usually, the computing capability of a human-based computing d http://aws.amazon.com/s3/. e http://www.mongodb.org/. f http://aws.amazon.com/dynamodb/.

1341003-7

H.-L. Truong, S. Dustdar & K. Bhattacharya

element is described via human skills and skill levels. Although there is no standard way to compare skills and skill levels described and/or verified by different people and organizations, we think that it is feasible to establish a common, comparative skills for a particular cloud of HBS. • The cloud can enforce different evaluation techniques to ensure that any HBS in its system will declare skills and skill levels in a cloud-wide consistency. This is, for example, similar to some crowdsourcing systems which have rigorous tests to verify claimed skills. • The cloud can use different benchmarks to test humans to validate skills and skill levels. Each benchmark can be used to test a skill and skill level. This is, e.g. similar to Amazon which uses benchmarks to define its elastic compute unit. • The cloud can map different skills from different sources into a common view which is consistent in the whole cloud. We define HPU for an HBS as follows: Definition 2.3 (Human Power Unit). HPU is a value describing the computing power of an HBS measured in an abstract unit. A cloud of HBS has a pre-defined basic power unit, hpuθ , corresponding to the baseline skill bsθ of the cloud. Without the loss of generality, we assume hpuθ = f (bsθ ). A cloud C provisioning HBS can support a set of n skills SK = {sk1 , . . . , skn } and a set of m cloud skill levels SL = {1, . . . , m}. C can define the HPU wrt ski for slj as follows: ski hpu(ski , slj ) = hpuθ × f (1) × slj . bsθ ski Here f ( bs ) indicates a way to determine a weighted factor when comparing the skill θ ski ski against the baseline bsθ . For the cloud C, f ( bs ) is known and pre-defined (based θ on the definition of SK). For example, let bsθ the basic T esting skill, ski be the basic esting U nitT esting skill and skj be the basic IntegrationT esting skill, f ( UnitT T esting ) = 2 esting and f ( IntegrationT ) = 8 could be very simple examples. T esting Given the capability of an hbs – CS(hbs) = {(sk1 , sl1 ), . . . , (sku , slu )} – the corresponding HPU can be calculated as follows:

hpu(CS(hbs)) =

hpu(ski , sli ).

(2)

i=1

Note that two HBS can have the same hpu value, even if their skills are different. To distinguish them, we propose to use a set of “architecture” types (see Sec. 2.2.3). Given a human offering her capabilities to C, she can be used exclusively or shared among different consumers. In case an hbs is provisioned exclusively for a particular consumer, the hbs can be associated with a theoretical utilization u — describing the utilization of a human — and CS(hbs); its theoretical HPU would be u × hpu(CS(hbs)). In case a hbs is provisioned for multiple consumers, the hbs can be described as a set of multiple instances, each has a theoretical power as ui × 1341003-8

Conceptualizing and Programming Hybrid Services in the Cloud

hpu(CSi (hbs)) where u = (ui ) ≤ 1 and CS(hbs) = CS1 (hbs) ∪ CS2 (hbs) ∪ · · · ∪ CSq (hbs) . Using this model, we can determine theoretical power for individual HBS as well as for a set of individual HBS. Note that the power of a set of HBS may be more than the sum of power units of its individual HBS, due to teamwork. However, we can assume that, similar to individual and cluster of machines, theoretical power units are different from the real one and are mainly useful for selecting HBS and defining prices. Given a human offering her capabilities to C, she can be used exclusively or shared among different consumers. 2.2.3. HBS archetype As an HBS can potentially offer different capabilities, similar to SBS, an HBS can be considered to offer a set of types of solutions for a set of domains. For example, an HBS can offer a set of solutions as SO = {({W ebDataAnalytics, T witterAnalytics}, DataAnalytics), ({DataCleansing, DataEnrichment}, DataQualityImprovement)}, where {W ebDataAnalytics, T witterAnalytics, DataCleansing, DataEnrichment} are types of solutions and {DataAnalytics, DataQualityImprovement} are domains. Therefore, an HPU of an HBS can be associated with types of solutions and domains to indicate the processing capability of the HBS for a specific solution in a specific domain. To allow this association, we propose to use a set of common “architecture” types, called Archetype, to indicate the type of solutions in a particular domain that the HPU is determined. This is similar to, e.g. different types of instruction set architectures (such as ×86, SPARC, and ARM). 2.2.4. Pricing and incentive models As we have observed, SBS often comes with pricing models but many of SBS are also given free, in particular, in volunteering and peer-to-peer computing systems,14 due to different incentives. Similarly, in crowdsourcing, free human efforts are quite popular.15–18 In the case of HBS, it is obvious that pricing models will need to be identified for HBS, e.g. by the HBS cloud provider in agreement with the HBS provider or other methods.19 However, an HBS can also declare itself as a free service while it may require rewards when using it via some incentive models.g g In literature, many incentive models give rewardss of monetary values. Furthermore, in volunteering computing, there exits the concept of “pay for participation”. For the sake of simplicity, we consider all monetary values under pricing models because, although in principle unlike services with commercial intention, these services impose some monetary pricing models that one has to pay when using the services. In other words, incentive models in our cloud HBS rewards non-monetary values, such as reputation.

1341003-9

H.-L. Truong, S. Dustdar & K. Bhattacharya

Therefore, in our model, each HBS will be associated with a set of pricing models and incentive models. This is different from SBS which is associated with pricing models, but not with incentive models. Note that for pricing and incentive models, there must be techniques to support billing and incentive enforcement. However, similar to software systems, such billing and incentive enforcement can be decoupled. For example, when an HBS is utilized and we need to reward it, the consumer or the provider can send its results to an incentive system, which calculates and performs payments. 2.3. HBS instances provisioning 2.3.1. Types of HBS instances For HBS we will consider two types of instances: Definition 2.4 (Individual Compute Unit instances (iICU)). iICU describe instances of HBS built atop capabilities of individuals. An individual can provide different iICU. Analogous to SBS, an iICU is similar to an instance of a virtual machine or a software. Definition 2.5 (Social Compute Unit instances (iSCU)). iSCU describe instances of HBS built atop capabilities of multiple individuals and SBS. Analogous to SBS, an iSCU is similar to a virtual cluster of machines or a complex set of software services. In our approach, iICU is built based on the concept that an individual can offer her capabilities via services2 and iSCU is built based on the concept of Social Compute Units20 which represents a team of individuals. 2.3.2. HBS instance description Let C be a cloud of hybrid services. All services in C can be described as follows: C = HBS ∪ SBS where HBS is the set of HBS instances and SBS is the set of SBS instances. The model for SBS is well known in contemporary clouds and can be characterized as SBS(capability, price). The provisioning description models for HBS instances are proposed as follows: • For an iICU its provisioning description includes (CS, HPU, price, incentive, utilization, location, APIs). • For an iSCU its provisioning description includes (CS, HPU, price, incentive, utilization, connectedness, location, APIs). From the consumer perspective, iSCU can be offered by the cloud provider or the consumer can build its own iSCU . In principle, in order to build an SCU, the provider or the consumer can follow the following steps: first, selecting suitable 1341003-10

Conceptualizing and Programming Hybrid Services in the Cloud

iICU for an iSCU and, second, combining and configuring SBS to have a working platform for iSCU . The connectedness reflects the intercommunication topology connecting members of iSCU , such as ring, star, and master-slave, typically configured via SBS. AP Is describe how to communicate to and execute requests on HBS. Moreover, similar to SBS, HBS can also be linked to user rating information, often managed by third-parties. 2.3.3. Pricing factors Similar to existing SBS clouds, we propose clouds of HBS to define different pricing models for different types of HBS instances. The baseline for the prices can be based on hpuθ . We propose to consider the following specific pricing factors: • Utilization: Unlike individual machines whose theoretical utilization when selling is 100%, ICU has much lower theoretical utilization, e.g. normal full time people have a utilization of 33.33% (8 h per day). However, an SCU can theoretically have 100% utilization. The real utilization of an HBS is controlled by the HBS rather than by the consumer as in machine/software instances. • Offering communication APIs: It is important that different communication capabilities will foster the utilization of HBS. Therefore, the provider can also bill consumers based on communication APIs (e.g. charge more when SMS is enabled). • Connectedness: Similar to capabilities of (virtual) networks between machines in a (virtual) cluster, the connectedness of an iSCU will have a strong impact on the performance of iSCU . Similar to pricing models in existing collaboration services,h the pricing factor for connectedness can be built based on which SBS and collaboration features are used for iSCU. Furthermore, other conventional factors used in SBS such as usage duration and location are considered. 2.3.4. Incentive factors Incentive factors for ICU are determined by the ICU and/or the HBS cloud provider. This can be done when the ICU is registered and provisioned under the cloud. When enforcing the incentive models of ICU, obviously all incentives must be attributed to the ICU. For SCU it is dependent on how the SCU is structured and the incentive strategies for the SCU are implemented. Thus, when building an SCU, its incentive strategies can also be programmed, e.g. using incentive programming frameworks,21 to allow the rewards for the whole SCU to be distributed to members of the SCU in the right way. Overall, the enforcement of incentive models will be carried out by the provider of HBS clouds. h Such

as in Google Apps for Business (http://www.google.com/enterprise/apps/business/ pricing.html). 1341003-11

H.-L. Truong, S. Dustdar & K. Bhattacharya

2.4. Cloud APIs for provisioning hybrid services Services in a cloud of hybrid services can be requested and provisioned on-demand. As APIs for provisioning SBS are well developed, we will focus on APIs for provisioning HBS. Table 1 describes some abstract APIs that we develop for HBS in our Vienna Elastic Computing Model.i These abstract APIs are designed in a similar manner to common APIs for SBS. Figure 2 shows main Java-based classes for APIs. HPU, HBS, ICU and SCU are described by HPU, HBS, ICU and SCU classes, respectively. Requests and messages for HBS are described by (HBSRequest and HBSMessage), while skills and skill levels are described Skill and SkillLevel. The cloud skills, described in CloudSkill, are built from Skill and SkillLevel. HBS and SBS are subclasses of Unit which represents generic service units. Unit is associated with Cost, describing cost models, and Benefit, describing incentive models and other Table 1.

Main (abstract) APIs for provisioning HBS.

APIs

Description APIs for service information and management

listSkills ( );listSkillLevels( ) listICU( );listSCU( )

negotiateHBS( )

startHBS( )

suspendHBS ( )

resumeHBS ( ) stopHBS( ) reduceHBS( ) expandhbs( )

List all pre-defined skills and skill levels of clouds List all iICU and iSCU instances that can be used. Different filters, e.g. based on pricing/incentive, location, and skills, can be applied to the listing. Allow a consumer to send and negotiate service contract with an iICU or an iSCU . In many cases, the cloud can just give the service contract and the consumer has to accept it (e.g. similar to SBS clouds) if the consumer wants to use the HBS. Allow a consumer to start an iICU or an iSCU . Via this API, the consumer sends message to the HBS cloud which, among other activities, passes a notification to the HBS that the HBS is being used from the consumer perspective. Depending on the provisioning contract, the usage can be time-based (subscription model) or task-based (pay-per-use model). Allow a consumer to suspend the operation of an iICU or iSCU . Note that in suspending mode, the HBS is not released for other consumers yet. Allow a consumer to resume the work of an iICU or iSCU . Allow a consumer to stop the operation of an iICU or iSCU . By stopping the HBS is no longer available for the consumer. Reduce the capabilities of iICU or iSCU , for example, reduce the power unit and some specific communication APIs. Expand the capabilities of iICU or iSCU , for example, reduce the power unit and some specific communication APIs.

APIs for service execution and communication runRequestOnHBS( ) receiveResultFromHBS( ) sendMessageToHBS( ) receiveMessageFromHBS( )

Execute a request on an iICU or iSCU . By execution, the HBS will receive requests from the consumers and perform them. Receive the result from an iICU or iSCU . send (support) messages to HBS. receive messages from HBS.

i dsg.tuwien.ac.at/research/viecom.

1341003-12

Conceptualizing and Programming Hybrid Services in the Cloud

Fig. 2.

Example of some Java-based APIs for clouds of HBS.

1341003-13

H.-L. Truong, S. Dustdar & K. Bhattacharya

types of benefits. The VieCOMHBSImpl class describes the collection of APIs that can be used to discover and invoke HBS, as described in Table 1. Currently, we simulate our cloud of HBS. The HBS can be accessed via APIs described in VieCOMHBSImpl. For SBS, we use existing APIs provided by cloud providers and common client APIs libraries, such as JClouds (www.jclouds.org) and boto (http://docs.pythonboto.org/en/latest/index.html). These APIs provide different ways to acquire and interact with HBS. How the HBS’s performance management is supported despite the fact these APIs do not tell if the HBS really continues to work even though being suspended via APIs. From the consumer perspective, the HBS will receive corresponding requests (based on APIs) and s/he will understand what the messages mean. In principle the HBS should follow the messages requested by the consumer but whether the HBS really follows the request or not is a different aspect, as humans may not necessary strictly follow requests, even they must be. What happens inside an HBS work cannot be controlled by the cloud. However, two possibilities could be supported. First, the cloud of HBS can control the quality of HBS and decide how to utilize the HBS based on quality of results (e.g. time, cost and quality of data) delivered by the HBS. Similarly, the consumer can also control the quality of the HBS the consumer pays for. These control mechanisms are complex enough for being out of this paper scope and we have developed some solutions in Ref. 22. Still, due to the nature of human works, not all human activities within the HBS can be measured and controlled.

2.5. Conceptual architecture for hybrid unit as a service Based on the conceptual models of clouds of hybrid services, we describe a conceptual architecture for establishing a cloud of hybrid services. Figure 3 outlines our conceptual architecture for provisioning and programming hybrid service units. At the lowest level, software, people and things can be provisioned by interfacing and integrating them to the Service-based Middleware. Using this middleware, we enable diﬀerent types of integration for software, people and things due to their different interaction models. The Service-based Middleware basically provisions HBS and SBS to the consumer via programmable, extensible APIs, e.g. based on the list of APIs that we present in Sec. 2.4. The Service-based Middleware utilizes our concepts by abstracting software, things and people using SBS and HBS unit model. This allows consumers to access software, people and things via a uniform way in which SBS and HBS will be mapped to underlying software, things, and people. To ensure the proper operations of this cloud, we need to implement Runtime Monitoring and Enforcement (e.g. for monitoring and enforcing costs, incentives and quality), Communication (e.g. for supporting the communication among HBS and SBS), Service Life-cycle Management (e.g. for supporting HBS selection and formation), and Capability/Proﬁle Management (e.g. for managing service capabilities and HPU). 1341003-14

Conceptualizing and Programming Hybrid Services in the Cloud

Fig. 3.

Conceptual architecture for provisioning hybrid SBS/HBS services.

The Service-based Middleware will be the core of a hybrid service provisioning platform. Atop this, one can program HBS and SBS by using Provisioning/Negotiation/Communication Cloud APIs. In the next sections, we describe some utilization possibilities and how to program hybrid services.

3. Framework for Utilizing Hybrid Services By utilizing hybrid services in clouds, we could potentially solve several complex problems that need both SBS and HBS. In our work, we consider complex problems that can be described under dependency graphs. Let DG be dependency graph of tasks to be solved. It can be provided or extracted automatically. In order to solve a task t ∈ DG, we need to determine whether t will be solved by SBS, HBS or their combination. For example, let t be a virtual machine failure and the virtual machine is provisioned by Amazon EC2. Two possibilities can be performed: (i) request a new virtual machine from Amazon EC and conﬁgure the new virtual machine suitable for the work or (ii) request an HBS to ﬁx the virtual machine. In case (i) SBS can be invoked, while for case (ii) we need to invoke an HBS which might need to be provisioned with extra SBS for supporting the failure analysis. 1341003-15

H.-L. Truong, S. Dustdar & K. Bhattacharya

Our approach for utilizing hybrid services includes the following points: • Link tasks with their required HPUs via skills and skill levels, before programming how to utilize HBS and SBS. • Form or select suitable iSCU or iICU for solving tasks. Different strategies will be developed for forming or selecting suitable iSCU or iICU , such as utilizing different ways to traverse the dependency graph and to optimize the formation objective. • Program different strategies of utilizing iSCU and iICU , such as considering the elasticity of HBS due to changes of tasks and HBS. This is achieved by using programming primitives and constructs atop APIs for hybrid services. Figure 4 describes the conceptual architecture of our framework for solving complex problems. Given a task dependency graph, we can detect changes in required human computing power by using Task Change Management. Detected required power changes will be sent to Change Adaptation, which in turns triggers different operations on HBS usage, such as creating new HBS or adapting an existing HBS. These operations are carried out by the HBS Formation service which implements and integrates different algorithms for handling requests of HBS, each suitable for specific situations. Change Adaptation also decides whether a change should be applied to SBS by sending change request to the SBS Adaptation service which will perform the change and modify the task graph accordingly. When an HBS deals with a task graph, the HBS can change the task graph and its required HPUs (this will trigger HBS operations again). During the solving process, HBS can change and this can be detected by HBS Change Management. The HBS change will be sent to Change Adaptation. task dependency

solve tasks change detection iICU|iSCU HBS Change Management

human power unit create/modify

change

HBS Formation

algo

request HBS

human power unit

SBS Adaptation

Change Adaptation

change description

algo algo

description

cloud of hybrid services

Fig. 4.

Conceptual architecture. 1341003-16

change detection

Task Change Management

Conceptualizing and Programming Hybrid Services in the Cloud

At the time of writing, we have developed an SCU provisioning platform for forming, managing and controlling quality of SCUs for independent tasks.22 This platform utilizes HBS from simulated ICU clouds based our concepts and APIs to form quality-aware SCUs and we also developed elasticity rules for adapting ICU and SBS.23 The SCU expansion and reduction for dependent and evolving tasks are currently being prototyped. In the next section, we explain some concepts of programming hybrid services that are the key elements of the architecture depicted in Fig. 4.

4. Programming Hybrid Services In this section, we discuss some programming primitives for hybrid services that can be applied to the complex framework that we mentioned before. Such primitives can be used in different components, such as HBSFormation and ChangeAdaptation, in our framework described in Fig. 4. In illustrating programming examples, we consider a virtualized cloud of hybrid services that are built on top of our cloud of HBS and real-world clouds of SBS. Consequently, we will combine our APIs, described in Sec. 2.4, with existing client cloud API libraries. Our goal in this section is not to present specific algorithms, e.g. for HBSFormation, adaptation strategies, e.g. for ChangeAdaptation, or specific applications to solve specific tasks. Instead, we present how different algorithms, strategies or applications could be developed and integrated into our framework. 4.1. Modeling HPU-aware task dependency graphs 4.1.1. Task dependency graphs Our main idea in modeling HPU-aware task dependencies is to link tasks that are required for management skills and compliance constraints: • human resource skills: Represent skill sets that are required for dealing with problems/management activities; • constraints: Represent constraints, such as resource locations, governance compliance, time, cost, etc. that are associated with management activities and humans dealing with these activities. Given a dependency graph of tasks, these types of information can be provided manually or automatically (e.g. using knowledge extraction). Generally, we model dependencies among tasks and required skills and compliance constraints as a directed graph G(N, E) where N is a set of nodes and E is a set of edges. A node n ∈ N represents a task or required skills/compliance constraints, whereas an edge e(ni , nj ) ∈ E means that nj is dependent on ni (ni can cause some effect on nj or ni can manage nj ). Edges may be associated with weighted factors to indicate the importance of edges. The required skills, compliance constraints and weighted 1341003-17

H.-L. Truong, S. Dustdar & K. Bhattacharya lotusdomino is Deployed On was supported By Web Middle ware

supported By

is Deployed On depends On

aix

emcbackup depends On

network

supported By

Emailand Collaboration Services

supported By

db2

Business Applications Services

supported By depends On Platform Support Unix

supported By

depends On depends On supported By

nasbox

supported By

Storage DASD Backup Restore

Database Management

supported By Network Service

Fig. 5. An example of HPU-aware dependency graph. A component box describes a software and its problems (ITProblem node). An eclipse describes management skills (Management node).

factors will be used to determine the required HPU for a task, to select iICU and members for iSCU , and to build the connectedness for SCUs. 4.1.2. Examples and implementation Figure 5 presents an example of a dependency graph of an IT system linked to management skills. In this system, we have a LotusDomino system (described by lotusdomino) deployed in a Web Application Server (described by was). The Web Application Server is deployed on an AIX server (described by aix) and depends on a DB2 server (described by db2). The DB2 server depends on a NAS box (described by nasbox) and a network (described by network). The AIX server is dependent on an EMC backup system (described by emcbackup) which depends on network. Each software node in the IT system has different requirements for HBS in order to solve IT problems arisen. In our implementation of dependency graph, we use JGraphT.j We define two main types of Node — ITProblem and Management. All relationships are dependency. It is also possible to use TOSCA24 to link people skills and map TOSCAbased description to JGraphT. 4.2. Combining HBS and SBS Combining HBS and SBS is a common need in solving complex problems (e.g. in evaluating quality of data in simulation workflows). In our framework, this feature can be used for preparing inputs managed by SBS for an HBS work or managing outputs from HBS work. Furthermore, it can be used to provision SBS as utilities j http://jgrapht.org/.

1341003-18

Conceptualizing and Programming Hybrid Services in the Cloud

for HBS work (e.g. requiring HBS to utilize specific SBS in order to produce the result where SBS is provisioned by the consumer). Example: Listing 1 shows an example of programming a combination of HBS and SBS for a task using our cloud APIs and JClouds. In this example, we want to invoke Amazon S3 to store a log file of a Web application sever and invoke an HBS to find problems. Using this way, we can also combine HBS with HBS and of course SBS with SBS from different clouds.

// u s i n g J Clou ds APIs t o s t o r e l o g f i l e o f web a p p l i c a t i o n server B l o b S t o r e C o nt e xt c o n t e x t = new B l o b S t o r e C o n t e x t F a c t o r y ( ) . c r e a t e C o n t e x t ( "aws -s3" , " REMOVED " , " REMOVED " ) ; BlobStore blobStore = context . getBlobStore ( ) ; // . . . . and add f i l e i n t o Amazon S3 Blob blo b = b l o b S t o r e . b l o b B u i l d e r ( " hbstest " ) . b u i l d ( ) ; blo b . s e t P a y l o a d (new F i l e ( " was.log" ) ) ; b l o b S t o r e . putBlob ( " hbstest " , blo b ) ; S t r i n g u r i = blo b . getMetadata ( ) . g e t P u b l i c U r i ( ) . t o S t r i n g ( ) ; VieCOMHBS vieCOMHBS = new VieCOMHBSImpl ( ) ; // assume t h a t WM6 i s t h e HBS t h a t can a n a l y z e t h e Web Middleware problem vieCOMHBS. startHBS ( " WM6" ) ; HBSRequest r e q u e s t = new HBSRequest ( ) ; r e q u e s t . s e t D e s c r i p t i o n ( " Find possible problems from " + uri ) ; vieCOMHBS. runRequestOnHBS ( "WM6" , r e q u e s t ) ; Listing 1. Example of HBS combined with SBS.

4.3. Forming and configuring iSCUs A cloud provider can form an iSCU and provide it to the consumer as well as a consumer can select iICU and SBS to form an iSCU . An iSCU not only includes HBS (iICU or other sub iSCU ) but also consists of possible SBS for ensuring the connectedness within iSCU and for supporting the work and interaction within the iSCU . There are diﬀerent ways to form SCUs. In the following, we will describe some approaches for forming SCUs to solve a dependency graph of tasks. 4.3.1. Selecting resources for iSCU Figure 6 describes a general concept of how iSCU forming algorithms work. To form an iSCU, we need to consider both Business-as-Usual (BAU) and corrective 1341003-19

H.-L. Truong, S. Dustdar & K. Bhattacharya

Fig. 6.

General model for forming iSCU.

action (CA) cases. Given a task t ∈ DG, our approach in dealing with t is that we do not just simply take required management resources suitable for t but we need to consider possible impacts of other tasks when solving t and the chain of dependencies. To this end, we utilize DG to determine a set of suitable human resources to deal with t and t’s possible impact. Such human resources establish HBS capabilities in an iSCU . Overall, the following steps are carried out to determine required SCU: • Step 1: Determine DGBAU ⊆ DG where DGBAU includes all tj ∃ a walk (tj , t), tj is the task that must be dealt together with t in typical BAU cases. • Step 2: Determine DGCA ⊆ DG that includes tasks that should be taken into account under CA cases. DGCA = {tr } ∃ a walk(tr , tj ) with tj ∈ DGBAU . • Step 3: Merge DGSCU = DGBAU ∪ DGCA by (re)assigning weighted factors to links between (tk , tl ) ∈ DGSCU based on whether (i) tk and tl belong to DGBAU or DGCA , (ii) reaction chain from t to tk or to tl and (iii) the original weighted factor of links consisting of tk or tl . • Step 4: Traverse DGSCU , ∀ti ∈ DGSCU , consider all (ti , ri ) where ri is management resource node linking to ti in order to determine human resources. Based on the above-mentioned description, different SCU formation strategies can be developed. Note that our principles mentioned above aim at forming iSCU enough for solving main tasks and let iSCU evolve during its runtime. There could be several possible ways to obtain DGBAU and DGCA , dependent on specific configurations and graphs for specific problems. Therefore, potentially the cloud of HBS 1341003-20

Conceptualizing and Programming Hybrid Services in the Cloud Table 2.

Examples of SCU formation strategies.

Algorithms

Description

SkillWithNPath

Select iICU for iSCU based on only skills with a pre-deﬁned network path length starting from the task to be solved.

SkillMinCostWithNPath

Select iICU for iSCU based on only skills with minimum cost, considering a pre-deﬁned network path length starting from the task to be solved.

SkillMinCostMaxLevelWithNPath

Select iICU for iSCU based on skills with minimum cost and maximum skill levels, considering a pre-deﬁned network path length starting from the task to be solved.

SkillWithNPathUnDirected

Similar to SkillW ithN P ath but considering undirected dependency.

MinCostWithNPathUnDirected

Similar to M inCostW ithN P ath but considering undirected dependency.

MinCostWithAvail NPathUnDirected Select Select iICU for iSCU based on skills with minimum cost, considering availability and a pre-deﬁned network path length starting from the task to be solved. Undirected dependencies are considered.

can provide several algorithms for selecting HBS to form SCUs. As we aim at presenting a generic framework, we do not describe here speciﬁc algorithms, however, Table 2 describes some selection strategies that we implement in our framework. Listing 2 describes an example of forming an SCU.

D e f a u l t D i r e c t e dG r a p h<Node , R e l a t i o n s h i p > dg ; // graph o f problems // . . . double hpu = HPU. hpu ( dg ) ; // d e t e r m i n e SCUFormation app = new SCUFormation( dg ) ; ManagementRequest r e q u e s t = new ManagementRequest ( ) ; // d e f i n e r e q u e s t s p e c i f y i n g o n l y main p r o b l e m s t o be s o l v e d // . . . . // c a l l a l g o r i t h m s t o f i n d s u i t a b l e HBS. Path l e n g t h =2 and a v a i l a b i l i t y from 4am t o 19pm i n GMT zon e ResourcePool scu = app . MinCo stWithAva ila bilityNP a thUnDir ectedFo r ma tio n ( r e q u e s t , 2 , 4 , 19) ; i f ( scu == nul l ) { return ; } A r r a y L i s t<HumanResource> scuMembers = scu . g e t R e s o u r c e s ( ) ; SCU iSCU = new SCU( ) ; iSCU . setScuMembers ( scuMembers ) ; // s e t t i n g up SBS f o r scuMember . . . Listing 2. Example of forming iSCU by minimizing cost and considering no direction. 1341003-21

H.-L. Truong, S. Dustdar & K. Bhattacharya

4.3.2. Setting up iSCU connectedness After selecting members of iSCU , we can also program SBS and HBS for the iSCU to have a complete working environment. iSCU can have diﬀerent connectedness conﬁgurations, such as: • Ring-based iSCU : The topology of iSCU is based on a ring. In this case for each (hbsi , hbsj ) ∈ iSCU then we program hbsi → hbsj based on message-passing request

or shared memory models. For example a common Dropbox directory can be created for hbsi and hbsj to exchange requests/responses. • Star-based iSCU : A common SBS can be programmed as a shared memory for iSCU . Let sbs be SBS for iSCU then ∀ hbsi ∈ iSCU give hbsi access to sbs. For example, a common Dropbox directory can be created and shared for all hbsi ∈ iSCU .

SCU iSCU ; // . . . f i n d members f o r SCU DropboxAPI<WebAuthSession> scuDropbox ; // u s i n g dropbox a p i s // . . . AppKeyPair appKeys = new AppKeyPair (APP KEY, APP SECRET) ; WebAuthSession s e s s i o n = new WebAuthSession ( appKeys , WebAuthSession . AccessType . DROPBOX) ; // . . . s e s s i o n . s e t A c c e s s T o k e n P a i r ( a ccessTo ken ) ; scuDropbox = new DropboxAPI<WebAuthSession >( s e s s i o n ) ; // s h a r i n g t h e dropbox d i r e c t o r y t o a l l scu members // f i r s t c r e a t e a s h a r e DropboxAPI . DropboxLink l i n k = scuDropbox . s h a r e ( "/ hbscloud" ) ; // t h e n sen d t h e l i n k t o a l l members VieCOMHBS vieCOMHBS = new VieCOMHBSImpl ( ) ; for (HBS hbs : iSCU . getScuMembers ( ) ) { vieCOMHBS. startHBS ( i c u ) ; HBSMessage msg = new HBSMessage ( ) ; msg . setMsg ( "pls. use shared Dropbox for communication " + link . url ) ; vieCOMHBS. sendMessageToHBS ( hbs , msg ) ; // . . . } Listing 3. Example of star-based iSCU using Dropbox as a communication hub.

1341003-22

Conceptualizing and Programming Hybrid Services in the Cloud

• Master-slave iSCU : An hbs ∈ iSCU can play the role of a shared memory and scheduler for all other hbsi ∈ iSCU . Listing 3 presents an example of establishing the connectedness for an iSCU using Dropbox. Note that ﬁnding suitable conﬁgurations by using HBS information and compliance constraints is a complex problem that is out of the scope of this paper. 4.4. Change model for task graph’s human power unit When a member in an iSCU receives a task, she might revise the task into a set of sub-tasks. Then she might specify human compute units required for sub tasks and revise the task graph by adding these sub-tasks. As the task graph will change, its required HPU is changed. By capturing the change of the task graph, we can decide to scale in/out the iSCU . Listing 4 describes some primitives for scaling in/out iSCU based on the change of HPU. SCU iSCU ; // . . . iSCU . setScuMembers ( scuMembers ) ; // s e t t i n g up SBS f o r scuMember // . . . double hpu = HPU. hpu ( dg ) ; // d e t e r m i n e c u r r e n t hpu //SCU s o l v e s / adds t a s k s i n DG // . . . . // graph chan ge − e l a s t i c i t y b a s e d on human power u n i t double dHPU = HPU. d e l t a ( dg , hpu ) ; D e f a u l t D i r e c t e dG r a p h<Node , R e l a t i o n s h i p > c ha ng e g r a ph ; // o b t a i n c h a n g e s Set<C l o u d S k i l l > changeCS = HPU. d e t e r m i n e C l o u d S k i l l ( c ha ng e g r a ph ) ; i f (dHPU > SCALEOUT LIMIT) { iSCU . s c a l e o u t ( changeCS ) ; // expand iSCU } e l s e i f (dHPU < SCALEIN LIMIT ) { iSCU . s c a l e i n ( changeCS ) ; // r e d u c e iSCU // . . . } Listing 4. Example of elasticity for SCU based on task graph change.

5. Related Work Although both humans and softwares can perform similar work and several complex problems, both of them are esential in the same system, currently there is a 1341003-23

H.-L. Truong, S. Dustdar & K. Bhattacharya

lack of programming models and languages for hybrid services of SBS and HBS. Most clouds of SBS offer different possibilities to acquire SBS on-demand, however, similar efforts for HBS are missing today. Cloud models and APIs for HBS: Tai et al.25 outlined several research questions in cloud service engineering to support “everything is a service” in which services can be provided/integrated from different providers and charged based on different costs and values. However, contemporary systems focus only on SBS. Several frameworks for engineering cloud applications based on different IaaS, PaaS and SaaS, such as Aneka.26 BOOM27 have been introduced. Generally, they utilize software-based cloud resources via different sets of APIs, such as JClouds, Boto,k and OpenStack,l to develop applications under different programming models, such as MapReduce and dataflows. These frameworks do not consider hybrid services consisting of SBS and HBS, while our work supports conceptualizing and providing programming techniques for both SBS and HBS. To our best knowledge, there is no other work that proposes HBS cloud models. Programming HBS and SBS in a unified way: Most clouds of SBS offering different possibilities to acquire SBS on-demand. However, researchers have not devoted similar efforts for HBS. A common way to utilize human capabilities is to exploit human computation programming frameworks, e.g. Crowdforge28 and TurKit29 and Jabberwocky framework,30 for utilizing crowds for solving complex problems.3,31 However, these works do not consider how to integrate and virtualize software in a similar manner to that for humans. As we have analyzed, current support can be divided in three approaches:1 (i) using plugins to interface to human, such as BPEL4Peopleb or tasks integrated into SQL processing systems,9 (ii) using separate crowdsourcing platforms, such as MTurk,m and (iii) using workflows, such as Turkomatic.6 A drawback is that all of them consider humans individually and human capabilities have not been provisioned in a similar manner like software capabilities. As a result, an application must split tasks into sub-tasks that are suitable for individual humans, which do not collaborate to each other, before the application can invoke humans to solve these sub-tasks. Furthermore, the application must join the results from several sub-tasks and it is difficult to integrate work performed by software with work performed by humans. This is not trivial for the application when dealing with complex problems that requires human capabilities. In terms of communication models and coordination models, existing models also support messages push/pull/mediator, but they are platforms/middleware built-in rather than reusable programming primitives of programming models. Our work in this paper does not focus on managing and coordinating tasks but by proposing high-level APIs for HBS in a similar manner to that for SBS, our work could

k http://boto.s3.amazonaws.com/index.html. l http://www.openstack.org/. m Amazon mechanical turk, 2011.

Last access: 27 Nov 2011.

1341003-24

Conceptualizing and Programming Hybrid Services in the Cloud

foster the utilization of several HBS and SBS from different clouds based on cloud business models for different task management and coordination strategies. Software tools for HBS: Some recent efforts have been devoted for software engineering tools of human-services, such as Ref. 32, and general-purpose programming languages for human computation, such as CrowdLang.33 While they call for a better software engineering and programming languages support for human-centric systems, they do not address issues related to human services provisioning, e.g. using cloud and service models. Although we do not develop new general-purpose programming languages, we believe that if these works need to utilize human capabilities and software services in a large-scale, on-demand, pay-per-use fashion, then our models and techniques can be integrated into these software tools and languages. Overall, compared with related work, we develop models for clouds of HBS. Our techniques for virtualizing HBS and programming HBS in a similar way to SBS are different from related work. Such techniques can be used by high-level programming primitives and languages for social computers.

6. Conclusions and Future Work In this paper, we have proposed novel methods for modeling clouds of HBS and describe how we can combine them with clouds of SBS to create hybrid services. We believe that clouds of hybrid services are crucial for complex applications which need to proactively invoke SBS and HBS in similar ways. We have described main concepts for establishing clouds of hybrid services, covering several aspects, like conceptual models and provisioning architectures for communication, pricing and incentive models, and programming APIs. Based on that, we present general frameworks and programming APIs to describe where and how hybrid services can be programmed. In this paper, we focus on designing models, frameworks and APIs, and illustrating programming examples. We have presented a broad view on conceptualizing and programming hybrid services but have not addressed detailed activities in provisioning and managing the operation and interaction within HBS clouds as well as ICU/SCU. They will be subjects of several future research activities. Further real-world experiments should be conducted in the future to demonstrate the benefits of programming HBS and SBS in the same system. With respect to the software development for our concepts, we are currently working on programming elements/constructs/patterns for hybrid services that consider different relationships and cost/quality as first class entities in our programming models. Another direction is to work on hybrid service life-cycle management. This is also strongly related to how to monitor and enforce pricing and incentive strategies within a cloud infrastructure of hybrid services. Furthermore, we are also working on the integration with programming languages for social collaboration processes5 using hybrid services. Other related aspects, such as pricing models and contract negotiation protocols, will be also investigated. 1341003-25

H.-L. Truong, S. Dustdar & K. Bhattacharya

Acknowledgments This paper is an extended version of the paper published in Ref. 11 The work mentioned in this paper is partially supported by the EU FP7 SmartSociety.n References 1. S. Dustdar and H. L. Truong, Virtualizing software and humans for elastic processes in multiple clouds — a service management perspective, IJNGC 3(2) 2012. 2. D. Schall, H. L. Truong and S. Dustdar, Unifying human and software services in web-scale collaborations, IEEE Internet Comput. 12(3) (2008) 62–68. 3. A. Doan, R. Ramakrishnan and A. Y. Halevy, Crowdsourcing systems on the worldwide web, Commun. ACM, 54(4) (2011) 86–96. 4. D. Oppenheim, L. R. Varshney and Y.-M. Chee, Work as a service, in ICSOC, eds. G. Kappel, Z. Maamar and H. R. Motahari Nezhad, Lecture Notes in Computer Science, Vol. 7084 (Springer, 2011), pp. 669–678. 5. V. Liptchinsky, R. Khazankin, H.-L. Truong and S. Dustdar, Statelets: Coordination of social collaboration processes, in 14th Int. Conf. Coordination Models and Languages (Coordination 2012), Stockholm, Sweden, June 2012. 6. A. P. Kulkarni, M. Can and B. Hartmann, Turkomatic: Automatic recursive task and workflow design for mechanical turk, in Proc. 2011 Annual Conference Extended Abstracts on Human factors in Computing Systems, CHI EA ’11 (ACM, New York, NY, USA, 2011), pp. 2053–2058. 7. D. W. Barowy, E. D. Berger and A. McGregor, Automan: A platform for integrating human-based and digital computation, Technical Report UMass CS TR 2011-44, University of Massachusetts, Amherst, 2011. http://www.cs.umass.edu/emery/pubs/ AutoMan-UMass-CS-TR2011-44.pdf. 8. H. S. Baird and K. Popat, Human interactive proofs and document image analysis, in Proc. 5th Int. Workshop on Document Analysis Systems V, DAS ’02 (Springer-Verlag, London, UK, 2002), pp. 507–518. 9. A. Marcus, E. Wu, D. Karger, S. Madden and R. Miller, Human-powered sorts and joins, Proc. VLDB Endow. 5 (2011) 13–24. 10. S. Tai, P. Leitner and S. Dustdar, Design by units: Abstractions for human and compute resources for elastic systems, IEEE Internet Comput. 16(4) (2012) 84–88. 11. H. L. Truong, S. Dustdar and K. Bhattacharya, Programming hybrid services in the cloud, in ICSOC, eds. Chengfei Liu, Heiko Ludwig, Farouk Toumani and Qi Yu, Lecture Notes in Computer Science, Vol. 7636 (Springer, 2012), pp. 96–110. 12. P. Mell and T. Grance, The NIST definition of cloud computing, NIST Special Publication 800-145 (September 2011). 13. H. L. Truong, S. Dustdar, D. Baggio, S. Corlosquet, C. Dorn, G. Giuliani, R. Gombotz, Y. Hong, P. Kendal, C. Melchiorre, S. Moretzky, S. Peray, A. Polleres, S. ReiffMarganiec, D. Schall, S. Stringa, M. Tilly and H. Q. Yu, Incontext: A pervasive and collaborative working environment for emerging team forms, in SAINT (IEEE Computer Society, 2008), pp. 118–125. 14. O. Nov, D. Anderson and O. Arazy, Volunteer computing: A model of the factors determining contribution to community-based scientific research, in Proc. 19th Int. Conf. World wide web, WWW’10 (ACM, New York, NY, USA, 2010) pp. 741–750. n http://www.smart-society-project.eu/.

1341003-26

Conceptualizing and Programming Hybrid Services in the Cloud

15. A. J. Quinn and B. B. Bederson, Human computation: A survey and taxonomy of a growing ﬁeld, in CHI, eds. D. S. Tan, S. Amershi, B. Begole, W. A. Kellogg and M. Tungare (ACM, 2011), pp. 1403–1412. 16. W. Mason and D. J. Watts, Financial incentives and the “performance of crowds”, in Proc. ACM SIGKDD Workshop on Human Computation, HCOMP’09 (ACM, New York, NY, USA, 2009), pp. 77–85. 17. O. Tokarchuk, R. Cuel and M. Zamarian, Analyzing crowd labor and designing incentives for humans in the loop, IEEE Internet Comput. 16(5) (2012) 45–51. 18. O. Scekic, H.-L. Truong and S. Dustdar, Incentives and rewarding in social computing, Commun. ACM 56(6) (2013) 72–82. 19. J. J. Horton and L. B. Chilton, The labor economics of paid crowdsourcing, in Proc. 11th ACM conf. Electronic commerce, EC’10 (ACM, New York, NY, USA, 2010), pp. 209–218. 20. S. Dustdar and K. Bhattacharya, The social compute unit, IEEE Internet Comput. 15(3) (2011) 64–69. 21. O. Scekic, H.-L. Truong and S. Dustdar, Programming incentives in information systems, in 25th Int. Conf. Advanced Information Systems Engineering (CAISE 2013), Valencia, Spain, 17–21 June 2013. 22. S. Dustdar, M. Z. C. Candra and H.-L. Truong, Provisioning quality-aware social compute units in the cloud, in Service-Oriented Computing — Proc. 9th Int. Conf. ICSOC 2013 (Springer, Berlin, Germany, 2–5 December 2013). 23. M. Z. C. Candra, H. L. Truong and S. Dustdar, Modeling elasticity trade-oﬀs in adaptive mixed systems, in WETICE, eds. S. Reddy and M. Jmaiel (IEEE, Hammamet, Tunisia, 17–20 June 2013), pp. 21–26. 24. T. Binz, G. Breiter, F. Leymann and T. Spatzier, Portable cloud services using tosca, IEEE Internet Comput. 16(3) (2012) 80–85. 25. S. Tai, J. Nimis, A. Lenk and M. Klems, Cloud service engineering, in Proc. 32nd ACM/IEEE Int. Conf. Software Engineering — Volume 2, ICSE’10 (ACM, New York, NY, USA, 2010), pp. 475–476. 26. R. N. Calheiros, C. Vecchiola, D. Karunamoorthy and R. Buyya, The aneka platform and qos-driven resource provisioning for elastic applications on hybrid clouds, Future Generation Comp. Syst. 28(6) (2012) 861–870. 27. P. Alvaro, W. R. Marczak, N. Conway, J. M. Hellerstein, D. Maier and R. Sears, Dedalus: Datalog in time and space, in Datalog, eds. O. de Moor, G. Gottlob, T. Furche, and A. J. Sellers, Lecture Notes in Computer Science, Vol. 6702 (Springer, Oxford, UK, 16–19 March 2010), pp. 262–281. 28. A. Kittur, B. Smus, S. Khamkar and R. E. Kraut, Crowdforge: Crowdsourcing complex work, in Proc. 24th Annual ACM Symp. User Interface Software and Technology, UIST’11 (ACM, New York, NY, USA, 2011), pp. 43–52. 29. G. Little, L. B. Chilton, M. Goldman and R. C. Miller, Turkit: Tools for iterative tasks on mechanical turk, in Proc. ACM SIGKDD Workshop on Human Computation, HCOMP’09 (ACM, New York, NY, USA, 2009), pp. 29–30. 30. S. Ahmad, A. Battle, Z. Malkani and S. Kamvar, The jabberwocky programming environment for structured social computing, in Proc. 24th Annual ACM Symp. User Interface Software and Technology, UIST ’11 (ACM, New York, NY, USA, 2011), pp. 53–64. 31. A. Brew, D. Greene and P. Cunningham, Using crowdsourcing and active learning to track sentiment in online media, in Proc. 2010 Conf. ECAI 2010: 19th European Conference on Artificial Intelligence, (IOS Press, Amsterdam, The Netherlands, 2010), pp. 145–150.

1341003-27

H.-L. Truong, S. Dustdar & K. Bhattacharya

32. C. Dorn and R. N. Taylor, Co-adapting human collaborations and software architectures, in ICSE, eds. M. Glinz, G. C. Murphy and M. Pezz`e (IEEE, 2012), pp. 1277– 1280. 33. P. Minder and A. Bernstein, Crowdlang: A programming language for the systematic exploration of human computation systems, in SocInfo, eds. K. Aberer, A. Flache, W. Jager, L. Liu, J. Tang and C. Gu´eret, Lecture Notes in Computer Science, Vol. 7710 (Springer, Lausanne, Switzerland, 5–7 December 2012), pp. 124–137.

1341003-28

On the Elasticity of Social Compute Units Mirela Riveni, Hong-Linh Truong, and Schahram Dustdar Distributed Systems Group, Vienna University of Technology {m.riveni,truong,dustdar}@infosys.tuwien.ac.at

Abstract. Advances in human computation bring the feasibility of utilizing human capabilities as services. On the other hand, we have witnessed emerging collective adaptive systems which are formed from heterogeneous types of compute units to solve complex problems. The recently introduced Social Compute Units (SCUs) present one type of these systems, which have human-based services as their core fundamental compute units. While, there is related work on forming SCUs and optimizing their performance with adaptation techniques, most of it is focused on static structures of SCUs. To provide better runtime performance and ﬂexibility management for SCUs, we present an elasticity model for SCUs and mechanisms for their elastic management which allow for certain ﬂuctuations in size, structure, performance and quality. We model states of elastic SCUs, present APIs for managing SCUs as well as metrics for controlling their elasticity with which it is possible to tailor their performance parameters at runtime within the customer-set constraints. We illustrate our contribution with an example algorithm. Keywords: Social Compute Units, Elasticity, Adaptation, Collective Adaptive Systems.

Introduction

In recent years, new forms of collective adaptive systems(CASs) that consider heterogeneous types of compute units/resources(e.g., software services, human based services and smart-devices) have emerged [20]. These systems allow compute units to be ﬂexibly added and/or removed from them, and diﬀerent collectives can overlap with each other by utilizing each other’s resources. Compute units within collectives are collaborative, manageable and may be given decision making responsibilities. With the advance of human computation [17] there is a possibility of forming CASs that include human-based services [21] as compute units. Social Compute Units(SCUs), introduced in [6], can be considered as one type of these collective adaptive systems. They are virtual compositions of individual human compute units, performing human computation tasks with a cloud-like behavior. SCUs are possible today because of the human resource pools that are provided by human computation platforms (e.g., crowdsourcing platforms, social networking platforms and expert networks), which have brought the possibility to investigate ways of utilizing human computation under the service oriented computing paradigm. However, due to the unpredictability of M. Jarke et al. (Eds.): CAiSE 2014, LNCS 8484, pp. 364–378, 2014. c Springer International Publishing Switzerland 2014

On the Elasticity of Social Compute Units

365

human behavior, human-based services bring considerable challenges in their management. This is especially the case with collective adaptive systems such as SCUs, where the ways to manage resources are obviously different and more complex than the management of crowd workers that work individually, and that of collaborations with fixed number of resources. In this context, traditional platforms that support virtual fixed-sized collaborations might not be as efficient as those that support SCUs with elastic capabilities that offer opportunities for variable resource numbers with variable scalable capabilities. There are several reasons for this. First, unexpected tasks might be generated at run-time which may require new type of elements with new type of capabilities. In fixed-resource collaborations, usually existing members need to learn these tasks and thus the work might be delayed and/or executed with lower quality. Next, there might be a human-compute unit that is temporarily misbehaving or its performance is degraded. Its exclusion would bring degradation of the collaboration and the performance of the collective, if another appropriate one is not employed in its place. Furthermore, due to badly planned delegations, it is often the case that some resources are overloaded while others are underutilized. The latter comes as a consequence of the problem of the reliance on human resource availability as one of the fundamental ones in social computing. In this context, the willingness of a human resource to execute a particular task at a specific time point is often overlooked. However, this is crucial for platforms supporting work that includes human computation because even if we assume that human resources can use ”unlimited” software-based resources, e.g., using the cloud, human behavior is dynamic and highly unpredictable. The aforementioned problems show that there is a need for management mechanisms to support elasticity by scaling in size and computing capabilities of SCUs in an elastic way. Authors in [8],[21] identify the underlying challenge in provisioning SCU elasticity to be the lack of techniques that enable proactive provisioning of human capabilities in a uniform way in large scale. Nevertheless, assuming the possibility of utilizing human-based services in a cloud-like way, systems should support runtime elastic coordination of collectives. To address the aforementioned issues, in this paper, we investigate and provide runtime mechanisms with the elasticity notion in mind, so that platforms would be able to provide elastic capabilities of human-based compute units/SCUs, that can be managed flexibly in terms of the number of resources, as well as their parameters such as cost, quality and performance time. Hence, our key contributions are: – conceptualizing and modeling the SCU execution phase and states, – defining SCU-elasticity properties and APIs, – designing an SCU provisioning platform model with elastic capabilities. The rest of this paper is organized as follows. In Section 2 we present a motivation example and discuss challenges in elasticity provisioning. In Section 3 we describe the SCU concept, model the execution mode of an SCU and present our platform for managing elastic SCUs. Section 4 illustrates the feasibility of our approach. We present related work in Section 5 and conclude the paper in Section 6.

366

M. Riveni, H.-L. Truong, and S. Dustdar

Motivation, Background and Research Statement

Scenario. Let us consider a concrete scenario of a software development project e.g., for a health-care specific system, and assume that a software start-up company is engaged for its execution and completion. To deliver the end-artifact, these type of projects require diverse set of skills. Hence, in addition to the company employees, some specific parts of the project might need to be outsourced, e.g., to experts with experience in health-care but also to IT professionals with skills that the start-up is lacking. Hence, to solve the problem of skill deficiency, an SCU including human-based resources/services both from the software developing company but also ”outside” experts is formed. The SCU utilizes software services for collaboration and task execution. On the other hand, the humanbased services and the software services that they utilize are supported by an SCU provisioning and management platform that coordinates their performance. The challenges that arise in this scenario come from the importance of performance and quality of results in paid expert units. A software solution needs to be delivered on time and in accordance with customer requirements and budget limitations. Fixed composite units with a known number of resources, including outsourced ones, often have problems with overloaded resources and may result in project delays with good quality or on time delivery of solutions with a lower quality than the desired ones. Problems such as those mentioned in the introduction also appear. However, with the availability of online resource-pools from human clouds [10], human-based services can be acquired and released from SCUs on demand, so as to best meet the customer performance and quality requirements. Hence, we assume that the ”outside” experts for our software development SCU can be recruited from human clouds on demand. Under these assumptions and if the SCU supporting platform incorporates mechanisms that allow elasticity, an initial SCU will be able to adapt at runtime with respect to certain parameters, such as the number of its compute units, unit types, structure and performance. This can be particularly important in agile software development, where both the customer requirements and the development process evolve in an iterative way, and teams have high collaboration with the customer and are more responsive to change. Our hypothesis is that in consequence of these elastic capabilities, SCUs will provide higher efficiency at runtime. Thus, platforms that include mechanisms and techniques for runtime support of coordination of SCUs with elastic capabilities are crucial. Background and Challenges. As aforementioned, our approach is based on the concept of Social Compute Units [6], which fundamentally represent virtual collective systems with human-based resources as compute units that are brought together to work on a common goal with a deadline. These compute units, can belong to an enterprise, they can be invoked from a crowdsourcing platform, an expert network or any platform that hosts pools of available resources for human (including social) computation. In relation to the work of the coauthors in [21], in this paper, we use the term Individual Compute Units (ICUs) for SCU members, which represent human-based services that can be programmed in a manner that

On the Elasticity of Social Compute Units

367

they can execute tasks on-demand. Thus, an SCU is composed on request from a customer who defines requirements and sets constraints(e.g.,budget,deadline). It has its compute (performance) power, it can be programmed (managed) and is intended to work utilizing a collaboration platform hosted on the cloud. The behavior of an SCU is cloud-like, in the sense that its duration and performance depends on its goal, customer constraints as well as events generated during its execution. Considerable related research focus has been put on formation algorithms [1],[13] and performance optimization within fixed teams. However, SCUs have a different nature than teams, as the SCU structures and capabilities can be programmed and SCU members can be elastically managed at runtime. Thus, even if some work for teams can be utilized, there is a research gap concerning SCU elasticity during the execution phase, in terms of resource numbers but also in terms of non-functional parameters(NFPs) such as cost, reliability, performance time etc. There has been a classification of human cloud platforms, where one category of platforms is said to be focused on project governance and complex coordination between resources [10], as opposed to crowdsourcing ones where the responsibility of project governance is not entirely on the platform. Examples of these type of platforms are TopCoder1 and workio2. Even though these type of platforms can manage the lifecycle of collaborations, we argue that they lack the adaptation techniques and flexibility of resource management in terms of elasticity at runtime. For example, the pricing in these cases is not set by the customer like in crowdsourcing, rather the human-based services set their own prices. Thus, there is a possibility that with these models a collective of human resources can be automatically ”programmed” so that if the number and type of resources changes the cost does not exceed the customer’s total budget. This is one example of NFP elasticity in terms of cost. Consequently, identifying possible elastic operations that can be triggered at critical time points present important challenges for optimizing an SCUs performance. In the context of the aforementioned scenario and what lacks in current platforms, some of the research questions that we confront are: – Given an initial formed SCU and a set of monitored team performance metrics, what are the set of actions that can enable SCU elastic capabilities, in situations when performance is degraded and violates a threshold value for a customer set constraint? – When is optimization(e.g, load balancing) within an SCU not enough and a reorganization needed? Which tasks need to be reassigned, when and to whom(to a resource within/out of the SCU? To sum up, this paper investigates the following fundamental challenge:What are the mechanisms that a human computation system needs to deploy so as to provision SCUs with elastic capabilities, both in terms of resource scaling and in terms of variable properties? 1 2

http://www.topcoder.com/ https://www.workio.com/

368

3 3.1

M. Riveni, H.-L. Truong, and S. Dustdar

Social Compute Units and Elasticity Elastic Social Compute Units

Elastic SCUs have elastic capabilities that can be triggered at runtime to tailor their performance to best fit client requirements at runtime. With human based resources being unpredictable and dynamic, their skills, price, interest and availability can change with time and within a specific context. However as stated in [6] the concept of SCU does not have a notion of elasticity in itself, thus an SCU provisioning platform which creates, deploys and supports the execution of SCUs needs to include mechanisms for scaling it up or down as needed, and as aforementioned, with this scale an SCUs performance parameters vary as well. These mechanisms should ensure that at each time point these parameters are within desired levels and comply with customer constraints. For our purposes, we conceptually define the elasticity of SCUs as follows: Definition 1. The Elasticity of Social Compute Units is the ability of SCUs to adapt at runtime in an automatic or semi-automatic manner, by scaling in size and/or reorganizing and rescheduling, such that the variations in the overall performance indicators such as capability, availability, effort, productivity and cost, at each point in time are optimal within the boundaries of the customerset constraints. To support elasticity for SCUs, we identify as a prerequisite to have an execution model for an SCU, as previous work identifies SCU phases but do not go into details into its execution phase. An SCU lifecycle consists of the following stages: request, create, assimilate, virtualize, deploy and dissolve [6]. The elasticity mechanisms are needed after the virtualization stage, in the execution phase, which we model next. 3.2

SCU Execution Model

We denote a cloud of ICUs (e.g.,from online platforms and/or enterprise internal pool) as the universal set R = {r1 , r2 , r3 ...rn }, and the set of ICUs that are members of a particular SCU as S = {s1 , s2 , s3 ...sn }, where S ⊂ R. Let the set of tasks to be executed from a specific SCU be T = {t1 , t2 , t3 ...tn }. For each task ti ∈ T , we denote the set of matching, appropriate and possible ICUs that can perform the task ti as P = {p1 , p2 , p3 ...pn }, where P ⊂ R. Depending on constraints the following can be valid in different situations: P ⊂ S c , P ⊂ S or P = S. To provide elasticity, ICUs from S can be released and new ICUs from P can be added to S, therefore, |S| might change at runtime. We model an ICU belonging to the cloud of ICUs R, with the following set of global gl = {Idicu , skillset, reputation, price, stateglobal}. Moreover, properties, ICUprop an ICU from the perspective of the specific SCU of which it is a member, is lscu gl = {Idscu , ICUprop , statelocal , modeled with its local properties, as ICUprop productivity, trust}, where reputation, state productivity, and trust are aggregate metrics that we discuss further in this section.

On the Elasticity of Social Compute Units

369

States. An SCU in execution mode, at a speciﬁc time point τ , can be in one of the following action-states, SCUstate (τ ) = {running, suspending, resuming, expanding, reducing, substituting, stopped}. These states are listed in Table 1. The mentioned states are basic/atomic ones and a combination of them makes a complex SCU execution state. For example, an SCU might be running but due to an adaptation action, at the same time multiple ICUs (a cluster of ICUs) within an SCU might be suspended, while a new ICU is being added in expanding state. In this case because running, suspending and expanding are all execution states of an SCU, then running ∧ suspending ∧ expanding is also an SCU state. However, some states are mutually exclusive if they refer to the whole SCU and cannot be aggregated, i.e., an SCU cannot be in running ∧ stopping state. If one of the atomic states refers to (a change in) individual or a cluster of ICUs, an SCU can be in running ∧ extending state or for example an SCU can be in a running ∧ reducing state. Thus, the aggregate states are valid in the context of the scope that a state-changing action takes place. Table 1 also shows the scope for which the state-changing actions are valid, in terms of the whole SCU, a cluster of ICUs, or ICUs only. The importance of the state of an SCU as a whole is tightly coupled with ICU states and is crucial when applying elastic strategies in two ways: 1) the state of the SCU can be a trigger for elastic operations on the SCU, and 2) it can be a desired result after applying these operations. Table 1. Fundamental state alternatives of the SCU Execution phase Triggering Role Platform Customer ICU √ √ Run Running SCU √ √ √ Suspend Suspending SCU/ICUcluster/ICU √ √ √ Activate Resuming SCU/ICUcluster/ICU √ √ √ Add Expanding ICUcluster/ICU √ √ √ Exclude Reducing ICUcluster/ICU √ √ √ Stop/Exclude/Add Substituting ICUcluster/ICU √ √ Stop Stopping SCU Trigger action

State

Scope

SCU Elasticity Management. Table 1 shows ways of adaptation triggering: platform based, customer based and ICU based. To clarify, a platform that supports an SCU should have the mechanisms to support all of its execution states elastically. Thus all state-changing actions can be triggered in an automated way as shown in Table 1. Referring to our motivational scenario, in rare cases the customer could suspend the whole SCU of software development until he has consulted and decided for crucial changes. There are other triggering statechanging actions that the customer can also make(shown with light gray check signs). Table 1 also shows which state-changing actions can be most aﬀected by communication and ICU feedback, which we illustrate in Section 4. We show an example for a software developing SCU in execution mode in Fig. 1. At a speciﬁc time point ICUs with developer skills are in running state while designers are suspended. Next, due to an event when expert information is needed(e.g.,health-care

370

M. Riveni, H.-L. Truong, and S. Dustdar

Fig. 1. An illustrative example of an SCU in execution: expanding and reducing states

information in our scenario), the SCU is expanded by including ICU with specific expertise and consultancy skills while a designer-ICU is resumed. At another time point each ICU is running, while before dissolving, the SCU is reduced as ICUs with designer and consultancy skills have finished their tasks. Adaptation actions on an SCU can change its execution model not only in terms of the state but also in terms of its execution structure. These changes are interdependent with task structure changes and ICU state changes. Basic ICU and SCU Metrics. The decision to apply an elastic adaptation action depends on events that are triggered by two level monitoring of global and local metrics, namely to detect: 1) a violation of preset threshold values for overall SCU performance, and 2) which ICUs have affected the SCU’s performance degradation. The focus of this paper is not to investigate extensive metrics, as many are context dependent. Thus, in this section we list and define some basic ones that we identify to be useful for SCUs at runtime. Project Effort and Productivity have been listed as performance measures for software projects [12]. Modified versions of these metrics can be reused for SCUs on software and other goals. Thus, we define the SCU Effort as the sum of the average time spent by each ICU on each assigned task. The SCU task completion ratio, gives the fraction of completed tasks within those assigned. However this does not always mean that the results of all completed tasks are also Table 2. Notation and description of basic ICU metrics and parameters Metrics nreq nack nreasgn nsucreasgn napproved(si ) τ (si, tx ) c(si , tx ) c( snw i , tx )

Description Number of willingness requests sent from the scheduler to an ICU Number of willingness acknowledgments sent to the scheduler by an ICU Number of tasks reassigned to an ICU Number of successfully executed reassigned tasks by an ICU Total number of successfully executed/approved tasks for an ICU Processing time for task x executed by an ICU Cost for task x when executed by an ICU Cost for task x when reassigned to a new ICU

On the Elasticity of Social Compute Units

371

approved. Thus, we also consider the number of valid or approved tasks, which we use for calculating the productivity of an SCU. We define SCU Productivity as the ratio of approved tasks to SCU Effort, giving an average number of tasksper-time-unit value. The SCU Reputation is a weighted sum of the reputation score of each ICU regarding its expertise for the skill for which is included in the SCU. We model the SCU Reputation in this way because some ICUs in a specific SCU are more crucial than others by executing more critical tasks. We define the, reputation(si ) as a function of (Success Rate, Approved Tasks, Timeliness, Reliability, SocialTrust). The SCU Cost is an aggregate sum of the cost of each ICU for each task according to its type and skill-type requirements. The metrics are given in Table 3, where si ∈ S and tx ∈ T . See Table 2 for notation on individual metrics, some of which we use in calculating those in Table 3. The described metrics are dynamic and a platform supporting elastic SCUs should be able to monitor and utilize them in runtime adaptation strategies. From all that was discussed, we can now characterize the elastic profile of an SCU within time τ , as SCUexec (τ ) = {SCUsize (τ ), SCUstructure (τ ), SCUstate (τ ), SCUef f ort (τ ), SCUproductivity (τ ), SCUcost (τ ), SCUreputation (τ )}. Table 3. Example metrics of SCU performance SCU Metrics

Definition

SCU Total Completed Tasks CT (scui ) = SCU Approved Tasks

AT (scui ) =

|S| i=1 |S|

ncompleted (si ) napproved (si )

i=1

SCU Success Rate SCU Eﬀort

ST (scui ) = AT (scui )/CT (scui ) |S| m 1 Ef f ort(scui ) = CT (scui ) τ (si , tx ) s=1 x=1

SCU Productivity SCU Reputation

P roductivity(scui ) = AT (scui )/Ef f ort(scui ) |S| Reputation(scui ) = wexpertise ∗ reputation(si ) i=1

SCU Cost

Cost(scui ) =

|S| m

c( si , tx )

i=1 x=1

Elasticity APIs. To be able to provide SCU elasticity capabilities, which include ICUs having the aforementioned (and other domain-dependent) properties, we need to have common APIs for their description and management. Currently we develop APIs which we categorize in ICU-description APIs for manipulating ICU proﬁles, ICU-scheduling APIs for ICU management and elastic operations,

372

M. Riveni, H.-L. Truong, and S. Dustdar Table 4. Example API, abstract methods for ICU manipulation

Scheduling methods abstract AddICU() abstract void SuspendICU(SCU scu)

Description adds an ICU to the SCU brings an ICU to idle state, still included in the SCU abstract void ExcludeICU(SCU scu) excludes an ICU form the SCU abstract void ResumeICU(SCU scu) restart an ICU and its associated tasks abstract void ReserveICU(Task t) reserves an alternative ICU for an already assigned task abstract void SubstituteICU() substitutes an ICU with a reserved one public List <ICU> getAllICUinSCU(SCU returns ICUs within the SCU scu) public List<ICU>getSuspendedICUs(SCU returns suspended ICUs within an SCU scu) public List<ICU>getIdleICUs(SCU scu) returns idle ICUs in an SCU public List<ICU>getReservedICUs(Task maintains an ordered list of top approprit) ate ICUs for a certain task (ICUs might be in/out of the speciﬁc SCU)

and communication operations. Table 4 describes some speciﬁc methods that we develop to be utilized in strategies providing SCU elastic capabilities. 3.3

Elastic SCU Provisioning Platform

Figure 2 shows a model of our concept of an elastic SCU provisioning platform, that utilizing our SCU execution model, metrics and API is able to support elastic SCU management. Thus, the platform supports the following behavior: a customer/SCU consumer submits a project/request with multiple tasks to it. When submitting tasks and request for SCU formation, the client specifies functional and non-functional ICU requirements such as: skill, reputation and cost. In addition he specifies overall SCU constraints, such as total budget and deadline. The platform integrates an SCU formation component with ICU selection/ranking algorithms. The resource selection and initial task assignment is not in our focus. The SCU creation/formation component’s output is an initial SCU created by selecting ICUs from human cloud providers. This SCU is ”fed” to a controller -a component that hosts monitoring and adaptation algorithms utilizing APIs for elasticity control, which provide SCU runtime management. The challenge of this component, is to monitor and adapt the SCU in accordance to customer set constraints, such that the SCU gives the maximum performance and quality within the preset boundaries for time related, cost and quality related indicators. Different scheduling and ICU management algorithms can be plugged into the platform, which would support the SCU during its lifecycle.

On the Elasticity of Social Compute Units

373

Fig. 2. Conceptual platform model supporting elastic SCUs

Illustrating Example

In this section we show the benefit of having explicit state management, metrics and elasticity for supporting elastic SCU. We present the way our framework can simplify the complexity of the development of elasticity strategies for SCUs. Typically, an elasticity strategy for an SCU is a domain-specific problem. In the following, we illustrate how an ICU Feedback-based elastic SCU management strategy can be implemented. As ICUs within an SCU are inherently dynamic and unpredictable, we cannot always fully rely on the system-based availability information concerning an ICU and fully automated task assignment and scheduling might not always be the most suitable approach, especially when there is a possibility of unexpected generation of tasks at runtime. Hence, we propose an SCU adaptation strategy that uses ICU acknowledgments for their willingness to work on specific tasks. More specifically, these acknowledgments are sent in response to system requests for availability guarantees for the execution of tasks that need reassignment. This strategy supports elasticity in the sense that it departs from the idea that a customer knows in advance which and how many ICUs will contribute and the final cost for his ”project”. However, the customer budget is kept within its limits as the cost may vary within these limits, just as the size and structure of the assembled SCU may vary with time until the final result is returned. Our example of elastic SCU mechanism is a semi-automatic task scheduling strategy where part of the coordination for task re-assignment is delegated to ICUs. With this approach a task is being re-assigned to a more available ICU,

374

M. Riveni, H.-L. Truong, and S. Dustdar

on an ICUs own approval and when certain conditions apply (e.g, when a threshold is reached). Thus, the task reassignment decisions are partly based on feedback from ICUs and in this way the elastic SCU management is inﬂuenced from “human in the loop” decentralized coordination. With this example, we show how new SCU metrics can be derived and how APIs for elastic capabilities can be used. Deriving New SCU Metrics. By utilizing APIs for obtaining SCU metrics at runtime, one can calculate the willingness of an ICU and the willingness conﬁdence score as: (See Table I for notations): W illingness =

nsucreasgn nack nsucreasgn nack , Successreasgn = , W Cnf = × . nreq nreasgn nreq nreasgn

We derive the willingness confidence value from the basic indicators, ICU willingness, and the rate of success in executing the reassigned tasks. The willingness confidence score WCnf, is computed from the number of acknowledgments that an ICU has sent to the scheduler in response to its Requests for Willingness, and the number of successfully completed tasks that are assigned to it as responses to these acknowledgments. Thus, it is an indicator about the reliability of the alternative ICUs guarantee about its willingness to work. Programming an Elasticity Strategy Using Elasticity APIs. Considering worker willingness, provides a way to measure and control the unpredictability/reliability of ICUs by asking them for task-execution guarantees because it provides a way to compare their ”statements” with their actual behavior. This is what the value of Willingness Confidentiality indicates. In this strategy we assume that each incoming task is assigned to the ICU at the top of a ranked list which is returned by a ranking algorithm, and references to the first x most appropriate ICUs from the ranked list are stored as reserves/alternatives for each task. The algorithm can be summarized with the following steps: 1. When a preset threshold, related to a task which is already assigned to the most appropriate ICU matching the requirements is reached, e.g.,the tasks waiting-time in an ICUs task-queue, the scheduler sends a request for execution willingness to the next top x number of ICUs that it has references to (reserves from the initial ranked list), which at the same time are idle, or their task queues are smaller than that of the ICU to which the task was initially assigned. With this request for willingness, it notifies them that there is a task that they can work on. This request is a resource availabilitycheck; it is a request for a resource’s willingness to work on a specific task as a form of a worker-side commitment or guarantee that the task will be executed by it. 2. Each ICU that receives this request and is ready and wishes to work on the task then sends the scheduler a willingness acknowledgment(Ack)/feedback to this request. 3. The scheduling component reassigns the task on threshold to the alternative resource that has sent a willingness acknowledgment and that is idle or has the smallest task queue. Priority is given to less loaded ICUs that are already members of the SCU.

On the Elasticity of Social Compute Units

375

Algorithm 1. Task-reassignment with ICU-side assurance Require: scuTasks for SCU Require: customer constraints on NFP 1: for all tasks in T do rank matching ICUs and return the first 10 appropriate 2: P ← getReservedICU List(T askt) store reserve ICUs 3: assign task t to top ranked ICUs r 4: if r is not an element in SCU then do 5: SCU ← addICU () add ICU r to SCU x and update its profile 6: if task.taskQueueT ime == task.timeT hreshold then do 7: if r == idle then do 8: SCU ← removeICU () reduction: remove ICU r from SCU 9: for all ICU in P do 10: getICU State(ICU ICU id) 11: if ICU STATE==idle AND icuReserve.tQueue() <r.tQueueSize()/2 then do 12: willingnessReqM essage() 13: for all icuReserve.sentAck == true in ascending order of icuResource.taskQueue do 14: if resource belongs in SCU then do 15: substituteICU () re-assign task to SCU member and update its profile 16: break 17: substituteICU () re-assign task to external ICU and update its profile 18: SCU ← addICU () expansion: include resource y in SCU

When multiple reserve ICUs send acknowledgments that they are ready to execute the task, the reassignment decision is made based on the information from the Acks combined with monitoring information about their task queues and logged information about the WCnf score. This type of scheduling combines the freedom of choosing tasks that workers have in crowdsourcing environments, with policy based assignment of tasks. It is these ICU-side guarantees that are combined with task queue analysis, that can avoid problems such as delegation sinks. We outline the steps of this strategy in Alg 1. Alternatively, the request for willingness can be sent immediately after the task’s initial assignment so that when a threshold is reached the scheduler only checks the task queues of ICUs. Executing an Elasticity Strategy. We implemented the algorithm using methods described in the API section. We created tasks with diﬀerent skill requirements and modeled an ICU with a single skill for simplicity, and assigned each of them diﬀerent costs. When a decision is made about which tasks are going to be reassigned to which ICU, the new cost calculation includes the prices of each of the new ICUs, as follows: j j m m c(si , tx ) + c( snw Costadapt (scui ) = Costprevious (scui ) − i , tx ), i=1 x=1

i=1 x=1

376

M. Riveni, H.-L. Truong, and S. Dustdar

where Costadapt (scui ) ≤ Allowed Budget. Due to space limitations and to the fact that it is not our goal to show how good this strategy is, we provide a supplement material3 . Generally, the results show that SCU productivity raises with the number of ICUs and the same eﬀort, while it declines if the eﬀort is high for a low number of tasks and a small number of ICUs.

Related Work

Resource Management and Adaptation. Work on a retainer model for crowdsourcing environments and examples of its application are presented in [4],[3]. The model is designed for recruiting guaranteed workers by paying them a small additional amount, and in this way keeping them in reserve and in ready state for handling real-time tasks. The similarity of our ICU-feedback based strategy is in that our scheduler keeps references to the top x number of resources that are previously ranked as most suitable for a specific task. Hence, these resources are the reserve resources in our approach. However, the difference in our approach is that no prior payment is made for reservation of these resources, rather the scheduler sends them a notification asking for feedback for their willingness to execute a task that is already assigned to another resource but for which a threshold is reached. Our model is not concerned with initial task assignment and it is not intended for crowdsourcing tasks, although ICUs may be invoked from a crowdsourcing platform. Authors of [15] present a programming language and framework called CrowdLang for systems that incorporate human computation, and what is of interest to us is that they provide cross-platform integration of resources, in this way making a human cloud possible. There is a considerable amount of work conducted on adaptation and more interestingly on self-adaptation strategies. For example, authors in [16] have presented an architecture that includes a self-adaptation framework for service-oriented collaboration systems. The part that this work relates to, is their approach on identifying worker misbehavior patterns (e.g., as a result of uncontrolled task delegations) and providing a solution of reassigning tasks to other alternative resources by taking into account their task-queue size. Our strategy differs from theirs in that tasks are not delegated if ICUs are not willing to accept tasks. Rather, the task reassignment is managed with consent from alternative ICUs. [9] describes a delegation model and related algorithms that concern trust updates. The authors mention adoption as a process where the delegation is initiated by the ”delegatee”. Our algorithm stands in between delegation and adoption. Collaborative Communities and Teams. The concept of the SCU that we utilize in our work is presented by one of the coauthors in [6]. However, while this is the fundamental work introducing the SCU, it describes its life-cycle and does not go into details into the SCU execution phase as this was not its aim. This is tackled in [19], where researchers have looked into a specific case of incident management to investigate how SCUs and their evolution(adaptation) perform better 3

dsg.tuwien.ac.at/research/viecom/prototypes/viecas

On the Elasticity of Social Compute Units

377

over traditional process management. Resource discovery in crowdsourcing and team formation strategies and algorithms have been the subject of investigation in many works, such as [1], [2], [14], [13],[5]. The algorithms in these works can be utilized for SCU formation and some also for ICU selection when an SCU needs to be extended. Task executing collaboration models and runtime collaborations are also investigated in works such as [18]. However, the mentioned works focus on ﬁxed teams without elasticity assumptions. Elasticity. The notion of elasticity is treated in several domains and contexts and has especially gained importance with the advance of cloud computing. In [8] authors discuss the reasons, challenges and their approach toward virtualizing humans and software under the same service-based model that will enable elastic computing in terms of scaling both software and human resources. The concept of elasticity in Cloud computing, is being extended to concepts like application [22] and process [7] elasticity, e.g., in [7], the authors identify resource, cost and quality elasticity as being crucial in modeling processes in service oriented computing. Mechanisms and a middleware to support scaling services in and out from applications utilizing SaaS are presented in [11].

Conclusion

Our research focus in this work was to provide mechanisms for eﬀective provisioning of SCUs with elastic capabilities and their eﬃcient runtime management. We have modeled an SCU at runtime and provided exemplary algorithm that utilizes operations for provisioning of elastic capabilities. We have shown that platforms supporting human computation in collective collaborations are more reliable by working based on the elasticity concept of scalability in terms of both resources and their parameters. Our future work includes further development of an SCU execution framework, which will include the presented model, metrics, API and algorithms so as to be able to deploy our approach in real environments. Acknowledgments. This work is supported by the Vienna PhD School of Informatics (http://www.informatik.tuwien.ac.at/teaching/phdschool) and by the EU FP7 FET SmartSociety project(http://www.smart-society-project.eu/) under the Grant agreement n.600854.

References 1. Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A., Leonardi, S.: Power in unity: forming teams in large-scale community systems. In: CIKM, pp. 599–608 (2010) 2. Anagnostopoulos, A., Becchetti, L., Castillo, C., Gionis, A., Leonardi, S.: Online team formation in social networks. In: Proceedings of the 21st International Conference on World Wide Web, WWW 2012, pp. 839–848. ACM, New York (2012) 3. Bernstein, M.S., Brandt, J., Miller, R.C., Karger, D.R.: Crowds in two seconds: enabling realtime crowd-powered interfaces. In: Proceedings of the 24th Annual ACM Symposium on User Interface Software and Technology, UIST 2011, pp. 33– 42. ACM, New York (2011)

378

M. Riveni, H.-L. Truong, and S. Dustdar

4. Bernstein, M.S., Karger, D.R., Miller, R.C., Brandt, J.: Analytic methods for optimizing realtime crowdsourcing. CoRR abs/1204.2995 (2012) 5. Dorn, C., Dustdar, S.: Composing near-optimal expert teams: A trade-oﬀ between skills and connectivity. In: Meersman, R., Dillon, T.S., Herrero, P. (eds.) OTM 2010. LNCS, vol. 6426, pp. 472–489. Springer, Heidelberg (2010) 6. Dustdar, S., Bhattacharya, K.: The social compute unit. IEEE Internet Computing 15, 64–69 (2011) 7. Dustdar, S., Guo, Y., Satzger, B., Truong, H.L.: Principles of elastic processes. IEEE Internet Computing 15(5), 66–71 (2011) 8. Dustdar, S., Truong, H.L.: Virtualizing software and humans for elastic processes in multiple clouds- a service management perspective. IJNGC 3(2) (2012) 9. Hexmoor, H., Chandran, R.: Delegations and Trust. International Journal of Computational Intelligence, Theory and Practice 3(2), 95–108 (2008) 10. Kaganer, E., Carmel, E., Hirschheim, R., Olsen, T.: Managing the human cloud. MITSloan Management Review 54(2), 23–32 (2013) 11. Kapuruge, M., Han, J., Colman, A., Kumara, I.: ROAD4SaaS: Scalable business ´ (eds.) service-based saaS applications. In: Salinesi, C., Norrie, M.C., Pastor, O. CAiSE 2013. LNCS, vol. 7908, pp. 338–352. Springer, Heidelberg (2013) 12. Kasunic, M.: A Data Speciﬁcation for Software Project Performance Measures: Results of a Collaboration on Performance Measurement. Technical report. Carnegie Mellon University, Software Engineering Institute (2008) 13. Lappas, T., Liu, K., Terzi, E.: Finding a team of experts in social networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, pp. 467–476. ACM, New York (2009) 14. Lopez, M., Vukovic, M., Laredo, J.: Peoplecloud service for enterprise crowdsourcing. In: 2010 IEEE International Conference on Services Computing, pp. 538–545 (2010) 15. Minder, P., Bernstein, A.: Crowdlang: programming human computation systems. Technical report (JAN (2012) 16. Psaier, H., Juszczyk, L., Skopik, F., Schall, D., Dustdar, S.: Runtime behavior monitoring and self-adaptation in service-oriented systems. In: Proceedings of the 2010 Fourth IEEE International Conference on Self-Adaptive and Self-Organizing Systems, SASO 2010, pp. 164–173. IEEEComputerSociety, Washington, DC (2010) 17. Quinn, A.J., Bederson, B.B.: A taxonomy of distributed human computation 18. Sagar, A.B.: Modeling collaborative task execution in social networks. In: Potdar, V., Mukhopadhyay, D. (eds.) CUBE, pp. 664–669. ACM (2012) 19. Sengupta, B., Jain, A., Bhattacharya, K., Truong, H.-L., Dustdar, S.: Who do you call? Problem resolution through social compute units. In: Liu, C., Ludwig, H., Toumani, F., Yu, Q. (eds.) Service Oriented Computing. LNCS, vol. 7636, pp. 48–62. Springer, Heidelberg (2012) 20. SmartSociety: Hybrid and diversity-aware collective adaptive systems: When people meet machines to build a smarter society, http://www.smart-society-project.eu/ FP7 FET,EU Funded Project (accessed: December 20, 2013) 21. Truong, H.-L., Dustdar, S., Bhattacharya, K.: Programming hybrid services in the cloud. In: Liu, C., Ludwig, H., Toumani, F., Yu, Q. (eds.) Service Oriented Computing. LNCS, vol. 7636, pp. 96–110. Springer, Heidelberg (2012), http://dblp.uni-trier.de/db/conf/icsoc/icsoc2012.html#TruongDB12 22. Zhang, X., Kunjithapatham, A., Jeong, S., Gibbs, S.: Towards an elastic application model for augmenting the computing capabilities of mobile devices with cloud computing. Mob. Netw. Appl. 16(3), 270–284 (2011)

SmartSociety Hybrid and Diversity-Aware Collective Adaptive Systems When People Meet Machines to Build a Smarter Society

Grant Agreement No. 600584

Deliverable 7.1 Working Package 7

Technical Report â&#x20AC;&#x201C; SmartCom Design Dissemination Level 1 (Confidentiality): Delivery Date in Annex I: Actual Delivery Date Status2 Total Number of pages: Keywords:

PU 31/12/2014 31/12/2014 F 57 virtualization, communication, middleware, SmartCom

PU: Public; RE: Restricted to Group; PP: Restricted to Programme; CO: Consortium Confidential as specified in the Grant Agreeement 2 F: Final; D: Draft; RD: Revised Draft

c SmartSociety Consortium 2013-2017

2 of 57

Deliverable 7.1

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Full project title:

Project Acronym: Grant Agreement Number: Number and title of workpackage: Document title: Work-package leader: Deliverable owner: Quality Assessor: c SmartSociety Consortium 2013-2017

SmartSociety: Hybrid and Diversity-Aware Collective Adaptive Systems: When People Meet Machines to Build a Smarter Society SmartSociety 600854 7 Programming Models and Frameworks Technical Report â&#x20AC;&#x201C; SmartCom Design Hong-Linh Truong, TUW Ognjen Scekic Daniele Miorandi, UH 3 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

List of Contributors Partner Acronym TUW

4 of 57

Contributor Philipp Zeppezauer, Ognjen Scekic, Hong-Linh Truong

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Executive Summary This document presents the SmartCom architecture, components, and application programming interfaces. Section 1 takes a look at the general architecture of the system and how the functionalities are mapped and handled within the system and its various components. Afterwards, Section 2 examines how SmartCom handles messages between the platform, the applications, the middleware, collectives, and the peers, and how messages look like. Section 2.4 presents the APIs of the components that have been described in Section 1. Finally, Section 2.5 outlines some more complex algorithms that are used by the system.

c SmartSociety Consortium 2013-2017

5 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Table of Contents 1 Architecture 1.1

1.2

Adapters

7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.1

Output Adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.2

Input Adapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Communication Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.2.1

Adapter Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.2.2

Authentication Manager . . . . . . . . . . . . . . . . . . . . . . . . . 16

1.2.3

Messaging and Routing Manager . . . . . . . . . . . . . . . . . . . . 17

1.2.4

Handling of Policies . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.3

Message Broker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.4

Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.4.1

Message Query Service . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.4.2

Message Info Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2 Messages

2.1

Message Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.2

Routing of Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.3

Predefined Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4

2.5

2.3.1

Control Messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3.2

Message Info Messages . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.3.3

Authentication Messages . . . . . . . . . . . . . . . . . . . . . . . . . 27

Application Programming Interfaces (APIs) . . . . . . . . . . . . . . . . . . 28 2.4.1

Data Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4.2

Public Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4.3

Callback Entities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.5.1

Creation of Output Adapters . . . . . . . . . . . . . . . . . . . . . . 46

2.5.2

Handling of Messages . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 Experimental Performance Evaluation 6 of 57

56 http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Communication Middleware Messaging and Routing Manager Adapter Manager Platform Components

Message Broker

Communication Engine

Authentication Manager

Input Input Output Adapter Adapter Adapter

Peers

Input Input Input Adapter Adapter Adapter

Tools

Services

Message Query Service

Message Info Service

<<REST>>

Figure 1: Overview of the communication middleware. Platform Components are part of the HDA-CAS platform and peers/tools are external to the system.

Architecture

Figure 1 presents a conceptual overview of SmartComâ&#x20AC;&#x2122;s internal architecture. In a typical use-case, the SmartSociety platform components on the left hand side initialize the communication by sending messages to peers and collectives on the right hand side via SmartCom. We denote the messages flowing from the SmartSociety platform in the direction of peers as Output Messages. Peers can reply to received messages by sending messages via SmartCom. Additionally they can use external, third-party tools (e.g., uploading a file to a file server) which are monitored by SmartCom (i.e., the system checks regularly if there are updates/changes). Messages originating from the peers being passed on to SmartCom for delivery are referred to as Input Messages. SmartCom consists of four groups of components: â&#x20AC;˘ Adapters are used to handle and virtualize the actual communication with peers

and tools over communication channels (e.g., sending an email or making a REST call) from the rest of SmartCom and the platform. The technology that is used by adapters to send and receive messages depends on the actual implementation and is abstracted from the rest of the system by providing a common interface for all adapters. This is important to provide virtualization of peers. Furthermore, this

c SmartSociety Consortium 2013-2017

7 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

allows the adaptation of technological advancements in communication. • Communication Engine consists of components that provide the core functionality of the system. They are responsible for the handling of messages intended to be sent to or received from peers and tools. They resolve current collective members and initialize communication. Additional functionalities, like message authentication, management and execution of adapters and routing of messages are also provided by the components of the Communication Engine. • Services provide additional information on communication aspects to the peers and platform components. Furthermore, they can be used, e.g., to derive metrics (such as average response time) and profiles for peers. • Message Broker is used to decouple the execution of various middleware components to achieve scalability (i.e., multiple components listen for messages on a single queue). Furthermore, the queues of the broker are used for the routing of messages which is determined by the Messaging and Routing Manager of the Communication Engine. The following sections describe the internal structure and further details of the components mentioned above. A detailed diagram of all the components is presented in Figure 2.

1.1

Adapters

This section discusses the technical aspect of adapters within SmartCom. In general there are two different types of adapters: Output Adapters and Input Adapters. They differ in their behavior as the Output Adapter is only allowed to send output messages to peers and the Input Adapter is only allowed to receive input messages from peers. The reason behind this distinction is that the two types are handled differently. Whereas Output Adapters are shared among all applications running on the platform, Input Adapters are usually created by applications and are dedicated to receive input messages for the application that created the adapter. Their behavior is application specific compared to the behavior of Output Adapter which is considered as peer specific. This difference in behavior is also expressed in the way they are created. Output Adapters are only registered in the system as adapter types (e.g., an email adapter that sends emails to peers) and their lifecycle (i.e., registration, creation, execution, and removal) is handled by the Adapter Manager (discussed in Section 1.2.1). On the other hand, 8 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

SmartCom - Communication Middleware Message Info Service Message Info Service API

Authentication Manager

PeerAuthentication Callback API

Sessions

Message Information

Message Broker

Notification Callback API

Adapters

MIS Queue

Messaging and Routing Manager

Input Adapter Execution

Input Pull Adapter Input Pull/ Push Adapter Adapter Implementation API

AUTH Queue

Message Handler

Collective Info API

<<REST>>

Request Handler

Authentication Provider

Tools

Control Queue cached Input Queue

Peer Info API

Log Queue

Output Adapter Execution

Request Queues

Output Adapter Output Adapter API

Output Queues

Communication API

Platform Component(s)

Authentication Request Handler

Routing Rule Engine

Message Logging Service

Input Handler

Adapter Manager Routing Rules

Message Query Service API

Query Handler

Adapter Execution Engine

Addresses

Address Resolver

Messages

Peers

executes

Adapter Handler

Message Query Service

Adapter Implementation

cache for contact data of peers

Figure 2: Detailed view of the communication middleware. Input Adapters are created by applications running on the platform and are passed to SmartCom because they might require special configuration. Consider an Input Adapter that monitors a folder of a FTP server, such an adapter would require a path to be specified as well as some additional information like username and password. This data has to be provided by applications because this information is application specific. Therefore, Input Adapters are not shared among different applications. 1.1.1

Output Adapters

Output Adapters are responsible for sending messages from SmartCom to peers. There are two categories of Output Adapters: Stateful Output Adapters and Stateless Output c SmartSociety Consortium 2013-2017

9 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Middleware Output Adapter Execution Stateless Output Adapter Output Queue

Peer

Output Adapter API

Adapter Implementation Peer

Control Queue Output Adapter Execution Stateful Output Adapter Output Queue

Output Adapter API

Adapter Implementation

Peer

Figure 3: Concept of the Output Adapter Adapters. Stateful Output Adapters instances are created per peer, whereas single Stateless Output Adapters instances are used to send messages to different peers. Both categories have to be provided with peer specific contact data, such as an email address or an URL. The necessary data to contact a peer with an Output Adapter is provided by the Adapter Manager at each invocation. Additionally, Stateful Output Adapters are provided with this data also at the beginning of their lifecycle, because they might require to maintain additional conversational data based on this peer information. Due to the different lifecycles (further details below) they are provided with that data during the creation of the adapter. Figure 3 presents the internal structure of both categories of Output Adapters. The boxes labeled â&#x20AC;&#x2122;Adapter Implementationâ&#x20AC;&#x2122; indicate the actual implementation of the corresponding adapter (e.g., code that issues a REST call). It can also be observed that both adapters have their own Output Queue but they share one Control Queue. Output Queues are adapter specific queues which are used for outgoing messages that still have to be handled. The Control Queue is used to notify SmartCom that the 10 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

sending was successful or of an error. Instances of both categories have to implement the Output Adapter API (see Section 2.4.2). The lifecycle of an Output Adapter depends on whether it is stateful or stateless. Both Stateful Output Adapters and Stateless Output Adapters, have to be registered in the system by calling the Adapter Manager (see Section 1.2.1). • In case of a Stateless Output Adapter, the adapter is instantiated and executed im-

mediately because they are shared among peers. This approach has the advantage that there is no need to instantiate the adapter of a specific type at any point in the future, which eliminates the need for synchronization and locking to ensure that there is only a single instance of this adapter. Note that further scaled out instances and the primary instance of the same adapter are considered as a single instance from the conceptual point of view. The disadvantage of this approach is that resources, such as computation time and memory, are assigned to the adapter even if the adapter is not used to communicate with peers. Nevertheless, due to the usually limited resource consumption of Stateless Output Adapters, this disadvantage is acceptable.

• In case of a Stateful Output Adapter the adapter is only instantiated on-demand because there is an instance per peer that uses this adapter for communication. If

there are many peers using an adapter type (e.g., an email adapter) there are also many adapter instances which causes a higher resource consumption compared to Stateless Output Adapters. Hence, creating them on-demand and – if they have not been used for some time – removing them reduces this resource consumption.

Adapters of both categories are usually just removed if the appropriate method of the Adapter Manager is called. However, instances of Stateful Output Adapters could be discarded earlier to save resources in case they have not been used for some time. Removing an Output Adapter means that the corresponding communication channel cannot be used by SmartCom to interact with peers. For example, removing an Output Adapter that sends emails results in not being able to contact any peer using emails unless another Output Adapter handling emails is registered. c SmartSociety Consortium 2013-2017

11 of 57

c SmartSociety Consortium 2013-2017

1.1.2

Deliverable 7.1

Input Adapters

Input Adapters are responsible for either waiting for input or for actively checking for input from peers or tools. Input Adapters can be implemented using a push or pull mechanisms. Adapters using push are called Input Push Adapters. They are notified by the external tool/communication channel of new developments (e.g., a new mail in the mailing list) via a push notification. On the other hand, Input Pull Adapters are handled and executed by the Adapter Execution Engine. The pull is triggered in a certain interval or based on a programmed request (e.g., a peer has only one hour to send a file to a FTP server. After the time runs out, the pull adapter checks if there is a file available). This request is expressed by putting a corresponding message in the Request Queue of a pull adapter. This message instructs the adapter to execute a pull. Figure 4 presents the internal structure of both Input Adapter categories. Instances of both categories push the received message to the Input Queue. Input Push Adapters have to implement the Input Push Adapter API (see Section 2.4.2), instances of Input Pull Adapters have to implement the Input Pull Adapter API (see also Section 2.4.2). The lifecycle of Input Adapters is managed entirely by SmartSociety platform applications/components. Input Adapters are created and removed by applications because their configuration is application specific. The lifecycle of Input Push Adapters is special, because after adding them to the Adapter Manager they have to register a technology-specific handler that is responsible for the reception of push notifications. This handler has to be destroyed again when the adapter is removed. For example, an adapter using a server socket has to register it at the beginning and destroy it at the end of its lifecycle.

1.2

Communication Engine

The Communication Engine is the core of the execution system of SmartCom and is responsible for the communication between the platform, the applications, and the peers using messages and adapters. The Communication Engine consists of the Adapter Manager, Authentication Manager, and the Messaging and Routing Manager. The interactions of the subcomponents can be examined in Figure 2. Messages that should be sent to peers are passed to the Messaging and Routing Manager which decides based on internal routing rules how to forward messages (i.e., which component/adapter handles the message). Messages are sent to and received from the 12 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Middleware Input Adapter Execution Input Push Adapter Input Push Adapter API

Adapter Implementation

Peer/ Tool

Input Queue Input Adapter Execution Input Pull Adapter Request Queue

Input Pull Adapter API

Adapter Implementation

Peer/ Tool

Figure 4: Concept of Input Push Adapters and Input Pull Adapters. peers using corresponding Input and Output adapters(described in Section 1.1), which are created, managed and executed by the Adapter Manager. The Authentication Manager is responsible to verify the authenticity of a peer and to provide a security token to peers that allows SmartCom to verify the sender of a message. These components are described in the following sections. 1.2.1

Adapter Manager

The Adapter Manager is responsible for the lifecycle management (i.e., registration, creation, initialization, execution, and removal) of adapters. It consists of the following subcomponents: Adapter Execution Engine, Adapter Handler, Address Resolver and multiple Adapter Executions. Figure 5 shows the internal structure of the Adapter Manager and how the subcomponents interact with each other. The Adapter Handler manages the lifecycle of Output Adapters. Both categories of c SmartSociety Consortium 2013-2017

13 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

System Component (e.g., Task Execution Engine)

Messaging and Routing Manager

Communication API

Adapter Manager

Adapter Execution Engine

Adapter Handler

cache for contact data of peers

executes

Adapter Adapter Adapter Adapter Execution Execution Execution Execution

Address Resolver

Addresses

Figure 5: Subcomponents of the Adapter Manager and interaction with other internal and external components. Output Adapters are registered with the Adapter Handler. Stateless Output Adapters are instantiated immediately, while Stateful Output Adapters just remain registered. In case a message has to be sent to a peer using a stateful adapter, the Adapter Handler creates an instance of the adapter for the recipient of the message and passes its reference to the Messaging and Routing Manager. The reference of the adapter represents the address of the adapter-specific Output Queue of the Message Broker the adapter pulls messages from. The Adapter Handler prevents the instantiation of multiple stateful adapter for a single peer. The selection of the required adapters for a communication with a peer is based on the Peer Channel Addresses 3 . The Peer Channel Address is the internal representation of a communication channel used by a peer. Input Adapters are not registered at the Adapter Handler. Platform components and applications have to create instances of Input Adapters themselves and pass these instances to SmartCom using the Communication API (see Section 2.4). Input Adapters either receive messages from peers or tools directly via push notification or they check an external tool (e.g., a folder on a FTP server or a mailing list) regularly if there is a new message represented by a new resource (e.g., a new file) available. To support scaling out to handle big workloads, each Output Adapter initially listens 3

They consist of a unique name (e.g., Email) and a list of parameters (e.g., an email address)

14 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

for new messages on a single, adapter specific Output Queue of the Message Broker (see Section 1.3). Scaled out instances of an adapter pull messages from the same queue which allows multiple instances to handle messages of the queue concurrently. Conceptually, the initial instance of the adapter and the scaled out instances are considered a single adapter instance, because there is no difference in semantics, just an increase in performance. Therefore, we will not differentiate between these instances in this description. Note that Stateful Output Adapters and Stateless Output Adapters differ in their behavior regarding scalability. Since each Stateful Output Adapters instance is associated with a peer there is hardly any need for scaling out these instances because a higher workload only occurs in rare cases. Since they are shared, Stateless Output Adapters have to be scaled out much more often, especially if there are lots of peers using the communication channel handled by the adapter.

Instances of both types of adapters are executed by the Adapter Execution Engine. Therefore, every adapter is assigned to an Adapter Execution which handles its execution. The Adapter Execution retrieves messages from queues, determines the address information of communication channels of peers, calls the appropriate methods from the Adapter APIs (see Section 2.4) and publishes messages to queues. The behavior of the Adapter Execution depends on whether it handles an Input Adapter or an Output Adapter. Executions of Output Adapters retrieve messages from the Output Queue and initiate the communication. Executions for Input Pull Adapters wait for pull requests in the Request Queue and initiate a pull request upon reception of a message. Input Push Adapters handle the adapterâ&#x20AC;&#x2122;s execution on their own, they are not assigned to Adapter Executions. The details of algorithms for adapter lifecycle management can be examined in Section 2.5.1.

Peer Channel Addresses for instantiated adapters are stored in the Addresses Data Storage. These addresses are needed by adapters to be able to contact a peer. The Address Resolver is responsible to resolve address requests by Adapter Executions. When an adapter is sending a message to a peer, the Adapter Execution provides the address of that peer by querying the Address Resolver. The data storage acts as a cache for Peer Channel Addresses to speed up the execution of adapters because the Peer Channel Addresses are usually managed by platform components and calling them regularly might be a limiting factor to performance and throughput. c SmartSociety Consortium 2013-2017

15 of 57

PeerAuthentication Callback API

c SmartSociety Consortium 2013-2017

System Component (e.g., Peer Manager)

Deliverable 7.1

Authentication Manager

Messaging and Routing Manager

Authentication Request Handler

Authentication Provider

AUTH Queue

Control Queue

Sessions

Figure 6: Concept of the Authentication Manager

1.2.2

Authentication Manager

The Authentication Manager is used to authenticate peers and verify the authenticity of their messages in the system. Authentication request messages (see Section 2.3) are dropped in the AUTH Queue by the Messaging and Routing Manager and are collected by the Authentication Request Handler. This handler interacts with the Peer Authentication Callback API (see Section 2.4.3) to get information on the peer and to authenticate the peer (using the credentials provided in the message). After the successful authentication, the manager creates a security token that can be used by peers and SmartCom to provide security features (e.g., message authentication or message encryption). This token is only valid a certain period of time. The time period between the creation of the token and the invalidating thereof is called session. The result of the authentication is passed to the Control Queue in form of a response message. The Authentication Provider can be used by the Messaging and Routing Manager to verify the authenticity of a message â&#x20AC;&#x201C; if required. Figure 6 presents the internal structure of the Authentication Manager. The Authentication Manager uses a Session Data Storage to handle the sessions of peers. Sessions consist of a session token that can be used by peers to authenticate messages, and a timestamp. If a message arrives with a token of an invalid session, the peer has to be informed to renew its token. Such messages should be discarded or at least retained until the peer authenticates itself again. 16 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Authentication Manager

AUTH Queue

Message Handler MIS Queue

cached Output Queues Log Queue

Communication API

Peer Info API

System Component (e.g., Task Execution Engine)

Collective Info API

Notification Callback API

Messaging and Routing Manager

Request Queues

Routing Rule Engine

Message Logging Service

Input Handler Control Queue

Input Queue Adapter Manager

Routing Rules

Messages

Figure 7: Concept of the Messaging and Routing Manager

1.2.3

Messaging and Routing Manager

The Messaging and Routing Manager is responsible for handling of internal and external messages of the system. Figure 7 presents the internal structure and the communication with external components of the Messaging and Routing Manager. Messages are sent to peers, collectives, or components by this component. Upon reception of a message the Message Handler handles the messages according to the type of the receiver of the message. If the receiver of the message is a peer, the Message Handler determines the corresponding adapter(s) that should be used for the communication. If there are no adapters available, new ones have to be created. Therefore, the Message Handler queries the Peer Info API (see Section 2.4.3) to retrieve the peer profile information which contains the communication channels used by the peer. Since this information does not change often, it can be cached to improve the performance of further requests. Subsequently, the Adapter Manager is instructed to create new adapters according to the peerâ&#x20AC;&#x2122;s preferred communication channels and delivery policies. After the successful creation of adapters, the message is put in the corresponding Output Queues c SmartSociety Consortium 2013-2017

17 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

of the adapters. If the receiver of the message is a collective, the current members of the collective have to be determined in order to initiate the communication. Therefore, the Message Handler queries the Collective Info API (see Section 2.4.3). After receiving the members of the collective, a new message is created and sent to every member using the procedure described in the previous paragraph. If the receiver of a message is an internal or external component, the Message Handler directly forwards it to the corresponding component either by putting it into a special queue or by notifying the Notification Callback API (see Section 2.4.3) of the corresponding component. The Messaging and Routing Manager uses routing (described later in this section) to be able to determine the corresponding component. If no receiver is set for a message, the Message Handler notifies a platform component using the Notification Callback API, because there is no possibility to determine a receiver due to the stateless nature of SmartCom. Platform components can register themselves with the Messaging and Routing Manager to receive notifications upon messages through the Notification Callback API. All registered callbacks implementing this API are invoked whenever a message arrives that cannot be handled by SmartCom. Besides the primary recipient of a message, further recipients of messages can be determined based on Routing Rules, which are handled by the Routing Rule Engine. Rules can be added by platform components and applications to implement special communication patterns, or to simplify the communication and reduce overhead. For example, in an application a message of a specific subtype is always transferred to a software service; the message can be forwarded directly to the software instead of sending it the application first. The Input Handler pulls incoming input messages (e.g., a response from a peer) from the Input Queue and incoming control messages (e.g., a communication error message) from the Control Queue. Both messages are forwarded to the Message Handler that determines their destination. It is possible to scale out the Input Handler to improve the performance of the handling of input and control information. The Message Logging Service is responsible for persisting of all sent and received messages to a database. This information can be used to debug SmartCom, or to analyze the data for determining incentives or constructing provenance graphs. Stored messages 18 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

can be retrieved by using the Message Query Service (see Section 1.4.1).

Routing There are two types of routing available in the proposed SmartCom. The first type of routing is purely internal and represents the determination of corresponding adapter(s) that have to be used for the communication with a peer. No routing rules are involved in this process. This activity involves that the Adapter Manager be instructed to instantiate new adapters if there are none available for a specific peer. This type of routing also has to keep track of changes in the peer information of a peer because this might result in the recreation/removal of previously instantiated adapters. Additionally, the delivery policy (described later on) of a peer has to be tested for changes because this might also trigger instantiations/removals of adapters using the Adapter Manager. The second type of routing determines further recipients of a message based on the following properties of messages: type and subtype, receiver, and sender. This routing information can be added by providing Routing Rules which are stored in the Routing Rule Engine. The resulting routing defines additional recipients of the message which can either be peers, collectives, or internal and external components. Note that routing rules with an empty recipient are not allowed due to obvious reasons. Routes are determined by matching the properties of the message to properties of the routing rules, whereas setting properties of the routing rules null matches every corresponding property of the message. The route is determined by the properties in ascending order. First, the type is determined, afterwards the subtype, then the receiver and finally the sender of the message. Because null matches anything, it is not allowed to provide a routing rule with the type, subtype, receiver, and sender being all null. This restriction prohibits that all messages of the system are forwarded to a single peer. The second type of routing adds flexibility to the system in terms of communication. It allows applications to implement special communication patterns â&#x20AC;&#x201C; e.g., a monitoring peer that logs all the messages sent to a specific collective without being part of the collective. 1.2.4

Handling of Policies

SmartCom handles two types of policies. The first type are peer-specific privacy policies which have to be considered when sending a message to a peer. Privacy policies might restrict the sending of messages based on their properties or at a certain time (e.g., during the night) which means that the sending has to be aborted. Privacy policies of peers c SmartSociety Consortium 2013-2017

19 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

are managed outside of SmartCom and are retrieved by calling the Peer Info API (see Section 2.4.3). The second type of policies are the delivery policies. There are multiple levels where they have to be considered and enforced. They are described in the following: 1. The first level of delivery policy enforcement is on the message level. At this level it is possible to specify how messages are delivered, initially there is just an option to determine whether a successful sending of a message is acknowledged or not. In general the delivery policies on this level and any other level are not restricted to this behavior. They can be easily extended in further versions of SmartCom. 2. The second level of delivery policies is concerned with the peer delivery policies. Besides specifying the preferred communication channels in peerâ&#x20AC;&#x2122;s profile (in an external component), these policies can also specify how a peer is to be contacted using these addresses. The options TO ALL (all addresses are used), TO ANY (any address is used) and PREFERRED (preference is expressed by the order of addresses) are currently provided. The enforcement of these policies is handled by the Messaging and Routing Manager. When a message is sent to a peer, the manager registers a handler that listens for acknowledgement messages from the corresponding adapters which indicate a successful sending. Unsuccessful sending is indicated by a communication error message of the adapter. The TO ALL policy requires that all adapters are able to successfully send the message, whereas TO ANY requires at least one adapter to be successful. The delivery policy PREFERRED fails if the message could not be sent to the preferred adapter of the peer. In case of an unsuccessful delivery based on the chosen delivery policy, a failure message is forwarded to the sender of the initial message. 3. The third level of delivery policies is concerned with the sending of messages to collectives. This information about delivery policies is provided by the Collective Info API (see Section 2.4.3) and defines how messages should be sent to members of the collective. Options include TO ALL MEMBERS and TO ANY. Upon sending a message to a collective, there is also a handler registered to enforce the policy. The behavior on this level is similar to the one on the peer level but successful sending is indicated by a successful policy enforcement on the peer level. TO ALL MEMBERS means that the sending of messages has to be successful for each peer based on the peersâ&#x20AC;&#x2122; delivery policies, if one of these policies fails, the enforcement on the collective 20 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

level fails too. On the other hand the TO ANY policy only requires the sending to one peer to be successful. Note that the failure of a policy enforcement is always reported to the sender of the initial message, the success case is just reported if required by the policy on the message level of the initial message.

1.3

Message Broker

The purpose of the Message Broker is to decouple the executions of the various components of SmartCom. Furthermore, it is used to implement the first type of routing (see Section 1.2.3) to send messages to the correct adapters that have to forward it to the peers. Some of the queues of the Message Broker have already been mentioned in previous chapters, in the following we briefly describe all available queues that are used within the system: • Control Queue: contains messages that have been sent by internal components and are needed to control the internal flow of messages, to forward results of in-

ternal service invocations (e.g., answer of an authentication request), to indicate (communication) errors, or to enforce delivery policies. • Input Queue: a single queue that is filled with input messages of peers by all Input Adapters. These messages are handled by the Messaging and Routing Manager according to the specified receivers and additional routing rules. • Output Queues: contain the output messages that should be handled by adapters to send a message to a peer over a communication channel. There is exactly one output queue for each Output Adapter. • Request Queues: used to force Input Pull Adapters to perform a pull. There is

one queue for each of these adapters so that they can be notified separately to pull for new input.

• AUTH Queue: a special queue for messages that are intended for the Authenti-

cation Manager. Messages in this queue are authentication request messages (see Section 2.3) which consist of the username and password so that peers can be authenticated.

c SmartSociety Consortium 2013-2017

21 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Message Query Service API

Message Query Service

Query Handler

Messages

Figure 8: Concept of the Message Query Service • MIS Queue: a special queue for messages that are intended for the Message Info Service. These messages are usually request messages (see Section 2.3) for information about a certain message indicated by a certain type and subtype. • Log Queue: intended for all messages that are handled within SmartCom and

that have to be logged. Messages in this queue are consumed by the Message Logging Service of the Messaging and Routing Manager which saves the messages to a database.

1.4

Services

1.4.1

Message Query Service

The Message Query Service provides an interface to query sent and received messages. Figure 8 presents the internal structure of the Message Query Service. The Query Handler is responsible for the handling of queries and the execution of queries in the database. This service can be used to query all internal and external messages that have been handled by SmartCom. 1.4.2

Message Info Service

The Message Info Service provides information about messages based on their type and subtype. It is used by peers to get information on how to interpret a message and how to respond. Furthermore, it provides a human-readable description of the message’s structure and contents, as well as its semantic meaning and relation with other messages. This service could also be improved to return an explanation how to interpret the message in a machine-readable way. The prototype provides a simple textual description that the worker can fetch to interpret the message semantics, especially with respect to related 22 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Message Info Service

System Component (e.g., Task Execution Engine)

Message Info Service API

<<REST>> Control Queue

Request Handler MIS Queue

Message Information

Figure 9: Concept of the Message Info Service

messages. The service maintains a database to store the message information which is updated through platform components. Figure 9 shows the internal structure of the Message Info Service and how it is connected to the queues. The service can be used by a peer either by sending a message info request (see Section 2.3) to an adapter or by invoking a REST service that provides the corresponding data. Since this data is application specific, it has to be provided by the application using the Communication API (see Section 2.4.2).

Messages

SmartCom exchanges messages with platform components, the adapters as well as with some internal components (i.e., the Authentication Manager and the Message Info Service). The following section takes a look at how these messages look like, how the routing of messages is handled and some predefined message types are discussed. Note that further message types and subtypes can be defined by programmers of applications for a HDACAS. The semantics of such message types and subtypes depend on the application that created them. c SmartSociety Consortium 2013-2017

23 of 57

c SmartSociety Consortium 2013-2017

2.1

Deliverable 7.1

Message Structure

The following section presents the structure of messages that are used within SmartCom. The structure is quite similar to the FIPA ACL Message Structure [1], but some properties have been removed and others added to fit the requirements of SmartCom. Each message consists of several mandatory and optional fields. The most important fields of a message are the Id of the message, the sender, the type and subtype. These and further fields are discussed and described in Table 1. Listing 1 outlines a simple message containing instructions for a task in the JSON format.

{ " id ": "2837" , " type ": " TASK " , " subtype ": " REQUEST " , " sender ": " peer291 " , " receiver ": " peer2734 " , " conversation - id ": "18475" , " content ": " Check the status of system 32"

2 3 4 5 6 7 8 9

} Listing 1: Example message with instructions for a task After receiving a message, Output Adapters are responsible to transform them to the appropriate technology-related and peer-understandable representation and send the message using a communication channel. On the other hand, Input Adapters are responsible for the transformation of received messages of a technology-related message format (e.g., email) to an internal message. Messages that are related to a specific execution of an application are required to have a execution-dependent conversation-Id, otherwise it is not possible to associate a message with the corresponding execution. Note that SmartCom does not use the conversation-Id internally, this functionality has to be provided by platform component.

2.2

Routing of Messages

The routing of messages is handled by the Messaging and Routing Manager according to rules based on the messageâ&#x20AC;&#x2122;s type, subtype, receiver and sender. The order (type, subtype, receiver, sender) also defines the priority, which means that the type has the highest priority and the sender the lowest. Further information on routing can be found in Section 1.2.3. 24 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Field Type

Description This field defines the high-level purpose of the message (e.g., control message, input message, metrics message, etc.). This field is especially important for the routing of messages within the system. Subtype This field is defined by the component that is in charge of the message (i.e., it is component specific). The subtype combined with the type of the message defines the purpose of the message. The subtype can also be used by programmers of applications to define custom message types for their application. Message-Id A global unique identifier is assigned to every message within the system by the Messaging and Routing Manager. Sender-Id The sender-Id specifies the sender of the message (can be a component, peer, etc.). Sender-Ids are unique within the systems. Sender-Ids are either predefined in case of an internal component or are assigned by platform component. Receiver-Id The receiver-Id specifies the receiver of the message (can be a component, (o) peer, collective). Can also be empty if the receiver is not clear. Conversation- Denotes the system identifier for the conversation. This identifier can be Identifier used by platform components to map the message to the actual execution (o) instance of an application. For example: application A is executed twice at the same time: A1 and A2 . The conversation-Id is used to associate the messages with the right executions A1 or A2 . If there is no conversation (e.g., for internal messages), the conversation-Id can also be empty. Content (o) Defines the content of the message including instructions and data that are needed to execute the message. This can be empty in case of simple messages (e.g., acknowledge messages). TTL (o) Time to live. Defines a time interval in which a message is valid. For example: a peer has one hour to post pictures in a folder of a FTP server, after this time SmartCom stops looking for pictures in the folder and creates an error message if there are no pictures. Language Denotes the language of the message. This can be a natural language, (o) like English or German, as well as a computer format like binary. The initial intention of this field are logging and debugging purposes. In future versions a translation service could be introduced that makes use of this field. SecurityThe security token can be used to guarantee the authenticity of messages Token (o) or to encrypt the content of the message. DeliverySpecifies the delivery policy of the message. This field can be used to specify Policy (o) if the sender wants an acknowledgement in case of a successful sending of the message. RefersTo (o) This field can be used to specify that this message refers to another message. Table 1: Structure of messages. Optional fields are marked with (o). c SmartSociety Consortium 2013-2017

25 of 57

c SmartSociety Consortium 2013-2017

2.3

Deliverable 7.1

Predefined Messages

These messages are needed for special purposes, like authentication, or to indicate specific behavior (i.e., an acknowledged message) or exceptional cases and errors. The following sections describe these predefined messages and define their intended usage in the system. The subtypes of the messages are defined in the corresponding rows within brackets and in capital letters. 2.3.1

Control Messages

Control messages are exchanged within SmartCom and are exposed to the application. Control messages are always indicated by the message type CONTROL. Their intention is to indicate specific control behavior (e.g., acknowledgement of a message) or exceptions during the communication. They are described in detail below. Table 2 presents the various subtypes. Message Acknowledge (ACK )

Error (ERROR) Communication Error (COMERROR) Timeout (TIMEOUT )

Description This message is sent by the output adapter if the message has been successfully sent to the peer. Note that this does not imply peerâ&#x20AC;&#x2122;s acceptance of the contents of the message, but is used to implement functionalities such as read receipts. This message is not sent if the programmer requires a fire-and-forget sending behavior (i.e., she doesnt care if it actually has been delivered). An error message that indicates a generic error. This message is handled based on the routing rules. This error message indicates an error during the communication. This is reported to the sender of the initial message. This message indicates that a time out has appeared in the system and that the message couldnâ&#x20AC;&#x2122;t be delivered in time or there was no response within a certain time. Table 2: Predefined subtypes of Control Messages.

2.3.2

Message Info Messages

These messages are handled by the Message Info Service (see Section 1.4.2) and are intended for requests of message information by peers over dedicated input adapters and for the reply of such a request. All such messages are required to have the message type MESSAGEINFO. Table 3 presents the two subtypes of message info messages. 26 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Message Message Info Request (REQUEST ) Message Info Response (REPLY )

Description Request by a peer to the Message Info Service for information on how to interpret and handle a given message based on its type, and subtype. Response of the Message Info Service to a peer that contains information on how to interpret and handle a given message.

Table 3: Message Info Request and Reply Messages.

2.3.3

Authentication Messages

Authentication messages are used to perform authentication of a peer in the system and provide him with a security token that is valid for a specific time period (internally called session). Such messages are handled by the Authentication Manager (see Section 1.2.2) which interacts with platform component to verify the identify of a peer. Further information can be found in section 1.2.2. Authentication messages always have the type AUTH. AuthenticationRequest messages are sent by peers to the system whereas the three other messages (AuthenticationResponse, AuthenticationFailed, AuthenticationError) are sent back from SmartCom to the peer. Table 4 describes the used subtypes. Message Authentication Request (REQUEST ) Authentication Response (REPLY ) Authentication Failed (FAILED)

Authentication Error (ERROR)

Description Authentication request message of a peer that contains its credentials. The Authentication Manager queries platform component to verify the peerâ&#x20AC;&#x2122;s credentials. After the successful verification, a security token is created and sent to the peer. Response message for an authenticate request message from SmartCom to the peer. It contains a security token that can be used in further requests to verify the identity of a peer. Special response for an authenticate request message from SmartCom to the peer that indicates that the authentication failed. The purpose of this message is to distinguish between the cases of a failed authentication and a authentication error on the basis of the messageâ&#x20AC;&#x2122;s subtype. Special response message for an authenticate message from SmartCom to the peer that indicates that there was an error during the authentication of the peer. Such an error might be that, for example, no external platform component is available that can verify the credentials. Table 4: Authentication Messages.

c SmartSociety Consortium 2013-2017

27 of 57

c SmartSociety Consortium 2013-2017

2.4

Deliverable 7.1

Application Programming Interfaces (APIs)

The following section takes a look at the API of SmartCom. First, we examine the public entities that are needed to interact with the system. Afterwards, we take a look at the callback entities that are needed by SmartCom to get required information for the communication. Finally, the interfaces and their methods are described in detail to get an understanding on how to interact with the system. 2.4.1

Data Structures

Table 5 presents the data structures that are exchanged between SmartCom and the platform components. They are mainly used by the public entities described in Section 2.4.2, and the callback entities described in Section 2.4.3. 2.4.2

Public Entities

Table 6 describes the interfaces that are exposed by SmartCom to clients. These entities are required to interact with the system and receive response. The interfaces and their methods are described in the following sections in detail. Communication API This section discusses the main API for the interaction with peers, collectives, and SmartCom for the purpose of communication. It provides methods to start the interaction with collectives and peers, and also defines methods to extend and manipulate the behavior of SmartCom. Figure 10 presents the Communication API in UML notation.

public Identifier send(Message message) throws CommunicationException Send a message to a collective or a single peer.

The method assigns an Id to

the message and handles the sending asynchronously, i.e., it returns immediately and does not wait for the sending to succeed or fail. Errors and exceptions thereafter are sent to the Notification Callback API (see Section 2.4.3). Optionally, received acknowledgments are communicated back through the Notification Callback API. The receiver of the message is defined by the message, it can be a peer, a collective, or a component. If the receiver is not set, the message will be sent back to the

28 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Entity Identifier

Message

RoutingRule

PeerChannelAddress

QueryCriteria

PeerInfo

CollectiveInfo

Description Defines an identifier object that distinguishes between different types (peer, collective, component, message) and Id combinations. Message that is exchanged between applications, SmartCom and peers. There are also internal messages that are just handled between SmartCom components, or applications and SmartCom components. See Section 2 for details. Defines a rule of how messages should be handled within SmartCom. This feature can be used to improve the handling of messages and increase the performance. A common use case is that a response message from peer A of a specific type is always be sent to peer B. See Section 2.2 for details. Defines an address for a communication channel of a peer that can be handled by a specific adapter. It contains a list of parameters that can be used by an adapter to contact the peer (e.g., an email address). The number of parameters, their syntax and semantic meaning depend on the adapter. See Section 1.1 for details. An entity that can be used to specify the criteria of a query. It is created using the Message Query Service. After specifying the criteria, a call can be made to query the database. Provides communication related information about a specific peer such as the used communication channels (PeerChannelAddresses), delivery policies defined by the peer as well as privacy policies that restrict the communication behavior. A peer is identified by an Identifier object. Provides the members of a specific collective as well as the collectiveâ&#x20AC;&#x2122;s delivery policy. A collective is identified by an Identifier object.

Table 5: Domain model and data structures of the SmartCom.

c SmartSociety Consortium 2013-2017

29 of 57

c SmartSociety Consortium 2013-2017

Entity Communication

OutputAdapter

InputPushAdapter InputPullAdapter

MessageInfoService MessageQueryService

Deliverable 7.1

Description Main entity that is used for the communication with SmartCom. New messages are sent using this interface and it also allows to register new adapters and routing rules. Adapter that is responsible to send messages to peers. There are two types of OutputAdapters: stateless and stateful adapters. Adapters that receive messages from peers via push communication. Adapters that receive messages from peers via pull communication, i.e. they query the corresponding endpoint in regular intervals. Provides information on a specific message, i.e. how to interpret the message and the relationship to other messages. Service that allows to query persisted messages.

Table 6: Public entities of SmartCom that are used to interact with the system.

<<Interface>>

Communication + send(Message): Identifier + addRouting(RoutingRule): Identifier + removeRouting(RoutingRule): Identifier + addPushAdapter(InputPushAdapter): Identifier + addPullAdapter(InputPullAdapter, long): Identifier + addPullAdapter(InputPullAdapter, long, boolean): Identifier + removeInputAdapter(Identifier):InputAdapter + registerOutputAdapter(Class<? extends OutputAdapter): Identifier + removeOutputAdapter(Identifier):void + registerNotificationCallback(NotificationCallback): Identifier + unregisterNotificationCallback(Identifier): boolean

Figure 10: Communication API

30 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Notification Callback API immediately. Parameters message - Specifies the message that should be handled by SmartCom. The receiver of the message is defined by the message. Returns Returns the internal Id of SmartCom to track the message within the system. Throws CommunicationException - A generic exception that is thrown if something went wrong in the initial handling of the message.

public Identifier addRouting(RoutingRule rule) throws InvalidRuleException Add a route to the routing rules (e.g., route input from peer A always to peer B). Returns the Id of the routing rule (can be used to delete it). SmartCom checks if the rule is valid and throw an exception otherwise. Parameters rule - Specifies the routing rule that should be added to the routing rules of SmartCom. Returns Returns SmartCom internal Id of the rule Throws InvalidRuleException - If the routing rule is not valid (e.g., all fields are null).

c SmartSociety Consortium 2013-2017

31 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

public RoutingRule removeRouting(Identifier routeId) Remove a previously defined routing rule identified by an Id.

As soon as the

method returns the routing rule is not applied any more. If there is no such rule with the given Id, null is returned. Parameters routeId - The Id of the routing rule that should be removed. Returns The removed routing rule or null if there is no such rule in the system.

public Identifier addPushAdapter(InputPushAdapter adapter) Adds an input push adapter that waits for push notifications.

Returns the Id

of the adapter. Parameters adapter - Specifies the input push adapter. Returns Returns SmartCom internal Id of the adapter.

public Identifier addPullAdapter(InputPullAdapter adapter, long interval) Adds an input pull adapter that pulls for updates in a certain time interval. Returns the Id of the adapter. The pull requests are issued in the specified interval until the adapter is explicitly removed from the system. Parameters

32 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

adapter - Specifies the input push adapter. interval - Interval in milliseconds that specifies when to issue pull requests. Can not be zero or negative. Returns Returns SmartCom internal Id of the adapter.

public Identifier addPullAdapter(InputPullAdapter adapter, long interval, boolean deleteIfSuccessful) Adds an input pull adapter that pulls for updates in a certain time interval. Returns the Id of the adapter. The pull requests are issued in the specified interval. If deleteIfSuccessful is set to true, the adapter is removed in case of a successful execution (i.e., a message has been received), it continues in case of a unsuccessful execution. Parameters adapter - Specifies the input pull adapter. interval - Interval in milliseconds that specifies when to issue pull requests. Can not be zero or negative. deleteIfSuccessful - delete this adapter after a successful execution Returns Returns SmartCom internal Id of the adapter.

public InputAdapter removeInputAdapter(Identifier adapterId) Removes a input adapter from the execution.

As soon as this method returns,

the adapter with the given Id is not executed any more. It returns the requested input adapter or null if there is no adapter with such an Id in the system.

c SmartSociety Consortium 2013-2017

33 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Parameters adapterId - The Id of the adapter that should be removed. Returns Returns the input adapter that has been removed or nothing if there is no such adapter.

public Identifier registerOutputAdapter(Class<?

extends OutputAdapter>

adapter) throws CommunicationException Registers a new type of output adapter that can be used by SmartCom to get in contact with a peer. The output adapters are instantiated by SmartCom on demand. Note that these adapters are required to have an @Adapter annotation which describes the name and the type of the adapter (stateful or stateless). Otherwise an exception is thrown. In case of a stateless adapter, it is possible that the adapter is instantiated immediately. If any error occurs during the instantiation, an exception is thrown. Parameters adapter - The output adapter that can be used to contact peers. Returns Returns SmartCom internal Id of the registered adapter. Throws CommunicationException - If the adapter could not be handled, the specific reason is embedded in the exception.

public void removeOutputAdapter(Identifier adapterId) Removes a type of output adapters.

34 of 57

Adapters that are currently in use are re-

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

moved as soon as possible (i.e., current executions of communication will not be aborted and waiting messages in the adapter queue are still transmitted). Parameters adapter - Specifies the adapter that should be removed.

public Identifier registerNotificationCallback(NotificationCallback callback) Register a notification callback that is called if there are new input messages available. Parameters callback - Callback for notification. Returns Returns SmartCom internal Id of the registered notification callback (can be used to remove it).

public boolean unregisterNotificationCallback(Identifier callback) Unregister a previously registered notification callback. Parameters callback - Callback for notification. Returns Returns true if the callback could be removed, false otherwise.

Output Adapter API

The Output Adapter API is used to implement an adapter

that can send (push) messages to a peer. Therefore, the push method has to be implec SmartSociety Consortium 2013-2017

35 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

mented. Output Adapters receive a message from SmartCom, transform this message to the adapter specific format (e.g., email) and push it to the peer over an external communication channel (e.g., send the message to a web platform or a mobile application). As described in Section 1.1 there are Stateless Output Adapters and Stateful Output Adapters. Stateless adapters are required to have a default constructor (no parameters) whereas stateful adapters can have a default constructor or a constructor with a single parameter of type PeerChannelAddress. Stateful Output Adapters are created on demand by SmartCom. Figure 11 presents the Output Adapter API in UML notation.

<<Interface>>

OutputAdapter + push(Message, PeerChannelAddress): void Figure 11: Output Adapter API

public void push(Message message, PeerChannelAddress address) throws AdapterException Push a message to the peer.

This method defines the handling of the actual

communication between the platform and the peer. Parameters message - Message that should be sent to the peer address - The address of the peer and adapter specific contact parameters. Throws AdapterException - If an exception occurred during the sending of a message

Input Push Adapter API The Input Push Adapter API is used to implement an adapter for a communication channel that uses push to get notified of new messages. The concrete implementation has to extend the InputPushAdapter class, which provides methods that support the implementation of the adapter. The external tool/peer pushes the message to the adapter, which transforms the message into the internal format and calls 36 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

the publishMessage of the InputPushAdapter class. This method delegates the message to the corresponding queue and subsequently to the correct component of the system that handles input messages. The adapter has to start a handler for the push notification (e.g., a handler that uses long polling) in its init method and remove this handler in the cleanUp method (e.g., a server socket).

<<Abstract>>

InputPushAdapter + publishMessage(Message): void # schedule(PushTask): void # cleanUp(): void # init(): void Figure 12: Input Push Adapter API

public void init() Method that can be used to initialize the adapter and other handlers like a push notification handler (if needed). For example, to create a server socket that listens for connections on a specific port.

public void cleanUp() Clean up resources that have been used by the adapter.

Scheduled tasks using

the schedule(PushTask) method have already been marked for cancellation, when this method is called.

protected void publishMessage(Message message) Publish a message that has been received.

This method has to be called when

implementing a push service to notify SmartCom that there was a new message.

c SmartSociety Consortium 2013-2017

37 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Parameters message - Message that has been received.

protected void schedule(PushTask task) Schedule a push task that is executed in the context of the adapter.

This

method should be used to reduce the resource consumption of push adapters by using an executor service. Using this method also guarantees the clean removal of adapters from the execution. Parameters task - Task that should be scheduled

Input Pull Adapter API The Input Pull Adapter is dedicated to pull messages from external tools or peers. For example, it can query a FTP server if there is a new file available. Instances of input adapters are always related to a single application and therefore in the context of the application, because their semantics depend on the application. Each Input Pull Adapter is executed by a single Adapter Execution of the Adapter Manager (see Section 1.2.1), which is responsible to call the pull method in certain intervals. Input Pull Adapters are created by applications and therefore provided with the initialization parameters by the application itself, implying a stateful adapter. Having a stateful pull adapter has some advantages: â&#x20AC;˘ The state of the communication (e.g., the corresponding execution Id of input messages) is always saved in the adapter and there is no need to save it in the Adapter Manager. â&#x20AC;˘ race conditions due to the parallel execution of a single adapter are not possible be-

cause each adapter is only executed by a single thread. Therefore, no synchronization has to be applied to the adapter.

38 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

• The pull method does not require any parameters. Specific settings for adapters (e.g., an URL) can be set during the instantiation of the adapter and there is no

need for a dirty parameter passing to a stateless adapter (e.g., a map or list of objects/strings). This approach also has some disadvantages: • Input Pull Adapters have to be created by a platform component or on higher levels (e.g., at the programming level).

• There might be a problem if too many adapters are running at the same time due to the amount of resources (i.e., memory) or required execution time. Due to the design of the Adapter Manager the Adapter Execution Engine could run on multiple machines which would eliminate or at least reduce this problem. • Adapters have to be cleaned up properly by the creator of the adapter

<<Interface>>

InputPullAdapter + pull(): Message Figure 13: Input Pull Adapter API

public Message pull() throws AdapterException Pull data from a predefined location.

If there is no data available, null is re-

turned. Returns Returns a new message or null if there is no new information. Throws AdapterException - If an exception occurred during the pull operation.

c SmartSociety Consortium 2013-2017

39 of 57

c SmartSociety Consortium 2013-2017

Message Info Service API

Deliverable 7.1

The Message Info Service provides information about the

semantics of messages, how to interpret them in a human-readable way and which messages are related to a message. Therefore, it provides methods to query message information and to add additional information to messages.

<<Interface>>

MessageInfoService + getInfoForMessage(Message): MessageInformation + addMessageInfo(Message, MessageInformation): void Figure 14: Message Info Service API

public MessageInformation getInfoForMessage(Message message) throws UnknownMessageException Returns information about a given message to the caller.

This contains how

the message has to be interpreted, how it is related to other messages and which messages are expected in response to this message. Parameters message - Instance of a message. Must contain at least either the message Id or the message type, other parameters are optional, are used as a template. Returns Returns the information about a given message Throws UnknownMessageException - If no message of that type found or the Id of the message is not valid.

40 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

public void addMessageInfo(Message message, MessageInformation info) Add information on a given message.

If there already exists information for a

message, it is replaced by this one. Parameters message - Specifies the message. info - Information for messages of the type of parameter message.

Message Query Service API

This service can be used to query the logged messages

that have been handled by the system. All internal and external messages are logged by the Messaging and Routing Manager. To query the service, a QueryCriteria object has to be used that specifies the query and executes the query.

<<Interface>>

MessageQueryService + createQuery(): QueryCriteria Figure 15: Message Query Service API

public QueryCriteria createQuery() Creates a query object that can be used to specify the criteria for the query.

Returns Returns a query criteria object that can be used to specify parameters and execute the query.

c SmartSociety Consortium 2013-2017

41 of 57

c SmartSociety Consortium 2013-2017

2.4.3

Deliverable 7.1

Callback Entities

Callback entities are used by the system to interact with platform components which are not part of SmartCom but that SmartCom communicates with and depends on for specific features. The corresponding components have to implement the callbacks in order to be able to communicate with them. Table 7 presents an overview of the available callback entities. These entities are described in detail in the following sections. Entity

Description

PeerAuthenticationCallback The Peer Authentication Callback is used by the system to verify the identity of a peer (used for authentication) and to provide security functionalities. PeerInfoCallback

The Peer Info Callback is used to resolve peer information about a peer. This information does not change very often but is queried quite frequently, therefore retrieved data should be cached as long as the callback does not provide the required performance throughput.

CollectiveInfoCallback

The Collective Info Callback is used by SmartCom to resolve the peers that are in a collective. This information cannot be stored in SmartCom because it changes frequently, two consecutive calls might not result in the same response.

NotificationCallback

This Notification Callback is used by SmartCom to notify a platform component about messages that are not intended to be handled by SmartCom. This includes messages like task results or task-related information like communication errors. Table 7: Callback entities of SmartCom.

Peer Authentication Callback API

This callback is used to authenticate a peer

within SmartCom because such information is not stored within the system but is provided by some platform component that implements this interface. After a successful authentication a session should be created to avoid calling this callback too often due to the unforeseeable performance impact.

public boolean authenticate(Identifier peerId, String password) throws PeerAuthenticationException;

42 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

<<Interface>>

PeerAuthenticationCallback + authenticate(Identifier, String: boolean Figure 16: Peer Authentication Callback API

Authenticates a peer, i.e.

checks if the provided credentials match the peerâ&#x20AC;&#x2122;s

credentials in the system. Parameters peerId - Id of the peer. password - Password of the peer Returns Returns true if the credentials are valid, false otherwise Throws PeerAuthenticationException - If an error occurs during the authentication.

Peer Info Callback API This callback is used to resolve information about a peer, the so called PeerInfo. This information does not change very often but is queried quite frequently, therefore, retrieved data should be cached as long as the callback does not provide the required performance throughput.

<<Interface>>

PeerInfoCallback + getPeerInfo(Identifier): PeerInfo Figure 17: Peer Info Callback API

c SmartSociety Consortium 2013-2017

43 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

public PeerInfo getPeerInfo(Identifier id) throws NoSuchPeerException Resolves the information about a given peer (e.g., provides the address and the adapter that should be used). Parameters id - Id of the requested peer Returns Returns information about a peer, such as the communication channel addresses and the preferred delivery policy. Throws NoSuchPeerException - If there exists no such peer.

Collective Info Callback API This API is used to provide information regarding the composition and the state of the collectives to SmartCom, in order for SmartCom to allow to platform components the functionality of addressing their messages on the collective level.

<<Interface>>

CollectiveInfoCallback + getCollectiveInfo(Identifier): CollectiveInfo Figure 18: Collective Info Callback API

public CollectiveInfo getCollectiveInfo(Identifier collective) throws NoSuchCollectiveException Resolves and returns the members of a given collective Id.

44 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Parameters collective - The Id of the collective. Returns Returns a list of peer Ids that are part of the collective and other collective related information. Throws NoSuchCollectiveException - If there exists no such collective.

Notification Callback API The Notification Callback is used to inform the different platform components of the messages that arrived for them (e.g., to inform the components about task results or other task-related information like an error) or that the receiver of a message could not be determined. Since SmartCom does not save any conversational state, it is not possible to determine the right recipient if multiple platform components are implementing the Notification Callback API. Therefore, these components are required to be capable of handling (filtering) unexpected messages.

<<Interface>>

NotificationCallback + notify(Message): void Figure 19: Notification Callback API

public void notify(Message message) Notifies the corresponding callback about task results or task-relation information like an error.

c SmartSociety Consortium 2013-2017

45 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Parameters message - The received message.

2.5

Algorithms

The following section presents some important algorithms of SmartCom in pseudocode which are needed for the creation and handling of adapters, as well as the handling and routing of messages in SmartCom. 2.5.1

Creation of Output Adapters

Algorithm 2.1 describes how Output Adapters are created/instantiated in the Adapter Manager (see Section 1.2.1) based on a peerâ&#x20AC;&#x2122;s delivery policy and the provided addresses of communication channels. The algorithm prefers Stateless Output Adapters over Stateful Output Adapters because they have a smaller impact on the performance of the system. All or at least multiple peers share one Stateless Output Adapter, therefore, they have a lower resource usage. Contrary to Stateful Output Adapters, Stateless Output Adapters are instantiated immediately after their registration in the system, therefore, they are not instantiated using this algorithm. On the other hand, Stateful Output Adapters are instantiated per peer and on demand, because they have a higher resource usage in the system compared to Stateless Output Adapters. Depending on the chosen delivery policy the algorithm either instantiates a single adapter (in case of the delivery policy PREFERRED) or multiple adapters (in case of delivery policies TO ALL CHANNELS and AT LEAST ONE). Note that the ordering of addresses defines the preference of communication channels (and therefore adapters) of a peer. Also note that TO ALL CHANNELS means all available channels, therefore, it is not an error if there is no adapter registered in the system that can handle a specific address. The intentional meaning is that the sending to all available communication channels has to succeed. The same applies for adapters that could not be instantiated due to an error. Finally the algorithm returns the internal Ids of adapters, which are required by the Messaging and Routing Manager and the Message Broker to send messages to peers. If no adapters have been found, the returned list is empty. 46 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Function createAdapterInstances is input : Peer information (peerInfo) output: Identifiers of the created adapters

addresses = peerInfo.addresses; policy = peerInfo.deliveryPolicy;

for all address in addresses do

if there is a stateless adapter instance available for this address then add the address of the stateless instance to the result list; if policy == PREFERRED then return result list; else continue with next address; end

else if there is a stateful adapter implementation for this address then

if there is already an instance of that adapter for this peer then add the address to the result list; if policy == PREFERRED then return result list; else continue with next address; end end

5 6 7 8 9 10

14 15 16 17 18 19 20 21 22 23

instantiate new stateful adapter with a unique ID; add new instance to the instances of stateful adapters; add new instance to the result list;

if policy == PREFERRED then return result list; 26 else 27 continue with next address; 28 end 29 else 30 log that there was an unknown adapter; 31 end 32 end Algorithm 2.1: Creation of adapters instances for a peer based on peerâ&#x20AC;&#x2122;s delivery policy. 24

c SmartSociety Consortium 2013-2017

47 of 57

c SmartSociety Consortium 2013-2017

2.5.2

Deliverable 7.1

Handling of Messages

Messages are handled by the Messaging and Routing Manager (see Section 1.2.3). Every incoming message, regardless of whether it is from an internal component, an application or a peer is handled by the handleMessage function. Algorithm 2.2 depicts the function. Line 7 of the algorithm indicates the application of a routing rule which has been described in Section 2.2. First, the message is assigned with a unique message Id which is used to track the message within SmartCom. Additional to the receiver of the message (can also be empty), further receivers - if there are any - are determined by the Routing Rule Engine based on routing rules (see Section 1.2.3). Note that the delivery policy handler is only created if the receiver of the message has been set and it is only created for the first receiver. This prevents the case of receiving an acknowledge and a communication error message for messages that have been sent to multiple receivers. Further details on the enforcement of delivery policies can be found in the following section. Finally the presented algorithm forwards the message to the corresponding receivers. Algorithm 2.3 describes how the messages are forwarded to a collective and Algorithm 2.4 describes how the messages are forwarded to single peers. The function calls registerCollectiveMessageDeliveryAttempt and registerPeerMessageDeliveryAttempt indicate the registration of a policy handler that observes whether a delivery policy has been enforced for an outgoing message or if there was an error during communication (see the corresponding paragraph in Section 1.2.4).

First, the algorithm retrieves the collective info from the CollectiveInfoCallback directly. This object contains information about the delivery policy of the collective as well as the peers that are currently part of the collective. If required (indicated by the variable createHandlers) a collective message delivery attempt is registered. Thereafter the message is delivered to every peer which is currently part of the collective. Note that this membership is subject to constant change. Similar to sending messages to a collective, the peer info are retrieved first. It consists of the delivery policy, privacy policies and contact addresses for adapters (which are not used in this algorithm). First, the algorithm checks if a message is allowed to be sent to 48 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Function handleMessage is input : Message (msg)

if message Id is empty then create unique message ID; end

createPolicyHandlers = false;

if message receiver is not null then add the message receiver to the receiver list; createPolicyHandlers = true; end

2 3

7 8 9

10 11 12 13 14 15 16 17 18 19

/* check if there are further receivers get further receivers from the routing engine (based on routing rules); add them to the receiver list;

if receiver list is empty then send error message to NotificationCallback; return; end for each receiver in receiver list do if receiver is component then forward message to component; continue;

else if receiver is collective then deliverToCollective(msg, receiver, createPolicyHandlers); // Alg. 2.3 22 else 23 try 24 deliverToPeer(msg, receiver, createPolicyHandlers); // Alg. 2.4 25 catch 26 send error message to the sender of the message; 27 createPolicyHandlers = false; 28 end 29 end Algorithm 2.2: Handling of messages in the Messaging and Routing Manager. 20

c SmartSociety Consortium 2013-2017

49 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Function deliverToCollective is input : Receiver (collective) Message (msg) Boolean value whether to create delivery policy handler (createHandlers)

retrieve collective info (collInfo) from CollectiveInfoCallback;

if createHandlers then // to trace the enforcement of delivery policies registerCollectiveMessageDeliveryAttempt(msg, collInfo.deliveryPolicy); end

4 5 6 7 8 9 10

for each peer in collInfo.peers do try deliverToPeer(msg, receiver, createHandlers); // see Alg. catch enforceCollectiveDeliveryPolicy(new error message);

if collInfo.deliveryPolicy is TO ALL MEMBERS then /* delivery failed because the massage could not be sent to everyone */ break; end

12 13 14 15

2.4

end end Algorithm 2.3: Sending messages to a collective.

50 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

Function deliverToPeer is input : Receiver (peer) Message (msg) Boolean value whether to create delivery policy handler (createHandlers)

retrieve peer info (peerInfo) from PeerInfoCallback;

for each policy in peerInfo.privacyPolicies do // check if policy allows sending messages if !policy.condition(msg) then throw an exception; end end

4 5 6 7

if createHandlers then // to trace the enforcement of delivery policies registerPeerMessageDeliveryAttempt(msg, peerInfo.deliveryPolicy); end

determine list of adapters (adapterList) from routing engine;

if adapterList is empty then throw exception; end

8 9

13 14 15 16 17 18

for each adapter in adapterList do send output message to adapter using the message broker; end end Algorithm 2.4: Sending messages to peers.

c SmartSociety Consortium 2013-2017

51 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

a peer at the moment based on its privacy policies. If required (indicated by the variable createHandlers) a peer message delivery attempt is registered. The list consists of Ids of adapters which can send the message to this peer. Finally, using the Message Broker the message is sent to the peer over each adapter (indicated by its Id) that has been returned previously by the routing engine. Enforcing delivery policies As described in Section 1.2.4 there are multiple delivery policies on three different levels (collective, peer and message level) which have to be enforced. Handlers for these policies are registered during the sending of messages to peers and collectives (see Algorithm 2.3 and 2.4) but the enforcement of policies is handled upon reception of acknowledge and communication error messages which are sent by adapters. Table 8 describes how the data structure to enforce collective delivery policies might look like. The MessageID and the SenderID represent the composed key that identifies an entry. There is a policy handler for every entry that keeps track of the policy enforcement for a specific message and sender, and decides whether a policy has been enforced, if there are still results missing or if it failed. The acronym CollPolEDS is used instead of â&#x20AC;?collective delivery policy enforcement data structureâ&#x20AC;? in the following. MessageID msg1 msg2 msg3

SenderID sender1 sender2 sender1

Policy TO ALL MEMBERS TO ALL MEMBERS TO ANY

Policy Handler policyHandlerInstance1 policyHandlerInstance2 policyHandlerInstance3

Table 8: Data structure to enforce collective delivery policies (CollPolEDS). Underlined entries indicate the composed key for each entry. Table 9 describes the proposed data structure to enforce peer delivery policies. It looks almost the same as the CollPolEDS except that the ID of the receiver is added to the composed key. Multiple entries of this data structure might correspond to a single entry in the CollPolEDS. In case of a message being sent only to a peer, there is no corresponding entry in the CollPolEDS. MessageID msg1 msg2 msg3

SenderID sender1 sender2 sender1

ReceiverID receiver2 receiver3 receiver1

Policy TO ALL CHANNELS AT LEAST ONE PREFERRED

Policy Handler policyHandlerInstance1 policyHandlerInstance2 policyHandlerInstance3

Table 9: Data structure to enforce peer delivery policies. Underlined entries indicate the composed key for each entry. 52 of 57

http://www.smart-society-project.eu

Deliverable 7.1

c SmartSociety Consortium 2013-2017

If a message is sent to a collective, a corresponding entry is created in the CollPolEDS. For every peer in the collective an additional entry is created in the peer delivery policy enforcement data structure. Ingoing acknowledge and communication error messages from adapters are handled on the peer level first and only if that level indicates a successful or erroneous enforcement of the delivery policy, the collective level is enforced. This behavior can be observed in Algorithm 2.5 which handles the enforcement on the peer level. If there is a corresponding entry in the CollPolEDS, the enforcement is redirected to the collective level because the peer delivery policy has been successfully enforced (in case of Line 9) or there was an error during enforcement (in case of Line 19). Algorithm 2.6 describes the delivery policy enforcement on the collective level.

c SmartSociety Consortium 2013-2017

53 of 57

c SmartSociety Consortium 2013-2017

2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

Deliverable 7.1

Function enforcePeerDeliveryPolicy is input : Acknowledge or communication error Message (msg) try if checkPeerDeliveryPolicy(msg) then // might throw an exception entry = discardPeerPolicyEntry(msg); if entry == null then // Policy has already been enforced return; end if collectiveDeliveryPolicyHasEntry(msg) then enforceCollectiveDeliveryPolicy(msg); // see Alg. 2.6 else if entry.messagePolicy == ACKNOWLEDGE then send acknowledgement to entry.sender; end catch /* msg can only be a communication error message */ entry = discardPeerPolicyEntry(msg); if entry == null then // Policy has already been enforced return; end if collectiveDeliveryPolicyHasEntry(msg) then enforceCollectiveDeliveryPolicy(msg); // see Alg. 2.6 else send communication error message to entry.sender; end

Function checkPeerDeliveryPolicy is input : Acknowledge or communication error Message (msg)

policy = getPeerDeliveryPolicy(msg.id, msg.sender, msg.receiver); if policy == null then return false; // entry has already been evicted end if msg.subtyp == ACKNOWLEDGE then return policy.check(); else throw exception; // indicates that this is an error message end Algorithm 2.5: Enforcing a delivery policy on the peer level.

25 26 27 28 29 30 31 32

54 of 57

http://www.smart-society-project.eu

Deliverable 7.1

Function enforceCollectiveDeliveryPolicy is input : Acknowledge or communication error Message (msg) try if checkCollectiveDeliveryPolicy(msg) then // might throw an exception entry = deleteCollectivePolicyEntry(msg);

3 4 5 6 7 8 9 10

if entry.policy == ACKNOWLEDGE then send acknowledgement to the entry.sender; end end catch entry = deleteCollectivePolicyEntries(); send error message to the entry.sender;

11 12

end

Function deleteCollectivePolicyEntry is input : Message (msg) output: collective delivery policy entry lock(collectiveDiscardCondition);

15 16 17 18 19 20 21 22

23 24 25 26 27 28

29 30

c SmartSociety Consortium 2013-2017

// prohibits race conditions

// delete entries because policy has been enforced entry = discardCollectivePolicyEntry(msg); for every corresponding entry in the peer delivery policy data structure do discardPeerPolicyEntry(entry); end unlock(collectiveDiscardCondition); return entry; end Function checkCollectiveDeliveryPolicy is input : Message (msg) output: true if delivery policy has been enforced, false otherwise policy = getCollectiveDeliveryPolicy(msg.content, msg.sender); if policy == null then return false; /* policy has already been enforced */ end if msg.subtyp == ACKNOWLEDGE then return policy.checkAcknowledge(); /* returns true if this message enforced the policy */ else return policy.checkError(); /* can throw an exception or just return false */ end Algorithm 2.6: Enforcing a delivery policy on the collective level.

c SmartSociety Consortium 2013-2017

55 of 57

c SmartSociety Consortium 2013-2017

Deliverable 7.1

Experimental Performance Evaluation

The following performance evaluation was made on a machine with the following specifications: Windows 7 64-bit, Intel Core2 Duo with 2x 2.53 GHz, 4.00 GB DDR2-RAM. The simulation configuration is as follows: • One implementation of a Stateless Output Adapter (one instance shared by all peers).

• 10 Input Push Adapter to receive input from peers. • Output and Input Adapters communicate directly using a in-memory queue to simulate a peer with a response time of zero.

• Worker threads (workers) simulate the number of applications/users that send messages to the system.

• One million messages are sent for each evaluation test run to get a meaningful average number of messages sent/received.

• Only sent and received messages are considered as ’handled’, no internal messages. Figure 20 depicts the setup for the performance evaluation as described above. 1 instance (scales)

1 mio messages

Workers Worker Worker

Communication Middleware

Stateless Output Adapter

simulated

Peers

Input Adapter 10 instances

Figure 20: Setup for the performance evaluations. The performance has been evaluated for every combination of 1, 5, 10, 20, 50 and 100 worker threads (Worker) simulating SmartSociety platform applications sending 1 · 106

messages concurrently, uniformly distributed to 1, 10, 100, and 1000 peers waiting for messages and replying to them. Each test run has been executed 10 times to obtain average throughput results. Figure 21 presents the results of the test runs. The test runs can be reproduced using the stated setup data to configure the Java application located at 56 of 57

http://www.smart-society-project.eu

c SmartSociety Consortium 2013-2017

Deliverable 7.1

GitHub4 . As one can see, the average throughput remains between 5000 and 3000 messages per second. The performance decrease with higher amounts of peers is result of increased memory requirements rather than computational complexity. The limiting factor here is the used ActiveMQ message broker which only allows a maximum of approximately 20000 messages per second. The system has an upper bound of 5000 messages per second since each message is handled multiple times by the message broker and the SmartCom. This limitation applies to a single SmartCom instance, but multiple SmartCom instances can be deployed to balance the load if needed, sharing the database and PeerManager access. The chosen numbers of worker threads and peers cover the maximum number of concurrent SmartSociety platform applications and collective members, respectively, expected to use a single SmartCom instance. Performance is not expected to become a primary concern of SmartCom due to the increased latency of human peers and variance of response times compared to machine peers.

Messages/second

6000 5000

1 Worker

4000

5 Worker

3000

10 Worker

2000

20 Worker

1000

50 Worker

0 1

100

1000

100 Worker

Peers

Figure 21: Measured message throughput. Workers simulate SmartSociety platform applications sending out messages to the peers.

References [1] A. Fipa, â&#x20AC;&#x153;Fipa acl message structure specification,â&#x20AC;? Foundation for Intelligent Physical Agents, http://www. fipa. org/specs/fipa00061/SC00061G. html (30.6. 2004), 2002.

https://github.com/tuwiendsg/SmartCom/blob/master/smartcom-demo/src/main/java/at/ac/ tuwien/dsg/smartcom/demo/PerformanceDemo.java

c SmartSociety Consortium 2013-2017

57 of 57