SmartSociety Hybrid and Diversity-‐Aware Collective Adaptive Systems When People Meet Machines to Build a Smarter Society
Grant Agreement No. 600854
Deliverable D5.3 Work package WP5
Specification of advanced incentive design and decision-‐ assisting algorithms for CAS Dissemination level (Confidentiality)1:
PU
Delivery date in Annex I:
31/12/2014
Actual delivery date:
31/1/2015
Status2:
F
Total number of pages:
(keep in mind that the page limit is 25 excluding pages before the Table of Contents, annexes and references)
Keywords:
Incentive design, decision-‐making, online mechanism
1 2
PU: Public; RE: Restricted to Group; PP: Restricted to Programme; CO: Consortium Confidential as specified in the F: Final; D: Draft; RD: Revised Draft
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Disclaimer This document contains material, which is the copyright of SmartSociety Consortium parties, and no copying or distributing, in any form or by any means, is allowed without the prior written agreement of the owner of the property rights. The commercial use of any information contained in this document may require a license from the proprietor of that information. Neither the SmartSociety Consortium as a whole, nor a certain party of the SmartSociety Consortium warrant that the information contained in this document is suitable for use, nor that the use of the information is free from risk, and accepts no liability for loss or damage suffered by any person using this information. This document reflects only the authors’ view. The European Community is not liable for any use that may be made of the information contained herein.
Full project title:
SmartSociety -‐ Hybrid and Diversity-‐Aware Collective Adaptive Systems: When People Meet Machines to Build a Smarter Society
Project Acronym:
SmartSociety
Grant Agreement Number:
600854
Number and title of work package:
5 Incentive Design and Decision-‐Making Strategies
Document title:
Specification of advanced incentive design and decision-‐ assisting algorithms for CAS
Work-‐package leader:
Kobi Gal, BGU
Deliverable owner:
Kobi Gal
Quality Assessor:
Mark Hartswood, OXF
Page 2 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
List of contributors
Partner Acronym BGU BGU BGU
Contributor Moshe Mash Avi Segal Kobi Gal
© SmartSociety Consortium 2013 - 2017
Page 3 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
Executive summary Deliverable 5.3 of WP5 focuses on the deployment and evaluation of incentive mechanisms for CAS in the real world. Building on the results of deliverable 5.1 (an initial design for an empirical ride sharing application) and deliverable 5.2 (lab studies for determining which incentive mechanisms to use for CAS) we have conducted two studies focusing on the design and evaluation of incentive mechanisms in real world CAS systems. Following the reviews from Y1, we focused or efforts on the design and evaluation of incentive mechanisms that take account of the social and technological context of interaction problems, and in particular, are robust to scale. Our first study focused on incentives for enhancing engagement in large-‐scale CAS systems that are loosely coupled, in that participants work individually and their contributions are subsequently aggregated using computational methods. We based this study on citizen science systems in which volunteers collectively create or analyze data at a scale that professional researchers cannot accomplish on their own. Citizen science are mass-‐participation platforms in which computer systems can play a key role in task allocation and data aggregation, thus making them a natural candidate for smart society research. Although such systems are highly successful in drawing large amounts of committed volunteers, the vast majority of participants exhibit a fast turnaround time in the system, becoming active participants quickly after registering and leaving after a few days. Even a small increase in the contribution rates of these participants can lead to a significant improvement in productivity of citizen science. We designed and evaluated a general methodology for reducing disengagement in citizen science through a controlled intervention. We analyzed two years’ of user participation data from 16 different citizen science projects, which revealed two significant cohorts of participants leaving the system shortly after their initial enrolment. We designed and targeted an intervention strategy for these groups in the form of an e-‐mail which directly addressed the factors identified in the survey as contributing to disengagement. Participants receiving the email were significantly more likely to return to activity in the system and did not decrease their level of contribution and persistency when compared to a control group of users who did not receive the mail. The contribution of this work was in providing a general methodology for identifying and alleviating disengagement in citizen science projects through a controlled intervention strategy that is shown to be highly effective in several different projects. Our second study focused on the question of how to design incentive mechanisms for large-‐scale CAS systems in which the interactions between participants are tightly coupled, and members in the group actively need to cooperate and coordinate in order for the group to succeed. In this study we focused on the design of incentive schemes for a ride sharing application for matching between riders and drivers according to various criteria. The application was developed as part of a consortium wide collaboration. We designed two incentive schemes for the system, one that consisted of a reputation mechanism in which participants were able to rate each other’s performance in the system. Another consisted of community messages that provided motivating messages with a social and ecological context to selected users. We hypothesized that using both these incentive systems would increase the motivation of participants in the system to use it, and result in higher efficiency and performance. Our results were as follows. First, we were able to deploy the ride sharing system at BGU university and recruit 150 users, despite the intense competition consisting from 3 registered ride sharing applications. Second, the reputation mechanism was heavily used by all users in the system. Third, there was no significant effect of using community messages on participant behaviour in the ride share system (although this could be partly explained by the fact that the messages were not made visible enough). These two studies demonstrate that reasoning about human factors, rather than just about optimization, is key to the design of successful CAS systems, even when these systems are very different from each other in terms of interaction and scale. Our studies provided a general methodology for motivating disengaged users in citizen science projects and increasing their contribution, and showed the importance of reputation systems in sustaining a successful ride sharing application. Page 4 of (24) http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Table of Contents Table of Contents .............................................................................................................................................. 5 2 Improving productivity through controlled intervention ........................................................................... 6 2.1 Background…….…………………………………………………………………………………………………………………..6 2.2 Related work…………………………………………………………….……………………………………………………....8 2.3 Understanding participation and disengagement in citizen science…………………………….........9 2.4 Profiling volunteer populations…………………………………………………………………………………….……9 2.5 Interview design and deliver………………................................................................................10 2.6 Assessment of effectiveness of intervention……………………………………………………………………..11 2.7. Discussion…………………………………………………………………………………………………………………………12 3 Community messages and reputation systems as incentive mechanisms in CAS…………………………………13 3.1 Related work…………………………………………………………………………………………………………………….13 3.2. Experimental design………………………………………………………………………………………………….………15 3.2.1 Background and technical specification for RideShare system……………………………..15 3.3. BGU Deployment and study results…………………………………………………………………………………..17 3.4. Adoption results………………………………………………………………………………………………………………..18 3.5 Reputation and community messages……………………………………………………………………………….20
© SmartSociety Consortium 2013 - 2017
Page 5 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
2. Improving productivity through controlled intervention This study was conducted in partnership with WP1 and in collaboration with the SOCIAM consortium. It is based on a paper that is currently under review for the Web Science track of the International Conference on the World Wide Web, co-‐authored with Avi Segal (BGU); Robert Simpson (Oxford); Victoria Homsey (Oxford) and Kevin Page (Oxford). 2.1. Background Volunteers have been involved in scientific research for over 100 years. More recently, technological developments have transformed the role of these non-‐professional scientists to active participants in large-‐scale endeavors, termed citizen science, in which volunteers collectively create or analyze data at a scale that professional researchers cannot accomplish on their own [1]. Zooniverse is the largest citizen science platform that exists today, including over a million volunteers and 25 live projects spanning astrophysics, zoology, biology, medicine, climate science, and the humanities [18]. In all of these projects the volunteers identify, classify, mark, and label data, which is subsequently aggregated and analyzed in order to reach scientific conclusions. The number of active projects is steadfastly growing, from 8 live projects in the beginning of 2012, to 25 live projects in 2014, and its user base includes volunteers from varying occupations, age groups, level of education and geographical location [2]. Participants in Zooniverse projects differ widely in contribution rates and motivation [3]. A small minority of participants are highly committed and contribute tens of thousands of tasks, also becoming involved in higher-‐order participation, such as forum moderation. Whilst the platform could not function without these committed, high-‐volume contributors, they exists within a long tail of user participation in Zooniverse. The vast majority of participants in Zooniverse projects undertake only a few classifications each, and participate for just a few days. Despite their casual participation, these users contribute a substantial fraction of the overall effort going into Zooniverse. This is demonstrated in Figure 1, which shows the fraction of total contributions as a function of the number of contributions per user. We note the tall spike in the total contribution rate for users with small number of contributions (lef-‐hand side of the figure), which form the vast majority of Zooniverse volunteers; and the long tail of decreasing contributions as the number of contributions grows. If volunteer disengagement (the point at which users stop participating in the system) can be delayed by just a few tasks, then overall productivity in the Zooniverse could improve significantly. As the Zooniverse continues to expand and This work provides a comprehensive study of participation patterns in Zooniverse, identifying disengaged populations and bringing them back to productivity in the system using controlled intervention. Prior work has showed that citizen science volunteers are driven by diverse range of motivations with varying degrees of commitment and engagement [3,4,5]. These studies were limited to isolated citizen projects, and were not used to implement and test intervention policies. Our work bridges this gap by moving towards a general methodology for reducing disengagement in citizen science that is based on the analysis of two years’ of participation data in 16 Zooniverse projects. This methodology includes: (1) surveys to reveal the motivations that drive users’ participation in Zooniverse; (2) identifying cohorts based on the survey results and the participation data; (3) designing an intervention strategy that targets specific cohorts and is designed to increase their engagement with the system; (4) analyzing the efficacy of this strategy over time, according to performance and persistence measures. Page 6 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
We designed and administered a survey to 3,000 randomly selected users in Zooniverse who participate in a wide variety of projects. The survey identified “classification anxiety” (overestimating the effects of individual mistakes [5]), competition from other life demands and leisure activities, and boredom from specific Zooniverse projects as prominent causes of disengagement among volunteers. For many users the cause of classification anxiety was revealed to be a misunderstanding of the collective nature of citizen science projects, in which aggregation of data diminishes the effects of individual mistakes. To identify target communities for the intervention we combined our analysis of the survey with findings revealed by clustering two years’ worth of user participation data from 16 different projects. We focused our intervention on two cohorts who quickly left the system after an initial burst of activity. Volunteers in the first cohort spent less than a day making contributions, and those in the second spent between one and ten days as active volunteers. These cohorts are significant as they capture the vast majority of user participation in the system for all projects. We designed interventions in the form of emails that directly addressed the causes of disengagement that were revealed in the survey and sent to each user in two cohorts described above. We compared the effectiveness of this intervention method with a control group that included participants with similar participation patterns who did not receive any email notification. The results showed that participants from the intervention group were significantly more likely to return to activity in Zooniverse than participants from the control group, without experiencing a drop in contribution rates and activity in the system, as compared to the control group. In addition, returning participants from the intervention group resumed activity at least as fast, and remained active in the system for at least as long as returning participants from the control group. Our work has insights for the designers of citizen science platforms in general, providing an example of methodology for reducing disengagement in citizen science projects that identifies meaningful cohorts in the population, uncovers the factors that reduce participation of different groups in the system and targeted group interventions that stimulate re-‐engagement.
Figure 1: Fraction of total contributions to Zooniverse per user in the Galaxy Zoo project. Note the sharp spike for users with very small contribution rates on the left-‐hand side of the Figure. © SmartSociety Consortium 2013 - 2017
Page 7 of (24)
© SmartSociety Consortium 2013 - 2017
2.2
Deliverable D5.3
Related work
This work relates to several bodies of work on identifying participation patterns in citizen science and designing environments for improving user engagement and productivity. We relate to each of these in turn. Existing work has identified different classes of populations in peer production sites. Preece and Shneiderman’s Reader-‐to-‐Leader framework defined categories of users that are distinguished by their depth of social engagement within the community: ‘readers’ who lurk in the background, ‘contributors’ who create content and contribute to the community; ‘collaborators’ who work together and regularly contribute and ‘leaders’ who participate in the governance of the site [6]. The majority of the labor in general peer production sites is often apportioned to 1% of users of the website [7]. In contrast, the ratio of contributors is significantly higher for citizen science projects and contributors exhibit a variety of contribution styles. Eveleigh who studied user participation patterns in the ‘Old Weather’ Zooniverse project, identified 'dabblers' and 'drop-‐outs' as important classes of volunteers [5]. Dabblers exhibit a low-‐commitment attitude, a weak tie to projects, and an intermittent approach to participation, with occasional short bursts of activity. These casual contribution styles form the majority of user contributions to Zooniverse, and collaborators and leaders represent a small minority. Several works have specifically studied the engagement patterns of citizen science volunteers. In an ecological fieldwork project, Rotman describes a ‘circuit of engagement’ whereby volunteers, motivated initially out of curiosity, may subsequently leave the system if they are not made to feel part of the wider scientific community [9]. Eveleigh cites competition with other life activities, anxiety over making mistakes and boredom as the main reasons driving disengagement in the Old Weather citizen project [5]. Kittur also cites boredom, in addition to low work quality and inappropriate task assignment as major reasons for early disengagement [10]. Mao et al used machine learning to predict disengagement in the Galaxy Zoo project, identifying two cohorts of user groups spending 5 and 30 minutes in the system, respectively [8]. Many studies have focused on environment design for facilitating user engagement in citizen science. The FoldIt project is framed in the context of a game in order to enhance user engagement. Some Zooniverse projects exhibit badges and leader boards functionalities, although there is evidence that competitive game elements may be counterproductive and work to de-‐motivating casual contributors and reduce the quality of the work [11, 12, 13]. Although recommendations towards an improved environment are often derived from these types of study few papers also include evidence of successful intervention. At the same time Zooniverse itself deploys a variety of mechanisms designed to enhance volunteers’ experience and engagement, including: chat and discussion forums, use of narrative or storytelling (e.g. being the captain on a ship, or engaging with the fight against cancer), active links and participation with scientists via blog posts and from within discussion forums, and words of encouragement (exhortations to ‘keep going’ as in the Cell Slider project). However the effects of these mechanisms were not studied in a principled way. Building on these findings our approach has been to formulate a four stage multidisciplinary process of: (1) Contacting volunteer populations to understand reasons for participation and disengagement; (2) profiling volunteer populations to reveal distinct cohorts that may be targeted by interventions; (3) Intervention design and delivery based upon prior stages; (4) Evaluation and follow-‐up to determine effectiveness. In the following sections we walk through this process for our intervention, and in the discussion offer suggestions for how this approach may be made more general. Page 8 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
2.3 Understanding participation and disengagement in citizen science This aspect of the methodology we implemented a questionnaire to uncover reasons for patterns of engagement and disengagement within Zooniverse. It was carried out by WP1 and is detailed in their deliverable. To summarize, a survey was sent on July 7th 2014 to 3,000 participants randomly selected. The three most common classes of reasons as to why people reported disengaging from Zooniverse were (roughly in order of significance ): 1. 2. 3.
Competition from other demands on the participant’s time, sometimes expressed as forgetting. Concern about making mistakes, termed “classification anxiety”. Boredom or disinterest.
The survey shows that there is a significant ‘re-‐engagement potential’ for participants who disengage, which might be activated by a suitable reminder. Such a reminder might usefully provide reassurance about classification accuracy, as well as well as directing ‘bored’ participants towards other projects they could try. 2.4.
Profiling volunteer populations
To understand which Zooniverse sub-‐population we might usefully target with an intervention (i.e. which may be prone to distraction, anxiety or boredom) we analyzed the engagement patterns of Zooniverse volunteers in 16 projects out of the 25 Zooniverse projects. This sample is representative of the gamut of different topics (e.g., biology, nature, astronomy) and popularity with volunteers, as measured by the number of registered users as of July 2014. Table 1 provides a general description of these projects. Data was collected beginning September 2012 for all projects, with the exception of the Planet Hunters project, for which data was already available from December 2010. We measure users’ activity in the system by the number of days elapsed since their first and last seen login. Let t_k be the current timestamp. Let t_i be the timestamp of the user’s first login to the The figure clearly identifies two distinct groups that make up the vast majority of activity in the system. The largest cohort of users consists of those who spent less than a day as active users, which we will denote the "1-‐day" cohort. This cohort included 56 to 87 percent of Zooniverse volunteers. Another large cohort consists of volunteers who spend between one to nine days as active users in the system, which we will denote the “10-‐day” cohort. This cohort included 4 to 14 percent of volunteers. Together these cohorts make up at least 60% of the user population in Zooniverse. We thus decided to focus our intervention strategy on these two cohorts. Even a small increase in the contributions of these populations can lead to significant benefits to citizen science. Let t_j be the timestamp representing the user’s most recent login to the system. We measure a user’s activity in Zooniverse as the difference between t_j and t_i. Figure 2 describes users' activity in the system for all of the supplied projects. The X-‐axis shows range groups of participation time spans, and the Y-‐axis shows the ratio of users that fall into each group. The figure clearly identifies two distinct groups that make up the vast majority of activity in the system. The largest cohort of users consists of those who spent less than a day as active users, which we will denote the "1-‐day" cohort. This cohort included 56 to 87 percent of Zooniverse volunteers. Another large cohort consists of volunteers who spend between one to nine days as active users in the system, which we will denote the “10-‐day” cohort. This cohort included 4 to 14 percent of volunteers. Together these cohorts make up at least 60% of the user population in Zooniverse. We thus decided to focus our intervention strategy on these two cohorts.
© SmartSociety Consortium 2013 - 2017
Page 9 of (24)
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Table 1: Zooniverse projects used in study
Figure 2: Activity Patterns In Zooniverse 2.5 Interview design and delivery The goal of the intervention was to bring back disengaged users to being productive users in the system as measure by whether (and how quickly) these users return to being active users in Zooniverse following the intervention, and the difference in contribution rates (i.e., persistence) after returning to the system. We randomly assigned the users in the 1-‐day and 10-‐day cohort to a control and an intervention (test) group. The intervention group received a reminder email that was designed to encourage them to return to the Zooniverse system and to make contributions. The email directly addressed the motivational issues that were uncovered in our survey, emphasizing the collective nature of the Zooniverse projects, the tolerance to individual mistakes by volunteers, and the availability of other projects on the system. The control group received no such email. The email sent out to the 1-‐day cohort was sent a week after the user’s last login to the system. The mail for the 1-‐day cohort was as follows: “Thanks for trying PROJECTNAME, we appreciate your clicks! You're not alone on PROJECTNAME -‐ thousands of people take part every month. You can discuss the images you see on PROJECTNAME with the community, and the project's research team, by visiting Talk at PROJECTTALKURL. Get involved again at PROJECTURL.
Page 10 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
We know that some people worry that they aren't very good at PROJECTNAME -‐ but this isn't the case. We can use all volunteers' clicks to learn about the data, and multiple people will see each image. We use statistical techniques to get the most from everyone's answers, and the occasional error does not affect the results. If PROJECTNAME didn't suit you, then check out all of the other Zooniverse citizen science projects at www.zooniverse.org, or if you would rather not receive these emails you can unsubscribe at www.zooniverse.org/account/newsletters. You can see your contributions to all Zooniverse projects by visiting http://zooniverse.org/me. We look forward to seeing you again, Rob and the Zooniverse Team.” The email to the 10-‐day cohort varied slightly in addressing the volunteers as regular contributors rather than newcomers and providing users with a link to a service which tracks their contributions to Zooniverse. It was sent two weeks after the user’s last login to the system. The intervention was conducted between the dates of August 15th, 2014 and September 24th, 2014. On each day, we sent out the relevant email to the volunteers in the intervention groups. In total, the intervention group consisted of 306 randomly selected volunteers from the 1-‐day cohort and 541 volunteers randomly selected from the 10-‐day cohort. The control group consisted of 292 randomly selected volunteers from the 1-‐day cohort and 540 volunteers from the 10-‐day cohort. Note that volunteers which were assigned to both cohorts and received both mails were removed from the analysis. To measure the interventions impact, Zooniverse supplied us contribution data for the two groups for the aforementioned dates, including user ID, task ID and timestamp of task contribution. We wished to examine the following hypotheses: 1. 2. 3. 4.
2.6.
Sending emails to the intervention group will have a significant and positive effect on the return of volunteers to activity as compared to the control group. Returning volunteers from the intervention group will resume activity at least as fast as returning volunteers from the control group. Returning volunteers from the intervention group will be at least as persistent (remain active in the system) as returning volunteers from the control group. Returning volunteers from the intervention group will provide at least as many contributions to Zooniverse as returning volunteers from the control group. Assessment of effectiveness of intervention
We first compared the number of volunteers from the intervention and control group that returned to activity in the system. As shown by Figure 3, the ratio of volunteers who returned to the system following the intervention was significantly higher for the intervention group than for the control group (Chi square p<0.03). The bars in the figure represent 95% confidence intervals.
© SmartSociety Consortium 2013 - 2017
Page 11 of (24)
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Figure 3: Return ratio for intervention and control groups Group
Contributions Contributions Days Before (num. After (num. active of tasks) of tasks) After Intervention 20 18 1.5 Control 21 23.5 1.2 Table 2: Persistence for intervention and control groups. We next compared the speed in which volunteers returned to the system in both groups as measured by the number of days from sending out the email to their first login back to the system. We found that the average return time for volunteers in the intervention group (4.1 days) was less than that of the control group (5.7 days), although this result was not statistically significant (one-‐tailed non-‐paired t-‐ test, p=0.052). One may suspect that although email interventions are able to bring back more volunteers that their persistence (as measured by their activity time in the system after they return and the number of classifications they perform) is lower than that of volunteers in the control group, who return to the system on their own accord and may be more highly motivated to contribute. To check this, we looked at the median number of classifications before and after the reminder for both groups, as shown in the Table 2. The results show that there was no statistically significant difference between the two groups in the number of classifications before and after the intervention. We chose to present the median rather than the average contribution rate to offset the effect of “outlier” volunteers whose contribution rates are exceptionally high. When looking at average contribution rates, we see a decrease for both groups in the number of classifications before and after the intervention (not shown in the table). However, this decrease was significantly more pronounced for the control group than for the intervention group (non-‐paired t-‐test, p<0.03). Lastly, there was no statistically significant difference between the number of days active in the system after the intervention between the intervention group (1.5 days) and the control groups (1.2 days, one-‐tailed non-‐paired t-‐test, p=0.32). Thus, we conclude that our reminder intervention ensures persistence, which does not fall from the persistence of those returning without a reminder. 2.7.
Discussion
In this section we wish to reflect upon some of the strengths and weaknesses of our approach that reveal important principles and trade-‐offs inherent to the design of interventions for citizen science platforms. Page 12 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Applying the methodology revealed that disengagement is triggered by life distractions, classification anxiety, and boredom. We identified target communities for the intervention that capture the vast majority of user participation in the system for all projects. We designed interventions in the form of emails that directly addressed underlying issues uncovered by the survey. The methodology was shown to successfully promote re-‐engagement of users across 16 different citizen science projects. Returning participants from the intervention group resumed activity at least as fast, and remained active in the system for at least as long as returning participants from the control group. Our methodology is an example of the new engineering approach combining social and computational elements [14,15] and the work by Burke et al. [17] suggesting to target intervention to specific users to increase their social contribution. We now mention three issues with our approach and explain how each corresponds to a type of trade-‐ off inherent when designing interventions for “non-‐uniform” populations in which volunteers vary widely in the extent of contribution. First, we identified two cohorts, those who disengage after a day, and those who remain in the system for up to 10 days before disengaging. But as far as our intervention is concerned, we treat these as a single population. On the one hand this is sensible because combined they represent the larger population of contributors who rapidly disengage (corresponding to Eveleigh et als’ ‘drop-‐outs’ [5]). On the other hand, better tailored interventions may be more effective for each cohort as presumably those disengaging after a day have a different shared experience to those disengaging after a few days. Moreover, it may be possible to disaggregate these populations even further based on finer differentiations of engagement patterns and underlying motivational issues, enabling increasingly more focused and efficient interventions. That said, we have been successful with a relatively simple (yet crude) instrument, and ever more refined approaches would incur correspondingly greater overheads in terms of cost and complexity. A second issue relates to the presumption that our survey findings map onto the experience of those 1-‐ day and 10-‐day cohorts identified in the participation profile. We are assuming that distracting life-‐ events, anxiety and boredom count as significant reasons for disengagement within these cohorts, without being able to precisely identify what the actual reasons are for any individual who disengages, nor denying that there may well be a mix of other reasons that we have yet to encounter. This imprecision is related to methodological limits of qualitative research, particularly surveys, where generalizations need to be made in order to map from the survey sample to the overall population. Again, there is a trade-‐off here, since greater precision attracts overheads – not least ultimately the risk of annoying or alienating volunteers. Finally, the e-‐mail intervention works much less like a hunting spear and much more like a net in the way that it ensnares several (presumed) sub-‐populations simultaneously (those who have been distracted from their project, are anxious or who are bored). These messages may also act in concert on those occasions where both reassurance and a reminder are needed, but they may also miss the mark where disengagement occurs for some other reason. On the plus side, the e-‐mail message has a degree of generality, it can speak to multiple audiences simultaneously, but this increases the challenges of assessing its effectiveness. To summarize, we have presented a general methodology for incentivizing participants in large scale CAS system (citizen science) that is based on the analysis of two years of participation data in 16 projects. This methodology included: (1) surveys to reveal the motivations that drive users’ participation in citizen science; (2) identifying cohorts based on the survey results and the participation data; (3) designing an intervention strategy that targets specific cohorts and is designed to increase their engagement with the system; (4) analyzing the efficacy of this strategy over time, according to performance and persistence measures. While the work described here has produced a significant improvement in productivity from a specific intervention, we believe further cyclic iterations of the 4-‐step methodology will uncover additional insights into the motivations of other © SmartSociety Consortium 2013 - 2017
Page 13 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
citizen science sub-‐populations; future work will design interventions to address these needs. As discussed in this paper, we wish to further explore the effectiveness of interventions when targeting large heterogeneous populations and seek to gather further qualitative and empirical evidence to better understand these trade-‐offs. 3. Community messages and reputation systems as incentive mechanisms in CAS In this section we describe a study for designing a CAS of an online community in which participants are incentivized to coordinate and collaborate. Specifically, we examined the influence of two incentive mechanisms -‐-‐-‐ community messages and reputation mechanisms -‐-‐-‐ on coordination and collaboration in CAS. 3.1 Related work This study relates to the use of incentives to enhance collaboration and coordination in groups. We mention prior work using community messages, reputation systems, and gamification elements to attempt to steer user behavior. There is ample evidence showing that community messages, gamification elements can increase the contributions of participants in online communities [21,20,19]. At the same time, there is increasing evidence (see our study on citizen science) showing that populations can be deterred or put off by such measures. We present some examples of both of these works below. Kim and Keller [19] focused on motivational and volitional messages that adhere to the ARCS model: attention (create interest and curiosity), relevance (use concrete and familiar language and examples), confidence (demonstrate likelihood of success) and satisfaction (during and post participation). They showed that messages encoded with such components are able to significantly increase group activities as compared to placebo messages in a technology adoption scenario. Zhou et al. [20] showed that perceived similarity and trust can increase participation in online communities, Van De Velde et al. [21] found that messages which are focused on the opportunities and possible solutions, like a lower energy use or the use of more environmental friendly energy sources, will be more effective to persuade people to contribute to the prevention and reduction of energy and environmental problems, instead of messages which strengthen the gravity of the problem and the detrimental effects of it. McKenzie-‐Mohr et al. [22] claim that messages should emphasize the adoption of collaborative systems by peers, setting collective goals for participants, providing prompts and memory aids, feedback to performance, and perceived Convenience of other participants. Badges and other gamification elements like leader boards have been shown to increase positively influence users’ participation in citizen science projects, question forums and in education. Judd and Churchill [23] show that badges serve social functions for participants in media contexts: goal setting, instruction, reputation, status/affirmation, and group identification. Anderson et al. (24) modeled user behavior in the presence of badges on the question-‐answering site Stack Overflow. They showed that the model was able to predict changes in user contribution as a function of badge allocation strategy. All of these works were constrained to loosely coupled interactions in which the group effort was aggregated from the individual contributions of each participant. Our goal was to extend incentive mechanisms to settings in which the success of the group highly depends on the ability of the individuals to form coalitions (rides) by matching their preferences. The effects of reputation systems on decision making in lab experiments has been well documented. For example, Keser [25] utilized an “investment game” where one player’s trust increases the total Page 14 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
payoffs but leaves her vulnerable to the other player taking an unfair portion. When subjects who had not previously interacted with each other were informed of each other’s past play, both trust (investment) and trustworthiness (return of profits to the trustor) were higher. However, the nature of interaction of online groups is significantly different from the lab, consisting of large-‐scale interactions of thousands of users with little prior information about each other’s trustworthiness. Thus reputation systems in online settings are an active area of research. Luca [26] has shown that online consumer review websites substitute more traditional forms of reputation. In particular, a one-‐ star increase in online reviews of restaurants in the yelp website led to 5-‐9% increase in the revenue of the restaurant. In addition, participants respond more strongly to ratings that contain additional information and opinions. Løsang et al. (27) provide an overview of existing and proposed systems that can be used to derive measures of trust and reputation for Internet transactions, and mention current trends in Amazon and e-‐bay. They show that the installment of a reputation systems makes users more reliable, even if there is a lack of De-‐Afaro et al. (28) discuss some basic design principles for content-‐driver reputation systems which rely on an analysis of the content and the collaboration process, rather than on explicit user feedback. This approach was implemented in the Wikitrust reputation system for Wikipedia and the Crowdsensus reputation system for Google Maps editors. 3.2. Experimental design Our hypothesis was twofold. First, that including ratings and community messages will contribute to the successful adaptation of a CAS system in a designated scenario. Second, that community messages will have a positive effect on the performance of the CAS system. We focused our study on the use of ride sharing systems in organizations to increase productivity and efficiency in commuting times. We based our study on the RideShare system (aka SmartShare beta) developed jointly with WP2-‐5-‐6-‐1. The system supports community messages and ratings and was fully deployed since mid December 2014 at Ben-‐Gurion University, available for use by the student community at the link http://www.rideshares.info. In the subsections that follow we provide a background for the system, then proceed to describe our empirical study. 3.2.1 Background and technical specification for RideShare system The RideShare system allows users to offer and request rides (as drivers or riders). Users publish rides that include departure and destination points and time window. In addition, drivers have to enter the number of seats offered for riders. There are several features in SmartShare that were developed with a CAS approach in mind and that distinguish it from traditional ride sharing services currently available. The first is an algorithm that searches for matches between rides offered by drivers and requested by riders. The algorithm looks for agreement between departure and destination points, as well as for an overlap between the time windows representing the departure points. The output for each matching process for a participant is the list of all possible rides that are available to the participant at a given point in time. Each such match is referred to as a “plan” and shown to the relevant participants. Figure 4 describes examples for rides that were created by a driver and two riders in the system. Figure 5 shows the plans that will be created according to the rides of Figure 1 (the matches between these rides). The rides include the following information: (a) participants of the ride; (b) departure and destination points; (c) time window; (d) whether smoking is allowed. The lower time window of the plan is determined by the maximum value of all the lower times of all the rides that were matched. The higher time is determined by the minimum value of all the higer times of all the rides that were matched. The plans are described in Figure 5. The smoking field is determined by the profile of the driver.
© SmartSociety Consortium 2013 - 2017
Page 15 of (24)
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
username
role
departure point
destination point
date and time
Moshik
Driver
Beer-‐Sheva
Tel-‐Aviv
9/12/14 12:00-‐14:00
Kobi
Rider
Beer-‐Sheva
Tel-‐Aviv
9/12/14 13:00-‐15:00
Ido
Rider
Beer-‐Sheva
Tel-‐Aviv
9/12/14 11:00-‐18:00
Figure 4: examples of rides that created by a driver (Moshik) and two riders (Kobi and Ido) The plans that will be created for these rides are shown in the Figure below: participants
departure point
destination point
date and time
Moshik (driver), kobi (rider), Ido (rider)
Beer-‐Sheva
Tel-‐Aviv
9/12/14 13:00-‐ 14:00
Moshik (driver), kobi (rider)
Beer-‐Sheva
Tel-‐Aviv
9/12/14 13:00-‐ 14:00
Moshik (driver), Ido (rider)
Beer-‐Sheva
Tel-‐Aviv
9/12/14 12:00-‐ 14:00
Figure 5: 'matched rides' for the offer and requested rides in Figure1 Each participant has to choose between 'accept' or 'reject' each plan in his list. This two tier process is conducted as follows. First, the driver accepts one of the existing plans, and then each rider is prompted to accept the same plan. Figure 6 shows a ride request GUI that is presented from the perspective of one of the riders, showing two accepted ride plans, one of which was accepted by all of the participants (which included the driver and the single rider). Lastly, we note the technical specifications of the application (developed in collaboration with WP2-‐5 and 6) which includes four services: (a) Reputation; (b) Matching; (c) Peer manager and (d) Provenance. All the services connected to the Orchestrator server, designed and built by WP6, which synchronizes the processes and validates that the users are allowed to use the service. Whenever the client sends a request to the orchestrator to activate a service, the user name and the password are attached to the request. The Orchestrator validates that the user name and password are correct and then, it forwards the request to the appropriate service. In addition, the provenance service, designed and built by WP2, is connected to all the components (Reputation, Orchestrator, Peer manager and client), in order to log the activations of the users and the different states of the system in the past. This allows all the interactions in the system (and in particular, the reputation reports from all the system history) to be logged transparently and made available for future analysis.
Page 16 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
A ride plan that was accepted by the driver, and is still pending acceptance by the other riders.
A ride request that was accepted by all participants
Figure 6: Ride request GUI We implemented two types of incentive mechanisms that are supported by the SmartShare system, community messages and reputation systems. We decided to focus on these two incentive schemes following a detailed survey of BGU commuters that was administered a few months ago and identified lack of reliability measures and lack of motivation to be primary causes of failure to adapt existing ride sharing systems in BGU. The first incentive mechanism consisted of motivation messages that are displayed to the user to promote awareness to the social and ecological benefits of the rideshare system to the community and the individual participant. The figure below shows the list of community messages we chose for the study, as well as an example of their visualization to the users in the GUI. Motivational messages Ride sharing significantly reduces air pollution Why ride alone when you can ride together Ride sharing will reduce your monthly expenses
Ride sharing contributes for creating a better society The system is restricted to BGU students only!
Figure 7: List of community messages, and the GUI representation on the right The second incentive mechanism was the inclusion of a reputation system which allowed for both riders and drivers to rate each other’s performance in the system. The driver can rate each rider in the ride, and the riders can rate the driver only. The rating score separated to three categories: (a) overall; (b) on time; (c) friendliness. When a plan is created the average rating of the driver is presents to the riders, and the average ratings of the riders are presented to the driver (see Figure 8). The added transparency to the system transactions and additional information provided to participants through © SmartSociety Consortium 2013 - 2017
Page 17 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
the reputation system was hypothesized to induce all participants to be more reliable and to increase their confidence in choosing rides in the system.
Figure 8: GUI for reputation system
3.3. BGU Deployment and study results The system was released in Ben-‐Gurion University on 17 Dec, 2014 to the general student body. We report results that were analyzed following a month of usage of the system. Currently, students at BGU primarily use designated Facebook pages for posting and searching for rides. The rides are posted as lists of offers that are posted by potential riders. Commuters scan the lists by hand and contact the drivers. There is no active involvement in the negotiation process between drivers and commuters. On the one hand, using FB allows all students to access ride requests freely and quickly. On the other hand, it is an unwieldy process to search the lists and find the best reports. Initially, we allowed students to be compensated $8 for using the system. In practice, only half of the students showed up to be compensated. This result serves to strengthen that the system was successfully adopted by BGU students. 3.4.
Adoption results
The table below lists general statistics about deployment and usage of the system at BGU. Registered users 149 Individual users active 84 in the system Ride requests (Driver) 193 Ride Requests 276 (Commuter) Agreed plans 48 Table 3: General statistics about deployment of SmartShare system at BGU As we expected, there was an overwhelmingly majority of requests made by commuters in the system. In all 70% of ride requests were made by commuters, while 30% of ride requests in the system were posted by drivers.
Page 18 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
The figure below shows the adoption rates of the system at BGU through a timeline. As shown by the figure, there was a steady increase in the number of users signing up to the system, the number of users posting rides, and the numbers of using posting plans, since the inception of the system. The “spikes” in contribution and enrollment is due to weekends in which there is an influx of commuting activity to and from BGU.
Figure 9: Adoption rates by timeline
The distribution over gender for drivers and commuters is shown below. Interestingly, there were significantly more ride requests posted by female drivers than by male drivers, and significantly more ride requests posted by male commuters than by female commuters. Some possible explanations for this phenomenon is that more female students own a car then male students and that male students are less weary to use public transportation.
Figure 10: Distribution of Male/Female contributions The figure below (left) shows the distribution of agreed plans across users. As shown by the figure, the majority of registered users in the system did not agree on a plan (and in fact did not post a ride). The © SmartSociety Consortium 2013 - 2017
Page 19 of (24)
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
median number of plans per user is 2. As we can see, the number of agreed plans exhibits a long-‐tail distribution that is common to other collective online enterprises such as citizen science and stack overflow. The figure below (right) shows the number of ride plans as a function of the number of passengers. As can be seen from the figure, rides had one passenger accompanying the driver, while a minority of the rides had two passengers.
Figure 11: Number of agreed plans for different number of users (left); number of ride plans for different number of passengers in plan (right) Next, we measure the efficiency of the rides taken using the system. We measure the efficiency of a ride by dividing the maximal number of commuters in the search result of the matching algorithm with the number of commuters who actually agreed on the ride. For example, is a ride could potentially accommodate 4 commuters but the agreed ride plan only included 2 commuters, then its efficiency would be ½. As can be shown in the Table below, the ride share system enjoyed a high efficiency rating (on average). This means that when rides included only one commuter, there was no other ride plan that met the necessary matching criteria. For ride plans that include more than one commuter, the efficiency rate was lower. Number of rides 42 3 3 Total
Efficiency 1 1/2 1/3 0.93
Table 4: Average Efficiency of rides 3.5 Reputation and community messages We used two conditions in the study. Students in the Reputation/Community messages conditions were able to use the reputation system. The system allowed these students to rate each other when they share a ride, see the ratings of potential users when searching for rides, and also see motivational messages. Students in the reputation messages condition were able to use the reputation system but did not observe the community messages during their interaction with the system. Students were allocated to the conditions in a random fashion upon registering to the system. Page 20 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
Feedbacks posted Users with feedbacks Avg Total rating Avg total “on-‐time” Total
137 36 4.91 4.88 4.89
Table 5: Reputation information As shown by the table, an overwhelming number of participants posted feedback to the system, showing the efficacy of the reputation system and its usefulness in the system. Unsurprisingly, most of the ratings were high. This corresponds to rating patters in other collaborative online systems. We believe (although we cannot prove it statistically) that the inclusion of the rating system had the positive effect of making participants behave more reliably and more efficiently as compared to the existing FB system. Lastly, we turn to measuring the effects of the community message on the behavior of participants in the system. The figure below shows the distribution of ride requests (left) and ride plans (right) for both information conditions. Although the number of ride requests and ride plans was larger for the conditions in which subjects did not see the community messages, this difference was not significant. Thus we conclude that the community messages did not have the desired effect of making participants more efficient in the system. A possible reason for this is that the messages were not visible and not emphasized enough. This is shown in the table below, which lists the number of participants in both conditions who reported to see the community messages during a post-‐study survey filled by a sample of the participants. One would expect that participants who were in the community messages condition to report the messages, and the participants who were in the reputation system condition to report they did not see the messages. Interestingly, only two out of 10 participants in the sample acknowledged the messages. We believe that making the messages more visible would have changed the result above.
Figure 12: distribution of ride requests (left) and ride plans (right) for both information conditions
Acknowledged Did not acknowledge Messages
Messages
reputation Condition
2
5
Community msg Condition
2
8
Table 6: Visibility of messages as reported by subjects in both conditions © SmartSociety Consortium 2013 - 2017
Page 21 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
We can summarize the main achievement of the ride sharing study as full fledged deployment of an active CAS designed and implemented by the consortium. The study demonstrated that a CAS system which specifically included mechanisms that reason about the human could be adopted by participants in the system. At BGU, the ride share system successfully overcome the “unbearable lightness of FB” curse, by which applications providing services are created as FB pages and compete with designated, tailored applications that were designed for the service at hand. References [1] Silvertown, J. 2009. A new dawn for citizen science. Trends in Ecology & Evolution. 24, 9, 467 – 471. [2] Raddick, M. J., Bracey, G., Gay, P. L., Lintott, C. J., Murray, P., Schawinski, K., Szalay, A. S., and Vandenberg, J. 2009. Galaxy Zoo: exploring the motivations of citizen science volunteers. arXiv:0909.2925. [3] Reed, J., Raddick, M. J., Lardner, A., and Carney, K. 2013. An Exploratory Factor Analysis of Motivations for Participating in Zooniverse, a Collection of Virtual Citizen Science Projects. 46th Hawaii International Conference on System Sciences (HICSS), 610-‐619, (Hawaii, 7-‐10 Jan, 2013). [4] Nov, O., Arazy, O., and Anderson, D. 2011. Dusting for science: motivation and participation of digital citizen science volunteers. I(iConference '11). ACM, 68-‐74. (New York, NY, USA) DOI=http://doi.acm.org/10.1145/1940761.1940771 [5] Eveleigh, A., Jennett, C., Blandford, A., Brohan, P., and Cox, A., L. 2014. Designing for dabblers and deterring drop-‐outs in citizen science. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (CHI '14). ACM, 2985-‐2994. (New York, NY, USA) DOI=http://doi.acm.org/10.1145/2556288.2557262. [6] Preece, J., and Shneiderman, B. 2009. The reader-‐to-‐leader framework: Motivating technology-‐ mediated social participation. AIS Transactions on Human-‐Computer Interaction 1,1, 13-‐32. [7] Ortega, F., Gonzalez-‐Barahona, J. M., Robles, G. 2008. On the Inequality of Contributions to Wikipedia. Proceedings of the 41st Annual Hawaii International Conference on System Sciences. 304. (7-‐10 Jan. 2008) DOI=10.1109/HICSS.2008.333 [8] Mao, A., Kamar, E., and Horvitz, E. 2013. Why Stop Now? Predicting Worker Engagement in Online Crowdsourcing. In First AAAI Conference on Human Computation and Crowd-‐sourcing. (November 7–9, 2013, Palm Springs, California). http://www.aaai.org/ocs/index.php/HCOMP/HCOMP13/paper/view/7498 [9] Rotman, D., Preece, J., Hammock, J., Procita, K., Hansen, D., Parr, C., Lewis, D., and Jacobs, D. 2012. Dynamic changes in motivation in collaborative citizen-‐science projects. In Proceedings of the ACM 2012 conference on Computer Supported Cooperative Work (CSCW '12). ACM, , 217-‐226. (New York, NY, USA) DOI=10.1145/2145204.2145238 http://doi.acm.org/10.1145/2145204.2145238 [10] Kittur, A., Nickerson, J. V., Bernstein, M. S., Gerber, E. M., Shaw, A., Zimmerman, J., Lease, M., and Horton, J. J., The Future of Crowd Work. 16th ACM Conference on Computer Supported Coooperative Work (CSCW 2013). (December 18, 2012) Available at SSRN: http://ssrn.com/abstract=2190946 [11] Darch, P. 2014. Managing the Public to Manage Data: Citizen Science and Astronomy. International Journal of Digital Curation, 9, 1, 25–40. Page 22 of (24)
http://www.smart-society-project.eu/
Deliverable D5.3
© SmartSociety Consortium 2013 - 2017
[12] Eveleigh, A., Jennett, C., Lynn, S. and Cox, A. L. 2013. “I want to be a Captain! I want to be a Captain!”: Gamification in the Old Weather citizen science project. Short paper presented at Gamification 2013 (2-‐4 October 2013, Stratford, Ontario). https://uwaterloo.ca/gamification/sites/ca.gamification/files/uploads/files/gamification2013-‐ proceedings.pdf [13] Preist, C., Massung, E., and Coyle. D. 2014. Competing or aiming to be average?: normification as a means of engaging digital volunteers. In Proceedings of the 17th ACM conference on Computer supported cooperative work & social computing (CSCW '14). ACM. 1222-‐1233. (New York, NY, USA) DOI=http://doi.acm.org/10.1145/2531602.2531615 [14]
Guest, G. 2012. Applied Thematic Analysis. California: SAGE Publications, Inc
[15] Pruitt, J., & Adlin, T. (2010). The persona lifecycle: keeping people in mind throughout product design. Morgan Kaufmann. [16] Bernstein, A., Klein, M., and Malone, T. W. 2012. Programming the global brain. Commun. ACM 55, 5 (May 2012), 41-‐43. DOI=http://doi.acm.org/10.1145/2160718.2160731 [17] Miorandi, D. and Maggi, L. 2014. “Programming” Social Collective Intelligence. To appear in IEEE Technology and Society, special issue on Technology for Collective Action (2014). [18] Zooniverse: observing the world's largest citizen science platform. Simpson, R. and Page, K.R. and De Roure, D. In Proceedings of 23rd international conference on World Wide Web. [19] Z. Zhou, J. P. Wu, Q. Zhang, and S. Xu, “Transforming visitors into members in online brand communities: Evidence from china,” Journal of Business Research, 2013. [20] Kim, ChanMin, and John M. Keller. "Towards technology integration: The impact of motivational and volitional email messages." Educational Technology Research and Development 59.1 (2011): 91-‐111. [21] Van de Velde, Liesbeth, et al. "The importance of message framing for providing information about sustainability and environmental aspects of energy."Energy Policy 38.10 (2010): 5541-‐5549. [22] McKenzie-‐Mohr, Doug. Fostering sustainable behavior: An introduction to community-‐based social marketing. New society publishers, 2013. [23] Antin, Judd, and Elizabeth F. Churchill. "Badges in social media: A social psychological perspective." CHI 2011 Gamification Workshop Proceedings (Vancouver, BC, Canada, 2011). 2011. [24] Anderson, Ashton, et al. "Steering user behavior with badges." Proceedings of the 22nd international conference on World Wide Web. International World Wide Web Conferences Steering Committee, 2013. [25] Keser, Claudia. "TRUST IN A NETWORKED WORLD: EXPERIMENTAL GAMES FOR THE DESIGN OF REPUTATION MANAGEMENT SYSTEMS." (2003). [26] Luca, Michael. Reviews, reputation, and revenue: The case of Yelp. com. No. 12-‐016. Harvard Business School, 2011. [27] Jøsang, Audun, Roslan Ismail, and Colin Boyd. "A survey of trust and reputation systems for online service provision." Decision support systems43.2 (2007): 618-‐644. © SmartSociety Consortium 2013 - 2017
Page 23 of (24)
© SmartSociety Consortium 2013 - 2017
Deliverable D5.3
[28] De Alfaro, Luca, et al. "Reputation systems for open collaboration."Communications of the ACM 54.8 (2011): 81-‐87.
Page 24 of (24)
http://www.smart-society-project.eu/