Mathematical Modelling of Social Spreading Processes
Hans De Sterck Department of Applied Mathematics University of Waterloo, Canada
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 1/52
mathematical modeling of social spreading processes 1. introduction 2. spread of political revolutions on social networks 3. spread of cigarette smoking in a population 4. some general thoughts on future work
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 2/52
collaborators • John Lang, PhD student, Applied Mathematics, Waterloo (for both topics: political revolutions, and smoking) (all the (real) work has been done by John, really)
• Danny Abrams, Engineering Sciences and Applied Mathematics, Northwestern University (for the smoking topic)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 3/52
1. introduction “new types of connectivity” between people (online social media, ...) appear to have an important influence on social processes • protest movements: Quebec student protests (2012), BC HST referendum (2011), Occupy (2011), Stuttgart 21 (2010), Pegida (2014), ... (note: ‘progressive’, ‘conservative’, ‘reactionary’, ... causes!) (gatekeepers on public opinion formation (radio, TV, newspapers) are largely gone!)
• riots: London (2011), Vancouver Stanley Cup (2011), Ferguson/New York unrests (2014), ... (note: both rioters and law enforcement use new media!) • how democracies work: fundraising, elections (Obama 2.0), timescales of public opinion formation and election cycles, ... Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 4/52
introduction “new types of connectivity” between people (online social media, ...) appear to have an important influence on social processes • advertising and business: Gmail, Google AdWords; newspaper industry (Google News); kijiji; music industry (iTunes, Youtube, filesharing); hotel and restaurant industry (Yelp, TripAdvisor); taxi industry (Uber); ... • societal norms and morality: ‘internet morality police’ in China, punishment of plagiarizing professors plays out in media, Dalhousie gentlemen, internet mobs (influences legal and disciplinary outcomes!); Mohammed cartoons and free speech (in international context); IS recruiting; ... Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 5/52
introduction “new types of connectivity” between people (online social media, ...) appear to have an important influence on social processes • political revolutions: are the dynamics of revolutions tied to the underlying social network connectivity and processes?
e.g., Arab Spring (Tahrir Square, January 2011)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 6/52
mathematical modeling of social spreading processes • so ... there are many interesting problems to consider! • there is more and more interesting data! online social networks are digital (sources of ‘big data’)
• two approaches of interest: simple population-level models goal: identify and model the main causal factors; then analyze the dynamic behaviour mathematically (classical mathematical modeling) (new data!)
network-based models on empirical social networks more realistic, computational harder to analyze completely Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 7/52
2. spread of political revolutions on social networks (Lang and De Sterck, Mathematical Social Sciences 2014; submitted 2015)
• we consider political revolutions in regimes that: are highly unpopular employ censorship employ police repression (e.g., Arab Spring in Tunisia, Egypt, 2010-2011)
• we first consider an agent-based model (ABM) for the temporal spread of revolutions on empirical networks a node v can be in two states: sv = 1 : active in revolution sv = 0 : inactive Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 8/52
ABM for political revolutions on social networks • assume: individuals get information only through network connections (other information is censored)
• police repression: the decision to join a revolution is a collective action problem (e.g., Kuran, 1991) “if individuals act unilaterally they are subject to retaliation by the regime, whereas if they act collectively then the regime loses the ability to punish due to a lack of resources” Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 9/52
ABM for political revolutions on social networks • modeling choice: linear threshold model “node v decides to participate in the revolution if at least a fraction θ of its neighbours are active”
• consistent with collective action principle: “I will participate if at least a fraction θ of my neighbours are active (and I deem the revolution large enough that I can likely avoid repression)”
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 10/52
ABM for political revolutions on social networks 1. ABM growth process: linear threshold model “node v becomes available to participate in the revolution if at least a fraction θ of its neighbours are active”: participation happens (in continuous time) at first arrival time of Poisson random process with rate c1
3. ABM decay process: police capacity β “node v will return to inactive state if total fraction of active nodes r < β”: happens (in continuous time) at first arrival time of Poisson process with rate c2
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 11/52
ABM for political revolutions on social networks (we simulate the ABM on empirical networks with Gillespie algorithm, implemented in Matlab, starting from fraction r0 active nodes (random))
(this simulation on small Facebook network)
Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 12/52
simplify ABM for insight: population-level differential equation model • expected fraction of active nodes: (N total nodes)
• population-level ordinary differential equation (ODE) consistent with ABM:
• visibility and policing functions:
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 13/52
population-level differential equation model • population-level ODE consistent with ABM:
• visibility function: the fraction of the total population that can ‘see’ the revolution
• “closure” of the model: how to express visibility function as a function of ra(t)? Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 14/52
ODE model with simple closure • assume node v has k neighbours assume the states of v’s neighbours are active with probability r independently then the probability of v having j out of k neighbours active in the revolution is • the probability that the fraction of v’s neighbours exceeds the linear threshold θ is
• assume k is the average degree of the network
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 15/52
ODE model with simple closure •
the probability that the fraction of v’s neighbours exceeds the linear threshold θ is
(visibility function: the fraction of the total population that can see the revolution) •
BinCDF has steep sigmoidal shape, and small k requires large θ
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 16/52
ODE model with simple closure •
BinCDF has steep sigmoidal shape, and small k requires large θ
•
use step function for visibility function
in ODE model
with visibility parameter α instead of θ and k • Step Visibility Function (SVF) model • intuitive interpretation: revolution can only grow when it is visible to the population (and censorship, or network structure, may influence α)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 17/52
correspondence ABM – SVF model • good agreement for many parameters
• sometimes agreement is not good (these simulations on small Facebook network) Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 18/52
major attraction of simple ODE model (SVF): complete analysis is possible
III0: stable police state IIIe: meta-stable police state III1: unstable police state
r = 0 : equilibrium of total state control r = 1 : equilibrium of realized revolution r = c* : equilibrium of civil unrest
II: failed state
gives direct insight into parameter transitions of ABM! Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 19/52
application of simple model: Arab Spring â&#x20AC;˘ How can a small number of active social media users and relatively low Internet penetration have a dramatic effect on the stability of a regime? (threshold phenomena) â&#x20AC;˘ Why is it that some regimes fall in a matter of weeks, others fight to a stalemate, and still others survive relatively unscathed?
Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 20/52
next step: improved population-level ODE models • sometimes agreement between ABM and SVF model is not good, even with visibility α that is optimal for specific network
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 21/52
improved population-level ODE models • improve ODE model: include specific network information into visibility function
• Binomial Visibility Function (BVF) incorporates degree distribution of empirical network
(instead of average degree k, and then step function)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 22/52
improved population-level ODE models • Binomial Visibility Function (BVF):
• alternative: Empirical Visibility Function (EVF) (via sampling) • parameter regimes equivalent to SVF • much closer to ABM than SVF! (takes network structure into account explicitly)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 23/52
hierarchy of models • ABM is faithful to linear threshold process on empirical networks • simple ODE models are cheap to simulate and offer insight into ABM parameter regimes
• BVF/EVF take network structure into account, and offer approximations that are mostly as accurate as expensive manycompartment “Degree Approximation” ODE models (Nekovee et al., 2007) Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 24/52
application: spread of revolutions on online versus offline social networks • ‘new types of connectivity’ (online social media) between people appear to have an important influence on social processes (Arab Spring, ...) • modern online social networks (Facebook, Twitter, ...) digitally stored = available at scale (for research, in principle) we use a simple Facebook network in this application
• compare with pre-Internet, offline social networks not available at scale! reasonable proxy: use (modern) physical contact network (wireless sensors in school; or GPS on phone, ...)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 25/52
application: spread of revolutions on online versus offline social networks
Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 26/52
application: spread of revolutions on online versus offline social networks • define basic reproduction number: (generalized from epidemiology for single-infection processes)
= the average number of individuals that become active in the revolution, due directly to the introduction of a single active individual into a population that is otherwise completely inactive • R0 < 1 : we expect the spreading process to die out R0 > 1 : we expect the process to spread • Facebook network: R0 = 1.12 physical contact network: R0 = 0.35
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 27/52
application: spread of revolutions on online versus offline social networks (Arab Spring) • Facebook: R0 = 1.12 physical contact: R0 = 0.35
• initial indication that linear threshold ABM for political revolutions may spread more readily on online than on offline social network Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 28/52
application: spread of revolutions on online versus offline social networks
• next steps: apply ABM to more diverse and larger networks network structure is likely important, but there are other important factors: influence of new propagation mechanisms: broadcast versus 1-to-1 modern online networks rewire and self-organize during revolution Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 29/52
3. spread of cigarette smoking in a population • modelling assumptions: x = fraction of population that smokes (prevalence, in [0,1]) individuals derive utility from smoking via two mechanisms: utility directly from the act of smoking: individual utility ux utility from interaction with other smokers: social utility xa
(where relative conformity parameter a weighs the two utilities)
competition between individual and social utilities (Abrams et al., 2003, 2011 (language death, religious affiliation))
• population-level mathematical model:
(where b determines the timescale of the dynamics) Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 30/52
spread of cigarette smoking in a population –
utility directly from the act of smoking: individual utility ux
–
utility from interaction with other smokers: social utility xa
–
(where relative conformity parameter a weighs the two utilities)
• population-level mathematical model: –
(a, b, x0 to be fitted to measured temporal data on prevalence x)
• interpretation / hypothesis / prediction : larger a (>1) weighs social utility more over individual utility, and corresponds to more collectivistic country (less individualistic) smaller a (<1) gives faster growth/decay when ux changes (less social inertia) Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 31/52
population-level model for dynamics of smoking • individual utility ux : a combination of factors, including advances in our understanding of the health effects of smoking and public policy initiatives designed to curb smoking, have likely reduced ux over the past century
• modeling assumption:
n(t) is cumulative number of scholarly articles on the health effects of smoking δ is discount factor (in [0,1]) (u0, u∞, δ to be fitted to measured data on prevalence x)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 32/52
population-level model for dynamics of smoking
n(t) is cumulative number of scholarly articles on the health effects of smoking δ is discount factor (in [0,1])
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 33/52
fit the model to temporal data for prevalence
• prevalence data only available since ~1960 • consumption data available since ~1920! Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 34/52
fit the model to temporal data for prevalence
•
• use consumption data to estimate prevalence over ~100 years • then fit estimated prevalence to model Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 35/52
fit the model to temporal data for prevalence â&#x20AC;˘ a good fit is obtained (e.g., better than with step ux(t))
Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 36/52
fit the model to temporal data for prevalence –
• prediction / interpretation / hypothesis: 1. larger a (>1) weighs social utility more over individual utility, and corresponds to more collectivistic country (less individualistic) 2. smaller a (<1) gives faster growth/decay when ux changes (less inertia)
• is our ‘measured’ a linked to societal individualism? • idea: consider Hofstede’s Individualism Index IDV Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 37/52
interpretation: confirm prediction / hypothesis –
1. larger a (>1) weighs social utility more over individual utility, and corresponds to more collectivistic country (less individualistic)
• yes, ‘measured’ a is correlated with IDV!
• confirms prediction, supports hypothesis Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 38/52
interpretation: confirm prediction / hypothesis –
2. smaller a (<1) gives faster growth/decay when ux changes (less inertia)
• yes, ‘measured’ a is correlated with slope!
• confirms prediction, supports hypothesis Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 39/52
interpretation: confirm prediction / hypothesis • hypothesis of faster reaction to sudden change in personal utility in individualistic societies predicts earlier peak time (less social inertia when personal utility changes suddenly) • USA (IDV 91) • Sweden (IDV 71) • USA shows faster change and earlier peak than Sweden
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 40/52
interpretation: confirm prediction / hypothesis • hypothesis of faster reaction to sudden change in personal utility in individualistic societies predicts earlier peak time • confirmed in correlation between peak year and IDV; supports hypothesis
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 41/52
population-level model for dynamics of smoking conclusions: our model predicts: the level of individualism or collectivism of a society may significantly affect the temporal dynamics of smoking prevalence the strong influence of the personal utility of smoking (and its decrease due to increased awareness of adverse health effects) leads to faster adoption and cessation of smoking in individualistic societies than in more collectivistic societies (less social inertia when individual utility changes)
all predictions of the model appear to be supported by the data
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 42/52
population-level model for dynamics of smoking conclusion: limitations: we did not separately model effects of gender, income, age, government regulation, ... nevertheless, our results support the hypothesis that, averaged over the whole population, differences in societal individualism may have a strong influence on smoking dynamics future work: explicitly include social network in models of the spread of smoking
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 43/52
4. some general thoughts on future work • in the past decades, biology and medicine have become increasingly quantitative (revolution!) • “The social sciences are in the midst of an historic change, with large parts moving from the humanities to the sciences in terms of research style, infrastructural needs, data availability, empirical methods, substantive understanding, and the ability to make swift and dramatic progress” (King, 2014): enormous potential for quantitative insight! • e.g., interactions on online social networks in principle provide a wealth of data to analyze and understand many aspects of human behaviour (‘big data’)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 44/52
example: try to match models to real events
• test hypotheses • falsify models • need real data!
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 45/52
try to match models to real events
â&#x20AC;˘ social network data (e.g., Twitter) is in principle available to test models
(Howard et al., 2011) Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 46/52
social science and big data â&#x20AC;˘ requirements for progress (in social sciences) using big data: 1. computational infrastructure to store and query/process big data 2. scalable mathematical algorithms to analyze big data 3. scalable software implementation to handle big data (I am doing research on these aspects, using a dedicated 27-node cluster to develop and implement scalable algorithms for big data using Spark)
Waterloo Institute for Complexity and Innovation â&#x20AC;&#x201C; Jan 2015 hdesterck@uwaterloo.ca 47/52
social science and big data • all information related to online social networks is digitally stored, so available in principle (a real treasure trove of relevant data ...) • perhaps for the first time in history, this allows researchers to quantify human social behaviour ‘at scale’ and in real-time • major roadblock: it is very hard to obtain online social network data for research! (relevant, timely, at scale, ...) Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 48/52
roadblock: open data for scientific research • my take: to make real progress, online social network data should be freely available for research • unfortunately, this runs counter to commercial interests of social media companies • interesting situation: the ‘roadways’ of online social communication are owned by commercial companies (compare public roads, rivers, market squares, public mail, ...)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 49/52
roadblock: open data for scientific research interesting situation: the ‘roadways of online social communication’ are owned by commercial companies • pro: great technological innovation (ongoing) • cons: these are toll roads (‘free’, but we pay by providing our private information and our eyeballs for advertising) these economical transactions are mostly untaxed no taxes are paid on company profits (by ‘clever’ international tax avoidance techniques that you and I would call blatant fraud) huge privacy issues, digital rights issues, accepted behaviour issues (moral, commercial) (and many of these are decided by American companies and courts) parasitic business models some things seem to be out of balance ... Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 50/52
roadblock: open data for scientific research online social networks: some things seem to be out of balance ... possible solutions: proper regulation and taxation: unfortunately, there is no international mechanism for this “Facebook should be nationalized to protect user rights” (Howard, 2012) public infrastructure (compare roads, bridges, market squares) privacy and acceptable use policed by ... police not unprecedented: • commercial postal monopoly of 1500-1800 in Europe was replaced by state postal services • UK turnpike system of toll roads (1700-1900) was replaced by public roads this could solve many of the existing problems!
in any case, online social network data should be freely available for research to enable real progress
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 51/52
• thank you • questions?
• • •
John Lang and Hans De Sterck, “The Arab Spring: A simple compartmental model for the dynamics of a revolution”, Mathematical Social Sciences 69, 12-21, 2014. John Lang and Hans De Sterck, “A Hierarchy of Linear Threshold Models for the Spread of Political Revolutions on Social Networks”, submitted, 2015. [arXiv 1501.04091] John Lang, Daniel Abrams, and Hans De Sterck, “The influence of societal individualism on a century of tobacco use: modelling the prevalence of smoking”, submitted, 2014. [arXiv 1407.2188]
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 52/52
hierarchy of models • BVF/EVF take network structure into account, and offer approximations that are mostly as accurate as expensive manycompartment “Degree Approximation” ODE models (Nekovee et al., 2007)
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 53/52
population-level differential equation model • population-level ordinary differential equation (ODE) consistent with ABM:
• two compartments:
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 54/52
next step: improved population-level ODE models • sometimes agreement between ABM and SVF model is not good, even with visibility α that is optimal for specific network
Waterloo Institute for Complexity and Innovation – Jan 2015 hdesterck@uwaterloo.ca 55/52