Shreyas Sundaram Talk by Waterloo Institute for Complexity and Innovation

Diffusing Information and Reaching Agreement in Networks Convergence and Resilience

Shreyas Sundaram Electrical and Computer Engineering University of Waterloo

Outline Introduction and Network Models Information Cascades Linear Iterative Strategies for Distributed Consensus

  



Resilience to Malicious Nodes

Ongoing Research



Complex Systems and Networks are Everywhereâ&#x20AC;Ś Mouse Gene Network

Social Network Synchronized Fireflies

Credits: Nature, National Geographic, Sentinel Visualizer

Complex Systems and Networks are Everywhereâ&#x20AC;Ś Sensor Networks for Airplane Health Monitoring

Electrical Power Grid

Robotic Swarms

Credits: Sampigethaya et al. (Digital Avionics Conference 2007), Urban Ecoist, Swarmrobots.org

Information Diffusion Networks arise from interactions between various nodes Key function: dissemination of information



 

Social networks: ideas, opinions, knowledge, etc. Engineered networks: measurements, computations, control signals, etc.

Questions:



 

How effective is the network at diffusing information quickly, efficiently, reliably, … ? What effect can a few nodes have on the behavior of the entire network?

Modeling the Network: Topology Network can be modeled as a graph



N nodes {x1, x2, …, xN} 



people, sensors, computers, robots, “agents”, …

Edge from xi to xj indicates that xi can influence xj

Modeling the Network: Information Some (or all) nodes have some personal “information”



Opinion, position, sensor measurement, etc.

This information gets updated over time, based on interactions with other nodes Model this information as a real number Denote node xi’s initial information as xi[0]

  

8.5

2.7 5.0

7.4

3.9

9.8 -4.1 5.9

-6.3

1.2

20.1

-1.9

-0.5

-6.2 2.0

2.2

-0.3

Dynamics on Networks 

Nodes update their values (information) based on the values of their neighbors



This produces dynamics on the network 

Exact nature of the dynamics depends on exactly how nodes use their neighbors values

Have to consider both network topology (who talks to whom) and dynamics (what is done with the information) when studying diffusion of information



Diffusion of Information in Networks Studied extensively by various communities



   

 

Sociology Epidemiology Physics, Biology and Ecology Economics Communications Computer Science and Engineering …

Many excellent books on this topic



    



Diffusion of Innovations, Rogers, 1962 Dissemination of Information in Communication Networks, Hromkovic et. al., 2005 Communication Complexity, Kushilevitz and Nisan, 1997 Distributed Algorithms, Lynch, 1997 Networks, Crowds, and Markets, Easley and Kleinberg, 2010 …

Cascades in Networks

Example: Cascade of a New Idea ([Morris, ’00], [Easley and Kleinberg, ’10]) 

Each node in network can be in state A or state B 



Two neighboring nodes get a benefit if they are in the same state, and no benefit if they are in different states 



e.g., Two competing technologies

Total benefit to each node is sum of benefits from each neighbor

Rule: Each node periodically looks at states of neighbors  

Chooses A if at least a fraction q of its neighbors are A Chooses B otherwise 

q depends on relative benefit from A versus B

Cascades 

 

Suppose all nodes start out in state B Small subset S of nodes change their state to A, and keep it that way For the specified “threshold” dynamics, under what topological conditions will all nodes eventually adopt A?

x2 x1

x6 x3

x8 x9

Example ([Easley and Kleinberg, ’10]) 

q = ½, S = {x3, x5}

State A

x2 x1

x6 x3

x8 x9

State B

Example 

q = ½, S = {x3, x5}

State A

x2 x1

x6 x3

x8 x9

State B

Example 

q = ½, S = {x3, x5}

State A

x2 x1

x6 x3

x8 x9

State B

Example 

q = ½, S = {x3, x5}

State A

x2 x1

x6 x3

x8 x9

State B

Example 

q = ½, S = {x3, x5}

State A

x2 x1

x6 x3

x8 x9



A spreads throughout network! 17

State B

Example: Different Initial Set 

q = ½, S = {x1, x3}

State A

x2 x1

x6 x3

x8 x9

State B

Example: Different Initial Set 

q = ½, S = {x1, x3}

State A

x2 x1

x6 x3

x8 x9

State B

Example: Different Initial Set 

q = ½, S = {x1, x3}

State A

x2 x1

State B

x6 x3

x7 x8

x9  

Spread of A stops! Problem: there is a close-knit cluster, where every node has most of its neighbors inside the cluster 20

Example: Different Initial Set 

q = ½, S = {x1, x3}

State A

x2 x1

State B

x6 x3

x7 x8

x9  

Spread of A stops! Problem: there is a close-knit cluster, where every node has most of its neighbors inside the cluster 21

Result ([Morris, ’00]) 

Definition: Set of nodes is a cluster of density p if every node in set has at least a fraction p of its neighbors inside the set



Suppose set S starts with state A, everybody else in state B Cascade stops

Rest of graph contains a cluster of density 1-q



Intuition: No node in cluster has enough neighbors outside to allow new information to penetrate



We’ll come back to this concept later 22

Reaching Agreement in Networks

Reaching Agreement in Networks 

Previous example studied propagation of a single value



What if there are multiple different values in the network? 



e.g., opinions on a topic, sensor measurements, etc.

Objective: get all nodes to reach agreement on some function of these values 

Synchronization in biological networks, clocks, multi-agent flocking, distributed optimization, …

Dynamics of Consensus 

Many mechanisms proposed for emergence of consensus



Here: look at a very simple linear iterative strategy 

Studied extensively by the control systems community



At each time-step k, each node xi updates its value as a weighted average of its neighbors values



Mathematically: xi [k  1]  wii xi [k ] 



 wij x j [k ]

jnbr (i )

Special case: wii = wij= 1/(degi +1) (simple averaging) 25

Example k=0

k=1

k=2

k 26

-1

6.5

4.17

5.25

4.14

System-Wide Model 

 

Local dynamics based on averaging Need to model system-wide behavior Evolution of values of all nodes:  x1[k  1]   w11  w1N   x1[k ]                 x N [k  1]  wN 1  wNN   x N [k ]       x[ k 1]



x[ k ]

Constraints: 



wij = 0 if node xj is not a neighbor of node xi W is a stochastic matrix: all elements of W are nonnegative, and each row sums to 1

Previous Example: System-Wide Model x1

 x1[k  1]   12  x [k  1]   1  2  3  x3[k  1] 0 x[ k 1]

  

1 2 1 3 1 2

0   x1[k ]   1 x [ k ] 3 2  1  2   x3 [ k ]  x[ k ]

Evolution of values: x[k] = Wx[k-1] = W2x[k-2+ = … = Wkx[0] x[k ]  lim Wk x[0] Thus: lim k  k  In this example 27 lim W k   2 7 k   2 7

3 3 3

7 7 7

 1 2   1 2 7    7 2  1 7 2



dT x[0]   lim x[k ]  dT x[0] k  dT x[0]  

Convergence Based on Properties of Markov Chains 

Network-wide dynamics: x[k+1] = Wx[k]



Topology of network is captured by sparsity pattern of W



Since W is a stochastic matrix, can view this system as a Markov Chain [deGroot, 1974]



Standard property of Markov Chain: 



If network is connected and each node has a positive self-weight, there exists a unique vector dT such that dT    k lim W    k  dT    dT is the left eigenvector of W for eigenvalue 1: dTW = dT

Proof of Convergence (cont’d) 

Thus: lim x[k ]  lim Wk x[0]  1dT x[0]



Result:

k 

Network connected and all self weights are positive



All nodes reach consensus on dTx[0], where dTW = dT

Rate of convergence is geometric, according to second largest eigenvalue of W 

Mixing Time of Markov Chain

Time-Varying Networks 

What happens if the network is changing over time? 



Neighbors of each node change over time

Model for linear strategy:  x1[k  1]   w11[k ]       xN [k  1]  wN 1[k ] x[ k 1]



w1N [k ]   x1[k ]      wNN [k ]  xN [k ] W[ k ]

x[ k ]

Weight matrix is now time-varying, but still stochastic at each time-step View this as a non-homogeneous Markov chain 31

Result 

As long as the network is connected over time, and there is a lower bound on the weights,

lim x[k ]  1dT x[0] k 



dT is some vector, no longer left eigenvector of some matrix  



Depends on how the network changes over time Still an open question to characterize final consensus value in timevarying networks (except for some special cases)

References: Tsitsiklis 1984, Jadbabaie et al., 2003, Ren et al., 2005, Moreau, 2005, Xiao et al., 2004, Olfati-Saber et al., 2007, … 32

Potential for Incorrect Behavior 

So far: assumed that all nodes in the network behave as expected “What happens if some nodes don’t follow the specified strategy?”



Simplest case: suppose some node keeps its value constant



Result [Tsitsiklis ‘84, Jadbabaie et al. ‘03, Gupta et al ‘05, …+ As long as the network is connected over time, if some node keeps its value constant, all nodes will converge to the value of that node



Linear strategy is easily influenced by stubborn/malicious agents (?) 33

Robustness of Linear Iterative Strategies to Malicious Nodes

Exchanging Information in Arbitrary Networks: Building Intuition x1

x3 x3[0]  5

   

Node x1 wants to obtain x3’s value Node x2 is malicious and pretends x3[0] = 9 Node x4 behaves correctly and uses x3[0] = 5 Node x1 doesn’t know who to believe 

 35

i.e., is node x3’s value equal to 9 or 5?

Node x1 needs another node to act as tie-breaker

Graph Connectivity 

The connectivity of a graph is the maximum number of nodedisjoint paths between any two nodes x1 x4

x2 x3

Connectivity: 1



x2 x1

x3 x4 Connectivity: 2

x4 Connectivity: 3

Menger’s Theorem: If a graph has connectivity k, there is a set of k nodes that disconnects the graph  36

This set of nodes is called a vertex cut

Main Results ď ˝

In fixed networks with up to f malicious nodes, we show:

Node xi has 2f or fewer node-disjoint paths from some node xj

f malicious nodes can update their values in such a way that xi cannot calculate any function of xjâ&#x20AC;&#x2122;s initial value

Node xi has 2f+1 or more node-disjoint paths from every other node

xi can obtain all initial values after running linear strategy for at most N time-steps with almost any weights

Main Results 

In fixed networks with up to f malicious nodes, we show:

Node xi has 2f or fewer node-disjoint paths from some node xj

“Easy”

f malicious nodes can update their values in such a way that xi cannot calculate any function of xj’s initial value

“Tricky”

xi can obtain all initial values after running linear strategy for at most N time-steps with almost any weights

Node xi has 2f+1 or more node-disjoint paths from every other node

Modeling Faulty/Malicious Behavior in Linear Iterative Strategies 

Linear iterative strategy for information dissemination: 

Correct update equation for node xi:

xi [k  1]  wii xi [k ]  



jnbr (i )

Faulty or malicious update by node xi:

xi [k  1]  wii xi [k ]  

 wij x j [k ]



jnbr ( i )

wij x j [k ]  ui [k ]

ui[k] is an additive error at time-step k Note: this model allows node xi to update its value in a completely arbitrary manner

Modeling the Values Seen by Each Node 



Each node obtains neighbors’ values at each time-step Let yi[k] = Cix[k] denote values seen by xi at time-step k 

Rows of Ci index portions of x[k] available to xi

For node x3:

x3 x4

x1 x2

 x1[k ]  1 0 0 0 y 3[k ]   x3[k ]  0 0 1 0 x[k ]      x4 [k ] 0 0 0 1   C3

Linear Iteration with Faulty/Malicious Nodes 

Let S = {xi1, xi2, …, xif} be set of faulty/malicious nodes 



Unknown a priori, but bounded by f

Update equation for entire system:  x1[k  1]   w11       xN [k  1]  wN 1 x[ k 1]

w1N   x1[k ]     e    i1 wNN   xN [k ] W

x[ k ]

ei2 BS

 ui1 [k ]    u [ k ]  i2   ei f      ui f [k ] uS [ k ]

y i [k ]  Ci x[k ] Constraint: weight wij = 0 if node xj is not a neighbor of node xi 41

Recovering the Initial State 

System model for linear iteration with malicious nodes x[k  1]  Wx[k ]  B S u S [k ] y i [k ]  Ci x[k ]



This is a linear dynamical system 

Use tools from linear system theory to analyze behavior

Using Structured System Theory to Prove Resilience of Linear Iterations 

To prove resilience, we use the following approach: Network has connectivity 2f+1 (Structured System Theory)

A linear system has some property “P”

Linear strategy is resilient to f malicious nodes in a given network 43

Background on Linear and Structured System Theory

Properties of Linear Systems x[k  1]  Ax[k ]  Bu[k ] y[k ]  Cx[k ]

State: x 2 n, Output: y 2 p, Input: u 2 m



Controllability: can the state be driven to any desired value using some sequence of inputs?



Observability: does the output trajectory uniquely specify the state of the system, when the inputs are known (or zero)?



Strong Observability: does the output trajectory uniquely specify the state of the system, when the inputs are unknown?



Invertibility: does the output trajectory uniquely specify the input of the system, when the state is known? 45

Properties of Linear Systems x[k  1]  Ax[k ]  Bu[k ] y[k ]  Cx[k ]

State: x 2 n, Output: y 2 p, Input: u 2 m



Controllability: can the state be driven to any desired value using some sequence of inputs?



Observability: does the output trajectory uniquely specify the state of the system, when the inputs are known (or zero)?



Standard Approach: algebraic tests if properties hold StrongUse Observability: doesto thedetermine output trajectory uniquely specify the state of the system, when the inputs are unknown?



Invertibility: does the output trajectory uniquely specify the input of the system, when the state is known? 46

Linear Structured Systems x[k  1]  Ax[k ]  Bu[k ] y[k ]  Cx[k ]

x  n , y   p , u   m



System is structured if every entry of the matrices (A,B,C) is either zero, or an independent free parameter



Used to represent and analyze dynamical systems with unknown/uncertain parameters *Lin ‘74, Dion et al., ‘03+



Structured system theory: determines properties of systems based on the zero/nonzero structure of matrices 47

Structural Properties “Structured system has property P”: Property P holds for at least one choice of free parameters in the matrices (A, B, C) 

Structural properties are generic! Structured system has property P



Structured system will have property P for almost any choice of free parameters

Use graph based techniques to determine if structural properties hold 48

Example of Structured System and Associated Graph  

Structured system can be represented as a graph Structured system:  1  x[k  1]   3 0  0



2 0 0  4 5 0  0

6

0 0

7 0 x[k ]   0 0   0 0

0 9 0  u[k ], y[k ]   0  8   0  0

10 0

0 0 0 0  x[k ] 0 11 

Associated graph H: Input vertices: U

1 4

3 2

State vertices: X 49

Output vertices: Y

Example: Test for Structural Invertibility Theorem [van der Woude, â&#x20AC;&#x2122;91]: Graph H has m node-disjoint paths from inputs to outputs

System is structurally invertible

ď ˝

e.g., Input vertices: U

1 4

3 2

Output vertices: Y

State vertices: X

J. W. van der Woude, Mathematics of Control Systems and Signals, 1991

Example: Test for Structural Invertibility Theorem [van der Woude, â&#x20AC;&#x2122;91]: Graph H has m node-disjoint paths from inputs to outputs

System is structurally invertible

ď ˝

e.g., Input vertices: U

1 4

3 2

Output vertices: Y

State vertices: X

Structurally Invertible 51

J. W. van der Woude, Mathematics of Control Systems and Signals, 1991

References on Structured Systems   

 

C. T. Lin, “Structural Controllability”, IEEE TAC, 1974 K. J. Reinschke, Multivariable Control: A Graph-Theoretic Approach, 1988 J-M. Dion, C. Commault and J. van der Woude, “Generic Properties and Control of Linear Structured Systems: A Survey”, Automatica, 2003 D. D. Siljak, Decentralized Control of Complex Systems, 1991 Sundaram & Hadjicostis, CDC 2009, ACC 2010 (Structural properties over finite fields, upper bound on generic controllability/observability indices)

Application to Resilient Information Dissemination

Recovering the Initial State 

System model for linear iteration with malicious nodes x[k  1]  Wx[k ]  B S u S [k ] y i [k ]  Ci x[k ]



Objective: Recover initial state x[0] from outputs of the system, without knowing uS[k]



Almost equivalent to strong observability of system 



The set S is also unknown here Only know it has at most f elements

Recovering the Initial State 

Want to ensure that the output trajectory uniquely specifies the initial state 



Same output trajectory must not be generated by two different initial states and two (possibly) different sets of f malicious nodes

By linearity, can show:

Can recover initial state in system

x[k  1]  Wx[k ]  B S u S [k ] y i [k ]  Ci x[k ] for any unknown set S of f nodes 55

Linear system

x[k  1]  Wx[k ]  BQuQ [k ] y i [k ]  Ci x[k ] is strongly observable for any known set Q of 2f nodes

Structural Strong Observability 

For any set Q of 2f nodes, strong observability of x[k  1]  Wx[k ]  BQuQ [k ] y i [k ]  Ci x[k ]



is a structural property Graph of system is given by graph of original network, with additional inputs and outputs



Using tests for structural strong observability, we show: xi has 2f+1 node-disjoint paths from every other node

Above system will be strongly observable for any set Q of 2f nodes

Robustness of the Linear Iterative Scheme 

By generic nature of structural properties: Network is 2f+1 connected



For almost any W, any node can recover all initial values despite actions of f malicious nodes

How long will each node need to wait before the values that it receives uniquely specifies the initial state? If a linear system is strongly observable, outputs of system over N time-steps are sufficient to determine initial state



Any node can obtain x[0] after at most N time-steps 57

Summary 

We know linear iterative strategies are quite powerful if:    



Network is fixed All nodes know the entire network All nodes know the linear combinations used by other normal nodes Each node can store a lot of data and do extensive computations

Would like to relax these conditions  

Study more “natural” mechanisms to produce robustness Nodes should require no knowledge of network

Ongoing Research 

Tradeoff between knowledge of network and ability to overcome malicious behavior



New (relaxed) objective: “All normal nodes should reach consensus on some value that is between the smallest and largest initial values of the normal nodes”



Malicious nodes should not be able to bias the consensus value excessively 59

f-Local Model 

For large networks, also want to allow the possibility of a large number of malicious nodes 



f-local malicious model: allow up to f malicious nodes in every neighborhood

Natural strategy: have each normal node be “suspicious” of extreme values in its neighborhood 

Remove the f highest and lowest values in its neighborhood, and take weighted average of remaining values

xi [k  1]  wii [k ]xi [k ] 



wij [k ]x j [k ]

jnbr ( i )

Neighbors after removing extreme values 60

Convergence 

Under what conditions will this strategy work?



Fact: 2f+1 connectivity is no longer sufficient Fully-connected graph with n/2 nodes Initial value 0

One-to-one edges between sets Fully-connected graph with n/2 nodes Initial value 1



Connectivity of graph is n/2, but no node ever uses a value from opposite set 61

Similarities with Cascade Problem 

Similar phenomenon to “clusters” seen earlier in the cascade problem 



Graph contains sets where no node in the set has enough neighbors outside

Key differences:  Every neighbor of every node might have a different value in our setting: “Filtering” rule versus “Threshold” rule  No malicious nodes in the cascade problem 62

Robust Graphs 

We introduce the following definitions  



A set S is r-reachable if it has a node that has at least r neighbors outside the set A graph S is r-robust if for any two disjoint subsets, at least one of the sets is r-reachable

Preliminary results: Graph is (3f+1)-robust

Normal nodes will reach consensus despite actions of any f-local set of malicious nodes

Graph is less than (f+1)robust

Normal nodes will not reach consensus for some initial values

Ongoing Research 

Try to narrow the gap between sufficient and necessary robustness conditions



Characterize robustness of typical complex networks  



Erdos-Renyi networks Scale-free networks

Use “r-robust” networks to characterize behavior of other information diffusion dynamics

Thanks!