Proc. of Int. Conf. on Recent Trends in Information, Telecommunication and Computing, ITC
On the Linear Time Construction of Minimum Spanning Tree Awadhesh Kumar Singh1, Ashish Negi2, Manish Kumar3, Vivek Rathee4 and Bharti Sharma5 1
1,2,3,4,5 National Institute of Technology Kurukshetra, Haryana, INDIA aksinreck@rediffmail.com, 2ashishnegi33@gmail.com, 3manishgl08@gmail.com, 4 rockonvivek@gmail.com, 5bharti_kanhiya@yahoo.co.in
Abstract— The article presents a simple algorithm to construct minimum spanning tree and to find shortest path between pair of vertices in a graph. Our illustration includes the proof of termination. The complexity analysis and simulation results have also been included. Index Terms— graph, spanning tree, ad hoc network
I. INTRODUCTION The graphical structures are popularly used to model computer networks. The spanning tree is one such structure. A spanning tree of a graph is the subgraph containing all the vertices and is a tree. The sum of its edge weights is called weight of the spanning tree. The smallest weight spanning tree, among all the possible spanning trees of a graph, is called minimum spanning tree (henceforth, MST). The ad hoc networks undergo topological changes due to the node movement or failure. Hence, the MST modeling such networks often needs to be reorganized. It is called the dynamic maintenance of MST. As, in the ad hoc network, nodes are energy constrained and the topological changes may be frequent, the MST computation method should be light weight and fast. The popular algorithms available in the literature, to compute MST, can be placed in two broad categories, namely, message efficient and time efficient. The message efficient algorithms, e.g. GHS algorithm [1], Chin-Ting [2], Gafni [3], and Awerbuch algorithm [4], exhibit linear or super linear time complexity; nevertheless, they are message optimal. On the other hand, the time efficient algorithms, e.g. Garay-Kutten-Peleg algorithm [5], Kutten-Peleg algorithm [6], and Elkin [7], exhibit sublinear time complexity; however, they are not message optimal. We present an algorithm to compute MST. Though, the algorithm is centralized, it exhibits message complexity better than algorithms [1–4] while keeping the time complexity order linear. II. THE ALGORITHM CONCEPT We consider a mobile ad hoc network (MANET) modeled as an undirected connected graph G = (V, E), where V is the set of vertices (nodes) and E is the set of edges (communication links) between them. Each edge eE has non-zero weight w. Each node has unique Id. Any two nodes are called neighbors if they are one hop away from each other and communicate directly. Also, we assume that despite multipath effect and varying channel conditions the message propagation between neighbor nodes is FIFO. We aim to collect the entire graph information at one or more nodes and use this information to create MST and to find the shortest paths between nodes. We present two methods for collecting the entire graph information. In the first method, all the graph information gets converged at a single node, which is not fixed a priori, called central DOI: 02.ITC.2014.5.15 © Association of Computer Electronics and Electrical Engineers, 2014
node. However, in the second method, we adopt a distributed approach so that each node gets the complete graph information. Hence, we name our methods as ‘centralized’ and ‘decentralized’ accordingly. Though, both centralized and decentralized approaches can be used for constructing MST and for finding shortest paths, it has been shown in the complexity analysis that the centralized approach performs better for constructing MST whereas the decentralized approach does well to find shortest paths. A. Message Types and Data Structures We assume an initiator node v V, which performs breadth first search (BFS) in the beginning. The initiator could be any arbitrary node in the system. The BFS procedure outputs the BFS-tree with v as root. In BFStree, there are two types of nodes, namely, leaf and non-leaf nodes. A leaf node has single edge connecting parent and non-leaf node has two or more edges that connect it to its neighbors. Assume that there are total N vertices and E edges in the graph. The value of E is upper bounded by N2 when each vertex is connected to every other vertex. Each node is aware of its neighbors and the weight associated with the edges connecting them. Messages: Following messages have been used in the algorithm. 1. Make_Me_Parent: It is used by a node to request some other node to become its child. 2. ACCEPT: It is sent by a node, which accepts to become child, to the sender of Make_Me_Parent message. 3. REJECT: It is sent by a node, which has already become child of some other node, in response to Make_Me_Parent message. 4. E_Msg: It is sent by nodei to its parent. It is edges’ information message containing Id of all neighbors for nodei and the weights associated with the edges connecting them. Also, nodei forwards with it all other E_Msg messages received from its BFS neighbors. It also contains Id of its source node. 5. MST_Info_Msg: It is the message containing MST information. 6. Graph_Info_Msg: It is the message containing complete graph information. Data Structures: Each nodei maintains the following data structures: 1. has_parenti : boolean variable indicating whether the nodei has parent. 2. allNeighborListi:It is the list containing Id of all neighbors (BFS and non-BFS) of nodei . 3. countInfoEdgei: It contains the number of edges through which nodei has received E_Msg. It is initialized to zero and incremented on reception of each E_Msg. 4. countBFSedgei: It is variable that contains the number of BFS neighbors of nodei. 5. BFS_NeighborListi: It is the list containing Id of all BFS neighbors of nodei . 6. Array_listi[]: Each nodei maintains a 1-d array having number of elements equal to countBFSedgei ; Array_listi[j] = 1, in case, E_Msg received from node j; Array_listi[j] = 0, otherwise. Array_listi [j] is initialized to 0 for all j. 7. Reject_Counti: Each nodei maintains this variable to count the number of REJECT messages received. B. Algorithm I We present the algorithm in event driven form. Firstly, the initiator starts the BFS protocol by sending Make_Me_Parent message to its neighbors. At Ordinary Node: Event 1: on receiving Make_Me_Parent message: if has_parent = 1 then reply REJECT else { reply ACCEPT; If(|allNeighborList| == 1) Then send E_Msg to the sender of Make_Me_Parent message. Else Forward Make_Me_Parent message to other neighbors. } Event 2: on receiving REJECT message: increment Reject_Count; If Reject_Count = |allNeighborList| - 1, send E_Msg to parent. Event 3: on receiving ACCEPT message: Put the sender Id in BFS_NeighborList. Event 4: on receiving E_Msg from node j: Set Array_list[j] ← 1; countInfoEdge ← countInfoEdge + 1; If (countInfoEdge = = countBFSedge - 1) Then scan Array_list[] to find node k for which Array_list[k] = 0; 83
append its own E_Msg message to other received E_Msg messages and forward this message to node k. Else If (countInfoEdge < countBFSedge - 1) Then store the message; Else if (countInfoEdge == countBFSedge) then If (nodei . id < E_Msg.sender.id){ /* the entire graph information has converged at nodei */ calculate exact MST using Kruskal’s or Prim’s algorithm; send MST_Info_Msg on newly computed MST edges.} The Working of Algorithm I: The nodes that have received some ACCEPT messages are non-leaf nodes; however, the nodes, which do not receive any ACCEPT message, become leaf nodes. Thus, each node is inherently aware of its status as leaf or non-leaf node. The leaf nodes send edge information message to their respective parent nodes. If a non-leaf node has degree e, then the node would wait for the arrival of edge information messages on each of its e-1 BFS tree edges. Once it has received edge information messages on its e-1 edges, the node appends its own edge information message to the received messages and forwards the combined message on the remaining eth edge. Finally, there would be a single node in the system that would receive edge information messages on all of its BFS edges. As this node contains the entire graph information, we call it ‘central’ node, henceforth. Also, we call algorithm I as centralized. Now, the central node computes MST using Kruskal’s or Prim’s algorithm and disseminate the MST information on newly computed MST edges. C. Algorithm II The Event 1, 2, and 3 are same as Algorithm I. Event 4: on receiving E_Msg from node j: Set Array_list[j] ← 1; countInfoEdge ← countInfoEdge + 1; If (countInfoEdge = = countBFSedge - 1) Then scan Array_list[] to find node k for which Array_list[k] = 0; append its own E_Msg message to other received E_Msg messages and forward this message to node k. Else If (countInfoEdge < countBFSedge - 1) then store the message; Else if (countInfoEdge == countBFSedge) then If (nodei . id < E_Msg.sender.id){ /* the entire graph information has converged at nodei */ send Graph_Info_Msg on BFS edges.} The Working of Algorithm II: Unlike Algorithm I, after receiving the entire graph information, the central node does not compute MST, rather, it sends Graph_Info_Msg wave on BFS edges. The receiver nodes, in turn, forward this wave to their BFS neighbors. In this way, all the nodes in the network receive the complete graph information. Now, each node can apply Dijkstra’s single source shortest path (SSSP) algorithm to compute shortest path to any vertex in the graph. D. The Proof of Termination Theorem: Only one node would receive the complete graph information. Proof: We are using BFS tree for collecting graph information; thus, there exist no cycle in the graph. Moreover, every node sends graph information on only one edge. Now, assume the contrary. Say, two arbitrary nodes i and j, both, receive whole graph information at time t1 and t2 respectively. Without loss of generality, assume t1 < t2. Now, there are two possibilities: (i) node i and j have a common ancestor node, say k. Since node i and j both have received whole graph information, both nodes have received graph information on all of its BFS edges. Therefore, the common ancestor node k has send data on two edges, which is a contradiction. (ii) node i and j do not have common ancestor node. It is possible only if one of them is parent of the other, i.e. either node i is parent of node j or vice versa. Thus, node i and j are neighbors. In this situation, it is possible for both of them to receive the graph information, if they have different send and receive channels; otherwise, the collision may occur. Hence, two neighbor nodes can receive the entire graph information. However, under this situation, node Id is used for tie breaking. Therefore, the theorem holds. 84
E. Message Complexity Theorem: The number of messages exchanged in Algorithm I as well as in Algorithm II is 4|E|. Proof: We have considered our network consisting of N nodes as undirected connected graph G = (V, E), where |E| represents the total number of edges in the graph. Now, looking at the operational view of both the algorithms, when a node receives Make_Me_Parent message, it sends Make_Me_Parent message on its every edge except to its parent. Also, every edge is shared between two nodes that are neighbors. Thus, every edge is used to send Make_Me_Parent message by two nodes, a maximum of 2|E| Make_Me_Parent messages can be send. However, on receiving first Make_Me_Parent message, except initiator node, remaining (N-1) nodes will get parent and hence, they will not send Make_Me_Parent message to their parents. Therefore, the number of Make_Me_Parent messages, actually propagated, will be (N-1) less than 2|E|, i.e. 2|E| - (N-1). In response to each Make_Me_Parent message, a node receives either REJECT or ACCEPT message. Hence, the total number of ACCEPT and REJECT messages will also amount to 2|E| - (N-1). Thus, the total number of messages propagated to construct BFS tree will be 2{2|E| - (N-1)}. Once BFS tree is in place, all the nodes, except the node that eventually becomes the central node, will send graph information to their respective parents. Thus, the total number of E_Msg messages, to collect the whole graph information at one node, will be (N-1). The computation of MST does not involve any additional message propagation because it is local computation at central node. Finally, (N-1) additional messages are required to distribute the MST information across all the nodes. Therefore, the total message overhead amounts to 2{2|E| - (N-1)} + (N-1) + (N-1), i.e. 4|E|. The results will be same for Algorithm II also. It is interesting to note that the message efficient algorithms [1–4], to construct MST, have message complexity O (|E| + N log N). F. Time Complexity The time complexity of a distributed algorithm is the maximum time taken by a computation of the algorithm under the following assumptions [8]: (i) a process can execute any finite number of events in zero time, i.e., the local computations performed by nodes do not affect the time complexity (alternatively, they are “free” [7]), (ii) the time between sending and receipt of a message is at most one time unit. In other words, the running time of any algorithm is equal to the number of sequential message propagations. Thus, if all messages are propagated in sequence, the algorithm takes worst case running time. Hence, for computing worst case time efficiency, we may assume all N nodes are arranged in a straight line and an extreme node is the initiator. Therefore, the total number of edges would be (N-1). The BFS tree construction procedure would propagate (N-1) Make_Me_Parent messages and (N-1) ACCEPT messages. There won’t be any REJECT message, in this case. Once BFS tree is constructed, (N-1) E_Msg messages would be flown to collect complete edge information at initiator, and the initiator would become central node. Now, the MST computation is performed by the central node locally, hence, it does not incur any running time overhead. Afterwards, additional (N-1) algorithm messages are exchanged to disseminate the MST information in the entire network. Hence, total 4(N-1) sequential control messages are needed to construct MST. Therefore, the algorithm requires O (N) rounds of distributed communication. Thus, the worst case running time of the algorithm is O (N). G. Simulation of Algorithm I We have simulated an example ad hoc network using ns2. The nodes are randomly distributed in an area of 350 350 units. We vary the number of nodes and edges in the network arbitrarily. Accordingly, the number of edges per node also varies. The total number of messages required to construct BFS tree and disseminate the MST information in the graph, comes out to be 4|E| + 1, where E is the number of edges in the graph. It may be noted that, the message count in the simulation result contains one extra message as compared to that in section 2.5, because in the static analysis we consider no collision scenario. However, in the simulation experiment, one additional E_Msg (terminate) message is generated because the complete graph information is received, finally, by two neighbor nodes, out of which the lower Id node becomes the central node. Table 1 summarizes the results related to MST construction. All eleven cases, from the above Table 1, have been plotted in the following Figure 1. In Figure 1 and Figure 3, the X-axis represents the simulation serial number (Sr. No.) as shown in Table 1 and Table 2. In both the figures, Y-axis represents the count of nodes, edges, and messages that is shown in different colors accordingly. Also, from Figure 2, we infer that message count increases linearly with the edge count in the system. 85
TABLE I. MST CONSTRUCTION RESULTS Sr. No. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
No. of nodes 16 16 24 16 24 64 30 100 200 200 400
No. of edges 19 29 29 41 47 87 99 139 279 325 655
Messages 77 117 117 165 189 349 397 557 1117 1301 2621
edges/node 1.1875 1.8125 1.208333333 2.5625 1.958333333 1.359375 3.3 1.39 1.395 1.625 1.6375
4*E + 1 77 117 117 165 189 349 397 557 1117 1301 2621
3000 2500 2000 1500 1000 500 0 1
2
3
4
5
Messages
6
7
8
Edges
9
10
11
Nodes
Figure 1. Message count on varying node and edge count simultaneously in MST construction Edge count Vs Message count Number of messages
3000 2500 2000 1500 1000
Messages
500 87 99 139 279 325 655
19 29 29 41 47
0 Number of edges
Figure 2. Edge count Vs message count in MST construction
H. Simulation of Algorithm II We use the same simulation set up as we used for Algorithm I. The total number of messages required to construct BFS tree and disseminate the entire graph information to each vertex, comes out to be 4*E + 1, where E is the number of edges in the graph. The value is same as in algorithm I. Also, the computation of single source shortest path (SSSP) does not involve any message propagation because it is computed locally by each node using the complete graph information available. Table 2 summarizes the results related to Algorithm II. All ten cases in Table 2 have been plotted in Figure 3. We observe that the plot of Algorithm II shows similar trend as that of Algorithm I. III. CONCLUSIONS A linear time algorithm to construct minimum spanning tree and single source shortest path was presented. The message efficiency of our algorithms has been confirmed by static analysis as well as simulation results. The message overhead is under strict control because the major computation work is local in case of both the 86
TABLE II. ALGORITHM II R ESULTS Sr. No. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
No. of nodes 16 16 24 16 24 64 30 100 200 200
No. of edges 19 29 29 41 47 87 99 139 279 325
Messages 77 117 117 165 189 349 397 557 1117 1301
edges/node 1.1875 1.8125 1.208333333 2.5625 1.958333333 1.359375 3.3 1.39 1.395 1.625
4*E+1 77 117 117 165 189 349 397 557 1117 1301
1400 1200 1000 800
Messages
600
Edges
400
Nodes
200 0 1
2
3
4
5
6
7
8
9 10
Figure 3. Message count on varying node and edge count simultaneously in Algorithm II
algorithms. Furthermore, the computation of spanning tree is predominantly a sequential task; nevertheless, our algorithm manifests linear convergence time. REFERENCES [1] R. Gallager, P. Humblet, and P. Spira, “A distributed algorithm for minimum weight spanning trees,” ACM Trans. Programming Languages and Systems, vol. 5, no. 1, pp. 66–77, 1983. [2] F. Chin and H. Ting, “An almost linear time and O(n log n+e) messages distributed algorithm for minimum weight spanning trees,” IEEE Symp. Foundations of Computer Science, pp. 257–266, 1985. [3] E. Gafni, “Improvements in the time complexity of two message-optimal election algorithms,” ACM Symp. Principles of Distributed Computing, pp. 175–185, 1985. [4] B. Awerbuch, “Optimal distributed algorithms for minimum weight spanning tree: counting, leader election, and related problems,” ACM Symp. Theory of Computing, pp. 230–240, 1987. [5] J. Garay, S. Kutten, and D. Peleg, “A sublinear time distributed algorithm for minimum weight spanning trees,” SIAM J. Comput, vol. 27, pp. 302–316, 1998. [6] S. Kutten and D. Peleg, “Fast distributed construction of k-dominating sets and applications,” J. Algorithms, vol. 28, pp. 40–66, 1998. [7] M. Elkin, “A faster distributed protocol for constructing minimum spanning tree,” ACM-SIAM Symp. Discrete Algorithms, pp. 352–361, 2004. [8] G. Tel, Introduction to Distributed Algorithms, 2nd ed., Cambridge Univ. Press, New York, 2000, pp. 209–210.
87