11_ by pupuol

Bloom-BIRD: A Scalable Open Source Router Based on Bloom Filter

Journal: Manuscript ID: Topic or Series:

IEEE Network Magazine Draft

r Fo

Special Issue-Open Source for Networking: Development and Experimentation

Date Submitted by the Author:

Complete List of Authors:

Bahrambeigy, Bahram; Islamic Azad University, Science and Research branch, Information Technology Department Ahmadi, Mahmood; Razi University, Computer Engineering Department Fazlali, Mahmood; Shahid Beheshti University (SBU), Computer Science Department

Key Words:

n/a

BIRD, Bloom Filter, Open Source Routers, Quagga, XORP

vi ly

Page 1 of 10 1

Bloom-BIRD: A Scalable Open Source Router Based on Bloom Filter Bahram Bahrambeigy, Mahmood Ahmadi and Mahmood Fazlali

Abstract— Flexibility and configurability behind the opensource routers has extended their usage via the networks. On the other hand, the need for high-performance and high-speed routers has become a fundamental issue due to incremental growth of information exchange through Internet and intranets. Therefore, in this paper we employ Bloom filter to accelerate the BIRD routing daemon which is the best based on resource usage comparison (i.e. CPU and memory usage) made between three open-source routers in the paper. Based on the best of our knowledge this is the first application of Bloom filter on BIRD software router. Bloom-BIRD (our changed version of BIRD) can scale its Bloom filter capacity therefore false positive errors are handled in an acceptable rate. It shows up to 93% speedup for IP lookups over standard BIRD when number of inserted nodes into its internal FIB (Forwarding Information Base) becomes huge.

r Fo

Index Terms— BIRD, Bloom Filter, Open Source Routers, Quagga, XORP.

single commodity personal computer [1]. On the other hand, existence of open-source software routers has created an opportunity to study and change their code to make the better routers based on what researchers need. BIRD [2], Quagga [3], and XORP [4] are examples of such open-source routers and are selected in the paper. These three routers are compared with each other in terms of CPU and memory usage by a simple scenario which all of them implement OSPF protocol. Experimental results show that the BIRD obtains the best performance among them. BIRD obtained less CPU usage than Quagga more than twice and more than 49 times than XORP router. For memory usage, BIRD obtained less memory usage than Quagga about twice and than XORP about 44 times. Therefore, BIRD is selected to implement a Bloom filter (BF) [5] on its internal FIB (Forwarding Information Base) which all routing tables are based on this data-structure. Results show that BF on BIRD makes it up to 93% faster than its internal hashing mechanism for searching big FIBs when the result of search is negative. Also results indicate that there is even speedup when result of searching a particular prefix in the FIB is positive. The main concern in this paper is to show how a Bloom filter can help an open-source router to speedup searches when number of inserted nodes becomes huge. In order to have a fair comparison between two versions of BIRD (i.e. standard BIRD and Bloom-BIRD), Basic rules of standard BIRD have not changed. For example maximum length of BIRD’s main hash is 16-bit, so it is the same for Bloom-BIRD too. The Bloom-BIRD has a Bloom filter array (so a space overhead) to speedup simple searches for a given IP and length and also Longest Prefix Matching (LPM) searches. The array can scale its capacity therefore false positive errors are handled in an acceptable rate. The main contribution of the paper is as follows: - Comparison of the memory and CPU usage of three open-source routers, BIRD, Quagga and XORP. - Proposal of Bloom-BIRD router to enhance the performance of BIRD open-source router that utilizes a Bloom filter in its architecture. The rest of the paper is organized as follows. In section II, a related work of Bloom filter and its applications in network processing is presented. In section III, in the first part, a scenario in order to compare three open-source routers is presented and in the latter part results of comparison is presented. Section IV, contains three parts, which first, is a brief introduction to BIRD’s FIB data structure, in the second

I. INTRODUCTION

Bahram Bahrambeigy is with Information Technology Department, Science and Research branch, Islamic Azad University, Kermanshah, Iran (email: bahramwhh@gmail.com). Mahmood Ahmadi is with Computer Engineering Department, Razi University, Kermanshah, Iran (e-mail: m.ahmadi@razi.ac.ir). Mahmood Fazlali is with Computer Science Department, Shahid Beheshti University (SBU), Tehran, Iran (e-mail: fazlali@sbu.ac.ir).

he need for high-performance and high-speed routers has become a fundamental issue due to incremental growth of information exchange through Internet and intranets. Due to adoption of CIDR (Class-less Inter-domain Routing), routers need to find best match between different prefix lengths that may differ from lengths 1 to 128 based on what version of IP and what prefix is used. This process of finding matching IPs is time consuming and a lot of hardware (e.g. TCAM and SRAM) and algorithmic approaches (e.g. binary searches) are proposed in the literature as will be discussed further in the related works. Modern IP router solutions can be classified into three main categories, hardware routers, software routers, and programmable network processors (NPs) [1]. PC-based software routers have made the reasonable computational platform available with easy development and programmability features. These features are the most important in comparison with hardware routers. Current software routers can forward up to 40 Gbits/sec traffic on a

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Page 2 of 10 2 part introduction of scenario in order to evaluate Bloom-BIRD and standard BIRD is presented and the last part is results of Bloom-BIRD evaluation. Section V presents the conclusion of the paper. II. RELATED WORKS Bloom filter (BF) is a randomized and probabilistic datastructure proposed by Burton Bloom in the 1970s [5]. BF normally consists of a bit-array representing existence of inserted elements. By checking k hash functions and getting negative answer, it can be determined that the element is not inserted certainly. However some False Positive (FP) may occur. Which means BF may say some elements exist by mistake so it needs to check actual hash table to make sure about positive answers. BFâ&#x20AC;&#x2122;s bit-array can reside in an on-chip memory by a hardware implementation to do k hash functions checks in parallel. Main advantage of BF structure is Space and Time efficiency in which consumes much less space than ordinary data structures because of its potential collisions and requiring much less and more predictable time to query a member. More than 20 variants of BF are presented in the literature [6]. Each and every one of them is used for a special manner. For example, standard Bloom filter (SBF) is used in order to check if a specific element is present or not. An important draw-back of SBF is that insertions can not be undone. Counting Bloom filter (CBF) and later, Deletable BF (DlBF) proposed in order to gain the removability in BF [6]. In CBF each bit in the Bloom array will be replaced by a counter. Each insertion, increments counters related to k hash functions. Obviously, each deletion decrements related counters. Another variant of BF which supports deletions as mentioned is DlBF. It splits BF array into multiple regions and tracks regions of BF array in which collisions occur. A small fraction of bit-array will be used in order to determine related area is collision-free or not. If bits are located in a collisionfree region then the bit can be reset safely, otherwise it will not be safe to delete. Therefore, some bits may not reset if they are located in a collisionary region. CBF is selected in the paper because of its simplicity and consistency over deletions instead of DlBF. DlBF would be a good option if BF is going to be implemented in an on-chip memory. Also other useful variants of BF are Scalable Bloom filter (SlBF) and Dynamic Bloom filter (DBF) which adapt their performance and capacity when number of inputs increases [6]. SlBF does not support counting and deletion while DBF can do it. They are based on this draw-back on SBF that there may be non-predictable input entries into SBF, so the error rate becomes to enlarge if BF array does not scale with number of inputs. BFs are used in various applications including network processing as discussed in [7] can be classified into four categories: Resource routing, Packet routing, Measurement, and Collaborating in overlay and peer-to-peer networks. Also IP Route lookup and Packet Classification are important applications of BF in the network processing (e.g. [8]). Longest Prefix matching (LPM) or Best Matching Prefix

r Fo

(BMP) which can be classified into IP route lookup is also an interesting area of BF application. There are a lot of proposed algorithms in order to speedup BMP in the literature. In [9] authors have classified BMP algorithms into Trie-based algorithms, Binary search on prefix values, and Binary search on prefix lengths. A Trie is a tree-based data-structure allowing organization of prefixes on a digital basis using the bits of prefixes to direct the branching [9]. Trie-based schemes do a linear search on prefix length since compare one bit at a time. The worst case of memory accesses is W when prefix length is W. But binary search algorithms on prefix values are proportional to log2N which N is number of prefixes. Binary search on prefix lengths are proportional to log2W [9]. There has been a BF application for LPM [6] which uses parallel check on on-chip memory to accelerate lookups before checking slower off-chip memory. Previous work which implemented a BF on a router (for the acceleration) is [10] that authors have implemented a complete BF even on hardware to show how BF can help forwarding and shortest path routing that is implemented in Click router. Their implementation adjusts BF size and false positives dynamically without impacting forwarding rate. Although BUFFALO is a complete BF implementation on a real router but in this paper the main concern is to show BF application in BIRDâ&#x20AC;&#x2122;s routing daemon (open-source router) that based on the best of our knowledge is the first time. Bloom-BIRD can scale its Bloom filter capacity therefore false positive errors are handled in an acceptable rate. Basic rules and chain orders of BIRD are not changed and changes are the modest in order to have a fair comparison.

III. OPEN SOURCE ROUTERS PERFORMANCE

In this section, three open-source routers, i.e. BIRD, Quagga and XORP are compared in terms of CPU and memory usage while implementing OSPF protocol in a simple scenario. In the first subsection, the scenario is presented and in the latter part, results of comparison are discussed.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Fig. 1. Simple scenario to test routers; packet injection from ubuntu1 to ubuntu2

A. Scenario and Test-bed The scenario to compare three routers is depicted in Fig 1. In the scenario, there are two routers (router1, router2) and two clients (ubuntu1, ubuntu2) in which injection of packets starts from ubuntu1 to ubuntu2 destination. This scenario is implemented by Oracle VirtualBox v4.1 which all virtual

Page 3 of 10 3 machines run Ubuntu 10.04 with unmodified Linux kernel 3.2.0. Hardware configuration of the two routers are same, they have 403 MB RAM and two network interfaces. And also the two clients have same hardware configuration in which they have one network interface and 256 MB RAM. Host of the virtual machine manager is a home PC with 2.88 MHz dual core CPU. Packet injection is made from ubuntu1 client to ubuntu2 destination by Ostinato version 0.5 [11]. For evaluation, a burst stream including 30 TCP, 30 UDP and 30 ICMP packets with 64 Bytes length which took 1:30 minutes, is used. Speed of injection in this software is set to 2 bursts/sec. To gather amount of CPU and memory usage of the routers, Valgrind profiler [12] is used in the two routers simultaneously. This software is a DBI (Dynamic Binary Instrumentation) framework designed for building heavyweight DBA (Dynamic Binary Analysis) tools. The program consists of different tools but in this comparison just two of them i.e. Callgrind and Massif are used. Callgrind tool uses number of functions calls as metric for CPU usage. Massif tool which is used for memory usage is a heap profiler that for each memory allocation/deallocation makes a snapshot from memory to measure amount of memory allocated or freed. BIRD (1.3.8), Quagga (0.99.22) and XORP (1.8.5) are executed separately in the scenario three times. Two routers (router1, router2) run same software router and resource usage is measured by Valgrind. Results are average performance of these two routers.

Fig. 2. Three open-source routing daemons CPU usage. XORP is worst and BIRD is best.

r Fo

Fig. 3. Three open-source routing daemons memory usage (MB). XORP is worst and BIRD is best.

was developed at Faculty of Math and Physics, Charles University Prague. Currently it is developed and supported by CZ.NIC1 Labs. It supports latest versions of routing protocols such as BGP, RIP and OSPF. It also supports both IPv4 and IPv6 and a simple command line interface to configure the router. There is a fundamental data-structure named FIB (Forwarding Information Base) in the BIRD which routing tables are based on. This data-structure stores IP prefixes and length of them. Searching in a FIB, where huge number of prefixes stored can be a speed bottleneck, which can be faster using a CBF (Counting Bloom Filter) as will be presented. Storing in BIRDâ&#x20AC;&#x2122;s FIB is a two stage mechanism. In the first stage, an order-bit hash is calculated based on prefix value to find bucket index of main hash table (order can be varied from 10 to 16). In the next stage, there is a chain of nodes in linked list structure which may become long due to huge number of nodes. Therefore, a BF can help to reduce of traversing these long chains for missing nodes which result in accelerating the whole router. Nodes are allocated in each chain by BIRDâ&#x20AC;&#x2122;s Slab Allocator. Their implementation is based on what Bonwick proposed [13] that makes linked list traversing and nodes allocation/deallocation fast. There is three important functions related to FIBs in the BIRD named fib_get(), fib_find(), fib_route() which are responsible for adding, searching and longest prefix matching respectively. Each FIB in BIRD starts with a default 10-bit

IV. BLOOM-BIRD: A BETTER BIRD The BIRD project [2] aims to develop a fully functional dynamic IP routing daemon primarily targeted on (but not limited to) Linux, FreeBSD and other UNIX-like systems. It

B. Comparison results Results of three open-source routers comparison is presented in Figures 2 and 3. All routers run OSPF protocol in a simple scenario as depicted in Figure 1. In Figure 2 the y axis shows the total number of function calls of related opensource router (i.e. CPU usage) in millions (106) base which is gathered by Callgrind tool from Valgrind profiler, and the x axis shows open-source routers name. In Figure 3 the y axis shows total memory usage by routers in MB gathered by Massif tool from Valgrind profiler. Similarly the x axis shows open-source routers name. As it is obvious from these two figures, for OSPF protocol implementation, BIRD achieved the best performance in terms of CPU and memory usage. BIRD obtained less CPU usage than Quagga more than twice and more than 49 times than XORP router. For memory usage, BIRD needs less memory usage than Quagga about twice (1.80 times) and than XORP about 44 times. Since BIRD is the best among the routers in terms of CPU and memory usage based on the comparison, it is selected to implement BF on it and make it even faster when the network is huge.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

http://www.nic.cz/

Page 4 of 10 4 hash table and increases its hash table size when number of prefixes increases. This expansion will stop at 16-bit, therefore, chain lengths starts to become larger. Implemented BF helps the main hash table when this situation happens to prevent searching in this large FIB chains when an IP prefix cannot be found. In the following subsection, the way Bloom-BIRD is implemented and testing scenario is presented. In the next subsection, the scenario to test Bloom-BIRD is discussed. There is also third subsection to present results of comparison between standard BIRD and Bloom-BIRD. A. Implementation BIRD uses dynamic hashing size to store prefixes which increases when number of inserted prefixes becomes huge. It starts from 10-bit to 16-bit and never grows afterwards. In order to have a fair comparison between standard BIRD and Bloom-BIRD, these rules have not changed. Therefore, Bloom-BIRD has a BF array in each FIB to help it responding faster when it is possible. Hashing mechanism in the BF of Bloom-BIRD is inspired by the main hash table of standard BIRD. If number of enteries increases, Bloom-BIRD changes size and order of BF array like the way main hash table does. Bloom-BIRD starts with 18-bit BF and it increases to 20-bit if the capacity limit is reached. This expansion continues until 32-bit and it never grows afterwards. Therefore, capacity of BF array is limited to 32-bit when an acceptable FP error rate is expected. For simplicity, this expansion of BF array is not included in the psuedo-codes in this subsection. In Figure 4 a simple BIRDâ&#x20AC;&#x2122;s FIB hashing table is presented. In the Figure, the order can be varied between 10 to 16 as mentioned before. Basic fib_find() function that searches for a given prefix and length in a FIB is shown which uses ipa_hash() function to determine which bucket in main hash table should be used. Main hash array is an order-bit array of fib_node type. Afterwards the node will be inserted into a new free location in the linked lists chain.

r Fo

Fig. 5. FIB hashing architecture of Bloom-BIRD. Before checking main hash table, k hash is performed on IP prefix to check the Bloom filter array. If Bloom confirms existing of the prefix then it will go through main hash table. Otherwise it immediately returns null from fib_find().

BF confirms existing of the prefix, then the main hash table will be checked in order to determine pointer address of founded node or a FP error may occur. On the other hand, (and more importantly) if BF returns negative, checking main hash table will be ignored. Therefore, the main advantage of BF is the latter part in which checking main hash table and maybe long chain lists traverse can be avoided. The pseudo-code of fib_find() in the standard BIRD (as discussed), is shown as SB_fib_find() function (in order to distinguish between standard BIRD functions a SB is prepended and for Bloom-BIRD functions there is BB prefix). In the first line, e variable points to selected bucket which is traversed in order to find given prefix. In the second and third lines, the bucket chain is traversed to find requested node. Two situations may happen after this loop. The loop may find the node, then last line returns the node. Otherwise, traversing linked list may end with a null pointer therefore fourth line returns a null pointer indicating that the node cannot be found.

SB_fib_find(fib, prefix, length)

In Figure 5 the way BF is implemented in FIB is depicted. The fib_find() function checks BF for given prefix firstly. If

e = fib_table[ipa_hash(prefix)] while((not empty e) AND (not found e)) e = e->next return e

1. 2. 3. 4.

Fig. 4. FIB hashing architecture of standard BIRD. fib_find() uses ipa_hash() function to determine which hash bucket must be selected. Afterwards nodes are chained in each bucket.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

In the Bloom-BIRD version of this function, in the first line, BF array and its hashing mechanism is used in order to ignore prefixes that does not exist as discussed earlier. The function is changed as BB_fib_find() shows. Three first lines are dedicated for BF search for a given prefix. There is k hash functions in order to search BF. In each iteration if a location of BF array represents an empty location then the search returns false answer (i.e. NULL pointer). As it is shown in second line, k independent hash functions are used for BF to check array locations. These hash functions are also decpited in Figure 5 as bloom_hashi() functions.

Page 5 of 10 5 BB_fib_find(fib, prefix, length) 1. 2. 3. 4. 5. 6. 7.

for(i=1 to k) if(filter[bloom_hashi(prefix)] is empty) return NULL e = fib_table[hash(prefix)] while((not empty e) AND (not found e)) e = e->next return e

BIRD uses very simple longest prefix matching (LPM) mechanism that starts from a given length and decrement it until longest prefix match is found or returns a NULL pointer. The pseudo-code is shown as SB_fib_route() function. Since fib_route() uses fib_find() as its main function to determine existence of the prefix, BF can help fib_route() very effectively since BF has no false negative and it does not need to go through the main hash chains for lengths that cannot be found. Therefore, there is no need to change anything in the fib_route() function. Although there is better solutions like binary searches on prefix values and lengths as mentioned in previous works, BIRD’s LPM algorithm is not changed in order to show BF performance over standard BIRD.

r Fo

SB_fib_route(fib, prefix, length) while (length ≥ 0) if(fib_find(fib, prefix, length)) return founded node else length = length – 1 return NULL

Extra lines 6 and 7 of BB_fib_get() are responsible for updating BF array due to newly added node. This does not count as a overhead since hash functions are optimized using bit-wise shifts. There are two different types of hash functions used in the Bloom-BIRD. First type is much like BIRD’s original hash function which returns a 16-bit hash based on prefix value but optimized using bit-wise shifts. Second type hash functions are used for BF which have much less collisions than BIRD’s original hash function. These second type hash functions return varied bit size based on BF array size. Number of BF hash functions may vary based on k parameter of Bloom filter which determines number of hash functions. Optimal value of the k parameter can be calculated using the following equation: / ∗ ln 2 (1) In which m represents number of bits in the BF array and n is number of elements that can be inserted. In the BloomBIRD, k is constant and is set to 3 because the loop of checking k hash functions becomes bottleneck for bigger k. Therefore, the equation is used in order to calculate size of BF array in the Bloom-BIRD. B. Scenario In order to evaluate standard BIRD and Bloom-BIRD three real IPv4 prefix sets from [14] are gathered from years 2008, 2010 and 2013 sorted by date which latest and more updated one contains more than 482 thousands unique IPv4 prefixes as Table 1 shows. The two versions of BIRD i.e. standard BIRD and Bloom-BIRD are evaluated by inserting these real prefix sets and quering them. Prefix sets 4 and 5 are manually generated which contains all possible 24 length prefixes starting with 1-19 and 20-39 octets respectively. Prefix set 6 is concatenation of two prefix sets 4 and 5 in order to test searching FIBs with even bigger prefix sets and make sure about results. These last three prefixes contain 99% missing (not existing) prefixes compared to the other three real prefixes (i.e. prefix sets 1-3) in order to show performance of BF when most queries return negative. These last three prefix

1. 2. 3. 4. 5. 6.

The last important function is fib_get() which searches for given prefix and length and if does not exist, it adds the prefix into the FIB. The pseudo-code of this function is presented as SB_fib_get() function. It is shown in the first line that the function uses fib_find() mechanism as its searching function. If it finds the node then its pointer will be returned. Otherwise the node will be inserted into selected bucket of the hash table.

SB_fib_get(fib, prefix, length) if(fib_find(fib, prefix, length)) return founded node else Go down through hash chains And add new node

To check existence of nodes in this function also BF can help through fib_find() when a node does not exist. Therefore, in the first line, BF is checked before its main hash table and when a node is not inserted before, BF counters should be incremented. Therefore, only one change is needed in the function. This function is been shown as BB_fib_get() function. BB_fib_get(fib, prefix, length) 1. 2. 3. 4. 5. 6. 7.

if(fib_find(fib, prefix, length)) return founded node else Go down through hash chains And add new node for(i=1 to k) filter[bloom_hashi(prefix)] += 1

1. 2. 3. 4. 5.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

Table. 1. Prefix sets to test the two versions of BIRD Prefix sets 1-3 are real dumps from routers and Prefix sets 4-6 are manually generated Prefix set alias # of nodes Prefix1 262,039 Prefix2 351,645 Prefix3 482,500 Prefix4 1,179,648 Prefix5 1,310,720 Prefix6 2,490,368 Table. 2. Percentage of missing nodes when searching for prefix sets There is zero (0) values since the prefix set is inserted itself (no missing nodes) Inserted prefix set / Prefix1 Prefix2 Prefix3 Searched prefix set Prefix1 0 24.53% 36.27% Prefix2 43.65% 0 22.92% Prefix3 65.34% 43.82% 0 Prefix4 99.8% 99.77% 99.56% Prefix5 99.86% 99.77% 99.39% Prefix6 99.83% 99.77% 99.47%

Page 6 of 10 6 Table. 3. Results of Bloom-BIRD (1.3.10) and standard BIRD (1.3.10) comparison (*) means the prefix set is inserted itself – these are also worst cases since all searched prefixes are existing

Inserted prefix set / Searched prefix set Prefix1 Prefix2 Prefix3 Prefix4 Prefix5 Prefix6

fib_find() – simple search Speedup of Bloom-BIRD over BIRD Using below prefix sets inserted Prefix1 Prefix2 Prefix3

fib_route() – LPM search Speedup of Bloom-BIRD over BIRD Using below prefix sets inserted Prefix1 Prefix2 Prefix3

False Positive Percentage of Bloom-BIRD Using below prefix sets inserted Prefix1 Prefix2 Prefix3

(*) 11% 54% 56% 77% 77% 77%

(*) 4% 28% 38% 55% 52% 54%

(*) 0 16.44% 14.4% 8.86% 8.63% 8.74%

37% (*) 19% 59% 93% 93% 92%

50% 42% (*) 22% 91% 90% 91%

sets are not real prefixes therefore they are only used for searching and not for inserting into FIBs. Percentage of number of missing nodes when each prefix sets are searched is presented in Table 2. For example when all prefixes in prefix set 2 are inserted into a FIB and all prefixes in the prefix set 1 are queried afterwards, 24.53% of searches return negative answer. Results of evaluation is included and discussed in the following subsection.

r Fo

C. Results of Bloom-BIRD and Discussion As discussed in the previous subsection, three prefix sets 13 are inserted into a FIB in three different times in the two versions of standard BIRD and Bloom-BIRD and all prefix sets 1-6 are queried afterwards. Percetage of speedups of Bloom-BIRD over standard BIRD and FP error rate are presented in Table 3 in three main columns. In the first main column, speedups of fib_fib() function which is responsible for simple searching for a given prefix and length is presented. In the second main column, speedups of fib_route() function which is responsible for Longest Prefix Matching (LPM) is presented (starting length for LPM is set to 32 for all searches). In the last column, percentage of FP error rate is presented. Also there are 6 rows representing what prefix set is searched. The smallest speedup is 4% and biggest speedup is 93%. Smaller speedups are gained when most prefixes are found after search (i.e. number of existing nodes are bigger than missing nodes). On the other hand, bigger speedups are gained when most prefixes are not found after search. Also FP error rate is at most 20% which is an acceptable rate. Based on the experiments, for small number of prefixes, BF counts only as a memory overhead for BIRD i.e. no valueable speedup will be gained. Therefore, BF feature of Bloom-BIRD will remain deactivated until its main hash table reaches into 16-bit order. Afterwards, BF array will be allocated and initialized to zero. Hashing mechanism in the BF of BloomBIRD is inspired by the main hash table of BIRD. If number of enteries increases, Bloom-BIRD changes size and order of BF feature like the way main hash table does. Bloom-BIRD starts with 18-bit BF and it increases to 20-bit if the capacity limit is reached. This expansion continues until 32-bit. In the Table 3 results show how scaling feature helps accelerating the Bloom-BIRD when prefix set 2 is inserted in

20% (*) 7% 36% 74% 71% 72%

32% 29% (*) 14% 76% 74% 75%

17.1% (*) 0 9.88% 0.46% 0.6% 0.54%

18.54% 20.83% (*) 0 1.11% 1.67% 1.4%

comparison with when prefix set 1 is inserted. Since number of prefixes in the prefix set 2 is bigger than Bloom-BIRD default hash order (i.e. 18-bit), therefore the order of BF array is scaled up to 20-bit and consequently FP is decreased in comparison with when prefix set 1 is inserted. This situation also shows how FP error rate is important and can make the searches faster. Table 3 shows that speedups of fib_route() function is lower than its similar situation in the fib_find(). That’s because of fib_find() tires just once for given prefix and length but fib_route() tries W(n) times in worst case which n is 32 for IPv4 and 128 for IPv6. Therefore, fib_route() in most cases finds the best match. For memory usage, number of bits in the BF array can be calculated using Equation (1). Although 4-bit counters are generally used for counters in CBF, but in the Bloom-BIRD FIBs, 8-bit counters are used in the CBF because of simplicity and lower overhead of increment operations in the PC for Byte data type. Since k is constant and is set to 3 and number of inputs by default is 18-bit, therefore the memory requirement for Bloom-BIRD in the 18-bit capacity is 1.08 MB. When the capacity limit is reached, it will be incremented by 2 therefore it will be 20-bit capacity. This order requires 4.33 MB of memory. This expansion continues until 32-bit and the memory requirement can be calculated using Equation (1).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

V. CONCLUSION The paper showed and presented another application of BF on a practical open-source router. The BF implementation on BIRD’s FIB data structure showed that it can help BIRD to search and route faster when inserted prefixes into a FIB become huge. Bloom-BIRD version which has Bloom Filter implementation, evaluated using different prefix sets gathered from real routers dumps and also manually generated prefix sets to make the tests more accurate. Bloom-BIRD employs a BF in BIRD’s FIB data structure in order to accelerate the router when FIB’s linked list chains become long. Comparison using different prefix sets showed that up to 93% speedup is gained when a search returns negative answer. Also it is showed how Bloom-BIRD can handle its FP error rate when number of inserted prefixes increases. Because of its

Page 7 of 10 7 capability to adapt its scale, it is possible to employ this solution for IPv6 addresses too.

REFERENCES [1]

[2] [3] [4]

[5] [6]

[7]

[8]

[10]

[13]

[14]

[12]

[11]

[9]

S. Han, K. Jang, K. Park, and S. Moon, "PacketShader: a GPUaccelerated software router," ACM SIGCOMM Computer Communication Review, vol. 41, pp. 195-206, 2010. BIRD Internet Routing Daemon. Available: http://bird.network.cz Quagga Routing Suite. Available: http://www.nongnu.org/quagga/ M. Handley, O. Hodson, and E. Kohler, "XORP: an open platform for network research," ACM SIGCOMM Computer Communication Review, vol. 33, pp. 53-57, 2003. B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors," Communications of the ACM, vol. 13, pp. 422-426, 1970. S. Tarkoma, C. E. Rothenberg, and E. Lagerspetz, "Theory and practice of bloom filters for distributed systems," Communications Surveys & Tutorials, IEEE, vol. 14, pp. 131-155, 2012. A. Broder and M. Mitzenmacher, "Network Applications of Bloom Filters: A Survey," Internet Mathematics, vol. 1, pp. 636-646, 2002. M. Ahmadi and S. Wong, "Modified collision packet classification using counting Bloom filter in tuple space," presented at the Proceedings of the 25th conference on Proceedings of the 25th IASTED International Multi-Conference: parallel and distributed computing and networks, Innsbruck, Austria, 2007. L. Hyesook and L. Nara, "Survey and proposal on binary search algorithms for longest prefix match," Communications Surveys & Tutorials, IEEE, vol. 14, pp. 681-697, 2012. M. Yu, A. Fabrikant, and J. Rexford, "BUFFALO: bloom filter forwarding architecture for large organizations," presented at the Proceedings of the 5th international conference on Emerging networking experiments and technologies, Rome, Italy, 2009. Ostinato, Packet/Traffic Generator and Analyzer. Available: http://code.google.com/p/ostinato N. Nethercote and J. Seward, "Valgrind: a framework for heavyweight dynamic binary instrumentation," ACM SIGPLAN Notices, vol. 42, pp. 89-100, 2007. J. Bonwick, "The slab allocator: an object-caching kernel memory allocator," presented at the Proceedings of the USENIX Summer 1994 Technical Conference on USENIX Summer 1994 Technical Conference - Volume 1, Boston, Massachusetts, 1994. RouteViews BGP RIBs. December 2008, December 2010, July 2013. Available: http://routeviews.org/

r Fo

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

r Re

router2 Page 8 of 10

router1

view 20.20.20.1/30

1 192.168.1.1/24 2 3 4 192.168.1.2/24 5 6 7 8 ubuntu1

20.20.20.2/30

10.10.10.1/8

Onl

10.10.10.2/8

ubuntu2

fib_node array

Page 9 of 10

1 fib_find() 2 3 4 5 ipa_hash() 6 7 8 9 10 11 12

node1

node3

node4

node5

iew . . .

2order

node n

node2

node6

On FIB Hash table

or 1 False 2 3 4 5 6 7 8 9 10 11 12

fib_node array

fib_find()

vie

bloom_hashi() 1

2 2

...

True

ipa_hash()

Page 10 of 10

node1

node3

node4

node5

w . . .

2order

node2

node6

node n

FIB Hash table