IMTS MCA (Data structures & alogrithems using c, c)

Page 1

I ns t i t ut eo fMa na g e me nt & Te c hni c a lSt udi e s DATASTRUCTURES& ALOGRI THEMSUSI NGC,C++

500

Ma s t e ri nComput e rAppl i c a t i on www. i mt s i ns t i t ut e . c om


IMTS (ISO 9001-2008 Internationally Certified) DATA STRUCTURES & ALOGRITHEMS USING C, C++

DATA STRUCTURES & ALOGRITHEMS USING C, C++

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ Block 1 Algorithm and Data Structure Fundamentals Unit 1-

01-12

Overview of Data Structure and Algorithm: Overview of Data Structure, Data Structures & C++, Types of Data Structures, Static vs. Dynamic Data Structures, Introduction to Algorithm, The Running Time of a Program, Measuring the Running Time of a Program, Big-Oh and Big-Omega Notation, The domination of Growth Rate, Calculating the Running Time of a Program. Unit 2-

13-24

Data handling in C++: Data Types, Data Structures and Abstract Data Types, Structured Types, Data Abstraction, and Classes, Basic Principles, Abstract Data Type, Categories of ADT Operations, C++ Classes, Information Hiding.

Unit 3-

25-38

Data Structure introduction: Abstract Data Type Implementation, An Example: Collections, Constructors and destructors, Data Structure, Methods, Pre- and post-conditions, Error Handling, Defining Errors, Processing errors, Data Structures Examples, Arrays, Linked lists, List variants, Stacks. Unit 4-

39-49

Searching Techniques: Searching Basics, Sequential Searches, Binary Search, Improvements in Searching, Hashing.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


Block 2 Linear Data structure Unit 1-

50-66

Sorting Techniques and Recursion: Introduction to Sorting, Insertion Sort, Bubble Sort, Selection Sort, Shell Sort, QuickSort, Merge Sort, Heap Sort, Recursion, Recursive functions, Example: Factorial, Fibonacci Numbers, Recursively Defined Lists, Analysis of Sorting. Unit 2-

67-85

Arrays and Pointer Handling: Introduction, Arrays, Multidimensional Arrays, Pointers, Function Pointers, References, Typedefs.

Unit 3-

86-109

Linked List I: Linked List Basics,Array Review, Pointer Review, Implementation : Template, Other Types of Lists, Implementation of Singly linked list, Implementation of Doubly linked list.

Unit 4-

110-129

Linked List II: Introduction, The Empty List NULL, Linked List Types: Node and Pointer,

Memory

Drawings,

List

Building,

About

C++,

Code

Techniques, Examples, Representation of Polynomial addition using linked list.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


Block 3 Advanced Data Structures Unit 1-

Stacks, Introduction to Stacks:

130-142

Programmatic representation of a stack, Implementation of Stacks, Application of Stacks, Post Fix Expressiononvert To PolishNotation (InfixExpression), Converting infix expression to postfix notation using Stack. Unit 2-

Queues:

143-158

Introduction, Programmatic representation of a Queue, Deque, Application of Queues,

Priority

Queues,

C++

Implementation

of

Queues,

Implementation of Double ended queue (Dequeue).

Unit 3-

Binary Trees:

159-176

Introduction, Binary Trees, Analysis Complete Trees, General binary trees, Unbalanced Trees, C++ implementation of Binary Tree.

Unit 4-

Binary Search Trees:

177-193

Introduction, Implementation of Binary Search Tree, Implementation of an Expression tree to perform tree traversals.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


Block 4 Special Trees and Graphs Unit 1-

Heaps:

194-204

Heaps Introduction, Heaps, Implementation, Heap Sort Algorithm Analysis.

Unit 2-

Height Balance Trees:

205-219

Red-Black Trees, Red-Black Tree Operation, AVL Trees, Implementation of insertion operation in AVL tree.

Unit 3-

Multi way Trees:

220-231

Multi-way Trees, B-Trees, Insertion into a B-Tree, Deletion from a B-Tree, B+tree and its Algorithm, B+-Tree structure, Search operation of B+-Tree, Update on B+-Tree, Insertion on B+-Tree, Deletion on B+-Tree, Adding Records to a B+ Tree, Rotation, Deleting Keys from a B+ tree.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 1 Overview of Data Structure and Algorithm Structure 1.0

Unit Objective

1.1

Overview of Data Structure

1.1.1

Data Structures & C++

1.1.2

Types of Data Structures

1.1.3

Static vs. Dynamic Data Structures

1.2

Introduction to Algorithm?

1.3

The Running Time of a Program

1.3.1

Measuring the Running Time of a Program

1.3.2

Big-Oh and Big-Omega Notation

1.4

The domination of Growth Rate

1.5

Calculating the Running Time of a Program

1.6

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

1


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1.0

2

Unit Objective

After going through this unit, you should be able to understand : 

Data structure definition

Static and Dynamic data structure

Basics of Algorithm

Running Time of Algorithm

Asymptotic notations for algorithm

Calculation of Running Time of Algorithm

1.1

Overview of Data Structure

Data Structure is about study of data and algorithms. This encompasses the study in the following areas: 

Machines that holds data / executing algorithms

Languages for describing data manipulation / algorithms

Foundation of algorithms

Structures for representing data &

Analysis of algorithms

A data type is a well-defined collection of data with a well-defined set of operations on it. A data structure is an actual implementation of a particular abstract data type. Abstraction can be thought of as a mechanism for suppressing irrelevant details while at the same time emphasizing relevant ones.

1.1.1

Data Structures & C++

Data Structures is not specific to any language. It is more about writing efficient algorithms. It can be implemented in any language. For our purpose we have chosen to implement all the algorithms in C++ language.

1.1.2

Types of Data Structures

There are several data structures that are available and the choice of an appropriate one depends on the requirement. Some of the data structures that are commonly used include: 

Arrays

Linked Lists

Stacks

Queues

Trees

Graphs

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

1.1.3

3

Static vs. Dynamic Data Structures

A static data structure has a fixed size. This meaning is different from the meaning of the static modifier. Arrays are static; once you define the number of elements it can hold, the number doesnt change. A dynamic data structure grows and shrinks at execution time as required by its contents. A dynamic data structure is implemented using linked lists.

1.2

Introduction to Algorithm?

An algorithm is a finite set of instructions, which accomplish a particular task. Every algorithm must satisfy the following criteria: 

Input Zero or more quantities which are externally supplied

Output at least one quantity is produced

Definiteness Each instruction must be clear and unambiguous

Finiteness If we trace out the instructions of an algorithm, then for all cases the algorithm will terminate after a finite number of steps

Effectiveness Every instruction must be sufficiently basic that a person using only pencil and paper can in principle carry it out. If is not enough that each operation be definite but it must also be feasible.

The logical or mathematical model of organization of data is called a data structure. A data structure describes not just a set of data but also how they are related. Data Structures and algorithms should be thought of as a unit, neither one makes sense without the other.

Once we have a suitable mathematical model for our problem, we can attempt to find a solution in terms of that model. Our initial goal is to find a solution in the form of an algorithm, which is a finite sequence of instructions, each of which has a clear meaning and can be performed with a finite amount of effort in a finite length of time. An integer assignment statement such as x := y + z is an example of an instruction that can be executed in a finite amount of effort. In an algorithm instructions can be executed any number of times, provided the instructions themselves indicate the repetition. However, we require that, no matter what the input values may be, an algorithm terminate after executing a finite number of instructions. Thus, a program is an algorithm as long as it never enters an infinite loop on any input.

There is one aspect of this definition of an algorithm that needs some clarification. We said each instruction of an algorithm must have a "clear meaning" and must be executable with a "finite amount of effort." Now what is clear to one person may not be clear to another, and it is often difficult to prove rigorously that an instruction can be carried out in a finite amount of time. It is

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

4

often difficult as well to prove that on any input, a sequence of instructions terminates, even if we understand clearly what each instruction means. By argument and counterargument, however, agreement can usually be reached as to whether a sequence of instructions constitutes an algorithm. The burden of proof lies with the person claiming to have an algorithm. In Section 1.5 we discuss how to estimate the running time of common programming language constructs that can be shown to require a finite amount of time for their execution.

Example 1.1. A mathematical model can be used to help design a traffic light for a complicated intersection of roads. To construct the pattern of lights, we shall create a program that takes as input a set of permitted turns at an intersection (continuing straight on a road is a "turn") and partitions this set into as few groups as possible such that all turns in a group are simultaneously permissible without collisions. We shall then associate a phase of the traffic light with each group in the partition. By finding a partition with the smallest number of groups, we can construct a traffic light with the smallest number of phases. We can model this problem with a mathematical structure known as a graph. A graph consists of a set of points called vertices, and lines connecting the points, called edges. For the traffic intersection problem we can draw a graph whose vertices represent turns and whose edges connect pairs of vertices whose turns cannot be performed simultaneously. For the intersection of Fig. 1.1 we can have another representation of this graph as a table with a 1 in row i and column j whenever there is an edge between vertices i and j.

The graph can aid us in solving the traffic light design problem. A coloring of a graph is an assignment of a color to each vertex of the graph so that no two vertices connected by an edge have the same color. It is not hard to see that our problem is one of coloring the graph of incompatible turns using as few colors as possible.

The problem of coloring graphs has been studied for many decades, and the theory of algorithms tells us a lot about this problem. Unfortunately, coloring an arbitrary graph with as few colors as possible is one of a large class of problems called "NP-complete problems," for which all known solutions are essentially of the type "try all possibilities." In the case of the coloring problem, "try all possibilities" means to try all assignments of colors to vertices using at first one color, then two colors, then three, and so on, until a legal coloring is found. With care, we can be a little speedier than this, but it is generally believed that no algorithm to solve this problem can be substantially more efficient than this most obvious approach.

One reasonable heuristic for graph coloring is the following "greedy" algorithm. Initially we try to color as many vertices as possible with the first color, then as many as possible of the uncolored

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

5

vertices with the second color, and so on. To color vertices with a new color, we perform the following steps.

1. Select some uncolored vertex and color it with the new color. 2. Scan the list of uncolored vertices. For each uncolored vertex, determine whether it has an edge to any vertex already colored with the new color. If there is no such edge, color the present vertex with the new color.

This approach is called "greedy" because it colors a vertex whenever it can, without considering the potential drawbacks inherent in making such a move. There are situations where we could color more vertices with one color if we were less "greedy" and skipped some vertex we could legally color.

1.3

The Running Time of a Program

When solving a problem we are faced frequently with a choice among algorithms. On what basis should we choose? There are two often contradictory goals. 1. We would like an algorithm that is easy to understand, code, and debug. 2. We would like an algorithm that makes efficient use of the computer's resources, especially, one that runs as fast as possible.

When we are writing a program to be used once or a few times, goal (1) is most important. The cost of the programmer's time will most likely exceed by far the cost of running the program, so the cost to optimize is the cost of writing the program. When presented with a problem whose solution is to be used many times, the cost of running the program may far exceed the cost of writing it, especially, if many of the program runs are given large amounts of input. Then it is financially sound to implement a fairly complicated algorithm, provided that the resulting program will run significantly faster than a more obvious program. Even in these situations it may be wise first to implement a simple algorithm, to determine the actual benefit to be had by writing a more complicated program. In building a complex system it is often desirable to implement a simple prototype on which measurements and simulations can be performed, before committing oneself to the final design. It follows that programmers must not only be aware of ways of making programs run fast, but must know when to apply these techniques and when not to bother.

1.3.1

Measuring the Running Time of a Program

The running time of a program depends on factors such as: 1. the input to the program, 2. the quality of code generated by the compiler used to create the object program,

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

6

3. the nature and speed of the instructions on the machine used to execute the program, and 4. the time complexity of the algorithm underlying the program.

The fact that running time depends on the input tells us that the running time of a program should be defined as a function of the input. Often, the running time depends not on the exact input but only on the "size" of the input. A good example is the process known as sorting. In a sorting problem, we are given as input a list of items to be sorted, and we are to produce as output the same items, but smallest (or largest) first. For example, given 2, 1, 3, 1, 5, 8 as input we might wish to produce 1, 1, 2, 3, 5, 8 as output. The latter list is said to be sorted smallest first. The natural size measure for inputs to a sorting program is the number of items to be sorted, or in other words, the length of the input list. In general, the length of the input is an appropriate size measure, and we shall assume that measure of size unless we specifically state otherwise. It is customary, then, to talk of T(n), the running time of a program on inputs of size n. 2

For example, some program may have a running time T(n) = cn , where c is a constant. The units of T(n) will be left unspecified, but we can think of T(n) as being the number of instructions executed on an idealized computer. For many programs, the running time is really a function of the particular input, and not just of the input size. In that case we define T(n) to be the worst case running time, that is, the maximum, over all inputs of size n, of the running time on that input. We also consider Tavg(n), the average, over all inputs of size n, of the running time on that input. While Tavg(n) appears a fairer measure, it is often fallacious to assume that all inputs are equally likely. In practice, the average running time is often much harder to determine than the worst-case running time, both because the analysis becomes mathematically intractable and because the notion of "average" input frequently has no obvious meaning. Thus, we shall use worst-case running time as the principal measure of time complexity, although we shall mention average-case complexity wherever we can do so meaningfully. Now let us consider remarks (2) and (3) above: that the running time of a program depends on the compiler used to compile the program and the machine used to execute it. These facts imply that we cannot express the running time T(n) in standard time units such as seconds. Rather, we can only make remarks like "the running time of such-and-such an algorithm 2

is proportional to n ." The constant of proportionality will remain unspecified since it depends so heavily on the compiler, the machine, and other factors.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1.3.2

7

Big-Oh and Big-Omega Notation

To talk about growth rates of functions we use what is known as "big-oh" notation. For example, 2

when we say the running time T(n) of some program is O(n ), read "big oh of n squared" or just "oh of n squared," we mean that there are positive constants c and n0 such that for n equal to or 2

greater than n0, we have T(n) ≤ cn . 2

Example 1.2. Suppose T(0) = 1, T(1) = 4, and in general T(n) = (n+l) . Then we see that T(n) is 2

2

2

O(n ), as we may let n0 = 1 and c = 4. That is, for n ≥ 1, we have (n + 1) ≤ 4n , as the reader may 2

prove easily. Note that we cannot let n0 = 0, because T(0) = 1 is not less than c0 = 0 for any constant c.

In what follows, we assume all running-time functions are defined on the nonnegative integers, and their values are always nonnegative, although not necessarily integers. We say that T(n) is O(f(n)) if there are constants c and n0 such that T(n) ≤ cf(n) whenever n ≥ n0. A program whose running time is O(f (n)) is said to have growth rate f(n). 3

2

3

Example 1.3. The function T(n)= 3n + 2n is O(n ). To see this, let n0 = 0 and c = 5. Then, the 3

2

3

4

reader may show that for n ≥ 0, 3n + 2n ≤ 5n . We could also say that this T(n) is O(n ), but this 3

would be a weaker statement than saying it is O(n ). n

n

As another example, let us prove that the function 3 is not O (2 ). Suppose that there were n

n

n

constants n0 and c such that for all n ≥ n0, we had 3 ≤ c2 . Then c ≥ (3/2) for any n ≥ n0. But n

n

(3/2) gets arbitrarily large as n gets large, so no constant c can exceed (3/2) for all n. When we say T(n) is O(f(n)), we know that f(n) is an upper bound on the growth rate of T(n). To specify a lower bound on the growth rate of T(n) we can use the notation T(n) is Ω(g(n)), read "big omega of g(n)" or just "omega of g(n)," to mean that there exists a positive constant c such that T(n) ≥ cg(n) infinitely often (for an infinite number of values of n). 3

2

3

3

Example 1.4. To verify that the function T(n)= n + 2n is Ω(n ), let c = 1. Then T(n) ≥ cn for n = 0, 1, . . .. 2

For another example, let T(n) = n for odd n ≥ 1 and T(n) = n /100 for even n ≥ 0. To verify that 2

T(n) is Ω (n ), let c = 1/100 and consider the infinite set of n's: n = 0, 2, 4, 6, . . ..

1.4

The domination of Growth Rate

We shall assume that programs can be evaluated by comparing their running-time functions, with 2

constants of proportionality neglected. Under this assumption a program with running time O(n )

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

8

3

is better than one with running time O(n ), for example. Besides constant factors due to the compiler and machine, however, there is a constant factor due to the nature of the program itself. It is possible, for example, that with a particular compiler-machine combination, the first program 2

3

3

takes 100n milliseconds, while the second takes 5n milliseconds. Might not the 5n program be 2

better than the 100n program?

The answer to this question depends on the sizes of inputs the programs are expected to 3

process. For inputs of size n < 20, the program with running time 5n will be faster than the one 2

with running time 100n . Therefore, if the program is to be run mainly on inputs of small size, we 3

would indeed prefer the program whose running time was O(n ). However, as n gets large, the 3

2

ratio of the running times, which is 5n /100n = n/20, gets arbitrarily large. Thus, as the size of the 3

2

input increases, the O(n ) program will take significantly more time than the O(n ) program. If there are even a few large inputs in the mix of problems these two programs are designed to solve, we can be much better off with the program whose running time has the lower growth rate.

Another reason for at least considering programs whose growth rates are as low as possible is that the growth rate ultimately determines how big a problem we can solve on a computer. Put another way, as computers get faster, our desire to solve larger problems on them continues to increase. However, unless a program has a low growth rate such as O(n) or O(nlogn), a modest increase in computer speed makes very little difference in the size of the largest problem we can solve in a fixed amount of time.

We wish to re-emphasize that the growth rate of the worst case running time is not the sole, or necessarily even the most important, criterion for evaluating an algorithm or program. Let us review some conditions under which the running time of a program can be overlooked in favor of other issues.

1. If a program is to be used only a few times, then the cost of writing and debugging dominate the overall cost, so the actual running time rarely affects the total cost. In this case, choose the algorithm that is easiest to implement correctly. 2. If a program is to be run only on "small" inputs, the growth rate of the running time may be less important than the constant factor in the formula for running time. What is a "small" input depends on the exact running times of the competing algorithms. There are some algorithms, such as the integer multiplication algorithm that are asymptotically the most efficient known for their problem, but have never been used in practice even on the largest problems, because the constant of proportionality is so large in comparison to other simpler, less "efficient" algorithms.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

9

3. A complicated but efficient algorithm may not be desirable because a person other than the writer may have to maintain the program later. It is hoped that by making the principal techniques of efficient algorithm design widely known, more complex algorithms may be used freely, but we must consider the possibility of an entire program becoming useless because no one can understand its subtle but efficient algorithms. 4. There are a few examples where efficient algorithms use too much space to be implemented without using slow secondary storage, which may more than negate the efficiency. 5. In numerical algorithms, accuracy and stability are just as important as efficiency.

1.5

Calculating the Running Time of a Program

Determining, even to within a constant factor, the running time of an arbitrary program can be a complex mathematical problem. In practice, however, determining the running time of a program to within a constant factor is usually not that difficult; a few basic principles suffice. Before presenting these principles, it is important that we learn how to add and multiply in "big oh" notation. Suppose that T1(n) and T2(n) are the running times of two program fragments P1 and P2, and that T1(n) is O(f(n)) and T2(n) is O(g(n)). Then T1(n)+T2(n), the running time of P1 followed by P2, is O(max(f(n),g(n))). To see why, observe that for some constants c1, c2, n1, and n2, if n ≥ n1 then T1(n) ≤ c1f(n), and if n ≥ n2 then T2(n) ≤ c2g(n). Let n0 = max(n1, n2). If n ≥ n0, then T1(n) + T2(n) ≤ c1f(n) + c2g(n). From this we conclude that if n ≥ n0, then T1(n) + T2(n) ≤ (c1 + c2)max(f(n), g(n)). Therefore, the combined running time T1(n) + T2(n) is O (max(f (n), g (n))). Example 1.5. The rule for sums given above can be used to calculate the running time of a sequence of program steps, where each step may be an arbitrary program fragment with loops 2

and branches. Suppose that we have three steps whose running times are, respectively, O(n ), 3

O(n ) and O(n log n). Then the running time of the first two steps executed sequentially is 2

3

3

3

O(max(n , n )) which is O(n ). The running time of all three together is O(max(n , n log n)) which 3

is O(n ).

In general, the running time of a fixed sequence of steps is, to within a constant factor, the running time of the step with the largest running time. In rare circumstances there will be two or more steps whose running times are incommensurate (neither is larger than the other, nor are they equal). For example, we could have steps of running times O(f (n)) and O(g (n)), where

In such cases the sum rule must be applied directly; the running time is O(max(f(n), g(n))), that is,

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4

10

3

n if n is even and n if n is odd. Another useful observation about the sum rule is that if g(n) ≤ f(n) for all n above some constant 2

n0, then O(f(n) + g(n)) is the same as O(f(n)). For example, O(n2+n) is the same as O(n ). The rule for products is the following. If T1(n) and T2(n) are O(f(n)) and O(g(n)), respectively, then T1(n)T2(n) is O(f(n)g(n)). The reader should prove this fact using the same ideas as in the proof of the sum rule. It follows from the product rule that O(cf(n)) means the same thing as O(f(n)) if c is 2

2

any positive constant. For example, O(n /2) is the same as O(n ).

Before proceeding to the general rules for analyzing the running times of programs, let us take a simple example to get an overview of the process.

Example 1.6. Consider the sorting program bubble of Fig. , which sorts an array of integers into increasing order. The net effect of each pass of the inner loop of statements (3)(6) is to "bubble" the smallest element toward the front of the array.

procedure bubble ( var A: array [1..n] of integer ); { bubble sorts array A into increasing order } var i, j, temp: integer; begin (1) for i := 1 to n-1 do (2) for j := n downto i+1 do (3)

if A[j-1] > A[j] then begin { swap A[j - 1] and A[j] }

(4)

temp := A[j-1];

(5)

A[j-1] := A[j];

(6)

AI> [j] := temp

end end; { bubble }

The number n of elements to be sorted is the appropriate measure of input size. The first observation we make is that each assignment statement takes some constant amount of time, independent of the input size. That is to say, statements (4), (5) and (6) each take O(1) time. Note that O(1) is "big oh" notation for "some constant amount." By the sum rule, the combined running time of this group of statements is O(max(1, 1, 1)) = O(1).

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

11

Now we must take into account the conditional and looping statements. The if- and for-statements are nested within one another, so we may work from the inside out to get the running time of the conditional group and each loop. For the if-statement, testing the condition requires O(1) time. We don't know whether the body of the if-statement (lines (4)(6)) will be executed. Since we are looking for the worst-case running time, we assume the worst and suppose that it will. Thus, the if-group of statements (3)-(6) takes O(1) time.

Proceeding outward, we come to the for-loop of lines (2)-(6). The general rule for a loop is that the running time is the sum, over each iteration of the loop, of the time spent executing the loop body for that iteration. We must, however, charge at least O(1) for each iteration to account for incrementing the index, for testing to see whether the limit has been reached, and for jumping back to the beginning of the loop. For lines (2)-(6) the loop body takes O(1) time for each iteration. The number of iterations of the loop is n-i, so by the product rule, the time spent in the loop of lines (2)-(6) is O((n-i) X 1) which is O(n-i).

Now let us progress to the outer loop, which contains all the executable statements of the program. Statement (1) is executed n - 1 times, so the total running time of the program is bounded above by some constant times 2

which is O(n ). The program, therefore, takes time proportional to the square of the number of items to be sorted. We can have running time is O(nlogn), which is considerably smaller, since for large n, logn is very much smaller than n.

Before proceeding to some general analysis rules, let us remember that determining a precise upper bound on the running time of programs is sometimes simple, but at other times it can be a deep intellectual challenge. There are no complete sets of rules for analyzing programs.

Now let us enumerate some general rules for the analysis of programs. In general, the running time of a statement or group of statements may be parameterized by the input size and/or by one or more variables. The only permissible parameter for the running time of the whole program is n, the input size.

1. The running time of each assignment, read, and write statement can usually be taken to be O(1). There are a few exceptions, such as in PL/I, where assignments can involve arbitrarily large arrays, and in any language that allows function calls in assignment statements. 2. The running time of a sequence of statements is determined by the sum rule. That is, the

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

12

running time of the sequence is, to within a constant factor, the largest running time of any statement in the sequence. 3. The running time of an if-statement is the cost of the conditionally executed statements, plus the time for evaluating the condition. The time to evaluate the condition is normally O(1). The time for an if-then-else construct is the time to evaluate the condition plus the larger of the time needed for the statements executed when the condition is true and the time for the statements executed when the condition is false. 4. The time to execute a loop is the sum, over all times around the loop, of the time to execute the body and the time to evaluate the condition for termination (usually the latter is O(1)). Often this time is, neglecting constant factors, the product of the number of times around the loop and the largest possible time for one execution of the body, but we must consider each loop separately to make sure. The number of iterations around a loop is usually clear, but there are times when the number of iterations cannot be computed precisely. It could even be that the program is not an algorithm, and there is no limit to the number of times we go around certain loops.

1.6

Check your progress 1. Describe the term Data Structure. Differentiate between static ad dynamic data structures.

2. Define algorithm. What are the criteria that a algorithm must satisfy?

3. What is the running time of a algorithm? How do we calculate it?

4. What are different asymptotic notations?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 2 Data handling in C++ Structure 2.0

Unit Objective

2.1

Data Types, Data Structures and Abstract Data Types

2.2

Structured Types, Data Abstraction, and Classes

2.3

Basic Principles

2.4

Abstract Data Type

2.4.1

Categories of ADT Operations

2.4.2

C++ Classes

2.5

Information Hiding

2.6

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

13


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 2.0

14

Unit Objective

After going through this unit, you should be able to understand : 

Data types and Abstract data types

C++ data types : scalar and structured

Aggregate operations possible

Principles of OO paradigm

2.1

Data Types, Data Structures and Abstract Data Types

Although the terms "data type" (or just "type"), "data structure" and "abstract data type" sound alike, they have different meanings. In a programming language, the data type of a variable is the set of values that the variable may assume. For example, a variable of type boolean can assume either the value true or the value false, but no other value.

An abstract data type is a mathematical model, together with various operations defined on the model. As we have indicated, we shall design algorithms in terms of ADT's, but to implement an algorithm in a given programming language we must find some way of representing the ADT's in terms of the data types and operators supported by the programming language itself. To represent the mathematical model underlying an ADT we use data structures, which are collections of variables, possibly of several different data types, connected in various ways.

The cell is the basic building block of data structures. We can picture a cell as a box that is capable of holding a value drawn from some basic or composite data type. Data structures are created by giving names to aggregates of cells and (optionally) interpreting the values of some cells as representing connections (e.g., pointers) among cells.

Another common mechanism for grouping cells in programming languages is the record structure. A record is a cell that is made up of a collection of cells, called fields, of possibly dissimilar types. Records are often grouped into arrays; the type defined by the aggregation of the fields of a record becomes the "celltype" of the array.

2.2

Structured Types, Data Abstraction, and Classes

Scalar Data Types - int, float 

An atomic data item

One memory location for a symbolic name

Cannot be decomposed

Structured Data Types - arrays, structs, union, class

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 

A collection of components, data items, with an imposed organization

Entire collection is given a single name

Each component is accessed individually

15

C++ Data Types /\ /\ /\

simple

structured

/||\ /||\ /||\ char, int, float

array struct union class

Structs (Record) 

Block of memory locations (fields) given one name

Heterogeneous - fields can be different data types

Each component, field, is called a member

A member name is given to each field for accessing

struct StructName

optional

{

DataType MemberName;

MemberList } Variables

DataType MemberName; optional

;

...

Example: struct ItemRecType {

// Type declaration

int qty;

float cost; bool taxable; } item1;

// Variable declaration

ItemRecType item2; 

//Variable declaration

Without the Type name variables are of anonymous type, no other variables can be declared with the same type.

Without the variable declarations it is a type declaration only

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

16

Access 

A member is accessed by the members name and the member selector (dot).

StructVariable.MemberName; Example:

item1.cost = 57.98;

Aggregate Operations 

Assignment is permitted - two struct variables of the same type can be assigned to each other. Copies the contents of the struct variable to the other struct variable, member by member.

Example:

item2 = item1; // Assigns all field values in item1 to item2

Passing Struct as Parameters 

Can be call by value or call by reference

Can be a return value of a function

Examples: SomeFunction(item1)

// Valid call

void SomeFunction(itemRecType item) // Call by value void SomeFunction(itemrecType & item) // Call by reference

Hierarchical Records - Records in which at least one of the components is a record. struct DateType {

int month;

int day; int year; };

enum Gradetype {A, B, C, D, F};

struct StudentRec { string first; string last; DateType birthday; float gpa; int programs; int quizzes;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

17

int final; GradeType grade; };

StudentRec s1, s2;

s1.grade = C;

s2.birthday.month = 10; Unions 

A structure that holds only one of its members at a time during program execution.

Purpose is to conserve memory by forcing several members to use the same memory space.

Example: union byteType {

long bytes;

int KB; float MB; float GB; }; byteType disk1; disk1.bytes = 345678; disk1.KB = 23;

// replaces the 345678

disk1.MB = 1.44;

2.3

// replaces the 23

Basic Principles

Data Abstraction - Separate a data types logical properties from its implementation.

Example: Version 1:

Version 2:

struct DateType {

int month;

enum Gradetype {A, B, C, D, F};

int day; int year;

struct StudentRec

};

{

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

18

string first; enum Gradetype {A, B, C, D, F};

string last; int birthMo;

struct StudentRec

int birthDay;

{

int birthYr;

string first;

float gpa;

string last;

int programs;

DateType birthday;

int quizzes;

float gpa;

int final;

int programs;

GradeType grade;

int quizzes;

};

int final; GradeType grade; };

Version 1 is more abstract - the DateType can be used for many other types. Manipulation of a date (comparison and assignment etc) is universal. When it is embedded in the record that abstraction cannot be made. It would be specific to the StudentRec.

Control Abstraction - Separation of the logical properties of an action from the implementation. (Example: calling a library function: It is not necessary to know how it does what it does - just that the action happens. Users depend on the specification - description, not the implementation details.)

Data Abstraction - separation of the logical properties of a data type from the implementation details.

Data abstraction comes into play when a programmer needs a data type that is not built into the language. The new data type is called an ADT (Abstract Data Type).

2.4

Abstract Data Type

A data type whose logical properties are specified independently of any particular implementation. Users of the ADT - need to know the specification (logical description).Implementers of the ADT determine the underlying implementation details by defining data structures and operations.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ Data Structure -

19

(The implementation of the data in an ADT) A collection of data elements

whose organization is characterized by operations that access these elements.

2.4.1 

Categories of ADT Operations Constructor - An operation that creates a new instance of an ADT. Invoked most often by a declaration.

Transformer - An operation that builds a new value of the ADT, from an existing value. Invoked most often to modify a data value in the ADT.

Observer - An operation that allows us to observe the state of an instance of an ADT without changing it.

Predicates - check if a certain property is true or false.

Selector - returns a copy of a value in the object

Summary - returns information about the abject as a whole.

Iterator - An operation that allows us to process - one at a time - all the components in an instance of an ADT. Invoked most often to print the items of an object in a specified order. Only implemented for structured data.

2.4.2

C++ Classes

Class - A data type that is used to implement an ADT. Class Member - A component of the class - data or function. Class Object/instance - A variable of class type. Client - Programs that declare and manipulate objects (instances) of a particular class.

Class Member - A component of a class, either data or functions Public - Those members that are available to the client (usually functions). Private - Those members that are not available to the client, only for the ADT - Data and functions. Can be seen in the specifications. Class Specification - A written description of what it is & the type of functions/operations that can be performed. (User / client of the ADT needs to only understand specs.) Most often stored in a header (.h) file.

Example:

The specifications for an ADT class to represent the date. (date.h)

class DateClass { public: // Member function declarations void Set(int newMo, int newDay, int newYr);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

20

void Print() const; bool Equal(DateClass otherDate) const;

private: // Member data declarations int mo; int day; int yr; }; 

const at the end of a function declaration indicates that the function doesnt change the data in the object.

By default members are private, must use public: to open up members to a client program.

Example piece of a client Program :

#include date.h

// variable declarations DateClass date1; DateClass date2;

Visual Image of the DateClass Class objects: date1 and date2 

Each has its own copies of all members

Technically the functions are not duplicated.

Class Implementation - The hidden code that determines how it is & how the operations work. is stored in a source file (.cpp).

Writing the code for a class: (Implementation file) 1. Chose a concrete data representation of the abstract data, using data types that already exist (int, floatstruct, arrays). 2. Implement each of the allowable operations in terms of program instructions (functions).

Some Notes to remember: 

Client and implementation file must begin with the preprocessor directive to include the specification file.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 

21

Implementation file often has comments indicating the private data portion to remind reader of these identifiers.

Every function definition in the implementation file must be prefaced by the class name and the C++ scope resolution operator (::) . There could be other classes that use the same function names.

Implementation file can refer to the members without using the dot operator. Use field names like "global" variables. An exception to the rule exists when a member function manipulates two or more class objects where one or more are passed as parameters example can be seen in Equal( )

Print() and Equal() are observer functions and do not modify the private data. The const prevents the member functions from accidentally modifying the data - it is an aid to the client and to the class implementers. The const must appear in both the specification file and implementation file.

Example:

The Implementation for an ADT class to represent the date. (date.cpp)

#include date.h #include <iostream>

void DateClass :: Set (int newMo, int newDay, int newyr) {

mo = newMo;

day = newDay; yr = newYear; } void DateClass :: Print() const { cout << mo << "/" << day << "/" << yr; }

bool DateClass :: Equal(DateClass otherDate) const { return (( mo == otherDate.mo ) && (day == otherDate.day) && (yr == otherDate.yr)) }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

22

Built-in operations on Classes: 

Pass as a parameter to a function

fun1(date1, date2) 

Return as a function return value

date1 = fun1(); 

Declare arrays of class objects

DateClass dates[15]; 

Assign one class object to another (only assigns data actually declared in object)

date1 = date2; 

Select a member of a class object

date1.Set (5, 2, 98); date1.Print(); if (date1.Equal(date2)) cout << "The dates are the same!";

Example : streams istream cin; ostream cout; ifstream infile; istream, ostream, ifstream, fstream are classes, each has many members ex: get, ignore, put, open etc..

2.5 

Information Hiding Abstract Barrier - The invisible wall around a class object that encapsulates implementation details. The wall can be breached only through the public interface.

Black box - A device/ program whose inner workings are hidden from view (ex; cable box - has an interface that you plug into, the rest is hidden)

Information hiding - The encapsulation and hiding of implementation details to keep the user of an ADT from depending on or incorrectly manipulating the details.

Guaranteed Initialization using Class Constructors Class Constructors - A special member function of a class that is implicitly invokes when a class object is declared. (Used for automatic initialization of data in a class.) Two types: default & parameterized. 

Member function uses the class name.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

23

No return value - return data type is omitted.

If a class has multiple constructors then the appropriate constructor is chosen by the actual parameters (or no parameters) passed when the declaration of the class object is made.

Default Constructor - parameter-less, must exist to have any constructors (This is the one used in declaration of arrays of a class).

Others Constructors - have parameters - matched by the data type and number of parameters passed in the declaration.

Example: add to the date.h the specification of the public member functions for construction: DateClass();

// declaration of the default constructor

DateClass(int, int, int); // declaration of a parameterized constructor

add to the date.cpp the implementation of the constructors:

DateClass::DateClass() { mo = 1; day = 1; yr = 1900; }

DateClass::DateClass(int initMo, int initDay, int initYr) { mo = initMo; day = initDay; yr = initYr; }

2.6

Check your progress

1. What are the various data types in C++? .

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 2. Define structures. What are various ways of the accessing the strcuctures?

3. What is a Hierarchical record ? How can we create one?

4. What are different categories of ADT operations?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

24


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 3 Data Structure introduction Structure 3.0

Unit Objective

3.1

Abstract Data Type Implementation

3.2

An Example: Collections

3.2.1 Constructors and destructors 3.2.2 Data Structure 3.2.3 Methods 3.2.4 Pre- and post-conditions 3.3

Error Handling

3.3.1 Defining Errors 3.3.2 Processing errors 3.4

Data Structures Examples

3.4.1 Arrays 3.4.2 Linked lists 3.4.3 List variants 3.4.4 Stacks 3.5

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

25


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 3.0

26

Unit Objective

After going through this unit, you should be able to understand :

3.1

ADT implementation

Basic collection concept

Errors handling ad processing

Collection examples

Arrays, Liked lists and Stacks

Abstract Data Type Implementation

We can think of an abstract data type (ADT) as a mathematical model with a collection of operations defined on that model. Sets of integers, together with the operations of union, intersection, and set difference, form a simple example of an ADT. In an ADT, the operations can take as operands not only instances of the ADT being defined but other types of operands, e.g., integers or instances of another ADT, and the result of an operation can be other than an instance of that ADT. However, we assume that at least one operand, or the result, of any operation is of the ADT in question.

The two properties of procedures mentioned above -- generalization and encapsulation --apply equally well to abstract data types. ADT's are generalizations of primitive data types (integer, real, and so on), just as procedures are generalizations of primitive operations (+, *, and so on). The ADT encapsulates a data type in the sense that the definition of the type and all operations on that type can be localized to one section of the program. If we wish to change the implementation of an ADT, we know where to look, and by revising one small section we can be sure that there is no subtlety elsewhere in the program that will cause errors concerning this data type. Moreover, outside the section in which the ADT's operations are defined, we can treat the ADT as a primitive type; we have no concern with the underlying implementation. One pitfall is that certain operations may involve more than one ADT, and references to these operations must appear in the sections for both ADT's. Turning to the abstract data type GRAPH we see need for the following operations: 1. get the first uncolored vertex, 2. test whether there is an edge between two vertices, 3. mark a vertex colored, and 4. get the next uncolored vertex.

There are clearly other operations needed outside the procedure greedy, such as inserting vertices and edges into the graph and making all vertices uncolored. It should be emphasized that there is no limit to the number of operations that can be applied to

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

27

instances of a given mathematical model. Each set of operations defines a distinct ADT. Some examples of operations that might be defined on an abstract data type SET are: 1

MAKENULL(A). This procedure makes the null set be the value for set A.

2

UNION(A, B, C). This procedure takes two set-valued arguments A and B, and assigns the union of A and B to be the value of set C.

3

SIZE(A). This function takes a set-valued argument A and returns an object of type integer whose value is the number of elements in the set A.

An implementation of an ADT is a translation, into statements of a programming language, of the declaration that defines a variable to be of that abstract data type, plus a procedure in that language for each operation of the ADT. An implementation chooses a data structure to represent the ADT; each data structure is built up from the basic data types of the underlying programming language using the available data structuring facilities. Arrays and record structures are two important data structuring facilities that are available in Pascal. For example, one possible implementation for variable S of type SET would be an array that contained the members of S. One important reason for defining two ADT's to be different if they have the same underlying model but different operations is that the appropriateness of an implementation depends very much on the operations to be performed. Much of this book is devoted to examining some basic mathematical models such as sets and graphs, and developing the preferred implementations for various collections of operations. Ideally, we would like to write our programs in languages whose primitive data types and operations are much closer to the models and operations of our ADT's.

Objects and ADTs In this unit, we'll concentrate on the pre-cursor of OO design: abstract data types (ADTs). A theory for the full object oriented approach is readily built on the ideas for abstract data types.

An abstract data type is a data structure and a collection of functions or procedures which operate on the data structure.

To align ourselves with OO theory, we'll call the functions and procedures methods and the data structure and its methods a class, i.e. we'll call our ADTs classes. However our classes do not have the full capabilities associated with classes in OO theory. An instance of the class is called an object . Objects represent objects in the real world and appear in programs as variables of a type defined by the class. These terms have exactly the same meaning in OO design methodologies, but they have additional properties such as inheritance that we will not discuss here.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

28

It is important to note the object orientation is a design methodology. As a consequence, it is possible to write OO programs using languages such as C, Ada and Pascal. The so-called OO languages such as C++ and Eiffel simply provide some compiler support for OO design: this support must be provided by the programmer in non-OO languages.

3.2

An Example: Collections

Programs often deal with collections of items. These collections may be organised in many ways and use many different program structures to represent them, yet, from an abstract point of view, there will be a few common operations on any collection. These might include: create Create a new collection add Add an item to a collection delete Delete an item from a collection find Find an item matching some criterion in the collection destroy Destroy the collection

3.2.1

Constructors and destructors

The create and destroy methods - often called constructors and destructors - are usually implemented for any abstract data type. Occasionally, the data type's use or semantics are such that there is only ever one object of that type in a program. In that case, it is possible to hide even the object's `handle' from the user. However, even in these cases, constructor and destructor methods are often provided. Of course, specific applications may call for additional methods, e.g. we may need to join two collections (form a union in set terminology) - or may not need all of these. One of the aims of good program design would be to ensure that additional requirements are easily handled.

3.2.2

Data Structure

To construct an abstract software model of a collection, we start by building the formal specification. The first component of this is the name of a data type - this is the type of objects that belong to the collectionclass. In C/C++, we use typedef to define a new type which is a pointer to a structure:

typedef struct collection_struct *collection;

Note that we are defining a pointer to a structure only; we have not specified details of the attributes of the structure. We are deliberately deferring this - the details of the implementation are irrelevant at this stage. We are only concerned with the abstract behaviour of the collection. In fact, as we will see later, we want to be able to substitute different data structures for the actual implementation of the collection, depending on our needs.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

29

The typedef declaration provides us with a C type (class in OO design parlance), collection. We can declare objects of type collectionwherever needed. Although C forces us to reveal that the handle for objects of the class is a pointer, it is better to take an abstract view: we regard variables of type collectionsimply as handles to objects of the class and forget that the variables are actually C pointers.

3.2.3

Methods

Next, we need to define the methods: collection ConsCollection( int max_items, int item_size ); void AddToCollection( collection c, void *item ); void DeleteFromCollection( collection c, void *item ); void *FindInCollection( collection c, void *key );

Just as we defined our collection object as a pointer to a structure, we assume that the object which belong in this collection are themselves represented by pointers to data structures. Hence in AddToCollection, itemis typed void *. In ANSI C, void * will match any pointer - thus AddToCollectionmay be used to add any object to our collection. Similarly, keyin FindInCollectionis typed void *, as the key which is used to find any item in the collection may itself be some object. FindInCollectionreturns a pointer to the item which matches key, so it also has the type void *.

3.2.4

Pre- and post-conditions

No formal specification is complete without pre- and post-conditions. A useful way to view these is as forming a contract between the object and its client. The pre-conditions define a state of the program which the client guarantees will be true before calling any method, whereas the postconditions define the state of the program that the object's method will guarantee to create for you when it returns.

However, the standard does define an assertfunction which can be used to verify pre-and postconditions. We will see how this is used when we examine an implementation of our collection object. Thus pre- and post-conditions should be expressed as comments accompanying the method definition.

3.3

Error Handling

No program or program fragment can be considered complete until appropriate error handling has been added. Unexpected program failures are a disaster - at the best, they cause frustration

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

30

because the program user must repeat minutes or hours of work, but in life-critical applications, even the most trivial program error, if not processed correctly, has the potential to kill someone. If an error is fatal, in the sense that a program cannot sensibly continue, then the program must be able to "die gracefully". This means that it must

3.3.1

inform its user(s) why it died, and

save as much of the program state as possible.

Defining Errors

The first step in determining how to handle errors is to define precisely what is considered to be an error. Careful specification of each software component is part of this process. The preconditions of an ADT's methods will specify the states of a system (the input states) which a method is able to process. The post-conditions of each method should clearly specify the result of processing each acceptable input state. Thus, if we have a method: int f( some_class a, int i ) /* PRE-CONDITION: i >= 0 */ /* POST-CONDITION: if ( i == 0 ) return 0 and a is unaltered else return 1 and update a's i-th element by .... */ This specification tells us that i==0 is a meaningless input that fshould flag by returning 0 but otherwise ignore. f is expected to handle correctly all positive values of i. The behaviour of f is not specified for negative values of i, ie it also tells us that It is an error for a client to call f with a negative value of i. Thus, a complete specification will specify 

all the acceptable input states, and

the action of a method when presented with each acceptable input state.

The client is responsible for the pre-conditions: it is an error for the client to call the method with an unacceptable input state, and

The method is responsible for establishing the post-conditions and for reporting errors which occur in doing so.

By specifying the acceptable input states in pre-conditions, it will also divide responsibility for errors unambiguously.

3.3.2

Processing errors

Let's look at an error which must be handled by the constructor for any dynamically allocated

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

31

object: the system may not be able to allocate enough memory for the object. A good way to create a disaster is to do this: X ConsX( .... ) { X x = malloc( sizeof(struct t_X) ); if ( x == NULL ) { printf("Insuff mem\n"); exit( 1 ); } Else .....} Not only is the error message so cryptic that it is likely to be little help in locating the cause of the error (the message should at least be "Insuff mem for X"!), but the program will simply exit, possibly leaving the system in some unstable, partially updated, state. This approach has other potential problems: 

What if we've built this code into some elaborate GUI program with no provision for "standard output"? We may not even see the message as the program exits!

We may have used this code in a system, such as an embedded processor (a control computer), which has no way of processing an output stream of characters at all.

The use of exitassumes the presence of some higher level program, eg a Unix shell, which will capture and process the error code 1.

3.4

Data Structures Examples

In this section, we will examine some fundamental data structures: arrays, lists, stacks and trees.

3.4.1

Arrays

The simplest way to implement our collection is to use an array to hold the items. Thus the implementation of the collection object becomes: /* Array implementation of a collection */ #include <assert.h> /* Needed for assertions */ #include "collection.h" /* import the specification */

struct t_collection { int item_cnt; int max_cnt;

/* Not strictly necessary */

int item_size;

/* Needed by FindInCollection */

void *items[];

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

32

};

/* Array implementation of a Collection */ #include <stdio.h> /* Definition of NULL */ #include <assert.h> /* Needed for assertions */ #include "Collection.h" /* import the specification */ extern void *ItemKey( void * ); struct t_Collection { int item_cnt; int max_items;

/* Not strictly necessary */

int size;

/* Needed by FindInCollection */

void **items; }; Collection ConsCollection(int max_items, int item_size ) /* Construct a new Collection Pre-condition: (max_items > 0) && (item_size > 0) Post-condition: returns a pointer to an empty Collection*/ { Collection c; assert( max_items > 0 ); assert( item_size > 0 ); c = (Collection)calloc( 1, sizeof(struct t_Collection) ); c->items = (void **)calloc(max_items,sizeof(void *)); c->size = item_size; c->max_items = max_items; return c; } void DeleteCollection( Collection c ) { assert( c != NULL ); assert( c->items != NULL ); free( c->items ); free( c ); } void AddToCollection( Collection c, void *item ) /* Add an item to a Collection Pre-condition: (c is a Collection created by a call to ConsCollection) && (existing item count < max_items) && (item != NULL)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

33

Post-condition: item has been added to c*/ { assert( c != NULL); assert( c->item_cnt < c->max_items ); assert( item != NULL); c->items[c->item_cnt++] = item; /* Post-condition */ assert( FindInCollection( c, ItemKey( item ) ) != NULL ); } void DeleteFromCollection( Collection c, void *item ) /* Delete an item from a Collection Pre-condition: (c is a Collection created by a call to ConsCollection) && (existing item count >= 1) && (item != NULL) Post-condition: item has been deleted from c*/ { int i; assert( c != NULL ); assert( c->item_cnt >= 1 ); assert( item != NULL ); for(i=0;i<c->item_cnt;i++) { if ( item == c->items[i] ) { /* Found the item to be deleted, shuffle all the rest down */ while( i < c->item_cnt ) { c->items[i] = c->items[i+1]; i++; } c->item_cnt--; break; } } }

void *FindInCollection( Collection c, void *key ) /* Find an item in a Collection Pre-condition: c is a Collection created by a call to ConsCollection key != NULL Post-condition: returns an item identified by key if one exists, otherwise returns NULL*/ { int i;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

34

assert( c != NULL ); assert( key != NULL ); for(i=0;i<c->item_cnt;i++) { if (memcmp(ItemKey(c->items[i]),key,c->size)==0) return c->items[i]; } return NULL; } The array implementation of our collection has one serious drawback: you must know the maximum number of items in your collection when you create it. This presents problems in programs in which this maximum number cannot be predicted accurately when the program starts up. Fortunately, we can use a structure called a linked list to overcome this limitation.

3.4.2

Linked lists

The linked list is a very flexible dynamic data structure: items may be added to it or deleted from it at will. A programmer need not worry about how many items a program will have to accommodate: this allows us to write robust programs which require much less maintenance. A very common source of problems in program maintenance is the need to increase the capacity of a program to handle larger collections: even the most generous allowance for growth tends to prove inadequate over time! In a linked list, each item is allocated space as it is added to the list. A link is kept with each item to the next item in the list. Each node of the list has two elements

1

the item being stored in the list and

2

a pointer to the next item in the list

The last node in the list contains a NULL pointer to indicate that it is the end or tail of the list. As items are added to a list, memory for a node is dynamically allocated. Thus the number of items that may be added to a list is limited only by the amount of memory available. Handle for the list The variable (or handle) which represents the list is simply a pointer to the node at the head of the list. Adding to a list The simplest strategy for adding an item to a list is to: a.

allocate space for a new node,

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ b.

copy the item into it,

c.

make the new node's nextpointer point to the current head of the list and

d.

make the head of the list point to the newly allocated node.

35

This strategy is fast and efficient, but each item is added to the head of the list. An alternative is to create a structure for the list which contains both head and tail pointers: struct fifo_list { struct node *head; struct node *tail; };

3.4.3

List variants

Circularly Linked Lists By ensuring that the tail of the list is always pointing to the head, we can build a circularly linked list. If the external pointer (the one in struct t_node in our implementation), points to the current "tail" of the list, then the "head" is found trivially via tail->next, permitting us to have either LIFO or FIFO lists with only one external pointer. In modern processors, the few bytes of memory saved in this way would probably not be regarded as significant. A circularly linked list would more likely be used in an application which required "round-robin" scheduling or processing. Doubly Linked Lists

They permit scanning or searching of the list in both directions. (To go backwards in a simple list, it is necessary to go back to the start and scan forwards.) Many applications require searching backwards and forwards through sections of a list: for example, searching for a common name like "Kim" in a Korean telephone directory would probably need much scanning backwards and forwards through a small region of the whole list, so the backward links become very useful. In this case, the node structure is altered to have two links: struct t_node { void *item; struct t_node *previous; struct t_node *next; } node;

Lists in arrays Although this might seem pointless (Why impose a structure which has the overhead of the "next" pointers on an array?), this is just what memory allocators do to manage available space.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

36

Memory is just an array of words. After a series of memory allocations and de-allocations, there are blocks of free memory scattered throughout the available heap space. In order to be able to re-use this memory, memory allocators will usually link freed blocks together in a free list by writing pointers to the next free block in the block itself. An external free list pointer pointer points to the first block in the free list. When a new block of memory is requested, the allocator will generally scan the free list looking for a freed block of suitable size and delete it from the free list (re-linking the free list around the deleted block). Many variations of memory allocators have been proposed: refer to a text on operating systems or implementation of functional languages for more details. The entry in the index under garbage collection will probably lead to a discussion of this topic.

3.4.4

Stacks

Another way of storing data is in a stack. A stack is generally implemented with only two principle operations (apart from a constructor and destructor methods):

Other methods such as

are sometimes added.

A common model of a stack is a plate or coin stacker. Plates are "pushed" onto to the top and "popped" off the top. Stacks form Last-In-First-Out (LIFO) queues and have many applications from the parsing of algebraic expressions to ... A formal specification of a stack class would look like: typedef struct t_stack *stack; stack ConsStack( int max_items, int item_size ); /* Construct a new stack Pre-condition: (max_items > 0) && (item_size > 0) Post-condition: returns a pointer to an empty stack*/ void Push( stack s, void *item ); /* Push an item onto a stack Pre-condition: (s is a stack created by a call to ConsStack) && (existing item count < max_items) && (item != NULL) Post-condition: item has been added to the top of s*/ void *Pop( stack s ); /* Pop an item of a stack

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

37

Pre-condition: (s is a stack created by a call to ConsStack) && (existing item count >= 1) Post-condition: top item has been removed from s*/ Important Points ďƒ˜

a. A stack is simply another collection of data items and thus it would be possible to use exactly the same specification as the one used for our general collection. However, collections with the LIFO semantics of stacks are so important in computer science that it is appropriate to set up a limited specification appropriate to stacks only.

ďƒ˜

b. Although a linked list implementation of a stack is possible (adding and deleting from the head of a linked list produces exactly the LIFO semantics of a stack), the most common applications for stacks have a space restraint so that using an array implementation is a natural and efficient one (In most operating systems, allocation and de-allocation of memory is a relatively expensive operation, there is a penalty for the flexibility of linked list implementations.).

Stack Frames Almost invariably, programs compiled from modern high level languages (even C++) make use of a stack frame for the working memory of each procedure or function invocation. When any procedure or function is called, a number of words - the stack frame - is pushed onto a program stack. When the procedure or function returns, this frame of data is popped off the stack.

As a function calls another function, first its arguments, then the return address and finally space for local variables is pushed onto the stack. Since each function runs in its own "environment" or context, it becomes possible for a function to call itself - a technique known as recursion. This capability is extremely useful and extensively used - because many problems are elegantly specified or solved in a recursive way.

Program stack after executing a pair of mutually recursive functions: function f(int x, int y) { int a; if ( term_cond ) return ...; a = .....; return g(a); } function g(int z) { int p,q; p = ...; q = ...;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

38

return f(p,q); }

Note how all of function f and g's environment (their parameters and local variables) are found in the stack frame. When fis called a second time from g, a new frame for the second invocation of f is created.

3.5

Check your progress 1. What are the basic operations that can be performed on any collection?

2. What condition of a program can be called as an error? How can we handle them in C++?

3. Give a brief introduction about collection Array.

4. Give a brief introduction about collection Stack.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 4 Searching Techniques Structure 4.0

Unit Objective

4.1

Searching Basics

4.2

Sequential Searches

4.3

Binary Search

4.4

Improvements in Searching

4.5

Hashing

4.6

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

39


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4.0

40

Unit Objective

After going through this unit, you should be able to understand :

4.1

Searching basics

Sequential search

Binary search

Analysis of searching techiques

Source code of searching in C++

Searching Basics

Computer systems are often used to store large amounts of data from which individual records must be retrieved according to some search criterion. Thus the efficient storage of data to facilitate fast searching is an important issue. In this section, we shall investigate the performance of some searching algorithms and the data structures which they use.

4.2

Sequential Searches

Let's examine how long it will take to find an item matching a key in the collections we have discussed so far. We're interested in: a. the average time b. the worst-case time and c. the best possible time.

However, we will generally be most concerned with the worst-case time as calculations based on worst-case times can lead to guaranteed performance predictions. Conveniently, the worstcase times are generally easier to calculate than average times.

If there are n items in our collection - whether it is stored as an array or as a linked list - then it is obvious that in the worst case, when there is no item in the collection with the desired key, then n comparisons of the key with keys of the items in the collection will have to be made. To simplify analysis and comparison of algorithms, we look for a dominant operation and count the number of times that dominant operation has to be performed. In the case of searching, the dominant operation is the comparison, since the search requires n comparisons in the worst case, we say this is a O(n) (pronounce this "big-Oh-n" or "Oh-n") algorithm. The best case - in which the first comparison returns a match - requires a single comparison and is O(1). The average time depends on the probability that the key will be found in the collection - this is something that we would not expect to know in the majority of cases. Thus in this case, as in most others, estimation of the average time is of little utility. If the performance of the system is vital, i.e. it's part of a life-critical system, then we must use the worst case in our design

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

41

calculations as it represents the best guaranteed performance.

4.3

Binary Search

However, if we place our items in an array and sort them in either ascending or descending order on the key first, then we can obtain much better performance with an algorithm called binary search.

In binary search, we first compare the key with the item in the middle position of the array. If there's a match, we can return immediately. If the key is less than the middle key, then the item sought must lie in the lower half of the array; if it's greater then the item sought must lie in the upper half of the array. So we repeat the procedure on the lower (or upper) half of the array.

Our FindInCollectionfunction can now be implemented: static void *bin_search( collection c, int low, int high, void *key ) { int mid; /* Termination check */ if (low > high) return NULL; mid = (high+low)/2; switch (memcmp(ItemKey(c->items[mid]),key,c->size)) { /* Match, return item found */ case 0: return c->items[mid]; /* key is less than mid, search lower half */ case -1: return bin_search( c, low, mid-1, key); /* key is greater than mid, search upper half */ case 1: return bin_search( c, mid+1, high, key ); default : return NULL; } }

void *FindInCollection( collection c, void *key ) { /* Find an item in a collection Pre-condition: c is a collection created by ConsCollection c is sorted in ascending order of the key key != NULL Post-condition: returns an item identified by key if one exists, otherwise returns NULL*/ int low, high; low = 0; high = c->item_cnt-1;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

42

return bin_search( c, low, high, key ); } Important points: a.

bin_search is recursive: it determines whether the search key lies in the lower or upper half of the array, then calls itself on the appropriate half.

b.

There is a termination condition (two of them in fact!) i.

If low > high then the partition to be searched has no elements in it and

ii. If there is a match with the element in the middle of the current partition, then we can return immediately. c.

AddToCollectionwill need to be modified to ensure that each item added is placed in its correct place in the array. The procedure is simple:

d.

i.

Search the array until the correct spot to insert the new item is found,

ii.

Move all the following items up one position and

iii.

Insert the new item into the empty position thus created.

bin_searchis declared static. It is a local function and is not used outside this class: if it were not declared static, it would be exported and be available to all parts of the program. The static declaration also allows other classes to use the same name internally.

Analysis

Each step of the algorithm divides the block of items being searched in half. We can divide a set of n items in half at most log2 n times. Thus the running time of a binary search is proportional to log n and we say this is a O(log n) algorithm.

Binary search requires a more complex program than our original search and thus for small n it may run slower than the simple linear search. However, for large n,

Thus at large n, log n is much smaller than n, consequently an O(log n) algorithm is much faster than an O(n) one.

Plot of n and log n vs n . In the worst case, insertion may require n operations to insert into a sorted list. 1. We can find the place in the list where the new item belongs using binary search in O(log n) operations.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

43

2. However, we have to shuffle all the following items up one place to make way for the new one. In the worst case, the new item is the first in the list, requiring n move operations for the shuffle!

A similar analysis will show that deletion is also an O(n) operation. If our collection is static, ie it doesn't change very often - if at all - then we may not be concerned with the time required to change its contents: we may be prepared for the initial build of the collection and the occasional insertion and deletion to take some time. In return, we will be able to use a simple data structure (an array) which has little memory overhead.

However, if our collection is large and dynamic, ie items are being added and deleted continually, then we can obtain considerably better performance using a data structure called a tree.

A technique for searching an ordered list in which we first check the middle item and -based on that comparison - "discard" half the data. The same procedure is then applied to the remaining half until a match is found or there are no more items left.

Program to perform binary search in an array.

/*Binary search in an array*/

#include<stdio.h> #include<conio.h> void main() { int arr[]={1,2,3,9,12,13,15,17}; /* Array declaration*/ int mid,lower=0,upper=8; int flag=1,num; printf(Enter the number to be searched : ); scanf(%d,&num); for(mid=(lower+upper)/2;lower<=upper;mid=(lower+upper)/2) { if (arr[mid]==num) /*If the element is found print its location*/ { printf( The number is at location %d ,mid); flag=0; break; /* Exit the loop */

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ } if (arr[mid]<num) /* Continue until the element is found */ { lower=mid+1; } else { upper=mid-1; } } if(flag==1) printf( Element not in the list ); /* Print not found */ getch(); }

Output :

Enter the number to be searched : 3 The number is at location 2

Program to perform binary search in an array using recursion. /* Binary search using recursion */

#include<stdio.h> #include<conio.h> int bin_search(int a[],int,int,int); void main() { clrscr(); int arr[7]={1,3,5,7,8,9,11}; /* Declare array */ int lower=0,upper=6;num,p; printf( Enter the number to be searched : ); scanf(%d,&num); p=bin_search(arr,lower,upper,num); /* Function call to bin_search */ if (p==1) printf( Element not in the list );

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

44


DATA STRUCTURES & ALOGRITHEMS USING C, C++ else printf( Element is at a position %d ,p); /* Print location in array */ getch(); } int bin_search(int a[],int lb,int ub,int number) { int middle=(lb+ub)/2,m; if(lb>ub) return -1; if(a[middle]==number) /* If number is found */ return middle; /* Return location of no. */ if(a[middle]<number) return bin_search(a,middle+1,ub,number); /* Recursive call to bin_search*/ else return bin_search(a,lb,middle-1,number); /* Recursive call to bin_search */ }

Output :

Enter the element to be searched : 3 Element is at position 1

Program to perform linear search in a 2D array

#include<stdio.h> #include<conio.h> void create(int a[3][3]); /* Function Declarations */ void display(int a[3][3]); void search(int a[3][3], int n); void main() { int arr[3][3],num; clrscr(); printf(" Enter the array : "); create(arr); /* Function call to create an array */ display(arr); /* Function call to display array */ printf(" Enter the number to be searched : ");

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

45


DATA STRUCTURES & ALOGRITHEMS USING C, C++ scanf("%d",&num); search(arr,num); /* Perform search */ getch(); } void create(int a[3][3])

/* Function to create an array */ { for(int i=0;i<=2;i++) { for(int j=0;j<=2;j++) { printf(" Enter the element : "); scanf("%d",&a[i][j]); } } } void display(int a[3][3]) /* Function to display the array */ { for(int i=0;i<=2;i++) { for(int j=0;j<=2;j++) printf(" %d",a[i][j]); } } void search(int a[3][3], int n) /* Function to perform linear search*/ { for(int i=0;i<=2;i++) { for(int j=0;j<=2;j++) { if(a[i][j]==n) { printf(" Element found at location %d %d ",i+1,j+1); break; }

}

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

46


DATA STRUCTURES & ALOGRITHEMS USING C, C++

47

} }

Output :

Enter the array :

Enter the element : 1

Enter the element : 2

Enter the element : 3

Enter the element : 4

Enter the element : 5

Enter the element : 6

Enter the element : 7

Enter the element : 8

Enter the element : 9

123456789

Enter the number to be searched : 2

Element found at location 1 2

4.4

Improvements in Searching

The function below demonstrates a sequential search. The function accepts a pointer to an array of integers, the amount of integer in the array, and a number to be looked for in the array. A boolean value is returned specifying whether that number exists in the array.

bool InArray(int *array, int size, int num)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

48

{ for(int j = 0; j < size; j++) if(array[j] == num) return true; else return false; }

The algorithm is very simple to implement, but also very inefficient. In the average and worst case, it takes O(N) time to find if the item exists. In very large arrays, this is a very slow operation.

A more efficient search can be performed if the array is already sorted. Note, sorting an array first, and then using one of the more efficient search algorithms may be inefficient in the long run. It is recommended that such algorithms be used only if the array is sorted to begin with. The binary search algorithm is one of the simplest searching algorithms on a sorted array. For demonstration purposes, we will assume the array is sorted in ascending order. However, the algorithm can easily be modified to work with descending list by changing the relational operators. The binary search algorithm compares the number being looked for to the value of the middle element in the array. Depending on whether it is less or greater, the same process is then done on the left or right part of the array respectively. One possible implementation of a binary search is shown below.

bool InArray(int *array, int left, int right, int num) //left and right are the left and right index values of the array. These are both necessary as parameters since the function is recursive. { if(num == ((left+right)/2)) //if a match is found, return true return true; if(left == right) //all possibilities have been searched. num is not in the array return false; if(num < ((left+right)/2)) return InArray(array,left,((left+right)/2) -1,num); //perform the same operation. New right position is the element before the middle if(num > ((left+right)/2)) return InArray(array,((left+right)/2) + 1,right,num); //perform the same operation. New left position is the element directly after the middle. }

4.5

Hashing

Hashing is a somewhat different approach to searching, as it attempts to make searching O(1) efficiency. A hash function is applied, which uses some type of algorithm or formula to determine what index of an array (also called the hash table) the object should be in. The algorithm or formula depends on the type of data in each situation.

Most times, at least several collisions occur in the hash table. A collision is when an index is returned by the hash function that already has a value. This could mean it is either a duplicate, or

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

49

the hash function is not unique, which is often times the case. There are two approaches to resolving a collision.

The first is known as open hashing. In this method, collision values are stored "outside" of the array. An example of open hashing is having each array index point to another array, or linked list. Thus, all values that hash to that location are stored in the list. The items in the list can be sorted by their value, access frequency, or the order in which they were put into the table.

Closed hashing is another collision resolution technique. In this approach, the collision values are stored at a different location in the hash table. The hash function takes care of this, as the resolution is dependent on data and programming task. Whatever algorithm the hash function uses to resolve a collision must then be used when searching to find the item required.

4.6

Check your progress 1. Explain the importance of searching in a Computing environment.

2. Write a program for sequential search in Arrays using classes.

3. Write a program for binary search in Arrays using classes.

4. Calculate the running time of sequential and binary search algorithm.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

BLOCK 2 Unit 1 Sorting Techniques and Recursion

Structure 1.0

Unit Objective

1.1

Introduction to Sorting

1.2

1.1.1

Insertion Sort

1.1.2

Bubble Sort

1.1.3

Selection Sort

1.1.4

Shell Sort

1.1.5

QuickSort

1.1.6

Merge Sort

1.1.7

Heap Sort

Recursion 1.2.1

Recursive functions

1.2.2

Example: Factorial

1.2.3

Fibonacci Numbers

1.2.4

Recursively Defined Lists

1.3

Analysis of Sorting

1.4

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

50


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1.0

51

Unit Objective

After going through this unit, you should be able to understand : 

Sorting basics

Working of various sorting techniques

Bubble, Insertion, Selection sort etc.

Divide and conquer technique

Quick sort and merge sort algorithm

Introduction to recursion

Analysis of sorting technique

1.1

Introduction to Sorting

Finding better algorithms to sort a given set of data is an ongoing problem in the field of computer science. Sorting is placing a given set of data in a particular order. Simple sorts place data in ascending or descending order. For discussion purposes, we will look at sorting data in ascending order. However, you may modify the code to sort the data in descending order by reversing the relational operators (i.e. change 'nums[j] < nums[j-1]' to 'nums[j] > nums[j-1]').

In this unit we will analyze sorts of different efficiency, and discuss when and where they can be used. In order to simplify the explanation of certain algorithms, we will assume a swap() function exists that switches the values of two variables. An example of such a function for int variables is displayed below. void swap(int &item1,int &item2) //reference parameters, point directly to the storage location of the variables passed. Local copies are not made, and these values are saved after the function life span ends. { int tmp; tmp = item1; item1 = item2; item2 = tmp; } We will first analyze sorts that are O(N^2). These sorts are very easy to understand, however they are very slow when there are a lot of elements to be sorted.

3.5.1

Insertion Sort

The first sort we will look it is called the insertion sort. The algorithm processes each element in turn, and compares it to the elements before it. The first element has no elements before it for comparison, so it is left alone. In the next iteration, the second element is evaluated. It is

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

52

compared to the element directly before it, which is the first element in the structure. If the second element has a value less than the first, their positions are switched. If they second element is more than the first, then they are left as they are, and the third element is processed. The third element is then compared to the second element in the new list ('new' list here since the first two items may have been swapped). If it is less than the second, then they are swapped, and it is then compared to the first element. If it is more than the second element, it is left in place and the process continues to the next element.

In short, each element is moved to the front of the list by switching positions with the previous elements as long as it is smaller than the elements before it.

The algorithm is programmed using two nested for loops. The first loop creates n-1 iterations, where n is the number of elements in the list. Since element[0] does not have any elements before it to compare to, we start with the second element. The nested loop statement starts at the element that is being processed by the first loop, and works backwards, comparing the element to the one before it. If it is smaller, a swap is made and the loop continues. If it is larger, the loop ends, and the next iteration begins in the outer loop. The code for the insertion sort algorithm is shown below. A standard array of int variable is used for simplicity. However, you can modify the code to work for any linear structure.

void InsertionSort(int *nums,int n) //array called nums with n elements to be sorted { for(int i=1; i<n; i++) for(int j=i; (j>0) && (nums[j]<nums[j-1])); j--) swap(j,j-1); }

3.5.1

Bubble Sort

The next O(N^2) algorithm that we will analyze is the bubble sort. The bubble sort works from the bottom-up (back to front), and evaluates each element to the one before it. If the element on the bottom has a smaller value than the top, the two are swapped, if not -they remain in their original position. The algorithm compares the next two elements from the bottom-up, no matter what the outcome of the previous comparison was. In this fashion, the smallest value "bubbles up" to the top in each iteration. In subsequent comparisons, the values that were bubbled up in previous iterations are no longer compared, since they are in place.

The code for the bubble sort is shown below, using a standard array of int variables. Two nested

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

53

for loops are used. The first loop has n iterations, the number of elements. Each iteration, at least one element is set into its proper sorted position. The inner for loop runs from the bottom-up, comparing adjacent values, and stops at the group of values that have already been set in place, this position being one more each iteration of the outer loop.

void BubbleSort(int *nums, int n) { for (int i=0; i<n-1; i++) for (int j=n-1; j>i; j--) if(nums[j] < nums[j-1] swap(j,j-1); }

3.5.1

Selection Sort

The selection sort will be the final O(N^2) sorting algorithm that we will look at. In a selection sort, the entire list is searched to find the smallest value. That is, we compare every element in the structure and find the smaller value, and then swap it with the first item. Then, every element but the first is searched to find the smallest value out of that group, and it is then swapped with the item in the second position. This continues until all items are in the correct order.

This technique is similar to what would be done if a person were sorting a list of items by hand. The list is searched for the smallest value, which is then crossed out and written as the first item in a new list. The computer algorithm is the same, however, in order to preserve memory and not have to make two lists, we use a swap operation.

The selection sort uses two nested for loops, the outer having n-1 iterations. When there is only one item left, it will appear in its correct position, last in the structure. The inner for loop searches the unsorted portion of structure (from bottom-top) by assuming the first element in the unsorted section is the smallest, and then comparing it to each element in turn. If a smaller element is found, it is considered to be the smallest, and compared to the rest of the elements. The code for selection sort is shown below.

void SelectionSort(int *nums, int n) { int low; //holds the index of the smallest element in the unsorted portion for (int i=0; i<n-1; i++) {low = i; //assume the first item in the unsorted section is the lowest,unless a smaller value is

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

54

found for (int j=n-1; j>i; j--) { if (nums[j] < nums[low]) //if element has smaller value than low low = j; //it then becomes the new low } swap(i,low); //switch the current position item with the smallest in theunsorted portion } } The algorithms that follow are O(N log N).

3.5.1

Shell Sort

The next sorting algorithm that will be covered is the Shell sort, named after its creator D.L. Shell. It is the first algorithm that we will look at that swaps non-adjacent elements, and takes a "divide and conquer" approach. The list is divided into many sublists, which are sorted, and are then merged together. The shell sort takes advantage of the insertion sort, which is very efficient in a best case scenario (that is, the list being sorted is already 'near sorted').

The shell sort divides the list into n/2 sublists, each being n/2 apart. For instance, in a list of ten elements, the first iteration would consist of five lists, two elements each. The first list would be list[0] and list[5], the second would be list[1] and list[6], and so on. Each of these lists is then sorted using an insertion sort. During the next iteration, we divide the list into bigger sublists, with the elements being closer together. In the second iteration, there are n/4 lists, each element being n/4 apart. These lists are then sorted using insertion sort, and so on. The process continues with twice the amount of lists each iteration as in the one before. Each iteration, the list becomes closer to being sorted. The last sort is done on the entire list, using a standard insertion sort. Since the list should be 'near sorted', the algorithm is very efficient. Note, during some iterations, sublists will contain unequal amount of elements, since the amount of sublists does not evenly divide into the total number of elements. Remember, in integer division, the decimal in the answer is dropped, e.g. 5/2 = 2. Therefore, if we have seven elements, there are three lists during the first iteration. One contains three elements, and the other two each contain two elements. {6,3,1,5,2,4,9} -> {6,3,1,5,2,4,9}

In the example, list one starts at list[0] (6) and includes every item two apart. The next list starts at list[1] (3) and also includes every item two apart from its position. Since there is no item that is

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

55

two locations after the '4', the list has only two items.

The code for shell sort is very simple and straightforward. A slightly modified version of the insertion sort is used however. Since the elements of the sublists to be sorted are not adjacent, an increment parameter is added, which specifies how far apart the elements of the sublists are. The value '1' is then replaced by the increment in the original insertion sort code, since it now compares values that are increment apart. void InSortShell(int *nums,int n, int incr) //array called nums with n elements to be sorted, incr apart { for(int i=incr; i<n; i+=incr) for(int j=i; (j>=incr) && (nums[j] < nums[j-incr]); j-=incr) swap(nums[j],nums[j-incr]); } void ShellSort(int *nums, int n) { for(int i=n/2; i>2; i/=2) //each iteration there are twice as many sublists. Divide the distance between each element by 2. for(int j=0; j<i; j++) //sort each sublist InSortShell(&nums[j],n-j,i); //the first element of each sublist begins at j, and therefore the entire list is j items shorter (n-j). The elements of the sublist are i apart. InSortShell(nums,n,1); //do a standard insertion sort on the now nearly sorted list. }

3.5.1

QuickSort

QuickSort is the quickest algorithm in the average case, however it has a very bad running time in a worst case scenario (e.g. items completely out of order, in reverse of how they should appear sorted, etc). The QuickSort takes a "divide and conquer" approach. A value called a pivot is first selected, usually it is the value of the middle element. A "partition" of the array is then preformed. Any elements that are less than the pivot value will be moved to the beginning of the list, followed by the pivot element, and then all values that are bigger will appear at the end. The elements in each 'sublist' do not need to be sorted in any way with respect to each other, but this order must be maintained. The QuickSort algorithm is then used on each sublist, through recursion. This continues until the structure has been sorted.

Let's take a look at how the partitioning algorithm is implemented. The algorithm starts at each

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

56

end of the sublist being analyzed. It then moves inward from each end. First, it uses a while loop to find the first value from the left that is greater than the pivot. Then, it uses another while loop to find the first value from the right that is less than the pivot. It swaps these two values, and continues the process until the left position value and right position value meet somewhere in the center. When this algorithm is finished, every element at the left position and after is greater than the pivot, and the elements before it are less than the pivot. The code for the partition algorithm is shown below. int part(int *nums,int left,int right,int pivot) { do { while(nums[left] < pivot) //find the next position greater than the pivot from the left left++; while(nums[right] > pivot) //find the next position less than the pivot from the right right--; swap(nums[left],nums[right]); //swap these two values } while(left<right); //move inward until they cross swap(nums[left],nums[right]); //the last swap occurs after left and right have crossed (i.e. left is already < right, after which the outer loops end), therefore we must re-swap these elements back into their correct positions return left; } The pivot of each list is simply the middle element. int getpivot(int left,int right) //the left and right indices of the sublist { return (left+right)/2; } Now let's take a look how QuickSort brings everything together. The left and right indices of the sublist must be passed into the QuickSort() function. Recursion is used by the algorithm to run QuickSort on each partition, which is why these values are required as parameters. On the initial call to sort an array, 0 would be used for left index, and n-1 for the right index. The pivot value itself is already locked in the correct position. Note, the pivot itself is not evaluated for the partition algorithm to work correctly. To accomplish this, we swap it with the position of the last element, and start the partition with right-1. After the partition, it is then swapped with the element at the first position of the right sublist. Now, all the elements to the left are less than the pivot, and all those to the right are greater than the pivot. void QuickSort(int *nums, int left, int right) {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

57

int pivot = getpivot(left,right); //find the pivot swap(nums[pivot],nums[right]); //move the pivot to the last position int r_sublist = part(nums,left,right-1,nums[right]); //partition left->right-1,excluding the pivot swap(nums[r_sublist],nums[right]); //move the pivot in the proper position if((r_sublist -left) > 1) //if a left sublist exists, sort it QuickSort(nums,left,r_sublist-1); if((right -r_sublist) > 1) //if a right sublist exists, sort it QuickSort(nums,r_sublist+1,right); } 3.5.1

Merge Sort

The merge sort is another algorithm which takes a "divide and conquer" approach. It begins by dividing a list into two sublists, and then recursively divides each of those sublists until there are sublists with one element each. These sublists are then combined using a simple merging technique.

In order to combine two lists, the first value of each is evaluated, and the smaller value is added to the output list. This process continues until one of the lists has become exhausted, at which point the remainder of the other list is simply appended to the output list. Two closest lists are combined at each end, until all the elements are merged back into a single list.

The code for merge sort is very straightforward. The algorithm uses two arrays to accomplish the task. First, the items to be sorted are copied to a temporary array, where they are divided. The original array acts as the output array, and will contain the sorted list at the end. The parameters of the MergeSort() function consist of these two arrays, as well as the left and right boundaries of the list to be sorted. This is required since the MergeSort() function is recursive, and repeats the algorithms to sublists within the array. On the initial call, left will have a value of 0, and right will have a value of n-1. void MergeSort(int *nums, int *tmp, int left, int right) { if (left==right) return; //if the boundaries are the same, the sublist has only one element, and cannot be further split int mid = (left+right)/2; MergeSort(nums,left,mid); //sort the first half MergeSort(nums,mid+1,right) //sort the second half //copy the sublist into the temporary array for(int i = left; i<=right; i++)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

58

temp[i] = array[i]; //merge the lists int l = left; //the first element in the first sublist int r = mid + 1; //the first element in the second sublist for(int j=left; j<=right; j++) { if(l == mid+1) //if the index of the left list is equal to the first element of the right list, the left list has ended. Insert next item from right list.nums[j] = tmp[r++]; else if (r > right) //if the index of the right sublist has exceeded it's right boundary, the right list has ended. Insert next item from the left list.

nums[j] = tmp[l++]; else if(tmp[l] < tmp[r]) //if two lists exist, and the current element in theleft is smaller than the right, insert it into the next position in the output array and move to the next element in the left list. nums[j] = tmp[l++]; else //the current element in the right sublist is smaller than the one in the left sublist, insert it into the output array and move to the next element in the right list. nums[j] = tmp[r++]; } } 3.5.1

Heap Sort

The next algorithm, HeapSort, is very simple to implement. The array to be sorted is first inserted as a heap structure. Then, a loop is used to remove each element of the heap. Remember from the lesson on heaps, when an element is removed, it is still part of the physical array, and is swapped with the last item of the heap. The process continues and each time the largest item is pushed to the end of the heap (which is directly before the item discarded the previous iteration, since the heap becomes smaller). This approach requires a slightly modified version of the heap class. The changes that were made are shown in bold below. const int MAX_SIZE = 100; template <class ItemType> class Heap {public: Heap(ItemType*,int); int left(int) const; int right(int) const; int parent(int) const; void insert(const ItemType&);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

59

ItemType remove_max(); bool IsEmpty() const; bool IsFull() const; int count() const; ItemType value(int) const; private: ItemType *array; int elements; //how many elements are in the heap void ReHeap(int); void BuildHeap(); }; This version of the heap class accepts a pointer to the array to be sorted in the class constructor. The second parameter of the constructor is the size of the array. The array is initialized as a pointer (which will point to the array passed to the construtor), and a the function BuildHeap() has been added, which will make the array into a heap. template <class ItemType> Heap<ItemType>::Heap(ItemType *array_ptr, int size) { array = array_ptr; elements = size; BuildHeap(); } The BuildHeap() function begins at the first non-leaf node and works up the array, sorting each subtree with the ReHeap() function. Since leafs can't travel down any further, they do not need to be processed. Instead, they will fall into their proper place by being exchanged with a node on a higher level if necessary, when that node comes down. By working up, the subtrees of a node are made into heaps first, which allows the ReHeap() function to be used. The ReHeap() function relies on the fact that the node's subtrees are heaps. This is because it compares the node to its children, and makes a switch if necessary. If the elements did not follow the heap property, larger items on lower levels would never be brought to the top, since a comparison is made with the node's children and not the parent. template <class ItemType> void Heap<ItemType>::BuildHeap() { for(int j = n/2 -1; j >= 0; j--) ReHeap(j); }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

60

The heap sort algorithm makes the array to be sorted into a heap, and then follows the above procedure. template <class ItemType> void HeapSort(ItemType *array, int size) { Heap<ItemType> sort(array, size); for(int j = 0; j < size; j++) sort.remove_max(); } Note, although algorithms such as the QuickSort seem to be the only ones that should be used, this is not always the case. If you know that the array to be sorted is very small, a O(N^2) algorithm is much more simple to implement, and the difference will not be significant. 1.2

Recursion

Many examples of the use of recursion may be found: the technique is useful both for the definition of mathematical functions and for the definition of data structures. Naturally, if a data structure may be defined recursively, it may be processed by a recursive function!

1.2.1

Recursive functions

Many mathematical functions can be defined recursively: 

factorial

Fibonacci

Euclid's GCD (greatest common denominator)

Fourier Transform

Many problems can be solved recursively, eg games of all types from simple ones like the Towers of Hanoi problem to complex ones like chess. In games, the recursive solutions are particularly convenient because, having solved the problem by a series of recursive calls, you want to find out how you got to the solution. By keeping track of the move chosen at any point, the program call stack does this housekeeping for you! This is explained in more detail later.

1.2.2

Example: Factorial

One of the simplest examples of a recursive definition is that for the factorial function: factorial( n ) = if ( n = 0 ) then 1 else n * factorial( n- 1 )

A natural way to calculate factorials is to write a recursive function which matches this definition: function fact( int n )

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

61

{ if ( n == 0 ) return 1; else return n*fact(n-1); }

Note how this function calls itself to evaluate the next term. Eventually it will reach the termination condition and exit. However, before it reaches the termination condition, it will have pushed n stack frames onto the program's run-time stack.

The termination condition is obviously extremely important when dealing with recursive functions. If it is omitted, then the function will continue to call itself until the program runs out of stack space -usually with moderately unpleasant results! Failure to include a correct termination condition in a recursive function is a recipe for disaster!

1.2.3

Fibonacci Numbers

Another commonly used example of a recursive function is the calculation of Fibonacci numbers. Following the definition: fib( n ) = if ( n = 0 ) then 1 if ( n = 1 ) then 1 else fib( n-1 ) + fib( n-2 ) one can write: function fib( int n ) { if ( (n == 0) || (n == 1) ) return 1; else return fib(n-1) + fib(n-2); }

Short and elegant, it uses recursion to provide a neat solution - that is actually a disaster! We shall revisit this and show why it is such a disaster later.

Data structures also may be recursively defined. One of the most important class of structure trees -allows recursive definitions which lead to simple (and efficient) recursive functions for manipulating them.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1.2.4

62

Recursively Defined Lists

We can define a list as: a. empty or b. containing a node and a link to a list.

A list can be scanned using a recursive function: eg. to count the number of items in a list: int ListCount( List l ) { if ( l == NULL ) return 0; else return 1 + ListCount( l->next ); } However, it turns out to be much faster to write this function without the recursive call: int ListCount( List l ) { int cnt = 0; while ( l != NULL ) { cnt++; l = l->next; } return cnt; } The overhead of calling a function is quite large on any machine, so that the second iterative version executes faster. (Another factor is that modern machines rely heavily on the cache for performance: the iterative code of the second version doesn't use so much memory for the call stack and so makes much better use of the cache.)

1.3

Analysis of Sorting

Bubble, Selection, Insertion Sorts There are a large number of variations of one basic strategy for sorting. It's the same strategy that you use for sorting your bridge hand. You pick up a card, start at the beginning of your hand and find the place to insert the new card, insert it and move all the others up one place. /* Insertion sort for integers */ void insertion( int a[], int n ) { /* Pre-condition: a contains n items to be sorted */ int i, j, v; /* Initially, the first item is considered 'sorted' */ /* i divides a into a sorted region, x<i, and an unsorted one, x >= i */ for(i=1;i<n;i++) { /* Select the item at the beginning of the as yet unsorted section */ v = a[i]; /* Work backwards through the array, finding where v should go */

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

63

j = i; /* If this element is greater than v, move it up one */ while ( a[j-1] > v ) { a[j] = a[j-1]; j = j-1; if ( j <= 0 ) break; } /* Stopped when a[j-1] <= v, so put v at position j */ a[j] = v; } }

Bubble Sort Another variant of this procedure, called bubble sort, is commonly taught: /* Bubble sort for integers */ #define SWAP(a,b) { int t; t=a; a=b; b=t; } void bubble( int a[], int n ) /* Pre-condition: a contains n items to be sorted */ { int i, j; /* Make n passes through the array */ for(i=0;i<n;i++) { /* From the first element to the end of the unsorted section */ for(j=1;j<(n-i);j++) { /* If adjacent items are out of order, swap them */ if( a[j-1]>a[j] ) SWAP(a[j-1],a[j]); } } } Analysis Each of these algorithms requires n-1 passes: each pass places one item in its correct place. (The n

th

th

is then in the correct place also.) The i pass makes either ior n - i comparisons and

moves. So:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

64

2

or O(n ) - but we already know we can use heaps to get an O(n logn) algorithm. Thus these algorithms are only suitable for small problems where their simple code makes them faster than the more complex code of the O(n logn) algorithm. As a rule of thumb, expect to find an O(n logn) algorithm faster for n>10 -but the exact value depends very much on individual machines!.

Quick Sort performance The recursive calls in quick sort are generally expensive on most architectures - the overhead of any procedure call is significant and reasonable improvements can be obtained with equivalent iterative algorithms. Two things can be done to eke a little more performance out of your processor when sorting: a. Quick sort - in its usual recursive form - has a reasonably high constant factor relative to a simpler sort such as insertion sort. Thus, when the partitions become small (n < ~10), a switch to insertion sort for the small partition will usually show a measurable speed-up.

Heap Sort We know that heaps provide a means of sorting: 1

construct a heap,

2

add each item to it (maintaining the heap property!),

3

when all items have been added, remove them one by one (restoring the heap property as each one is removed).

Addition and deletion are both O(logn) operations. We need to perform n additions and deletions, leading to an O(nlogn) algorithm. We will look at another efficient sorting algorithm, Quicksort, and then compare it with Heap sort.

Quick Sort Quicksort is a very efficient sorting algorithm invented by C.A.R. Hoare. It has two phases: ďƒ˜

the partition phase and

ďƒ˜

the sort phase.

As we will see, most of the work is done in the partition phase - it works out where to divide the work. The sort phase simply sorts the two smaller problems that are generated in the partition phase.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

65

This makes Quicksort a good example of the divide and conquer strategy for solving problems. In quicksort, we divide the array of items to be sorted into two partitions and then call the quicksort procedure recursively to sort the two partitions, ie we divide the problem into two smaller ones and conquer by solving the smaller ones. Thus the conquer part of the quicksort routine looks like this:

To do this, we choose a pivot element and arrange that all the items in the lower part are less than the pivot and all those in the upper part greater than it. In the most general case, we don't know anything about the items to be sorted, so that any choice of the pivot element will do - the first element is a convenient one.

Quick Sort: Partition in place Most implementations of quick sort make use of the fact that you can partition in place by keeping two pointers: one moving in from the left and a second moving in from the right. They are moved towards the centre until the left pointer finds an element greater than the pivot and the right one finds an element less than the pivot. These two elements are then swapped. The pointers are then moved inward again until they "cross over". The pivot is then swapped into the slot to which the right pointer points and the partition is complete. int partition( void *a, int low, int high ) { int left, right; void *pivot_item; pivot_item = a[low]; pivot = left = low; right = high; while ( left < right ) { /* Move left while item < pivot */ while( a[left] <= pivot_item ) left++; /* Move right while item > pivot */ while( a[right] > pivot_item ) right--; if ( left < right ) SWAP(a,left,right); } /* right is final position for the pivot */ a[low] = a[right]; a[right] = pivot_item;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

66

return right; } Analysis The partition routine examines every item in the array at most once, so it is clearly O(n). Usually, the partition routine will divide the problem into two roughly equal sized partitions. We know that we can divide n items in half log2n times. This makes quicksort a O(nlogn) algorithm - equivalent to heapsort.

1.4

Check your progress 5. Write a program in C++ to sort a array of string using selection sort.

6. Write a program in C++ to sort a array of string using Insertion sort?

7. Analyse the performance of Quick sort algorithm ?

8. Prepare a table of all sorting techniques along with their complexities?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 2 Arrays and Pointer handling Structure 2.0

Unit Objective

2.1

Introduction

2.2

Arrays 2.2.1

2.3

Multidimensional Arrays

Pointers 2.3.1

Function Pointers

2.4

References

2.5

Typedefs

2.6

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

67


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 2.0

68

Unit Objective

After going through this unit, you should be able to understand:

2.1

Concept of array, pointer, and reference data types

Illustration of their use for defining variables.

Use of pointers etc. in various popular algorithms

Implementation of Referencing

Introduction

An array consists of a set of objects (called its elements), all of which are of the same type and are arranged contiguously in memory. In general, only the array itself has a symbolic name, not its elements. Each element is identified by an index which denotes the position of the element in the array. The number of elements in an array is called its dimension. The dimension of an array is fixed and predetermined; it cannot be changed during program execution.

Arrays are suitable for representing composite data which consist of many similar, individual items. Examples include: a list of names, a table of world cities and their current temperatures, or the monthly transactions for a bank account.

A pointer is simply the address of an object in memory. Generally, objects can be accessed in two ways: directly by their symbolic name, or indirectly through a pointer. The act of getting to an object via a pointer to it, is called dereferencing the pointer. Pointer variables are defined to point to objects of a specific type so that when the pointer is dereferenced, a typed object is obtained.

Pointers are useful for creating dynamic objects during program execution. Unlike normal (global and local) objects which are allocated storage on the runtime stack, a dynamic object is allocated memory from a different storage area called the heap. Dynamic objects do not obey the normal scope rules. Their scope is explicitly controlled by the programmer.

A reference provides an alternative symbolic name (alias) for an object. Accessing an object through a reference is exactly the same as accessing it through its original name. References offer the power of pointers and the convenience of direct access to objects. They are used to support the call-by-reference style of function parameters, especially when large objects are being passed to functions.

2.2

Arrays

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

69

An array variable is defined by specifying its dimension and the type of its elements. For example, an array representing 10 height measurements (each being an integer quantity) may be defined as: int heights[10]; The individual elements of the array are accessed by indexing the array. The first array element always has the index 0. Therefore, heights[0] and heights[9] denote, respectively, the first and last element of heights. Each of heights elements can be treated as an integer variable. So, for example, to set the third element to 177, we may write:

heights[2] = 177; Attempting to access a nonexistent array element (e.g., heights[-1] or heights[10]) leads to a serious runtime error (called index out of bounds error).

Processing of an array usually involves a loop which goes through the array element by element. Listing 1 illustrates this using a function which takes an array of integers and returns the average of its elements

Listing 1 1

const int size = 3;

2

double Average (int nums[size])

3

{

4

double average = 0;

5

for (register i = 0; i < size; ++i)

6

average += nums[i];

7 8

return average/size; }

Like other variables, an array may have an initializer. Braces are used to specify a list of commaseparated initial values for array elements. For example,

int nums[3] = {5, 10, 15};

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

70

initializes the three elements of nums to 5, 10, and 15, respectively. When the number of values in the initializer is less than the number of elements, the remaining elements are initialized to zero:

int nums[3] = {5, 10};

// nums[2] initializes to 0

When a complete initializer is used, the array dimension becomes redundant, because the number of elements is implicit in the initializer. The first definition of nums can therefore be equivalently written as: int nums[] = {5, 10, 15};

// no dimension needed

Another situation in which the dimension can be omitted is for an array function parameter. For example, the Average function above can be improved by rewriting it so that the dimension of nums is not fixed to a constant, but specified by an additional parameter. Listing 2 illustrates this. Listing 2 1

double Average (int nums[], int size)

2

{

3

double average = 0;

4

for (register i = 0; i < size; ++i)

5

average += nums[i];

6 7

return average/size; } A C++ string is simply an array of characters. For example,

char

str[] = "HELLO";

defines str to be an array of six characters: five letters and a null character. The terminating null character is inserted by the compiler. By contrast,

char

str[] = {'H', 'E', 'L', 'L', 'O'};

defines str to be an array of five characters. It is easy to calculate the dimension of an array using the sizeof operator. For example, given an array ar whose element type is Type, the dimension of ar is:

sizeof(ar) / sizeof(Type)

3.5.1

Multidimensional Arrays

An array may have more than one dimension (i.e., two, three, or higher). The organization of the array in memory is still the same (a contiguous sequence of elements), but the programmers

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

71

perceived organization of the elements is different. For example, suppose we wish to represent the average seasonal temperature for three major Australian capital cities (see Table 5.1).

Table 5.1

Average seasonal temperature. Spring

Summer

Autumn

Winter

Sydney

26

34

22

17

Melbourne

24

32

19

13

Brisbane

28

38

25

20

This may be represented by a two-dimensional array of integers: int

seasonTemp[3][4];

The organization of this array in memory is as 12 consecutive integer elements. The programmer, however, can imagine it as three rows of four integer entries each (see Figure 1).

Figure 1

Organization of seasonTemp in memory.

As before, elements are accessed by indexing the array. A separate index is needed for each dimension. For example, Sydneys average summer temperature (first row, second column) is given by seasonTemp[0][1].

The array may be initialized using a nested initializer: int seasonTemp[3][4] = { {26, 34, 22, 17}, {24, 32, 19, 13}, {28, 38, 25, 20} }; Because this is mapped to a one-dimensional array of 12 elements in memory, it is equivalent to:

int seasonTemp[3][4] = { 26, 34, 22, 17, 24, 32, 19, 13, 28, 38, 25, 20 }; The nested initializer is preferred because as well as being more informative, it is more versatile. For example, it makes it possible to initialize only the first element of each row and have the rest default to zero:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

72

int seasonTemp[3][4] = {{26}, {24}, {28}};

We can also omit the first dimension (but not subsequent dimensions) and let it be derived from the initializer:

int seasonTemp[][4] = { {26, 34, 22, 17}, {24, 32, 19, 13}, {28, 38, 25, 20} };

Processing a multidimensional array is similar to a one-dimensional array, but uses nested loops instead of a single loop. Listing 3 illustrates this by showing a function for finding the highest temperature in seasonTemp. Listing 3 1

const int rows

= 3;

2

const int columns

= 4;

3

int seasonTemp[rows][columns] = {

4

{26, 34, 22, 17},

5

{24, 32, 19, 13},

6

{28, 38, 25, 20}

7

};

8

int HighestTemp (int temp[rows][columns])

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 9

{

10

int

11

for (register i = 0; i < rows; ++i)

12

for (register j = 0; j < columns; ++j)

13

highest = 0;

if (temp[i][j] > highest)

14

highest = temp[i][j];

15

return highest;

16

2.3

73

}

Pointers

A pointer is simply the address of a memory location and provides an indirect way of accessing data in memory. A pointer variable is defined to point to data of a specific type. For example:

int

*ptr1;

char

*ptr2;

// pointer to an int // pointer to a char

The value of a pointer variable is the address to which it points. For example, given the definitions

int

num;

we can write:

ptr1 = #

The symbol & is the address operator; it takes a variable as argument and returns the memory address of that variable. The effect of the above assignment is that the address of num is assigned to ptr1. Therefore, we say that ptr1 points to num. Figure 2 illustrates this diagrammatically. Figure 2

A simple integer pointer.

Given that ptr1 points to num, the expression

*ptr1

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

74

dereferences ptr1 to get to what it points to, and is therefore equivalent to num. The symbol * is the dereference operator; it takes a pointer as argument and returns the contents of the location to which it points.

In general, the type of a pointer must match the type of the data it is set to point to. A pointer of type void*, however, will match any type. This is useful for defining pointers which may point to data of different types, or whose type is originally unknown.

A pointer may be cast (type converted) to another type. For example, ptr2 = (char*) ptr1; converts ptr1 to char pointer before assigning it to ptr2. Regardless of its type, a pointer may be assigned the value 0 (called the null pointer). The null pointer is used for initializing pointers, and for marking the end of pointer-based data structures (e.g., linked lists).

Dynamic Memory In addition to the program stack (which is used for storing global variables and stack frames for function calls), another memory area, called the heap, is provided. The heap is used for dynamically allocating memory blocks during program execution. As a result, it is also called dynamic memory. Similarly, the program stack is also called static memory.

Two operators are used for allocating and deallocating memory blocks on the heap. The new operator takes a type as argument and allocated a memory block for an object of that type. It returns a pointer to the allocated block. For example,

int *ptr = new int; char *str = new char[10];

allocate, respectively, a block for storing a single integer and a block large enough for storing an array of 10 characters.

Memory allocated from the heap does not obey the same scope rules as normal variables. For example, in

void Foo (void) {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

75

char *str = new char[10]; //... }

when Foo returns, the local variable str is destroyed, but the memory block pointed to by str is not. The latter remains allocated until explicitly released by the programmer.

The delete operator is used for releasing memory blocks allocated by new. It takes a pointer as argument and releases the memory block to which it points. For example:

delete ptr;

// delete an object

delete [] str;

// delete an array of objects

Note that when the block to be deleted is an array, an additional [] should be included to indicate this. The significance of this will be explained later when we discuss classes.

Should delete be applied to a pointer which points to anything but a dynamically-allocated object (e.g., a variable on the stack), a serious runtime error may occur. It is harmless to apply delete to the 0 pointer.

Dynamic objects are useful for creating data which last beyond the function call which creates them. Listing 4 illustrates this using a function which takes a string parameter and returns a copy of the string. Listing 4 1

#include <string.h>

2

char* CopyOf (const char *str)

3

{

4

char *copy = new char[strlen(str) + 1];

5

strcpy(copy, str);

6

return copy;

7

}

Explanation 1

This is the standard string header file which declares a variety of functions for

manipulating strings.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4

76

The strlen function (declared in string.h) counts the characters in its string argument up to

(but excluding) the final null character. Because the null character is not included in the count, we add 1 to the total and allocate an array of characters of that size. 5

The strcpy function (declared in string.h) copies its second argument to its first, character

by character, including the final null character.

Because of the limited memory resources, there is always the possibility that dynamic memory may be exhausted during program execution, especially when many large blocks are allocated and none released. Should new be unable to allocate a block of the requested size, it will return 0 instead. It is the responsibility of the programmer to deal with such possibilities. The exception handling mechanism of C++ (explained in Chapter 10) provides a practical method of dealing with such problems.

Pointer Arithmetic In C++ one can add an integer quantity to or subtract an integer quantity from a pointer. This is frequently used by programmers and is called pointer arithmetic. Pointer arithmetic is not the same as integer arithmetic, because the outcome depends on the size of the object pointed to. For example, suppose that an int is represented by 4 bytes. Now, given

char *str = "HELLO"; int nums[] = {10, 20, 30, 40}; int *ptr = &nums[0];

// pointer to first element

str++ advances str by one char (i.e., one byte) so that it points to the second character of "HELLO", whereas ptr++ advances ptr by one int (i.e., four bytes) so that it points to the second element of nums. Figure .Error! Bookmark not defined. illustrates this diagrammatically.

It follows, therefore, that the elements of "HELLO" can be referred to as *str, *(str + 1), *(str + 2), etc. Similarly, the elements of nums can be referred to as *ptr, *(ptr + 1), *(ptr + 2), and *(ptr + 3).

Another form of pointer arithmetic allowed in C++ involves subtracting two pointers of the same type. For example:

int *ptr1 = &nums[1]; int *ptr2 = &nums[3];

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ int n = ptr2 - ptr1;

77

// n becomes 2

Pointer arithmetic is very handy when processing the elements of an array. Listing 5 shows as an example a string copying function similar to strcpy. Listing 5 1

void CopyString (char *dest, char *src)

2

{

3

while (*dest++ = *src++)

4

;

5

}

Explanation 3

The condition of this loop assigns the contents of src to the contents of dest and then

increments both pointers. This condition becomes 0 when the final null character of src is copied to dest.

In turns out that an array variable (such as nums) is itself the address of the first element of the array it represents. Hence the elements of nums can also be referred to using pointer arithmetic on nums, that is, nums[i] is equivalent to *(nums + i). The difference between nums and ptr is that nums is a constant, so it cannot be made to point to anything else, whereas ptr is a variable and can be made to point to any other integer.

Listing 6 shows how the HighestTemp function (shown earlier in Listing 3) can be improved using pointer arithmetic.

Listing 6 1

int HighestTemp (const int *temp, const int rows, const int columns)

2

{

3

int

4

for (register i = 0; i < rows; ++i)

5

for (register j = 0; j < columns; ++j)

6

highest = 0;

if (*(temp + i * columns + j) > highest)

7

highest = *(temp + i * columns + j);

8 9

return highest; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

78

Explanation 1

Instead of passing an array to the function, we pass an int pointer and two additional

parameters which specify the dimensions of the array. In this way, the function is not restricted to a specific array size. 6

The expression *(temp + i * columns + j) is equivalent to temp[i][j] in the previous version of this

function.

HighestTemp can be simplified even further by treating temp as a one-dimensional array of row * column integers. This is shown in Listing 7. Listing 7 1

int HighestTemp (const int *temp, const int rows, const int columns)

2

{

3

int

4

for (register i = 0; i < rows * columns; ++i)

5

highest = 0;

if (*(temp + i) > highest)

6

highest = *(temp + i);

7 8

2.3.1

return highest; }

Function Pointers

It is possible to take the address of a function and store it in a function pointer. The pointer can then be used to indirectly call the function. For example,

int (*Compare)(const char*, const char*);

defines a function pointer named Compare which can hold the address of any function that takes two constant character pointers as arguments and returns an integer. The string comparison library function strcmp, for example, is such. Therefore:

Compare = &strcmp;

// Compare points to strcmp function

The & operator is not necessary and can be omitted:

Compare = strcmp;

// Compare points to strcmp function

Alternatively, the pointer can be defined and initialized at once:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

79

int (*Compare)(const char*, const char*) = strcmp;

When a function address is assigned to a function pointer, the two types must match. The above definition is valid because strcmp has a matching function prototype:

int strcmp(const char*, const char*); Given the above definition of Compare, strcmp can be either called directly, or indirectly via Compare. The following three calls are equivalent:

strcmp("Tom", "Tim");

// direct call

(*Compare)("Tom", "Tim");

// indirect call

Compare("Tom", "Tim");

// indirect call (abbreviated)

A common use of a function pointer is to pass it as an argument to another function; typically because the latter requires different versions of the former in different circumstances. A good example is a binary search function for searching through a sorted array of strings. This function may use a comparison function (such as strcmp) for comparing the search string against the array strings. This might not be appropriate for all cases. For example, strcmp is case-sensitive. If we wanted to do the search in a non-case-sensitive manner then a different comparison function would be needed.

As shown in Listing 8, by making the comparison function a parameter of the search function, we can make the latter independent of the former. Listing 8 1

int BinSearch (char *item, char *table[], int n,

2 3

int (*Compare)(const char*, const char*)) {

4

int bot = 0;

5

int top = n - 1;

6

int mid, cmp;

7

while (bot <= top) {

8

mid = (bot + top) / 2;

9

if ((cmp = Compare(item,table[mid])) == 0)

10

return mid;

// return item index

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 11

80

else if (cmp < 0)

12 13

top = mid - 1;

// restrict search to lower half

bot = mid + 1;

// restrict search to upper half

else

14 15

}

16

return -1;

17

// not found

}

Explanation 1

Binary search is a well-known algorithm for searching through a sorted list of items. The

search list is denoted by table which is an array of strings of dimension n. The search item is denoted by item. 2

Compare is the function pointer to be used for comparing item against the array elements.

7

Each time round this loop, the search span is reduced by half. This is repeated until the

two ends of the search span (denoted by bot and top) collide, or until a match is found. 9

The item is compared against the middle item of the array.

10

If item matches the middle item, the latters index is returned.

11

If item is less than the middle item, then the search is restricted to the lower half of the

array. 14

If item is greater than the middle item, then the search is restricted to the upper half of the

array. 16

Returns -1 to indicate that there was no matching item.

The following example shows how BinSearch may be called with strcmp passed as the comparison function:

char *cities[] = {"Boston", "London", "Sydney", "Tokyo"}; cout << BinSearch("Sydney", cities, 4, strcmp) << '\n';

This will output 2 as expected. 2.4

References

A reference introduces an alias for an object. The notation for defining references is similar to that of pointers, except that & is used instead of *. For example,

double num1 = 3.14; double &num2 = num1;

// num is a reference to num1

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

81

defines num2 as a reference to num1. After this definition num1 and num2 both refer to the same object, as if they were the same variable. It should be emphasized that a reference does not create a copy of an object, but merely a symbolic alias for it. Hence, after

num1 = 0.16;

both num1 and num2 will denote the value 0.16.

A reference must always be initialized when it is defined: it should be an alias for something. It would be illegal to define a reference and initialize it later.

double &num3;

// illegal: reference without an initializer

num3 = num1;

You can also initialize a reference to a constant. In this case a copy of the constant is made (after any necessary type conversion) and the reference is set to refer to the copy.

int &n = 1;

// n refers to a copy of 1

The reason that n becomes a reference to a copy of 1 rather than 1 itself is safety. Consider what could happen if this were not the case. int &x = 1; ++x; int y = x + 1;

The 1 in the first and the 1 in the third line are likely to be the same object (most compilers do constant optimization and allocate both 1s in the same memory location). So although we expect y to be 3, it could turn out to be 4. However, by forcing x to be a copy of 1, the compiler guarantees that the object denoted by x will be different from both 1s.

The most common use of references is for function parameters. Reference parameters facilitates the pass-by-reference style of arguments, as opposed to the pass-by-value style which we have used so far. To observe the differences, consider the three swap functions in Listing 9. Listing 9

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1

void Swap1 (int x, int y)

2

{

3

int temp = x;

4

x = y;

5

y = temp;

6

}

7

void Swap2 (int *x, int *y)

8

{

9

int temp = *x;

10

*x = *y;

11

*y = temp;

12

}

13

void Swap3 (int &x, int &y)

14

{

15

int temp = x;

16

x = y;

17

y = temp;

18

82

// pass-by-value (objects)

// pass-by-value (pointers)

// pass-by-reference

}

Explanation 1

Although Swap1 swaps x and y, this has no effect on the arguments passed to the

function, because Swap1 receives a copy of the arguments. What happens to the copy does not affect the original. 2

Swap2 overcomes the problem of Swap1 by using pointer parameters instead. By

dereferencing the pointers, Swap2 gets to the original values and swaps them. 3

Swap3 overcomes the problem of Swap1 by using reference parameters instead. The

parameters become aliases for the arguments passed to the function and therefore swap them as intended. 4

Swap3 has the added advantage that its call syntax is the same as Swap1 and involves

no addressing or dereferencing. The following main function illustrates the differences:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

83

int main (void) { int i = 10, j = 20; Swap1(i, j);

cout << i << ", " << j << '\n'; Swap2(&i, &j);

cout << i << ", " << j << '\n';

Swap3(i, j);

cout << i << ", " << j << '\n';

}

When run, it will produce the following output: 10, 20 20, 10 10, 20 2.5

Typedefs

Typedef is a syntactic facility for introducing symbolic names for data types. Just as a reference defines an alias for an object, a typedef defines an alias for a type. Its main use is to simplify otherwise complicated type declarations as an aid to improved readability. Here are a few examples:

typedef char *String; Typedef char Name[12]; typedef unsigned int uint;

The effect of these definitions is that String becomes an alias for char*, Name becomes an alias for an array of 12 chars, and uint becomes an alias for unsigned int. Therefore:

String

str;

// is the same as: char *str;

Name

name;

// is the same as: char name[12];

uint

n;

// is the same as: unsigned int n;

The complicated declaration of Compare in Listing 8 is a good candidate for typedef:

typedef int (*Compare)(const char*, const char*);

int BinSearch (char *item, char *table[], int n, Compare comp) { //...

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

84

if ((cmp = comp(item, table[mid])) == 0) return mid; //... }

The typedef introduces Compare as a new type name for any function with the given prototype. This makes BinSearchs signature arguably simpler.

2.6

Check your progress

1

Define two functions which, respectively, input values for the elements of an array of reals

and output the array elements:

void ReadArray (double nums[], const int size); void WriteArray (double nums[], const int size);

2

Define a function which reverses the order of the elements of an array of reals:

void Reverse (double nums[], const int size);

3

The following table specifies the major contents of four brands of breakfast cereals.

Define a two-dimensional array to capture this data:

Fiber

Sugar

Fat

Salt

Top Flake

12g

25g

16g

0.4g

Cornabix

22g

4g

8g

0.3g

Oatabix

28g

5g

9g

0.5g

Ultrabran

32g

7g

2g

0.2g

Write a function which outputs this table element by element.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

4

85

Define a function to input a list of names and store them as dynamically-allocated strings

in an array, and a function to output them:

void ReadNames (char *names[], const int size); void WriteNames (char *names[], const int size);

Write another function which sorts the list using bubble sort:

void BubbleSort (char *names[], const int size);

Bubble sort involves repeated scans of the list, where during each scan adjacent items are compared and swapped if out of order. A scan which involves no swapping indicates that the list is sorted.

5

Rewrite the following function using pointer arithmetic:

char* ReverseString (char *str) { int len = strlen(str); char *result = new char[len + 1];

for (register i = 0; i < len; ++i) result[i] = str[len - i - 1]; result[len] = '\0'; return result; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 3 Linked List I Structure 3.0

Unit Objective

3.1

Linked List Basics

3.2

Array Review

3.3

Pointer Review

3.4

Implementation : Template

3.5

Other Types of Lists

3.6

3.5.1

Implementation of Singly linked list

3.5.2

Implementation of Doubly linked list

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

86


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 3.0

87

Unit Objective

After going through this unit, you should be able to understand :

3.1

Linked List structure

Comparison of Linked List with array

Linked List Template

Basic operations on Linked Lists

List Types

Implementation algorithm and source code

Linked List Basics

Linked lists are useful to study for two reasons. Most obviously, linked lists are a data structure which you may want to use in real programs. Seeing the strengths and weaknesses of linked lists will give you an appreciation of the some of the time, space, and code issues which are useful to thinking about any data structures in general.

Somewhat less obviously, linked lists are great way to learn about pointers. In fact, you may never use a linked list in a real program, but you are certain to use lots of pointers. Linked list problems are a nice combination of algorithms and pointer manipulation. Traditionally, linked lists have been the domain where beginning programmers get the practice to really understand pointers.

Why Linked Lists? Linked lists and arrays are similar since they both store collections of data. The terminology is that arrays and linked lists store "elements" on behalf of "client" code. The specific type of element is not important since essentially the same structure works to store elements of any type. One way to think about linked lists is to look at how arrays work and think about alternate approaches.

3.2

Array Review

Arrays are probably the most common data structure used to store collections of elements. In most languages, arrays are convenient to declare and the provide the handy [ ] syntax to access any element by its index number. The following example shows some typical array code and a drawing of how the array might look in memory. The code allocates an array int scores[100], sets the first three elements set to contain the numbers 1, 2, 3 and leaves the rest of the array uninitialized...

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

88

void ArrayTest() { int scores[100]; // operate on the elements of the scores array... scores[0] = 1; scores[1] = 2; scores[2] = 3;}

Here is a drawing of how the scores array might look like in memory. The key point is that the entire array is allocated as one block of memory. Each element in the array gets its own space in the array. Any element can be accessed directly using the [ ] syntax.

Once the array is set up, access to any element is convenient and fast with the [ ] operator. (Extra for experts) Array access with expressions such as scores[i]is almost always implemented using fast address arithmetic: the address of an element is computed as an offset from the start of the array which only requires one multiplication and one addition. The disadvantages of arrays are... 1) The size of the array is fixed 100 elements in this case. Most often this size is specified at compile time with a simple declaration such as in the example above . With a little extra effort, the size of the array can be deferred until the array is created at runtime, but after that it remains fixed. (extra for experts) You can go to the trouble of dynamically allocating an array in the heap and then dynamically resizing it with realloc(), but that requires some real programmer effort.

2) Because of (1), the most convenient thing for programmers to do is to allocate arrays which seem "large enough" (e.g. the 100 in the scores example). Although convenient, this strategy has two disadvantages: (a) most of the time there are just 20 or 30 elements in the array and 70% of the space in the array really is wasted. (b) If the program ever needs to process more than 100 scores, the code breaks. A surprising amount of commercial code has this sort of naive array allocation which wastes space most of the time and crashes for special occasions. (Extra for experts) For relatively large arrays (larger than 8k bytes), the virtual memory system may partially compensate for this problem, since the "wasted" elements are never touched.

3) (minor) Inserting new elements at the front is potentially expensive because existing elements need to be shifted over to make room. Linked lists have their own strengths and weaknesses, but they happen to be strong where arrays are weak. The array's features all follow from its strategy of allocating the memory for all its elements in one block of memory. Linked lists use an entirely different strategy. As we will see,

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

89

linked lists allocate memory for each element separately and only when necessary.

3.3

Pointer Review

Here is a quick review of the terminology and rules for pointers. The linked list code to follow will depend on these rules. Pointer/Pointee A "pointer" stores a reference to another variable sometimes known as its "pointee". Alternately, a pointer may be set to the value NULL which encodes that it does not currently refer to a pointee. (In C and C++ the value NULL can be used as a boolean false).

Dereference The dereference operation on a pointer accesses its pointee. A pointer may only be dereferenced after it has been set to refer to a specific pointee. A pointer which does not have a pointee is "bad" (below) and should not be dereferenced.

Bad Pointer A pointer which does not have an assigned a pointee is "bad" and should not be dereferenced. In C and C++, a dereference on a bad sometimes crashes immediately at the dereference and sometimes randomly corrupts the memory of the running program, causing a crash or incorrect computation later. That sort of random bug is difficult to track down. In C and C++, all pointers start out with bad values, so it is easy to use bad pointer accidentally. Correct code sets each pointer to have a good value before using it. Accidentally using a pointer when it is bad is the most common bug in pointer code. In Java and other runtime oriented languages, pointers automatically start out with the NULL value, so dereferencing one is detected immediately. Java programs are much easier to debug for this reason.

Pointer assignment An assignment operation between two pointers like p=q;makes the two pointers point to the same pointee. It does not copy the pointee memory. After the assignment both pointers will point to the same pointee memory which is known as a "sharing" situation.

malloc() malloc() is a system function which allocates a block of memory in the "heap" and returns a pointer to the new block. The prototype for malloc() and other heap functions are in stdlib.h. The argument to malloc() is the integer size of the block in bytes. Unlike local ("stack") variables, heap memory is not automatically deallocated when the creating function exits. malloc() returns NULL if it cannot fulfill the request. (extra for experts) You may check for the NULL case with assert() if you wish just to be safe. Most modern programming systems will throw an exception or do some other automatic error handling in their memory allocator, so it is becoming less common that source code needs to explicitly check for allocation failures.

free() free() is the opposite of malloc(). Call free() on a block of heap memory to indicate to the

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

90

system that you are done with it. The argument to free() is a pointer to a block of memory in the heap a pointer which some time earlier was obtained via a call to malloc().

A list is one of the most basic data structures in programming. It is a logically sequential order of elements, any of which can be accessed without restriction. Any element in the list can be removed, and its value be read or modified. Also, a new element may be inserted into any location in the list structure. Each element points to the next one in the list, and the last does not reference any other item. Physically, the elements of a list can be stored at various locations in memory, and the addresses of each element are not correlated in any way. The list is linked since each element points to the location of the next item.

3.4

Implementation : Template

The dynamic representation of a list is called a linked list. Each element in the list is called a node, and contains two values. The first, is the data value that is to be stored. For instance, in a list of names, this would be a value such as John. The second value in a node is a pointer to the next node in the list. A common representation of a list node is as follows:

template <class ItemType> struct ListNode { ItemType data; ListNode<ItemType> *next; }; Linked lists can be implemented in many ways, depending on how the programmer will use lists in their program. We will show how to implement a generic class, which can be adapted and modified to use in most situations. The member functions implemented will be those necessary to add, modify, or delete nodes in a linked list. The class will be constructed in such a way that if the implementation were to be changed, the class definition would remain the same, and therefore any program that uses the class will not need to be altered. This is usually good practice in the design of any class. A definition of the linked list class is displayed below.

template <class ItemType> class List { public:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

91

List(); //constructor -initialize private variables ~List(); //destructor -free used memory void insert(const ItemType); //insert new node at current location void delete(); //remove the current node void next(); //set current to the next node in the list void prev(); //set current to the previous node in the list void reset(); //set current to the first node in the list void clear(); //remove all nodes in the list int length() const; //return the amount of nodes in the list bool IsEmpty() const; //returns true if the list doesn't have any nodes bool IsFull() const; //returns true if there is no system memory for additional nodes ItemType value() const; //returns the value of the current node private: ListNode<ItemType> *list; //points to the list header ListNode<ItemType> *prevcurrent; int len; } len will contain the total number of nodes in the list, and is self explanatory. prevcurrent and list will be implemented in a special way, and require further explanation. At first, it would appear logical to have a pointer directly to the node being referenced. However, this would make the implementation of the prev (), as well as the insert() and delete() functions to be time consuming. These three functions require access to the node that precedes the current node being referenced. Therefore, if there were a pointer directly to the required node, the only solution would be to search through the entire list (in the worst case scenario) for the preceding node. A temporary pointer would be created that points to the list's first node, and be used to traverse the list. A condition would then be implemented that triggers when temp->next->data equals the current node's value.

A more efficient approach is to have prevcurrent store the pointer to the node that precedes the one being referenced. Therefore, the time needed to otherwise find this node is not wasted.

This approach raises a new concern -if the list has only one item, then a special case will need to be introduced every place prevcurrent is used, since there is no preceding node to point to. The most efficient way of solving this problem is by using a header node. A header node is a "dummy" node that acts as the first node of the list, but is not logically in the list. It is used only in the implementation of the linked list class, and the programmer who uses the class does not need to know about header nodes. It is created, manipulated, and deleted by the member functions. list

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

92

will point to header node. Now let's look at the implementation of the linked list class. The class constructor will create the header node, and set prevcurrent to point to its location. It will also set length to a value of zero. template <class ItemType> List<ItemType>::List() { list = new ListNode; //create a new ListNode in memory list->next = NULL; //the header is the only node in the list revcurrent = list; //set current to the header, since there are no nodes len = 0; } The destructor will delete all nodes of the list, freeing up the memory they occupied. Since the clear() function does this operation, it can be called in the destructor. In addition, the header node will also be deleted in the destructor, as the clear() function only removes actual list nodes. Two local variables of type ListNode<itemType> are used for the operation. traverse is set to the first node in the list, and used to visit every node. tmp will be used to point to the node to be deleted. Since we cannot move on to the next node if the current node is deleted (the next pointer will no longer exist), traverse will be set to the next node first, after which tmp will be deleted. template <class ItemType> void List<ItemType>::clear() { ListNode<ItemType> *tmp; //point to the node to be deleted ListNode<ItemType> *traverse = list->next; //used to visit each node in the list. The header node is not deleted, so we start with the first actual node.

while(traverse != NULL) //while the list is not empty { tmp = traverse; //store the current node. traverse = traverse->next; //visit the next node delete tmp; //free the memory taken up by the current node } prevcurrent = list; //set current to the header node len = 0; } template <class ItemType< List<ItemType>::~List() {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

93

clear(); //delete all list nodes delete list; //delete the header "dummy" node }

The insert() function will create a new node preceding the one being referenced, and move the reference to it. Remember, current->next is the node currently being referenced, since prevcurrent points to the preceding node. template <class ItemType> void List<ItemType>::insert(const ItemType item) { assert(!IsFull()); //abort if there is no memory to create a new node ListNode<ItemType> *NewNode = new ListNode<ItemType>; //create a new node in memory NewNode->data = item; //set the node's value NewNode->next = prevcurrent->next; //referenced node will follow new node in order prevcurrent->next = NewNode; //The node that preceded the old node now precedes the new one. The new node is now referenced. len++; //increment length } The delete() function sets the previous node to point to the node following the one being referenced, thus removing it from the logical list. It is then deleted from memory. template <class ItemType> void List<ItemType>::delete() { if(len != 0) //don't delete the header node {prevcurrent->next = prevcurrent->next->next; //logically remove it from the list delete (prevcurrent->next); //free up memory len--; //decrement length } } The next() function is very short, and self explanatory. template <class ItemType> void List<ItemType>::next() { prevcurrent = prevcurrent->next; } The prev() function visits each node until the one that points to prevcurrent is found, and then sets prevcurrent to this node. The node being referenced now becomes the node prevcurrent

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

94

pointed to before the function was executed. template <class ItemType> void List<ItemType>::prev() { if (len > 1) //run only if there is an element behind the current { ListNode *tmp = list; while(tmp->next != prevcurrent) tmp = tmp->next; revcurrent = tmp; } } The reset() function sets the item in reference to the first by setting prevcurrent to the header node. template <class ItemType> void List<ItemType>::reset() { prevcurrent = list; } Next, the length() function simply returns the value of the private length member len. A function would not be necessary to perform this operation if we were to make the length a public data member, in which case the programmer can read it directly. However, a member function is used to retrieve this value for two reasons. First, if the length data member were public, the programmer could also change the length of the list in the program without modifying the number of nodes in the list. Second, if we were to change the implementation of the class, and the length was no longer controlled by a single variable, the programmer would not have to modify his program in order for it to work. template <class ItemType> int List<ItemType>::length() const { return len; } The IsEmpty() function works by checking if the length of the list is zero. template <class ItemType> bool List<ItemType>::IsEmpty() const { return (len == 0);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

95

} The IsFull() function checks to see if there is enough system memory to create a new node. This is done by attempting to create a new node, and checking if the resulting pointer is NULL. If the new operation was unsuccessful in assigning the appropriate memory space, NULL is the return value. If the value is not NULL, the memory space is freed, and false is returned. Otherwise, the function returns true, meaning no more items can be added to the list. template <class ItemType> bool List<ItemType>::IsFull() const { ListNode<ItemType> *tmp = new ListNode<ItemType>; if(tmp == NULL) return true; else {delete tmp; return false; } }

Finally, the value() function returns the value of the node that is currently being referenced. template <class ItemType> ItemType List<ItemType>::value() const { return prevcurrent->next->data; } 3.5

Other Types of Lists

Like header nodes, lists may also have a trailer node, which is a dummy node at the end of the list. They are maintained in a similar fashion to that of header nodes, that is, they are not part of the logical list structure. Header nodes are used to eliminate any special cases that may arise when inserting a new node at the end of the list. Other type of standard lists exist as well. For instance, in a circular list, the last node points to the first node instead of a NULL value. This would require minor changes in the class implementation. Nodes inserted at the end must now point to the first node, and the same must be done when searching for the last node. We must look for a node which points to the first instead of NULL. Finally, a doubly linked list maintains a prev pointer in ListNode, which points to the previous

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

96

node. This makes functions such as insert() and remove() as well as the general implementation very simple. We no longer need to maintain a pointer to the node preceding the one being referenced. Since the previous node is required by both functions, we need only check prev for it. Other modifications needed in order to create a doubly linked list include updating each new node's prev pointer when it is created. The front pointer has a prev value of NULL and requires a special condition, unless a header node or a circular doubly linked list is used.

3.5.1

Implementation of Singly linked list

To perform various operations such as creation, insertion, deletion, search and display on singly linked list.

Algorithm: Step 1: Start the process. Step 2: Initialize and declare variables. Step 3: Enter the choice. Step 4: If choice is CREATE then a) Initialize the variable FLAG b) IF FLAG = TRUE, to create Head node or first node of linked list. c) Assign FALSE to FLAG Step 5: If choice is DISPLAY then a) Assign Head node as TEMP b) IF Temp = Null, display List is empty, else display the will be displayed. Step 6: If choice is INSERT then a) Insert a node as a Head node b) Insert a node as last node c) Insert a node after some node Step 7: If choice is DELETE then a) Before Delete any element from the linked list, first search the node present in the list b) To delete Head node then set adjacent node as anew head node and then deal locate the previous head node Step 8: If choice is SEARCH then a) Compare data at each node with the key value, If not matching then move to next node b) If the node containing desired data is obtained in the linked list then found variable is set to TRUE. Step 9: Stop the process

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

CODING(SLL): #include<stdio.h> #include<conio.h> #include<stdlib.h> #define TRUE 1 #define FALSE 0 typedef struct SLL { int data; struct SLL *next; }node; node *create(); void main() { int choice,val; char ans; node *head; void display(node *); node *search(node *, int); node *insert(node *); void dele(node **); head = NULL; do { clrscr(); printf(\n Program to Perform Various Operations on single Linked List); printf(\n 1.Create); printf(\n 2.Display); printf(\n 3.Search for an item); printf(\n 4.Insert an element in a list); printf(\n 5.Delete an element from list); printf(\n 6.Quit); printf(\n Enter Your Choice (1-6)); scanf(%d,&choice); switch(choice) {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

97


DATA STRUCTURES & ALOGRITHEMS USING C, C++ case 1 : head = create(); break; case 2 :display(head ); break;

case 3 : printf(Enter the Element you want to Search); break; case 4 : head = insert(head); break; case 5: dele(&head); break; case 6 : exit(0); default: clrscr(); printf(Invalid Choice,Try again); getch(); } }while(choice !=6); } node *create() { node *temp, *New, *head; int val, flag; char ans=y; node *get_node(); temp=NULL; flag = TRUE; do { printf(\n Enter the Element); scanf(%d,&val); New=get_node(); if (New == NULL) printf(\n Memory is not allocated); New -> data = val; if(flag == TRUE) { head = New;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

98


DATA STRUCTURES & ALOGRITHEMS USING C, C++ temp =head; flag = FALSE; } else { temp -> next = New; temp = New; } printf(\Do u want to enter more elements?(y/n)); ans = getche(); } while(ans == y); printf(\n The singly linked List is Created\n); getch(); clrscr(); return head; } node *get_node() { node *temp; temp = (node *) malloc(sizeof(node)); temp->next=NULL; return temp; }

void display(node*head) { node *temp; temp = head; if(temp == NULL) { printf(\n The list is empty\n); getch(); clrscr(); return; } while(temp != NULL) {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

99


DATA STRUCTURES & ALOGRITHEMS USING C, C++ printf(%d->,temp->data); temp = temp -> next; } printf(NULL); getch(); clrscr(); } node *search(node *head, int key) { node *temp; int found; temp=head; if(temp == NULL) { printf(The linked list is Empty\n); getch(); clrscr(); return NULL; } found=FALSE; while(temp!=NULL && found==FALSE) { if( temp->data !=key) { temp=temp->next; } else found=TRUE; } if(found==TRUE) { printf(\nThe element is present in the list\n); getch(); return temp; } else {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

100


DATA STRUCTURES & ALOGRITHEMS USING C, C++ printf(\nThe element is not present in the list\n); getch(); return NULL; } } node * insert(node *head) { int choice; node *insert_head(node *); void insert_after(node *); void insert_last(node *); printf(\n 1.Insert a node as a head node); printf(\n 2.Insert a node as a last node); printf(\n 3.Insert a node at intermediate position in the linked list); printf(\nEnter your choice for your insertion of node); scanf(%d,&choice); switch(choice) { case 1 : head=insert_head(head); break; case 2 : insert_last(head); break; case 3 : insert_after(head); break; } return head; } node *insert_head(node *head) { node *New, *temp; New=get_node(); printf(\n Enter the element which you want to insert); scanf(%d,&New->data); if(head=NULL) head=New; else {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

101


DATA STRUCTURES & ALOGRITHEMS USING C, C++ temp=head; New->next=temp; head=new } return head; } void insert_last(node *head) { node *New,*temp; New=get_node(); printf(\n Enter the element which you want to insert); scanf(%d,&New->data); if(head==NULL) head=New; else { temp=head; while(temp->next!=NULL) temp=temp->next; temp->next=New; New_next=NULL; } } void insert_after(node *head) { int key; node *New,*Temp; New= get_node(); printf(\n Enter the element which you want to insert); scanf(%d,&New->data); if(head==NULL) head=New; else { printf(\n Enter the element after which you want to insert); scanf(%d,&New->data); temp=head;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

102


DATA STRUCTURES & ALOGRITHEMS USING C, C++ do { if(temp->data==key) { New->next=temp->next; temp->next=New; return; } else temp=temp->next; }while(temp!=NULL); } } node *get_prev function(node *head, int val) { node *temp, *prev; int flag; temp=head; if(temp==NULL) return NULL; flag=FALSE; prev=NULL; while(temp !=NULL && !flag) { if(temp->data !=val) { prev=temp; temp=temp->next; } else flag=TRUE; } if(flag) return prev; else return NULL; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

103


DATA STRUCTURES & ALOGRITHEMS USING C, C++

104

void dele(node **head) { node *temp,*prev; int key; temp=*head; if(temp==NULL) { printf(\n The list is empty); getch(); clrscr(); return; } clrscr(); printf(\n Enter the element you want to delete); scanf(%d,&key); temp=search(*head, key); if(temp != NULL) { prev=get_prev(*head,key); if(prev!=NULL) { prev->next = temp->next; free(temp); } else { *head=temp->next; free(temp); } printf(\n The element is Deleted\n); getch(); clrscr(); } } 3.5.2

Implementation of Doubly linked list

To perform various operations such as creation, insertion, deletion, search and display on doubly linked list.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ Algorithm: Step 1: Start the process. Step 2: Initialize and declare variables. Step 3: Enter the choice. Step 4: If choice is CREATE then a) Allocate the memory using malloc function b) Assign HEAD->NEXT and HEAD->PREV as NULL Step 5: If choice is GET_NODE then a) Assign CUR position as HEAD b) While(CUR->NEXT!=NULL) CUR=CUR->NEXT c) Get the node value and interchange the CUR position Step 6: If choice is INSERT then a) Get the position to be inserted b) Assign CUR=HEAD c) using for loop , change the CUR->NEXT position d) Get the Data to be inserted, then assign the CUR->NEXT position to new node e) change the NEXT position and PREV position for the new node Step 7: If choice is DELETE then a) Get the node to deleted d b) using for loop search that node present in the list c) If(CUR->NEXT_>data==d) t=CUR->NEXT CUR->NEXT=CUR->NEXT->NEXT CUR->NEXT->PREV=CUR c) free(t) Step 8: If choice is DISPLAY then a) If(HEAD->NEXT == NULL) , print list is empty b) While(CUR->NEXT!=NULL) , print CUR->NEXT->DATA Step 9 : Stop the Process.

CODING: #include<stdio.h> #include<conio.h> #include<stdlib.h> struct node { int data;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

105


DATA STRUCTURES & ALOGRITHEMS USING C, C++ struct node *next,*prev;

}*head,*cur,*t,*q; typedef struct node x;

void create(); void display(); void get_node(); void insert(); void del();

void main() { int ch; clrscr(); create();

do {

printf("\n Program to Perform Various Operations on Doubly Linked List"); printf("\n 1.Create"); printf("\n 2.Insert"); printf("\n 3.Delete"); printf("\n 4.Display"); printf("\n 5.Exit"); printf("\n Enter Your Choice (1-5)"); scanf("%d",&ch); switch(ch) { case 1 :get_node(); break; case 2 :insert(); break; case 3 : del(); break; case 4 : display();

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

106


DATA STRUCTURES & ALOGRITHEMS USING C, C++ break; case 5: exit(0); break; } }while(ch<=5); getch(); }

void create() { head=(x*)malloc(sizeof(x)); head->next=NULL; head->prev=NULL; } void get_node() { cur=head; while(cur->next!=NULL) cur=cur->next; t = (x *) malloc(sizeof(x)); printf("\n Enter the Data"); scanf("%d",&t->data); cur->next=t; t->next=NULL; t->prev=cur; display(); }

void display() { cur= head; if(head->next==NULL) { printf("\n The list is empty\n"); getch(); clrscr(); return;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

107


DATA STRUCTURES & ALOGRITHEMS USING C, C++ } printf("NULL");

while(cur->next!= NULL) { printf("<=>%d",cur->next->data); cur=cur->next; } printf("<=>NULL"); getch(); }

void insert() { int i,pos; printf("\n Enter the Position to be inserted"); scanf("%d",&pos); cur=head; for(i=1;i<pos;i++) cur=cur->next; t=(x*)malloc(sizeof(x)); printf("\n Enter the Data"); scanf("%d",&t->data); q=cur->next; t->next=q; q->prev=t; cur->next=t; t->prev=cur; display(); } void del() { int d; printf("\n Enter the Data to be Deleted"); scanf("%d",&d); cur=head; while(cur->next!=NULL)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

108


DATA STRUCTURES & ALOGRITHEMS USING C, C++ { if(cur->next->data==d) { t=cur->next; cur->next=cur->next->next; cur->next->prev=cur; printf("\n TheDeleted element is %d\n",t->data); free(t); display(); return; } else cur=cur->next; } printf("\n Element not found"); display(); }

3.6

Check your progress 1. Compare and contrast the properties of Arrays w.r.t. Linked List.

2. Write a C++ code to delete duplicate occurrences of a node from Liked List?

3. What is the running time of Linked List searching and sorting?

4. Write ?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

109


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 4 Linked List II Structure 4.0

Unit Objective

4.1

Introduction

4.2

The Empty List NULL

4.3

Linked List Types: Node and Pointer

4.4

Memory Drawings

4.5

List Building

4.6

About C++

4.7

Code Techniques

4.8

Examples

4.9

Representation of Polynomial addition using linked list

4.10 Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

110


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4.0

111

Unit Objective

After going through this unit, you should be able to understand :

4.1

Linked List Memory representation

Various ways of drawing Linked List

Various coding techniques in C++

Application of Linked List (Polynomial Addition)

Introduction

An array allocates memory for all its elements lumped together as one block of memory. In contrast, a linked list allocates space for each element separately in its own block of memory called a "linked list element" or "node". The list gets is overall structure by using pointers to connect all its nodes together like the links in a chain.

Each node contains two fields: a "data" field to store whatever element type the list holds for its client, and a "next" field which is a pointer used to link one node to the next node. Each node is allocated in the heap with a call to malloc(), so the node memory continues to exist until it is explicitly deallocated with a call to free(). The front of the list is a pointer to the first node. Here is what a list containing the numbers 1, 2, and 3 might look like...

This drawing shows the list built in memory by the function BuildOneTwoThree() . The beginning of the linked list is stored in a "head" pointer which points to the first node. The first node contains a pointer to the second node. The second node contains a pointer to the third node, ... and so on. The last node in the list has its .next field set to NULL to mark the end of the list. Code can access any node in the list by starting at the head and following the .nextpointers. Operations towards the front of the list are fast while operations which access node farther down the list take longer the further they are from the front. This "linear" cost to access a node is fundamentally more costly then the constant time [ ] access provided by arrays. In this respect, linked lists are definitely less efficient than arrays.

Drawings such as above are important for thinking about pointer code, so most of the examples in this article will associate code with its memory drawing to emphasize the habit. In this case the head pointer is an ordinary local pointer variable, so it is drawn separately on the left to show that it is in the stack. The list nodes are drawn on the right to show that they are allocated in the heap.

4.2

The Empty List NULL

The above is a list pointed to by head is described as being of "length three" since it is made of three nodes with the .next field of the last node set to NULL. There needs to be some

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

112

representation of the empty list the list with zero nodes. The most common representation chosen for the empty list is a NULL head pointer. The empty list case is the one common weird "boundary case" for linked list code. All of the code presented in this article works correctly for the empty list case, but that was not without some effort. When working on linked list code, it's a good habit to remember to check the empty list case to verify that it works too. Sometimes the empty list case works the same as all the cases, but sometimes it requires some special case code. No matter what, it's a good case to at least think about.

4.3

Linked List Types: Node and Pointer

Before writing the code to build the above list, we need two data types... 

Node The type for the nodes which will make up the body of the list. These are allocated in the heap. Each node contains a single client data element and a pointer to the next node in the list. Type: struct node

struct node { int

data;

struct node*

next;

}; 

Node Pointer The type for pointers to nodes. This will be the type of the head pointer and the .next fields inside each node. In C and C++, no separate type declaration is required since the pointer type is just the node type followed by a '*'. Type: struct node*

BuildOneTwoThree() Function Here is simple function which uses pointer operations to build the list {1, 2, 3}. The memory drawing above corresponds to the state of memory at the end of this function. This function demonstrates how calls to malloc() and pointer assignments (=) work to build a pointer structure in the heap.

/* Build the list {1, 2, 3} in the heap and store its head pointer in a local stack variable. Returns the head pointer to the caller.*/

struct node* BuildOneTwoThree() { struct node* head = NULL; struct node* second = NULL; struct node* third = NULL; head = malloc(sizeof(struct node)); // allocate 3 nodes in the heap second = malloc(sizeof(struct node));

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

113

third = malloc(sizeof(struct node)); head->data = 1; // setup first node head->next = second; // note: pointer assignment rule second->data = 2; // setup second node second->next = third; third->data = 3; // setup third link third->next = NULL; // At this point, the linked list referenced by "head" // matches the list in the drawing.return head; }

Length() Function The Length() function takes a linked list and computes the number of elements in the list. Length() is a simple list function, but it demonstrates several concepts which will be used in later, more complex list functions... /* Given a linked list head pointer, compute and return the number of nodes in the list.*/ int Length(struct node* head) { struct node* current = head; int count = 0; while (current != NULL) { count++; current = current->next; } return count; } There are two common features of linked lists demonstrated in Length()...

1) Pass The List By Passing The Head Pointer The linked list is passed in to Length() via a single head pointer. The pointer is copied from the caller into the "head" variable local to Length(). Copying this pointer does not duplicate the whole list. It only copies the pointer so that the caller and Length() both have pointers to the same list structure. This is the classic "sharing" feature of pointer code. Both the caller and length have copies of the head pointer, but they share the pointee node structure.

2) Iterate Over The List With A Local Pointer The code to iterate over all the elements is a very common idiom in linked list code.... struct node* current = head;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

114

while (current != NULL) { // do something with *current node current = current->next; } The hallmarks of this code are... 1) The local pointer, currentin this case, starts by pointing to the same node as the head pointer with current = head;. When the function exits, currentis automatically deallocated since it is just an ordinary local, but the nodes in the heap remain. 2) The while loop tests for the end of the list with (current != NULL). This test smoothly catches the empty list case currentwill be NULL on the first iteration and the while loop will just exit before the first iteration. 3) At the bottom of the while loop, current = current->next; advances the local pointer to the next node in the list. When there are no more links, this sets the pointer to NULL. If you have some linked list code which goes into an infinite loop, often the problem is that step (3) has been forgotten.

Calling Length() Here's some typical code which calls Length(). It first calls BuildOneTwoThree() to make a list and store the head pointer in a local variable. It then calls Length() on the list and catches the int result in a local variable. void LengthTest() { struct node* myList = BuildOneTwoThree(); int len = Length(myList); // results in len == 3 }

4.4

Memory Drawings

The best way to design and think about linked list code is to use a drawing to see how the pointer operations are setting up memory. There are drawings below of the state of memory before and during the call to Length() take this opportunity to practice looking at memory drawings and using them to think about pointer intensive code. You will be able to understand many of the later, more complex functions only by making memory drawings like this on your own.

Start with the Length() and LengthTest() code and a blank sheet of paper. Trace through the execution of the code and update your drawing to show the state of memory at each step. Memory drawings should distinguish heap memory from local stack memory. Reminder: malloc() allocates memory in the heap which is only be deallocated by deliberate calls to free(). In

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

115

contrast, local stack variables for each function are automatically allocated when the function starts and deallocated when it exits. Our memory drawings show the caller local stack variables above the callee, but any convention is fine so long as you realize that the caller and callee are separate.

Drawing 1 : Before Length() Below is the state of memory just before the call to Length() in LengthTest() above. BuildOneTwoThree() has built the {1, 2, 3} list in the heap and returned the head pointer. The head pointer has been caught by the caller and stored in its local variable myList. The local variable len has a random value it will only be given the value 3 when then call to Length() returns.

Drawing 2: Mid Length Here is the state of memory midway through the execution of Length(). Length()'s local variables headand currenthave been automatically allocated. The currentpointer started out pointing to the first node, and then the first iteration of the while loop advanced it to point to the second node.

Notice how the local variables in Length() (headand current) are separate from the local variables in LengthTest() (myListand len). The local variables headand currentwill be deallocated (deleted) automatically when Length() exits. This is fine the heap allocated links will remain even though stack allocated pointers which were pointing to them have been deleted. 4.5

List Building

BuildOneTwoThree() is a fine as example of pointer manipulation code, but it's not a general mechanism to build lists. The best solution will be an independent function which adds a single new node to any list. We can then call that function as many times as we want to build up any list. Before getting into the specific code, we can identify the classic 3-Step Link In operation which adds a single node to the front of a linked list. The 3 steps are... 1)

Allocate Allocate the new node in the heap and set its .data to whatever needs to be

stored. struct node* newNode; newNode = malloc(sizeof(struct node)); newNode->data = data_client_wants_stored; 2)

Link Next Set the .next pointer of the new node to point to the current first node of the list.

This is actually just a pointer assignment remember: "assigning one pointer to another makes them point to the samething."

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

116

newNode->next = head; 3)

Link Head Change the head pointer to point to the new node, so it is now the first node in

the list. head = newNode;

3-Step Link In Code The simple LinkTest() function demonstrates the 3-Step Link In... void LinkTest() { struct node* head = BuildTwoThree(); // suppose this builds the {2, 3} list struct node* newNode; newNode= malloc(sizeof(struct node)); // allocate newNode->data = 1; newNode->next = head;// link next head = newNode; // link head // now head points to the list {1, 2, 3} }

3-Step Link In Drawing The drawing of the above 3-Step Link like (overwritten pointer values are in gray)...

Push() Function With the 3-Step Link in mind, the problem is to write a general function which adds a single node to head end of any list. Historically, this function is called "Push()" since we're adding the link to the head end which makes the list look a bit like a stack. Alternately it could be called InsertAtFront(), but we'll use the name Push().

WrongPush() Unfortunately Push() written in C suffers from a basic problem: what should be the parameters to Push()? This is, unfortunately, a sticky area in C. There's a nice, obvious way to write Push() which looks right but is wrong. Seeing exactly how it doesn't work will provide an excuse for more practice with memory drawings, motivate the correct solution, and just generally make you a better programmer.... void WrongPush(struct node* head, int data) { struct node* newNode = malloc(sizeof(struct node)); newNode->data = data; newNode->next = head;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

117

head = newNode; // NO this line does not work! } void WrongPushTest() { List head = BuildTwoThree(); WrongPush(head, 1); // try to push a 1 on front -- doesn't work }

WrongPush() is very close to being correct. It takes the correct 3-Step Link In and puts it an almost correct context. The problem is all in the very last line where the 3-Step Link In dictates that we change the head pointer to refer to the new node. What does the line head = newNode;do in WrongPush()? It sets a head pointer, but not the right one. It sets the variable named head local to WrongPush(). It does not in any way change the variable named head we really cared about which is back in the caller WrontPushTest().

Correct Push() Code Here are Push() and PushTest() written correctly. The list is passed via a pointer to the head pointer. In the code, this amounts to use of '&' on the parameter in the caller and use of '*' on the parameter in the callee. Inside Push(), the pointer to the head pointer is named "headRef" instead of just "head" as a reminder that it is not just a simple head pointer.. /*Takes a list and a data value. Creates a new link with the given data and pushes it onto the front of the list. The list is not passed in by its head pointer. Instead the list is passed in as a "reference" pointer to the head pointer -- this allows us to modify the caller's memory. */void Push(struct node** headRef, int data) { struct node* newNode = malloc(sizeof(struct node)); newNode->data = data; newNode->next = *headRef; // The '*' to dereferences back to the real head *headRef = newNode; // ditto } void PushTest() { struct node* head = BuildTwoThree();// suppose this returns the list {2, 3} Push(&head, 1); // note the & Push(&head, 13); // head is now the list {13, 1, 2, 3} }

Correct Push() Drawing Here is a drawing of memory just before the first call to Push() exits. The original value of the

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

118

head pointer is in gray. Notice how the headRefparameter inside Push() points back to the real headpointer back in PushTest(). Push() uses *headRefto access and change the real head pointer.

4.6

About C++

C++ has its built in "& argument" feature to implement reference parameters for the programmer. The short story is, append an '&' to the type of a parameter, and the compiler will automatically make the parameter operate by reference for you. The type of the argument is not disturbed by this the types continue to act as they appear in the source, which is the most convenient for the programmer. So In C++, Push() and PushTest() look like...

/* Push() in C++ -- we just add a '&' to the right hand side of the head parameter type, and the compiler makes that parameter work by reference. So this code changes the caller's memory, but no extra uses of '*' are necessary -we just access "head" directly, and the compiler makes that change reference back to the caller. */void Push(struct node*& head, int data) { struct node* newNode = malloc(sizeof(struct node)); newNode->data = data; newNode->next = head; // No extra use of * necessary on head -- the compiler head = newNode; // just takes care of it behind the scenes. } void PushTest() { struct node* head = BuildTwoThree();// suppose this returns the list {2, 3} Push(head, 1); // No extra use & necessary -- the compiler takes Push(head, 13); // care of it here too. Head is being changed by these calls. // head is now the list {13, 1, 2, 3} }

4.7

Code Techniques

This section summarizes, in list form, the main techniques for linked list code. These techniques are all demonstrated in the examples in the next section.

1) Iterate Down a List A very frequent technique in linked list code is to iterate a pointer over all the nodes in a list. Traditionally, this is written as a whileloop. The head pointer is copied into a local variable currentwhich then iterates down the list. Test for the end of the list with current!=NULL. Advance

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

119

the pointer with current=current->next. // Return the number of nodes in a list (while-loop version) int Length(struct node* head) { int count = 0; struct node* current = head; while (current != NULL) { count++; current = current->next } return(count); } Alternately, some people prefer to write the loop as a forwhich makes the initialization, test, and pointer advance more centralized, and so harder to omit... for (current = head; current != NULL; current = current->next) { 2) Changing a Pointer With A Reference Pointer Many list functions need to change the caller's head pointer. To do this , pass a pointer to the head pointer. Such a pointer to a pointer is sometimes called a "reference pointer". The main steps for this technique are...

Design the function to take a pointer to the head pointer. This is the standard technique in C pass a pointer to the "value of interest" that needs to be changed. To change a struct node*, pass a structnode**. Use '&' in the caller to compute and pass a pointer to the value of interest. Use '*' on the parameter in the callee function to access and change the value of interest.

The following simple function sets a head pointer to NULL by using a reference parameter.... // Change the passed in head pointer to be NULL // Uses a reference pointer to access the caller's memory void ChangeToNull(struct node** headRef) { // Takes a pointer to the value of interest *headRef = NULL; // use '*' to access the value of interest } void ChangeCaller() { struct node* head1; struct node* head2; ChangeToNull(&head1); // use '&' to compute and pass a pointer to ChangeToNull(&head2); // the value of interest // head1 and head2 are NULL at this point

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

120

} Here is a drawing showing how the headRefpointer in ChangeToNull() points back to the variable in the caller...

3) Build At Head With Push() The easiest way to build up a list is by adding nodes at its "head end" with Push(). The code is short and it runs fast lists naturally support operations at their head end. The disadvantage is that the elements will appear in the list in the reverse order that they are added. If you don't care about order, then the head end is the best. struct node* AddAtHead() { struct node* head = NULL; int i; for (i=1; i<6; i++) { Push(&head, i); } // head == {5, 4, 3, 2, 1}; return(head); }

4) Build With Tail Pointer What about adding nodes at the "tail end" of the list? Adding a node at the tail of a list most often involves locating the last node in the list, and then changing its .nextfield from NULL to point to the new node, such as the tailvariable in the following example of adding a "3" node to the end of the list {1, 2}...

This is just a special case of the general rule: to insert or delete a node inside a list, you need a pointer to the node just before that position, so you can change its .nextfield. Many list problems include the sub-problem of advancing a pointer to the node before the point of insertion or deletion. The one exception is if the node is the first in the list in that case the head pointer itself must be changed. The following examples show the various ways code can handle the single head case and all the interior cases...

5) Build Special Case + Tail Pointer Consider the problem of building up the list {1, 2, 3, 4, 5} by appending the nodes to the tail end. The difficulty is that the very first node must be added at the head pointer, but all the other nodes are inserted after the last node using a tail pointer. The simplest way to deal with both cases is to

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

121

just have two separate cases in the code. Special case code first adds the head node {1}. Then there is a separate loop that uses a tail pointer to add all the other nodes. The tail pointer is kept pointing at the last node, and each new node is added at tail->next. The only "problem" with this solution is that writing separate special case code for the first node is a little unsatisfying. Nonetheless, this approach is a solid one for production code it is simple and runs fast.

struct node* BuildWithSpecialCase() { struct node* head = NULL; struct node* tail; int i; // Deal with the head node here, and set the tail pointer Push(&head, 1); tail = head; // Do all the other nodes using 'tail' for (i=2; i<6; i++) { Push(&(tail->next), i); // add node at tail->next tail = tail->next; // advance tail to point to last node } return(head); // head == {1, 2, 3, 4, 5}; }

6) Build Dummy Node Another solution is to use a temporary dummy node at the head of the list during the computation. The trick is that with the dummy, every node appear to be added after the .nextfield of a node. That way the code for the first node is the same as for the other nodes. The tail pointer plays the same role as in the previous example. The difference is that it now also handles the first node.

struct node* BuildWithDummyNode() { struct node dummy; // Dummy node is temporarily the first node struct node* tail = &dummy; // Start the tail at the dummy. // Build the list on dummy.next (aka tail->next) int i; dummy.next = NULL; for (i=1; i<6; i++) { Push(&(tail->next), i);tail = tail->next; } // The real result list is now in dummy.next// dummy.next == {1, 2, 3, 4, 5};

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

122

return(dummy.next); } Some linked list implementations keep the dummy node as a permanent part of the list. For this "permanent dummy" strategy, the empty list is not represented by a NULL pointer. Instead, every list has a dummy node at its head. Algorithms skip over the dummy node for all operations. That way the heap allocated dummy node is always present to provide the above sort of convenience in the code. Our dummy-in-the stack strategy is a little unusual, but it avoids making the dummy a permanent part of the list. Some of the solutions presented in this document will use the temporary dummy strategy. The code for the permanent dummy strategy is extremely similar, but is not shown.

4.8

Examples

This section presents some complete list code to demonstrate all of the techniques above

AppendNode() Example Consider a AppendNode() function which is like Push(), except it adds the new node at the tail end of the list instead of the head. If the list is empty, it uses the reference pointer to change the head pointer. Otherwise it uses a loop to locate the last node in the list. This version does not use Push(). It builds the new node directly. struct node* AppendNode(struct node** headRef, int num) { struct node* current = *headRef; struct node* newNode; newNode = malloc(sizeof(struct node)); newNode->data = num; newNode->next = NULL; // special case for length 0 if (current == NULL) { *headRef = newNode; } else { // Locate the last node while (current->next != NULL) { current = current->next; } current->next = newNode; }}

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

123

AppendNode() With Push() This version is very similar, but relies on Push() to build the new node. Understanding this version requires a real understanding of reference pointers.

struct node* AppendNode(struct node** headRef, int num) { struct node* current = *headRef; // special case for the empty list if (current == NULL) { Push(headRef, num); } else { // Locate the last node while (current->next != NULL) { current = current->next; } // Build the node after the last node Push(&(current->next), num); } }

4.9

Representation of Polynomial addition using linked list

To represent a polynomial addition using linked list

Algorithm: Step 1: Start the process. Step 2: Initialize and declare variables. Step 3: Enter the choice. Step 4: If choice is CREATEPOLY then a) Assign HEAD=NULL call READENODE(P) b) Assign HEAD=INSERTNODE(HEAD,P) c) Repeat until (P->EXP== 0) d) Return HEAD Step 5: If choice is VIEWPOLY then a) Assign P = HEAD b) While(P!=NULL) , print P->COEF and P->EXP c) P=P->NEXT Step 6: If choice is INSERNODE then a) If(P->COEF == 0) then Return HEAD

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ b) If(HEAD == NULL) then Return P c) Else If(P->EXP>HEAD->EXP)then P->NEXT=HEAD, Return P d) Else If(P->EXP<HEAD->EXP)then HEAD->NEXT=INSERTNODE (HEAD->NEXT,P) e) Else If((HEAD->COEF+P->COEF)!=0) then HEAD->COEF = HEAD->COEF+P->COEF d) Else Return HEAD->NEXT e) Return HEAD Step 7: If choice is POLYADD a) Assign HEAD=NULL b) While(POLY1!=NULL), HEAD=INSERTNODE(HEAD,COPYNODE(POLY1,1)) POLY1=POLY1->NEXT c) While (POLY2! =NULL), HEAD=INSERTNODE (HEAD,COPYNODE(POLY2,1)) POLY2=POLY2->NEXT d) Return HEAD Step 8: Stop the Process

CODING: #include<stdio.h> #include<conio.h> #include<stdlib.h> #define POSITIVE 1 #define NEGATIVE -1 typedef struct NODE { float coef; int exp; struct node *next; }poly; void viewMenu(); void readNode(poly *); void viewPoly(poly *); poly *getNode(); poly *createPoly(); poly *copyNode(poly*,int); poly *insertNode(poly*,poly*);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

124


DATA STRUCTURES & ALOGRITHEMS USING C, C++ poly *polyAdd(poly *p1,poly *p2); void main() { int choice; poly *poly1=NULL, *poly2=NULL, *res=NULL; viewMenu(); while(1) { printf("\n?"); fflush(stdin); scanf("%d",&choice); switch(choice) { case 0 : viewMenu(); break; case 1 : printf("\n Enter the First polynomial"); poly1=createPoly(); printf("\n The First polynomial is"); viewPoly(poly1); break; case 2 : printf("\n Enter the Second polynomial"); poly2=createPoly(); printf("\n The Second polynomial is"); viewPoly(poly2); break; case 3 : printf("\n The First polynomial is"); viewPoly(poly1); printf("\n The Second polynomial is"); viewPoly(poly2); break; case 4: printf("\n The Reultant aster polynomial addition is\n"); res=polyAdd(poly1,poly2); viewPoly(res); break; default: printf("\n End of Run of your program"); exit(0); }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

125


DATA STRUCTURES & ALOGRITHEMS USING C, C++ } } void viewMenu() { printf("\n Polynomial Manipulation using Singly linked list"); printf("\n\t 0.View the Main Menu"); printf("\n\t 1.Create the First polynomial"); printf("\n\t 2.Create the Second Polynomial"); printf("\n 3.View the First polynomial and Second polynomial"); printf("\n 4.polynomial Addition "); printf("\n 5.Exit"); } poly *createPoly() { poly *p,*head=NULL; do { p=getNode(); readNode(p); head=insertNode(head,p); } while(p->exp!=0); return head; } poly *getNode() { return(poly *)malloc(sizeof(poly)); } void readNode(poly *newnode) { int exp; float coef; printf("\n Enter the Coefficient:"); scanf("%f",&coef); printf("\n Enter the Exponent:"); scanf("%d",&exp); newnode->coef=coef; newnode->exp=exp;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

126


DATA STRUCTURES & ALOGRITHEMS USING C, C++ newnode->next=NULL; } poly *insertNode(poly *head,poly *p) { if(p->coef == 0.0f) return head; if(head==NULL) return p; else if(p->exp > head-> exp) { p->next=head; return p; } else if(p->exp < head->exp) head-> next = insertNode(head-> next,p); else if((head->coef=head->coef + p-> coef)== 0.0f) return head-> next ; return head; } void viewPoly(poly *ply) { if(ply == NULL) printf("NULL\n"); while(ply!=NULL) { printf("%.2fx^%d",ply->coef, ply->exp); printf("%s",(ply->next==NULL) ?" = 0\n" : "+"); ply = ply -> next; } } poly *polyAdd(poly *poly1, poly *poly2) { poly *head = NULL; while(poly1 != NULL) { head=insertNode(head,copyNode(poly1,POSITIVE)); poly1=poly1->next;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

127


DATA STRUCTURES & ALOGRITHEMS USING C, C++

128

} while(poly2!=NULL) { head=insertNode(head,copyNode(poly2,POSITIVE)); poly2=poly2->next; } return head; } poly *copyNode(poly *p,int sign) { poly *newnode; newnode=getNode(); newnode -> coef = sign * p -> coef; newnode->exp = p->exp; newnode->next = NULL; return newnode; }

4.10

Check your progress

1. Explain the importance of NULL pointer in context to the Liked List..

2. Write a variant of Length() function given above. Can we run it simultaneously from both ends?

3. Write a C++ code to join two Linked Lists taking one node from each list.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

4. Write a C++ code to sort the Liked list.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

129


DATA STRUCTURES & ALOGRITHEMS USING C, C++

BLOCK 3 Unit 1 Stacks Structure 1.0

Unit Objective

1.1

Introduction to Stacks

1.2

Programmatic representation of a stack

1.3

Implementation of Stacks

1.4

Application of Stacks

1.5

1.4.1

PostFixExpression ConvertToPolishNotation(InfixExpression)

1.4.2

Converting infix expression to postfix notation using Stack

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

130


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 1.0

131

Unit Objective

After going through this unit, you should be able to understand : 

Stack structure

Stack usage in recursion

Primitive operations : Push and Pop

Implementation of stack in C++

Application of stacks in evaluating expressions

1.1

Introduction to Stacks

A stack is an ordered list in which items are inserted and removed at only one end called the TOP. There are only 2 operations that are possible on a stack. They are the Push and the Pop operations. A Push operation inserts a value into the stack and the Pop operation retrieves the value from the stack and removes it from the stack as well.

An example for a stack is a stack of plates arranged on a table. This means that the last item to be added is the first item to be removed. Hence a stack is also called as Last-In-First-Out List or LIFO list.

To understand the usage of a stack let us see what does the system does when a method call is executed:

Whenever a method call is executed the compiler needs to know the return address for it to resume from the place where it left off in the calling method i.e. lets take the following example

void func1() { printf(I am in function 1); func2(); printf(I am again in function 1); }

void func2() { printf(I am in function 2); func3(); printf(I am again in function 2); }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

132

void func3() { printf(I am in function 3); }

If the above method calls are executed by the compiler, after printing the line I am in function 1,the compiler will transfer the method call to func2(). From function 2 it prints the line I am in function 2 and after that it switches to func3. Now it prints I am in function 3 from within func3().

Once that is over where should the function call return? Should it return to func2() or func1() or to anywhere else? How does the system identify that it has to go to func2() after finishing func3()?

This is done with the help of a stack. Before entering into function 2, the system pushes the address of where it has to return in to a stack and then makes a call to func2(). Similarly before entering into function 3, the system pushes the return address into the stack to which it will return after exiting from func3(). Once this is done, it then makes a call to func3(). Now, when we take the value from a stack, the last value pushed will be first one that will be popped back and hence the system returns exactly to the line after where it had left off and continues further. i.e. it prints the line I am again in function 2. Now again the system needs to know where it has to return. So it will issue another pop which will return the position where it will have to resume in func1() which eventually will print I am again in func1 and terminates the function.

1.2

Programmatic representation of a stack

typedef struct node { int iData; struct node* pNext; }node; node* pTop = NULL;

The node of a stack is very similar to the node in a list. The difference is only in the way the data is organized. As discussed there are only 2 operations permitted on a stack: the Push and the Pop Push Operation A push operation is for inserting an element in to a stack. Elements are always inserted in the beginning of a stack.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

133

Steps for the push operation 

Create a new node

Make the new nodes next point to the node pointed by Top

Make the top point to the new node

void push(int iData) { node* pNewNode = (node *)malloc(sizeof(node)); pNewNode->iData = iData; pNewNode->pNext = pTop; pTop = pNewNode; } Pop Operation A pop operation is basically used for retrieving a value from the stack. This also removes the element from the Stack. int pop() { int iData; node* pTempNode = NULL; if (pTop == NULL) { iData = -1; printf(Stack Empty); } else { pTempNode = pTop; iData = pTop->iData; pTop = pTop->pNext; free(pTempNode); } return iData; }

1.3

Implementation of Stacks

A stack is a special type of list, where only the element at one end can be accessed. Items can be "pushed" onto one end of the stack structure. New items are inserted before the others, as

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

134

each old element moves down one position. The first element is referred to as the "top" item, and is the only item that may be accessed at any time. In order to access items that are further down the stack, they must be moved to the top by "popping" the appropriate number of items. Popping refers to removing the top element of a stack. This is referred to as a LIFO structure, "Last In, First Out". These rules make stacks very restricted in use, however they are very efficient and much easier to implement than lists. The uses of stacks vary from programming a simple card game, to maintaining the order of operations in a complex program. For example, a stack is useful in a management program where the newest tasks must be executed first. The node of a stack is usually presented with the following structure, which is very similar to that of a list node.

template <class ItemType> struct StackNode { ItemType data; StackNode<ItemType> *next; };

Implementing a generic stack class, which can be modified to work in any type of programming situation is very easy to do. A definition of such a class is shown below.

template <class ItemType> class Stack {public: Stack(); //class constructor -initialize private variables ~Stack(); //class destructor -free up used memory void push(const ItemType); //add a new node to the top of the stack ItemType pop(); //remove the top node and return its contents ItemType top() const; //return the top node without popping it void clear(); //delete all nodes in the stack bool IsEmpty() const; //return true if the stack has no elements bool IsFull() const; //return true if there is no free memory for new nodes int count() const; //return the amount of nodes on the stack private: StackNode<ItemType> *top; //pointer to the top node in stack int counter; //maintain the amount of nodes in the stack };

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

135

The class constructor sets counter to zero. Since there are no nodes in the stack when an instance of the class is first created, top is set to NULL.

template <class ItemType> Stack<ItemType>::Stack() { counter = 0; top = NULL; }

The role of the destructor is to delete all nodes in the list, and return the memory they occupy to free store. Since the clear() function does this task, it can be called by the destructor.

template <class ItemType> Stack<ItemType>::~Stack() { clear(); }

The push() function inserts a new node on top of the stack, and sets the top pointer to reference this new node. First, we check if there is enough system memory to create a new node, and then create the node, assigning it to top. The node's value is set equal to item, and its next component is set to the node that was on top before the creation of the new node.

template <class ItemType> void Stack<ItemType>::push(const ItemType item) { assert(!IsFull()); //abort if there is not enough memory to create a new node StackNode<ItemType> *tmp = new StackNode<ItemType>; //create a new node on top of the others with value item. set the original top node to follow the new one. tmp->data = item; tmp->next = top; top = tmp; counter++; //increment the amount of nodes in the stack }

The pop() function removes the top node from the stack (freeing up the memory it uses) and returns its value. top is set to the next node in the stack, and a temporary local variable(tmp) is

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

136

created to point to the original top node. It is then used to reference the memory address of the node to delete. If we were to delete the node using top as a reference, the position of the next node in the stack would be lost, since top->next would no longer exist.

template <class ItemType> ItemType Stack<ItemType>::pop() { assert(!IsEmpty()); //abort if the stack is empty, no node to pop ItemType item = top->data; //maintain top value, to be returned later StackNode<ItemType> *tmp = top; //create a temporary reference to the top node top = top->next; //set top to be the next node delete tmp; //delete the top node counter--; //decrement the amount of nodes in the stack return item; //return the original top value }

The top() function simple returns the value of the top node, without any modifications to the stack.

template <class ItemType> ItemType Stack<ItemType>::top() const { assert(!IsEmpty()); return top->data; }

The clear() function delete all nodes in the stack, and frees the memory they occupy. Each node in the stack is visited using a loop, which executes until it reaches a NULL reference. A temporary variable is used for the same reason as in the list implementation. If we were to delete a node using top as a reference, the position of the next node in the list would be lost, since top->next would no longer exist.

template <class ItemType> void Stack<ItemType>::clear() { StackNode<ItemType> *tmp; while(top != NULL) //loop through every node in the stack

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

137

{ tmp = top; //reference the top node top = top->next; //set top to the next node delete tmp; //delete the original top node } }

The IsEmpty() function returns true if the stack has no nodes. This task is accomplished very simply by checking to see if the top pointer is NULL. template <class ItemType> bool Stack<ItemType>::IsEmpty() const { return (top == NULL); }

The IsFull() function checks to see if there is enough system memory avaliable to create a new node for the stack. It works exactly the same way as it did in the linked list class. A node is "created" using the new command, which is then evaluated. If the new command had failed to set aside the necessary memory, its value is NULL, in which case the function returns true. If the new node is created successfully, it is deleted, and the function returns false.

template <class ItemType> bool Stack<ItemType>::IsFull() const { StackNode<ItemType> *tmp = new StackNode<ItemType>; if(tmp == NULL) return true; else { delete tmp; return false; } }

The count() function returns the amount of nodes in the stack, which is maintained by the class' private counter member. Again, this value can be maintained using a public member which the

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

138

programmer can access directly, however it is good practice to hide this value from the programmer, since it can be modified without any nodes being added or removed. Also, if we were to change the class implementation, a program using the stack class would not require any change, since all the new code will be written in the count() function.

template <class ItemType> int Stack<ItemType>::count() const { return counter; }

1.4

Application of Stacks

Evaluating Expressions In normal practice any arithmetic expression is written in such a way that the operator is placed between its operands. For example

(A + B) * (C + D)

Such kind of expressions is called as infix notation.

Polish notation refers to the notation in which the operator is placed before the operands. For example the above expression can be written as

*+AB+CD

The idea is, whenever we write expressions in this notation, parenthesis are not required for determining the priority of the expressions. Let us see the steps involved in the conversion

(A+B)*(C+D) = (+AB)*(+CD) = *+AB+CD

Reverse polish notation is exactly the opposite of polish notation i.e. the operator is always placed after the operands. For example, the above expression can be written in Reverse Polish Notation as

(A+B) * (C+D) = (AB+) * (CD+) = AB+CD+*

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

139

Whenever an expression is evaluated by the system it usually performs it by means of 2 steps.First it converts any expression into prefix or postfix notation and then it evaluates the expression as it makes the job much simpler.

Converting an infix expression to postfix or prefix expression makes use of stacks extensively.

The following is the algorithm for doing this conversion. A complete program requires a lot more than what is described in the algorithm.

1.4.1

PostFixExpression ConvertToPolishNotation(InfixExpression)

{ 1. Push ( onto the STACK and add ) to the end of the InfixExpression 2. Scan the InfixExpression from left to right and repeat steps 3 to 6 for each element of InfixExpression until the stack is empty 3. If an operand is encountered, add it to Postfix Expression 4. If a left parenthesis is encountered, push it to STACK 5. If an operator XX is encountered a. Pop from STACK repeatedly and add it to Postfix expression which has the same/ higher precedence than XX. b. Add XX to STACK 6. If a right parenthesis is encountered, then a. Pop from STACK repeatedly and add it to Postfix Expression until a left parenthesis is encountered. b. Remove the left parenthesis 7. Exit }

1.4.2

Converting infix expression to postfix notation using Stack

To convert infix expression into postfix notation using stack. Algorithm: Step 1: Read the Expression from left to right Step 2: If the Input symbol read is (then push it onto the stack Step 3: If the input symbol read is an operand then place it in postfix expression Step 4: If the input symbol read is an operator then a) Check if the precedence of the operator which is in the stack has greater precedence than the precedence of the operator read, if so then remove that symbol from stack and place it

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

140

in the postfix expression. Repeat Step 4(a) till you get the operator in the stack has greater precedence than the operator being read. b) Otherwise push the operator being read onto the stack. Step 5: If the input symbol read is a closing parenthesis ) then pop all the operators from the stack, place them in postfix expression till the opening parenthesis is not popped. The (should not be place in the postfix expression. Step 6: Finally print the postfix expression.

CODING: #include<stdio.h> #include<conio.h> #include<alloc.h> #define MAX 9

char inf[40],post[40]; int top, st[20]; void postfix(); void push(char); char pop(); void main(void) { clrscr(); printf("\n Enter the infix Expression::\n"); scanf("%s",&inf); postfix(); getch(); } void postfix() { int i,j=0; for(i=0;inf[i]!='\0';i++) { switch(inf[i]) { case '+' : while(st[top] >= +) post[j++] = pop(); push(+);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ break; case '-' : while(st[top]>=-) post[j++]=pop(); push(-); break; case '*': while(st[top]>=*) post[j++]=pop(); push(*); break; case '/': while(st[top] >=/) post[j++]=pop(); push(/); break; case '^': while(st[top]>=^) post[j++]=pop(); break; case '(': push((); break; case ')': while(st[top]!=)) post[j++]=pop(); top--; break; default: post[j++] = inf[i]; } } while(top>0) post[j++]=pop(); printf("\n \t Postfix expression is => \n\n\t\t %s", post); } void push(char ele) { top++; st[top]=ele; } char pop() { int el;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

141


DATA STRUCTURES & ALOGRITHEMS USING C, C++ el=st[top]; top--; return(el); }

OUTPUT: Enter the infix Expression:: (A+B)*(C-D)

Postfix expression is =>

AB+CD-*

1.5

Check your progress 9. How is Stack useful in recursion? Explain.

10. Explain Push and Pop operation of Stack?

11. Write algorithm to convert infix expression to postfix expression.

12. Write C++ program to convert postfix expression to infix expression.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

142


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 2 Queues Structure 2.0

Unit Objective

2.1

Introduction

2.2

Programmatic representation of a Queue

2.3

Deque

2.4

Application of Queues

2.5

Priority Queues

2.6

C++ Implementation of Queues

2.7

Implementation of Double ended queue (Dequeue)

2.8

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

143


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 2.0

144

Unit Objective

After going through this unit, you should be able to understand : 

Queue definition

Basic operations on Queue

Dequeue concepts

Priority Queues

C++ implementation of Queues

C++ implementation of Dequeue

2.1

Introduction

A Queue is an ordered list in which all insertions can take place at one end called the rear and all deletions take place at the other end called the front. The two operations that are possible in a queue are Insertion and Deletion.

A real time example for a queue is people standing in a queue for billing on a shop. The first person in the queue will be the first person to get the service. Similarly the first element inserted in the queue will be the first one that will be retrieved and hence a queue is also called as First In First Out or FIFO list.

2.2

Programmatic representation of a Queue

The node of a queue will have 2 parts. The first part contains the data and the second part contains the address of the next node. This is pretty much similar to a linear list representation.

But in a queue insertions and deletions are going to occur on two different ends. If we have only one pointer called START then for every insertion we need to traverse the complete queue as insertions are always on the end and hence will be time consuming.

To avoid this we are going to have 2 pointers to represent a queue, one pointer is called the FRONT which will always be pointing to the first element and is used for deletions, and the other pointer called REAR which will always be pointing to the last element in the queue and is used in insertions. typedef struct node { int iData; struct node* pNext; }node; node* pFRONT = NULL, *pREAR = NULL;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

145

When the queue is empty, both FRONT and REAR will be pointing to NULL. When there is only one element in the queue, then both FRONT and REAR will be pointing to the same element.

Steps for inserting an element into a Queue Inserting into a queue can happen only at the REAR end. 1. Create a new node 2. Make the next of new node as NULL as it is the last element always 3. If the queue is empty, then make FRONT point to new node 4. Otherwise make the previous nodes next ( the node pointed by REAR is always the previous node ) point to this node 5. Make Rear point to new node

void QInsert(int iData) { node* pTempNode = getNode(iData); if ( pFRONT == NULL ) { //If the queue is empty pFRONT = pTempNode; } else { pREAR->pNext = pTempNode; } pREAR = pTempNode; } Steps for deleting an element from the queue The deletion is the only way through which we can retrieve values from a queue. Along with retrieving the value this will remove the entry from the queue. Deletions always happen in the FRONT end. 1. Store the node pointed by FRONT in a temporary pointer 2. Make Front as Fronts next 3. Delete the node pointed by temporary pointer 4. If all the nodes are deleted from the queue make the FRONT and REAR as NULL

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

146

int QDelete() { int iData = -1; node* pDelNode = NULL; if (pFRONT == NULL) { printf(Queue Empty); } else { iData = pFRONT->iData; pDelNode = pFRONT; pFRONT = pFRONT->pNext; if (pFRONT == NULL) { pREAR = NULL; } free(pTempNode); } return iData; }

2.3

Deque

A Deque is a queue in which insertions and deletions can happen at both ends of a queue. A deque, or double-ended queue is a data structure, which unites the properties of a queue and a stack. Like the stack, items can be pushed into the deque; once inserted into the deque the last item pushed in may be extracted from one side (popped, as in a stack), and the first item pushed in may be pulled out of the other side (as in a queue).

Implementation of Deque The push (insert/assign) and pop operation is done at both the end that is start and end of the deque. The following pictures show how a deque is formed based on this change in algorithm. Initially the base and end pointer will be pointing to NULL or 0 (zero).

We define two pointers, p_base and p_end to keep track of front and back of the deque. Initially when the deque object is created, both p_base and p_end would point to NULL. When the first

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

147

node is created, p_base assumes the position and p_end starts pointing to p_base. The next and previous pointer of p_base assumes NULL or 0(zero). Further insertions happen at p_end.

2.4

Application of Queues

The most common occurrence of a queue in computer applications is for scheduling of print jobs.

For example if several people are giving print requests, all the request get queued up in the printer and is processed on first come first serve basis.

Priority queues are used for job scheduling by the operating system. The operating system will assign a priority for every process running currently in the system. The jobs with the highest priory are the ones which are taken up by the operating system for processing first and once all the jobs in that priority are over it will try to find the job with the next priority and so on.

Performance Queues are dynamic collections which have some concept of order. This can be either based on order of entry into the queue - giving us First-In-First-Out (FIFO) or Last-In-First-Out (LIFO) queues. Both of these can be built with linked lists: the simplest "add-to-head" implementation of a linked list gives LIFO behaviour. A minor modification - adding a tail pointer and adjusting the addition method implementation - will produce a FIFO queue.

A straightforward analysis shows that for both these cases, the time needed to add or delete an item is constant and independent of the number of items in the queue. Thus we class both addition and deletion as an O(1) operation. For any given real machine+operating system+language combination, addition may take c1 seconds and deletion c2 seconds, but we aren't interested in the value of the constant, it will vary from machine to machine, language to language, etc. The key point is that the time is not dependent on n - producing O(1) algorithms.

Once we have written an O(1) method, there is generally little more that we can do from an algorithmic point of view. Occasionally, a better approach may produce a lower constant time. Often, enhancing our compiler, run-time system, machine, etc will produce some significant improvement. However O(1) methods are already very fast, and it's unlikely that effort expended in improving such a method will produce much real gain! 2.5

Priority Queues

Often the items added to a queue have a priority associated with them: this priority determines

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

148

the order in which they exit the queue - highest priority items are removed first.

This situation arises often in process control systems. Imagine the operator's console in a large automated factory. It receives many routine messages from all parts of the system: they are assigned a low priority because they just report the normal functioning of the system - they update various parts of the operator's console display simply so that there is some confirmation that there are no problems. It will make little difference if they are delayed or lost.

A priority queue is the one in which each element will have a priority associated with it. The element with the highest priority is the one that will be processed/deleted first. If two or more nodes have the same priority then they will be processed in the same order as they were entered in to the queue.

However, occasionally something breaks or fails and alarm messages are sent. These have high priority because some action is required to fix the problem (even if it is mass evacuation because nothing can stop the imminent explosion!).

Typically such a system will be composed of many small units, one of which will be a buffer for messages received by the operator's console. The communications system places messages in the buffer so that communications links can be freed for further messages while the console software is processing the message. The console software extracts messages from the buffer and updates appropriate parts of the display system. Obviously we want to sort messages on their priority so that we can ensure that the alarms are processed immediately and not delayed behind a few thousand routine messages while the plant is about to explode.

As we have seen, we could use a tree structure - which generally provides O(logn) performance for both insertion and deletion. Unfortunately, if the tree becomes unbalanced, performance will degrade to O(n) in pathological cases. This will probably not be acceptable when dealing with dangerous industrial processes, nuclear reactors, flight control systems and other life-critical systems.

The great majority of computer systems would fall into the broad class of information systems which simply store and process information for the benefit of people who make decisions based on that information. Obviously, in such systems, it usually doesn't matter whether it takes 1 or 100 seconds to retrieve a piece of data - this simply determines whether you take your coffee break now or later. However, as we'll see, using the best known algorithms is usually easy and straightforward: if they're not already coded in libaries, they're in text-books. You don't even have

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

149

to work out how to code them! In such cases, it's just your reputation that's going to suffer if someone (who has studied his or her algorithms text!) comes along later and says 2

"Why on earth did X (you!) use this O(n ) method there's a well known O(n) one!"

2.6

C++ Implementation of Queues

A queue is special type of list structure. Elements can only be inserted to the back of a queue, and only the front element can be accessed and modified. The structure of a queue is the same as that of a line of people. A person who wishes to stand in line must go to the back, and the person in front of the line is served. Thus, a queue is a FIFO structure, "First In, First Out". A generic queue is very simple to, a class definition for a linked queue is shown below. template <class ItemType> struct QueNode { ItemType data; QueNode<ItemType> *next; }; template <class ItemType> class Queue {public: Queue(); //class constructor -initialize variables ~Queue(); //class destructor -return memory used by queue elements void enqueue(const ItemType); //add an item to the back of the queue ItemType dequeue(); //remove the first item from the queue and return its value ItemType first() const; //return the value of the first item in the queue without modification to the structurebool IsEmpty() const; //returns true if there are no elements in the queue bool IsFull() const; //returns true if there is no system memory for a new queue node int length() const; //returns the amount of elements in the queue private: QueNode<ItemType> *front; QueNode<ItemType> *back; int len; }; The front pointer will reference the first node in the queue, and the back pointer will reference the last node in the queue. It is possible to maintain only a front pointer. The last node in the list points to a NULL value, and can easily be found. However, such a design would be inefficient since finding the last node every time its location is needed is very time consuming. Therefore,

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

150

we maintain a reference to it in our class implementation. The class constructor initializes the private data members. template <class ItemType> Queue<ItemType>::Queue() { front = NULL; back = NULL; len = 0; } The destructor deletes all nodes, freeing up the memory they used. The clear() function is called to do this operation. template <class ItemType> Queue<ItemType>::~Queue() { clear(); }

The enqueue() function adds a new item to the back of the queue. The algorithm differs slightly, depending on whether the queue is empty or not. If the queue is not empty, the last node is set to point to the newly created node, and the back pointer is set to reference the new node. The value of the new node is item, and its next pointer has a value of NULL. If the queue is empty, a similar procedure is used. However, the front pointer is also set to reference the newly created node. Since there is only one node in the queue, the front and back are one in the same. template <class ItemType> void Queue<ItemType>::enqueue(const ItemType item) { assert(!IsFull()); //abort if there is no more memory for a new node

if(len != 0) //if the queue is not empty { back->next = new QueNode<ItemType>; //create a new node back = back->next; //set the new node as the back node back->data = item; back->next = NULL; } else {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

151

back = new QueNode<ItemType>; //create a new node back->data = item; back->next = NULL; front = back; //set frontto reference the new node. Since there it is the only node in the queue, it is considered to be both the back and front. } len++; //increment the amount of elements in the queue } The dequeue() function removes the node at the front of the queue and returns its value. The value of the front node is stored in item. A temporary local pointer is then created to reference the front node, and the front pointer is set to the next element in the queue. The front node is deleted using tmp as a reference. The function then checks if the queue is empty by evaluating front. If it has a value of NULL, the queue is empty, and the back pointer must also be set to NULL since it maintains the address of the now deleted node. template <class ItemType> ItemType Queue<ItemType>::Dequeue() {assert(!IsEmpty()); //abort if the queue is empty, no node to dequeue ItemType item = front->data; //store the value of the first node, to be returned at the end QueNode<ItemType> *tmp = front; //temporary pointer to the first node. front = front->next; //set the second node in the queue as the new front delete tmp; //delete the original first node if(front == NULL) //if the queue is empty, update the back pointer

back = NULL; len--; //decrement the amount of nodes in the queue return item; //return the value of the original first element } The first() function returns the value of the front node without modifying the queue. template <class ItemType> ItemType Queue<ItemType>::first() const { assert(!IsEmpty()); //abort if the queue is empty return front->data; } The IsEmpty() function checks to see if there are any nodes in the queue by evaluating the front pointer. If the queue is empty, front has a value of NULL. template <class ItemType> bool Queue<ItemType>::IsEmpty() const {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

152

return (front == NULL); } The IsFull() function works exactly the same way as it did with other data structures. A node is "created" using the new command, which is then checked. If the new command had failed to set aside the necessary memory, its value is NULL, in which case the function returns true. If the new node is created successfully, it is deleted and the function returns false. template <class ItemType> bool Queue<ItemType>::IsFull() const { QueNode<ItemType> *tmp = new QueNode<ItemType>; if(tmp == NULL) return true; else { delete tmp; return false; } } The length() function returns the value of len, which maintains the amount of nodes in the queue. Again, a function is used to retrieve this value instead of making it a public data member to avoid error and make our class abstract. template <class ItemType> int Queue<ItemType>::length() const { return len; }

2.7

Implementation of Double ended queue (Dequeue)

To implement various operations on Double ended queue Algorithm: Step 1: Start the Process Step 2: Initialize and declare variables Step 3: Enter the Choice Step 4: If choice is ENQUEUE at FRONT then a) Check if dequeue is full b) else check for FRONT at first position c) else decrement the FRONT position

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ Step 5: If choice is ENQUEUE at REAR then a) Check if dequeue is full b) else check for REAR at last position c) else increment the REAR position Step 6: If choice is DEQUEUE at FRONT then a) Check if dequeue is empty b) else check for dequeue contains only one element c) else increment the FRONT position Step 7: If choice is DEQUEUE at REAR then a) Check if dequeue is empty b) else check for dequeue contains only one element c) else decrement the REAR position Step 8: Stop the Process.

CODING: #include<stdio.h> #include<conio.h> #include<stdlib.h> #define QSIZE 5 void display(); int isEmpty(); int isFull(); int enqueueRear(int value); int dequeueRear(int *value); int enqueueFront(int value); int dequeueFront(int *value); int size(); void view(); int queue[QSIZE],front = -1,rear = -1; void main() { int status,ch,data; display(); while(1) { printf("\n Enter the Choice");

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

153


DATA STRUCTURES & ALOGRITHEMS USING C, C++ scanf("%d",&ch);

switch(ch) { case 0: display(); break; case 1: printf("\n Enter the Element"); fflush(stdin); scanf("%d",&data); status=enqueueFront(data); if(status==-1) printf("Dequeue overflow at Front..."); break; case 2: printf("\n Enter the element"); fflush(stdin); scanf("%d",&data); status=enqueueRear(data); if(status==-1) printf("Dequeue overflow at Rear..."); break; case 3: status=dequeueFront(&data); if(status==-1) printf("Dequeue Underflow at Front..."); else printf("\n The dequeued value is %d",data); break; case 4: status=dequeueRear(&data); if(status==-1) printf("Dequeue Underflow at Rear..."); else printf("\n The dequeued value is %d",data); break; case 5: printf("Number of elements in dequeue is %d",size()); break; case 6: view(); break;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

154


DATA STRUCTURES & ALOGRITHEMS USING C, C++ default: printf("\n end of Run of your program......."); exit(0); } } } void display() { printf("\n Representation of Dequeue "); printf("\n\t 0.View Menu"); printf("\n\t 1.Enqueue at front"); printf("\n\t 2.Enqueue at rear"); printf("\n\t 1.Dequeue at front"); printf("\n\t 4.Dequeue at rear"); printf("\n\t 5.Size of the queue"); printf("\n\t 6.view"); printf("\n\t 7.Exit");

} int isEmpty() { extern int queue[],front,rear; if(front==-1&&rear==-1) return 1; else return 0; } int isFull() { extern int queue[],front,rear; if(rear==(QSIZE-1)) return 1; else return 0; } int enqueueFront(int value) { extern int queue[],front,rear;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

155


DATA STRUCTURES & ALOGRITHEMS USING C, C++ if(isEmpty()) front=rear=0; else if(isFull()) return -1; else front=front-1; queue[front]=value; return 0; } int enqueueRear(int value) { extern int queue[],front,rear; if(isEmpty()) front=rear=0; else if(isFull()) return -1; else rear=rear+1; queue[rear]=value; return 0; } int dequeueFront(int *value) { extern int queue[],front,rear; if(isEmpty()) return -1; *value=queue[front]; if(front==rear) front=rear=-1; else front=front+1; return 0; } int dequeueRear(int *value) { extern int queue[],front,rear; if(isEmpty())

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

156


DATA STRUCTURES & ALOGRITHEMS USING C, C++ return -1; *value=queue[rear]; if(front==rear) front=rear=-1; else rear=rear-1; return 0; } int size() { extern int queue[],front,rear; if(isEmpty()) return 0; return(rear - front +1); } void view() { extern int queue[],front,rear; int f; if(isEmpty()) { printf("\n Dequeue is Empty"); return; } printf("\n Content of the Dequeue is...\n FRONT ->"); for(f=front;f!=rear;f=f+1) printf("%d-->",queue[f]); printf("%d-->REAR",queue[f]); if(isFull()) printf("\n Dequeue is Full"); }

2.8

Check your progress

1. Describe the basic operations on Queue with C++ code examples.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

157


DATA STRUCTURES & ALOGRITHEMS USING C, C++

158

2. Differentiate between Queue and Dequeue. What are the basic operations level changes required in Queue to make it Dequeue?

3. Give examples of applications of Queue, Dequeue and priority queues.

4. Write C++ code for basic insertions and deletions in queue.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 3 Binary Trees Structure 3.0

Unit Objective

3.1

Introduction

3.2

Binary Trees

3.3

Analysis Complete Trees

3.4

General binary trees

3.5

Unbalanced Trees

3.6

C++ implementation of Binary Tree

3.7

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

159


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 3.0

160

Unit Objective

After going through this unit, you should be able to understand :

3.1

Concept of Tree

Definition and structure of Binary tree

Binary tree types like balanced, unbalanced

Binary tree traversals

Binary tree metrics like height etc.

C++ implementation of Binary tree

Introduction

Binary trees are different from the three previous structures we have covered before. Lists, stacks, and queues were all linear structures, that is, the elements in them were logically following each other.

A binary tree structure contains a root node, which is the first in the structure. The root points to one or two other nodes, its left and right children. The root is considered to be a parent of these two nodes. Each child is also a sub-tree, since it can have one or two children of its own. If a node has no children, it is referred to as a leaf node.

Each node in the tree also has a level associated with it. The root node is at level 0, and increases with each row of nodes below the root.

Binary trees have many different basic implementations. An array implementation is often times used, where every level must be completely filled. In larger trees, this can be a very big waste of space. For our demonstration, we will create a generic class using dynamic memory allocation. This particular implementation was created by the author using a mixture of possible approaches. It is very effective in explaining the concepts behind binary trees.

The binary tree class gives the programmer complete control over the tree. Nodes may be removed and inserted into any location in the list. The class allows the user to traverse the tree by keeping a current pointer, just as in the linked list class. The programmer can then use the functions left(), right(), and parent() to move from one node to another. The class also allows the user to display the tree in in-order, post-order, and pre-order. The "order" refers to how the nodes are displayed. For instance, in pre-order, a node's value is displayed, then the value of its left child, followed by the right child. In the case of in-order, the node's value is displayed between the value of its left and right child. In post order, the node's children are displayed before it. The implementation for the binary tree class is displayed below. Although it may look intimidating at

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

161

first, the code is very easy to follow. The purpose and code behind each function is explained following the definition. You will note, most functions are programmed using recursion. Since each node is actually a tree within itself, using recursion is the easiest approach

5.2

Binary Trees

The simplest form of tree is a binary tree. A binary tree consists of ďƒ˜

a node (called the root node) and

ďƒ˜

left and right sub-trees.

Both the sub-trees are themselves binary trees. You now have a recursively defined data structure. (It is also possible to define a list recursively:)

A binary tree The nodes at the lowest levels of the tree (the ones with no sub-trees) are called leaves. In an ordered binary tree, 1

the keys of all the nodes in the left sub-tree are less than that of the root,

2

the keys of all the nodes in the right sub-tree are greater than that of the root,

3

the left and right sub-trees are themselves ordered binary trees.

3.3

Analysis Complete Trees

Before we look at more general cases, let's make the optimistic assumption that we've managed to fill our tree neatly, ie that each leaf is the same 'distance' from the root.

This forms a complete tree, whose height is defined as the number of links from the root to the deepest leaf.

First, we need to work out how many nodes, n, we have in such a tree of height, h. Now, n = 1 + 1

2

h

2 + 2 + .... + 2 From which we have, n = 2

h+1

- 1 and h = floor( log2n )

Examination of the Findmethod shows that in the worst case, h+1 or ceiling( log2n ) comparisons are needed to find an item. This is the same as for binary search. However, Addalso requires ceiling( log2n ) comparisons to determine where to add an item. Actually adding the item takes a constant number of operations, so we say that a binary tree requires O(logn) operations for both adding and finding an item - a considerable improvement over binary search for a dynamic structure which often requires addition of new items. Deletion is also an O(logn) operation.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 3.4

162

General binary trees

However, in general addition of items to an ordered tree will not produce a complete tree. The worst case occurs if we add an ordered list of items to a tree. This problem is readily overcome: we use a structure known as a heap. However, before looking at heaps, we should formalise our ideas about the complexity of algorithms by defining carefully what O(f(n)) means.

Root Node Node at the "top" of a tree - the one from which all operations on the tree commence. The root node may not exist (a NULL tree with no nodes in it) or have 0, 1 or 2 children in a binary tree.

Leaf Node Node at the "bottom" of a tree - farthest from the root. Leaf nodes have no children.

Complete Tree Tree in which each leaf is at the same distance from the root. A more precise and formal definition of a complete tree is set out later.

Height Number of nodes which must be traversed from the root to reach a leaf of a tree. /* Binary tree implementation of a collection */ struct t_node { void *item; struct t_node *left; struct t_node *right; }; typedef struct t_node *Node; struct t_collection { int size; /* Needed by FindInCollection */ Node root; }; /* Binary tree implementation of a collection */ static void AddToTree( Node *t, Node new ) { Node base; base = *t; /* If it's a null tree, just add it here */ if ( base == NULL ) { *t = new;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

163

return;} else { if ( KeyLess( ItemKey( new->item ), ItemKey( base->item ) ) ) { AddToTree( &(base->left), new ); } Else AddToTree( &(base->right), new ); } } void AddToCollection( Collection c, void *item ) { Node new, node_p; assert( c != NULL ); assert( item != NULL ); /* Allocate space for a node for the new item */ new = (Node)malloc(sizeof(struct t_node)); /* Attach the item to the node */ new->item = item; new->left = new->right = (Node)0; AddToTree( &(c->node), new ); } /* Binary tree implementation of a collection */ /* Now we need to know whether one key is less, equal or greater than another*/ extern int KeyCmp( void *a, void *b ); /* Returns -1, 0, 1 for a < b, a == b, a > b */ void *FindInTree( Node t, void *key ) { if ( t == (Node)0 ) return NULL; switch( KeyCmp( key, ItemKey(t->item) ) ) { case -1 : return FindInTree( t->left, key ); case 0: return t->item; case +1 : return FindInTree( t->right, key ); } } void *FindInCollection( collection c, void *key ) { /* Find an item in a collection Pre-condition: (c is a collection created by a call to ConsCollection) && (key != NULL) Post-condition: returns an item identified by key if one exists, otherwise returns NULL*/ assert( c != NULL );

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

164

assert( key != NULL ); /* Select node at head of list */ return FindInTree( c->root, key ); } 3.5

Unbalanced Trees

If items are added to a binary tree in order then the following unbalanced tree results:

The worst case search of this tree may require up to n comparisons. Thus a binary tree's worst case searching time is O(n). Later, we will look at red-black trees, which provide us with a strategy for avoiding this pathological behaviour.

3.6

C++ implementation of Binary Tree

Many people make a class for a single node, and use it to implement the tree. However, we will separate the structure for each node and the entire tree to conserve overhead processing time. Each time a node is created, much less time and memory is used than when a whole tree structure is made. Each node will store a value, and pointers to its children and parent. These will be used and modified by the general tree class. template <class ItemType> struct TreeNode { ItemType data; TreeNode<ItemType> *left; TreeNode<ItemType> *right; TreeNode<ItemType> *parent; }; template <class ItemType> class BinaryTree { public:

BinaryTree(); //create empty tree with default root node which has no value. Set current to main root node.

BinaryTree(TreeNode<ItemType>*,int); //create new tree with passed node as the new main root. set current to main root. if the second parameter is 0, the new object simply points to the node of the original tree. If the second parameter is 1, a new copy of the subtree is created, which the object points to.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

165

~BinaryTree(); void insert(const ItemType&,int); //insert new node as child of current. 0=left 1=right void remove(TreeNode<ItemType>*); //delete node and its subtree ItemType value() const; //return value of current //navigate the tree void left(); void right(); void parent(); void reset(); //go to main_root void SetCurrent(TreeNode<ItemType>*); //return subtree (node) pointers TreeNode<ItemType>* pointer_left() const; TreeNode<ItemType>* pointer_right() const; TreeNode<ItemType>* pointer_parent() const; TreeNode<ItemType>* pointer_current() const; //return values of children and parent without leaving current node ItemType peek_left() const; ItemType peek_right() const; ItemType peek_parent() const; //print the tree or a subtree. only works if ItemType is supported by <<operator void DisplayInorder(TreeNode<ItemType>*) const; void DisplayPreorder(TreeNode<ItemType>*) const; void DisplayPostorder(TreeNode<ItemType>*) const; //delete all nodes in the tree void clear(); bool IsEmpty() const; bool IsFull() const; private: TreeNode<ItemType>* current; TreeNode<ItemType>* main_root; TreeNode<ItemType>*CopyTree(TreeNode<ItemType>*,TreeNode<ItemType>*) const; //create a new copy of a subtree if passed to the constructor bool

subtree;

//does

it

reference

a

part

of

a

larger

object?

};

The first constructor simply sets the main_root and current data members to NULL, since the tree has no nodes. A new tree is made, therefore it is not part of a larger tree object, and the subtree

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

166

value is set accordingly. template <class ItemType> BinaryTree<ItemType>::BinaryTree() { //create a root node with no value main_root = NULL; current = NULL; subtree = false; } The second constructor accepts a pointer to a node, and creates a new tree object with the node that is passed acting as the new tree's main root. current is then set to the main root. The second parameter specifies whether the new subtree object points directly to the original tree's nodes (the root and its decedents), or creates a copy of the subtree and is thus a new tree. The subtree variable specifies if the subtree points directly to the original tree's nodes. As you will later find out, this is important in the class destructor. template <class ItemType> BinaryTree<ItemType>::BinaryTree(TreeNode<ItemType>* root, int op) { if(op = 0) { main_root = root; current = root; subtree = true; } Else { main_root = CopyTree(root,NULL); current = main_root; subtree = false; } }

The CopyTree() function creates a copy of subtree root and returns a pointer to the location of the new copy's root node. The second parameter is a pointer to the parent of the subtree being passed. Since CopyTree() uses recursion to traverse the original tree, passing each node's parent as a parameter is the most efficient way of assigning each new node's parent value. Since the parent of the main root is always NULL, we pass NULL as the second parameter in the class

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

167

constructor above.

template <class ItemType> TreeNode<ItemType>* BinaryTree<ItemType>::CopyTree(TreeNode<ItemType> *root, TreeNode<ItemType> *parent) const { if(root == NULL) //base case -if the node doesn't exist, return NULL. return NULL; TreeNode<ItemType>* tmp = new TreeNode<ItemType>; //make a new location in memory tmp->data = root->data; //make a copy of the node's data tmp->parent = parent; //set the new node's parent tmp->left = CopyTree(root->left,tmp); //copy the left subtree of the current node. pass the current node as the subtree's parent tmp->right = CopyTree(root->right,tmp); //do the same with the right subtree return tmp; //return a pointer to the newly created node. }

The job of the class destructor is to delete all the nodes, and free up memory as usual. The clear() function is called just as in the previous data structure implementations. However, this operation is only performed if the object is a main tree. If the object is a subtree that points to the nodes of a larger tree, it will be deleted when the main tree itself is deleted. Attempting to delete the data in the memory associated with the subtree after it has already been deleted by the main tree will have unpredictable results. template <class ItemType> BinaryTree<ItemType>::~BinaryTree() { if(!subtree) clear(); //delete all nodes } The insert() function creates a new node as a child of current. The first parameter is a value for the new node, and the second parameter is an integer indicating what child the new node will become. A value of 0 indicates that the new node will be a left child of current, whereas a value of 1 indicates the new node will be a right child. If a node already exists in the location that programmer wishes to insert it, that node adopts the value passed to insert(). If the tree does not have any nodes, the second parameter is disregarded, and a main root is created.

template <class ItemType>

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

168

void BinaryTree<ItemType>::insert(const ItemType &item,int pos) //insert as child of current 0=left 1=right. if item already exists, replace it { assert(!IsFull()); //if the tree has no nodes, make a root node, disregard pos. if(main_root == NULL) { main_root = new TreeNode<ItemType>; main_root->data = item; main_root->left = NULL; main_root->right = NULL; main_root->parent = NULL; current = main_root; return; //node created, exit the function

} if(pos == 0) //new node is a left child of current { if(current->left != NULL) //if child already exists, replace value

(current->left)->data = item; Else {

current->left = new TreeNode<ItemType>; current->left->data = item; current->left->left = NULL; current->left->right = NULL; current->left->parent = current;

} } else //new node is a right child of current {

if(current->right != NULL) //if child already exists, replace value

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

169

(current->right)->data = item; Else {

current->right = new TreeNode<ItemType>; current->right->data = item; current->right->left = NULL; current->right->right = NULL; current->right->parent = current; } } }

The remove() function removes the subtree referenced to by root, as well as the root node itself. Depending on whether it was a left or right child, the left or right pointer of the parent is set to NULL. The function uses recursion to perform the necessary operation on all nodes of the subtree. We must start with the nodes on the lowest level, and work our way up. If we were to delete the top level nodes first, we would loose the link the lower levels. template <class ItemType> void BinaryTree<ItemType>::remove(TreeNode<ItemType>* root) {

if(root == NULL) //base case -if the root doesn't exist, do nothing return; remove(root->left); //perform the remove operation on the nodes left subtree first remove(root>right); //perform the remove operation on the nodes right subtree first if(root->parent == NULL) //if the main root is being deleted, main_root must be set to NULL

main_root = NULL; else {

if(root->parent->left == root) //make sure the parent of the subtree's root points to NULL, since the node no longer exists root->parent->left = NULL; else

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

170

root->parent->right = NULL; } current = root->parent; //set current to the parent of the subtree removed. delete root;

}

The next function returns the value of current. template <class ItemType> ItemType BinaryTree::value() const { return current->data; }

The next five functions are used to navigate the tree. The programmer can visit a node's left child, right child, or parent, as well as reset current to the main root. Finally, the programmer can set current to a specific node by supplying a pointer to it. This is very helpful if the programmer would like to work with subtrees within the main tree object. Note, the SetCurrent() function should be used with caution. If a pointer is supplied to a node that is not within the tree, the results are unpredictable.

template <class ItemType> void BinaryTree<ItemType>::left() {

current = current->left; }

template <class ItemType> void BinaryTree<ItemType>::right() {

current = current->right; } template <class ItemType> void BinaryTree<ItemType>::parent() {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

171

current = current->parent; } template <class ItemType> void BinaryTree<ItemType>::reset() {

current = main_root; }

template <class ItemType> void BinaryTree<ItemType>::SetCurrent(TreeNode<ItemType>* root) {

current = root; }

The four functions that follow return pointers to various nodes in the tree, epending on current. This is a required parameter for a few of our other functions, such as remove() and the three display functions. It is also used by one of our class constructors, which can make a new tree object from a subtree. The only function that is required is pointer_current(), since the programmer can navigate the tree to any node. The other three functions were also included for ease of use. It is often times necessary to perform an operation on a node's children or parent without leaving the node. The functions are also useful if a programmer would like to work on a subtree. An external TreeNode* pointer can be created, set by one of the pointer returning functions, and then passed to the operation functions of the class.

template <class ItemType> TreeNode<ItemType>* BinaryTree<ItemType>::pointer_left() const {

return current->left; }

template <class ItemType> TreeNode<ItemType>* BinaryTree<ItemType>::pointer_right() const {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

172

return current->right; }

template <class ItemType> TreeNode<ItemType>* BinaryTree<ItemType>::pointer_parent() const {

return current->parent; }

template <class ItemType> TreeNode<ItemType>* BinaryTree<ItemType>::pointer_current() const {

return current; }

The next three functions are also not required, but were added for ease of use. They return the values of a node's two children and parent without having to leave the node. template <class ItemType> ItemType BinaryTree<ItemType>::peek_left() const {

assert(current->left != NULL); return current->left->data; } template <class ItemType> ItemType BinaryTree<ItemType>::peek_right() const {

assert(current->right != NULL); return current->right->data; }

template <class ItemType> ItemType BinaryTree<ItemType>::peek_parent() const

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

173

{

assert(current->parent != NULL); return current->parent->data; }

The display functions as explained above are next. Note, these functions will work only if ItemType is supported by the << operator. For instance, any simple built in C/C++ type (such as int, float, char, etc.) will work without any modification.

template <class ItemType> void BinaryTree<ItemType>::DisplayInorder(TreeNode<ItemType>* root) const { if (root == NULL) return;

DisplayInorder(root->left); cout << root->data; DisplayInorder(root->right);

}

template <class ItemType> void BinaryTree<ItemType>::DisplayPreorder(TreeNode<ItemType>* root) const {

if (root == NULL) return;

cout << root->data; DisplayInorder(root->left); DisplayInorder(root->right);

}

emplate <class ItemType> void BinaryTree<ItemType>::DisplayPostorder(TreeNode<ItemType>* root) const

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

174

{ if (root == NULL) return;

DisplayInorder(root->left); DisplayInorder(root->right); cout << root->data; }

The clear() function deletes all nodes in the list. This is very easy to do, since we can take advantage of the remove() function, which we has already defined. The remove() functions deletes all nodes of a subtree, as well as the root node. Therefore, we can pass the main root to remove() in order to delete all nodes in the tree.

template <class ItemType> void BinaryTree<ItemType>::clear() {

remove(main_root); //use the remove function on the main root main_root = NULL; //since there are no more items, set main_root to NULL current = NULL; }

The IsEmpty() function works by evaluating main_root. If there aren't any nodes n the tree, main_root points to NULL. template <class ItemType> bool BinaryTree<ItemType>::IsEmpty() const { return (main_root == NULL); }

Finally, other than the data types, the implementation of the IsFull() function does not change from previous classes. template <class ItemType< bool BinaryTree<ItemType>::IsFull() const {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

175

TreeNode<ItemType> *tmp = new TreeNode<ItemType>; if(tmp == NULL)

return true; else { delete tmp; return false; } }

Now let's take a look at two additional functions, which are not part of the tree class. Often times it is necessary to know how many nodes are in the list, or how many of them are leafs. One example of when a leaf count is required is in a binary expression tree. Binary expression trees store mathematical expression, for instance, 5*x+7=22. Each character of the expression is represented by one node. They are stored in such a way that the expression can then be displayed using an in-order traversal. Also, pre-order and post-order traversals will display the mathematical expression using prefix and postfix notations. This means an operator stored in a node perform an operation on its two children. In such a setup, all operators are internal nodes, whereas variables and constants are leafs. The code for the NodeCount() and LeafCount() functions is displayed below. Both are very short since recursion is used.

template <class ItemType> int LeafCount(TreeNode<ItemType>* root) {

if(root == NULL) //base case -if the node doesn't exist, return 0 (don't count it) return 0; if((root->left == NULL) && (root->right == NULL)) //if the node has no children return 1 (it is a leaf) return 1;

return LeafCount(root->left) + LeafCount(root->right); //add the leaf nodes in the left and right subtrees }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

176

template <class ItemType≶ int NodeCount(TreeNode<ItemType>* root) {

if(root == NULL) //base case -if the return 0 if node doesn't exist (don't count it) return 0; else return 1 + NodeCount(root->left) + NodeCount(root->right); //return 1 for the current node, and add the amount of nodes in the left and right subtree }

3.7

Check your progress

1. Describe the term Tree. What are the basic properties of a Binary tree?

2. Draw all possible binary trees with four nodes?

3. Define Height of a Tree. Write C++ code to calculate height of a tree.

4. Write C++ code to compare two trees ad return whether they are similar or not?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

Unit 4 Binary Search Trees Structure 4.0

Unit Objective

4.1

Introduction

4.2

Implementation of Binary Search Tree

4.3

Implementation of an Expression tree to perform tree traversals

4.4

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

177


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4.0

178

Unit Objective

After going through this unit, you should be able to understand :

4.1

Binary search tree concept

C++ implementation of BST

Insertion in BST

C++ program of BST

Expression tree and their usage in tree traversal.

Introduction

A special binary tree is the binary search tree (BST). BSTs must conform to a property that states all the left children of a node have a lesser value than the node, and all the right children have a value greater than the node. In order to make our binary tree class a BST, we need only modify the insert() function, since nodes can no longer be placed anywhere in the tree by the programmer. Otherwise, a BST acts in the same fashion as a standard binary tree.

The new insert() function accepts one parameter, the item to be inserted. If the tree is empty, then the main root is added in the same fashion as it was with a standard binary tree, and the function exits. If the tree is not empty, the function proceeds with a new algorithm.

The parent of the new node is found by running the insert_find() function, a new private member function that must be added to the BST class. The insert_find() function accepts two parameters -the root of the tree and the item value. The root is needed as a parameter because insert_find() uses recursion to traverse the tree. The function works by comparing the value of item to each node. If it is less than the node's value, and the node has a left child, the function proceeds to that child and performs the same operation. If it is greater (or equal to) than the node's value, and a right child exists, then the function proceeds to the right child. If item is less than the node and a left child does not exist, or it is more than the node and a right child does not exist, that is where the new node belongs.

The insert() function receives a pointer to the new parent, however, we must check again if the new node is to be the left or right child. This is because there is no efficient way a recursive version of insert_find() can return this information. We would have to add another parameter, or write a non-recursive version of insert_find(). Both methods are far more space and time consuming than simply performing another check.

A new node is then created using the same method as in the original version of insert(). template <class ItemType>

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ void BST<ItemType>::insert(const ItemType &item) { //if the tree has no nodes, make a root node if(main_root == NULL) { main_root = new TreeNode<ItemType>; main_root->data = item; main_root->left = NULL; main_root->right = NULL; main_root->parent = NULL; current = main_root; return; } TreeNode<ItemType>* new_parent = insert_find(main_root,item); //find the new node's parentif (item < new_parent->data) //check whether the new node is a left or rightchild and create it { new_parent->left = new TreeNode<ItemType>; new_parent->left->data = item; new_parent->left->left = NULL; new_parent->left->right = NULL; new_parent->left->parent = new_parent; } Else { new_parent->right = new TreeNode<ItemType>; new_parent->right->data = item; new_parent->right->left = NULL; new_parent->right->right = NULL; new_parent->right->parent = new_parent; } } template <class ItemType> TreeNode<ItemType>* BST<ItemType>::insert_find(TreeNode<ItemType>* delete_node,ItemType item) { if((root->left != NULL) && (item < root->data))

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

179


DATA STRUCTURES & ALOGRITHEMS USING C, C++

180

return insert_find(root->left,item); if((root->right != NULL) && (item >= root->data)) return insert_find(root->right,item); return root; }

Since the programmer is no longer in control of the location of each node, we can add a function that removes one node at a time. If the node to be deleted is a leaf, the solution is simple. We set its parent's pointer to NULL. If it has one child, then the parent is set to point to that child. However, what happens if the node has two children ? How should we insert those children back into the tree once the node is deleted ? The solution can vary, depending on the programming task. Since a binary search tree follows a specific property, we can libraryelop an algorithm that maintains the binary search tree property when rearranging the node's children.

The code for the remove_node() function may look difficult at first, however when broken down into each possible situation, it is very simple to understand. The first case that must be considered is if we wish to remove the main root, and it has only one child. In this situation, that child becomes the new main root. If the node to be deleted has no children, node's parent is set to point to NULL, and node is then deleted.

Also as mentioned before, if node has one child, the parent of node is set to point to this child, and node is then deleted.

If delete_node has two children, a little bit more work must be done. The question arises as to how to attach delete_node's children to the tree, while preserving the binary search tree property. One possible solution would be to attach one of the children to delete_node's parent, and reinsert each node from the second child subtree one-by-one. However, this method is very inefficient, especially if the node to be deleted has a very large subtree.

The ideal approach is to find another node in the tree, which can replace delete_node, and still maintain the binary search tree property. We can then replace the value of delete_node, and remove the node who's value we replaced it with instead. The possible nodes that would fit such a criteria are the largest node in the left subtree, and the smallest node in the right subtree. The largest node in the left subtree is still smaller than any node of the right subtree, and the smallest node in the right subtree is larger than any node in the left subtree, therefore making both possible values that can replace root. If a right subtree exists, then it is used, since it may contain a value that is equal to delete_node (in which case that value will be

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++

181

used). The largest node in the left subtree can be found by moving to the left one time (starting from delete_node), and then moving to the right as much as possible. The smallest node in the right subtree can be found by moving once to the right, and then to the left as much as possible. This means that either node will have at most one child (the largest node in the left subtree can only have a left child, and the smallest node in the right subtree can only have a right child). It can therefore be removed using the above method, by attaching that child to the node's parent.

The replace_find() private member function returns the node that is used to replace delete_node. The first parameter of the function is a pointer to the first possible node (if we are searching the left subtree, this value is root->left, whereas it is root->right if we are searching the right subtree), and the second parameter is what direction to search in. Zero means the function will locate the largest value in the left subtree, meaning it will travel to the right as much as possible. A value of one means the function will locate the smallest value in the right subtree, therefore travelling to the left as much as possible. Once the node is found, delete_node is set to its value, and the node is deleted.

template <class ItemType> void BST<ItemType>::remove_node(TreeNode<ItemType>* root) { if((root == main_root) && ((root->left == NULL) || (root->right == NULL))) { //set the main root's only child as the new root. if it has no children, main_root becomes NULL as the tree is empty. if(root->left == NULL) main_root = root->right; else main_root = root->left; main_root->parent = NULL; //set the new main root's parent to NULL if(current == root) //if current is at the original main root, set it to the new root, since the original will be deleted current = main_root; delete root; return; } if(current == root) //if current is at the node to be deleted, set it to the node's parent

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ current = root->parent; if((root->left == NULL) && (root->right == NULL)) //if the root has no children { //have the parent point to NULL in place of it if(root->parent->left == root) //if it a left child root->parent->left = NULL; else //it is a right child root->parent->right = NULL; delete root; return; } //if the root has one child, have the parent point to it in place of root if((root->left == NULL) && (root->right != NULL)) { if(root->parent->left == root) root->parent->left = root->right; else root->parent->right = root->right; delete root; return; } if((root->left != NULL) && (root->right == NULL)) { if(root->parent->left == root) root->parent->left = root->left; else root->parent->right = root->left; delete root; return; } //if the node has two children TreeNode<ItemType> *tmp; if(root->right != NULL) //if the root has a right subtree, search it for thesmallest value tmp = replace_find(root->right,1); else //search the left subtree for the largest value tmp = replace_find(root->left,0);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

182


DATA STRUCTURES & ALOGRITHEMS USING C, C++

183

root->data = tmp->data; //if tmp has a child, have tmp's parent point to it. Otherwise, have the parent point to NULL in place of tmp. if(tmp->parent->left == tmp) //if tmp is a left child { if(root->right

!=

NULL)

//if

it

has

a

right

child,

have

the

parent

point

to

it

tmp->parent->left = tmp->right; else //point to the left child. This value is NULL if there is no left child tmp->parent->left = tmp->left; } else //if tmp is a right child { if(root->right != NULL) //if it has a right child, have the parent point to it tmp->parent->right = tmp->right; else //point to the left child. This value is NULL if there is no left child tmp->parent->right = tmp->left; } delete tmp; } template <class ItemType> TreeNode<ItemType>* BST<ItemType>::replace_find(TreeNode<ItemType>* root,int direction) {if(direction = 0) //searching left subtree for largest value. go right as much as possible. Return last node. { if(root->right == NULL) return root; return replace_find(root->right,0); } else //searching right subtree for smallest value. go left as much as possible.Return last node. { if(root->left == NULL) return root; return replace_find(root->left,1); } }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ 4.3

184

Implementation of Binary Search Tree

To implement Binary Search tree with insertion and deletion operations.

Algorithm: Step 1: Start the process Step 2: Initialize and declare variables. Step 3: Enter the choice. Step 4:If the Choice is CREATE a) To create a node using get_node function b) After allocating memory space , NEW node is assigned to ROOT Step 5: If the choice is INSERT a) Check whether the ROOT node is NULL b) If the condition is true, the tree is empty,consider the NEW node as the ROOT node c) The NEW node is lesser than the ROOT node ,check whether the left child of the ROOT node is NULL d) The NEW node is greater than the ROOT node, check whether the right child of the root node is NULL Step 6: If the Choice is DELETE a) To delete a Leaf node, search the parent of the leaf node and make the link to the leaf node as NULL and release the memory for the deleted node. b) To delete the node with only one child, search the parent of the node to be deleted, Assign the link of the parent node to the child of the node to be deleted and release the memory for the deleted node. c) To delete the node with two children, search the parent of the node to be deleted, copy the inorder successor to the node to be deleted, delete the inorder successor node. Step 7: If the choice is SEARCH a) check the NEW node is equal to ROOT node . b) check the NEW node is lesser than the ROOT node c) check the NEW node is greater than the ROOT node Step 8: Stop the process.

CODING: #include<stdio.h> #include<conio.h> #include<stdlib.h>

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ typedef struct bst { int data; struct bst *left, *right; }node; void insert(node *,node*); void inorder(node*); node *search(node *,int,node **); void del(node *,int);

void main() { int ch; char ans='N'; int key; node *New,*root,*tmp,*parent; node *get_node(); root=NULL; clrscr(); printf("\n \t Program for Binary Search Tree"); do { printf("\n 1.Create \n 2. Search \n 3. Delete \n 4.Display"); printf("\n\n Enter your choice"); scanf("%d",&ch); switch(ch) { case 1 : do { New=get_node(); printf("\n Enter the Element"); scanf("%d",&New->data); if(root==NULL) root=New; else insert(root,New); printf("\n Do u want to continue enter more elements(y/n)");

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

185


DATA STRUCTURES & ALOGRITHEMS USING C, C++ ans=getch(); }while(ans=='N'); break; case 2: printf("/n Enter the element which u want to search"); scanf("%d",&key); tmp=search(root,key,&parent); printf("\n Parent of node %d is %d",tmp->data,parent->data); break; case 3: printf("\n Enter the Element u wish to delete"); scanf("%d",&key); del(root,key); break; case 4: if(root==NULL) printf("Tree is not created"); else { printf("\n The tree is:"); inorder(root); } break; } } while(ch!=5); }

node *get_node() { node *temp; temp=(node*)malloc(sizeof(node)); temp->left=NULL; temp->right=NULL; return temp; } void insert(node *root,node *New) { if(New->data<root->data) { if(root->left==NULL)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

186


DATA STRUCTURES & ALOGRITHEMS USING C, C++ root->left=New; else insert(root->left,New); } if(New->data>root->data) { if(root->right==NULL) root->right=New; else insert(root->right,New); } } node *search(node *root,int key,node **parent) { node *temp; temp=root; while(temp!=NULL) { if(temp->data==key) { printf("\n The %d Element is present",temp->data); return temp; } *parent=temp; if(temp->data>key) temp=temp->left; else temp=temp->right; } return NULL; } void del(node *root,int key) { node *temp,*parent,*temp_succ; temp=search(root,key,&parent); if(temp->left!=NULL && temp->right!=NULL) {

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

187


DATA STRUCTURES & ALOGRITHEMS USING C, C++ parent=temp; temp_succ=temp->right; while(temp_succ->left!=NULL) { parent = temp_succ; temp_succ=temp_succ->left; } temp->data=temp_succ->data; parent->right=NULL; printf("\n Now deleted it!"); return; } if(temp->left!=NULL && temp->right==NULL) { if(parent->left==temp) parent->left=temp->left; else parent->right=temp->left; temp=NULL; free(temp); printf("\n Now deleted if!"); return; } if(temp->left==NULL && temp->right!=NULL) { if(parent->left==temp) parent->left=temp->right; else parent->right=temp->right; temp=NULL; free(temp); printf("\n Now Deleted it!"); return; } if(temp->left==NULL &&temp->right==NULL) { if(parent->left==temp)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

188


DATA STRUCTURES & ALOGRITHEMS USING C, C++

189

parent->left=NULL; else parent->right=NULL; printf("\n Now Deleted it!"); return; } } void inorder(node *temp) { if(temp!=NULL) { inorder(temp->left); printf(" %d",temp->data); inorder(temp->right); } }

4.3

Implementation of an Expression tree to perform tree traversals

To implement an expression tree and to perform pre-order, in-order and post-order traversals.

Algorithm: Step 1: Start the process Step 2: Initialize and declare variables. Step 3: Enter the Postfix Expression that can be stored in the stack Step 4: In pop operation, check the top of the stack is Empty, otherwise stack[top] = Node, and decrement the top value. Step 5: In Push operation, check the size of the Stack, Stack is not full ,top is incremented and stack[top] = Node. Step 6: Allocate the memory for new character and assign left and right pointer is NULL Step 7: If the character value is +,*,/,-,pop the right and left pointer of temp and push in to Stack. Step 8: The Preorder function is traverse the root node and left, right node of the tree. Step 9: The Inorder function is traverse the left, root and right node of the tree. Step 10: The Postorder function is traverse the left, right and root node of the tree. Step 11: Stop the Process.

CODING: #include<stdio.h>

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA STRUCTURES & ALOGRITHEMS USING C, C++ #include<conio.h> #include<alloc.h> #include<ctype.h> #define size 20 typedef struct node { char data; struct node *left; struct node *right; }btree; btree *stack[size]; int top; void main() { btree *root; char exp[80]; btree *create(char exp[80]); void inorder(btree *root); void preorder(btree *root); void postorder(btree *root); clrscr(); printf("Enter the Postfix Expression"); scanf("%s",exp); top=-1; root=create(exp); printf("\n The tree is created"); printf("\n The inorder traversal of tree \n"); inorder(root); printf("\n The preorder traversal of tree \n"); preorder(root); printf("\n The Postorder traversal of tree \n"); postorder(root); getch(); } btree *create(char exp[]) { btree *temp;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

190


DATA STRUCTURES & ALOGRITHEMS USING C, C++ int pos; char ch; btree *pop(); void push(btree *); pos=0; ch=exp[pos]; while(ch!='\0') { temp=(btree *)malloc(sizeof(btree)); temp->left=temp->right=NULL; temp->data=ch; if(isalpha(ch)) push(temp); else if(ch=='+' || ch=='-'||ch=='*' || ch=='/') { temp->right=pop(); temp->left=pop(); push(temp); } else printf("\Invalid Character in Expression\n"); pos++; ch=exp[pos]; } temp=pop(); return(temp); } void push(btree *Node) { if(top+1 >= size) printf("\n Error : Stack is Full\n"); top++; stack[top]=Node; } btree *pop() { btree *Node; if(top==-1)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

191


DATA STRUCTURES & ALOGRITHEMS USING C, C++ printf("\n Error: Stack is Empty\n"); Node=stack[top]; top--; return(Node); } void inorder(btree *root) { btree *temp; temp=root; if(temp!=NULL) { inorder(temp->left); printf("%c",temp->data); inorder(temp->right); } } void preorder(btree *root) { btree *temp; temp=root; if(temp!=NULL) { printf("%c",temp->data); preorder(temp->left); preorder(temp->right); } } void postorder(btree *root) { btree *temp; temp=root; if(temp!=NULL) { postorder(temp->left); postorder(temp->right); printf("%c",temp->data); }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

192


DATA STRUCTURES & ALOGRITHEMS USING C, C++ }

OUTPUT: Enter the Postfix Expression abc*+ The tree is created The inorder traversal of tree a+b*c The preorder traversal of tree +a*bc The Postorder traversal of tree abc*+

4.4

Check your progress

1. What are the special properties that a tree must have so that it can be called as BST?

2. Write C++ code to find and replace a value in BST.

3. Does the traversal of BST give nodes in any specific order? Why or why not?

4. What are expression trees?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

193


DATA SURUCTURES & ALOGRITHESM USING C , C++

BLOCK 4 Unit 1 Heaps Structure 1.0

Unit Objective

1.1

Heaps Introduction

1.2

Heaps

1.3

Implementation

1.4

Heap Sort Algorithm Analysis

1.5

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

194


DATA SURUCTURES & ALOGRITHESM USING C , C++ 1.0

195

Unit Objective

After going through this unit, you should be able to understand :

1.1

Heap as a special tree

Heap structure

C++ implementation of Heap

Heap Applications

Heap sort algorithm and analysis

Heaps Introduction

Another type of special binary tree is called a heap. In order to understand what a heap is, we must first define a complete and full binary tree. In a full binary tree, all nodes are either a parent with two children, or a leaf. In a complete binary tree, all the levels except the last must be completely filled. In the last level, all nodes must be filled in from the left side, without spacing between them, however, it does not have to be filled to the end.

A heap is a complete binary tree, which is partially ordered with either the max-heap or minheap properties. That is, if a heap is a max-heap, then the children of every node have a value less than that node. In a min-heap, the children of every node are greater than the node itself. With such a setup, the main root always has either the highest or lowest value in the tree. For demonstration purposes, we will show how to implement a max-heap, as it also an important part of the HeapSort algorithm, which will be covered later. [It is easy to change to code to work as a min-heap by changing the relational operators between node values]. A max-heap usually used for maintaining priority queues. Priority queues store values and release the object with the highest "priority" (or value) when needed. For instance, a value is associated with a particular task in a program, put into such a structure, and then executed based on its position. Since a heap must conform to the complete tree property, simple formulae can be libraryeloped to find the logical position of a node's children and parent given the position of the node itself. It is therefore very easy and efficient to implement a heap using arrays, and is done so most of the time, even if dynamic memory allocation is available.

1.2

Heaps

Heaps are based on the notion of a complete tree, for which we gave an informal definition earlier. Formally: A binary tree is completely full if it is of height, h, and has 2

h+1

-1 nodes. A

binary tree of height, h, is complete iff

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

196

a. it is empty or

b. its left subtree is complete of height h-1 and its right subtree is completely full of height h-2 or

c. its left subtree is completely full of height h-1 and its right subtree is complete of height h-1. A complete tree is filled from the left:

● all the leaves are on

the same level or

two adjacent ones and

all nodes at the lowest level are as far to the left as possible.

A binary tree has the heap property iff 

a. it is empty or

b. the key in the root is larger than that in either child and both subtrees have the heap property.

A heap can be used as a priority queue: the highest priority item is at the root and is trivially extracted. But if the root is deleted, we are left with two sub-trees and we must efficiently recreate a single tree with the heap property. The value of the heap structure is that we can both extract the highest priority item and insert a new one in O(logn) time.

Let's start with this heap.

A deletion will remove the T at the root. by the M.

Put it in the vacant root position.

This has violated the condition that the root must be greater than each of its children.

So interchange the M with the larger of its children.

The left subtree has now lost the heap property. So again interchange the M with the larger of its children.

This tree is now a heap again, so we're finished.

We need to make at most h interchanges of a root of a subtree with one of its children to fully restore the heap property. Thus deletion from a heap is O(h) or O(logn).

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

197

Addition to a heap

To add an item to a heap, we follow the reverse procedure.

Place it in the next leaf position and move it up.

Again, we require O(h) or O(logn) exchanges.

Storage of complete trees The properties of a complete tree lead to a very efficient storage mechanism using n sequential locations in an array.

If we number the nodes from 1 at the root and place: ďƒ˜

the left child of node k at position 2k

ďƒ˜

the right child of node k at position 2k+1

Then the 'fill from the left' nature of the complete tree ensures that the heap can be stored in consecutive locations in an array.

Viewed as an array, we can see that the nth node is always in index position n.

The code for extracting the highest priority item from a heap is, naturally, recursive. Once we've extracted the root (highest priority) item and swapped the last item into its place, we simply call MoveDownrecursively until we get to the bottom of the tree.

Heaps provide us with a method of sorting, known as heapsort. However, we will examine and analyse the simplest method of sorting first.

Complete Tree A balanced tree in which the distance from the root to any leaf is either h or h-1.

1.3

Implementation

In an array implementation, we must allocate a certain amount of memory space that may be used for the heap. The space may not be used up, and is therefore a waste of memory. Other times, we may need to add more nodes to the heap than the allocated memory allows for. However, we usually allocate more space than we think may be required in order to insure the

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

198

heap is usable. If we have a tree of very large structures, this space can be significant. However, this is the price we always pay for greater efficiency. It should noted though, that we no longer have three pointers for every tree node (left, right, parent), which took up a lot of space in the dynamic memory implementation. The logical position of a node in a heap corresponds to the index of the node's array position, thereby making it very easy to access any node. A generic implementation is shown below.

const int MAX_SIZE = 100; //the maximum amount of elements our heap should have.This may be changed to any number so long as memory permits, depending on how the heap will be used.

template <class ItemType> class Heap { public: Heap(); int left(int) const; int right(int) const; int parent(int) const; void insert(const ItemType&); ItemType remove_max(); bool IsEmpty() const; bool IsFull() const; int count() const; ItemType value(int) const;

private: ItemType array[MAX_SIZE]; int elements; //how many elements are in the heap void ReHeap(int); }; //default constructor -initialize private variables template <class ItemType> Heap<ItemType>::Heap() { elements = 0; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

199

The left(), right(), and parent() functions return the index positions of a node's children and parent. Since the index position of each element correspond to their logical position in the heap, the functions use simple formulae that are derived by observing the heap structure.

template <class ItemType> int Heap<ItemType>::left(int root) const { assert(root return (root * 2) + 1; }

template <class ItemType> int Heap<ItemType>::right(int root) const {

assert(root < (elements-1)/2); //does a right child exist? return (root * 2) + 2; } template <class ItemType> int Heap<ItemType>::parent(int child) const { assert(child != 0); //main root has no parent return (child -1) / 2; }

The insert() function accepts the new item value as its parameter. It works by inserting the new item at the end of the heap, and swapping positions with the parent, if the parent has a smaller value than the item. The new item continues to travel up the heap, swapping its position with its new parents until the item's parent is larger than it.

template <class ItemType> void Heap<ItemType>::insert(const ItemType &item) { assert(!IsFull()); array[elements] = item; //elements represents the array position after the last,since indexing starts with 0

int new_pos = elements; //index of the new item

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

200

elements++; //update the amount of elements in heap while((new_pos != 0) && (array[new_pos] > array[parent(new_pos)])) //loop while the item has not become the main root, and while its value is less than its parent

{ swap(array[new_pos],array[parent(new_pos)]); //swap the value of item with its lesser parent new_pos = parent(new_pos); //update the item's positions } }

The remove_max() removes the item with the highest priority and returns its value. The item is swapped with the last item, and elements is updated to one less.

Notice the item is not physically deleted, it will remain as part of the array. It will not be part of the heap since elements is updated, and the heap goes only as far as (elements -1).

The new root may not have the largest priority, therefore the ReHeap() function is then used to insert the new root into its proper position, thus conserving the heap property.

template <class ItemType> ItemType Heap<ItemType>::remove_max() { assert(!IsEmpty()); elements--; //update the amount of elements in heap if(elements != 0) //if we didn't delete the root { swap(array[0],array[elements]); ReHeap(0); } return array[elements]; }

The ReHeap() function checks of either of root's children are bigger than it, in which case the bigger child is swapped with root. The process is then continued using recursion, on root's new children. The function stops when root is bigger than both of its children.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

201

template <class ItemType> void Heap<ItemType>::ReHeap(int root) { int child = left(root); if((array[child] < array[child+1]) && (child < (elements-1))) //if a right child exists, and it's bigger than the left child, it will be used child++; if(array[root]

>=

array[child])

//if

root

is

bigger

than

its

largest

child,

stop.

return; swap(array[root],array[child]); //swap root and its biggest child ReHeap(child); //continue the process on root's new children }

The rest of our member functions are easy to implement.

template <class ItemType> int Heap<ItemType>::count() const { return elements; }

template <class ItemType> ItemType Heap<ItemType>::value(int pos) const { assert(pos < elements); //is pos a valid index in the heap return array[pos]; }

template <class ItemType< bool Heap<ItemType>::IsEmpty() const { return (elements == 0); } template <class ItemType> bool Heap<ItemType>::IsFull() const { return (elements == MAX_SIZE);

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

1.4

202

Heap Sort Algorithm Analysis

The heap sort is the slowest of the O(n log n) sorting algorithms, but unlike the merge and quick sorts it doesn't require massive recursion or multiple arrays to work. This makes it the most attractive option for very large data sets of millions of items.

The heap sort works as it name suggests - it begins by building a heap out of the data set, and then removing the largest item and placing it at the end of the sorted array. After removing the largest item, it reconstructs the heap and removes the largest remaining item and places it in the next open position from the end of the sorted array. This is repeated until there are no items left in the heap and the sorted array is full. Elementary implementations require two arrays - one to hold the heap and the other to hold the sorted elements.

To do an in-place sort and save the space the second array would require, the algorithm below "cheats" by using the same array to store both the heap and the sorted array. Whenever an item is removed from the heap, it frees up a space at the end of the array that the removed item can be placed in.

Pros: In-place and non-recursive, making it a good choice for extremely large data sets

Cons: Slower than the merge and quick sorts.

As mentioned above, the heap sort is slower than the merge and quick sorts but doesn't use multiple arrays or massive recursion like they do. This makes it a good choice for really large sets, but most modern computers have enough memory and processing power to handle the faster sorts unless over a million items are being sorted.

The "million item rule" is just a rule of thumb for common applications - high-end servers and workstations can probably safely handle sorting tens of millions of items with the quick or merge sorts. But if you're working on a common user-level application, there's always going to be some yahoo who tries to run it on junk machine older than the programmer who wrote it, so better safe than sorry.

Source Code Below is the basic heap sort algorithm. The siftDown() function builds and reconstructs the heap. void heapSort(int numbers[], int array_size)

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++ { int i, temp; for (i = (array_size / 2)-1; i >= 0; i--) siftDown(numbers, i, array_size); for (i = array_size-1; i >= 1; i--) { temp = numbers[0]; numbers[0] = numbers[i]; numbers[i] = temp; siftDown(numbers, 0, i-1); } }

void siftDown(int numbers[], int root, int bottom) { int done, maxChild, temp; done = 0; while ((root*2 <= bottom) && (!done)) { if (root*2 == bottom) maxChild = root * 2; else if (numbers[root * 2] > numbers[root * 2 + 1]) maxChild = root * 2; else maxChild = root * 2 + 1;

if (numbers[root] < numbers[maxChild]) { temp = numbers[root]; numbers[root] = numbers[maxChild]; numbers[maxChild] = temp; root = maxChild; } else done = 1; } }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

203


DATA SURUCTURES & ALOGRITHESM USING C , C++

1.5

204

Check your progress 13. Describe the term Heap. What conditions a tree has to satisfy to be called as Heap?

14. Describe heap sort algorithm. Write C++ program for Heapsort.

15. Write C++ code for insertion in a Heap?

16. Given numbers 34, 65, 21, 54, 23, 56, 8, 59, show stepwise additions of numbers to the heap.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

Unit 2 Height Balance Trees Structure 2.0

Unit Objective

2.1

Red-Black Trees

2.2

Red-Black Tree Operation

2.3

AVL Trees

2.4

Implementation of insertion operation in AVL tree

2.5

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

205


DATA SURUCTURES & ALOGRITHESM USING C , C++ 2.0

206

Unit Objective

After going through this unit, you should be able to understand : 

Red Black tree

Red Black tree operation

Concept of Height balance

AVL trees

Height balancing operations

Implementation of insertion in AVL tree

2.1

Red-Black Trees

A red-black tree is a binary search tree with one extra attribute for each node: the colour, which is either red or black. We also need to keep track of the parent of each node, so that a red-black tree's node structure would be: struct t_red_black_node { enum { red, black } colour; void *item; struct t_red_black_node *left, *right, *parent; } For the purpose of this discussion, the NULL nodes which terminate the tree are considered to be the leaves and are coloured black.

Definition of a red-black tree A red-black tree is a binary search tree which has the following red-black properties: 1

Every node is either red or black.

2

Every leaf(NULL) is black.implies that on any path from the root to a

3

If a node is red, then both its children are black. Leaf, red nodes must not be adjacent

.

However, any number of black nodes may. 4.

Every simple path from a node to a descendant appear in a sequence leaf contains the same number of black nodes.

Basic red-black tree with the sentinel nodes added. Implementations of the red-black tree algorithms will usually include the sentinel nodes as a convenient means of flagging that you have reached a leaf node. They are the NULL black nodes of property 2.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

207

The number of black nodes on any path from, but not including, a node x to a leaf is called the black-height of a node, denoted bh(x). We can prove the following lemma:

As with heaps, additions and deletions from red-black trees destroy the red-black property, so we need to restore it. To do this we need to look at some operations on red-black trees. Rotations A rotation is a local operation in a search tree that preserves in-order traversal key ordering.

Note that in both trees, an in-order traversal yields: AxByC

The left_rotate operation may be encoded: left_rotate( Tree T, node x ) { node y; y = x->right; /* Turn y's left sub-tree into x's right sub-tree */

x->right = y->left; if ( y->left != NULL ) y->left->parent = x; /* y's new parent was x's parent */ y->parent = x->parent; /* Set the parent to point to y instead of x */ /* First see whether we're at the root */ if ( x->parent == NULL ) T->root = y; else if ( x == (x->parent)->left ) /* x was on the left of its parent */ x->parent->left = y; else /* x must have been on the right */ x->parent->right = y; /* Finally, put x on y's left */ y->left = x; x->parent = y; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

208

Insertion Insertion is somewhat complex and involves a number of cases. Note that we start by inserting the new node, x, in the tree just as we would for any other binary tree, using the tree_insertfunction. This new node is labelled red, and possibly destroys the red-black property. The main loop moves up the tree, restoring the red-black property. rb_insert( Tree T, node x ) { /* Insert in the tree in the usual way */ tree_insert( T, x ); /* Now restore the red-black property */ x->colour = red; while ( (x != T->root) && (x->parent->colour == red) ) { if ( x->parent == x->parent->parent->left ) { /* If x's parent is a left, y is x's right 'uncle' */ y = x>parent->parent->right; if ( y->colour == red ) { /* case 1 - change the colours */ x->parent->colour = black; y->colour = black; x->parent->parent->colour = red; /* Move x up the tree */ x = x->parent->parent; } else { /* y is a black node */ if ( x == x->parent->right ) { /* and x is to the right */ /* case 2 - move x up and rotate */ x = x->parent; left_rotate( T, x ); } /* case 3 */ x->parent->colour = black; x->parent->parent->colour = red; right_rotate( T, x->parent->parent ); } } else { /* repeat the "if" part with right and left exchanged */ } } /* Colour the root black */ T->root->colour = black; }

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

209

/* Extract the highest priority from the heap */ #define LEFT(k) (2*k) #define RIGHT(k) (2*k+1) #define EMPTY(c,k) (k>=c->item_cnt) #define SWAP(i,j) { void *x = c->items[i]; \ c->items[i] = c->items[j]; c->items[j] = x; } void MoveDown( Collection c, int k ) { int larger, right, left; left = LEFT(k); right = RIGHT(k); if ( !EMPTY(c,k) ) /* Termination condition! */ { larger=left; if ( !EMPTY(c,right) ) { if ( ItemCmp( c->items[right], c->items[larger] ) > 0 ) larger = right; } if ( ItemCmp( c->items[k], c->items[larger] ) ) { SWAP( k, larger ); MoveDown( c, larger ); } } } void *HighestPriority( Collection c ) /* Return the highest priority item Pre-condition: (c is a collection created by a call to ConsCollection) && (existing item count >= 1) && (item != NULL) Post-condition: item has been deleted from c*/ { int i, cnt; void *save; assert( c != NULL ); assert( c->item_cnt >= 1 ); /* Save the root */ save = c->items[0]; /* Put the last item in the root */

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

210

cnt = c->item_cnt; c->items[0] = c->items[cnt-1]; /* Adjust the count */ c->item_cnt--; /* Move the new root item down if necessary */ MoveDown( c, 1 ); return save; } 2.2

Red-Black Tree Operation

Here's an example of insertion into a red-black tree

Here's the original tree .. Note that in the following diagrams, the black sentinel nodes have been omitted to keep the diagrams simple.

The tree insert routine has just been called to insert node "4" into the tree.

This is no longer a red-black tree -there are two successive red nodes on the path 11 - 2 - 7 - 5 4 Mark the new node, x, and it's uncle, y. y is red, so we have case 1 ...

Change the colours of nodes 5, 7 and 8. Move x up to its grandparent, 7. x's parent (2) is still red, so this isn't a red-black tree yet. Mark the uncle, y. In this case, the uncle is black, so we have case 2 ... Move x up and rotate left.

2.3

AVL Trees

An AVL tree is another balanced binary search tree. Named after their inventors, Adelson-Velskii and Landis, they were the first dynamically balanced trees to be proposed. Like red-black trees, they are not perfectly balanced, but pairs of sub-trees differ in height by at most 1, maintaining an O(logn) search time. Addition and deletion operations also take O(logn) time.

Definition of an AVL tree An AVL tree is a binary search tree which has the following properties:

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

211

1

The sub-trees of every node differ in height

2

by at most one.

3

Every sub-tree is an AVL tree.

4

Balance requirement for an AVL tree: the left and right sub-trees differ by at most 1 in height.

You need to be careful with this definition: it permits some apparently unbalanced trees! For example, here are some trees:

Tree

AVL tree?

Yes Examination shows that each left sub-tree has a height 1 greater than each right sub-tree.

No Sub-tree with root 8 has height 4 and sub-tree with root 18 has height 2

Insertion As with the red-black tree, insertion is somewhat complex and involves a number of cases. Implementations of AVL tree insertion may be found in many textbooks: they rely on adding an extra attribute, the balance factor to each node. This factor indicates whether the tree is leftheavy (the height of the left sub-tree is 1 greater than the right sub-tree), balanced (both subtrees are the same height) or right-heavy (the height of the right sub-tree is 1 greater than the left

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

212

sub-tree). If the balance would be destroyed by an insertion, a rotation is performed to correct the balance.

A new item has been added to the left subtree of node 1, causing its height to become 2 greater than 2's right sub-tree (shown in green). A right-rotation is performed to correct the imbalance.

2.4

Implementation of insertion operation in AVL tree

To implement AVL tree with insert operation.

Algorithm Step1: Start the process. Step 2 : Initialize and declare the variables Step 3: Enter the choice Step 4: If the Choice is INSERT a) Find the place to insert the new node in the tree Traverse the node from the inserted position, until end with a node with a balance factor +1 or -. b) If node X was not found, make another pass from the root, updating the balance factors and terminate the process. c) IF BF(X)=1 and new node was inserted in Xs right sub tree or BF(X)=-1 and new node was inserted in Xs left sub tree, BF(X) = 0,update the balance factor on path to X and terminate the process. d) Classify the imbalance at X and perform the any of the four rotations like left to left rotation, right to right rotation, left to right rotation and right to left rotation. Step 5: If the choice is DISPLAY, print the inorder traversal of the tree. Step 6: Stop the Process.

CODING: #include<stdio.h> #include<malloc.h>

typedef enum { FALSE ,TRUE } bool; struct node { int info; int balance; struct node *lchild; struct node *rchild;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++ };

struct node *insert (int , struct node *, int *); struct node* search(struct node *,int);

main() { bool ht_inc; int info ; int choice; struct node *root = (struct node *)malloc(sizeof(struct node)); root = NULL;

while(1) { printf("1.Insert\n"); printf("2.Display\n"); printf("3.Quit\n"); printf("Enter your choice : "); scanf("%d",&choice); switch(choice) { case 1: printf("Enter the value to be inserted : "); scanf("%d", &info); if( search(root,info) == NULL ) root = insert(info, root, &ht_inc); else printf("Duplicate value ignored\n"); break; case 2: if(root==NULL) { printf("Tree is empty\n"); continue; } printf("Tree is :\n");

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

213


DATA SURUCTURES & ALOGRITHESM USING C , C++ display(root, 1); printf("\n\n"); printf("Inorder Traversal is: "); inorder(root); printf("\n"); break; case 3: exit(1); default: printf("Wrong choice\n"); } } }

struct node* search(struct node *ptr,int info) { if(ptr!=NULL) if(info < ptr->info) ptr=search(ptr->lchild,info); else if( info > ptr->info) ptr=search(ptr->rchild,info); return(ptr); }

struct node *insert (int info, struct node *pptr, int *ht_inc) { struct node *aptr; struct node *bptr;

if(pptr==NULL) { pptr = (struct node *) malloc(sizeof(struct node)); pptr->info = info; pptr->lchild = NULL; pptr->rchild = NULL; pptr->balance = 0; *ht_inc = TRUE;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

214


DATA SURUCTURES & ALOGRITHESM USING C , C++ return (pptr); }

if(info < pptr->info) { pptr->lchild = insert(info, pptr->lchild, ht_inc);

if(*ht_inc==TRUE) { switch(pptr->balance) { case -1: pptr->balance = 0; *ht_inc = FALSE; break; case 0: pptr->balance = 1; break; case 1: aptr = pptr->lchild; if(aptr->balance == 1) { printf("Left to Left Rotation\n"); pptr->lchild= aptr->rchild; aptr->rchild = pptr; pptr->balance = 0; aptr->balance=0; pptr = aptr; } else { printf("Left to right rotation\n"); bptr = aptr->rchild; aptr->rchild = bptr->lchild; bptr->lchild = aptr; pptr->lchild = bptr->rchild; bptr->rchild = pptr;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

215


DATA SURUCTURES & ALOGRITHESM USING C , C++

if(bptr->balance == 1 ) pptr->balance = -1; else pptr->balance = 0; if(bptr->balance == -1) aptr->balance = 1; else aptr->balance = 0; bptr->balance=0; pptr=bptr; } *ht_inc = FALSE; } } }

if(info > pptr->info) { pptr->rchild = insert(info, pptr->rchild, ht_inc);

if(*ht_inc==TRUE) { switch(pptr->balance) { case 1: pptr->balance = 0; *ht_inc = FALSE; break; case 0: pptr->balance = -1; break; case -1: aptr = pptr->rchild; if(aptr->balance == -1) { printf("Right to Right Rotation\n");

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

216


DATA SURUCTURES & ALOGRITHESM USING C , C++ pptr->rchild= aptr->lchild; aptr->lchild = pptr; pptr->balance = 0; aptr->balance=0; pptr = aptr; } else { printf("Right to Left Rotation\n"); bptr = aptr->lchild; aptr->lchild = bptr->rchild; bptr->rchild = aptr; pptr->rchild = bptr->lchild; bptr->lchild = pptr;

if(bptr->balance == -1) pptr->balance = 1; else pptr->balance = 0; if(bptr->balance == 1) aptr->balance = -1; else aptr->balance = 0; bptr->balance=0; pptr = bptr; } *ht_inc = FALSE; } } }

return(pptr); }

display(struct node *ptr,int level) { int i;

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

217


DATA SURUCTURES & ALOGRITHESM USING C , C++

218

if ( ptr!=NULL ) { display(ptr->rchild, level+1); printf("\n"); for (i = 0; i < level; i++) printf(" "); printf("%d", ptr->info); display(ptr->lchild, level+1); } } inorder(struct node *ptr) { if(ptr!=NULL) { inorder(ptr->lchild); printf("%d ",ptr->info); inorder(ptr->rchild); } }

2.5

Check your progress

1. Describe the term Height Balance. How do trees maintain this property.

2. Give the properties of a Red-Black tree. What are the various operations performed on the Red-Black tree.

3. Draw AVL tree for the following node insertions : 12, 24, 16, 18, 21, 25, 27, 8, 15, 39

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++ 4. Write a C++ code for deletion in an AVL tree.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

219


DATA SURUCTURES & ALOGRITHESM USING C , C++

Unit 3 Multi way Trees Structure 3.0

Unit Objective

3.1

Multi-way Trees 3.1.1

3.2

3.3

3.4

Search in M-way Trees

B-Trees 3.2.1

Insertion into a B-Tree

3.2.2

Deletion from a B-Tree

B+-tree and its Algorithm 3.3.1

B+-Tree structure

3.3.2

Search operation of B+-Tree

3.3.3

Update on B+-Tree

3.3.4

Insertion on B+-Tree

3.3.5

Deletion on B+-Tree

3.3.6

Adding Records to a B+ Tree

3.3.7

Rotation

3.3.8

Deleting Keys from a B+ tree

Check your progress

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

220


DATA SURUCTURES & ALOGRITHESM USING C , C++ 3.0

221

Unit Objective

After going through this unit, you should be able to understand : 

Multi way tree definition

Searching in an m-way tree

Concept of B-tree

Insertion and deletion in B-tree

Concept of B+-tree

Insertion and deletion in B+-tree

Rotation in B+-tree

3.1

Multi-way Trees

A m-way search tree is a tree in which a.

The nodes hold between 1 to m-1 distinct keys

b.

The keys in each node are sorted

c.

A node with k values has k+1 subtrees, where the subtrees may be empty.

d.

The ith subtree of a node [v1, ..., vk], 0 < i < k, may hold only values v in the range vi < v < vi+1

In fact, each may contain up to M - 1 values. A node with k values must have k + 1 subtrees. In a node, values are stored in ascending order: V1 < V2 < ... < Vk The subtrees are placed between adjacent values: each value has a left and right subtree. V(i)'s right subtree = V(i+1)'s left subtree. All the values in V(i)'s left subtree are < V(i). All the values in V(i)'s right subtree are > V(i).

3.1.1

Search in M-way Trees

Searching for X: 1. If X < V(1), recursively search in V(1)'s left subtree. 2. If X > V(k), recursively search in V(k)'s right subtree. 3. If X = V(i), for some i, X is found! 4. Else, for some i, V(i) < X < V(i+1); recursively search in subtree between V(i) and V(i+1).

Search for 68 in:

3.2

B-Trees

A B-tree is a M-way search tree such that: It is perfectly balanced. Every node, except perhaps the root, is at least half full (has >= M

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

222

values).

This tree is not a B-tree:

This one is, and contains the same values:

3.2.1

Insertion into a B-Tree

To insert X: 1. Use search procedure to find leaf node where X should be added. 2. add X to this node at the appropriate place among the values. 3. if there are <= M-1 values, we are done! Otherwise, the node has overflowed. To repair, split the node into 3 parts: Left: first (M-1)/2 values Middle: value at position 1+(M-1)/2 Right: last (M-1)/2 values

Left and Right have just enough values: make them into nodes. They become the left and right children of Middle, which we add in the appropriate place in this node's parent. If the parent overflows, we repeat the procedure.If the root overflows: we create a new root with Middle as its only value and Left and Right as its children.

Insert 17

Insert 6. Split: Left=[2,3], Middle=5, Right=[6,7]

Insert 21. Split: Left=[17,21], Middle=22, Right=[44,45] Overflow. Split: Left=[5,10], Middle=22, Right=[50,67]

The tree-insertion algorithms we have previously seen add new nodes at the bottom of the tree, and then have to worry about whether they have created an imbalance. The B-tree insertion algorithm is just the opposite: it adds nodes at the top. All nodes become 1 level deeper: the tree remains balanced.

3.2.2

Deletion from a B-Tree

Recall: in a BST, if the value to be deleted does not occur in a leaf, we replace it with the largest value from its left subtree, and then delete that value from the left subtree.We proceed similarly in

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

223

a B-tree. Furthermore, the largest value in a left subtree is guaranteed to be in a leaf node. To delete X from a leaf node: 1. Remove X from current node. There are no subtrees to worry about. 2. If >= (M-1)/2 values, Done! Else, node underflowed and needs repair. How to repair a non-root node? Consider deleting 6 from:

The leaf node now contains just 7. Repair strategy: try to borrow values from a neighbouring node. We join together the current node and a neighbour to form a combined node. Don't forget to include the value between these 2 adjacent subtrees. We choose to join [7] with [17,22,44,45]: [6,7,10,17,22,44,45] parent contributes 1 value. node that underflowed: (M-1)/2 - 1 values. neighbour: between (M-1)/2 and (M-1) values.

We distinguish 2 cases, depending on whether the neighbour contributes exactly (M-1)/2 values or more.

Case 1: neighbour contributes > (M-1)/2 values. The combined node contains > 1+((M-1)/2 - 1) + (M-1)/2 values, i.e. > M - 1. It is too big! Split the combined node into: Left, Middle and Right. Since there were >= M values, minus Middle, leaves >= M - 1, split in 2, leaves >= (M-1)/2. Therefore, Left and Right have enough values to become nodes. Replace the value we borrowed from the parent with Middle, using Left and Right as its 2 children. Since parent's size doesn't change: we are done!

Case 2: neighbour contributes exactly (M-1)/2 values. The combined node contains 1 + ((M-1)/2 - 1) + (M-1)/2 = M-1 values. It can become a valid node. Simply erase borrowed value and neighbour from parent, and replace node that underflowed with the new combined node. Delete 3 from:

Result: Note: the parent has 1 fewer value. It might underflow! The repair strategy may have to be applied repeatedly at successive levels. If the root underflows: it must have originally contained just 1 value, now removed. if the root was a leaf: now empty

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

224

else, value was consumed by case 2: the resulting combined node can be used as new root. Root underflows:

3.3

B+-tree and its Algorithm

The B-tree structure is the standard organization for indexes in a database system. There are several variations of the B-tree, the most well known being the B*-tree and the B+-tree. The Btree guarantees at least 50% storage utilization, that is, at any given time, the tree has each of its nodes at least 50% full. The B+-tree is a slightly different data structure, which in addition to indexed access, also allows sequential data processing and stores all data in the lowest level of the tree. 3.3.1

B+-Tree structure

The structure of a B+-tree is illustrated in Figure 1. Each node in a B+-tree of order p, except for a special node called root, has one parent node and at most p child nodes. Figure 2 is a B+-tree with order 4. A node that has no child node is called a leaf node; a non-leaf node is called a internal node. Each internal node in a B+-tree contains at most p-1 search values and p tree pointers in the order < >where For all search field values X in the subtree pointed by we have for

,

for , and for . Each internal node, except the root node, has at least pointers. The root node has at least two tree pointers if it is an internal node. Each leaf node is of the form , where . Each is a data pointer which points to the record or data object whose search field value is

. Each leaf

node has at least values and all leaf nodes are at the same level. A node is of level

if there are

internal nodes (excluding the node itself)

between the root and the node. The height of a B+-tree is the maximum level of nodes in the tree plus one. For example, the root and the leaf nodes in Figure 2 are of levels 0 and 1, respectively. The height of the tree is 2.

The B+-tree has three basic operations, searching for a key value in the tree, inserting a new key value in the tree, and deleting an old key value from the tree. The algorithms for these three operations will be discussed in the following sections.

3.3.2

Search operation of B+-Tree

The algorithm for search finds the leaf node in which a given data entry belongs. When we want to search for a record with a search-key value of

. The search process begins from the root

node, looking for the smallest search-key value greater than

. Assume that this search-key

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

value is

. The search process with follow pointer

to another node. If

, then search

follows

to another node. If the tree has

follows

to another node. Once again, the search process will look for the smallest search-key

value greater than

pointers in the node and

225

, then the search

, and follow the corresponding pointer. Eventually, the search process will

reach a leaf node, at which point the pointer directs it to the desired data record. Thus in the processing of a search process, a path is traversed in the tree from the root to a leaf node.

3.3.3

Update on B+-Tree

Insertion and deletion are more complicated than the search process, since it may be necessary to split a node that becomes too large as the result of an insertion, or to combine nodes if a node becomes too small (fewer than

3.3.4

pointers).

Insertion on B+-Tree

The algorithm for insertion takes an entry, finds the leaf node where it belongs, and inserts it there.

The basic idea behind the algorithm is that we recursively insert the new key value by calling the insert algorithm on the appropriate child node. Usually, this procedure results in going down to the leaf node where the new search key value belongs, placing the new search key value there, and returning all the way back to the root node. Occasionally a node is full and it must be split. When the node is split, a new key value and tree pointer pointing to the node created by the split must be inserted into its parent node. If the old root node is split, a new root node must be created and the height of the tree increases by one.

3.3.5

Deletion on B+-Tree

The algorithm for deletion takes a key value, finds the leaf node it belongs to, and deletes the key value from the node.

The basic idea behind the algorithm is that we recursively delete the key value by calling the delete algorithm on the appropriate child node. We usually go down to the leaf node where the key value belongs, remove it from there, and return all the way back to the root node. Occasionally a node is at minimum occupancy before the deletion and the deletion causes it to go below the occupancy threshold. When this happens, we must either redistribute key values from an adjacent sibling node or merge the node with a sibling node to maintain minimum occupancy. If key values are redistributed between two nodes, their parent node must be updated to reflect

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

226

this. The key value of the parent node pointing to the second node must be changed to be the lowest search key value in the second node. If two nodes are merged, their parent node must be updated to reflect this change by deleting the key value pointing to the second node. If the last key value of the root node is deleted in this manner because one of its child node is deleted, the height of the tree decreases by one.

3.3.6

Adding Records to a B+ Tree

The key value determines a record's placement in a B+ tree. The leaf pages are maintained in sequential order AND a doubly linked list (not shown) connects each leaf page with its sibling page(s). This doubly linked list speeds data movement as the pages grow and contract. We must consider three scenarios when we add a record to a B+ tree. Each scenario causes a different action in the insert algorithm. The scenarios are: The insert algorithm for B+ Trees Leaf

Index

Page

Page

Full

FULL

NO

NO

YES

NO

Action

Place the record in sorted position in the appropriate leaf page 1. Split the leaf page 2. Place Middle Key in the index page in sorted order. 3. Left leaf page contains records with keys below the middle key. 4. Right leaf page contains records with keys equal to or greater than the middle key.

YES

YES

1. Split the leaf page. 2. Records with keys < middle key go to the left leaf page. 3. Records with keys >= middle key go to the right leaf page. 4. Split the index page. 5. Keys < middle key go to the left index page. 6. Keys > middle key go to the right index page. 7. The middle key goes to the next (higher level) index. IF the next level index page is full, continue splitting the index pages.

Illustrations of the insert algorithm The following examples illlustrate each of the insert scenarios. We begin with the simplest scenario: inserting a record into a leaf page that is not full. Since only the leaf node containing 25 and 30 contains expansion room, we're going to insert a record with a key value of 28 into the B+ tree. The following figures shows the result of this addition. Add Record with Key 28

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

227

Adding a record when the leaf page is full but the index page is not Next, we're going to insert a record with a key value of 70 into our B+ tree. This record should go in the leaf page containing 50, 55, 60, and 65. Unfortunately this page is full. This means that we must split the page as follows: Left Leaf Page

Right Leaf Page

50 55

60 65 70

The middle key of 60 is placed in the index page between 50 and 75. The following table shows the B+ tree after the addition of 70. Add Record with Key 70

Adding a record when both the leaf page and the index page are full As our last example, we're going to add a record containing a key value of 95 to our B+ tree. This record belongs in the page containing 75, 80, 85, and 90. Since this page is full we split it into two pages:

Left Leaf Page

Right Leaf Page

75 80

85 90 95

The middle key, 85, rises to the index page. Unfortunately, the index page is also full, so we split the index page: Left Index Page

Right Index Page

New Index Page

25 50

75 85

60

The following table illustrates the addition of the record containing 95 to the B+ tree. Add Record with Key 95

3.3.7

Rotation

B+ trees can incorporate rotation to reduce the number of page splits. A rotation occurs when a leaf page is full, but one of its sibling pages is not full. Rather than splitting the leaf page, we move a record to its sibling, adjusting the indices as necessary. Typically, the left sibling is checked first (if it exists) and then the right sibling.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

228

As an example, consider the B+ tree before the addition of the record containing a key of 70. As previously stated this record belongs in the leaf node containing 50 55 60 65. Notice that this node is full, but its left sibling is not. Add Record with Key 28

Using rotation we shift the record with the lowest key to its sibling. Since this key appeared in the index page we also modify the index page. The new B+ tree appears in the following table. Illustration of Rotation

3.3.8

Deleting Keys from a B+ tree

We must consider three scenarios when we delete a record from a B+ tree. Each scenario causes a different action in the delete algorithm. The scenarios are:

The delete algorithm for B+ Trees Leaf Below

Page Fill

Index Below

Factor

Factor

NO

NO

Page

Action

Fill

Delete the record from the leaf page. Arrange keys in ascending order to fill void. If the key of the deleted record appears in the index page, use the next key to replace it.

YES

NO

Combine the leaf page and its sibling. Change the index page to reflect the change.

YES

YES

1. Combine the leaf page and its sibling. 2. Adjust the index page to reflect the change. 3. Combine

the

index

page

with

its

sibling.

Continue combining index pages until you reach a page with the correct fill factor or you reach the root page.

As our example, we consider the B+ tree after we added 95 as a key. As a refresher this tree is printed in the following table.

Add Record with Key 95

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

229

Delete 70 from the B+ Tree We begin by deleting the record with key 70 from the B+ tree. This record is in a leaf page containing 60, 65 and 70. This page will contain 2 records after the deletion. Since our fill factor is 50% or (2 records) we simply delete 70 from the leaf node. The following table shows the B+ tree after the deletion.

Delete Record with Key 70

Delete 25 from the B+ tree

Next, we delete the record containing 25 from the B+ tree. This record is found in the leaf node containing 25, 28, and 30. The fill factor will be 50% after the deletion; however, 25 appears in the index page. Thus, when we delete 25 we must replace it with 28 in the index page.

The following table shows the B+ tree after this deletion.

Delete Record with Key 25

Delete 60 from the B+ tree

As our last example, we're going to delete 60 from the B+ tree. This deletion is interesting for several reasons: 1. The leaf page containing 60 (60 65) will be below the fill factor after the deletion. Thus, we must combine leaf pages. 2. With recombined pages, the index page will be reduced by one key. Hence, it will also fall below the fill factor. Thus, we must combine index pages. 3. Sixty appears as the only key in the root index page. Obviously, it will be removed with the deletion. 4. The following table shows the B+ tree after the deletion of 60. Notice that the tree contains a single index page. Delete Record with Key 60

3.4

Check your progress

1. What is a multi-way tree? What are its basic properties?

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621


DATA SURUCTURES & ALOGRITHESM USING C , C++

2. What is a B tree? What are the rules for insertion in a B-tree.

3. What is a B-+ tree? What are the rules for deletion in a B+-tree.

4. In the last example, show the deletion of nodes 15 and 50.

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

230


DATA SURUCTURES & ALOGRITHESM USING C , C++

-------------------------------------------THE END--------------------------------------------------

FOR MORE DETAILS VISIT US ON WWW.IMTSINSTITUTE.COM OR CALL ON +91-9999554621

231


DATASTRUCTURES& ALOGRI THEMSUSI NGC, C++

Publ i s he dby

I ns t i t ut eofManage me nt& Te c hni c alSt udi e s Addr e s s:E4 1 , Se c t o r 3 , No i da( U. P) www. i mt s i ns t i t ut e . c o m| Co nt a c t :9 1 +9 2 1 0 9 8 9 8 9 8


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.