structures de données et Algorithmes

Page 1

Contents


Contents

Table of Contents Contents...................................................................................................................................................1 junk.......................................................................................................................................................................2 DATA STRUCTURES AND ALGORITHMS II Chapter 1: Introduction..................................................3 Motivation: 4 Reasons to Learn about Algorithms.........................................................................................4 Overview of Course.............................................................................................................................................5 Brief Revision of Pointers and Dynamic Data..................................................................................................6 Introduction.........................................................................................................................................................7 Pointers................................................................................................................................................................9 Pointers and Arrays..........................................................................................................................................10 Copying Pointer Variables...............................................................................................................................11 Pointers, Functions and Call by Reference.....................................................................................................12 Arrays and Functions.......................................................................................................................................13 Dynamic Data....................................................................................................................................................14 Problems and Pitfalls........................................................................................................................................15 Object Oriented Programming........................................................................................................................16 Introduction.......................................................................................................................................................16 Motivation: Programming Paradigms....................................................................................................17 Object Oriented Programming and ADTs.............................................................................................17 Key Concepts and Terminology............................................................................................................18 Object Oriented Programming in C++: Basics..............................................................................................19 Further Reading.....................................................................................................................................21 Further Important C++ OOP Concepts.........................................................................................................22 Derived classes:.....................................................................................................................................22 Inheritance and Accessibility:................................................................................................................23 Constructors:..........................................................................................................................................23 Destructors.............................................................................................................................................23 Inline functions:.....................................................................................................................................25 Objects, Pointers and Virtual Functions........................................................................................................26 Virtual functions....................................................................................................................................28 Exploiting Inheritance when Defining ADTs.................................................................................................28 A Named Stack......................................................................................................................................30 Constructors, Destructors and Dynamic Data........................................................................................33 Queues and Stacks derived from Lists...................................................................................................34 i


Contents

Table of Contents Exploiting Inheritance when Defining ADTs Programming Style................................................................................................................................35 Some final points...................................................................................................................................37 Object Oriented Design....................................................................................................................................37 Methodology..........................................................................................................................................38 Some Examples......................................................................................................................................38 A Drawing Tool...............................................................................................................................39 File and Printing Systems................................................................................................................39 Window and User Interface Systems..............................................................................................40 Object Browsers.....................................................................................................................................41 Conclusion.........................................................................................................................................................41 Advantages and Disadvantages of OOP................................................................................................41 Summary................................................................................................................................................43 Graph Algorithms.............................................................................................................................................44 Introduction.......................................................................................................................................................44 Motivation..............................................................................................................................................45 Terminology and Definitions.................................................................................................................45 A Graph ADT........................................................................................................................................46 Implementing a Graph ADT..................................................................................................................46 Adjacency Matrix............................................................................................................................47 Edge Lists........................................................................................................................................48 Which to use?..................................................................................................................................49 Graph Search Algorithms................................................................................................................................49 Breadth first and depth first search........................................................................................................49 Tree search.............................................................................................................................................50 Graph search..........................................................................................................................................51 Returning the path..................................................................................................................................52 Example application...............................................................................................................................53 Weighted Graphs and Shortest Path Algorithms..........................................................................................55 Topological Sort................................................................................................................................................55 Motivation..............................................................................................................................................56 The Algorithm........................................................................................................................................58 Graphs ADTs and OOP....................................................................................................................................59 Summary: Graph Algorithms..........................................................................................................................60 String Processing Algorithms..........................................................................................................................61 Introduction.......................................................................................................................................................62 A String ADT.....................................................................................................................................................63 Comparing Representations...................................................................................................................64

ii


Contents

Table of Contents String Searching................................................................................................................................................64 Motivation..............................................................................................................................................65 A Naive Algorithm................................................................................................................................66 The Knuth−Morris−Pratt Algorithm......................................................................................................68 The Boyer−Moore Algorithm................................................................................................................69 Tradeoffs and Issues..............................................................................................................................70 Conclusion.............................................................................................................................................70 Exercise..................................................................................................................................................70 Further Reading.....................................................................................................................................71 Pattern Matching..............................................................................................................................................71 Motivation..............................................................................................................................................71 Representing Patterns.............................................................................................................................72 A Simple Pattern Matcher......................................................................................................................74 Further Reading.....................................................................................................................................75 Parsing...............................................................................................................................................................75 Motivation..............................................................................................................................................76 Context Free Grammars.........................................................................................................................76 Simple Parsing Methods........................................................................................................................78 A Top−Down Parser........................................................................................................................78 A Bottom−Up Parser.......................................................................................................................79 Further Reading.....................................................................................................................................80 File Compression...............................................................................................................................................80 Motivation..............................................................................................................................................81 Run−Length Encoding...........................................................................................................................81 Variable−Length Encoding....................................................................................................................83 Substitutional Compressors...................................................................................................................83 JPEG and MPEG...................................................................................................................................84 Summary................................................................................................................................................84 Further Reading...............................................................................................................................86 Cryptography....................................................................................................................................................86 Motivation..............................................................................................................................................86 A Basic Outline and Terminology.........................................................................................................87 Simple Symmetric Methods...................................................................................................................88 Asymmetric Systems: Public Key CyptoSystems.................................................................................89 Summary................................................................................................................................................89 Exercises................................................................................................................................................90 Further Reading.....................................................................................................................................91 Geometric Algorithms......................................................................................................................................92 Motivation..........................................................................................................................................................93 Representing Points, Lines and Polygons.......................................................................................................94 Line Intersection...............................................................................................................................................95

iii


Contents

Table of Contents Inclusion in a Polygon.......................................................................................................................................96 Finding the Convex Hull..................................................................................................................................98 Range Searching................................................................................................................................................99 General Methods for Developing Algorithms..............................................................................................100 Brute Strength Method...................................................................................................................................101 Divide and Conquer........................................................................................................................................102 Greedy Algorithms..........................................................................................................................................103 Dynamic Programming..................................................................................................................................105 Genetic Algorithms.........................................................................................................................................106 About this document ....................................................................................................................................title

iv


junk

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation: 4 Reasons to Up: Data Structures and Algorithms Previous: junk

junk

1


DATA STRUCTURES AND ALGORITHMS II Chapter 1: Introduction These notes accompany Data Structures and Algorithms II. The course, to a large extent, follows on from Data Structures and Algorithms I. However, while DS&A I focused on fundamental datastructures, DS&A II will focus on practical algorithms, applicable to a wide range of tasks. The approach will be somewhat less formal, with a little more focus on applications. There is huge range of reasonable textbooks on data structures and algorithms. The main topics covered in this module are: string processing algorithms; graph data structures and algorithms; and object oriented programming. Most books on C++ will probably have enough material on the relevant object oriented programming concepts (Friedman and Koffman give a reasonable introduction in appendix E). Most books on datastructures cover graphs and graph algorithms adequately. However, relatively few introductory books cover string processing algorithms. One book that does is Sedgewick ``Algorithms in C++'' by Addison Wesley. This is a fairly comprehensive text on algorithms, going beyond this module, and at a slightly more advanced level. However, it will remain a good reference book for you to use in the future, so is probably worth buying. A book which covers alot of material from DS & A 1, and about 2/3 of the material in this module is ``Data Abstraction and Structures using C++'' by M.R. Headington & D.D. Riley, pub D.C. Heath and Company, 1994, ISBN 0−669−29220−6. This might be particularly useful for those who feel that they are unsure of term 1's material, and will want a clear reference.

• Motivation: 4 Reasons to Learn about Algorithms • Overview of Course

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Overview of Course Up: DATA STRUCTURES AND ALGORITHMS Previous: DATA STRUCTURES AND ALGORITHMS

DATA STRUCTURES AND ALGORITHMS II Chapter 1: Introduction

2


Motivation: 4 Reasons to Learn about Algorithms Why should you want to know about algorithms? There are a number of possible reasons: • To avoid re−inventing the wheel. As you might expect, for many programming problems, someone has already developed a good algorithm to solve that problem. For many of these algorithms, people have formally analysed their properties, so you can be confident in their correctness and efficiency. For example, we know that merge sort and quick sort both work correctly, and have average case complexity O(n log n), so we can straightforwardly use either algorithm in our programs. We might be able to further choose between these two algorithms, depending on properties of our data (is it already almost sorted?). • To help when developing your own algorithms. Many of the principles of data−abstraction and algorithm design, illustrated by some of the algorithms discussed here, are important in all programming problem. Not all tasks have ``off−the−shelf'' algorithms available, so you will sometimes have to develop your own. Merge sort illustrates a widely applicable technique: split the problem in two, solve each separately, then combine the results. A knowledge of well known algorithms will provide a source of ideas that may be applied to new tasks. • To help understand tools that use particular algorithms, so you can select the appropriate tool to use. For example, documentation of various compression programs will tell you that pack uses Huffman coding, compress uses LZW, and Gzip uses the Lempel−Ziv algorithm. If you have at least some understanding of these algorithms you will know which is likely to be better and how much reduction in file size you might expect for a given type of file (e.g., does the file involve a lot of repetition of common sequences of characters). • Because they're neat. Many surprisingly simple and elegant algorithms have been invented. If you have even a slightly Mathematical disposition you should find them interesting in their own right. For all these reasons it is useful to have a broad familiarity with a range of algorithms, and to know what they may be used for.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Brief Revision of Pointers Up: DATA STRUCTURES AND ALGORITHMS Previous: Motivation: 4 Reasons to

Motivation: 4 Reasons to Learn about Algorithms

3


Overview of Course This course will have two main sections. The first, shorter section will introduce inheritance and polymorphism in object oriented programming. This is material that you will need in the third year, and which makes the implementation of some datastructures more elegant and more re竏置seable. However this section of the course is fairly independent of the next. The second, longer section will look at different sorts of algorithms, and in particular string processing algorithms and graph algorithms. There are various concepts from DS&A 1 which will be assumed. The basic idea of an abstract datatype is important. You also need to be familiar with pointers for some of the first section. You should also be able to implement and use a stack and a queue (with or without the help of a textbook).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Introduction Up: Data Structures and Algorithms Previous: Overview of Course

Overview of Course

4


Brief Revision of Pointers and Dynamic Data As some people find pointers difficult, and they are an important concept in C++, here's a brief review. These notes are summarised from chapter 7 of the book: "Data Abstraction and Structures using C++" by Headington and Riley, D.C. Heath & company, 1994.

• Introduction • Pointers • Pointers and Arrays • Copying Pointer Variables • Pointers, Functions and Call by Reference • Arrays and Functions • Dynamic Data • Problems and Pitfalls

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Pointers Up: Brief Revision of Pointers Previous: Brief Revision of Pointers

Brief Revision of Pointers and Dynamic Data

5


Introduction We can think of computer memory as just a contiguous block of cells: −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− | | | | | | | | | | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

each cell has a value (or contents) and an address, e.g. 12 13 14 15... −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− |3 |7 |1 | | | | | | | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

So the cell with address 12 contains value 3. (Of course most addresses would be rather larger than 12! But big numbers wouldnt fit in my diagram). Cells can be named, using variable names. If variable x refers to cell 12 then: x = 5

results in 12 13 14 15... −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− |5 |7 |1 | | | | | | | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

Well, actually, if a cell corresponds to 1 byte of memory, a variable might refer to a number of cells, starting at a particular memory address. For 2 byte (16 bit) integers the picture is more like: 12 13 14 15... −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− |5 |1 | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

In languages like C++ we have to allocate memory for variables: int x;

will allocate a 2 byte space for x, and associate the address of the first byte with the variable name. If we don't allocate space correctly then we risk memory being overwritten and used for something else.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Pointers and Arrays Up: Brief Revision of Pointers Previous: Introduction

Introduction

6


Pointers A pointer is basically like an address. In the above example, if ptr is a pointer variable, ptr might contain the value 14. Pointers are declared as follows: int* ptr;

declares that ptr is a pointer to an integer (int *ptr; is also OK). It allocates space for a pointer but does NOT allocate space for the thing pointed to. Generally the syntax is: <type>* <identifier>

We can find out what a pointer ptr points to (ie, the value in the cell with the address ptr) using the notation: *ptr

and set this value using e.g.,: *ptr = 6

From our last diagram, with ptr=14 this would result in: 12 13 14 15... −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− |5 |6 | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−

We can modify what a pointer points to in various ways. ptr = &x;

modifies ptr so it points to the same cell that the ordinary integer variable x points to. In our example above this was memory cell 12, so the result would be that the value pointed to by ptr (*ptr) would now be 5. If two things point to the same cell weird things can happen. If we now say: x = 2

then of course *ptr will = 2 too. But sometimes that's what we want. We'll (virtually) never want to refer to the `actual' address locations. So we'd never want to do: ptr = 16.

We just don't want to know (or mess with) where in real memory things are being stored. Pointers are often represented using diagrams like the following, which avoid explicitly referring to a particular memory cell address: −−− −−−− |ptr|−−−> | 5 | −−− −−−−

Pointers

pointer ptr points to a cell containing 5

7


Contents Pointers should always point to something and `know' what sort of thing they point to (hence are more than just an address..). If we increment a pointer, this results in it pointing to the NEXT object in memory, not just the next memory address. So if it was pointing to a 2 byte integer, and we increment it, it points to the next 2 byte integer. If it was pointing to a 4 byte float, and we add one, it points to the next float.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Copying Pointer Variables Up: Brief Revision of Pointers Previous: Pointers

Pointers

8


Pointers and Arrays In C++ pointers and arrays are very closely related. It's important to understand this. int vec[5];

essentially makes the variable vec a pointer to the first cell of a block of memory big enough to store 5 integers. Space for those 5 integers is reserved, so it doesnt get clobbered: 12 14 16 17 18 −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− |5 |6 | | | | | | −−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−−− −−−−−−−− RESERVED −−−−−−−−−−−−

In the above example vec=12 (an address) (but we wouldn't usually think about this). We could have declared vec in the almost equivalent way: int* vec;

Note that in principle we STILL can set and access the `elements' in the `array' in the same manner: vec[1] = 2;

etc. The difference is that we havent reserved any space. This is a source of errors. If you declare strings or arrays in this manner, and fail to allocate space, your program won't work. You need to allocate space (as described in more detail below) with something like: vec = new int[5];

An exception to this is where you make your pointer point to something that already exists. For example, a string constant. If "alison" explicitly appears in your code, C++ will allocate space for it. In the following code: char* str; str = "alison";

the result is that the pointer str ends up pointing to the same address as the (previously allocated) string constant "alison". This is fine. However, it might not be fine if you, say, read in someone's name into str, without allocating space. The following causes the program to crash: char* str; cin >> str;

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: PointersFunctions and Call Up: Brief Revision of Pointers Previous: Pointers and Arrays

Pointers and Arrays

9


Copying Pointer Variables What's the difference between int* a; int* b; .... a=b; and int* a; int* b; .... *a = *b;

What do you think the effect of the following is: int vec[5]; int* vptr; .... vptr = vec;

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Arrays and Functions Up: Brief Revision of Pointers Previous: Copying Pointer Variables

Copying Pointer Variables

10


Pointers, Functions and Call by Reference We can pass a pointer to a function: void myfn(int* ptr) { *ptr = 1; }

The result is that "the int ptr points to" is set to 1. If we call it with: int* p1; ...... myfn(p1);

*p1 will now have the value 1. When myfn is called, ptr is a new integer variable, a copy of p1, pointing to the same place as p1. So when *ptr is set, that sets the memory cell also pointed to by p1. We could also call the fn with, say: int p1; ...... myfn(&p1)

which calls myfn with a pointer to (address of) p1. When myfn finishes p1 will have the value 1. Call by reference arguments are much the same, but avoid the need to `dereference' pointers in the body of the function or call, so can be less messy. void myfn(int& ptr) { ptr = 1; } ... int p1; myfn(p1);

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Dynamic Data Up: Brief Revision of Pointers Previous: PointersFunctions and Call

Pointers, Functions and Call by Reference

11


Arrays and Functions An array name is generally equivalent to a pointer to the first element, and arrays are passed to functions as pointers. When elements in the array are modified in the function, they are also modified in the calling procedure: void test(int c[]) { c[0] =1; } void main() { int a[5]; test(a); cout << a[0]; }

results in `1' being written out. The test function may be written entirely equivalently as: void test(int* c) { c[0] =1; }

If we want a function to return an array, the pointer version is essential. The following illustrates this: int* test() { int* a; a = new int[1]; a[0]=1; a[1]=2; return a; } void main() { int* b; b = test(); cout << b[0] << b[1]; }

Note that space for the array (being created) is explicitly allocated within the function using new. We'll come on to that more in the next bit.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Problems and Pitfalls Up: Brief Revision of Pointers Previous: Arrays and Functions

Arrays and Functions

12


Dynamic Data When variables are declared in their normal fashion, space in memory is assigned at compile time for those variables. If an array is declared, enough space for the whole array is assigned, so: int a[1000]

would use 2000 bytes (for 2 byte ints). Obviously this is wasteful if only a few of them are used in a particular run of the program. And problematic if more than 1000 elements are needed. So there's a case for being able to dynamically assign memory as a program runs, based on the particular needs for this run. This is possible using the `new' operator that explicitly assigns memory: int* a; .... cout << "How many entries are there going to be this time?"; cin >> n; a = new int[n];

results in space for n integers being dynamically assigned as the program runs. Just enough, and not too much, for that run. We could declare the variable at the same time: int* a = new int[n];

We can also dynamically de−allocate memory to free it up when we're done: delete a;

de−allocates all that space. This can be useful for strings: char* str = new char[n];

could be used to create space for a string of length n, allowing us to deal with variable length strings efficiently.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Object Oriented Programming Up: Brief Revision of Pointers Previous: Dynamic Data

Dynamic Data

13


Problems and Pitfalls The problem with the programmer dynamically assigning and deallocating memory is that they might get it wrong. And then horrible things might happen. Here's some of them: Inaccessible objects: are objects which no longer have a pointer pointing at them (so are wasted space). This might happen, for example, if you create a new object pointed to by an old pointer: oldptr = new int;

A dangling pointer: is a pointer pointing to something that has been de−allocated using delete. This might happen if you have two pointers p1, p2 pointing to the same object, and (say) p1 is deleted: delete p1;

Now p2 points nowhere.. or to some random bit of memory.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Introduction Up: Data Structures and Algorithms Previous: Problems and Pitfalls

Problems and Pitfalls

14


Object Oriented Programming

• Introduction ♦ Motivation: Programming Paradigms ♦ Object Oriented Programming and ADTs ♦ Key Concepts and Terminology • Object Oriented Programming in C++: Basics ♦ Further Reading • Further Important C++ OOP Concepts.. ♦ Derived classes: ♦ Inheritance and Accessibility: ♦ Constructors: ♦ Destructors ♦ Inline functions: • Objects, Pointers and Virtual Functions ♦ Virtual functions • Exploiting Inheritance when Defining ADTs ♦ A Named Stack ♦ Constructors, Destructors and Dynamic Data ◊ Constructors ◊ Destructors ◊ Copy Constructors ♦ Queues and Stacks derived from Lists ♦ Programming Style ♦ Some final points • Object Oriented Design ♦ Methodology ♦ Some Examples ◊ A Drawing Tool ◊ File and Printing Systems ◊ Window and User Interface Systems ♦ Object Browsers • Conclusion ♦ Advantages and Disadvantages of OOP ♦ Summary

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation: Programming Paradigms Up: Object Oriented Programming Previous: Object Oriented Programming

Object Oriented Programming

15


Introduction In this first main section of this course we'll develop further notions of object oriented programming, buiding from term 1. So far you've used C++ classes as a convenient way to define abstract datatypes. In this section I'll re−introduce object oriented programming, and show how it extends the idea of abstract datatypes.

• Motivation: Programming Paradigms • Object Oriented Programming and ADTs • Key Concepts and Terminology

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Object Oriented Programming and Up: Introduction Previous: Introduction

Motivation: Programming Paradigms Object−oriented programming is one of a number of approaches to programming and programming languages: • Imperative Programming (e.g., standard Pascal or C). A program is thought of as a sequence of instructions, normally organised in a structured manner using procedures, functions. Running a program involves going through the sequence of instructions. • Functional Programming (e.g., ML, Lisp). A program is a set of function definitions. Running a program involves calling a function with some arguments to get a result. • Logic Programming. A program is a set of statements in some logic. Running a program involves giving the system a fact to prove. • Object Oriented Programming. A program is seen as a set of objects which exchange information with each other. Running a program involves sending a 'message' to an object. It is important to be aware of these different approaches to programming. A given programming language typically supports more than one paradigm, and it is to some extent up to the programmer what style of programming they adopt even in a given language. For different problems, different approaches are more natural, so there is no panacaea. However, imperative and object oriented approaches are arguably the most important, and widely applicable. C++ supports both imperative and object−oriented programming. You can write C++ programs which don't exploit its object oriented features, however very often there will be a cleaner solution that uses objects.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Key Concepts and Terminology Up: Introduction Previous: Motivation: Programming Paradigms Introduction

16


Contents

Object Oriented Programming and ADTs Object−oriented programming can be viewed as an extension of the idea of abstract datatypes. An abstract datatype involves a data structure with a set of defined operations on that data structure. The data structure is only accessed or modified through these operations. The main extension of these idea in object oriented programming is that a given datatype (say, t1) may be a subtype of another (say, t2), and so access the operations and data of that other type. If an operation (or data field) is defined for both t1 and t2, in different ways, the version for t1 overrides the version for t2, and is the one used. This simple feature introduces a great deal of flexibility, and more importantly makes it relatively easy to re−use code in different applications, where minor variations of a basic approach are required. Libraries of generally useful datatypes (objects) are made available, but the programmer can easily define special customised versions with slightly different behaviour.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Object Oriented Programming in Up: Introduction Previous: Object Oriented Programming and

Key Concepts and Terminology In object oriented programming, the datatypes are referred to as classes, and variable of a given type are referred to as instances of the class, or just objects. Operations are referred to as methods and we sometimes talk of sending a message to an object, rather than just calling a method. Subtypes are referred to as subclasses. A subclass relation exists between a class and its subclass, and the set of all these relations defines a class hierarchy (basically, a tree structure of classes). Family tree terminology is used to refer to relations between classes (e.g., parent class). Where the methods or data of a parent class are used by an object (say, the object is an instance of class C1, which is a subclass of class C2) we say that it inherits the methods of data. The key concepts in object oriented programming are encapsulation and inheritance. Encapsulation has been met before for ADTs − the internals of the data representation should be hidden, and all communication with the object should be via the specified methods. Inheritance is where methods and data of parent (or ancestor) classes are used by instances of a given class.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: Object Oriented Programming Previous: Key Concepts and Terminology

Object Oriented Programming and ADTs

17


Object Oriented Programming in C++: Basics It is probably best to look at concrete examples of object oriented programming in C++ before going into further theory or concepts. We'll consider two kinds of examples. The first will involve graphical and user interface objects, the second will involve reworking some old list ADTs as objects. Suppose we wanted to write a simple drawing program that could (among other things) display different shapes on the screen. The shapes to handle are circles and rectangles. They each have a position and size, though this will be defined differently for the different shapes. A circle will be defined by center point and radius, rectangle by a center point and the length and width. Shapes also have various characteristics such as colour. Useful methods for these objects include: Initialise; Move; SetColour; Draw. Initialise should set the position and size, while Move, SetColour and Draw should do the obvious. It should be clear that Initialise and Draw will be different for circles and rectangles, while SetColour and Move could be defined in a shape−independent way. (We'd also want methods to modify the size, but this won't be included in these tiny examples). The first thing to do in object oriented programming is decide on a class hierarchy − that is, a hierarchy of kinds of objects, from very general ones to very specific kinds. In this simple example we can have a general class of 'shapes' and more specific classes for 'circles' and 'rectangles' (a realistic program would of course have many more such classes). A particular circle with specific size, colour etc will be an instance of the circle class. We can now decide on which operations and datafields can be defined for the most general class (shape). The following gives the type definition, and an example method (assuming that the type ``ColourType'' is defined somewhere). class Shape { public: Move(int x, int y); SetColour(ColourType c); private: int xpos,ypos; ColourType colour; }; Shape::SetColour(ColourType c) { colour = c; }

Note that in the SetColour function 'colour' is not treated as a global variable, but as the value of the colour field of the object whose colour is being set. We can also access these fields when calling further functions within the function (e.g., we could call a function MovePen(xpos, ypos) and the x and y position of the object in question would be accessed, without these needing to be specified as arguments to the function.) Note too that the position and colour of the shape are `private' and can only be accessed or modified via any defined operations. Now, a circle and a rectangle are both kinds of shape. To specify this we can use the notation ': Shape' after the class name. For reasons to be explained later, we need here to specify `: public Shape'. So for circle and rectangle we might have: class Circle : public Shape { public: Circle(); Draw(); private: int rad;

Object Oriented Programming in C++: Basics

18


Contents }; class Rectangle : public Shape { public: Rectangle(); draw(); private: int width, length; };

Circle and rectangle would inherit the operations and datafields from Shape, so it is not necessary to repeat them. I'll just assume that the relevant operations can be straightforwardly written. Note that the class 'Shape' is not a class that you'd have instances of. It is just defined so that specific shapes like circle and rectangle can be defined much more simply. Such classes are referred to as abstract classes. If we now had an object instance Ob of type Circle, and wanted to set its colour, we could call Ob.SetColour. It would use the SetColour function inherited from Shape. We can think of this as sending a message SetColour to the object Ob. Now, suppose we wanted to deal with a specialised kind of rectangle: say, a rounded rectangle (one with rounded corners). This can use the standard rectangle's initialise method, but will need a special draw method. class RoundedRectangle : public Rectangle { public: Draw(); }

The new Draw procedure will override, or replace, the one that would be inherited from the standard rectangle object.

• Further Reading

Next: Further Reading Up: Object Oriented Programming Previous: Key Concepts and Terminology Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Important C++ OOP Up: Object Oriented Programming in Previous: Object Oriented Programming in

Further Reading Most datastructures in C++, or general C++ books cover object oriented programming. See for example Friedman & Koffman app E, or Headington & Riley ch 11.

Alison Cawsey

Further Reading

19


Contents Fri Aug 28 16:25:58 BST 1998 Next: Derived classes: Up: Object Oriented Programming Previous: Further Reading

Further Reading

20


Further Important C++ OOP Concepts.. In this section we'll introduce a number of further concepts, via a simple example, elaborating the shapes example above. First, here's a small illustrative program, mostly just based on the concepts already discussed (we should really have the class info in separate shape.h/shape.C files, but this'll work:) #include <iostream.h> enum colour {red, green, blue}; class shape { public: shape() {col=blue;}; shape(colour c) {col=c;}; void setcol(colour c) {col=c;}; void display(); colour getcol() {return col;}; private: colour col; }; class square : public shape { public: square() {size = 0;}; square(colour c, int s): shape(c) {size=s;}; void display(); void setsize(int s) {size = s;}; private: int size; }; void shape::display() { cout << "Shape of: " << endl; cout << "colour: " << getcol() << endl << endl; } void square::display() { cout << "Square of: " << endl; cout << "colour: " << getcol() << endl; cout << "size: " << size << endl << endl; } void main() { square s1(red,3); square s2; s2.setsize(6); s1.display(); s2.display(); }

When this program is run we get: Square of: colour: 0 size: 3 Square of: colour: 2 size: 6

Further Important C++ OOP Concepts..

21


Contents (enumerated types are written out as integers). There are many new concepts illustrated in this example, which I'll go through one by one. You should also read through Friedman & Koffman appendix E, or any other introductory chapter to OOP in C++.

• Derived classes: • Inheritance and Accessibility: • Constructors: • Destructors • Inline functions:

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Inheritance and Accessibility: Up: Further Important C++ OOP Previous: Further Important C++ OOP

Derived classes: A more specific class such as `square' is referred to as a derived class, and the more general class it is derived from is referred to as the base class (e.g., `shape'). If a class declaration begins with: class DerivedClass : BaseClass {

This means that BaseClass is a private base class of DerivedClass, and the derived class can't invoke the base classes member functions. But usually we DO want derived classes to access base classes functions, so what we want is: class DerivedClass : public BaseClass {

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Constructors: Up: Further Important C++ OOP Previous: Derived classes:

Inheritance and Accessibility: Although a derived class inherits private data members of the base class, it can only access and modify them through the public member functions of the base class. That's why we used getcol() rather than just col in the square::display() function.

Derived classes:

22


Contents Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Destructors Up: Further Important C++ OOP Previous: Inheritance and Accessibility:

Constructors: The square and shape classes both have two alternative constructors. The one without arguments is referred to as the default constructor. We'll consider that first. When an instance of a derived class is created, the base class constructor is called as well as the derived class constructor (before it). Therefore the derived class constructor need only include the additional intialisation code, beyond that used by the base class constructor. That's why the `square' default constructor only needed to initialise size, but not colour. If the base class constructor requires parameters, these must be passed by the derived class in a special way. The alternative constructor for square takes parameters c (colour) and s (size), but the shape constructor just took a parameter c (colour). Again, the base class (shape) constructor will be called when an instance of the derived class (square) is created. So we need to `tell' the system what arguments to use for the constructor of the base class. This is done as follows. square::square(int s, colour c) : shape(c)

When s1 is created, both its size and colour are set correctly, the former using a call to shape(red).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Inline functions: Up: Further Important C++ OOP Previous: Constructors:

Destructors None of the above examples defines a ``destructor'' for a class. While constructor functions are invoked whenever an object is created (e.g., using ``new'', or by declaring a variable of the given type) destructor functions are invoked whenever an object is destroyed (using delete, or when a function exits and its local variables go out of scope). Destructors are only essential when data fields involve pointers (dynamic data). Then we must have a destructor if we want to delete pointed to data when an object is deleted (this will be illustrated in later examples), In the above example we don't have any such data, so we don't need to define a destructor.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: ObjectsPointers and Virtual Up: Further Important C++ OOP Previous: Destructors

Inline functions: Note that the code for the simple constructor, setcol, getcol etc functions is given immediately after the Constructors:

23


Contents function declaration in the class. This is both a more concise method for small functions, and also results in the functions being defined in a special way. Whenever there is a function call within the main program, the compiler will effectively replace that call with a copy of the entire function. That reduces the overheads at run time that are normally associated with function calls. It is sensible to define most one or two line functions as inline functions in this way.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Virtual functions Up: Object Oriented Programming Previous: Inline functions:

Constructors:

24


Objects, Pointers and Virtual Functions Now, let us suppose we want to start with some (rather more sophisticated) set of classes for various shapes, and develop a drawing package based on the shape objects. We want the user to be able to select from a menu of shapes, then position the selected shape, and adjust its size and colour. We want the details of the displayed shapes to be stored in an array, so we could in principle redisplay, print, or generally manipulate them. How do we declare a suitable array. We could easily enough declare: square s[20];

to give us an array s which can contain 20 squares. All 20 squares will be initialised (size=0, col=red) so if we try: s[5].display();

an appropriate description of a square will be printed out. However, if we wanted to initialise the values by invoking the constructors with extra arguments (discussed in the last section) you'd have to explicitly do things like: s[5].square(green, 4);

rather than just including arguments in the array declaration somehow. That would be OK, but another problem is that for the outlined application you won't know in advance whether the shapes you want are squares, or something else (circles, rectangles..). So an array of squares isnt much good. We kind of want an array of items that can be shapes, cicles or rectangles, so how about: shape s[20];

But now, memory is allocated for shapes such that there is room to hold the colour of each shape, but no room for anything else. [When an instance is created, space is allocated for its private data items]. So there's no room for, say, the size, if the shape required is a square. We want a more dynamic way of assigning things, so that when we know what the user wants, the appropriate memory can be assigned for any possible shape and their required data items. You've guessed it. We use pointers. The following should create in this case pointers to 20 blue shapes: shape* s[20];

Then we can tell the system that we actually want a square for the first item: s[0] = new square;

This is allowed because ``A pointer of a derived class object may be converted, without a cast operation, to a pointer to an object of its base class.''. Which means in this case that we can create a square, and assign it to a pointer to a shape. Because memory is being dynamically assigned, at run time, and we don't have to statically declare an object to be an instance of a particular derived class, this means that we could have a flexible program involving statements like: if(userwants='s') s[n] = new square;

Objects, Pointers and Virtual Functions

25


Contents else if(userwants='c') s[n] = new circle;

• Virtual functions

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Exploiting Inheritance when Defining Up: ObjectsPointers and Virtual Previous: ObjectsPointers and Virtual

Virtual functions This is all very well, but look what happens if we run the following program: shape* s[20]; s[1] = new square; s[1]−>display();

results in: Shape of: colour: 2

It's using the wrong `display' function! That's because, unless we tell it otherwise, it decides which version of `overloaded functions' to use at compile time. And all it knows at compile time is that s[1] is a pointer to a shape. So it compiles it such that `display' is shape::display. Fortunately, all is not lost, as we can tell it to make such decisions at run time. To do this we declare `display' a virtual function, by prefixing it with the word virtual in the class definitions, e.g.,: virtual display();

This fixes everything. Now when we run the above it gives: Square of: colour: 2 size: 0

as required. It is tempting sometimes to stick ``virtual'' in front of every function in the class definition. However, the more things calculated at run−time obviously the slower your program will run. So you should only use virtual functions in cases where some confusion may arise (such as when a function may be overridden in a derived class).

Virtual functions

26


Contents Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Named Stack Up: Object Oriented Programming Previous: Virtual functions

Virtual functions

27


Exploiting Inheritance when Defining ADTs How can we exploit inheritance when defining abstract datatypes like stacks, queues and so on, met last term. In this section I'll give two examples, and discuss a few further general points.

• A Named Stack • Constructors, Destructors and Dynamic Data ♦ Constructors ♦ Destructors ♦ Copy Constructors • Queues and Stacks derived from Lists • Programming Style • Some final points

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: ConstructorsDestructors and Dynamic Up: Exploiting Inheritance when Defining Previous: Exploiting Inheritance when Defining

A Named Stack Suppose we have defined and implemented (as a class) a stack ADT, with operations IsEmpty, IsFull, Push, and Pop. We'll assume it has a defined constructor, and constructor. Now, for some reason, we need a `named stack' ADT, which also has a name or label associated with it. How would we create this? Obviously all the above operations are still relevant, we just need an additional datafield to store the name, and a `getname' operation to access the name. We might also want to modify the constructor so that it can take a name as argument, e.g., namedstack s("a useful stack");

We want the name to be of arbitrary length, so the type of the datafield for the name should be a pointer to chars (not a fixed length array of chars). So the basic class definition might be: class namedstack : public stack { public: namedstack(char* name); // constructor char* getname(); ~namedstack(); // destructor private: char* stkname; };

Note that as `stack' is a `public base class' of named stack, all public operations of stack are available as public operations of named stack. So you can push/pop things in named stack! If stack had been a private base class, public operations of stack would only be available as private operations in named stack.

Exploiting Inheritance when Defining ADTs

28


Contents The new constructor now needs defining. One possibility is the following: namedstack::namedstack(char* name) { stkname = new char[strlen(name) + 1]; strcpy(stkname, name); }

Note two things about this implementation. First, we make stkname a copy of the name argument passed to the constructor (rather than just pointing at the same object). This is a good idea, in case the argument passed to it, goes out of scope, gets destroyed, or gets re−used for something else, e.g.,: cin >> name; namedstackptr1 = new namedstack(name); cin >> name; namedstackptr2 = new namedstack(name);

[What would happen here if we didnt use strcpy in the namedstack constructor, but rather just set stkname = name?] Second, we dynamically assign memory here, as required, rather than using fixed length arrays. This is a good idea as it allows strings of aribitrary length without wastage of space. To get the right size of memory required, we use the `strlen' function from the string.h library. This basic pattern, of finding the length of an existing string using strlen, asigning memory using new, and using strcpy to copy the string into the new chunk of memory, is very commonly used in C/C++. Now, the `getname' function is fairly trivial: char* namedstack::getname(); { return stkname; }

But a little more work needs doing on the destructor. Remember that destuctors are needed to de−allocate dynamically allocated memory. We've used just such for stkname. We can rely on the general `stack' destructor to do whatever is required in de−allocating the main stack memory, but we need to, in addition, de−allocate the memory assigned for the name. This is just: namedstack~::namedstack() { delete stkname; }

Note that nowhere did we have to know how stack was implemented. It could be that stack is implemented using an array, or using a linked list. This is an essential feature of object oriented programming. We are able to derive new specialised classes without knowing how the base class is implemented, just what operations are provided for the base class.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Constructors Up: Exploiting Inheritance when Defining Previous: A Named Stack

Exploiting Inheritance when Defining ADTs

29


Contents

Constructors, Destructors and Dynamic Data In the above example we defined constructors and destructors. These are generally required when dealing with dynamic data (ie, allocated with new). Constructors are also useful when some initialisation is required. The namedstack constructor both allocated memory and did some initialisation; the destructor just de−allocated memory. When NOT using dynamic data, constructors/destructors are not strictly required. However, it is probably good style to always include then, and just make them empty when nothing is required. In this section we'll look a little more closely at constructors, destructors and dynamic data.

• Constructors • Destructors • Copy Constructors

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Destructors Up: ConstructorsDestructors and Dynamic Previous: ConstructorsDestructors and Dynamic Constructors Whenever the data fields of an object include pointers we must normally have a constructor to allocate space for the pointed to data (unless the object's datafields will always point to pre−existing objects). This was illustrated in the last example: namedstack::namedstack(char* name) { stkname = new char[strlen(name) + 1]; strcpy(stkname, name); }

Linked list examples from last term should also illustrate the use of constructors for objects based on dynamic data.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Copy Constructors Up: ConstructorsDestructors and Dynamic Previous: Constructors Destructors Suppose you have defined a class MyType, and, say, xptr is a pointer to an object of type MyType, and memory has been allocated for an object of type MyType:

Constructors, Destructors and Dynamic Data

30


Contents MyType* xptr = new MyType;

Assume that objects of MyType have two datafields str and s. str is a pointer to char; s is an integer (maybe representing the size of the string). int s; char* str;

We automatically allocate space for s and str by invoking new, but NOT for what str points to (just for the pointer itself). Space for what str points to will only be allocated if that is done explicitly in the constructor. Let's further suppose the constructor results in space for a large block of chars being assigned: MyType::MyType(int n) { str = new char[n]; s = n; }

A bit later we've finished with the object and want to get rid of it. So we call: delete xptr;

This results in the space allocated for str and s being de−allocated. But the space we allocated that str points to is not de−allocated. Only the immediate contents of the private datafields. We end up with a large block of memory that is not pointed to by anything. To get rid of the unwanted block of chars we need a destructor: MyType::~MyType() { delete str; }

If we have that, when we call delete xptr then the block allocated for the string will be removed too. Now, you might think that this is not too vital if we rarely want to explicitly delete things of type MyType, but suppose we have a function which has an object of MyType as a local var: myfn() { MyType st; .... }

When st is created, memory is allocated. When the function exits, we want it thrown away (as it was a local variable). But UNLESS we have defined a destructor, the memory pointed to by datafields won't be de−allocated! In the above example, the detructor was quite simple. For linked lists it will be a touch more complex. For a linked−list based stack something like: mylist::~mylist() { while(! empty()) pop(); }

(assuming pop de−allocates memory).

Constructors, Destructors and Dynamic Data

31


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Queues and Stacks derived Up: ConstructorsDestructors and Dynamic Previous: Destructors Copy Constructors When defining classes where the datafields include pointers, and memory is allocated dynamically, you really need to also define a copy constructor. This is a function that specifies how you should copy the fields. If you don't specify this, then whenever a class is copied (e.g., when using it as a call−by−value argument) it may not be copied right; datafields may point to an original version of some data, rather than a copy. This may cause weird errors. When an object is copied, space is allocated for the contents of each datafield, so in the copy we'd have space for str and s. But no new space would be allocated for what str points to, and indeed that is not copied. This is called ``shallow'' copying, as only the top level is copied; the copied pointers in datafields will still point to the exact same bit of memory. This is probably not what we want. And to avoid this we need to define a `copy constructor' that does a `deep' copy. Here's an example: MyType::MyType(const MyType& another) { int i; s = another.s; str = new char[s]; for(i = 0; i<size; i++) str[i] = another.str[i]; }

This may be particularly important if we are going to use arguments of MyType as call−by−value arguments to functions: myfn(MyType x) { some stuff }

(This is legal, but much more often you'd be dealing with pointers to objects of type MyType, or use call by reference). Anyway, if you WERE using call−by−value then copies of the object would be created when the function was called. If you want call by value to behave as you would expect (not resulting in the parameter being modified when the fn exits) then you'd have to define a copy constructor.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Programming Style Up: Exploiting Inheritance when Defining Previous: Copy Constructors

Constructors, Destructors and Dynamic Data

32


Contents

Queues and Stacks derived from Lists As another example, suppose we already have a list ADT (called mylist) with the following operations: int isempty(); int isfull(); void reset(); // Resets list `cursor' to be at front of list. void advance(); // advance list cursor int currentitem(); // return item at list cursor int atend(); // true if list cursor is at end of list. void insertbefore(int x); // insert x before list cursor void insertafter(int x); // insert x before list cursor void delete(); // delete item at list cursor // List cursor now points to next item.

How would we define a stack ADT which exploits the list ADT already developed and debugged? This should be fairly straightforward. class stack: mylist { public: stack(); ~stack(); push(int x); int pop(); int isempty(); int isfull(); private: // no extra private data members };

Stack is defined as a derived class of mylist. mylist is given as a private base class. This means that it's public operations won't be public in stack, but they will be private! Which means that we can use them to implement pop etc: stack::stack() { // Nothing extra to do } stack::~stack() { // Nothing extra to do } stack::push(int x) { mylist::insertbefore(x); } int stack::pop() { int temp; temp = mylist::currentitem(); mylist::delete();

Queues and Stacks derived from Lists

33


Contents return temp; } int stack::isempty() { return mylist::isempty(); } int stack::isfull() { return mylist::isfull(); };

Very little extra work was required. We certainly didnt have to mess with linked lists, and indeed didnt have to know how mylist was implemented (could be using arrays). As we explicitly DONT want most of the operations of mylist to be available in stack, we make mylist a private base class. However, we CAN use its public methods in the function definitions above. Isempty and Isfull just call the methods of the base class (effectively making them public). A queue can be implemented very similarly. It's just mildly more fiddly to write the fn to add items to the queue, but still possible with the methods provided by the list class.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Some final points Up: Exploiting Inheritance when Defining Previous: Queues and Stacks derived

Programming Style Most of the examples given so far have been rather minimal, omitting comments, documentation, or correct use of specification and implementation files! You might get a program working this way, but it won't be much use to someone else who wants to use your classes. So.. • Always use separate specification and implementation files. However, with OOP, (or indeed, whenever writing software modules to be used by others) we run the risk that two modules both try and load the same .h file. The result is that you get compile errors. A solution is to enclose the definitions in #ifndef/#endif brackets, e.g., #ifndef MYCLASSH #define MYCLASSH class myclass { .... } #endif

The #ifndef statement prevents all the stuff up to #endif being read in if MYCLASSH (some arbitrary name) has already been #defined. The second line #defines MYCLASSH, stopping everything being re−read the next time round. This method is illustrated in Friedman & Koffman.

Programming Style

34


Contents • Comment your code! Especially the .h files, as others may want to examine them to determine how to use your class library. Friedman & Koffman illustrates a fairly minimal style of commenting. Other books adopt more complex conventions on how to comment class definitions. • Use `const' declarations where appropriate. In my examples I've both been sloppy in using `const' in arguments (that shouldn't have values change) and in defining const and member functions. Constant member functions are functions that don't change any of the datafields of the class.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Object Oriented Design Up: Exploiting Inheritance when Defining Previous: Programming Style

Some final points To illustrate some final points, consider the following problem. You have a database of shapes, managed via an array of pointers to shape. Assume for now the code in section 3.3 defines the operations on shapes. You now want to set a given shape's size: shape* s[20]; s[0] = new square; s[0]−>setsize(3);

But this will cause an error (the compiler would complain that there was no member function setsize for class shape). How could we get round this? You could: • Declare setsize in the definition of shape, as an empty virtual function.. but this could result in a very large `shape' definition, containing all of square's and circle's methods as well as those which logically belong to it. • Avoid using setsize in the main code, but instead use more general methods, such as (maybe) ``setvalues''. This could be defined as a virtual method in shape, and the individual methods in square and circle could invoke shape−specific methods like ``setsize''.. but sometimes that's not what you want. • Use explicit typecasting. Although you can assign an object of a derived class to a pointer to a base class (e.g., square to pointer to shape), it is sometimes useful/necessary to do explicit typecasts. For example, in our `shapes' example, we can explicitly `cast' the pointer s[0] to a pointer to square: shape* s[20]; s[0] = new square; ((square*) s[0])−>setsize(3);

However, you have to somehow know that s[0] contains a square before doing this, which introduces more problems; and it may also result in over complex code.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998

Some final points

35


Contents Next: Methodology Up: Object Oriented Programming Previous: Some final points

Some final points

36


Object Oriented Design In this final section I'll make some very brief comments about object oriented software design. This is covered in depth in the third year. I'll consider the basic methodology, tools available to support this (object browsers), and give a few short case studies.

• Methodology • Some Examples ♦ A Drawing Tool ♦ File and Printing Systems ♦ Window and User Interface Systems • Object Browsers

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Some Examples Up: Object Oriented Design Previous: Object Oriented Design

Methodology The development of a large software system for a customer involves three main stages: determining requirement, design, and implementation. The main methodology adopted is structured (top down) design, which emphasises control flow issues, and decomposition based on functions. Data is just something you pass between software modules. Object oriented design changes the focus to start with looking at the objects or entities needed to solve the problem. (The emphasis is on the data, and what you do with it). You identify the key objects, then the operations required on those objects. It may be hard to work out what the objects and operations should be. If you have a specification of what is required, often the necessary objects will correspond to nouns in the specification, and operations will correpond to verbs. For example, in the `vehicle database' problem, the nouns `vehicle', `car', `lorry' etc are all objects. A common first stage is to sketch an `object table' which lists objects and the operations required (e.g., vehicle: getenginesize). The second stage may be to identify relationships between objects. We've already looked at `isa' relationships (e.g., car isa vehicle) and inheritance. But in general there may be other relationships between objects. Any datafield in an object may contain another object (or more likely a pointer to one) so we can capture any relationships. Perhaps we have a `tyre' class that captures data about tires; a datafield `TyreType' might contain an object of this class. This can be viewed as a kind of `has−a' relationship (carA has−a TyreB); a very common kind of relationship between objects in OOP. The third stage is designing the `driver': the top level code that glues all the objects together and achieves the overal design purpose. A driver may do little more than process user commands and delegate tasks to the objects. Object Oriented Design

37


Contents As with any software design process, the three stages above need not always be sequential, and an iterative process may well be involved (ie, redoing earlier decisions once later design decisions are made). The final stages should be implementation and testing of the design. Hopefully these should be straightforward once a good design has been developed. To implement the design, obviously specific datafields/datatypes must be selected to implement each object's data, and specific algorithms developed to implement the operations. The required datatypes may include ADTs such as stack/queue/tree that are not built in, but which are likely to be available in a library.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Drawing Tool Up: Object Oriented Design Previous: Methodology

Some Examples The following are simply some examples of the sort of software problem which OOP methods apply naturally too. You should consider how the design methodology outlined above could be applied.

• A Drawing Tool • File and Printing Systems • Window and User Interface Systems

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: File and Printing Systems Up: Some Examples Previous: Some Examples

A Drawing Tool Suppose you want to develop a drawing tool, allowing different shapes to be selected from a menu, graphically positioned and sized, modified (changing their colour etc), filled, and so on. This is a natural extension of the examples being used in the text so far, so it should be clear how OO programming applies. A list of (pointers to) object instances could be maintained, corresponding to all the objects on the screen, created by the user. As they select new objects and position them, this list would be extended. As the user chooses to modify an object, its datafields would be altered, and hence how it is drawn. Although for this application there is clearly a fair amount of work on the basic interface, it should be clear that an object oriented approach is reasonable for maintaining and manipulating the shapes themselves. A special purpose design tool might make more, um, special purpose `shapes' available. For example, a tool for office layout might have built in classes for desk, filing cabinet and so on. The user would be able to manipulate their attributes (size, colour) but would have available standard methods for displaying, moving etc the objects.

Some Examples

38


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Window and User Interface Up: Some Examples Previous: A Drawing Tool

File and Printing Systems Almost all computer applications require facilities to save stuff to disk, and print things out (consider: spreadsheets, drawing packages, netscape, programming language compilers,..). Getting this right in all its details is very hard, and might take someone size months to program from scratch. Yet most print/save facilities are pretty much the same, with just minor details. So the programmer will want to re竏置se code from existing applications (or special purpose libraries). Program re竏置se is what OOP is good at. The programmer can import classes that have all the basic print/save functionality built in (procedures to save files while making various checks to ensure other files not overwritten, or data lost if disk full or system crashes etc). Very little additional code will have to be written ( basically, code to covert the data format of the application into a form suitable for printing or writing to a disk). And if the application programmer doesn't like some of the functionality that the library provides (maybe they don't like having a dialogue box popping up asking if you REALLY want to print this file) then they can create subclasses of the library classes and override any procedures they don't like with new ones.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Object Browsers Up: Some Examples Previous: File and Printing Systems

Window and User Interface Systems Another thing required by almost all applications these days is a nice graphical user interface (GUI). This is regarded as so important that most professional programming environments provide some kind of interface toolkit. Anyway, an interface toolkit will normally provide a library of user interface objects, and some kind of pallette/menu through which the user can select the interface objects they want and arrange them in a new window. The figure below gives the general idea. The left hand box contains a selection of user interface objects: some text, a clickable `radio button', a check box (with text), a scroll bar, and a button. The `programmer' can pick these up with their mouse, move them into the other window (right box), and change their attributes (e.g., the text that goes with them; what procedure to call when a button is pressed). As selecting some of the interface objects will result in code being run (e.g., procedure called when button pressed), the interface can be easily linked in with the main program. Generally the interface objects themselves are defined as object classes. This provides advantages for the developer of the interface toolkit, as inheritance can be exploited both to simply the programming and also (as an added extra) encourage some uniformity between different user interface objects. Typically user interface objects such as buttons and check boxes (as illustrated above) are subclasses of the class of `dialog item'. All dialog items share the code concerning how they are positioned in a window, sized, have colour set, and so on. A particular dialog item (e.g., button) will have additional data fields (e.g., the procedure to call) and special display/draw methods.

File and Printing Systems

39


Contents Dialog items themselves are often a subclass of a a more general class, which might include (among its subclasses) the class of windows. A window, like buttons and things, can be positioned, has a particular text font, and so on. It also needs special code to allow windows to be opened and closed, to be brought to the front, iconified, etc. A particular toolkit might also provide special purpose subclasses for, say, a text window or a graphics window. Although these different user interface objects are defined as classes, using inheritance, a naive user of a given toolkit may only be vaguely of the class hierarchy, just selecting options from a menu/pallette to create their application interface. However, a more sophisticated applications programmer would probably want to get under the surface a bit, and be able to create special purpose subclasses with slightly different behaviour to the that provided in the toolkit/library. The underlying OO implementation of the interface objects should make that easy.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Conclusion Up: Object Oriented Design Previous: Window and User Interface

Object Browsers One vital tool to support complex object oriented design and implementation is an object browser. An object browser will display graphically the hierarchy of objects in your compiled program (including any in included modules). Object browsers display obejcts in a hierarchy. Typically you can click on any of the objects in the hierarchy, and get information about that object. The information would normally include the declarations for all the objects methods and data fields, with an indication of where they were inherited from (and perhaps further info, such as whether they are declared virtual methods). If the source code was available, then by selecting one of the methods you should be taken directly to the place in the source code where that method is defined. An object browser allows you to rapidly move within the text files defining a large object oriented system. If, say, you're editing a procedure which calls the `draw' method, and want to check out/modify that method, you need to go straight to just that version of the draw method that would be used there. Lets say the procedure you're editing is class1::display. You'd find class1 in the graphical class hierarchy, select that, find its draw method, select that, and then be at the appropriate draw method. Simple search tools (that may allow you to search for the procedure draw's definition) are just not adequate. There may be many versions of draw, and ob1 may inherit its version from any of its ancestors. Object browsers allow static browsing of the classes. However, many systems also provide more run−time debugging tools, showing the object instances created at run−time, their inherited methods, and their location inthe hierarchy.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Advantages and Disadvantages of Up: Object Oriented Programming Previous: Object Browsers

Object Browsers

40


Conclusion

• Advantages and Disadvantages of OOP • Summary

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Summary Up: Conclusion Previous: Conclusion

Advantages and Disadvantages of OOP OOP has the following advantages over conventional approaches: • Provides a clear modular structure for programs. This makes it good for defining abstract datatypes, where implementation details are hidden and the unit has a clearly defined interface. • Provides a good framework for code libraries, where supplied software components can be easily adapted and modified by the programmer. This is particularly useful for developing graphical user interfaces. • Makes it easy to maintain and modify existing code, as new objects can be created with small differences to existing ones. However, there is a slight cost in terms of efficiency. As objects are normally referenced by pointers, with memory allocated dynamically, there is a small space ovearhead to store the pointers, and a small speed overhead to find space on the heap (at run−time) to store the objects. For dynamic methods there is an additional small time penalty, as the method to choose is found at run time, by searching through the object hierarchy (using the class precedence list for the object). These days, these efficiency penalties seem to be of relatively little importance compared with the software engineering benefits.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Graph Algorithms Up: Conclusion Previous: Advantages and Disadvantages of

Summary Object oriented programming focusses on defining a hierarchy of classes of objects, each with associated data fields and methods. Subclasses inherit data fields and methods from their parent and ancestor classes, but can override these with more specific versions. When one method can override another it is a good idea to define it as a virtual method, which means that the version chosen will determined at run−time. Otherwise the wrong version might be used when, for example, one method is called from within another.

Conclusion

41


Contents Objects are normally accessed via pointers, with memory allocated dynamically. Using this approach, a variable of a given (pointer) type can contain pointers to subclasses of the declared type. Explicit typecasting may be needed to allow this. Dynamic memory allocation also means that space can be allocated and deallocated in a flexible way. (You could have memory allocated and de−allocated for an object by declaring the object variable locally within a procedure − when that procedure exited the memory would be de−allocated. But this is less flexible). Constructors, destructors and copy−constructors are required to handle memory allocation in objects where the datafields are pointers with memory to be dynamically allocated. Object oriented programming is generally considered good for software re−use and maintenance, as objects with clearly defined interfaces can be used and modified with relatively little effort. It has been particularly widely used in creating user interface libraries and toolkits. Although there are some time penalties in OOP, these are becoming less significant as machine speed increases. However, to navigate round a large object−oriented program requires the use of good object browsers. It is very difficult to follow a large OO program simply by inspecting the code.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Introduction Up: Data Structures and Algorithms Previous: Summary

Conclusion

42


Graph Algorithms

• Introduction ♦ Motivation ♦ Terminology and Definitions ♦ A Graph ADT ♦ Implementing a Graph ADT ◊ Adjacency Matrix ◊ Edge Lists ◊ Which to use? • Graph Search Algorithms ♦ Breadth first and depth first search ♦ Tree search ♦ Graph search ♦ Returning the path ♦ Example application • Weighted Graphs and Shortest Path Algorithms • Topological Sort ♦ Motivation ♦ The Algorithm • Graphs ADTs and OOP • Summary: Graph Algorithms

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: Graph Algorithms Previous: Graph Algorithms

Graph Algorithms

43


Introduction

• Motivation • Terminology and Definitions • A Graph ADT • Implementing a Graph ADT ♦ Adjacency Matrix ♦ Edge Lists ♦ Which to use?

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Terminology and Definitions Up: Introduction Previous: Introduction

Motivation Many real−life problems can be formulated in terms of sets of objects and relationships or connections between objects. Examples include: • Finding routes between cities: the objects could be towns, and the connections could be road/rail links. • Deciding what first year courses to take: the objects are courses, and the relationships are prerequisite and corequisite relations. Similarly, planning a course: the objects are topics, and the relations are prerequisites between topics (you have to understand topic X before topic Y will make sense). • Planning a project: the objects are tasks, and relations are relationships between tasks. • Finding out whether two points in an electrical circuit are connected: the objects are electrical components and the connections are wires. • Deciding on a move in a game: the objects are board states, and the connections correspond to possible moves. A graph is just a datastructure that consists of a set of vertices (or nodes) (which can represent objects), and a set of edges linking vertices (which can represent relationships between the objects). A tree (met in DS&A 1) is a special kind of graph (with certain restrictions). Graph algorithms operate on a graph datastructure, and allow us to, for example, search a graph for a path between two given nodes; find the shortest path between two nodes; or order the vertices in the graph is a particular way. These very general algorithms can then be used to solve problems of the kind mentioned above, where the data is represented as a graph. For example, search algorithms can be used find a possible winning move in a game; shortest path algorithms can be used to find the shortest route between two cities; ordering algorithms can be used to find a possible sequence of courses to take, given prerequisite relationships between them.

Alison Cawsey

Introduction

44


Contents Fri Aug 28 16:25:58 BST 1998 Next: A Graph ADT Up: Introduction Previous: Motivation

Terminology and Definitions There is alot of terminology associated with graphs that needs to be introduced. Figure ? shows examples of two graphs, illustrating some of the main ideas:

Figure 4.1: Example Graphs Graphically we can think of a graph as a collection of points, some of which are connected. These connections can be two way or one way (illustrated by arrows, for one way connections). Graphs with one way connections are referred to as directed. In an undirected graph, if there is a connection from vertex 1 to vertex 2, there also is from node 2 to node 1. Directed graphs may be cyclic or acyclic depending on whether it is possible to get back to a vertex by following arrowed links. Two nodes are adjacent if an edge connects them, and a path is a sequence of adjacent vertices. Graphs may be connected or unconnected as illustrated in the figure. Vertices may be labelled as illustrated in the first example in the figure, where the labels are just integers. We often use family tree terminology for graphs. The children of a node n1 are those nodes such that there is an edge from n1 to the child node. (These are also sometimes called the neighbours of the node, particularly when talking about general graphs rather than trees). The parents of a node is the reverse. Ancestors, siblings and descendants are what you might expect. Formally, a graph G = (V, E) where V is a set of vertices, E is a set of edges, where each edge is a pair of vertices from V. Definitions for paths, cycles, adjacency, connectivity etc can be stated in terms of this formal description.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Implementing a Graph ADT Up: Introduction Previous: Terminology and Definitions

A Graph ADT An abstract datatype for a graph should provide operations for constructing a graph (adding and removing edges and vertices) and for checking connections in a graph. If we allow labelled nodes then there will be additional operations to add, access and remove labels. Nodes can be just indicated by integers. Assuming that we have suitable type declarations for label and graph, the following is a reasonable minimal set of operations: Graph(int n) // Creates and initialises a graph of given size (no. of nodes) void AddEdge(int n1, int n2) // Adds edge from N1 to N2 // Pre: N1 and N2 are nodes in graph void RemoveEdge(int n1, int n2)

Terminology and Definitions

45


Contents // Removes an edge} // Pre: There is an edge in graph from N1 to N2 int EdgeExists(int n1, int n2) // 1 if there is an edge in graph from N1 to N2, else 0 // void SetLabel(graphlabel l, int n) // Adds a label L to node N // Pre: N is a node in Graph Label GetLabel(int n) // Returns the label of a node // Pre: N is node of Graph

We might also want: owllist Neighbours(int n); // Returns a linear list containing all the neighbours of node N // Pre: N is a node in graph

(This assumes type definitions for Graphlabel and owllist; GraphLabel could be a fixed length string) Other operations might be useful, but this provides a reasonable minimal set.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Adjacency Matrix Up: Introduction Previous: A Graph ADT

Implementing a Graph ADT It turns out that there are two reasonable implementations for a graph ADT.

• Adjacency Matrix • Edge Lists • Which to use?

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Edge Lists Up: Implementing a Graph ADT Previous: Implementing a Graph ADT

Adjacency Matrix For a graph of N nodes, a simple representation is just an NxN matrix of boolean values −say G[1..Max,1..Max] of boolean. An edge between nodes n and m is indicated by a 'True' entry in the array Implementing a Graph ADT

46


Contents G[n,m] (lack of an edge by 'false'). For a labelled graph, a further 1d array can give the labels of each node. Finally, an integer can be used to represent the size (number of nodes) of a specific graph. The first graph in the figure would be represented as follows: 1 2 3 4 5

1 F F F F F

2 T F F F F

3 F T F T F

4 T F F F F

5 F F F T F

The details of this are left as an exercise. In fact, exercise 2. All the ADT operations specified above can be straightforwardly implemented using this representation.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Which to use? Up: Implementing a Graph ADT Previous: Adjacency Matrix

Edge Lists The above representation is very simple, but it is inneficient (in terms of space) for sparse graphs, ie, those without many edges compared with nodes (vertices). Such a graph would still need an NxN matrix, but would be almost full of 'False's. So an alternative is to have associated with each node a list (or set) of all the nodes it is linked to via an edge. So, in this representation we have a one dimensional array, where each element in the array is a list of nodes. For the same graph as the one illustrated above this would give: 1 2 3 4 5

(2 4) (3) () (3 5) ()

Now comes the question of how to represent the list of nodes. One way is to use linked lists. The graph would involve an array, where each element of that array is a linked list. Some of the graph operations are a little more tricky using this representation. However if you have a linked list ADT defined, you can just 'use' this, rather than redefining basic operations.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Graph Search Algorithms Up: Implementing a Graph ADT Previous: Edge Lists

Edge Lists

47


Contents

Which to use? As is so often the case, the best implementation for a graph depends on properties of the graphs we want to operate on, and on which operations are likely to be used most often. The adjacency matrix representation is not space efficient for sparse graphs, but certain operations such as EdgeExists are likely to be much more efficient using adjacency matrices (particularly for large dense graphs).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Breadth first and depth Up: Graph Algorithms Previous: Which to use?

Which to use?

48


Graph Search Algorithms A graph search (or graph traversal) algorithm is just an algorithm to systematically go through all the nodes in a graph, often with the goal of finding a particular node, or one with a given property. Searching a linear structure such as a list is easy: you can just start at the beginning, and work through to the end. Searching a graph is obviously more complex.

• Breadth first and depth first search • Tree search • Graph search • Returning the path • Example application

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Tree search Up: Graph Search Algorithms Previous: Graph Search Algorithms

Breadth first and depth first search There are two main ways to traverse a graph: depth first and breadth first. If we start at a particular node (say, n1), then in breadth first search, all nodes path length M away from n1 are searched before all nodes path length M+1 away. In depth first search, if a node n2 is searched, all nodes connected to n2 are searched before any other nodes. We can also describe it in terms of family tree terminology: in depth first the node's descendants are searched before its (unvisited) siblings; in breadth first the siblings are searched before its descendants.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Graph search Up: Graph Search Algorithms Previous: Breadth first and depth

Tree search For trees (which are, as we said, just a specialised kind of graph, where each node has only one 'parent' and cycles arent allowed)) this distinction is clear graphically. As we see in figure 3.2, in breadth first we search across the tree before we search down. In depth first we follow a path down, before we search across.

Figure 4.2: Depth first and depth first search

Graph Search Algorithms

49


Contents For trees, there is a very simple algorithm for depth first search, which uses a stack of nodes. The following assumes we are searching for some target node, and will quit when it is found. stack.push(startnode); {(assumes stack initially empty..)} do { currentnode = stack.pop(); for {each neighbour n of current node} stack.push(n); } while(! stack.empty() && currentnode != target)

Pseudocode has been used to describe putting all neighbours of the node on the stack; you'd actually need to either use list traversal of neighbours or check through all possible nodes and add them if they are a neighbour. We can see how this will work for the tree in the figure above. (The trace below shows 'stack' on entering the body of the loop each time, and 'current node' after it has been popped off stack). Stack (1) (2 3 4) (5 6 3 4) (6 3 4) (3 4) (7 4) (4) (8 9) (9) ()

Current竏地ode 1 2 5 6 3 7 4 8 9

Neighbours 2,3,4 5,6 none none 7 none 8,9 none none

Note that the order of current竏地odes is correct for depth first. The algorithm for breadth first is exactly the same BUT we use a queue rather than a stack of nodes: put the neighbours on the BACK of the queue, but remove current竏地ode from the front. I'll leave it as an exercise for the reader to verify that this leads to the right search. Depth first search also has a simple recursive version of the algorithm, but using a stack makes it more explicit what is going on.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Returning the path Up: Graph Search Algorithms Previous: Tree search

Graph search If we are searching a general graph rather than a tree it is necessary to keep track of which nodes have already been searched, as they might be met again. If we don't do this, then if there are cycles in the graph the loop might never terminate. Even if there are no cycles, redundant work is done, re竏致isiting old nodes. Avoiding revisitted previously visited nodes leads to the following modified algorithm, which keeps track of nodes visited (using an array visited[], which would be initialised appropriately). Again, I'll leave it a an exercise for the reader to work through how it works for some example graphs, and how it is modified for breadth first. Note that for any of these algorithms, it will only result in nodes in a connected region of the graph being traversed. To traverse the whole of a not fully connected graph you would have to try different starting nodes. Graph search

50


Contents stack.push(startnode); do { currentnode = stack.pop(); if(! visited[currentnode]) { visited[currentnode] = 1; for {each neighbour n of currentnode} if( !visited[n]) stack.push(n); } while(! stack.empty() && currentnode != target)

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Example application Up: Graph Search Algorithms Previous: Graph search

Returning the path So far, the algorithm(s) will process (visit) all the nodes in a graph, or with a minor modification, check whether a node satisfying a given property can be found in the graph, from a given starting point. However, what is often of more use is if the path from a start node to a target node can be returned. The graph is searched for that target node, and when reached, the loop exits and the path is returned. It turns out that there is a simple trick that can be used to find the path. Whenever we push a node onto the stack, we also make a record of that node's parent (ie, the current−node). As we only push nodes on the stack if they aren't already visited, this will result in a single parent being recorded for each node (so we can just hold the parent record in a 1−d node array). Once the target has been found, we can find its parent, its parent's parent, that node's parent, and so on to the start node. That sequence will be a path from start to target. This is illustrated in the modification of the algorithm given below. A 'writepath' procedure would also be required to write out the resulting path, given path array, start node and target node. stack.push(startnode) do { currentnode = stack.pop(); if(! visited[currentnode]) { visited[currentnode] = 1; for {each neighbour n of currentnode} if( !visited[currentnode]) { stack.push(n); parent[n] = currentnode; } } while(! stack.empty() && currentnode != target)rtnode);

Alison Cawsey

Returning the path

51


Contents Fri Aug 28 16:25:58 BST 1998 Next: Weighted Graphs and Shortest Up: Graph Search Algorithms Previous: Returning the path

Example application The second assessed exercise will involve constructing and searching a graph. Graph search methods are used in, among many other things, solving simple AI problems which involve find a way to achieve one state from another state. Such problems include game playing problems, when the 'target' state is a winning state; and robot planning problems, where the target state specifies what you want the robot to achieve (e.g., he's brought you your beer, or assembled a car). The methods are often illustrated by considering simple puzzles. The general idea is that a possible 'state' of the problem is going to be represented as a node in the graph. Possible actions, which get us from one state to another, are represented as edges. If we can search from an initial state, and find a target state, then the path between these states will correspond to the sequence of actions that will achieve your target state.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Topological Sort Up: Graph Algorithms Previous: Example application

Example application

52


Weighted Graphs and Shortest Path Algorithms For many applications it is useful if edges in a graph can be labelled with a weight. For example, we could have a graph representing possible routes between cities, where the weight might be the distance along a given route. In other applications the weight might represent the time taken to traverse an edge (e.g., time to get from one city to another, or to do some task), or the cost of traversing an edge (e.g., the cost of taking some action corresponding to an edge). Such a graph is referred to as a weighted graph. A weighted graph is generally implemented using an adjacency matrix: whereas before the matrix was an NxN array of booleans, now we use an NxN array of integers or reals, where the number indicates the weight of an edge (where there is no edge between two nodes this corresponds in principle with a weight of infinity, but some convention can be adopted for what to put in the matrix for such cases). The ADT definition would be much as before, but with further operations to modify and access weights. Given a weighted graph, we may want to find the shortest path between two nodes, where the 'length' of the path is just defined as the sum of the weights on the relevant edges. This might be used to, say: find the shortest route between cities; the quickest train journey; the cheapest way to do a sequence of tasks. If the weights on every edge are the same (e.g., a weight of 1) then we already have an algorithm that will do the trick: Breadth first search. This always explores paths of length N before paths of length N+1, and so will find the shortest path first. For arbitrary weights, things are a little more complex (at least, if we want to do the task reasonably efficiently). It is necessary to keep track, as we search, of the shortest distance we have found so far connecting the start node to each other node. This information can be kept in an array (say, called shortest). Then, as we search, given a number of possible nodes to explore, the one that is the shortest distance from the start node is explored first. (This is like breadth first, but it is kind of 'shortest' first). If we do this we find that when we find the target node, the path we have effectively traversed is guaranteed to be the shortest. I'll go through how this works with an example. Suppose we are searching the graph in the figure, in order to find the shortest path from node 1 to node 5. We'll use a set W to denote the nodes so far examined. We could have alternatively used an array ``visited'' like we used in the search algorithms. That implicitly represented a set (all the nodes such as visited[n] = 1 are in the set).

Figure 4.3: Example weighted graph W={1}

We then look at the nodes connected to 1, and record the shortest distance from node 1 to these nodes, ie: shortest[2] = 3; shortest[6]=5

The one with the shortest distance is added to W, and the 'shortest' array updated to also hold information about the nodes connected to 2: W = {1, 2}; shortest[3]=10

Now, the node with the shortest distance, not already in W is 6, so this is added, and shortest updated as before: W = {1, 2, 6}; shortest[4]=7

Weighted Graphs and Shortest Path Algorithms

53


Contents note that we will also have discovered another path to node 3 via 6, but the distance is 13, so more than shortest[3]=10 (added earlier). So this path is ignored. Now the best node not in W is 4: W= {1, 2, 6, 4}; shortest[5]=7+6=13

Best is now node 3: W = {1, 2, 6, 4, 3}; shortest[5]=11 (overwriting old value)

Best is now node 5, with shortest path length 11. If that's the node we were looking for, we now know its shortest path from node 1. Or we could continue to find the shortest paths for all the nodes. With a small extension to the approach, we could have recorded the path taken, which could be returned. C++ doesn't have a built in `set' datatype, so to implement the algorithm directly we'd have to define one. We'll assume that a set datatype has been defined (using the C++ class mechanism), with operations `intersection', `iselement' (is given element in given set?), `insert' (adds single element to set), and `union'. Small sets may be implemented using bit operations on ordinary numbers. For example, a set of 16 possible elements could be represented using a single (2 byte) integer, where each bit in that integer indicates whether a particular item is in the set. Bitwise operations (like OR and AND) can be used to implement union and intesection. Anyway, the actual algorithm for this is as follows. T[i][j] gives the weight between two nodes; V is the set of all nodes. Psuedo code is used here for conciseness. Initialise shortest array so shortest[i]=T[start][i] V is set of all nodes in graph. W = {startnode} while(W != V) { find the vertex w in V−W at minimum distance from start−node, ie, where shortest[w] is lowest Add w to set W for each u in V−W shortest[u] = min(shortest[u], shortest[w]+T[w][u]) }

It is possible to prove that this algorithm (known as Dijkstra's algorithm) works correctly. Essentially you use induction to show that if the shortest distances are correct for all the nodes in W at one stage, they are still correct after you add a new node to W as specified.

Next: Topological Sort Up: Graph Algorithms Previous: Example application Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: Graph Algorithms Previous: Weighted Graphs and Shortest

Weighted Graphs and Shortest Path Algorithms

54


Topological Sort

• Motivation • The Algorithm

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: The Algorithm Up: Topological Sort Previous: Topological Sort

Motivation There are many problems where we can easily say that one task has to be done before another, or depends on the other, but can't easily then work out what order to do things to a whole buch of such things. For example, it is easy to specify/look up prerequisite relationships between modules in a course, but it may be hard to find an order to take all the modules so that all prerequisite material is covered before the modules that depend on it. The same problem arises in spreadsheets. To recalculate cells after some of the values are changed, it is necessary to consider how one cell's value depends on the value in another, e.g., Cell 1 2 3 4

Contents 100 (Cell 1) + 10 (Cell 1) * 2 (Cell 2) + (Cell 3)

Value 100 110 200 310

In the above example, if you change the value in cell 1, you need to first recalculate the values in cell 2 and 3, and only when that is done can you recalculate the value in cell 4. Both these problems are essentially equivalent. The data of both can be represented in a directed graph (see fig ?). In the first each node is a module; in the second example each node is a spreadsheet cell. Directed edges occur when one node depends on the other, because of prerequisite relationships among courses or dependencies among nodes. The problem in both is to find an acceptable ordering of the nodes satisfying the dependencies. This is referred to as a topological ordering.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Graphs ADTs and OOP Up: Topological Sort Previous: Motivation

Topological Sort

55


Contents

The Algorithm A straightforward approach to finding this order would be first to find the nodes each node depends on, e.g.,: Node 1 2 3 4

Depends on: none 1 1 2,3

You would first remove any node that has NO dependencies, as an acceptable first node. That node could then be removed from the dependency lists of the other nodes: Node order: 1 Node 2 3 4

Depends on: none none 2,3

The process then repeats, with either node 2 or 3 being chosen. Let's say node 2. Node Order: 1,2 Node 3 4

Depends on none 3

This continues in the obvious manner, with the final order being 1,2,3,4. In fact, to implement this we don't even need to store lists of nodes that a node depends on, just the NUMBER of such nodes. So you start by finding the number of such nodes, for each node. Then when a node n (with zero nodes depending on it) is removed, you decrement the number associate with each of the nodes m depending on it (ie, with edge from node m to n). ie, Node 1 2 3 4 remove remove remove remove

Depends on none 1 1 2 1, then 2 and 3 have 0 nodes depending on it. 2, then 4 has 1 3, then 4 has 0 4

To implement this, we can use a list datastructure L to return the topologically sorted nodes, and a queue data structure Q to hold the nodes will zero dependent nodes waiting to be processed. When a node N is removed from Q and added to L, all of N's dependent naighbours have their number decremented, and any that are then zero are added to Q. Something like: Q 1 2 3 3 4

The Algorithm

L 1 1 2 1 2 3 1 2 3 4

56


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Summary: Graph Algorithms Up: Graph Algorithms Previous: The Algorithm

The Algorithm

57


Graphs ADTs and OOP In this section we've considered various types of graph: directed vs undirected; labelled vs unlabelled; weighted vs not weighted. How could we exploit inheritance and OOP to define these abstract datatypes? There are two approaches you might take. One would be to start with a very minimal (say, directed, unlabelled, unweighted) graph ADT and create derived classes to define the more complex graph types (analogous to the ``named stack'' example). These derived types could have additional datafields (e.g., for labels) and could, where necessary, override some of the basic graph ADTs methods (e.g., to make a directed graph into an undirected one). Multiple inheritance could even be used to create, say, labelled undirected graphs. The second approach (illustrated in the ``stack from list'' example in the OOP section) would be to create a graph ADT that has all possible operations and datafields (labels, weights, methods to modify weights and labels etc) and directed edges. This could be made a private base class for more specialised graph types, such as a labelled undirected non−weighted graph, which would define its operations using the operations defined for the supporting base class. This might provide a better basis for defining the more complex graph types, but would be inefficient for the simple ones, as simple things would be defined in terms of a complex underlying class definition.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: String Processing Algorithms Up: Graph Algorithms Previous: Graphs ADTs and OOP

Graphs ADTs and OOP

58


Summary: Graph Algorithms This chapter has presented a range of useful graph algorithms, for undirected, directed, and weighted graphs. The algorithms presented are fairly simple. There are many dozens more algorithms that have been invented for related tasks, and a number of tasks for which there is no efficient algorithm. One of these is the 'travelling salesman' problem, where the objective is to find the minimum cost tour in a weighted graph whereby every node is visited. The development of graph algorithms is still a very active research area. What key ideas should you remember from all this? First, you may, when solving some computational problem, be able to spot that the data can be best represented as a graph, and that solving the problem involves analysing the graph in some way. You can then look up graph algorithms in a major textbook (like Sedgewick) to find out if there is a standard algorithm that fits. Second, the very basic algorithms mentioned here are part of the basic `vocabulary' of computer scientists, so might be referred to when people describe their system or method. A basic familiarity with the algorithms should help in understanding such descriptions.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Introduction Up: Data Structures and Algorithms Previous: Summary: Graph Algorithms

Summary: Graph Algorithms

59


String Processing Algorithms

• Introduction • A String ADT • String Searching ♦ Motivation ♦ A Naive Algorithm ♦ The Knuth−Morris−Pratt Algorithm ♦ The Boyer−Moore Algorithm ♦ Tradeoffs and Issues ♦ Conclusion ♦ Exercise ♦ Further Reading • Pattern Matching ♦ Motivation ♦ Representing Patterns ♦ A Simple Pattern Matcher ♦ Further Reading • Parsing ♦ Motivation ♦ Context Free Grammars ♦ Simple Parsing Methods ◊ A Top−Down Parser ◊ A Bottom−Up Parser ♦ Further Reading • File Compression ♦ Motivation ♦ Run−Length Encoding ♦ Variable−Length Encoding ♦ Substitutional Compressors ♦ JPEG and MPEG ♦ Summary ◊ Further Reading • Cryptography ♦ Motivation ♦ A Basic Outline and Terminology ♦ Simple Symmetric Methods ♦ Asymmetric Systems: Public Key CyptoSystems ♦ Summary ♦ Exercises ♦ Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A String ADT Up: String Processing Algorithms Previous: String Processing Algorithms

String Processing Algorithms

60


Introduction We'll start with looking at a class of useful algorithms, used in many practical applications: string processing algorithms. These include: • Simple searching algorithms, useful for applications such as text editors and retrieval tools, where the user may want to search through some large document or collection of documents for the occurence of a given term. • Pattern matching algorithms, where the user, rather than specifying a literal string to search for, may want just to enter a patten, such as '?cat *' (which, assuming a fairly standard representation of such patterns, matches all strings that end in cat, followed by zero or more spaces). • Parsers, used both in the interpretation of computer programs and natural language. Here, the goal is to analyse a string given a grammar representing acceptable forms, and to decompose them into a structure suitable for further processing. • File compression algorithms, used in programs such as gzip and compress, which allow files to be stored in less space. • Encryption algorithms, particularly important now for secure transactions over the network. (These get complex very quickly, so we'll only introduce the main ideas). You might want to know about these algorithms for all the reasons listed in chapter 1: they give the basic ideas that will allow you do develop better string processing algorithms of your own (e.g., perhaps you want an algorithm to score two strings according to some somplex notion of closeness of match); they are useful in wide variety of programming tasks; they provide the basis for understanding existing tools (e.g., encryption and compression tools); and some of them are surprisingly neat and simple, and interesting in their own right.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Comparing Representations Up: String Processing Algorithms Previous: Introduction

Introduction

61


A String ADT Most languages have strings as a built in datatype, and a set of operations defined on that type. However, it is also possible for the programmer to implement their own string datatype. The use of that datatype in string processing applications shouldn't depend on how it is implemented, or whether it is built in or user defined. The programmer only has to know what operations are allowed on the string. The following is a set of the operations we might want to do on strings. We might need operations to access individual characters, obtain the length, copy, compare and concatenate strings, and find the position of a substring within a string: Getchar(str, n) Return the nth character in the string. Putchar(str, n, c) Set the nth character in the string to c. Length(str) Return The number of characters in the string. Pos(str1, str2) The position of the first occurence of str2 found in str1, or 0 if no match. Concat(str1, str2) Returns a new string consisting of characters in S1 followed by characters in S2 Substring(str1, i, m) A substring of length m starting at position i in string str. Delete(str, i, m) Deletes m characters from str starting at position i. Insert(str1, str2, i); Changes str1 into a new string with str2 inserted in position i Compare(str1, str2) Return an integer indicating whether str1 > str2. There are a number of ways that strings may be implemented: • As a fixed length array, where the first element denotes the length of the string, e.g., [6,a,l,i,s,o,n,.....]. This is used as the standard string type in Pascal. • As an array, but with the end of the string indicated using a special `null' character (denoted ' 0'), e.g., [a,l,i,s,o,n,

0,.....]. Memory can be dynamically allocated for the string once we know

its length. • As a (variable length) linked list of characters, dynamically allocating and de−allocating storage as required. (This is in fact rarely used, but illustrates a plausible alternative). C++ has only a very basic built in string data type with few operations defined on strings. The second implementation method above is used, which means that you either have to declare in advance the maximum length of a string, or deal with allocating/de−allocating memory dynamically. C++'s string library string.h provides a wider range of operations on this string datatype, beyond those built in. Some of these are listed on pg 651 of Friedman & Koffman. Many are variants of the list of useful string operations given above.

Alison Cawsey A String ADT

62


Contents Fri Aug 28 16:25:58 BST 1998 Next: String Searching Up: A String ADT Previous: A String ADT

Comparing Representations Looking at strings gives us an opportunity to consider the advantages and disadvantages of different implementations of a given abstract datatype. Obviously, if we pre−allocate an array of a given size, it will have the following problems: • If the actual string is smaller than the space allocated, that extra space is wasted.. • and if the actual string is larger than the space allocated, errors will occur, so care must always be taken to ensure that this does not occur. For some applications this is critical, and it is necessary to allocate memory dynamically. However, the different representations will also have different efficiency for the different operations. Consider the following: • How would you find out the length of a string in each of the three representations? In which representation would this be most efficient? • How would you insert a new string (say ``xyz'') into a string (say ``abcd'') in a given position (say, after 2nd char, to give ``abxyzcd'')? In which representation would this be easiest and most efficient? • How would you access the nth character of the string in each implementation? Clearly the choice of string implementation depends in part on the application. In some applications, some operations might be more common, so the use of a representation that makes that operation efficient will be better. In other applications the underlying representation may be relatively unimportant, so long as a convenient set of string handling operations are available.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: String Processing Algorithms Previous: Comparing Representations

Comparing Representations

63


String Searching [Revise KMP description, so emphasises ``if these characters match, how much should we move up the string to find another partial match'']

• Motivation • A Naive Algorithm • The Knuth−Morris−Pratt Algorithm • The Boyer−Moore Algorithm • Tradeoffs and Issues • Conclusion • Exercise • Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Naive Algorithm Up: String Searching Previous: String Searching

Motivation As mentioned above, string searching algorithms are important in all sorts of applications that we meet everyday. In text editors, we might want to search through a very large document (say, a million characters) for the occurence of a given string (maybe dozens of characters). In text retrieval tools, we might potentially want to search through thousands of such documents (though normally these files would be indexed, making this unnecessary). Other applications might require string matching algorithms as part of a more complex algorithm (e.g., the Unix program ``diff'' that works out the differences between two simiar text files). Sometimes we might want to search in binary strings (ie, sequences of 0s and 1s). For example the ``pbm'' graphics format is based on sequences of 1s and 0s. We could express a task like ``find a wide white stripe in the image'' as a string searching problem. In all these applications the naive algorithm (that you might first think of) is rather inefficient. There are algorithms that are only a little more complex which give a very substantial increase in efficiency. In this section we'll first introduce the naive algorithm, then two increasingly sophisticated algorithms that give gains in efficiency. We'll end up discussing how properties of your string searching problem might influence choice of algorithm and average case efficiency, and how you might avoid having to search at all!

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: The Knuth−Morris−Pratt Algorithm Up: String Searching Previous: Motivation

String Searching

64


Contents

A Naive Algorithm The simplest algorithm can be written in a few lines. Note that s1 and s2 are strings, which in C++ are normally represented as arrays of characters. strlen is the standard C++ string library function for finding string length. int NaiveSearch(char s1[], char s2[]) // Returns an index of S2, corresponding to first match of S1 with S2, or // −1 if there is no match { int i, j, M = strlen(s2), N = strlen(s1); for(i=0, j=0; j< M && i < N; i++, j++) if (s1[i] != s2[j]) { i −= j; j = −1; } if (j==M) return i−M; else return −1; }

We'll illustrate the algorithms by considering a search of string 'abc' (s2) in document 'ababcab' (s1). The following diagram should be fairly self−explanatory (the îndicates where the matching characters are checked each round): 'ababcab' 'abc' ^

i=0,j=0

'ababcab' 'abc' ^

i=1,j=1

'ababcab' 'abc' ^

i=2,j=2

'ababcab' 'abc' ^

i=1, j=0 match fails: i=i−j=1, j=−1 increment

'ababcab' 'abc' ^

i=2, j=0 matches: increment i and j

'ababcab' 'abc' ^

i=3, j=1 matches: increment i and j

'ababcab' 'abc' ^

i=4, j=2 matches: increment i and j

matches: increment i and j

matches: increment i and j

match fails: i=i−j=0, j=−1, increment i and j

i=5, j=3, exit loop (j=M), j=M so return i−M = 2

Note that for this example 7 comparisons were required before the match was found. In general, if we have a string s1 or length N, s2 of length M then a maximum of approx (M−1)*N matches may be required, though often the number required will be closer to N (if few partial matches occur). To illustrate these extremes consider: s1= 'aaaabaaaabaaaaab', s2 = 'aaaaa', and s1= 'abcdefghi', s2= 'fgh'. A Naive Algorithm

65


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: The Boyer−Moore Algorithm Up: String Searching Previous: A Naive Algorithm

The Knuth−Morris−Pratt Algorithm The Knuth−Morris−Pratt (KMP) algorithm uses information about the characters in the string you're looking for to determine how much to `move along' that string after a mismatch occurs. To illustrate this, consider one of the examples above: s1= 'aaaabaaaabaaaaab', s2 = 'aaaaa'. Using the naive algorithm you would start off something like this: 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^ 'aaaabaaaabaaaaab' 'aaaaa' ^

match fails, move s2 up one..

'aaaabaaaabaaaaab' 'aaaaa' ^ etc etc

but in fact if we look at s2 (and the 'b' in s1 that caused the bad match) we can tell that there is no chance that a match starting at position 2 will work. The 'b' will end up being matched against the 4th character in s2, which is an 'a'. Based on our knowledge of s2, what we really want is the last iteration above replaced with: 'aaaabaaaabaaaaab' 'aaaaa' ^

We can implement this idea quite efficiently by associating with each element position in the searched for string the amount that you can safely move that string forward if you get a mismatch in that element position. For the above example.. a 1 a 2

If mismatch in first el, just move string on 1 place. If mismatch here, no point in trying just one place, as that'll involve matching with the same el (a) so move 2 places

a 3 a 4 a 5

The Knuth−Morris−Pratt Algorithm

66


Contents In fact the KMP algorithm is a little more cunning than this. Consider the following case: 'aaaab' 'aab' ^

i=2,j=2

We can only move the second string up 1, but we KNOW that the first character will then match, as the first two elements are identical, so we want the next iteration to be: 'aaaab' 'aab' ^

i=2,j=1

Note that i has not changed. It turns out that we can make things work by never decrementing i (ie, just moving forward along s1), but, given a mismatch, just decrementing j by the appropriate amount, to capture the fact that we are moving s2 up a bit along s1, so the position on s2 corresponding to i's position is lower. We can have an array giving, for each position in s2, the position in s2 that you should backup to in s2 given a mismatch (while holding the position in s1 constant). We'll call this array next[j]. j 0 1 2 3 4 5 6

s2[j] a b a b b a a

next[j] −1 0 0 1 2 0 1

In fact next[0] is a special case. If the first match fails we want to keep j fixed and increment i. If we are incrementing i and j each time round the loop this is achieved easily if next[0]=−1. 'abababbaa' 'ababbaa' ^

i=4, j=4 mismatch, so j = next[j]=2

'abababbaa' i=4, j=2 'ababbaa' ^ −−−−−−−−−−−−−−−−−−− 'abaabbaa' i=3, j=3 'ababbaa' ^ mismatch, so j = next[j]= 1 'abaababbaa' i=3, j=1 'ababbaa' ^ −−−−−−−−−−−−−−−−−−− 'bababbaa' i=0, j=0 'ababbaa' ^ mismatch, so j = next[j]= −1, increment i and j. 'abaababbaa' 'ababbaa' ^

i=1, j=0

It's easy enough to implement this algorithm once you have the next[..] array. The bit that is mildly more tricky is how to calculate next[..] given a string. We can do this by trying to match a string against itself. When looking for next[j] we'd find the first index k such that s2[0..k−1] = s2[j−k..j−1], e.g: 'ababbaa' 'aba....'

s2[0..1] = s2[2..3] so next[4] = 2.

The Knuth−Morris−Pratt Algorithm

67


Contents ^

If there is is no matching region, return 0. If j=0, return −1. (Essentially we find next[j] by sliding forward the pattern along itself, until we find a match of the first k characters with the k characters before (and not including) position j). The detailed implementations of these algorithms are left as an exercise for the reader − it's pretty easy, so long as you get the boundary cases right and avoid out−by−one errors. The KMP algorithm is extremely simple once we have the next table: int i, j, M=strln(s2), N = strln(s1); for(i = 0, j=0; j < M && i < N; i++, j++) while ((j >= 0) & (s1[i] != s2[j])) j = next[j]; if (j == M) return i−M; else return −1;

(If j = length(s1) when the loop exits we have a match and can return something appropriate, such as the index in s1 where the match starts).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Tradeoffs and Issues Up: String Searching Previous: The Knuth−Morris−Pratt Algorithm

The Boyer−Moore Algorithm Although the above algorithm is quite cunning, it doesnt help that much unless the strings you are searching involve alot of repeated patterns. It'll still require you to go all along the document (s1) to be searched in. For most text editor type applications, the average case complexity is little better than the naive algorithm (O(N), where N is the length of s1). (The worst case for the KMP is N+M comparisons − much better than naive, so it's useful in certain cases). The Boyer−Moore algorithm is significantly better, and works by searching the target string s2 from right to left, while moving it left to right along s1. The following example illustrates the general idea: 'the caterpillar' 'pill' ^ 'the caterpillar' 'pill' ^

Match fails: There's no space (' ') in the search string, so move it right along 4 places Match fails. There's no e either, so move along 4

'the caterpillar' 'pill' ^

'l' matches, so

continue trying to match right to left

'the caterpillar' 'pill' ^ 'the caterpillar' 'pill'

Match fails. But there's an 'i' in 'pill' so move along to position where the 'i's line up. Matches, as do all the rest..

The Boyer−Moore Algorithm

68


Contents ^

This still only requires knowledge of the second string, but we require an array containing an indication, for each possible character that may occur, where it occurs in the search string and hence how much to move along. So, index['p']=0, index['i']=1, index['l'] = 3 (index the rightmost 'l' where repetitions) but index['r']=−1 (let the value be −1 for all characters not in the string). When a match fails at a position i in the document, at a character C we move along the search string to a position where the current character in the document is above the index[C]th character in the string (which we know is a C), and start matching again at the right hand end of the string. (This is only done when this actually results in the string being moved right − otherwise the string is just moved up one place, and the search started again from the right hand end.) The Boyer−Moore algorithm in fact combines this method of skipping over characters with a method similar to the KMP algorithm (useful to improve efficiency after you've partially matched a string). However, we'll just assume the simpler version that skips based on the position of a character in the search string. It should be reasonably clear that, if it is normally the case that a given letter doesnt appear at all in the search string, then this algorithm only requires approx N/M character comparisons (N=length(s1), M=length(s2)) − a big improvement on the KMP algorithm, which still requires N. However, if this is not the case then we may need up to N+M comparisons again (with the full version of the algorithm). Fortunately, for many applications we get close to the N/M performance. If the search string is very large, then it is likely that a given character WILL appear in it, but we still get a good improvement compared with the other algorithms (approx N*2/alphabet_size if characters are randomly distributed in a string).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Conclusion Up: String Searching Previous: The Boyer−Moore Algorithm

Tradeoffs and Issues There are other string search algorithms that use different methods, but the Boyer−Moore algorithm outlined above is the most well known, fairly simple, and widely used. Even today, new algorithms are being developed based on the Boyer−Moore one, but (for example) to work with patterns rather than fixed search strings. The best choice of algorithm (and the average case efficiency of your algorithm) will depend on your 'alphabet size' (how many different characters may appear in a string), and the amount of repetition in the strings. For normal natural language text (e.g., an English document), the KMP algorithm gives little advantage over the naive algorithm, but Boyer−Moore does very significantly better. For a binary string where subsequences of the search string may occur frequently, KMP may do quite alot better than the naive algorithm, while Boyer−Moore may do little better than KMP. For applications such as text retrieval, where you may be (conceptually) searching though many text files for the occurence of a given string, approaches that go through each file from start to end are likely to be too inefficient. The only practical thing is to create a giant index which indicates for each word/character sequence just which files contain this sequence (and where in the file). You could have a large hash table, that allowed you to rapidly get from a given sequence to the list of files and locations. This obviously requires alot of preprocessing of files, and a fair amount of storage for the index/hash table. However, for any application where many users will be searching the same (fixed) set of texts such indexing will be worthwhile. I won't discuss details of how you'd do it here, just point out that in such cases we would pre−process the files to create indexes rather than searching them on−the−fly from the raw text. Tradeoffs and Issues

69


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Exercise Up: String Searching Previous: Tradeoffs and Issues

Conclusion The string matching algorithms introduced vary from a worst case N*(M−1), best case N complexity, to worst case N+M, best N/M complexity. This is a significant gain, without much cost in terms of algorithm complexity (just some jolly clever chaps to think of it in the first place). This is a good illustration of how knowledge of state−of−the−art algorithms may give good gains in performance if you use them in your programs.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: String Searching Previous: Conclusion

Exercise Try implementing the Boyer−Moore string search algorithm, and the naive algorithm. Do some tests on a large file and count how many character comparisons are done in each for some example searches.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Pattern Matching Up: String Searching Previous: Exercise

Further Reading Sedgewick (and several other algorithm books), cover the algorithms above.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: String Processing Algorithms Previous: Further Reading

Conclusion

70


Pattern Matching

• Motivation • Representing Patterns • A Simple Pattern Matcher • Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Representing Patterns Up: Pattern Matching Previous: Pattern Matching

Motivation For many search and retrieval tasks we don't want to have to specify exact strings to match against, but rather we want to specify a pattern. For example, we may want to retrieve all texts that inclue the word `carrot', followed by zero or more exclamation marks, followed by space. One way would be to specify a pattern such as `carrot!* '. The `*' means zero or more occurences of the last character. Matching against patterns is important for many text processing applications. New languages have been developed whose major feature is good string manipulation features such as pattern matching. A good example is the PERL language, which is widely used to process and generate text files to be used on the World Wide Web. To allow matches against such patterns we need to consider both how to describe such patterns (so that they are both easy to specify by the user and easy to process by the machine), and how we can do efficient matches and searches given such patterns. The next section will briefly discuss how we can represent patterns, and then we'll look at algorithms to do the matching.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Simple Pattern Matcher Up: Pattern Matching Previous: Motivation

Representing Patterns A pattern can be represented as a regular expression, using special symbols to describe optional repetition and alternatives. etc. A regular expression language may be based on just two special symbols (+ brackets):

to represent alternatives (OR), e.g., 'a

bcd' will match acd or bcd. '(a

bc)d' will match

'ad' or 'bcd'. Pattern Matching

71


Contents '*' to allow zero or more occurences of the last character (or bracketted expressin). e.g., 'ab*' matches 'a', 'ab', 'abb' etc, '(ab)*' matches ab, abab, ababab, etc. Other special symbols are often included for representing regular expressions (e.g., `.' to match any character, '+' for 1 or more occurences, often a character to represent any character except the last one, etc). However, the above is sufficient, just may be rather verbose. This simple language is in fact very flexible. We can build up quite complex expressions, such as '(ab c)*z' to match 'cz', 'abababz', 'ccabccz' and a whole lot more. A regular expression may be represented graphically as a finite state machine as illustrated below. (The numbers in brackets are just so we can go discuss it more easily). Each possible string corresponds to a path from start (right arrow on LHS) to end node (box on RHS) of the network. So, for example 'abcabz' corresponds to following the path between nodes numbered 1,2,3,5,1,4,5,1,2,3,5,6.

Figure 5.1: finite state machine for pattern '(ab

c)+z'

We can implement a datastructure based on this graphical representation by having an array holding, for each state, the possible next states. So, state 1 has possible next states 2 and 4. (We could use a Nx2 array, where N is number of states, and agree on what to put in the second position where there is only one successor). There is then a fairly simple algorithm for checking matches of strings with this representation of a pattern. It is based on using a variant of the queue datastructure to hold a representation of the states in the network to check when matching along the string. A `double ended queue' (or `deq') is used, so items may be added to the front or the end. A queue such as (1 2 + 5 6) means ``Possible states to go to in order to process the current character are 1 and 2, and possible next states assuming current character already matched are 5 and 6.''. So the representation allows you to explore many paths simultaneously, by keeping the alternatives next states on the list. The special symbol + is used essentially to separate possible states corresponding to the current character from states corresponding to the next character.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: Pattern Matching Previous: Representing Patterns

A Simple Pattern Matcher Using this special representation it turns out that we can check if a string matches a pattern using the following simple algorithm: The following assumes that choice nodes have the character '?' in them, and that the double ended queue data structure has the operations 'empty', 'push', 'put', and 'pop' available. 'Put' puts an item at the end of the queue, 'push' at the front. N = strlen(str); state = next[0][0]; dq.put('+'); // put + at END of dq j = 0; while( !state == endstate && j <= N && ! dq.empty()) {

A Simple Pattern Matcher

72


Contents if (state == '+') { j++; dq.put('+'); } else if (ch[state] == str[j]) dq.put(next[state][0]); else if (ch[state] == '?') { dq.push(next[state][0]); dq.push(next[state][1]); } state = dq.pop();

// put next states at // START of dq

}

A match succeeds if we reach the final state. The position in the string when the loop exits will indicate the shortest initial substring matching the pattern (we'd need slightly different exit conditions to check if the complete string matched the pattern). As an illustration, consider what would happen given the network above if we try to match the pattern against 'abcz'. To simplify things we'll not show 'State', but just assume it is the first element of the previous list (which is 'popped off' after each iteration): String position 0 0 0 0 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4

List of states to process (1 +) Choice point: put next states at front (2 4 +) State 2 contains 'a', so put next state at end (4 + 3) State 4 doesnt match 'b' so no action. (+ 3) '+', so increment string pointer, put + at end (3 +) State 3 contains 'b', so put next state at end (+ 5) '+', so increment string pointer etc (5 +) Choice: put next states at front (1 6 +) Choice: Put next states at front (2 4 6 +) State 2 doesnt match 'c' so no action (4 6 +) .. but State 4 does, so put next state at end (6 + 5) no action.. (+ 5) increment string pointer (5 +) Choice: put next states at front (1 6 +) Choice: Put next states at front (2 4 6 +) State 2 doesnt match 'z' so no action (4 6 +) State 2 doesnt.. (6 +) but state 6 does, so put next state at end (+ 7) increment string pointer.. (7 +) REACHED END! Must match pattern.

Think for a minute what one of the lists corresponds to. The list (6 + 5) given string position 2 means: another node to check for the 3rd character is '6'; but there's a next node 5 corresponding to a path we've already found containing the third character (1 2 3 5 1 4). The algorithm outlined works reasonably efficiently: with a small modification to avoid the possibility of having redundant repeats of the same state, we have a worst case on where M is the number of states and N the length of the string. To complete our pattern matcher we need code to translate our regular expressions into a finite−state machine (our network). This is a non−trivial problem in itself.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998

A Simple Pattern Matcher

73


Contents Next: Parsing Up: Pattern Matching Previous: A Simple Pattern Matcher

Further Reading Sedgewick describes the above pattern matching methods in more detail.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: String Processing Algorithms Previous: Further Reading

Further Reading

74


Parsing [rewrite to use −> notation and single unified example grammar] Parsing is the process of taking a sequence (e.g., of characters), and checking whether it is a legal sequence in a given ``language''. The sequence may also be decomposed into a structure which allows further processing. As a very simple example, we might parse an expression ``1 + 2 * 3'', determine that it is a legal arithmetic expression, and return a tree structure such as: / \ / | \ / | \ 1 + / \ / | \ 2 * 3

In this section we will give a very brief introduction to parsing. However, it is a very large subject, and so the introduction here will be somewhat superficial, and the methods presented should not be blindly adopted for more serious applications without further study of the subject.

• Motivation • Context Free Grammars • Simple Parsing Methods ♦ A Top−Down Parser ♦ A Bottom−Up Parser • Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Context Free Grammars Up: Parsing Previous: Parsing

Motivation Parsing is essential for processing computer programs. A computer program must be checked to see if it is syntactically correct, and a datastructure reflecting the structure of the program built up (e.g., making clear where different statements begin/end). Then it can be compiled into a low level assembly language (or interpreted). Parsing is also important for the interpretation of ``natural'' language such as English, and for other applications such as the processing of ``marked up'' text files (SGML). We'll include some examples of parsing natural language, as the grammar involved should be more familiar.

Parsing

75


Contents Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Simple Parsing Methods Up: Parsing Previous: Motivation

Context Free Grammars Parsers presuppose that there is some grammar that defines the legal expressions in a language. For parsing natural language this will just be a grammar of English. For parsing expressions of a programming language this will be the grammar defining legal constructs of that language. Programming languages are often described using a particular type of grammar known as a context free grammar. For example, the following grammar rules define legal arithmetic expressions involving * and +: <expression> ::= <term> | <term> + <expression> <term> ::= <factor> | <factor> * <term> <factor> ::= ( <expression> ) | v

(rule 1) (rule 2) (rule 3)

Note, other slightly different notations are often used, such as: expression −−> term | term + expression term −−> factor | factor `*' term factor −−> `(' expression `)' | v

(rule 1) (rule 2) (rule 3)

This grammar is meant to allow expressions like `1 + (1 + 3) * 6', but NOT expressions like `1 + * 2 ('. The grammar is such that the right structure will be assigned to the expression, as well as invalid expressions rejected. The rules above can be read as follows: ``An expression consists of a term OR a term followed by a `+' followed by an expression. A term consists of a factor OR a factor followed by a `*' followed by a term. A factor consists of an expression in brackets OR a single letter/digit.'' (`v' stands for any letter or digit). So in the notation above, `::=' can be read as `consists of' or 'is a' and ` ' means OR. In the above grammar, '+' and '*' are examples of terminal symbols corresponding to symbols in the language described, while 'expression' and 'term' are examples of non−terminal symbols, which are internal to the grammar. One non−termal is distinguished as the start symbol (in this case `expression'). A sequence is recognised by the grammar if we can rewrite it into a sequence containing just the start symbol, by repeatedly replacing sequences matching the right hand side of a rule with sequences matching the left hand side of the rule.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Top−Down Parser Up: Parsing Previous: Context Free Grammars

Simple Parsing Methods Parsing methods can be divided into top−down methods and bottom−up methods. These are illustrated schematically in figure 2.2, for a tiny grammar of English. The illustration for bottom up parsing starts with the sentence itself, and uses the grammar rules to determine how combinations of words can be rewritten as more abstract grammatical categories (e.g., ``The dog'' is a noun phrase). The parse succeeds if this rewriting process concludes with the whole sentence rewritten to the start symbol (sentence). Top down parsing, on the Context Free Grammars

76


Contents other hand, starts with this start symbol, and rewrites/expands symbols until it finally gets down to a sequence matching the sequence being parsed.

Figure 5.2: Top Down and Bottom up Parses, for a fragment of English Both top−down and bottom−up parsing are based on the idea of repeatedly modifying a sequence of symbols (based on allowed rewrites in the grammar rules), with the aim of getting from a sequence consisting of just the start symbol, to our sequence to be parsed. This can be illustrated as follows: Top Down (Sentence) (NounPhrase VerbPhrase) (Article Noun VerbPhrase) (the Noun VerbPhrase) (the dog VerbPhrase) (the dog Verb) (the dog jumps)

apply rule apply rule apply rule apply rule apply rule apply rule SUCCEEDS

1 2 4 5 3 6

apply rule apply rule apply rule apply rule apply rule apply rule SUCEEDS

6 3 5 4 2 1

Bottom up (the dog jumps) (the dog Verb) (the dog VerbPhrase) (the Noun VerbPhrase) (Article Noun VerbPhrase) (NounPhrase VerbPhrase) (Sentence)

Now, of course, the grammar in the figure is a bit limited! In fact, it can ONLY recognise the sentence 'The dog jumps'. There are no options or alternatives. In any real grammar there will be many alternative structures to explore, and that's where the difficulty in parsing arises. This is particularly a problem when parsing natural language. For artificial languages such as programming languages it is not so bad, but we still need ways of choosing between alternatives. For example, if we have the rule: <factor>

::= ( <expression> ) | v

and we are doing a top−down parse, how do we know whether to rewrite <factor> to ( <expression> ) or to v (letter/digit). For programming languages, it is often possible to do this by looking ahead one character. In the above rule, if the next character in the sequence you are parsing is '(' then the first option is correct, otherwise it is the second option. Often, in order to make this one−char lookahead work, the grammar must be modified into an equivalent, but easier to process form.

• A Top−Down Parser • A Bottom−Up Parser

Alison Cawsey Fri Aug 28 16:25:58 BST 1998

Context Free Grammars

77


Contents Next: A Bottom−Up Parser Up: Simple Parsing Methods Previous: Simple Parsing Methods

A Top−Down Parser A simple top−down parser (with one char look ahead) can be written by effectively translating the grammar rules into procedures. We don't bother with an explicit sequence of symbols (we're basically relying on the sequence being implicitly represented as the internal stack of procedures still to be executed). We assume that we are traversing an input string S, and that the current position in that sequence is held in a global variable j. As an example, the grammar rule above could be implemented as: void ParseFactor(); { if (s[j] == '(') { j++; ParseExpression(); if (s[j] == ')') j++; else error(); } else if (LetterOrDigit(s[j]) j++; else error(); }

This assumes that there is some error function that will, at least, set a flag to indicate failure. Although rather inelegant, the approach is at least simple, and illustrates the 'one char lookahead' very explicitly in selecting between alternatives in the relevant grammar rule.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: Simple Parsing Methods Previous: A Top−Down Parser

A Bottom−Up Parser A simple bottom−up parser, known as a shift−reduce parser can be implemented using a stack to hold a sequence of terminal and non−terminal symbols. Symbols from the input string can be shifted onto this stack, or the items on the stack can be reduced by applying a grammar rule, such that the right−hand−side of the rule matches symbols on the top of the stack. The basic mechanism can best be illustrated using an example. (Note that this is similar to the example near the beginning of this section, but we are specifying more precisely the actions to be taken at each iteration). I'll give examples both for natural language, and for arithmetic expressions (I'll use square brackets in the arithmetic expressions as otherwise its confusing). Stack () (the) (Art) (Art dog) (Art Noun) (NounPhrase) (NounPhrase jumps)

A Top−Down Parser

Input Sequence (the dog jumps) (dog jumps) (dog jumps) (jumps) (jumps) (jumps) ()

SHIFT word onto stack REDUCE using grammar rule SHIFT.. REDUCE.. REDUCE SHIFT

78


Contents (NounPhrase Verb) () (NounPhrase VerbPhrase)() (Sentence) ()

REDUCE REDUCE SUCCESS

() (2 * [ 1 + 3 ]) (2) (* [ 1 + 3 ]) (<Factor>) (* [ 1 + 3]) (<Factor> *) ([ 1 + 3]) (<Factor> * [) (1 + 3]) (<Factor> * [ 1) (1 + 3]) (<Factor> * [ <Term>) (+ 3 ]) (<Factor> * [ <Term> + 3) ( ]) (<Factor> * [ <Term> + <Expression>) ( ]) (<Factor> * [ <Expression>) ( ]) (<Factor> * [ <Expression> ]) () (<Factor> * <Factor>) () (<Factor> * <Term>) () (<Term>) () (<Expression>) ()

SHIFT REDUCE using rule 3 SHIFT SHIFT SHIFT REDUCE (2 times..) SHIFT (twice) REDUCE (3 times) REDUCE using rule 1 SHIFT REDUCE using rule 3 REDUCE using rule 2 REDUCE using rule 2 REDUCE using rule 1 SUCCESS

In the second example, there were many cases where it was unclear whether to shift or reduce. This is a complicated issue. We can build shift−reduce parsers (for certain types of grammar) that use some look−ahead to determine whether to shift or reduce, in a similar manner to that outlined in the example for top down parsers. However, we'll leave it as an important issue to be discussed in further courses.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: File Compression Up: Parsing Previous: A Bottom−Up Parser

Further Reading Sedgewick, ch 22.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: String Processing Algorithms Previous: Further Reading

Further Reading

79


File Compression In this section we will briefly introduce file compression algorithms. These are concerned with ways to reduce the space a file occupies on the file store. Methods are required that can (fairly) rapidly compress a file, such that the compressed file occupies less space, yet can easily be expanded to recreate the original file. As a simple example of how we might compress a file, suppose we replace every occurence of the word 'the' with the symbol '#' (and deal with actual occurences of '#' in some way!). The resulting file would take up less space, but it would be easy to recreate the original. In general we can try and replace common sequences with shorter 'codes' for those sequences.

• Motivation • Run−Length Encoding • Variable−Length Encoding • Substitutional Compressors • JPEG and MPEG • Summary ♦ Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Run−Length Encoding Up: File Compression Previous: File Compression

Motivation Despite the rapid decrease in the cost of storage (hard disk, CD etc), file compression methods are getting more and more important. It is becoming common to keep the complete text of an encyclopaedia, complex colour images, or even longish video extracts, on a hard disk or CD. Such texts, images and videos normally contain alot of redundancy − texts have oft repeated words, images large homogeneous areas, and videos have frames that are very similar to the last. By exploiting this redundancy we can compress these files to a fraction of the original (maybe 50% for text, 10−20% for images/videos). There are four broad classes of compression algorithm, each good for different types of data. These will be discussed in turn.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Variable−Length Encoding Up: File Compression Previous: Motivation

File Compression

80


Contents

Run−Length Encoding A simple compression method uses the fact that, in certain sorts of files, we often get `runs' of repeated characters. A sequence such as: aaaaabbbbbbbbbccccc could be represented more concisely as: 5a 9b 5c For binary files we can be even more concise: 11111111111000000011111111111111 could be represented simply as 11 8 14 as we can assume that we are alternating between 1s and 0s. Of course, numbers take up more space (bits) than 1s/0s to represent, but we can still get good savings if we have quite long runs of 1s or 0s (as may be common in some simple image formats, for example). Run−length encoding isn't much good for text files, as repeated sequences aren't that common. However, it is quite useful for images. Many standard image−file formats are essentially compressed versions of the simple bitmap representation − run−length encoding is one compression method used.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Substitutional Compressors Up: File Compression Previous: Run−Length Encoding

Variable−Length Encoding Variable−length encoding exploits the fact that some characters in a string are more common than others. So, rather than using, say, a full 8 bits to represent each character, we encode common characters using a smaller number of bits. Consider the string 'the'. Using an ASCII file representation, we might encode 't' as 116, h as 104, e as 101, or in binary: 1110100 1101000 1100101. However, we might say that as 'e' is most common, followed by 't', we'll encode these as '1' and '10'. If, say, 'h' was '101' then we'd have the shorter encoding: 10 101 1. One problem with this is that we have to include spaces between the codes somehow, in order to tell where one code begins and another ends. If we had an encoding such as 101 we wouldnt know whether this represented 'h' or 'te'. One way to avoid this problem is if we avoid using two codes such that one code consists of the first few characters of another (e.g., 101 10). An acceptable set of codes can be represented as leaves in a binary tree:

Run−Length Encoding

81


Contents A code for a letter corresponds to the path from the root to the leaf labelled with the letter. So, we have T: 00, H: 010, A: 011, E: 1. If we have an encoding such as '000101' we can uniquely determine what it represents and translate back to the uncompressed form. The decoding can even be done efficiently, using the binary tree as a datastructure to traverse to find the decoding of a given bit sequence. To actually construct an optimal tree, we can use data on the frequencies of difference characters. A general method for finding optimal codes was developed by D. Huffman, and the coding scheme is referred to as Huffman encoding. The method involves building up a binary tree, bottom up, storing in each node the total frequency count for the characters below that node. Lets say we're only dealing with the letters t, a, e and h, and the frequencies are 10, 5, 15 and 3. We start off by creating leaf nodes for each of these: t:10

a:5

e:15

h:3

Then we pick the two with lowest frequency, and combine them, creating a new node with a combined frequency count: *:8 / h:3

\ a:5

e:15

t:10

We then repeat this, but now the nodes we consider are the new one, e and t: *:18 /

\

*:8 / h:3

\ \ t:10

\ a:5

e:15

and finally: /

*:33 \

*:18 / *:8 / h:3

\ a:5

\ \

\ \ \ t:10

\ \ e:15

We now have an efficient encoding scheme where the most common letter 'e' has a single bit code (1), the next most common 't' has a 2 bit code, The code to construct such a tree, and use it in encoding/decoding is fairly simple, and described in Sedgewick, ch 22. Huffman encoding is worth doing for text files, though you won't get any very dramatic compression (maybe 20−30%). It may also be useful for image files, where (for example) a small range of colours may be much more common than others.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: JPEG and MPEG Up: File Compression Previous: Variable−Length Encoding Run−Length Encoding

82


Contents

Substitutional Compressors The basic idea behind a substitutional compressor is to replace an occurrence of a particular phrase or group of bytes in a piece of data with a reference to a previous occurrence of that phrase. There are two main classes of schemes, named after Jakob Ziv and Abraham Lempel, who first proposed them in 1977 and 1978. The LZ78 family of compressors LZ78−based schemes work by entering phrases into a *dictionary* and then, when a repeat occurrence of that particular phrase is found, outputting the dictionary index instead of the phrase. The dictionary is constructed as the string is traversed. New strings are only added if they extend a current string in the dictionary. By this means, you will tend only to get long strings in the dictionary if they occur frequently in the text. The LZ77 family of compressors LZ77−based schemes keep track of the last n bytes of data seen, and when a phrase is encountered that has already been seen, they output a pair of values corresponding to the position of the phrase in the previously−seen buffer of data, and the length of the phrase.So, we might replace the string 'the cat ate the dog' with 'the cat ate (1,4)dog'. Substitutional compressors are good for strings with frequently reoccuring subsequences. So, they are useful for text files, but also for graphics files where a certain texture or colour may be represented by a repeated sequence. Many popular compression and archive programs (e.g., zip, gzip, compress) are based primarily on substitututional methods.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Summary Up: File Compression Previous: Substitutional Compressors

JPEG and MPEG All the compression methods discussed so far are lossless, which means that you can restore an exact copy of the original. Although vital for texts, for images lossy compression may be acceptible, where some fine detail in the image may be lost. JPEG is a lossy compression method for images where very good compression may be achieved at the expense of losing some detail in the image. JPEG is designed so that the detail lost is normally not perceivable by the human eye. JPEG allows you to vary the degree of ``lossiness''. If you want a very small file size, but can put up with a slightly poor quality image, you can set parameters appropriately. JPEG is jolly complicated. However, there is one key idea that is worth explaining. The basic idea is that you transform (bits of) the image into frequency space by applying a kind of fourier transform. Then you throw away some of the details of the high (spatial) frequencies. The result of this is that gradual changes are kept, but fast or abrupt changes may be lost. (If you've never heard of fourier transforms this probably won't make much sense, but it would take to long to explain). As an illustration, here's a Piglet before and after its been compressed with JPEG with quality set very low:

Substitutional Compressors

83


Contents

The size of the file has been reduced by a factor of 20, but the image quality is poor. With higher quality settings any decrease in quality may be barely perceivable by human eye, yet the image size may still be reduced by a factor of 5 or so. MPEG is a lossy compression algorithm for video. Very roughly, it uses JPEG like compression for a frame, but does uses difference information between frames to reduce the information needed for each frame. (Two successive frames may be virtually identical). Any lossy compression algorithm is useless for text. It's no good saving your essay, only to find your compression program has removed all the 'details'! However, typically lossy algorithms can give greater compression than lossless methods, and for video, audio or graphics the result may not be noticeably poorer than the original.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: File Compression Previous: JPEG and MPEG

Summary We've discussed here four broad categories of compression algorithm. Each tends to be good for different kinds of file. For example, run−length may be good for artificial images, where there may be long 'runs' of one colour. Substututional will be quite good for texts, as will Huffman encoding. Lossy algorithms such as JPEG which lose detail are OK for images/video, but no good for text. All the algorithms discussed result in reasonable compression/decompression times (though we've just focused on the actual formats, not the details of the algorithms). In practice many compression programs combine aspects of each approach. For example, Huffman encoding can be used in combination with any of the other techniques, achieving further compression.

• Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Cryptography Up: Summary Previous: Summary

Further Reading Sedgewick ch22 for basics. There are a number of web sites/news groups that have more up to date or specialised info, e.g., http://simr02.si.ehu.es/DOCS/mice/compression−faq/top.html. Summary

84


Contents

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: String Processing Algorithms Previous: Further Reading

Summary

85


Cryptography In the last section we looked at methods to encode a string in order to save space. Here we'll look at methods for encoding a string in order to keep it secret from your enemies! We'll briefly discuss a number of encryption methods, from the simplest, to an outline of current techniques. We'll be focusing on the encoding methods (cryptography) rather than methods of trying to break the codes (cryptanalysis).

• Motivation • A Basic Outline and Terminology • Simple Symmetric Methods • Asymmetric Systems: Public Key CyptoSystems • Summary • Exercises • Further Reading

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: A Basic Outline and Up: Cryptography Previous: Cryptography

Motivation Although traditionally used for military/diplomatic communications, encryption is now becoming particularly important for things like electronic funds transfer over the network. There are, for example, encryption methods being used within WWW browsers such as Netscape, to try and ensure that credit card transfers etc are secure. There are active arguments about the security of the encryption methods used (the current system has been broken, but only with very significant computational resources).

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Simple Symmetric Methods Up: Cryptography Previous: Motivation

A Basic Outline and Terminology Encryption/decryption systems fit into the following framework. The sender (of the secret message) encrypts their message (the plaintext) using an encryption method and a key. The resulting string (the cyphertext) is sent to the receiver, who uses a matching decryption method, and a key, to tranform the cyphertext back into the original message.

Cryptography

86


Contents The aim is to develop encryption methods such that someone can't work out the message given the code (cyphertext) without knowing the key. In practice, for many applications we might be willing to have a system where, in principle, one could decode it without the key, but it would be so very expensive it wouldnt be worth it. (Internet financial applications fit this: People are probably not going to use many computer−years of effort to obtain credit card details which might result in a few hundred pounds gain). Encryption methods can be divided into symmetric and asymmetric systems. In symmetric systems the same key is used by both the sender and the receiver. In asymmetric systems the sender uses one key (the public) key, and the receiver uses another (the private key. Symmetric systems are simpler, so will be described first, but asymmetric methods are now more important for most applications.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Asymmetric Systems: Public Key Up: Cryptography Previous: A Basic Outline and

Simple Symmetric Methods One of the simplest methods, called the ``Caesar cipher'' is to encode the message such that the Nth letter in the alphabet is replaced by the N+Kth. (The key is jut K). So, for K=2: message: attack code: cvvcem

However, this is pretty easy to decode: Just try the 26 possibilities until you get one that results in a meaningful message. A better approach is to use a key that gives a mapping from letters of the alphabet (plus spaces) to letters that should replace them: abcdefghij.... zqjprtoisva... message: a cab code: qzpqj

but this is also pretty easy to break: we could use letter frequency information (e.g., spaces, ts and es are common) to guess at the key. To avoid this we could use a method closer to the first, but with a repeated key to decide on the value of K. This is called the ``Vigenere cipher'': key: abcabc message: attack code: bvwben

One can also use a method that combines these two ideas, with an offset depending both on a repeated key and the letter itself. If the key is as long as the message itself (and only used once) then this encyption method is provably secure Simple Symmetric Methods

87


Contents (called the ``vernam cipher'' or one−time pad). But there is the problem of distributing the key. It is only really useful in applications where a long key can be distributed ahead of time, in order to allow urgent messages to be safely transmitted at a later date. For simple encryption methods it is often possible to decode a message if you know (or can guess) part of the message. Perhaps you know that an email message starts with a standard header (From: ... Return−Path ..). Obviously if a simple repeated key encryption method is used, with a short key, then this would allow someone to break it. Speech signals often have large areas of silence. This too may be used to find the key. Many encryption methods involve ``generating'' a long key from a shorter key. So, rather than just repeating a short key, we have some cunning method for creating a longer key out of the shorter one, that creates a ``pseudo−key'' that is hard to break, even if some plaintext is known. If you don't know any of the plaintext, breaking the code will be harder whatever encryption method: possible strategies are brute force methods (try all keys until one produces something with dictionary words in), and statistical methods (use knowledge of letter/word frequencies). However, these strategies will also tend to be foxed if a long pseudo key has been generated.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Summary Up: Cryptography Previous: Simple Symmetric Methods

Asymmetric Systems: Public Key CyptoSystems Symmetric cryptosystems have a problem: how do you transport the secret key from the sender to the recipient securely and in a tamperproof fashion? If you could send the secret key securely, you wouldn't need the symmetric cryptosystem in the first place (because you would simply use that same secure channel to send your message). Frequently, trusted couriers are used as a solution to this problem. Another, more efficient and reliable solution is a public key cryptosystem. In a public key system, different keys are used to encrypt and decrypt. The receiver has a private key that can be used to decode any message encoded using his public key. But the public key is NOT enough to decode the message. To send a message to someone, you look up their public key (which is, as you might expect, quite public), encode your message with it, and sent it. The receiver can decode it using his private key. We can express this semi−formally as follows. If P(M) is the encoding of M using the public key, S(M) is the decoding using the private key, we want a system where S(P(M))=M (for all M), but where S cannot be (easily) derived from P. A method that enables this is RSA public−key encryption. Here, the public and private keys are just pairs of BIG numbers. Let the public key be (N, p) and the private key (N, s). The plain text is broken up and translated into numbers (by whatever method), then encryption is just: , decryption . For this to work we want keys such that

(and such that S cant be derived easily from P). It turns out that if we

generate 3 random prime numbers x,y,and z, make S be the largest (say z), N the product of x and y, and P a number such that PS mod (X−1)(Y−1)=1, then we have suitable keys. (You can try this out with small primes, but really we'd use prime numbers with hundreds of digits!). Finding suitable keys (and trying to break codes!) using this system requires number theory algorithms, such as algorithms for finding greatest common divisors, large primes, prime factors etc. Asymmetric Systems: Public Key CyptoSystems

88


Contents Encryption and decryption using this method is relatively time consuming, compared with other methods. It is therefore common to effectively use public key encryption just to encode a session key to be passed with the message to the receiver, and to encode the message itself using simpler methods (using the session key). Netscape (a WWW browser) uses RSA public−key encryption to encode session keys that are then used in less costly symmetric encryption algorithms. If, say, a bookstore wanted to setup so that people could securely send orders which included their credit card details and address, they would get hold of suitable public and private keys, then tell their customers their public key (or rather, incorporate the key in the electronic order form so that it is automatically used at the client end to encode messages sent with forms from the server). Now, if a client sends a message using that public key, that message can (in principle) only be read by the bookstore, so it is only the store which has the credit card details.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Exercises Up: Cryptography Previous: Asymmetric Systems: Public Key

Summary Encryption/decryption methods can be divided into symmetric and asymmetric, depending on whether the same key is used for encryption and decryption. There are many simple symmetric methods, but most are liable to attack, using knowledge of letter/word frequencies, or given some fragment of known plaintext. More complex methods avoid this by generating a pseudo−key from a given key which is sufficiently random looking to avoid statistical tricks, and which ``spreads'' a key throughout the text. Asymmetric (public key) methods work by letting the receiver tell the sender a key which can be used to encrypt the message, but having a separate key for decryption. This method can also make encryption transparent to the user: the system they use can lookup the receivers public key and encode the message without them ever even knowing what a key or encryption is.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Further Reading Up: Cryptography Previous: Summary

Exercises Implement a simple repeated key encryption scheme. Implement a decryption method that will work if you know some initial plaintext at least as long as the key. Suggest a way of generating a long pseudo−key that will foil your simple scheme.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998

Summary

89


Contents Next: Geometric Algorithms Up: Cryptography Previous: Exercises

Further Reading Sedgewick (Ch 23) has a reasonable intro. More up to date material is easily available on the WWW. The details of commercial and government encryption methods are often hard to find out about for understandable reasons.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Motivation Up: Data Structures and Algorithms Previous: Further Reading

Further Reading

90


Geometric Algorithms

• Motivation • Representing Points, Lines and Polygons • Line Intersection • Inclusion in a Polygon • Finding the Convex Hull • Range Searching

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Representing PointsLines and Up: Geometric Algorithms Previous: Geometric Algorithms

Geometric Algorithms

91


Motivation Many computer applications involve manipulating `geometric' objects: points, lines, polygons, cubes and so on. Obvious examples are drawing packages and computer−aided design (CAD) tools, tools that involve manipulating maps, and even just graphical user interfaces (GUIs). Geometric problems are often easy for humans, but hard to find good algorithmic solutions. As an example, given a polygon defined by its vertices, it is easy for humans to: decide whether a given point falls within the polygon; find the smallest convex polygon (or convex hull) that encloses the vertices (see figure ?).

Figure 6.1: Polygon with point inside (A) and outside (B), and 'convex hull' Finding out whether a given point falls within a specified polygon is obviously essential for drawing packages − for example, most of them will allow you to create an arbitrary polygon and have it 'filled' with a given colour. Finding the 'convex hull' surrounding a set of points could be useful when working out the shortest boundary enclosing some set of objects (e.g., a fence around a number of trees..). In this course I'll only give a very brief introduction to geometric algorithms, roughly corresponding to chapter 24 in Sedgewick. You should, at least, be aware of this category of algorithms, so you know where to look things up if you need such a thing in your programming and problem solving.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Line Intersection Up: Geometric Algorithms Previous: Motivation

Motivation

92


Representing Points, Lines and Polygons Before discussing the algorithms, I'll give datastructures for points, lines and polygons: struct point { int x, y;}; struct line { point p1, p2;}; point polygon[NMax];

So, the first point on a line l would be l.p1; the y coordinate of that point would be l1.p.y; the second point in a polygon poly would be poly[1]; and its x coordinate poly[1].x We will require that the end point in a polygon will be the same as the first point, so it is possible to know where the polygon array ends.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Inclusion in a Polygon Up: Geometric Algorithms Previous: Representing PointsLines and

Representing Points, Lines and Polygons

93


Line Intersection Given this representation, how do we find out if two line segments intersect, ie: \

/ \

vs

/ \

/ \

\ / \/ /\ /

One way is basically to use school maths (y=mx+c etc) to find the equations, and then intersection points of the lines, and then find out if these intersection points lie 'within' each line.. Another approach, given in Sedgewick, which is introduces ideas useful in other algorithms is as follows. First define a function that, given three points, tells you whether you have to turn clockwise or counter clockwise when travelling from the first to the second to the third. Call this CCW: it will return TRUE if it is counter clockwise. Now, two lines l1 and l2 will intersect if (CCW(l1.p1, l1.p2, l2.p1) <> CCW(l1.p1, l1.p2, l2.p2)) AND (CCW(l2.p1, l2.p2, l1.p1) <> CCW(l2.p1, l2.p2, l1.p2)). Basically, this just means that both endpoints of each line are on different 'sides' of the other line. The CCW function can be defined fairly straightforwardly, by calculating the gradient between point 1 and point2, point 2 and point 3, and checking if one gradient is larger than the other. (The implementation must cope with the fact that the slopes may be infinite, or that they are equal, but that's just a minor fiddle − see Sedgewick). The following gives what might be an initial attempt at both ccw and intersect. I'll leave it as an exercise for the reader to work out how to make it deal with infinite slopes (ie, dx=0), and the case where two lines are collinear. int ccw(point p1, point p2, point p3) // Slightly deficient function to determine if the two lines p1, p2 and // p2, p3 turn in counter clockwise direction} { int dx1, dx2, dy1, dy2; dx1 = p2.x − p1.x; dy1 = p2.y − p1.y; dx2 = p3.x − p2.x; dy2 = p3.y − p2.y; if(dy1*dx2 < dy2*dx1) return 1; else return 0; } int intersect(line l1, line l2) { return ((ccw(l1.p1, l1.p2, l2.p1) != ccw(l1.p1, l1.p2, l2.p2)) && (ccw(l2.p1, l2.p2, l1.p1) != ccw(l2.p1, l2.p2, l1.p2))) }

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Finding the Convex Hull Up: Geometric Algorithms Previous: Line Intersection

Line Intersection

94


Inclusion in a Polygon A basic approach to finding if a given point is within a polygon is as follows. Draw a line out from that point, and count how many times it intersects lines in the polygon. If it is an odd number, the point must be within the polygon. See figure.

Figure 6.2: Determining if point is included in polygon This is slightly complicated by the fact that the line might intersect a vertex of the polygon. If so, it may sometimes be counted as two counts and sometimes as one, as illustrated by the two such examples in the figure. Without this complication, the algorithm to determine inclusion is the following (which ignores such points). 'n' is the number of points in the polygon. int inside(point p, point poly[], int n) { int i, j, count; line lt, lp; count = 0; j = 0; lt.p1 = p; lt.p2 = p; lt.p2.x = MAXINT; // Defines horizontal line // out from p for(i=0, i<=n, i++) { lp.p1 = p[i]; lp.p2 = p[i+1]; // Create line for segment in // polygon if(intersect(lp, lt)) count++; } return ((count % 2) != 0) // odd? }

To handle the case where our line intersects a vertex, we need to check whether the two points in the polygon either side of the vertex are on the same 'side' of the test line. (Convince yourself that this is right). Implementation details are left to the reader, or to look up in Sedgewick.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Range Searching Up: Geometric Algorithms Previous: Inclusion in a Polygon

Inclusion in a Polygon

95


Finding the Convex Hull The convex hull of a set of points, is really its natural boundary or enclosure, as illustrated in the figure. It is defined to be the smallest convex polygon enclosing all the points. (Convex has its usual meaning. One way to describe a convex polygon is that it is one such that any line connecting two points within the polygon also fall within the polygon. Another description could be one such that, if you follow round the polygon, each line segment bends in the same (clockwise or anticlockwise) direction.)

Figure 6.3: Convex hulls A function for finding a convex hull takes an array of points, and returns a polygon (another array of points) (a subset of the points in the first array). Sometimes it may be easier to modify the original point array, rather than returning a new one. The most natural algorithm for finding the convex hull is referred to as package wrapping. Starting with some point guaranteed to be on the hull (say, the one with smallest y coordinate), take a horizontal line in positive direction and sweep it up until it hits a point. This point is guaranteed to be on the hull. Record this point, anchor the line at this point, and continue sweeping until another point is hit. Continue in this way until the `package' is fully `wrapped', as illustrated.

Figure 6.4: `Package Wrapping' a Convex Hull So, how do we find the next point on the hull, assuming we have a suitable line which is to be `wrapped' round to this point? Basically, the next point is the one for which the angle between the current line, and the line from the last point on the hull to the point considered, is a minimum. (This angle is labelled theta ( ) on the diagram). If we have a function theta, that takes two points and returns the angle between the associate line and the horizontal, then this results in the following algorithm: int wrap(points p[], int N) { int i, min=0, M; float th, v;

// v will be current 'sweep' angle; // th will keep track of minimum angle

for (i=1; i< N; i++); if(p[i].y < p[min].y) min = i; // find starting point th = 0.0; p[N] = p[min]; // initialise 'th' and p[n] m := 0; finished := false; for (M=0; M<N; M++) // we'll be looking at the m'th point on hull { swap(points, M, min); // Move the next point into the M'th position // and vice versa} min = N; v = th; th = 360.0; // set sweep angle, initialise min & th for (i = M+1; i <= N; i++) // if the angle to this point is > current sweep angle, // and is the lowest such angle come across, then set // it as a new min if ((theta(p[m],p[i]) > v) && (theta(p[m], p[i]) < th)) { min = i; th = theta(p[m], p[i]) } if( min==N ) return M; } }

Finding the Convex Hull

96


Contents Some steps of the above may not be obvious. As points on the hull are found, they are swapped with an actual point in the array, so at any point the array contains the first m points of the hull plus the rest of the points. So that the algorithm can exit when the hull is complete, the starting point in the hull is placed in the N'th position in the array. When the current point on the hull just found is that one, then exit. The algorithm should look vaguely like the sort algorithms which you should have already met. This is not surprising, as we are basically sorting a bunch of points according to their position around the hull. Like other simplistic sort procedures (selection sort), this algorithm doesnt have very good worst case complexity (N*N). Other methods for finding convex hull can have N log N complexity, and have analogs in other sorting methods.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: General Methods for Developing Up: Geometric Algorithms Previous: Finding the Convex Hull

Finding the Convex Hull

97


Range Searching As a final example of geometric algorithms, consider the following problem. You have a database of objects, each with certain attributes (e.g., age, mileage, mpg, price!). You want to pick out from that database all those items that satisfy some criteria (e.g, price < £3000, age < 4). Finding all the objects that satisfy range restrictions on attributes is called range searching, and is clearly an important practical problem. If we have only ONE attribute to consider (e.g., 2 < age < 4), then an efficient technique is to represent the data as a binary tree, then do a binary search on the end points of the range:

This requires pre−processing the data to create the tree, but once done searches are O(R log N) where N items in database and R is number of points falling in range. The goal of range searching algorithms is to achieve similar efficiency for ``multidimensional'' range searching involving several attributes. We'll just consider two attributes/dimensions. In geometric terms we are looking for points (items) that lie inside a rectangle (e.g., capturing 3 < age < 6; 2000 < price < 3000). The simplest thing to do is to examine every item in the database, checking whether each ``point'' lies in the given rectangle. (giving O(N) complexity). However, there are more efficient methods, some of which involve some pre−processing of the data. The simplest extension is to first find all the items for which one criteria (e.g., age) is satisfied, and only check the other criteria for these items. But this won't give us much gain in efficiency. A better method is to pre−process the data by dividing up the range into a grid of small squares. Say, one of those squares might be 0<age<1, 0 < price < 1000. You find out which points lie in each grid square. Then given a particular query, you can identify it with a number of grid squares, and rapidly retrieve the relevant items. If the size of the squares is carefully chosen then you can get good efficiency with this method. Another method is to use ``two dimensional trees''. These are like ordinary binary trees, but alternate between the two attributes. So, the first division might be between age < 5 and 5 < age, and then at the next level in the tree we might have, for the first branch, a division price < 3000 vs 3000 < price. If these trees are carefully constucted then the complexity for searches is O(R+log N). Both the grid method and the tree method generalise to more than two dimensions/attributes.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Brute Strength Method Up: Data Structures and Algorithms Previous: Range Searching

Range Searching

98


General Methods for Developing Algorithms So far, in discussing algorithms we've focused on particular algorithms for particular tasks. But how does that help us if we need to develop a new algorithm for a new type of problem that cannot be mapped to an existing method? There are various general methods that can be adopted, and some of these will be briefly discussed in this section. They provide ways of solving problems where there are many possible ``configurations'' of something, each of which could possibly be a solution, but we want to find the solution fast. Sorting and searching algorithms can be viewed in these terms (including graph search and range search). This material is based on the book ``Understanding Algorithms and Data Structures'' by Brunskill and Turner, McGraw Hill, 1996, and there will be extracts of this book handed out.

• Brute Strength Method • Divide and Conquer • Greedy Algorithms • Dynamic Programming • Genetic Algorithms

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Divide and Conquer Up: General Methods for Developing Previous: General Methods for Developing

General Methods for Developing Algorithms

99


Brute Strength Method The brute strength method is based on going through all possible answers to a problem, and choosing the best one. It isn't a terribly clever or efficient method. As an extreme example of a brute strength approach, consider sorting: a brute strength method might try out all possible orderings of the given item until it found one that had the items in the right order. Simple searching algorithms which examine all items in turn can also be viewed as brute strength algorithms. And breadth and depth first search algorithms can be viewed as brute strength, as you are going through every possible path in turn until you find the item you want. While it isn't a terribly good approach for most problems, it is often possible to finda brute strength approach to a problem, and for some hard problems it may be the best you can manage. More info in handout.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Greedy Algorithms Up: General Methods for Developing Previous: Brute Strength Method

Brute Strength Method

100


Divide and Conquer Divide an conquer algorithms divide a problem into parts, solve each of the sub problems, and then combine the subsolutions in some way to find the final answer. The subproblems may be solved in a similar manner. Quick sort and merge sort are both examples of divide and conquer.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Dynamic Programming Up: General Methods for Developing Previous: Divide and Conquer

Divide and Conquer

101


Greedy Algorithms Greedy algorithms formulate a problem so that at each step, there are a number of possible options. Further, if the BEST available option is chosen at each step then you'll eventually solve the problem. If a problem can be set up in such a way then the actual problem solving is straightforward. ``Heuristic'' algorithms are similar, but you can't be sure that what appears to the best option will actually lead you to the best solution. So, there is still some way to ``intelligently'' choose among the options, and this may be better than trying them all, but it may not lead to the best solution.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: Genetic Algorithms Up: General Methods for Developing Previous: Greedy Algorithms

Greedy Algorithms

102


Dynamic Programming The principle of dynamic programming is analogous to `divide and conquer'. In fact, dynamic programming can be viewed as this principle taken to extremes. If it is not possible to work out exactly how to divide up a problem into smaller problems, it is sometimes worth taking the approach of solving ALL the smaller problems, and storing them away to be combined and used when solving the larger problem. (This idea is illustrated to some extent by the shortest path algorithm we looked at: a table of shortest distances found so far to each node is maintained, and this used when working out new shortest paths.). The basic idea of dynamic programming can be illustrated by looking at the `knapsack' problem. In this problem, a thief robbing a safe finds it filled with N items of various size and value. But his bag has only limited capacity (M). How does he choose which items to take to maximise his loot's value. As an example, perhaps there are four objects in the safe of following sizes/values: size 12, value 20; size 5 value 8; size 3 value 10; size 6 value 10. His bag's capacity is 12. He should clearly go for the first object and scrap the rest. But what if his bag has capacity 15. This time he should leave the big object, and go for the smaller ones. This problem is clearly of some practical relevance, for example in real packing (e.g., shipping) applications. In a dynamic programming approach we do a rather non−intuitive thing. We calculate the best combination of objects for ALL knapsack sizes up to M. This can be done very efficiently, as illustrated in the following program (which assumes that we have a potentially number of items of each given size/value, these sizes/values being stored in an array). for j:=1 to N do {Go through each item} for i := 1 to M do begin {Consider each size knapsack} if i >= size[j] then if (cost[i] < cost[i−size[j]] + value[j]) then begin cost[i] := cost[i−size[j]] + value[j]; best[i] := j end;

Cost[i] is the highest value that can be achieved with a knapsack of capacity i and is initialised to zero; best[i] is the last item that was added to achieve that maximum. First we calculate the best we can do only using objects of type 1 (j=1). Then we calculate the best considering items of type 1 and 2 (using our result for just type 1). And so on. The revised calculation of 'cost' when considering a new item is very simple. We can find out the value (cost) of items that we'd be able to take of the other kinds considered so far IF we removed enough to leave room for our new kind of item (cost[i−size[j]]). If that value PLUS the value of the item we're considering adding is greater than the old cost just with all the old items then replacing the old item(s) with this new one is a good idea, and so cost (and best) are updated to reflect this. (We can see parallels with shortest path algorithms). As an example, suppose we have the following items, and are looking to fill a bag of size 4: Item 1 2 3

Size 3 2 1

Value 6 5 2

Initially we just consider item 1. For a bag of size i=1 and 2 the item won't fit in the bag. For a bag of size 3, we can fit one item and as cost[3]<cost[0]+value[1], that one is added. So cost[3] is now 6; best[3]=1. For a bag of size 4, we check if cost[4] < cost[1]+value[1] − this is true, so for this size too we have item 1 in the bag, and cost[4]=6; best[4]=1.

Dynamic Programming

103


Contents Now we consider item 2. For bag size 1, it won't fit. For bag size 2 we check whether cost[2] < cost[0] + value[2]. This is clearly the case (as cost[2] is zero, as item 1 wouldnt fit in the bag) so we make that the `last' (and only) item in this bag, and cost[2]=5. However, for size 3 we could fit a type 1 object − so when we check if cost[3] < cost[1]+value[2] the answer is no. For size 4 we check if: cost[4] < cost[2]+value[2]. Now cost[4] is 6, cost[2] has just been calculated as 5, so the check succeeds, which means it is worth throwing out the previous contents of bag 4, and replacing the top item with item 2 (and assuming the rest of teh bag is te same as what is `best' for bags of size 2). So now we have: cost[0]=0; cost[1]=0; cost[2]=5; cost[3]=6; cost[4]=10. Now, considering item 3. Item 3 fits in a bag of size 1, so cost[1]=2. It is not worth using for bags of size 2, but for a bag of size 3 we have: cost[3] < cost[2]+value[3], so the top item in this bag is item 3 (and we assume the rest is the best we can get for a bag of size 2). For a bag of size 4, this is not worth doing, so at the end of the day we have: cost[0]=0; cost[1]=2; cost[2]=5; cost[3]=7; cost[4]=10. We also have best[4]=2; best[3]=1; best[2]=2; best[1]=3. It turns out that when the calculation is complete we can find out the contents of the knapsack using the 'best' array. Suppose best[M] is item k, of size size[k]. Then the next item in the bag must be the last item in a bag of size[M−size[k]]. And so on. So, in the above case, best[4]=2, so we have an item 2 in the bag. best[4−size[2]]= best[2]=2, so we have another item of type 2 in the bag. All this can be proved correct by induction. ie, we assume that it gives the right result for N items, it gives the right result for N+1 items. Then we just check it is correct for 1 item. We check, for N items, that if it gives right result for bag of size M it gives right result for bag of size M+1. I'll leave that as an exercise for the reader. This approach to the knapsack problem takes NxM steps, which isn't bad. Try thinking of other solutions to the problem. However, the approach does have a limitation − it only works for bags whose size can be represented as an integer (or where this is an acceptable approximation). The more fine−grained a result we want, in terms of bag sizes, the more partial solutions we need calculate and store for smaller sizes. For dynamic programming it is necessary to think hard about the time and space requirements of the approach. However, the general idea, of solving for simpler cases and then using these results to solve the harder case, is clearly a very widely applicable technique, and it should at least be viewed as a possible problem solving technique to be used for all sorts of hard problems.

Next: Genetic Algorithms Up: General Methods for Developing Previous: Greedy Algorithms Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Next: About this document Up: General Methods for Developing Previous: Dynamic Programming

Dynamic Programming

104


Genetic Algorithms Genetic algorithms are used increasingly. Here the principle is that you have some ``population'' of possible solutions/configurations, some way of scoring them according to how good they are, and some way of combining two of them into one new possible solution. At each step in the algorithm, the best few of the population of candidate solutions are selected, combined to create some new possibile solutions, and these new possibilities replace the worst in the current population. This can sometime result in a good solution being found in quite a short space of time.

Alison Cawsey Fri Aug 28 16:25:58 BST 1998 Up: Data Structures and Algorithms Previous: Genetic Algorithms

Genetic Algorithms

105


About this document ... Data Structures and Algorithms II This document was generated using the LaTeX2HTML translator Version 96.1 (Feb 5, 1996) Copyright ツゥ 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds. The command line arguments were: latex2html 竏地o_images ds98. The translation was initiated by Alison Cawsey on Fri Aug 28 16:25:58 BST 1998

Alison Cawsey Fri Aug 28 16:25:58 BST 1998

About this document ...

106


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.