Data Structure Algorithms

Page 1

Data Structures and Algorithms

DSA

Referenc
Barne Luca Del Tongo
Ann
Granville

DataStructuresandAlgorithms: AnnotatedReferencewithExamples

FirstEdition

Copyright c GranvilleBarnett,andLucaDelTongo2008.

ThisbookismadeexclusivelyavailablefromDotNetSlackers (http://dotnetslackers.com/) the placefor.NETarticles,andnewsfrom someoftheleadingmindsinthesoftwareindustry.

Contents 1Introduction1 1.1Whatthisbookis,andwhatitisn’t................ 1 1.2Assumedknowledge......................... 1 1.2.1BigOhnotation....................... 1 1.2.2Imperativeprogramminglanguage............. 3 1.2.3Objectorientedconcepts.................. 4 1.3Pseudocode.............................. 4 1.4Tipsforworkingthroughtheexamples............... 6 1.5Bookoutline............................. 6 1.6Testing................................ 7 1.7WherecanIgetthecode?...................... 7 1.8Finalmessages............................ 7 IDataStructures8 2LinkedLists9 2.1SinglyLinkedList.......................... 9 2.1.1Insertion............................ 10 2.1.2Searching........................... 10 2.1.3Deletion............................ 11 2.1.4Traversingthelist...................... 12 2.1.5Traversingthelistinreverseorder............. 13 2.2DoublyLinkedList.......................... 13 2.2.1Insertion............................ 15 2.2.2Deletion............................ 15 2.2.3ReverseTraversal....................... 16 2.3Summary............................... 17 3BinarySearchTree19 3.1Insertion................................ 20 3.2Searching............................... 21 3.3Deletion................................ 22 3.4Findingtheparentofagivennode................. 24 3.5Attainingareferencetoanode................... 24 3.6Findingthesmallestandlargestvaluesinthebinarysearchtree 25 3.7TreeTraversals............................ 26 3.7.1Preorder............................ 26 I
3.7.2Postorder........................... 26 3.7.3Inorder............................ 29 3.7.4BreadthFirst......................... 30 3.8Summary............................... 31 4Heap32 4.1Insertion................................ 33 4.2Deletion................................ 37 4.3Searching............................... 38 4.4Traversal............................... 41 4.5Summary............................... 42 5Sets44 5.1Unordered............................... 46 5.1.1Insertion............................ 46 5.2Ordered................................ 47 5.3Summary............................... 47 6Queues48 6.1Astandardqueue........................... 49 6.2PriorityQueue............................ 49 6.3DoubleEndedQueue......................... 49 6.4Summary............................... 53 7AVLTree54 7.1TreeRotations............................ 56 7.2TreeRebalancing........................... 57 7.3Insertion................................ 58 7.4Deletion................................ 59 7.5Summary............................... 61 IIAlgorithms62 8Sorting63 8.1BubbleSort.............................. 63 8.2MergeSort.............................. 63 8.3QuickSort............................... 65 8.4InsertionSort............................. 67 8.5ShellSort............................... 68 8.6RadixSort.............................. 68 8.7Summary............................... 70 9Numeric72 9.1PrimalityTest............................ 72 9.2Baseconversions........................... 72 9.3Attainingthegreatestcommondenominatoroftwonumbers.. 73 9.4Computingthemaximumvalueforanumberofaspecificbase consistingofNdigits......................... 74 9.5Factorialofanumber........................ 74 9.6Summary............................... 75 II
10Searching76 10.1SequentialSearch........................... 76 10.2ProbabilitySearch.......................... 76 10.3Summary............................... 77 11Strings79 11.1Reversingtheorderofwordsinasentence............. 79 11.2Detectingapalindrome....................... 80 11.3Countingthenumberofwordsinastring............. 81 11.4Determiningthenumberofrepeatedwordswithinastring.... 83 11.5Determiningthefirstmatchingcharacterbetweentwostrings.. 84 11.6Summary............................... 85 AAlgorithmWalkthrough86 A.1Iterativealgorithms......................... 86 A.2RecursiveAlgorithms......................... 88 A.3Summary............................... 90 BTranslationWalkthrough91 B.1Summary............................... 92 CRecursiveVs.IterativeSolutions93 C.1ActivationRecords.......................... 94 C.2Someproblemsarerecursiveinnature............... 95 C.3Summary............................... 95 DTesting97 D.1Whatconstitutesaunittest?.................... 97 D.2WhenshouldIwritemytests?................... 98 D.3HowseriouslyshouldIviewmytestsuite?............. 99 D.4ThethreeA’s............................. 99 D.5Thestructuringoftests....................... 99 D.6CodeCoverage............................ 100 D.7Summary............................... 100 ESymbolDefinitions101 III

Preface

Everybookhasastoryastohowitcameaboutandthisoneisnodifferent, althoughwewouldbelyingifwesaiditsdevelopmenthadnotbeensomewhat impromptu.Putsimplythisbookistheresultofaseriesofemailssentback andforthbetweenthetwoauthorsduringthedevelopmentofalibraryfor the.NETframeworkofthesamename(withtheomissionofthesubtitleof course!).Theconversationstartedoffsomethinglike,“Whydon’twecreate amoreaestheticallypleasingwaytopresentourpseudocode?”Afterafew weeksthisnewpresentationstylehadinfactgrownintopseudocodelistings withchunksoftextdescribinghowthedatastructureoralgorithminquestion worksandvariousotherthingsaboutit.Atthispointwethought,“Whatthe heck,let’smakethisthingintoabook!”Andso,inthesummerof2008we beganworkonthisbooksidebysidewiththeactuallibraryimplementation.

Whenwestartedwritingthisbooktheonlythingsthatweweresureabout withrespecttohowthebookshouldbestructuredwere:

1. alwaysmakeexplanationsassimpleaspossiblewhilemaintainingamoderatelyfinedegreeofprecisiontokeepthemoreeagermindedreaderhappy; and

2. injectdiagramstodemystifyproblemsthatareevenmoderatlychallenging tovisualise(...andsowecouldrememberhowourownalgorithmsworked whenlookingbackatthem!);andfinally

3. presentconciseandself-explanatorypseudocodelistingsthatcanbeported easilytomostmainstreamimperativeprogramminglanguageslikeC++, C#,andJava.

Akeyfactorofthisbookanditsassociatedimplementationsisthatall algorithms(unlessotherwisestated)weredesignedbyus,usingthetheoryof thealgorithminquestionasaguideline(forwhichweareeternallygratefulto theiroriginalcreators).Thereforetheymaysometimesturnouttobeworse thanthe“normal”implementations—andsometimesnot.Wearetwofellows oftheopinionthatchoiceisagreatthing.Readourbook,readseveralothers onthesamesubjectandusewhatyouseefitfromeach(ifanything)when implementingyourownversionofthealgorithmsinquestion.

Throughthisbookwehopethatyouwillseetheabsolutenecessityofunderstandingwhichdatastructureoralgorithmtouseforacertainscenario.Inall projects,especiallythosethatareconcernedwithperformance(hereweapply anevengreateremphasisonreal-timesystems)theselectionofthewrongdata structureoralgorithmcanbethecauseofagreatdealofperformancepain.

IV

Thereforeitisabsolutelykeythatyouthinkabouttheruntimecomplexityand spacerequirementsofyourselectedapproach.Inthisbookweonlyexplainthe theoreticalimplicationstoconsider,butthisisforagoodreason:compilersare verydifferentinhowtheywork.OneC++compilermayhavesomeamazing optimisationphasesspecificallytargetedatrecursion,anothermaynot,forexample.Ofcoursethisisjustanexamplebutyouwouldbesurprisedbyhow manysubtledifferencestherearebetweencompilers.Thesedifferenceswhich maymakeafastalgorithmslow,andviceversa.Wecouldalsofactorinthe sameconcernsaboutlanguagesthattargetvirtualmachines,leavingallthe actualvariousimplementationissuestoyougiventhatyouwillknowyourlanguage’scompilermuchbetterthanus...wellinmostcases.Thishasresultedin amoreconcisebookthatfocusesonwhatwethinkarethekeyissues.

Onefinalnote:nevertakethewordsofothersasgospel;verifyallthatcan befeasiblyverifiedandmakeupyourownmind.

Wehopeyouenjoyreadingthisbookasmuchaswehaveenjoyedwritingit.

GranvilleBarnett

LucaDelTongo

V

Acknowledgements

Writingthisshortbookhasbeenafunandrewardingexperience.Wewould liketothank,innoparticularorderthefollowingpeoplewhohavehelpedus duringthewritingofthisbook.

SonuKapoorgenerouslyhostedourbookwhichwhenwereleasedthefirst draftreceivedoverthirteenthousanddownloads,withouthisgenerositythis bookwouldnothavebeenabletoreachsomanypeople.JonSkeetprovidedus withanalarmingnumberofsuggestionsthroughoutforwhichweareeternally grateful.Jonalsoeditedthisbookaswell.

Wewouldalsoliketothankthosewhoprovidedtheoddsuggestionviaemail tous.Allfeedbackwaslistenedtoandyouwillnodoubtseesomecontent influencedbyyoursuggestions.

Aspecialthankyoualsogoesouttothosewhohelpedpublicisethisbook fromMicrosoft’sChannel9weeklyshow(thanksDan!)tothemanybloggers whohelpedspreadtheword.Yougaveusanaudienceandforthatweare extremelygrateful.

Thankyoutoallwhocontributedinsomewaytothisbook.Theprogrammingcommunityneverceasestoamazeusinhowwillingitsconstituentsareto givetimetoprojectssuchasthisone.Thankyou.

VI

AbouttheAuthors GranvilleBarnett

GranvilleiscurrentlyaPh.DcandidateatQueenslandUniversityofTechnology (QUT)workingonparallelismattheMicrosoftQUTeResearchCentre1.Healso holdsadegreeinComputerScience,andisaMicrosoftMVP.Hismaininterests areinprogramminglanguagesandcompilers.Granvillecanbecontactedvia oneoftwoplaces:eitherhispersonalwebsite(http://gbarnett.org)orhis blog(http://msmvps.com/blogs/gbarnett).

LucaDelTongo

LucaiscurrentlystudyingforhismastersdegreeinComputerScienceatFlorence.Hismaininterestsvaryfromwebdevelopmenttoresearchfieldssuchas dataminingandcomputervision.LucaalsomaintainsanItalianblogwhich canbefoundat http://blogs.ugidotnet.org/wetblog/

1 http://www.mquter.qut.edu.au/

VII
Pageintentionallyleftblank.

Chapter1

Introduction

1.1Whatthisbookis,andwhatitisn’t

Thisbookprovidesimplementationsofcommonanduncommonalgorithmsin pseudocodewhichislanguageindependentandprovidesforeasyportingtomost imperativeprogramminglanguages.Itisnotadefinitivebookonthetheoryof datastructuresandalgorithms.

Forthemostpartthisbookpresentsimplementationsdevisedbytheauthors themselvesbasedontheconceptsbywhichtherespectivealgorithmsarebased uponsoitismorethanpossiblethatourimplementationsdifferfromthose consideredthenorm.

Youshouldusethisbookalongsideanotheronthesamesubject,butone thatcontainsformalproofsofthealgorithmsinquestion.Inthisbookweuse theabstractbigOhnotationtodepicttheruntimecomplexityofalgorithms sothatthebookappealstoalargeraudience.

1.2Assumedknowledge

Wehavewrittenthisbookwithfewassumptionsofthereader,butsomehave beennecessaryinordertokeepthebookasconciseandapproachableaspossible. Weassumethatthereaderisfamiliarwiththefollowing:

1. BigOhnotation

2. Animperativeprogramminglanguage

3. Objectorientedconcepts

1.2.1BigOhnotation

ForruntimecomplexityanalysisweusebigOhnotationextensivelysoitisvital thatyouarefamiliarwiththegeneralconceptstodeterminewhichisthebest algorithmforyouincertainscenarios.WehavechosentousebigOhnotation forafewreasons,themostimportantofwhichisthatitprovidesanabstract measurementbywhichwecanjudgetheperformanceofalgorithmswithout usingmathematicalproofs.

1

Figure1.1showssomeoftheruntimestodemonstratehowimportantitisto chooseanefficientalgorithm.Forthesanityofourgraphwehaveomittedcubic

O(n3),andexponential O(2n)runtimes.Cubicandexponentialalgorithms shouldonlyeverbeusedforverysmallproblems(ifever!);avoidthemiffeasibly possible.

ThefollowinglistexplainssomeofthemostcommonbigOhnotations:

O(1) constant:theoperationdoesn’tdependonthesizeofitsinput,e.g.adding anodetothetailofalinkedlistwherewealwaysmaintainapointerto thetailnode.

O(n) linear:theruntimecomplexityisproportionatetothesizeof n

O(logn) logarithmic:normallyassociatedwithalgorithmsthatbreaktheproblem intosmallerchunkspereachinvocation,e.g.searchingabinarysearch tree.

O(nlogn) just nlogn:usuallyassociatedwithanalgorithmthatbreakstheproblem intosmallerchunkspereachinvocation,andthentakestheresultsofthese smallerchunksandstitchesthembacktogether,e.g.quicksort.

O(n2) quadratic:e.g.bubblesort.

O(n3) cubic:veryrare.

O(2n) exponential:incrediblyrare.

Ifyouencountereitherofthelattertwoitems(cubicandexponential)thisis reallyasignalforyoutoreviewthedesignofyouralgorithm.Whileprototypingalgorithmdesignsyoumayjusthavetheintentionofsolvingtheproblem irrespectiveofhowfastitworks.Wewouldstronglyadvisethatyoualways reviewyouralgorithmdesignandoptimisewherepossible—particularlyloops

CHAPTER1.INTRODUCTION 2
Figure1.1:Algorithmicruntimeexpansion

andrecursivecalls—sothatyoucangetthemostefficientruntimesforyour algorithms.

ThebiggestassetthatbigOhnotationgivesusisthatitallowsustoessentiallydiscardthingslikehardware.Ifyouhavetwosortingalgorithms,one withaquadraticruntime,andtheotherwithalogarithmicruntimethenthe logarithmicalgorithmwillalwaysbefasterthanthequadraticonewhenthe datasetbecomessuitablylarge.Thisapplieseveniftheformerisranonamachinethatisfarfasterthanthelatter.Why?BecausebigOhnotationisolates akeyfactorinalgorithmanalysis:growth.Analgorithmwithaquadraticrun timegrowsfasterthanonewithalogarithmicruntime.Itisgenerallysaidat somepointas n →∞ thelogarithmicalgorithmwillbecomefasterthanthe quadraticalgorithm.

BigOhnotationalsoactsasacommunicationtool.Picturethescene:you arehavingameetingwithsomefellowdeveloperswithinyourproductgroup. Youarediscussingprototypealgorithmsfornodediscoveryinmassivenetworks. Severalminuteselapseafteryouandtwoothershavediscussedyourrespective algorithmsandhowtheywork.Doesthisgiveyouagoodideaofhowfasteach respectivealgorithmis?No.Theresultofsuchadiscussionwilltellyoumore aboutthehighlevelalgorithmdesignratherthanitsefficiency.Replaythescene backinyourhead,butthistimeaswellastalkingaboutalgorithmdesigneach respectivedeveloperstatestheasymptoticruntimeoftheiralgorithm.Using thelatterapproachyounotonlygetagoodgeneralideaaboutthealgorithm design,butalsokeyefficiencydatawhichallowsyoutomakebetterchoices whenitcomestoselectinganalgorithmfitforpurpose.

Somereadersmayactuallyworkinaproductgroupwheretheyaregiven budgetsperfeature.Eachfeatureholdswithitabudgetthatrepresentsitsuppermosttimebound.Ifyousavesometimeinonefeatureitdoesn’tnecessarily giveyouabufferfortheremainingfeatures.Imagineyouareworkingonan application,andyouareintheteamthatisdevelopingtheroutinesthatwill essentiallyspinupeverythingthatisrequiredwhentheapplicationisstarted. Everythingisgreatuntilyourbosscomesinandtellsyouthatthestartup timeshouldnotexceed n ms.Theefficiencyofeveryalgorithmthatisinvoked duringstartupinthisexampleisabsolutelykeytoasuccessfulproduct.Even ifyoudon’thavethesebudgetsyoushouldstillstriveforoptimalsolutions. Takingaquantitativeapproachformanysoftwaredevelopmentproperties willmakeyouafarsuperiorprogrammer-measuringone’sworkiscriticalto success.

1.2.2Imperativeprogramminglanguage

Allexamplesaregiveninapseudo-imperativecodingformatandsothereader mustknowthebasicsofsomeimperativemainstreamprogramminglanguage toporttheexampleseffectively,wehavewrittenthisbookwiththefollowing targetlanguagesinmind:

CHAPTER1.INTRODUCTION 3
1. C++ 2. C# 3. Java

Thereasonthatweareexplicitinthisrequirementissimple—allourimplementationsarebasedonanimperativethinkingstyle.Ifyouareafunctional programmeryouwillneedtoapplyvariousaspectsfromthefunctionalparadigm toproduceefficientsolutionswithrespecttoyourfunctionallanguagewhether itbeHaskell,F#,OCaml,etc.

Twoofthelanguagesthatwehavelisted(C#andJava)targetvirtual machineswhichprovidevariousthingslikesecuritysandboxing,andmemory managementviagarbagecollectionalgorithms.Itistrivialtoportourimplementationstotheselanguages.WhenportingtoC++youmustrememberto usepointersforcertainthings.Forexample,whenwedescribealinkedlist nodeashavingareferencetothenextnode,thisdescriptionisinthecontext ofamanagedenvironment.InC++youshouldinterpretthereferenceasa pointertothenextnodeandsoon.Forprogrammerswhohaveafairamount ofexperiencewiththeirrespectivelanguagethesesubtletieswillpresentnoissue,whichiswhywereallydoemphasisethatthereadermustbecomfortable withatleastoneimperativelanguageinordertosuccessfullyportthepseudoimplementationsinthisbook.

Itisessentialthattheuserisfamiliarwithprimitiveimperativelanguage constructsbeforereadingthisbookotherwiseyouwilljustgetlost.Somealgorithmspresentedinthisbookcanbeconfusingtofollowevenforexperienced programmers!

1.2.3Objectorientedconcepts

Forthemostpartthisbookdoesnotusefeaturesthatarespecifictoanyone language.Inparticular,weneverprovidedatastructuresoralgorithmsthat workongenerictypes—thisisinordertomakethesamplesaseasytofollow aspossible.However,toappreciatethedesignsofourdatastructuresyouwill needtobefamiliarwiththefollowingobjectoriented(OO)concepts:

1. Inheritance

2. Encapsulation

3. Polymorphism

ThisisespeciallyimportantifyouareplanningonlookingattheC#target thatwehaveimplemented(moreonthatin §1.7)whichmakesextensiveuse oftheOOconceptslistedabove.Asafinalnoteitisalsodesirablethatthe readerisfamiliarwithinterfacesastheC#targetusesinterfacesthroughout thesortingalgorithms.

1.3Pseudocode

Throughoutthisbookweusepseudocodetodescribeoursolutions.Forthe mostpartinterpretingthepseudocodeistrivialasitlooksverymuchlikea moreabstractC++,orC#,butthereareafewthingstopointout:

1. Pre-conditionsshouldalwaysbeenforced

2. Post-conditionsrepresenttheresultofapplyingalgorithm a todatastructure d

CHAPTER1.INTRODUCTION 4

3. Thetypeofparametersisinferred

4. Allprimitivelanguageconstructsareexplicitlybegunandended Ifanalgorithmhasareturntypeitwilloftenbepresentedinthepostcondition,butwherethereturntypeissufficientlyobviousitmaybeomitted forthesakeofbrevity.

Mostalgorithmsinthisbookrequireparameters,andbecauseweassignno explicittypetothoseparametersthetypeisinferredfromthecontextsinwhich itisused,andtheoperationsperformeduponit.Additionally,thenameof theparameterusuallyactsasthebiggestcluetoitstype.Forinstance n isa pseudo-nameforanumberandsoyoucanassumeunlessotherwisestatedthat n translatestoanintegerthathasthesamenumberofbitsasaWORDona 32bitmachine,similarly l isapseudo-nameforalistwherealistisaresizeable array(e.g.avector).

Thelastmajorpointofreferenceisthatwealwaysexplicitlyendalanguage construct.Forinstanceifwewishtoclosethescopeofa for loopwewill explicitlystate endfor ratherthanleavingtheinterpretationofwhenscopes areclosedtothereader.Whileimplicitscopeclosureworkswellinsimplecode, incomplexcasesitcanleadtoambiguity.

Thepseudocodestylethatweusewithinthisbookisratherstraightforward. Allalgorithmsstartwithasimplealgorithmsignature,e.g.

1) algorithm AlgorithmName(arg1, arg2,..., argN )

2)...

n) end AlgorithmName

Immediatelyafterthealgorithmsignaturewelistany Pre or Post conditions.

1) algorithm AlgorithmName(n)

2) Pre: n isthevaluetocomputethefactorialof

3) n ≥ 0

4) Post: thefactorialof n hasbeencomputed

5)//...

n) end AlgorithmName

Theexampleabovedescribesanalgorithmbythenameof AlgorithmName, whichtakesasinglenumericparameter n.Thepreandpostconditionsfollow thealgorithmsignature;youshouldalwaysenforcethepre-conditionsofan algorithmwhenportingthemtoyourlanguageofchoice.

Normallywhatislistedasapre-coniditioniscriticaltothealgorithmsoperation.Thismaycoverthingsliketheactualparameternotbeingnull,orthatthe collectionpassedinmustcontainatleast n items.Thepost-conditionmainly describestheeffectofthealgorithmsoperation.Anexampleofapost-condition mightbe“Thelisthasbeensortedinascendingorder”

Becauseeverythingwedescribeislanguageindependentyouwillneedto makeyourownminduponhowtobesthandlepre-conditions.Forexample, intheC#targetwehaveimplemented,weconsidernon-conformancetopreconditionstobeexceptionalcases.Weprovideamessageintheexceptionto tellthecallerwhythealgorithmhasfailedtoexecutenormally.

CHAPTER1.INTRODUCTION 5

1.4Tipsforworkingthroughtheexamples

Aswithmostbooksyougetoutwhatyouputinandsowerecommendthatin ordertogetthemostoutofthisbookyouworkthrougheachalgorithmwitha penandpapertotrackthingslikevariablenames,recursivecallsetc.

Thebestwaytoworkthroughalgorithmsistosetupatable,andinthat tablegiveeachvariableitsowncolumnandcontinuouslyupdatethesecolumns. Thiswillhelpyoukeeptrackofandvisualisethemutationsthatareoccurring throughoutthealgorithm.Oftenwhileworkingthroughalgorithmsinsuch awayyoucanintuitivelymaprelationshipsbetweendatastructuresrather thantryingtoworkoutafewvaluesonpaperandtherestinyourhead.We suggestyouputeverythingonpaperirrespectiveofhowtrivialsomevariables andcalculationsmaybesothatyoualwayshaveapointofreference.

Whendealingwithrecursivealgorithmtraceswerecommendyoudothe sameastheabove,butalsohaveatablethatrecordsfunctioncallsandwho theyreturnto.Thisapproachisafarcleanerwaythandrawingoutanelaborate mapoffunctioncallswitharrowstooneanother,whichgetslargequicklyand simplymakesthingsmorecomplextofollow.Trackeverythinginasimpleand systematicwaytomakeyourtimestudyingtheimplementationsfareasier.

1.5Bookoutline

Wehavesplitthisbookintotwoparts:

Part1: Providesdiscussionandpseudo-implementationsofcommonanduncommondatastructures;and

Part2: Providesalgorithmsofvaryingpurposesfromsortingtostringoperations.

Thereaderdoesn’thavetoreadthebooksequentiallyfrombeginningto end:chapterscanbereadindependentlyfromoneanother.Wesuggestthat inpart1youreadeachchapterinitsentirety,butinpart2youcangetaway withjustreadingthesectionofachapterthatdescribesthealgorithmyouare interestedin.

Eachofthechaptersondatastructurespresentinitiallythealgorithmsconcernedwith:

Thepreviouslistrepresentswhatwebelieveinthevastmajorityofcasesto bethemostimportantforeachrespectivedatastructure.

Forallreaderswerecommendthatbeforelookingatanyalgorithmyou quicklylookatAppendixEwhichcontainsatablelistingthevarioussymbols usedwithinouralgorithmsandtheirmeaning.Onekeywordthatwewouldlike topointouthereis yield.Youcanthinkof yield inthesamelightas return The return keywordcausesthemethodtoexitandreturnscontroltothecaller, whereas yield returnseachvaluetothecaller.With yield controlonlyreturns tothecallerwhenallvaluestoreturntothecallerhavebeenexhausted.

CHAPTER1.INTRODUCTION 6
1. Insertion 2. Deletion 3. Searching

1.6Testing

Allthedatastructuresandalgorithmshavebeentestedusingaminimisedtest drivendevelopmentstyleonpapertofleshoutthepseudocodealgorithm.We thentranscribethesetestsintounittestssatisfyingthemonebyone.When allthetestcaseshavebeenprogressivelysatisfiedweconsiderthatalgorithm suitablytested.

Forthemostpartalgorithmshavefairlyobviouscaseswhichneedtobe satisfied.Somehoweverhavemanyareaswhichcanprovetobemorecomplex tosatisfy.Withsuchalgorithmswewillpointoutthetestcaseswhicharetricky andthecorrespondingportionsofpseudocodewithinthealgorithmthatsatisfy thatrespectivecase.

Asyoubecomemorefamiliarwiththeactualproblemyouwillbeableto intuitivelyidentifyareaswhichmaycauseproblemsforyouralgorithmsimplementation.Thisinsomecaseswillyieldanoverwhelminglistofconcernswhich willhinderyourabilitytodesignanalgorithmgreatly.Whenyouarebombardedwithsuchavastamountofconcernslookattheoverallproblemagain andsub-dividetheproblemintosmallerproblems.Solvingthesmallerproblems andthencomposingthemisafareasiertaskthancloudingyourmindwithtoo manylittledetails.

Theonlytypeoftestingthatweuseintheimplementationofallthatis providedinthisbookareunittests.Becauseunittestscontributesuchacore pieceofcreatingsomewhatmorestablesoftwareweinvitethereadertoview AppendixDwhichdescribestestinginmoredepth.

1.7WherecanIgetthecode?

Thisbookdoesn’tprovideanycodespecificallyalignedwithit,howeverwedo activelymaintainanopensourceproject1 thathousesaC#implementationof allthepseudocodelisted.Theprojectisnamed DataStructuresandAlgorithms (DSA)andcanbefoundat http://codeplex.com/dsa.

1.8Finalmessages

Wehavejustafewfinalmessagestothereaderthatwehopeyoudigestbefore youembarkonreadingthisbook:

1. Understandhowthealgorithmworksfirstinanabstractsense;and

2. Alwaysworkthroughthealgorithmsonpapertounderstandhowthey achievetheiroutcome

Ifyoualwaysfollowthesekeypoints,youwillgetthemostoutofthisbook.

1 Allreadersareencouragedtoprovidesuggestions,featurerequests,andbugssowecan furtherimproveourimplementations.

CHAPTER1.INTRODUCTION 7

PartI DataStructures

8

Chapter2

LinkedLists

Linkedlistscanbethoughtoffromahighlevelperspectiveasbeingaseries ofnodes.Eachnodehasatleastasinglepointertothenextnode,andinthe lastnode’scaseanullpointerrepresentingthattherearenomorenodesinthe linkedlist.

InDSAourimplementationsoflinkedlistsalwaysmaintainheadandtail pointerssothatinsertionateithertheheadortailofthelistisaconstant timeoperation.Randominsertionisexcludedfromthisandwillbealinear operation.Assuch,linkedlistsinDSAhavethefollowingcharacteristics:

1. Insertionis O(1)

2. Deletionis O(n)

3. Searchingis O(n)

Outofthethreeoperationstheonethatstandsoutisthatofinsertion.In DSAwechosetoalwaysmaintainpointers(ormoreaptlyreferences)tothe node(s)attheheadandtailofthelinkedlistandsoperformingatraditional insertiontoeitherthefrontorbackofthelinkedlistisan O(1)operation.An exceptiontothisruleisperforminganinsertionbeforeanodethatisneither theheadnortailinasinglylinkedlist.Whenthenodeweareinsertingbefore issomewhereinthemiddleofthelinkedlist(knownasrandominsertion)the complexityis O(n).Inordertoaddbeforethedesignatednodeweneedto traversethelinkedlisttofindthatnode’scurrentpredecessor.Thistraversal yieldsan O(n)runtime.

Thisdatastructureistrivial,butlinkedlistshaveafewkeypointswhichat timesmakethemveryattractive:

1. thelistisdynamicallyresized,thusitincursnocopypenaltylikeanarray orvectorwouldeventuallyincur;and

2. insertionis O(1).

2.1SinglyLinkedList

Singlylinkedlistsareoneofthemostprimitivedatastructuresyouwillfindin thisbook.Eachnodethatmakesupasinglylinkedlistconsistsofavalue,and areferencetothenextnode(ifany)inthelist.

9

2.1.1Insertion

Ingeneralwhenpeopletalkaboutinsertionwithrespecttolinkedlistsofany formtheyimplicitlyrefertotheaddingofanodetothetailofthelist.When youuseanAPIlikethatofDSAandyouseeageneralpurposemethodthat addsanodetothelist,youcanassumethatyouareaddingthenodetothetail ofthelistnotthehead.

Addinganodetoasinglylinkedlisthasonlytwocases:

1. head = ∅ inwhichcasethenodeweareaddingisnowboththe head and tail ofthelist;or

2. wesimplyneedtoappendournodeontotheendofthelistupdatingthe tail referenceappropriately.

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtothelist

3) Post: value hasbeenplacedatthetailofthelist

4) n ← node(value)

5) if head = ∅ 6) head ← n

Asanexampleofthepreviousalgorithmconsideraddingthefollowingsequenceofintegerstothelist:1,45,60,and12,theresultinglististhatof Figure2.2.

2.1.2Searching

Searchingalinkedlistisstraightforward:wesimplytraversethelistchecking thevaluewearelookingforwiththevalueofeachnodeinthelinkedlist.The algorithmlistedinthissectionisverysimilartothatusedfortraversalin §2.1.4.

CHAPTER2.LINKEDLISTS 10
Figure2.1:Singlylinkedlistnode Figure2.2:Asinglylinkedlistpopulatedwithintegers
8)
10) tail
11)
12)
7) tail ← n
else 9) tail.Next ← n
← n
endif
end Add

1) algorithm Contains(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetosearchfor

4) Post: theitemiseitherinthelinkedlist,true;otherwisefalse

5) n ← head

6) while n = ∅ and n.Value = value

7) n ← n.Next 8) endwhile 9) if n = ∅

returnfalse

endif

returntrue

end Contains

2.1.3Deletion

Deletinganodefromalinkedlistisstraightforwardbutthereareafewcases weneedtoaccountfor:

1. thelistisempty;or

2. thenodetoremoveistheonlynodeinthelinkedlist;or

3. weareremovingtheheadnode;or

4. weareremovingthetailnode;or

5. thenodetoremoveissomewhereinbetweentheheadandtail;or

6. theitemtoremovedoesn’texistinthelinkedlist

Thealgorithmwhosecaseswehavedescribedwillremoveanodefromanywherewithinalistirrespectiveofwhetherthenodeisthe head etc.Ifyouknow thatitemswillonlyeverberemovedfromthe head or tail ofthelistthenyou cancreatemuchmoreconcisealgorithms.Inthecaseofalwaysremovingfrom thefrontofthelinkedlistdeletionbecomesan O(1)operation.

CHAPTER2.LINKEDLISTS 11
10)
11)
12)
13)

1) algorithm Remove(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetoremovefromthelist

4) Post: value isremovedfromthelist,true;otherwisefalse

5) if head = ∅ 6)//case1

7) returnfalse

8) endif 9) n ← head 10) if n.Value= value 11) if head = tail

13) head ←∅

tail ←∅ 15) else

17) head ← head.Next

endif

returntrue 20) endif

while n.Next = ∅ and n.Next.Value = value

n ← n.Next

endwhile

if n.Next = ∅ 25) if n.Next= tail

27) tail ← n

endif 29)//thisisonlycase5iftheconditionalonline25was false

n.Next ← n.Next.Next

returnfalse

end Remove

2.1.4Traversingthelist

Traversingasinglylinkedlististhesameasthatoftraversingadoublylinked list(definedin §2.2).Youstartattheheadofthelistandcontinueuntilyou comeacrossanodethatis ∅.Thetwocasesareasfollows:

1. node = ∅,wehaveexhaustedallnodesinthelinkedlist;or

2. wemustupdatethe node referencetobe node.Next.

Thealgorithmdescribedisaverysimpleonethatmakesuseofasimple while looptocheckthefirstcase.

CHAPTER2.LINKEDLISTS 12
23)
30)
31)
33)//case6 34)
12)//case2
14)
16)//case3
18)
19)
21)
22)
24)
26)//case4
28)
returntrue 32) endif
35)

1) algorithm Traverse(head)

2) Pre: head istheheadnodeinthelist

3) Post: theitemsinthelisthavebeentraversed

4) n ← head

5) while n =0

6) yield n.Value

7) n ← n.Next 8)

2.1.5Traversingthelistinreverseorder

Traversingasinglylinkedlistinaforwardmanner(i.e.lefttoright)issimple asdemonstratedin §2.1.4.However,whatifwewantedtotraversethenodesin thelinkedlistinreverseorderforsomereason?Thealgorithmtoperformsuch atraversalisverysimple,andjustlikedemonstratedin §2.1.3wewillneedto acquireareferencetothepredecessorofanode,eventhoughthefundamental characteristicsofthenodesthatmakeupasinglylinkedlistmakethisan expensiveoperation.Foreachnode,findingitspredecessorisan O(n)operation, sooverthecourseoftraversingthewholelistbackwardsthecostbecomes O(n2). Figure2.3depictsthefollowingalgorithmbeingappliedtoalinkedlistwith theintegers5,10,1,and40.

1) algorithm ReverseTraversal(head, tail)

2) Pre: head and tail belongtothesamelist

3) Post: theitemsinthelisthavebeentraversedinreverseorder 4) if tail = ∅ 5) curr ← tail

Thisalgorithmisonlyofrealinterestwhenweareusingsinglylinkedlists, asyouwillsoonseethatdoublylinkedlists(definedin §2.2)makereverselist traversalsimpleandefficient,asshownin §2.2.3.

2.2DoublyLinkedList

Doublylinkedlistsareverysimilartosinglylinkedlists.Theonlydifferenceis thateachnodehasareferencetoboththenextandpreviousnodesinthelist.

CHAPTER2.LINKEDLISTS 13
9)
endwhile
end Traverse
6)
7)
8)
9)
10) endwhile 11)
curr
12) curr ← prev 13) endwhile 14) yield curr.Value 15) endif 16)
ReverseTraversal
while curr = head
prev ← head
while prev.Next = curr
prev ← prev.Next
yield
.Value
end
CHAPTER2.LINKEDLISTS 14
Figure2.3:Reversetraveralofasinglylinkedlist Figure2.4:Doublylinkedlistnode

Thefollowingalgorithmsforthedoublylinkedlistareexactlythesameas thoselistedpreviouslyforthesinglylinkedlist:

1. Searching(definedin §2.1.2)

2. Traversal(definedin §2.1.4)

2.2.1Insertion

Theonlymajordifferencebetweenthealgorithmin §2.1.1isthatweneedto remembertobindthepreviouspointerof n totheprevioustailnodeif n was notthefirstnodetobeinsertedintothelist.

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtothelist

3) Post: value hasbeenplacedatthetailofthelist

4) n ← node(value)

5) if head = ∅

6) head ← n

7) tail ← n

8) else

9) n.Previous ← tail

10) tail.Next ← n 11) tail ← n

12) endif

13) end Add

Figure2.5showsthedoublylinkedlistafteraddingthesequenceofintegers definedin §2.1.1.

2.2.2Deletion

Asyoumayofguessedthecasesthatweusefordeletioninadoublylinked listareexactlythesameasthosedefinedin §2.1.3.Likeinsertionwehavethe addedtaskofbindinganadditionalreference(Previous)tothecorrectvalue.

CHAPTER2.LINKEDLISTS 15
Figure2.5:Doublylinkedlistpopulatedwithintegers

1) algorithm Remove(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetoremovefromthelist

4) Post: value isremovedfromthelist,true;otherwisefalse

Singlylinkedlistshaveaforwardonlydesign,whichiswhythereversetraversal algorithmdefinedin §2.1.5requiredsomecreativeinvention.Doublylinkedlists makereversetraversalassimpleasforwardtraversal(definedin §2.1.4)except thatwestartatthetailnodeandupdatethepointersintheoppositedirection. Figure2.6showsthereversetraversalalgorithminaction.

CHAPTER2.LINKEDLISTS 16
returnfalse
if value = head
if head = tail 10) head ←∅ 11) tail ←∅ 12) else 13) head ← head.Next 14) head.Previous ←∅ 15) endif 16) returntrue 17) endif 18) n ← head.Next 19) while n = ∅ and value = n.Value 20) n ← n.Next 21) endwhile 22) if n = tail 23) tail ← tail.Previous 24) tail.Next ←∅ 25) returntrue 26) elseif n = ∅ 27) n.Previous.Next ← n
28) n.Next.Previous ← n.Previous 29) returntrue 30) endif 31) returnfalse 32) end Remove
5) if head = ∅ 6)
7) endif 8)
.Value 9)
.Next
2.2.3ReverseTraversal

2.3Summary

Linkedlistsaregoodtousewhenyouhaveanunknownnumberofitemsto store.Usingadatastructurelikeanarraywouldrequireyoutospecifythesize upfront;exceedingthatsizeinvolvesinvokingaresizingalgorithmwhichhas alinearruntime.Youshouldalsouselinkedlistswhenyouwillonlyremove nodesateithertheheadortailofthelisttomaintainaconstantruntime. Thisrequiresmaintainingpointerstothenodesattheheadandtailofthelist butthememoryoverheadwillpayforitselfifthisisanoperationyouwillbe performingmanytimes.

Whatlinkedlistsarenotverygoodforisrandominsertion,accessingnodes byindex,andsearching.Attheexpenseofalittlememory(inmostcases4 byteswouldsuffice),andafewmoreread/writesyoucouldmaintaina count variablethattrackshowmanyitemsarecontainedinthelistsothataccessing suchaprimitivepropertyisaconstantoperation-youjustneedtoupdate count duringtheinsertionanddeletionalgorithms.

Singlylinkedlistsshouldbeusedwhenyouareonlyperformingbasicinsertions.Ingeneraldoublylinkedlistsaremoreaccommodatingfornon-trivial operationsonalinkedlist.

Werecommendtheuseofadoublylinkedlistwhenyourequireforwards andbackwardstraversal.Forthemostcasesthisrequirementispresent.For example,consideratokenstreamthatyouwanttoparseinarecursivedescent fashion.Sometimesyouwillhavetobacktrackinordertocreatethecorrect parsetree.Inthisscenarioadoublylinkedlistisbestasitsdesignmakes bi-directionaltraversalmuchsimplerandquickerthanthatofasinglylinked

CHAPTER2.LINKEDLISTS 17
1) algorithm ReverseTraversal(tail) 2) Pre: tail isthetailnodeofthelisttotraverse 3) Post: thelisthasbeentraversedinreverseorder 4) n ← tail 5) while n = ∅ 6) yield n.Value 7) n ← n.Previous 8) endwhile 9) end ReverseTraversal
Figure2.6:Doublylinkedlistreversetraversal
CHAPTER2.LINKEDLISTS 18 list.

Chapter3

BinarySearchTree

Binarysearchtrees(BSTs)areverysimpletounderstand.Westartwitharoot nodewithvalue x,wheretheleftsubtreeof x containsnodeswithvalues <x andtherightsubtreecontainsnodeswhosevaluesare ≥ x.Eachnodefollows thesameruleswithrespecttonodesintheirleftandrightsubtrees.

BSTsareofinterestbecausetheyhaveoperationswhicharefavourablyfast: insertion,lookup,anddeletioncanallbedonein O(logn)time.Itisimportant tonotethatthe O(logn)timesfortheseoperationscanonlybeattainedif theBSTisreasonablybalanced;foratreedatastructurewithselfbalancing propertiesseeAVLtreedefinedin §7).

Inthefollowingexamplesyoucanassume,unlessusedasaparameteralias that root isareferencetotherootnodeofthetree.

19
Figure3.1:Simpleunbalancedbinarysearchtree

3.1Insertion

Asmentionedpreviouslyinsertionisan O(logn)operationprovidedthatthe treeismoderatelybalanced.

1) algorithm Insert(value)

2) Pre: value haspassedcustomtypechecksfortype T

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if root = ∅

5) root ← node(value)

6) else

7)InsertNode(root, value)

8) endif

9) end Insert

1) algorithm InsertNode(current, value)

2) Pre: current isthenodetostartfrom

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if value<current.Value

5) if current.Left= ∅

6) current.Left ← node(value)

7) else

8)InsertNode(current.Left, value)

9) endif

10) else

11) if current.Right= ∅

12) current.Right ← node(value) 13) else

14)InsertNode(current.Right, value) 15) endif 16) endif 17) end InsertNode

Theinsertionalgorithmissplitforagoodreason.Thefirstalgorithm(nonrecursive)checksaverycorebasecase-whetherornotthetreeisempty.If thetreeisemptythenwesimplycreateourrootnodeandfinish.Inallother casesweinvoketherecursive InsertNode algorithmwhichsimplyguidesusto thefirstappropriateplaceinthetreetoput value.Notethatateachstagewe performabinarychop:weeitherchoosetorecurseintotheleftsubtreeorthe rightbycomparingthenewvaluewiththatofthecurrentnode.Foranytotally orderedtype,novaluecansimultaneouslysatisfytheconditionstoplaceitin bothsubtrees.

CHAPTER3.BINARYSEARCHTREE 20

3.2Searching

SearchingaBSTisevensimplerthaninsertion.Thepseudocodeisself-explanatory butwewilllookbrieflyatthepremiseofthealgorithmnonetheless. Wehavetalkedpreviouslyaboutinsertion,wegoeitherleftorrightwiththe rightsubtreecontainingvaluesthatare ≥ x where x isthevalueofthenode weareinserting.Whensearchingtherulesaremadealittlemoreatomicand atanyonetimewehavefourcasestoconsider:

1. the root = ∅ inwhichcase value isnotintheBST;or

2. root.Value= value inwhichcase value isintheBST;or

3. value<root.Value,wemustinspecttheleftsubtreeof root for value;or

4. value>root.Value,wemustinspecttherightsubtreeof root for value.

1) algorithm Contains(root, value)

2) Pre: root istherootnodeofthetree, value iswhatwewouldliketolocate

3) Post: value iseitherlocatedornot 4) if root = ∅

CHAPTER3.BINARYSEARCHTREE 21
5)
6)
7) if root.Value= value 8) returntrue 9) elseif value<root.Value 10) return Contains(root.Left, value) 11) else 12) return Contains(root.Right, value) 13) endif 14) end Contains
returnfalse
endif

3.3Deletion

RemovinganodefromaBSTisfairlystraightforward,withfourcasestoconsider:

1. thevaluetoremoveisaleafnode;or

2. thevaluetoremovehasarightsubtree,butnoleftsubtree;or

3. thevaluetoremovehasaleftsubtree,butnorightsubtree;or

4. thevaluetoremovehasbothaleftandrightsubtreeinwhichcasewe promotethelargestvalueintheleftsubtree.

Thereisalsoanimplicitfifthcasewherebythenodetoberemovedisthe onlynodeinthetree.Thiscaseisalreadycoveredbythefirst,butshouldbe notedasapossibilitynonetheless.

OfcourseinaBSTavaluemayoccurmorethanonce.Insuchacasethe firstoccurrenceofthatvalueintheBSTwillberemoved.

The Remove algorithmgivenbelowreliesontwofurtherhelperalgorithms named FindParent,and FindNode whicharedescribedin §3.4and §3.5respectively.

CHAPTER3.BINARYSEARCHTREE 22
Figure3.2:binarysearchtreedeletioncases

1) algorithm Remove(value)

2) Pre: value isthevalueofthenodetoremove, root istherootnodeoftheBST

3)CountisthenumberofitemsintheBST

3) Post: nodewith value isremovediffoundinwhichcaseyieldstrue,otherwisefalse

value

CHAPTER3.BINARYSEARCHTREE 23
9)
10)
11) elseif
nodeToRemove
null 12)//case#1
14) parent
15) else 16) parent.Right ←∅ 17) endif 18) elseif nodeToRemove
nodeToRemove
= ∅ 19)//case#2 20) if nodeToRemove.Value <parent.Value 21) parent.Left ← nodeToRemove.Right 22) else 23) parent.Right ← nodeToRemove.Right 24) endif 25) elseif nodeToRemove.Left = ∅ and nodeToRemove.Right= ∅ 26)//case#3 27) if nodeToRemove.Value <parent.Value 28) parent.Left ← nodeToRemove.Left 29) else 30) parent.Right ← nodeToRemove.Left 31) endif 32) else 33)//case#4 34) largestValue ← nodeToRemove.Left 35) while largestValue.Right = ∅ 36)//findthelargestvalueintheleftsubtreeof nodeToRemove 37) largestValue ← largestValue.Right 38) endwhile 39)//settheparents’Rightpointerof largestValue to ∅ 40)FindParent(largestValue.Value).Right ←∅ 41) nodeToRemove.Value ← largestValue.Value 42) endif 43)Count ← Count 1 44) returntrue 45) end Remove
4) nodeToRemove ← FindNode(
) 5) if nodeToRemove = ∅ 6) returnfalse //valuenotinBST 7) endif 8) parent ← FindParent(value)
if Count=1
root ←∅ //weareremovingtheonlynodeintheBST
nodeToRemove.Left= ∅ and
.Right=
13) if nodeToRemove.Value <parent.Value
.Left ←∅
.Left=
and
.Right

3.4Findingtheparentofagivennode

Thepurposeofthisalgorithmissimple-toreturnareference(orpointer)to theparentnodeoftheonewiththegivenvalue.Wehavefoundthatsuchan algorithmisveryuseful,especiallywhenperformingextensivetreetransformations.

1) algorithm FindParent(value, root)

2) Pre: value isthevalueofthenodewewanttofindtheparentof

3) root istherootnodeoftheBSTandis!= ∅

4) Post: areferencetotheparentnodeof value iffound;otherwise ∅

5) if value = root.Value

6) return ∅ 7) endif

8) if value<root.Value

9) if root.Left= ∅

10) return ∅

11) elseif root.Left.Value= value

12) return root

13) else

14) return FindParent(value, root.Left)

15) endif 16) else

17) if root.Right= ∅ 18) return ∅ 19) elseif root.Right.Value= value

20) return root 21) else

22) return FindParent(value, root.Right)

23) endif 24) endif 25) end FindParent

Aspecialcaseintheabovealgorithmiswhenthespecifiedvaluedoesnot existintheBST,inwhichcasewereturn ∅.Callerstothisalgorithmmusttake accountofthispossibilityunlesstheyarealreadycertainthatanodewiththe specifiedvalueexists.

3.5Attainingareferencetoanode

Thisalgorithmisverysimilarto §3.4,butinsteadofreturningareferencetothe parentofthenodewiththespecifiedvalue,itreturnsareferencetothenode itself.Again, ∅ isreturnedifthevalueisn’tfound.

CHAPTER3.BINARYSEARCHTREE 24

1) algorithm FindNode(root, value)

2) Pre: value isthevalueofthenodewewanttofindtheparentof

3) root istherootnodeoftheBST

areferencetothenodeof value iffound;otherwise ∅

Astutereaderswillhavenoticedthatthe FindNode algorithmisexactlythe sameasthe Contains algorithm(definedin §3.2)withthemodificationthat wearereturningareferencetoanodenot true or false.Given FindNode, theeasiestwayofimplementing Contains istocall FindNode andcomparethe returnvaluewith ∅.

3.6Findingthesmallestandlargestvaluesin thebinarysearchtree

TofindthesmallestvalueinaBSTyousimplytraversethenodesintheleft subtreeoftheBSTalwaysgoingleftuponeachencounterwithanode,terminatingwhenyoufindanodewithnoleftsubtree.Theoppositeisthecasewhen findingthelargestvalueintheBST.Bothalgorithmsareincrediblysimple,and arelistedsimplyforcompleteness.

Thebasecaseinboth FindMin,and FindMax algorithmsiswhentheLeft (FindMin),orRight(FindMax)nodereferencesare ∅ inwhichcasewehave reachedthelastnode.

1) algorithm FindMin(root)

2) Pre: root istherootnodeoftheBST

3) root = ∅

4) Post: thesmallestvalueintheBSTislocated 5) if root.Left= ∅ 6) return root.Value 7) endif 8)FindMin(root.Left) 9) end FindMin

CHAPTER3.BINARYSEARCHTREE 25
10)
11)
12)
13)
14)
15)
4) Post:
5) if root = ∅ 6) return ∅ 7) endif 8) if root.Value= value 9) return root
elseif value<root.Value
return FindNode(root.Left, value)
else
return FindNode(root.Right, value)
endif
end FindNode

1) algorithm FindMax(root)

2) Pre: root istherootnodeoftheBST

3) root = ∅

4) Post: thelargestvalueintheBSTislocated

5) if root.Right= ∅

6) return root.Value

7) endif

8)FindMax(root.Right)

9) end FindMax

3.7TreeTraversals

Therearevariousstrategieswhichcanbeemployedtotraversetheitemsina tree;thechoiceofstrategydependsonwhichnodevisitationorderyourequire. InthissectionwewilltouchonthetraversalsthatDSAprovidesonalldata structuresthatderivefrom BinarySearchTree

3.7.1Preorder

Whenusingthepreorderalgorithm,youvisittherootfirst,thentraversetheleft subtreeandfinallytraversetherightsubtree.Anexampleofpreordertraversal isshowninFigure3.3.

1) algorithm Preorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinpreorder

4) if root = ∅

5) yield root.Value

6)Preorder(root.Left)

7)Preorder(root.Right)

8) endif

9) end Preorder

3.7.2Postorder

Thisalgorithmisverysimilartothatdescribedin §3.7.1,howeverthevalue ofthenodeisyieldedaftertraversingbothsubtrees.Anexampleofpostorder traversalisshowninFigure3.4.

1) algorithm Postorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinpostorder

4) if root = ∅

5)Postorder(root.Left)

6)Postorder(root.Right)

7) yield root.Value

8) endif

9) end Postorder

CHAPTER3.BINARYSEARCHTREE 26
CHAPTER3.BINARYSEARCHTREE 27
Figure3.3:Preordervisitbinarysearchtreeexample
CHAPTER3.BINARYSEARCHTREE 28
Figure3.4:Postordervisitbinarysearchtreeexample

3.7.3Inorder

Anothervariationofthealgorithmsdefinedin §3.7.1and §3.7.2isthatofinorder traversalwherethevalueofthecurrentnodeisyieldedinbetweentraversing theleftsubtreeandtherightsubtree.Anexampleofinordertraversalisshown inFigure3.5.

Figure3.5:Inordervisitbinarysearchtreeexample

1) algorithm Inorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedininorder

4) if root = ∅

5)Inorder(root.Left)

6) yield root.Value

7)Inorder(root.Right)

8) endif

9) end Inorder

Oneofthebeautiesofinordertraversalisthatvaluesareyieldedintheir comparisonorder.Inotherwords,whentraversingapopulatedBSTwiththe inorderstrategy,theyieldedsequencewouldhaveproperty xi ≤ xi+1∀i.

CHAPTER3.BINARYSEARCHTREE 29

3.7.4BreadthFirst

Traversingatreeinbreadthfirstorderyieldsthevaluesofallnodesofaparticulardepthinthetreebeforeanydeeperones.Inotherwords,givenadepth d wewouldvisitthevaluesofallnodesat d inalefttorightfashion,thenwe wouldproceedto d +1andsoonuntilwehadenomorenodestovisit.An exampleofbreadthfirsttraversalisshowninFigure3.6.

Traditionallybreadthfirsttraversalisimplementedusingalist(vector,resizeablearray,etc)tostorethevaluesofthenodesvisitedinbreadthfirstorder andthenaqueuetostorethosenodesthathaveyettobevisited.

CHAPTER3.BINARYSEARCHTREE 30
Figure3.6:BreadthFirstvisitbinarysearchtreeexample

1) algorithm BreadthFirst(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinbreadthfirstorder

4) q ← queue

5) while root = ∅

6) yield root.Value

7) if root.Left = ∅

8) q.Enqueue(root.Left)

9) endif 10) if root.Right = ∅ 11) q.Enqueue(root.Right) 12) endif 13) if !q.IsEmpty()

14) root ← q.Dequeue()

15) else

16) root ←∅

17) endif 18) endwhile 19) end BreadthFirst

3.8Summary

Abinarysearchtreeisagoodsolutionwhenyouneedtorepresenttypesthatare orderedaccordingtosomecustomrulesinherenttothattype.Withlogarithmic insertion,lookup,anddeletionitisveryeffecient.Traversalremainslinear,but therearemanywaysinwhichyoucanvisitthenodesofatree.Treesare recursivedatastructures,sotypicallyyouwillfindthatmanyalgorithmsthat operateonatreearerecursive.

Theruntimespresentedinthischapterarebasedonaprettybigassumption -thatthebinarysearchtree’sleftandrightsubtreesarereasonablybalanced. Wecanonlyattainlogarithmicruntimesforthealgorithmspresentedearlier whenthisistrue.Abinarysearchtreedoesnotenforcesuchaproperty,and theruntimesfortheseoperationsonapathologicallyunbalancedtreebecome linear:suchatreeiseffectivelyjustalinkedlist.Laterin §7wewillexamine anAVLtreethatenforcesself-balancingpropertiestohelpattainlogarithmic runtimes.

CHAPTER3.BINARYSEARCHTREE 31

Chapter4

Heap

Aheapcanbethoughtofasasimpletreedatastructure,howeveraheapusually employsoneoftwostrategies:

1. minheap;or

2. maxheap

Eachstrategydeterminesthepropertiesofthetreeanditsvalues.Ifyou weretochoosetheminheapstrategytheneachparentnodewouldhaveavalue thatis ≤ thanitschildren.Forexample,thenodeattherootofthetreewill havethesmallestvalueinthetree.Theoppositeistrueforthemaxheap strategy.Inthisbookyoushouldassumethataheapemploystheminheap strategyunlessotherwisestated.

Unlikeothertreedatastructuresliketheonedefinedin §3aheapisgenerally implementedasanarrayratherthanaseriesofnodeswhicheachhavereferencestoothernodes.Thenodesareconceptuallythesame,however,havingat mosttwochildren.Figure4.1showshowthetree(notaheapdatastructure) (127(32)6(9))wouldberepresentedasanarray.ThearrayinFigure4.1isa resultofsimplyaddingvaluesinatop-to-bottom,left-to-rightfashion.Figure 4.2showsarrowstothedirectleftandrightchildofeachvalueinthearray.

Thischapterisverymuchcentredaroundthenotionofrepresentingatreeas anarrayandbecausethispropertyiskeytounderstandingthischapterFigure 4.3showsastepbystepprocesstorepresentatreedatastructureasanarray. InFigure4.3youcanassumethatthedefaultcapacityofourarrayiseight.

Usingjustanarrayisoftennotsufficientaswehavetobeupfrontaboutthe sizeofthearraytousefortheheap.Oftentheruntimebehaviourofaprogram canbeunpredictablewhenitcomestothesizeofitsinternaldatastructures, soweneedtochooseamoredynamicdatastructurethatcontainsthefollowing properties:

1. wecanspecifyaninitialsizeofthearrayforscenarioswhereweknowthe upperstoragelimitrequired;and

2. thedatastructureencapsulatesresizingalgorithmstogrowthearrayas requiredatruntime

32

Figure4.1doesnotspecifyhowwewouldhandleaddingnullreferencesto theheap.Thisvariesfromcasetocase;sometimesnullvaluesareprohibited entirely;inothercaseswemaytreatthemasbeingsmallerthananynon-null value,orindeedgreaterthananynon-nullvalue.Youwillhavetoresolvethis ambiguityyourselfhavingstudiedyourrequirements.Forthesakeofclaritywe willavoidtheissuebyprohibitingnullvalues.

Becauseweareusinganarrayweneedsomewaytocalculatetheindexofa parentnode,andthechildrenofanode.Therequiredexpressionsforthisare definedasfollowsforanodeat index:

1. (index 1)/2(parentindex)

2. 2 ∗ index +1(leftchild)

3. 2 ∗ index +2(rightchild)

InFigure4.4a)representsthecalculationoftherightchildof12(2 ∗ 0+2); andb)calculatestheindexoftheparentof3((3 1)/2).

4.1Insertion

Designinganalgorithmforheapinsertionissimple,butwemustensurethat heaporderispreservedaftereachinsertion.Generallythisisapost-insertion operation.Insertingavalueintothenextfreeslotinanarrayissimple:wejust needtokeeptrackofthenextfreeindexinthearrayasacounter,andincrement itaftereachinsertion.Insertingourvalueintotheheapisthefirstpartofthe algorithm;thesecondisvalidatingheaporder.Inthecaseofmin-heapordering thisrequiresustoswapthevaluesofaparentanditschildifthevalueofthe childis < thevalueofitsparent.Wemustdothisforeachsubtreecontaining thevaluewejustinserted.

CHAPTER4.HEAP 33
Figure4.1:Arrayrepresentationofasimpletreedatastructure Figure4.2:Directchildrenofthenodesinanarrayrepresentationofatreedata structure 1. Vector 2. ArrayList 3. List
CHAPTER4.HEAP 34
Figure4.3:Convertingatreedatastructuretoitsarraycounterpart

Theruntimeefficiencyforheapinsertionis O(logn).Theruntimeisa byproductofverifyingheaporderasthefirstpartofthealgorithm(theactual insertionintothearray)is O(1).

Figure4.5showsthestepsofinsertingthevalues3,9,12,7,and1intoa min-heap.

CHAPTER4.HEAP 35
Figure4.4:Calculatingnodeproperties
CHAPTER4.HEAP 36
Figure4.5:Insertingvaluesintoamin-heap

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtotheheap

3)Countisthenumberofitemsintheheap

4) Post: thevaluehasbeenaddedtotheheap

5) heap[Count] ← value

6)Count ← Count+1

7)MinHeapify()

8) end Add

1) algorithm MinHeapify()

2) Pre: Countisthenumberofitemsintheheap

3) heap isthearrayusedtostoretheheapitems

4) Post: theheaphaspreservedminheapordering

5) i ← Count 1

6) while i> 0 and heap[i] <heap[(i 1)/2]

7)Swap(heap[i], heap[(i 1)/2]

8) i ← (i 1)/2

9) endwhile

10) end MinHeapify

Thedesignofthe MaxHeapify algorithmisverysimilartothatofthe MinHeapify algorithm,theonlydifferenceisthatthe < operatorinthesecond conditionofenteringthewhileloopischangedto >

4.2Deletion

Justasforinsertion,deletinganiteminvolvesensuringthatheaporderingis preserved.Thealgorithmfordeletionhasthreesteps:

1. findtheindexofthevaluetodelete

2. putthelastvalueintheheapattheindexlocationoftheitemtodelete

3. verifyheaporderingforeachsubtreewhichusedtoincludethevalue

CHAPTER4.HEAP 37

1) algorithm Remove(value)

2) Pre: value isthevaluetoremovefromtheheap

3) left,and right areupdatedalias’for2 ∗ index +1,and2 ∗ index +2respectively

4)Countisthenumberofitemsintheheap

5) heap isthearrayusedtostoretheheapitems

6) Post: value islocatedintheheapandremoved,true;otherwisefalse

7)//step1

8) index ← FindIndex(heap, value)

9) if index< 0

10) returnfalse

11) endif

12)Count ← Count 1

13)//step2

14) heap[index] ← heap[Count]

15)//step3

16) while left< Count and heap[index] >heap[left] or heap[index] >heap[right]

17)//promotesmallestkeyfromsubtree

18) if heap[left] <heap[right]

19)Swap(heap, left, index)

20) index ← left

21) else

22)Swap(heap, right, index)

23) index ← right

24) endif

25) endwhile

26) returntrue

27) end Remove

Figure4.6showsthe Remove algorithmvisually,removing1fromaheap containingthevalues1,3,9,12,and13.InFigure4.6youcanassumethatwe havespecifiedthatthebackingarrayoftheheapshouldhaveaninitialcapacity ofeight.

Pleasenotethatinourdeletionalgorithmthatwedon’tdefaulttheremoved valueinthe heap array.Ifyouareusingaheapforreferencetypes,i.e.objects thatareallocatedonaheapyouwillwanttofreethatmemory.Thisisimportant inbothunmanaged,andmanagedlanguages.Inthelatterwewillwanttonull thatemptyholesothatthegarbagecollectorcanreclaimthatmemory.Ifwe weretonotnullthatholethentheobjectcouldstillbereachedandthuswon’t begarbagecollected.

4.3Searching

Searchingaheapismerelyamatteroftraversingtheitemsintheheaparray sequentially,sothisoperationhasaruntimecomplexityof O(n).Thesearch canbethoughtofasonethatusesabreadthfirsttraversalasdefinedin §3.7.4 tovisitthenodeswithintheheaptocheckforthepresenceofaspecifieditem.

CHAPTER4.HEAP 38
CHAPTER4.HEAP 39
Figure4.6:Deletinganitemfromaheap

Theproblemwiththepreviousalgorithmisthatwedon’ttakeadvantage ofthepropertiesinwhichallvaluesofaheaphold,thatisthepropertyofthe heapstrategybeingused.Forinstanceifwehadaheapthatdidn’tcontainthe value4wewouldhavetoexhaustthewholebackingheaparraybeforewecould determinethatitwasn’tpresentintheheap.Factoringinwhatweknowabout theheapwecanoptimisethesearchalgorithmbyincludinglogicwhichmakes useofthepropertiespresentedbyacertainheapstrategy.

Optimisingtodeterministicallystatethatavalueisintheheapisnotthat straightforward,howevertheproblemisaveryinterestingone.Asanexample consideramin-heapthatdoesn’tcontainthevalue5.Wecanonlyrulethatthe valueisnotintheheapif5 > theparentofthecurrentnodebeinginspected and < thecurrentnodebeinginspected ∀ nodesatthecurrentlevelweare traversing.Ifthisisthecasethen5cannotbeintheheapandsowecan provideananswerwithouttraversingtherestoftheheap.Ifthispropertyis notsatisfiedforanylevelofnodesthatweareinspectingthenthealgorithm willindeedfallbacktoinspectingallthenodesintheheap.Theoptimisation thatwepresentcanbeverycommonandsowefeelthattheextralogicwithin theloopisjustifiedtopreventtheexpensiveworsecaseruntime.

Thefollowingalgorithmisspecificallydesignedforamin-heap.Totailorthe algorithmforamax-heapthetwocomparisonoperationsinthe elseif condition withintheinner while loopshouldbeflipped.

CHAPTER4.HEAP 40 1) algorithm Contains(value) 2) Pre: value isthevaluetosearchtheheapfor 3)Countisthenumberofitemsintheheap 4) heap isthearrayusedtostoretheheapitems 5) Post: value islocatedintheheap,inwhichcasetrue;otherwisefalse 6) i ← 0 7) while i< Count and heap[i] = value 8) i ← i +1 9) endwhile 10) if i< Count 11) returntrue 12) else 13) returnfalse 14) endif 15) end Contains

1) algorithm Contains(value)

2) Pre: value isthevaluetosearchtheheapfor

3)Countisthenumberofitemsintheheap

4) heap isthearrayusedtostoretheheapitems

5) Post: value islocatedintheheap,inwhichcasetrue;otherwisefalse

6) start ← 0

7) nodes ← 1

Thenew Contains algorithmdeterminesifthe value isnotintheheapby checkingwhether count = nodes.Insuchaneventwherethisistruethenwe canconfirmthat ∀ nodes n atlevel i : value> Parent(n), value<n thusthere isnopossiblewaythat value isintheheap.AsanexampleconsiderFigure4.7. Ifwearesearchingforthevalue10withinthemin-heapdisplayeditisobvious thatwedon’tneedtosearchthewholeheaptodetermine9isnotpresent.We canverifythisaftertraversingthenodesinthesecondleveloftheheapasthe previousexpressiondefinedholdstrue.

4.4Traversal

Asmentionedin §4.3traversalofaheapisusuallydonelikethatofanyother arraydatastructurewhichourheapimplementationisbasedupon.Asaresult youtraversethearraystartingattheinitialarrayindex(0inmostlanguages) andthenvisiteachvaluewithinthearrayuntilyouhavereachedtheupper boundoftheheap.Youwillnotethatinthesearchalgorithmthatweuse Count asthisupperboundratherthantheactualphysicalboundoftheallocated array. Count isusedtopartitiontheconceptualheapfromtheactualarray implementationoftheheap:weonlycareabouttheitemsintheheap,notthe wholearray—thelattermaycontainvariousotherbitsofdataasaresultof heapmutation.

CHAPTER4.HEAP 41
11)
12)
13)
14)
15)
16)
17)
18) start
start
19) endwhile 20) if count = nodes 21) returnfalse 22) endif 23) nodes
nodes
24) endwhile 25) returnfalse 26) end Contains
8) while start< Count 9) start ← nodes 1 10) end ← nodes + start
count ← 0
while start< Count and start<end
if value = heap[start]
returntrue
elseif value> Parent(heap[start]) and value<heap[start]
count ← count +1
endif
+1
∗ 2

Ifyouhavefollowedtheadvicewegaveinthedeletionalgorithmthena heapthathasbeenmutatedseveraltimeswillcontainsomeformofdefault valueforitemsnolongerintheheap.Potentiallyyouwillhaveatmost LengthOf (heapArray) Count garbagevaluesinthebackingheaparraydata structure.Thegarbagevaluesofcoursevaryfromplatformtoplatform.To makethingssimplethegarbagevalueofareferencetypewillbesimple ∅ and0 foravaluetype.

Figure4.8showsaheapthatyoucanassumehasbeenmutatedmanytimes. Forthisexamplewecanfurtherassumethatatsomepointtheitemsinindexes 3 5actuallycontainedreferencestoliveobjectsoftype T .InFigure4.8 subscriptisusedtodisambiguateseparateobjectsof T

Fromwhatyouhavereadthusfaryouwillmostlikelyhavepickedupthat traversingtheheapinanyotherorderwouldbeoflittlebenefit.Theheap propertyonlyholdsforthesubtreeofeachnodeandsotraversingaheapin anyotherfashionrequiressomecreativeintervention.Heapsarenotusually traversedinanyotherwaythantheoneprescribedpreviously.

4.5Summary

Heapsaremostcommonlyusedtoimplementpriorityqueues(see §6.2fora sampleimplementation)andtofacilitateheapsort.Asdiscussedinboththe insertion §4.1anddeletion §4.2sectionsaheapmaintainsheaporderaccording totheselectedorderingstrategy.Thesestrategiesarereferredtoasmin-heap,

CHAPTER4.HEAP 42
Figure4.7:Determining10isnotintheheapafterinspectingthenodesofLevel 2 Figure4.8:Livinganddeadspaceintheheapbackingarray

andmaxheap.Theformerstrategyenforcesthatthevalueofaparentnodeis lessthanthatofeachofitschildren,thelatterenforcesthatthevalueofthe parentisgreaterthanthatofeachofitschildren.

Whenyoucomeacrossaheapandyouarenottoldwhatstrategyitenforces youshouldassumethatitusesthemin-heapstrategy.Iftheheapcanbe configuredotherwise,e.g.tousemax-heapthenthiswilloftenrequireyouto statethisexplicitly.Theheapabidesprogressivelytoastrategyduringthe invocationoftheinsertion,anddeletionalgorithms.Thecostofsuchapolicyis thatuponeachinsertionanddeletionweinvokealgorithmsthathavelogarithmic runtimecomplexities.Whilethecostofmaintainingthestrategymightnot seemoverlyexpensiveitdoesstillcomeataprice.Wewillalsohavetofactor inthecostofdynamicarrayexpansionatsomestage.Thiswilloccurifthe numberofitemswithintheheapoutgrowsthespaceallocatedintheheap’s backingarray.Itmaybeinyourbestinteresttoresearchagoodinitialstarting sizeforyourheaparray.Thiswillassistinminimisingtheimpactofdynamic arrayresizing.

CHAPTER4.HEAP 43

Chapter5

Sets

Asetcontainsanumberofvalues,innoparticularorder.Thevalueswithin thesetaredistinctfromoneanother.

Generallysetimplementationstendtocheckthatavalueisnotintheset beforeaddingit,avoidingtheissueofrepeatedvaluesfromeveroccurring.

Thissectiondoesnotcoversettheoryindepth;ratheritdemonstratesbriefly thewaysinwhichthevaluesofsetscanbedefined,andcommonoperationsthat maybeperformeduponthem.

Thenotation A = {4, 7, 9, 12, 0} definesaset A whosevaluesarelistedwithin thecurlybraces.

Giventheset A definedpreviouslywecansaythat4isamemberof A denotedby4 ∈ A,andthat99isnotamemberof A denotedby99 / ∈ A

Oftendefiningasetbymanuallystatingitsmembersistiresome,andmore importantlythesetmaycontainalargenumberofvalues.Amoreconciseway ofdefiningasetanditsmembersisbyprovidingaseriesofpropertiesthatthe valuesofthesetmustsatisfy.Forexample,fromthedefinition A = {x|x> 0,x %2=0} theset A containsonlypositiveintegersthatareeven. x isan aliastothecurrentvalueweareinspectingandtotherighthandsideof | are thepropertiesthat x mustsatisfytobeintheset A.Inthisexample, x must be > 0,andtheremainderofthearithmeticexpression x/2mustbe0.Youwill beabletonotefromthepreviousdefinitionoftheset A thatthesetcancontain aninfinitenumberofvalues,andthatthevaluesoftheset A willbealleven integersthatareamemberofthenaturalnumbersset N,where N = {1, 2, 3,...}.

Finallyinthisbriefintroductiontosetswewillcoversetintersectionand union,bothofwhichareverycommonoperations(amongstmanyothers)performedonsets.Theunionsetcanbedefinedasfollows A ∪ B = {x | x ∈ Aorx ∈ B},andintersection A ∩ B = {x | x ∈ Aandx ∈ B}.Figure5.1 demonstratessetintersectionanduniongraphically.

Giventhesetdefinitions A = {1, 2, 3},and B = {6, 2, 9} theunionofthetwo setsis A ∪ B = {1, 2, 3, 6, 9},andtheintersectionofthetwosetsis A ∩ B = {2}

Bothsetunionandintersectionaresometimesprovidedwithintheframeworkassociatedwithmainstreamlanguages.Thisisthecasein.NET3.51 wheresuchalgorithmsexistasextensionmethodsdefinedinthetype System.Linq.Enumerable 2,asaresultDSAdoesnotprovideimplementationsof

1 http://www.microsoft.com/NET/

2 http://msdn.microsoft.com/en-us/library/system.linq.enumerable_members.aspx

44

thesealgorithms.Mostofthealgorithmsdefinedin System.Linq.Enumerable dealmainlywithsequencesratherthansetsexclusively.

Setunioncanbeimplementedasasimpletraversalofbothsetsaddingeach itemofthetwosetstoanewunionset.

1) algorithm Union(set1, set2)

2) Pre: set1,and set2 = ∅

3) union isaset

3) Post: Aunionof set1,and set2hasbeencreated

4) foreach item in set1

5) union.Add(item)

6) endforeach

7) foreach item in set2

8) union.Add(item)

9) endforeach

10) return union

11) end Union

Theruntimeofour Union algorithmis O(m + n)where m isthenumber ofitemsinthefirstsetand n isthenumberofitemsinthesecondset.This runtimeappliesonlytosetsthatexhibit O(1)insertions.

Setintersectionisalsotrivialtoimplement.Theonlymajorthingworth pointingoutaboutouralgorithmisthatwetraversethesetcontainingthe fewestitems.Wecandothisbecauseifwehaveexhaustedalltheitemsinthe smallerofthetwosetsthentherearenomoreitemsthataremembersofboth sets,thuswehavenomoreitemstoaddtotheintersectionset.

CHAPTER5.SETS 45
Figure5.1:a) A ∩ B;b) A ∪ B

1) algorithm Intersection(set1, set2)

2) Pre: set1,and set2 = ∅

3) intersection,and smallerSet aresets

3) Post: Anintersectionof set1,and set2hasbeencreated

4) if set1.Count <set2.Count

5) smallerSet ← set1

6) else

7) smallerSet ← set2

8) endif

9) foreach item in smallerSet

10) if set1.Contains(item) and set2.Contains(item)

11) intersection.Add(item)

12) endif 13) endforeach 14) return intersection 15) end Intersection

Theruntimeofour Intersection algorithmis O(n)where n isthenumber ofitemsinthesmallerofthetwosets.Justlikeour Union algorithmalinear runtimecanonlybeattainedwhenoperatingonasetwith O(1)insertion.

5.1Unordered

Setsinthegeneralsensedonotenforcetheexplicitorderingoftheirmembers.Forexamplethemembersof B = {6, 2, 9} conformtonoorderingscheme becauseitisnotrequired.

MostlibrariesprovideimplementationsofunorderedsetsandsoDSAdoes not;wesimplymentionitheretodisambiguatebetweenanunorderedsetand orderedset.

Wewillonlylookatinsertionforanunorderedsetandcoverbrieflywhya hashtableisanefficientdatastructuretouseforitsimplementation.

5.1.1Insertion

Anunorderedsetcanbeefficientlyimplementedusingahashtableasitsbacking datastructure.Asmentionedpreviouslyweonlyaddanitemtoasetifthat itemisnotalreadyintheset,sothebackingdatastructureweusemusthave aquicklookupandinsertionruntimecomplexity.

Ahashmapgenerallyprovidesthefollowing:

1. O(1)forinsertion

2. approaching O(1)forlookup

Theabovedependsonhowgoodthehashingalgorithmofthehashtable is,butmosthashtablesemployincrediblyefficientgeneralpurposehashing algorithmsandsotheruntimecomplexitiesforthehashtableinyourlibrary ofchoiceshouldbeverysimilarintermsofefficiency.

CHAPTER5.SETS 46

5.2Ordered

Anorderedsetissimilartoanunorderedsetinthesensethatitsmembersare distinct,butanorderedsetenforcessomepredefinedcomparisononeachofits memberstoproduceasetwhosemembersareorderedappropriately.

InDSA0.5andearlierweusedabinarysearchtree(definedin §3)asthe internalbackingdatastructureforourorderedset.Fromversions0.6onwards wereplacedthebinarysearchtreewithanAVLtreeprimarilybecauseAVLis balanced.

Theorderedsethasitsorderrealisedbyperforminganinordertraversal uponitsbackingtreedatastructurewhichyieldsthecorrectorderedsequence ofsetmembers.

BecauseanorderedsetinDSAissimplyawrapperforanAVLtreethat additionallyensuresthatthetreecontainsuniqueitemsyoushouldread §7to learnmoreabouttheruntimecomplexitiesassociatedwithitsoperations.

5.3Summary

Setsprovideawayofhavingacollectionofuniqueobjects,eitherorderedor unordered.

Whenimplementingaset(eitherorderedorunordered)itiskeytoselect thecorrectbackingdatastructure.Aswediscussedin §5.1.1becausewecheck firstiftheitemisalreadycontainedwithinthesetbeforeaddingitweneed thischecktobeasquickaspossible.Forunorderedsetswecanrelyontheuse ofahashtableandusethekeyofanitemtodeterminewhetherornotitis alreadycontainedwithintheset.Usingahashtablethischeckresultsinanear constantruntimecomplexity.Orderedsetscostalittlemoreforthischeck, howeverthelogarithmicgrowththatweincurbyusingabinarysearchtreeas itsbackingdatastructureisacceptable.

Anotherkeypropertyofsetsimplementedusingtheapproachwedescribeis thatbothhavefavourablyfastlook-uptimes.Justlikethecheckbeforeinsertion,forahashtablethisruntimecomplexityshouldbenearconstant.Ordered setsasdescribedin3performabinarychopateachstagewhensearchingfor theexistenceofanitemyieldingalogarithmicruntime.

Wecanusesetstofacilitatemanyalgorithmsthatwouldotherwisebealittle lessclearintheirimplementation.Forexamplein §11.4weuseanunordered settoassistintheconstructionofanalgorithmthatdeterminesthenumberof repeatedwordswithinastring.

CHAPTER5.SETS 47

Chapter6

Queues

Queuesareanessentialdatastructurethatarefoundinvastamountsofsoftwarefromusermodetokernelmodeapplicationsthatarecoretothesystem. Fundamentallytheyhonourafirstinfirstout(FIFO)strategy,thatistheitem firstputintothequeuewillbethefirstserved,theseconditemaddedtothe queuewillbethesecondtobeservedandsoon.

Atraditionalqueueonlyallowsyoutoaccesstheitematthefrontofthe queue;whenyouaddanitemtothequeuethatitemisplacedatthebackof thequeue.

Historicallyqueuesalwayshavethefollowingthreecoremethods:

Enqueue: placesanitematthebackofthequeue;

Dequeue: retrievestheitematthefrontofthequeue,andremovesitfromthe queue;

Peek: 1 retrievestheitematthefrontofthequeuewithoutremovingitfrom thequeue

Asanexampletodemonstratethebehaviourofaqueuewewillwalkthrough ascenariowherebyweinvokeeachofthepreviouslymentionedmethodsobservingthemutationsuponthequeuedatastructure.Thefollowinglistdescribes theoperationsperformeduponthequeueinFigure6.1:

1. Enqueue(10) 2. Enqueue(12) 3. Enqueue(9) 4. Enqueue(8) 5. Enqueue(3) 6. Dequeue() 7. Peek()
48
1 ThisoperationissometimesreferredtoasFront

6.1Astandardqueue

Aqueueisimplicitlylikethatdescribedpriortothissection.InDSAwedon’t provideastandardqueuebecausequeuesaresopopularandsuchacoredata structurethatyouwillfindprettymucheverymainstreamlibraryprovidesa queuedatastructurethatyoucanusewithyourlanguageofchoice.Inthis sectionwewilldiscusshowyoucan,ifrequired,implementanefficientqueue datastructure.

Themainpropertyofaqueueisthatwehaveaccesstotheitematthe frontofthequeue.Thequeuedatastructurecanbeefficientlyimplemented usingasinglylinkedlist(definedin §2.1).Asinglylinkedlistprovides O(1) insertionanddeletionruntimecomplexities.Thereasonwehavean O(1)run timecomplexityfordeletionisbecauseweonlyeverremoveitemsfromthefront ofqueues(withtheDequeueoperation).Sincewealwayshaveapointertothe itemattheheadofasinglylinkedlist,removalissimplyacaseofreturning thevalueoftheoldheadnode,andthenmodifyingtheheadpointertobethe nextnodeoftheoldheadnode.Theruntimecomplexityforsearchingaqueue remainsthesameasthatofasinglylinkedlist: O(n).

6.2PriorityQueue

Unlikeastandardqueuewhereitemsareorderedintermsofwhoarrivedfirst, apriorityqueuedeterminestheorderofitsitemsbyusingaformofcustom comparertoseewhichitemhasthehighestpriority.Otherthantheitemsina priorityqueuebeingorderedbypriorityitremainsthesameasanormalqueue: youcanonlyaccesstheitematthefrontofthequeue.

Asensibleimplementationofapriorityqueueistouseaheapdatastructure (definedin §4).Usingaheapwecanlookatthefirstiteminthequeuebysimply returningtheitematindex0withintheheaparray.Aheapprovidesuswiththe abilitytoconstructapriorityqueuewheretheitemswiththehighestpriority areeitherthosewiththesmallestvalue,orthosewiththelargest.

6.3DoubleEndedQueue

Unlikethequeueswehavetalkedaboutpreviouslyinthischapteradouble endedqueueallowsyoutoaccesstheitemsatboththefront,andbackofthe queue.Adoubleendedqueueiscommonlyknownasadequewhichisthename wewillhereoninrefertoitas.

Adequeappliesnoprioritizationstrategytoitsitemslikeapriorityqueue does,itemsareaddedinordertoeitherthefrontofbackofthedeque.The formerpropertiesofthedequearedenotedbytheprogrammerutilisingthedata structuresexposedinterface.

CHAPTER6.QUEUES 49
8. Enqueue(33) 9. Peek() 10. Dequeue()
CHAPTER6.QUEUES 50
Figure6.1:Queuemutations

Deque’sprovidefrontandbackspecificversionsofcommonqueueoperations, e.g.youmaywanttoenqueueanitemtothefrontofthequeueratherthan thebackinwhichcaseyouwoulduseamethodwithanamealongthelines of EnqueueFront.Thefollowinglistidentifiesoperationsthatarecommonly supportedbydeque’s:

• EnqueueFront

• EnqueueBack

• DequeueFront

• DequeueBack

• PeekFront

• PeekBack

Figure6.2showsadequeaftertheinvocationofthefollowingmethods(inorder):

1. EnqueueBack(12)

2. EnqueueFront(1)

3. EnqueueBack(23)

4. EnqueueFront(908)

5. DequeueFront()

6. DequeueBack()

Theoperationshaveaone-to-onetranslationintermsofbehaviourwith thoseofanormalqueue,orpriorityqueue.Insomecasesthesetofalgorithms thataddanitemtothebackofthedequemaybenamedastheyarewith normalqueues,e.g. EnqueueBack maysimplybecalled Enqueue ansoon.Some frameworksalsospecifyexplicitbehaviour’sthatdatastructuresmustadhereto. Thisiscertainlythecasein.NETwheremostcollectionsimplementaninterface whichrequiresthedatastructuretoexposeastandard Add method.Insuch ascenarioyoucansafelyassumethatthe Add methodwillsimplyenqueuean itemtothebackofthedeque.

Withrespecttoalgorithmicruntimecomplexitiesadequeisthesameas anormalqueue.Thatisenqueueinganitemtothebackofathequeueis O(1),additionallyenqueuinganitemtothefrontofthequeueisalsoan O(1) operation.

Adequeisawrapperdatastructurethatuseseitheranarray,oradoubly linkedlist.Usinganarrayasthebackingdatastructurewouldrequiretheprogrammertobeexplicitaboutthesizeofthearrayupfront,thiswouldprovide anobviousadvantageiftheprogrammercoulddeterministicallystatethemaximumnumberofitemsthedequewouldcontainatanyonetime.Unfortunately inmostcasesthisdoesn’thold,asaresultthebackingarraywillinherently incurtheexpenseofinvokingaresizingalgorithmwhichwouldmostlikelybe an O(n)operation.Suchanapproachwouldalsoleavethelibrarydeveloper

CHAPTER6.QUEUES 51
CHAPTER6.QUEUES 52
Figure6.2:Dequedatastructureafterseveralmutations

tolookatarrayminimizationtechniquesaswell,itcouldbethatafterseveral invocationsoftheresizingalgorithmandvariousmutationsonthedequelater thatwehaveanarraytakingupaconsiderableamountofmemoryyetweare onlyusingafewsmallpercentageofthatmemory.Analgorithmdescribed wouldalsobe O(n)yetitsinvocationwouldbehardertogaugestrategically.

Tobypassalltheaforementionedissuesadequetypicallyusesadoubly linkedlistasitsbakingdatastructure.Whileanodethathastwopointers consumesmorememorythanitsarrayitemcounterpartitmakesredundantthe needforexpensiveresizingalgorithmsasthedatastructureincreasesinsize dynamically.Withalanguagethattargetsagarbagecollectedvirtualmachine memoryreclamationisanopaqueprocessasthenodesthatarenolongerreferencedbecomeunreachableandarethusmarkedforcollectionuponthenext invocationofthegarbagecollectionalgorithm.WithC++oranyotherlanguagethatusesexplicitmemoryallocationanddeallocationitwillbeuptothe programmertodecidewhenthememorythatstorestheobjectcanbefreed.

6.4Summary

Withnormalqueueswehaveseenthatthosewhoarrivefirstaredealtwithfirst; thatistheyaredealtwithinafirst-in-first-out(FIFO)order.Queuescanbe eversouseful;forexampletheWindowsCPUschedulerusesadifferentqueue foreachpriorityofprocesstodeterminewhichshouldbethenextprocessto utilisetheCPUforaspecifiedtimequantum.Normalqueueshaveconstant insertionanddeletionruntimes.Searchingaqueueisfairlyunusual—typically youareonlyinterestedintheitematthefrontofthequeue.Despitethat, searchingisusuallyexposedonqueuesandtypicallytheruntimeislinear.

Inthischapterwehavealsoseenpriorityqueueswherethoseatthefront ofthequeuehavethehighestpriorityandthosenearthebackhavethelowest. Oneimplementationofapriorityqueueistouseaheapdatastructureasits backingstore,sotheruntimesforinsertion,deletion,andsearchingarethe sameasthoseforaheap(definedin §4).

Queuesareaverynaturaldatastructure,andwhiletheyarefairlyprimitive theycanmakemanyproblemsalotsimpler.Forexamplethebreadthfirst searchdefinedin §3.7.4makesextensiveuseofqueues.

CHAPTER6.QUEUES 53

Chapter7

AVLTree

Intheearly60’sG.M.Adelson-VelskyandE.M.Landisinventedthefirstselfbalancingbinarysearchtreedatastructure,callingitAVLTree.

AnAVLtreeisabinarysearchtree(BST,definedin §3)withaself-balancing conditionstatingthatthedifferencebetweentheheightoftheleftandright subtreescannotbenomorethanone,seeFigure7.1.Thiscondition,restored aftereachtreemodification,forcesthegeneralshapeofanAVLtree.Before continuing,letusfocusonwhybalanceissoimportant.Considerabinary searchtreeobtainedbystartingwithanemptytreeandinsertingsomevalues inthefollowingorder1,2,3,4,5.

TheBSTinFigure7.2representstheworstcasescenarioinwhichtherunningtimeofallcommonoperationssuchassearch,insertionanddeletionare O(n).Byapplyingabalanceconditionweensurethattheworstcaserunning timeofeachcommonoperationis O(logn).TheheightofanAVLtreewith n nodesis O(logn)regardlessoftheorderinwhichvaluesareinserted.

TheAVLbalancecondition,knownalsoasthenodebalancefactorrepresents anadditionalpieceofinformationstoredforeachnode.Thisiscombinedwith atechniquethatefficientlyrestoresthebalanceconditionforthetree.Inan AVLtreetheinventorsmakeuseofawell-knowntechniquecalledtreerotation. h h+1

54
Figure7.1:TheleftandrightsubtreesofanAVLtreedifferinheightbyat most1
CHAPTER7.AVLTREE 55
2 4 5 1 3 4 5 3 2 1
Figure7.2:Unbalancedbinarysearchtree
a) b)
Figure7.3:Avltrees,insertionorder:-a)1,2,3,4,5-b)1,5,4,3,2

7.1TreeRotations

Atreerotationisaconstanttimeoperationonabinarysearchtreethatchanges theshapeofatreewhilepreservingstandardBSTproperties.Thereareleftand rightrotationsbothofthemdecreasetheheightofaBSTbymovingsmaller subtreesdownandlargersubtreesup.

56
CHAPTER7.AVLTREE
Figure7.4:Treeleftandrightrotations

1) algorithm LeftRotation(node)

2) Pre: node.Right!= ∅

3) Post: node.Rightisthenewrootofthesubtree,

4) node hasbecome node.Right’sleftchildand, 5)BSTpropertiesarepreserved

6) RightNode ← node.Right

7) node.Right ← RightNode.Left

8) RightNode.Left ← node

9) end LeftRotation

1) algorithm RightRotation(node)

2) Pre: node.Left!= ∅

3) Post: node.Leftisthenewrootofthesubtree,

4) node hasbecome node.Left’srightchildand,

5)BSTpropertiesarepreserved

6) LeftNode ← node.Left

7) node.Left ← LeftNode.Right

8) LeftNode.Right ← node

9) end RightRotation

Therightandleftrotationalgorithmsaresymmetric.Onlypointersare changedbyarotationresultinginan O(1)runtimecomplexity;theotherfields presentinthenodesarenotchanged.

7.2TreeRebalancing

Thealgorithmthatwepresentinthissectionverifiesthattheleftandright subtreesdifferatmostinheightby1.Ifthispropertyisnotpresentthenwe performthecorrectrotation.

Noticethatweusetwonewalgorithmsthatrepresentdoublerotations. Thesealgorithmsarenamed LeftAndRightRotation,and RightAndLeftRotation Thealgorithmsareselfdocumentingintheirnames,e.g. LeftAndRightRotation firstperformsaleftrotationandthensubsequentlyarightrotation.

CHAPTER7.AVLTREE 57

1) algorithm CheckBalance(current)

2) Pre: current isthenodetostartfrombalancing

3) Post: current heighthasbeenupdatedwhiletreebalanceisifneeded

4)restoredthroughrotations

5) if current.Left= ∅ and current.Right= ∅

6) current.Height=-1;

7) else

8) current.Height=Max(Height(current.Left),Height(current.Right))+1

9) endif

10) if Height(current.Left)- Height(current.Right) > 1

11) if Height(current.Left.Left)- Height(current.Left.Right) > 0

12)RightRotation(current)

13) else

14)LeftAndRightRotation(current)

15) endif

16) elseif Height(current.Left)- Height(current.Right) < 1

17) if Height(current.Right.Left)- Height(current.Right.Right) < 0

18)LeftRotation(current)

19) else

20)RightAndLeftRotation(current)

21) endif

22) endif

23) end CheckBalance

7.3Insertion

AVLinsertionoperatesfirstbyinsertingthegivenvaluethesamewayasBST insertionandthenbyapplyingrebalancingtechniquesifnecessary.Thelatter isonlyperformediftheAVLpropertynolongerholds,thatistheleftandright subtreesheightdifferbymorethan1.EachtimeweinsertanodeintoanAVL tree:

1. Wegodownthetreetofindthecorrectpointatwhichtoinsertthenode, inthesamemannerasforBSTinsertion;then

2. wetravelupthetreefromtheinsertednodeandcheckthatthenode balancingpropertyhasnotbeenviolated;ifthepropertyhasn’tbeen violatedthenweneednotrebalancethetree,theoppositeistrueifthe balancingpropertyhasbeenviolated.

CHAPTER7.AVLTREE 58

1) algorithm Insert(value)

2) Pre: value haspassedcustomtypechecksfortype T

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if root = ∅

5) root ← node(value)

6) else

7)InsertNode(root, value)

8) endif

9) end Insert

1) algorithm InsertNode(current, value)

2) Pre: current isthenodetostartfrom

3) Post: value hasbeenplacedinthecorrectlocationinthetreewhile

4)preservingtreebalance

5) if value<current.Value

6) if current.Left= ∅

7) current.Left ← node(value)

8) else

9)InsertNode(current.Left, value)

10) endif

11) else

12) if current.Right= ∅

13) current.Right ← node(value)

14) else

15)InsertNode(current.Right, value)

16) endif

17) endif

18)CheckBalance(current)

19) end InsertNode

7.4Deletion

OurbalancingalgorithmisliketheonepresentedforourBST(definedin §3.3). Themajordifferenceisthatwehavetoensurethatthetreestilladherestothe AVLbalancepropertyaftertheremovalofthenode.Ifthetreedoesn’tneed toberebalancedandthevalueweareremovingiscontainedwithinthetree thennofurthersteparerequired.However,whenthevalueisinthetreeand itsremovalupsetstheAVLbalancepropertythenwemustperformthecorrect rotation(s).

CHAPTER7.AVLTREE 59

1) algorithm Remove(value)

2) Pre: value isthevalueofthenodetoremove, root istherootnode

3)oftheAvl

4) Post: nodewith value isremovedandtreerebalancediffoundinwhich

5)caseyieldstrue,otherwisefalse

6) nodeToRemove ← root

7) parent ←∅

8) Stackpath ← root 9) while nodeToRemove = ∅ and nodeToRemove.Value = Value

parent = nodeToRemove

if value<nodeToRemove.Value

nodeToRemove ← nodeToRemove.Right

endif 16)path.Push(nodeToRemove) 17) endwhile

if nodeToRemove = ∅ 19) returnfalse //valuenotinAvl 20) endif

parent ← FindParent(value)

if count =1// count keepstrackofthe#ofnodesintheAvl

root ←∅ //weareremovingtheonlynodeintheAvl

elseif nodeToRemove.Left= ∅ and nodeToRemove.Right= null

CHAPTER7.AVLTREE 60
10)
11)
12)
13)
14)
15)
21)
23)
24)
25)//case#1 26)
27)
28) else 29) parent
30) endif 31) elseif
32)//case#2 33)
34) parent
nodeToRemove
35) else 36) parent
nodeToRemove
37) endif 38) elseif nodeToRemove
nodeToRemove
∅ 39)//case#3 40) if nodeToRemove
.Value 41) parent.Left ← nodeToRemove.Left 42) else 43) parent
← nodeToRemove.Left 44) endif 45) else 46)//case#4 47) largestValue
nodeToRemove
48) while largestValue
∅ 49)//findthelargestvalueintheleftsubtreeof nodeToRemove 50) largestValue ← largestValue.Right
nodeToRemove ← nodeToRemove.Left
else
18)
22)
if nodeToRemove.Value <parent.Value
parent.Left ←∅
.Right ←∅
nodeToRemove.Left= ∅ and nodeToRemove.Right = ∅
if nodeToRemove.Value <parent.Value
.Left ←
.Right
.Right ←
.Right
.Left =
and
.Right=
.Value <parent
.Right
.Left
.Right =

51) endwhile

52)//settheparents’Rightpointerof largestValue to ∅

53)FindParent(largestValue.Value).Right ←∅

54) nodeToRemove.Value ← largestValue.Value

55) endif

56) while path.Count> 0

57)CheckBalance(path.Pop())//wetrackbacktotherootnodecheckbalance

58) endwhile

59) count ← count 1

60) returntrue

61) end Remove

7.5Summary

TheAVLtreeisasophisticatedselfbalancingtree.Itcanbethoughtofas thesmarter,youngerbrotherofthebinarysearchtree.Unlikeitsolderbrother theAVLtreeavoidsworstcaselinearcomplexityruntimesforitsoperations.

TheAVLtreeguaranteesviatheenforcementofbalancingalgorithmsthatthe leftandrightsubtreesdifferinheightbyatmost1whichyieldsatmosta logarithmicruntimecomplexity.

CHAPTER7.AVLTREE 61

PartII Algorithms

62

Chapter8

Sorting

Allthesortingalgorithmsinthischapterusedatastructuresofaspecifictype todemonstratesorting,e.g.a32bitintegerisoftenusedasitsassociated operations(e.g. <, >,etc)areclearintheirbehaviour.

Thealgorithmsdiscussedcaneasilybetranslatedintogenericsortingalgorithmswithinyourrespectivelanguageofchoice.

8.1BubbleSort

Oneofthemostsimpleformsofsortingisthatofcomparingeachitemwith everyotheriteminsomelist,howeverasthedescriptionmayimplythisform ofsortingisnotparticularlyeffecient O(n2).Init’smostsimpleformbubble sortcanbeimplementedastwoloops.

1) algorithm BubbleSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) for i ← 0to list.Count 1

5) for j ← 0to list.Count 1

6) if list[i] <list[j]

7) Swap(list[i],list[j])

8) endif

9) endfor 10) endfor 11) return list

12) end BubbleSort

8.2MergeSort

MergesortisanalgorithmthathasafairlyefficientspacetimecomplexityO(nlogn)andisfairlytrivialtoimplement.Thealgorithmisbasedonsplitting alist,intotwosimilarsizedlists(left,and right)andsortingeachlistandthen mergingthesortedlistsbacktogether.

Note:thefunctionMergeOrderedsimplytakestwoorderedlistsandmakes themone.

63

1) algorithm Mergesort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) if list.Count=1//alreadysorted

5) return list

6) endif

7) m ← list.Count / 2

8) left ← list(m)

9) right ← list(list.Count m)

10) for i ← 0to left.Count 1

11) left[i] ← list[i]

12) endfor

13) for i ← 0to right.Count 1

14) right[i] ← list[i]

15) endfor

16) left ← Mergesort(left)

17) right ← Mergesort(right)

18) return MergeOrdered(left, right)

19) end Mergesort

CHAPTER8.SORTING 64
Figure8.1:BubbleSortIterations

8.3QuickSort

Quicksortisoneofthemostpopularsortingalgorithmsbasedondivideet imperastrategy,resultinginan O(nlogn)complexity.Thealgorithmstartsby pickinganitem,calledpivot,andmovingallsmalleritemsbeforeit,whileall greaterelementsafterit.Thisisthemainquicksortoperation,calledpartition, recursivelyrepeatedonlesserandgreatersublistsuntiltheirsizeisoneorzero -inwhichcasethelistisimplicitlysorted.

Choosinganappropriatepivot,asforexamplethemedianelementisfundamentalforavoidingthedrasticallyreducedperformanceof O(n2).

CHAPTER8.SORTING 65
Figure8.2:MergeSortDivideetImperaApproach

1) algorithm QuickSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) if list.Count=1//alreadysorted

5) return list

6) endif

7) pivot ←MedianValue(list)

8) for i ← 0to list.Count 1

9) if list[i]= pivot

10) equal.Insert(list[i])

11) endif

12) if list[i] <pivot

13) less.Insert(list[i])

14) endif

15) if list[i] >pivot

16) greater.Insert(list[i])

17) endif

18) endfor

19) return Concatenate(QuickSort(less), equal,QuickSort(greater))

20) end Quicksort

CHAPTER8.SORTING 66
Figure8.3:QuickSortExample(pivotmedianstrategy)

8.4InsertionSort

Insertionsortisasomewhatinterestingalgorithmwithanexpensiveruntimeof O(n2).Itcanbebestthoughtofasasortingschemesimilartothatofsorting ahandofplayingcards,i.e.youtakeonecardandthenlookattherestwith theintentofbuildingupanorderedsetofcardsinyourhand.

1) algorithm Insertionsort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) unsorted ← 1

5) while unsorted<list.Count

6) hold ← list[unsorted]

7) i ← unsorted 1

8) while i ≥ 0 and hold<list[i]

9) list[i +1] ← list[i]

10) i ← i 1

11) endwhile

12) list[i +1] ← hold

13) unsorted ← unsorted +1

14) endwhile 15) return list

16) end Insertionsort

CHAPTER8.SORTING 67
Figure8.4:InsertionSortIterations

8.5ShellSort

Putsimplyshellsortcanbethoughtofasamoreefficientvariationofinsertion sortasdescribedin §8.4,itachievesthismainlybycomparingitemsofvarying distancesapartresultinginaruntimecomplexityof O(nlog2 n).

Shellsortisfairlystraightforwardbutmayseemsomewhatconfusingat firstasitdiffersfromothersortingalgorithmsinthewayitselectsitemsto compare.Figure8.5showsshellsortbeingranonanarrayofintegers,thered colouredsquareisthecurrentvalueweareholding.

1) algorithm ShellSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) increment ← list.Count / 2

5) while increment =0

6) current ← increment

7) while current<list.Count

8) hold ← list[current]

9) i ← current increment 10) while i ≥ 0 and hold<list[i] 11) list[i + increment] ← list[i] 12) i = increment 13) endwhile 14) list[i + increment] ← hold 15) current ← current +1 16) endwhile 17) increment/ =2 18) endwhile 19) return list 20) end ShellSort

8.6RadixSort

Unlikethesortingalgorithmsdescribedpreviouslyradixsortusesbucketsto sortitems,eachbucketholdsitemswithaparticularpropertycalledakey. Normallyabucketisaqueue,eachtimeradixsortisperformedthesebuckets areemptiedstartingthesmallestkeybuckettothelargest.Whenlookingat itemswithinalisttosortwedosobyisolatingaspecifickey,e.g.intheexample weareabouttoshowwehaveamaximumofthreekeysforallitems,thatis thehighestkeyweneedtolookatishundreds.Becausewearedealingwith,in thisexamplebase10numberswehaveatanyonepoint10possiblekeyvalues 0..9eachofwhichhastheirownbucket.Beforeweshowyouthisfirstsimple versionofradixsortletusclarifywhatwemeanbyisolatingkeys.Giventhe number102ifwelookatthefirstkey,theonesthenwecanseewehavetwoof them,progressingtothenextkey-tenswecanseethatthenumberhaszero ofthem,finallywecanseethatthenumberhasasinglehundred.Thenumber usedasanexamplehasintotalthreekeys:

CHAPTER8.SORTING 68
CHAPTER8.SORTING 69
Figure8.5:Shellsort

Forfurtherclarificationwhatifwewantedtodeterminehowmanythousands thenumber102has?Clearlytherearenone,butoftenlookingatanumberas finallikeweoftendoitisnotsoobvioussowhenaskedthequestionhowmany thousandsdoes102haveyoushouldsimplypadthenumberwithazerointhat location,e.g.0102hereitismoreobviousthatthekeyvalueatthethousands locationiszero.

Thelastthingtoidentifybeforeweactuallyshowyouasimpleimplementationofradixsortthatworksononlypositiveintegers,andrequiresyouto specifythemaximumkeysizeinthelististhatweneedawaytoisolatea specifickeyatanyonetime.Thesolutionisactuallyverysimple,butitsnot oftenyouwanttoisolateakeyinanumbersowewillspellitoutclearly here.Akeycanbeaccessedfromanyintegerwiththefollowingexpression: key ← (number/keyToAccess)%10.Asasimpleexampleletssaythatwe wanttoaccessthetenskeyofthenumber1290,thetenscolumniskey10and soaftersubstitutionyields key ← (1290 / 10)%10=9.Thenextkeyto lookatforanumbercanbeattainedbymultiplyingthelastkeybytenworking lefttorightinasequentialmanner.Thevalueof key isusedinthefollowing algorithmtoworkouttheindexofanarrayofqueuestoenqueuetheiteminto.

1) algorithm Radix(list, maxKeySize)

2) Pre: list = ∅

3) maxKeySize ≥ 0andrepresentsthelargestkeysizeinthelist

4) Post: list hasbeensorted

5) queues ← Queue[10]

6) indexOfKey ← 1

7) fori ← 0 to maxKeySize 1

8) foreach item in list

9) queues[GetQueueIndex(item, indexOfKey)].Enqueue(item)

10) endforeach

11) list ← CollapseQueues(queues)

12)ClearQueues(queues)

13) indexOfKey ← indexOfKey ∗ 10

14) endfor

15) return list

16) end Radix

Figure8.6showsthemembersof queues fromthealgorithmdescribedabove operatingonthelistwhosemembersare90, 12, 8, 791, 123, and61,thekeywe areinterestedinforeachnumberishighlighted.OmittedqueuesinFigure8.6 meanthattheycontainnoitems.

8.7Summary

Throughoutthischapterwehaveseenmanydifferentalgorithmsforsorting lists,someareveryefficient(e.g.quicksortdefinedin §8.3),somearenot(e.g.

CHAPTER8.SORTING 70
1. Ones 2. Tens 3. Hundreds

Selectingthecorrectsortingalgorithmisusuallydenotedpurelybyefficiency, e.g.youwouldalwayschoosemergesortovershellsortandsoon.Thereare alsootherfactorstolookatthoughandthesearebasedontheactualimplementation.Somealgorithmsareverynicelyexpressedinarecursivefashion, howeverthesealgorithmsoughttobeprettyefficient,e.g.implementingalinear, quadratic,orsloweralgorithmusingrecursionwouldbeaverybadidea.

Ifyouwanttolearnmoreaboutwhyyoushouldbevery,verycarefulwhen implementingrecursivealgorithmsseeAppendixC.

CHAPTER8.SORTING 71
Figure8.6:Radixsortbase10algorithm bubblesortdefinedin §8.1).

Chapter9

Numeric

Unlessstatedotherwisethealias n denotesastandard32bitinteger.

9.1PrimalityTest

Asimplealgorithmthatdetermineswhetherornotagivenintegerisaprime number,e.g.2,5,7,and13are all primenumbers,however6isnotasitcan betheresultoftheproductoftwonumbersthatare < 6.

Inanattempttoslowdowntheinnerloopthe √n isusedastheupper bound.

1) algorithm IsPrime(n)

2) Post: n isdeterminedtobeaprimeornot

3) for i ← 2 to n do

4) for j ← 1 to sqrt(n) do

5) if i ∗ j = n

6) return false

7) endif

8) endfor

9) endfor

10) end IsPrime

9.2Baseconversions

DSAcontainsanumberofalgorithmsthatconvertabase10numbertoits equivalentbinary,octalorhexadecimalform.Forexample7810 hasabinary representationof10011102

Table9.1showsthealgorithmtracewhenthenumbertoconverttobinary is74210

72

9.3Attainingthegreatestcommondenominatoroftwonumbers

Afairlyroutineprobleminmathematicsisthatoffindingthegreatestcommon denominatoroftwointegers,whatweareessentiallyafteristhegreatestnumber whichisamultipleofboth,e.g.thegreatestcommondenominatorof9,and 15is3.OneofthemostelegantsolutionstothisproblemisbasedonEuclid’s algorithmthathasaruntimecomplexityof O(n2).

CHAPTER9.NUMERIC 73 1) algorithm ToBinary(n) 2) Pre: n ≥ 0 3) Post: n hasbeenconvertedintoitsbase2representation 4) while n> 0 5) list.Add(n %2) 6) n ← n/2 7) endwhile 8) return Reverse(list) 9) end ToBinary n list 742 { 0 } 371 { 0, 1 } 185 { 0, 1, 1 } 92 { 0, 1, 1, 0 } 46 { 0, 1, 1, 0, 1 } 23 { 0, 1, 1, 0, 1, 1 } 11 { 0, 1, 1, 0, 1, 1, 1 } 5 { 0, 1, 1, 0, 1, 1, 1, 1 } 2 { 0, 1, 1, 0, 1, 1, 1, 1, 0 } 1 { 0, 1, 1, 0, 1, 1, 1, 1, 0, 1 }
Table9.1:AlgorithmtraceofToBinary
1) algorithm GreatestCommonDenominator(m, n) 2) Pre: m and n areintegers 3) Post: thegreatestcommondenominatorofthetwointegersiscalculated 4) if n =0 5) return m 6) endif 7) return GreatestCommonDenominator(n, m % n) 8) end GreatestCommonDenominator

9.4ComputingthemaximumvalueforanumberofaspecificbaseconsistingofNdigits

Thisalgorithmcomputesthemaximumvalueofanumberforagivennumber ofdigits,e.g.usingthebase10systemthemaximumnumberwecanhave madeupof4digitsisthenumber999910.Similarlythemaximumnumberthat consistsof4digitsforabase2numberis11112 whichis1510

Theexpressionbywhichwecancomputethismaximumvaluefor N digits is: BN 1.Inthepreviousexpression B isthenumberbase,and N isthe numberofdigits.Asanexampleifwewantedtodeterminethemaximumvalue forahexadecimalnumber(base16)consistingof6digitstheexpressionwould beasfollows:166 1.Themaximumvalueofthepreviousexamplewouldbe representedas FFFFFF16 whichyields1677721510

Inthefollowingalgorithm numberBase shouldbeconsideredrestrictedto thevaluesof2,8,9,and16.Forthisreasoninouractualimplementation numberBase hasanenumerationtype.The Base enumerationtypeisdefined as:

Base = {Binary ← 2,Octal ← 8,Decimal ← 10,Hexadecimal ← 16}

Thereasonweprovidethedefinitionof Base istogiveyouanideahowthis algorithmcanbemodelledinamorereadablemannerratherthanusingvarious checkstodeterminethecorrectbasetouse.Forourimplementationwecastthe valueof numberBase toaninteger,assuchweextractthevalueassociatedwith therelevantoptioninthe Base enumeration.Asanexampleifweweretocast theoption Octal toanintegerwewouldgetthevalue8.Inthealgorithmlisted belowthecastisimplicitsowejustusetheactualargument numberBase

1) algorithm MaxValue(numberBase, n)

2) Pre: numberBase isthenumbersystemtouse, n isthenumberofdigits

3) Post: themaximumvaluefor numberBase consistingof n digitsiscomputed

4) return Power(numberBase,n) 1

5) end MaxValue

9.5Factorialofanumber

Attainingthefactorialofanumberisaprimitivemathematicaloperation.Many implementationsofthefactorialalgorithmarerecursiveastheproblemisrecursiveinnature,howeverherewepresentaniterativesolution.Theiterative solutionispresentedbecauseittooistrivialtoimplementanddoesn’tsuffer fromtheuseofrecursion(formoreonrecursionsee §C).

Thefactorialof0and1is0.Theaforementionedactsasabasecasethatwe willbuildupon.Thefactorialof2is2∗ thefactorialof1,similarlythefactorial of3is3∗ thefactorialof2andsoon.Wecanindicatethatweareafterthe factorialofanumberusingtheform N !where N isthenumberwewishto attainthefactorialof.Ouralgorithmdoesn’tusesuchnotationbutitishandy toknow.

CHAPTER9.NUMERIC 74

1) algorithm Factorial(n)

2) Pre: n ≥ 0, n isthenumbertocomputethefactorialof

3) Post: thefactorialof n iscomputed

4) if n< 2

9.6Summary

Inthischapterwehavepresentedseveralnumericalgorithms,mostofwhich aresimplyherebecausetheywerefuntodesign.Perhapsthemessagethat thereadershouldgainfromthischapteristhatalgorithmscanbeappliedto severaldomainstomakeworkinthatrespectivedomainattainable.Numeric algorithmsinparticulardrivesomeofthemostadvancedsystemsontheplanet computingsuchdataasweatherforecasts.

CHAPTER9.NUMERIC 75
5) return 1 6) endif 7) factorial ← 1 8) for i ← 2 to n 9) factorial ← factorial ∗ i 10) endfor 11) return factorial 12) end Factorial

Chapter10

Searching 10.1SequentialSearch

Asimplealgorithmthatsearchforaspecificiteminsidealist.Itoperates loopingoneachelement O(n)untilamatchoccursortheendisreached.

1) algorithm SequentialSearch(list, item)

2) Pre: list = ∅

3) Post: return index ofitemiffound,otherwise 1

4) index ← 0

5) while index<list.Count and list[index] = item

6) index ← index +1

7) endwhile

8) if index<list.Count and list[index]= item

9) return index 10) endif 11) return 1

12) end SequentialSearch

10.2ProbabilitySearch

Probabilitysearchisastatisticalsequentialsearchingalgorithm.Inadditionto searchingforanitem,ittakesintoaccountitsfrequencybyswappingitwith it’spredecessorinthelist.Thealgorithmcomplexitystillremainsat O(n)but inanon-uniformitemssearchthemorefrequentitemsareinthefirstpositions, reducinglistscanningtime.

Figure10.1showstheresultingstateofalistaftersearchingfortwoitems, noticehowthesearcheditemshavehadtheirsearchprobabilityincreasedafter eachsearchoperationrespectively.

76

Figure10.1:a)Search(12),b)Search(101)

1) algorithm ProbabilitySearch(list, item)

2) Pre: list = ∅

3) Post: abooleanindicatingwheretheitemisfoundornot; intheformercaseswapfoundeditemwithitspredecessor

4) index ← 0

5) while index<list.Count and list[index] = item

6) index ← index +1

7) endwhile

8) if index ≥ list.Count or list[index] = item

9) return false

10) endif

11) if index> 0

12) Swap(list[index],list[index 1])

13) endif

14) return true

15) end ProbabilitySearch

10.3Summary

Inthischapterwehavepresentedafewnovelsearchingalgorithms.Wehave presentedmoreefficientsearchingalgorithmsearlieron,likeforinstancethe logarithmicsearchingalgorithmthatAVLandBSTtree’suse(definedin §3.2). Wedecidednottocoverasearchingalgorithmknownasbinarychop(another nameforbinarysearch,binarychopusuallyreferstoitsarraycounterpart)as

CHAPTER10.SEARCHING 77

thereaderhasalreadyseensuchanalgorithmin §3. Searchingalgorithmsandtheirefficiencylargelydependsontheunderlying datastructurebeingusedtostorethedata.Forinstanceitisquickertodeterminewhetheranitemisinahashtablethanitisanarray,similarlyitisquicker tosearchaBSTthanitisalinkedlist.Ifyouaregoingtosearchfordatafairly oftenthenwestronglyadvisethatyousitdownandresearchthedatastructures availabletoyou.Inmostcasesusingalistoranyotherprimarilylineardata structureisdowntolackofknowledge.Modelyourdataandthenresearchthe datastructuresthatbestfityourscenario.

CHAPTER10.SEARCHING 78

Chapter11

Strings

Stringshavetheirownchapterinthistextpurelybecausestringoperations andtransformationsareincrediblyfrequentwithinprograms.Thealgorithms presentedarebasedonproblemstheauthorshavecomeacrosspreviously,or wereformulatedtosatisfycuriosity.

11.1Reversingtheorderofwordsinasentence

Definingalgorithmsforprimitivestringoperationsissimple,e.g.extractinga sub-stringofastring,howeversomealgorithmsthatrequiremoreinventiveness canbealittlemoretricky.

Thealgorithmpresentedheredoesnotsimplyreversethecharactersina string,ratheritreversestheorderofwordswithinastring.Thisalgorithm worksontheprincipalthatwordsarealldelimitedbywhitespace,andusinga fewmarkerstodefinewherewordsstartandendwecaneasilyreversethem.

79

1) algorithm ReverseWords(value)

2) Pre: value = ∅, sb isastringbuffer

3) Post: thewordsin value havebeenreversed

4) last ← value.Length 1

5) start ← last

6) while last ≥ 0

7)//skipwhitespace

8) while start ≥ 0 and value[start]=whitespace

9) start ← start 1

10) endwhile

11) last ← start

12)//marchdowntotheindexbeforethebeginningoftheword

13) while start ≥ 0 and start =whitespace

14) start ← start 1

15) endwhile

16)//appendcharsfrom start +1to length +1tostringbuffersb

17) for i ← start +1 to last

18) sb.Append(value[i])

19) endfor

20)//ifthisisn’tthelastwordinthestringaddsomewhitespaceafterthewordinthebuffer

21) if start> 0

22) sb.Append(‘’)

23) endif

24) last ← start 1

25) start ← last 26) endwhile

27)//checkifwehaveaddedonetoomanywhitespaceto sb

28) if sb[sb.Length 1]=whitespace

29)//cutthewhitespace

30) sb.Length ← sb.Length 1

31) endif

32) return sb

33) end ReverseWords

11.2Detectingapalindrome

Althoughnotafrequentalgorithmthatwillbeappliedinreal-lifescenarios detectingapalindromeisafun,andasitturnsoutprettytrivialalgorithmto design.

Thealgorithmthatwepresenthasa O(n)runtimecomplexity.Ouralgorithmusestwopointersatoppositeendsofstringwearecheckingisapalindrome ornot.Thesepointersmarchintowardseachotheralwayscheckingthateach charactertheypointtoisthesamewithrespecttovalue.Figure11.1showsthe IsPalindrome algorithminoperationonthestring“WasitEliot’stoiletIsaw?” Ifyouremoveallpunctuation,andwhitespacefromtheaforementionedstring youwillfindthatitisavalidpalindrome.

CHAPTER11.STRINGS 80

Figure11.1:

1) algorithm IsPalindrome(value)

2) Pre: value = ∅

3) Post: value isdeterminedtobeapalindromeornot

4) word ← value.Strip().ToUpperCase()

5) left ← 0

6) right ← word.Length 1

7) while word[left]= word[right] and left<right

8) left ← left +1

9) right ← right 1

10) endwhile

11) return word[left]= word[right]

12) end IsPalindrome

Inthe IsPalindrome algorithmwecallamethodbythenameof Strip.This algorithmdiscardspunctuationinthestring,includingwhitespace.Asaresult word containsaheavilycompactedrepresentationoftheoriginalstring,each characterofwhichisinitsuppercaserepresentation.

Palindromesdiscardwhitespace,punctuation,andcasemakingthesechanges allowsustodesignasimplealgorithmwhilemakingouralgorithmfairlyrobust withrespecttothepalindromesitwilldetect.

11.3Countingthenumberofwordsinastring

Countingthenumberofwordsinastringcanseemprettytrivialatfirst,however thereareafewcasesthatweneedtobeawareof:

1. trackingwhenweareinastring

2. updatingthewordcountatthecorrectplace

3. skippingwhitespacethatdelimitsthewords

Asanexampleconsiderthestring“Benatehay”Clearlythisstringcontains threewords,eachofwhichdistinguishedviawhitespace.Allofthepreviously listedpointscanbemanagedbyusingthreevariables:

1. index

2. wordCount

3. inWord

CHAPTER11.STRINGS 81
left and right pointersmarchingintowardsoneanother

Ofthepreviouslylisted index keepstrackofthecurrentindexweareatin thestring, wordCount isanintegerthatkeepstrackofthenumberofwordswe haveencountered,andfinally inWord isaBooleanflagthatdenoteswhether ornotatthepresenttimewearewithinaword.Ifwearenotcurrentlyhitting whitespaceweareinaword,theoppositeistrueifatthepresentindexweare hittingwhitespace.

Whatdenotesaword?Inouralgorithmeachwordisseparatedbyoneor moreoccurrencesofwhitespace.Wedon’ttakeintoaccountanyparticular splittingsymbolsyoumayuse,e.g.in.NET String.Split 1 cantakeachar(or arrayofcharacters)thatdeterminesadelimitertousetosplitthecharacters withinthestringintochunksofstrings,resultinginanarrayofsub-strings.

InFigure11.2wepresentastringindexedasanarray.Typicallythepattern isthesameformostwords,delimitedbyasingleoccurrenceofwhitespace. Figure11.3showsthesamestring,withthesamenumberofwordsbutwith varyingwhitespacesplittingthem.

1 http://msdn.microsoft.com/en-us/library/system.string.split.aspx

CHAPTER11.STRINGS 82
Figure11.2:Stringwiththreewords Figure11.3:Stringwithvaryingnumberofwhitespacedelimitingthewords

1) algorithm WordCount(value)

2) Pre: value = ∅

3) Post: thenumberofwordscontainedwithin value isdetermined

4) inWord ← true

5) wordCount ← 0

6) index ← 0

7)//skipinitialwhitespace

8) while value[index]=whitespace and index<value.Length 1

9) index ← index +1

10) endwhile

11)//wasthestringjustwhitespace?

12) if index = value.Length and value[index]=whitespace 13) return 0 14) endif 15) while index<value.Length 16) if value[index]=whitespace 17)//skipallwhitespace

18) while value[index]=whitespace and index<value.Length 1

index ← index +1

inWord ← false

wordCount ← wordCount +1

11.4Determiningthenumberofrepeatedwords withinastring

Withthehelpofanunorderedset,andanalgorithmthatcansplitthewords withinastringusingaspecifieddelimiterthisalgorithmisstraightforwardto implement.Ifwesplitallthewordsusingasingleoccurrenceofwhitespace asourdelimiterwegetallthewordswithinthestringbackaselementsof anarray.Thenifweiteratethroughthesewordsaddingthemtoasetwhich containsonlyuniquestringswecanattainthenumberofuniquewordsfromthe string.Allthatislefttodoissubtracttheuniquewordcountfromthetotal numberofstingscontainedinthearrayreturnedfromthesplitoperation.The splitoperationthatwerefertoisthesameasthatmentionedin §11.3.

CHAPTER11.STRINGS 83
21)
22)
23) else 24)
25) endif 26) index
27) endwhile 28)//lastwordmayhavenotbeenfollowedbywhitespace 29) if inWord 30) wordCount
+1 31) endif 32) return wordCount 33) end WordCount
19)
20) endwhile
inWord ← true
← index +1
← wordCount

Figure11.4:a)Undesired

1) algorithm RepeatedWordCount(value)

2) Pre: value = ∅

3) Post: thenumberofrepeatedwordsin value isreturned

4) words ← value.Split(’’)

5) uniques ← Set

6) foreach word in words

7) uniques.Add(word.Strip())

8) endforeach

9) return words.Length uniques.Count

10) end RepeatedWordCount

Youwillnoticeinthe RepeatedWordCount algorithmthatweusethe Strip methodwereferredtoearlierin §11.1.Thissimplyremovesanypunctuation froma word.Thereasonweperformthisoperationoneach word issothat wecanbuildamoreaccurateuniquestringcollection,e.g.“test”,and“test!” arethesamewordminusthepunctuation.Figure11.4showstheundesiredand desiredsetsforthe unique setrespectively.

11.5Determiningthefirstmatchingcharacter betweentwostrings

Thealgorithmtodeterminewhetheranycharacterofastringmatchesanyofthe charactersinanotherstringisprettytrivial.Putsimply,wecanparsethestrings consideredusingadoubleloopandcheck,discardingpunctuation,theequality betweenanycharactersthusreturninganon-negativeindexthatrepresentsthe locationofthefirstcharacterinthematch(Figure11.5);otherwisewereturn -1ifnomatchoccurs.Thisapproachexhibitaruntimecomplexityof O(n2).

CHAPTER11.STRINGS 84
uniques set;b)desired uniques set

1) algorithm Any(word,match)

2) Pre: word,match = ∅

3) Post: index representingmatchlocationifoccured, 1otherwise

4) for i ← 0to word.Length 1

5) while word[i]=whitespace

6) i ← i +1

7) endwhile

8) for index ← 0to match.Length 1

9) while match[index]=whitespace

10) index ← index +1

11) endwhile

12) if match[index]= word[i]

13) return index 14) endif 15) endfor 16) endfor 17) return 1 18) end Any

11.6Summary

Wehopethatthereaderhasseenhowfunalgorithmsonstringdatatypes are.Stringsareprobablythemostcommondatatype(anddatastructurerememberwearedealingwithanarray)thatyouwillworkwithsoitsimportant thatyoulearntobecreativewiththem.Weforonefindstringsfascinating.A simpleGooglesearchonstringnuancesbetweenlanguagesandencodingswill provideyouwithagreatnumberofproblems.Nowthatwehavespurredyou alongalittlewithourintroductoryalgorithmsyoucandevisesomeofyourown.

CHAPTER11.STRINGS 85 t s e t 01234 s r e t p 0123456 Word Match i t s e t 01234 s r e t p 0123456 i index t s e t 01234 s r e t p 0123456 i index index a) b) c)
Figure11.5:a)FirstStep;b)SecondStepc)MatchOccurred

AppendixA

AlgorithmWalkthrough

Learninghowtodesigngoodalgorithmscanbeassistedgreatlybyusinga structuredapproachtotracingitsbehaviour.Inmostcasestracinganalgorithm onlyrequiresasingletable.Inmostcasestracingisnotenough,youwillalso wanttouseadiagramofthedatastructureyouralgorithmoperateson.This diagramwillbeusedtovisualisetheproblemmoreeffectively.Seeingthings visuallycanhelpyouunderstandtheproblemquicker,andbetter.

Thetracetablewillstoreinformationaboutthevariablesusedinyouralgorithm.Thevalueswithinthistableareconstantlyupdatedwhenthealgorithm mutatesthem.Suchanapproachallowsyoutoattainahistoryofthevarious valueseachvariablehasheld.Youmayalsobeabletoinferpatternsfromthe valueseachvariablehascontainedsothatyoucanmakeyouralgorithmmore efficient.

Wehavefoundthisapproachbothsimple,andpowerful.Bycombininga visualrepresentationoftheproblemaswellashavingahistoryofpastvalues generatedbythealgorithmitcanmakeunderstanding,andsolvingproblems mucheasier.

Inthischapterwewillshowyouhowtoworkthroughbothiterative,and recursivealgorithmsusingthetechniqueoutlined.

A.1Iterativealgorithms

Wewilltracethe IsPalindrome algorithm(definedin §11.2)asourexample iterativewalkthrough.Beforeweevenlookatthevariablesthealgorithmuses, firstwewilllookattheactualdatastructurethealgorithmoperateson.It shouldbeprettyobviousthatweareoperatingonastring,buthowisthis represented?Astringisessentiallyablockofcontiguousmemorythatconsists ofsomechardatatypes,oneaftertheother.Eachcharacterinthestringcan beaccessedviaanindexmuchlikeyouwoulddowhenaccessingitemswithin anarray.Thepictureshouldbepresentingitself-astringcanbethoughtofas anarrayofcharacters.

Forourexamplewewilluse IsPalindrome tooperateonthestring“Never oddoreven”Nowweknowhowthestringdatastructureisrepresented,and thevalueofthestringwewilloperateonlet’sgoaheadanddrawitasshown inFigureA.1.

86

The IsPalindrome algorithmusesthefollowinglistofvariablesinsomeform throughoutitsexecution:

Havingidentifiedthevaluesofthevariablesweneedtokeeptrackofwe simplycreateacolumnforeachinatableasshowninTableA.1.

Now,usingthe IsPalindrome algorithmexecuteeachstatementupdating thevariablevaluesinthetableappropriately.TableA.2showsthefinaltable valuesforeachvariableusedin IsPalindrome respectively.

Whilethisapproachmaylookalittlebloatedinprint,onpaperitismuch morecompact.Wherewehavethestringsinthetableyoushouldannotate thesestringswitharrayindexestoaidthealgorithmwalkthrough.

Thereisoneotherpointthatweshouldclarifyatthistime-whetherto includevariablesthatchangeonlyafewtimes,ornotatallinthetracetable. InTableA.2wehaveincludedboththe value,and word variablesbecauseit wasconvenienttodoso.Youmayfindthatyouwanttopromotethesevalues toalargerdiagram(likethatinFigureA.1)andonlyusethetracetablefor variableswhosevalueschangeduringthealgorithm.Werecommendthatyou promotethecoredatastructurebeingoperatedontoalargerdiagramoutside ofthetablesothatyoucaninterrogateitmoreeasily.

APPENDIXA.ALGORITHMWALKTHROUGH 87
value word left right
FigureA.1:Visualisingthedatastructureweareoperatingon TableA.1:Acolumnforeachvariablewewishtotrack
1. value
2. word 3. left
4. right
value word left right “Neveroddoreven” “NEVERODDOREVEN” 0 13 1 12 2 11 3 10 4 9 5 8 6 7 7 6
TableA.2:Algorithmtracefor IsPalindrome

Wecannotstressenoughhowimportantsuchtracesarewhendesigning youralgorithm.Youcanusethesetracetablestoverifyalgorithmcorrectness. Atthecostofasimpletable,andquicksketchofthedatastructureyouare operatingonyoucandevisecorrectalgorithmsquicker.Visualisingtheproblem domainandkeepingtrackofchangingdatamakesproblemsaloteasiertosolve. Moreoveryoualwayshaveapointofreferencewhichyoucanlookbackon.

A.2RecursiveAlgorithms

Forthemostpartworkingthroughrecursivealgorithmsisassimpleaswalking throughaniterativealgorithm.Oneofthethingsthatweneedtokeeptrack ofthoughiswhichmethodcallreturnstowho.Mostrecursivealgorithmsare muchsimpletofollowwhenyoudrawouttherecursivecallsratherthanusing atablebasedapproach.Inthissectionwewillusearecursiveimplementation ofanalgorithmthatcomputesanumberfromtheFiboncaccisequence.

1) algorithm Fibonacci(n)

2) Pre: n isthenumberinthefibonaccisequencetocompute

3) Post: thefibonaccisequencenumber n hasbeencomputed

4) if n< 1

5) return 0

6) elseif n< 2

7) return 1

8) endif

9) return Fibonacci(n 1)+Fibonacci(n 2)

10) end Fibonacci

Beforewejumpintoshowingyouadiagrammticrepresentationofthealgorithmcallsforthe Fibonacci algorithmwewillbrieflytalkaboutthecasesof thealgorithm.Thealgorithmhasthreecasesintotal:

Thefirsttwoitemsinthepreceedinglistarethebasecasesofthealgorithm. Untilwehitoneofourbasecasesinourrecursivemethodcalltreewewon’t returnanything.Thethirditemfromthelistisourrecursivecase.

Witheachcalltotherecursivecaseweetcheverclosertooneofourbase cases.FigureA.2showsadiagrammticrepresentationoftherecursivecallchain. InFigureA.2theorderinwhichthemethodsarecalledarelabelled.Figure A.3showsthecallchainannotatedwiththereturnvaluesofeachmethodcall aswellastheorderinwhichmethodsreturntotheircallers.InFigureA.3the returnvaluesarerepresentedasannotationstotheredarrows.

Itisimportanttonotethateachrecursivecallonlyeverreturnstoitscaller uponhittingoneofthetwobasecases.Whenyoudoeventuallyhitabasecase thatbranchofrecursivecallsceases.Uponhittingabasecaseyougobackto

APPENDIXA.ALGORITHMWALKTHROUGH 88
1. n< 1 2. n< 2 3. n ≥ 2
APPENDIXA.ALGORITHMWALKTHROUGH 89
FigureA.2:Callchainfor Fibonacci algorithm FigureA.3:Returnchainfor Fibonacci algorithm

thecallerandcontinueexecutionofthatmethod.Executioninthecalleris contiuedatthenextstatement,orexpressionaftertherecursivecallwasmade.

Inthe Fibonacci algorithms’recursivecasewemaketworecursivecalls. Whenthefirstrecursivecall(Fibonacci(n 1))returnstothecallerwethen executethethesecondrecursivecall(Fibonacci(n 2)).Afterbothrecursive callshavereturnedtotheircaller,thecallercanthensubesequentlyreturnto itscallerandsoon.

Recursivealgorithmsaremucheasiertodemonstratediagrammaticallyas FigureA.2demonstrates.Whenyoucomeacrossarecursivealgorithmdraw methodcalldiagramstounderstandhowthealgorithmworksatahighlevel.

A.3Summary

Understandingalgorithmscanbehardattimes,particularlyfromanimplementationperspective.Inordertounderstandanalgorithmtryandworkthrough itusingtracetables.Incaseswherethealgorithmisalsorecursivesketchthe recursivecallsoutsoyoucanvisualisethecall/returnchain.

Inthevastmajorityofcasesimplementinganalgorithmissimpleprovided thatyouknowhowthealgorithmworks.Masteringhowanalgorithmworks fromahighleveliskeyfordevisingawelldesignedsolutiontotheproblemin hand.

APPENDIXA.ALGORITHMWALKTHROUGH 90

AppendixB

TranslationWalkthrough

Theconversionfrompseudotoanactualimperativelanguageisusuallyvery straightforward,toclarifyanexampleisprovided.Inthisexamplewewill convertthealgorithmin §9.1totheC#language.

1)publicstaticboolIsPrime(intnumber)

2) {

3)if(number < 2)

4) { 5)returnfalse;

6) }

7)intinnerLoopBound=(int)Math.Floor(Math.Sqrt(number)); 8)for(inti=1;i < number;i++)

9) { 10)for(intj=1;j <=innerLoopBound;j++)

11) { 12)if(i ∗ j==number)

13) { 14)returnfalse;

15) }

16) }

17) } 18)returntrue;

19) }

Forthemostparttheconversionisastraightforwardprocess,howeveryou mayhavetoinjectvariouscallstootherutilityalgorithmstoascertainthe correctresult.

Aconsiderationtotakenoteofisthatmanyalgorithmshavefairlystrict preconditions,ofwhichtheremaybeseveral-inthesescenariosyouwillneed toinjectthecorrectcodetohandlesuchsituationstopreservethecorrectnessof thealgorithm.Mostofthepreconditionscanbesuitablyhandledbythrowing thecorrectexception.

91

B.1Summary

Asyoucanseefromtheexampleusedinthischapterwehavetriedtomakethe translationofourpseudocodealgorithmstomainstreamimperativelanguages assimpleaspossible.

Wheneveryouencounterakeywordwithinourpseudocodeexamplesthat youareunfamiliarwithjustbrowsetoAppendixEwhichdescirbeseachkeyword.

APPENDIXB.TRANSLATIONWALKTHROUGH 92

AppendixC

RecursiveVs.Iterative Solutions

Oneofthemostsuccinctpropertiesofmodernprogramminglanguageslike C++,C#,andJava(aswellasmanyothers)isthattheselanguagesallow youtodefinemethodsthatreferencethemselves,suchmethodsaresaidtobe recursive.Oneofthebiggestadvantagesrecursivemethodsbringtothetableis thattheyusuallyresultinmorereadable,andcompactsolutionstoproblems.

Arecursivemethodthenisonethatisdefinedintermsofitself.Generally arecursivealgorithmshastwomainproperties:

1. Oneormorebasecases;and

2. Arecursivecase

Fornowwewillbrieflycoverthesetwoaspectsofrecursivealgorithms.With eachrecursivecallweshouldbemakingprogresstoourbasecaseotherwisewe aregoingtorunintotrouble.Thetroublewespeakofmanifestsitselftypically asastackoverflow,wewilldescribewhylater.

Nowthatwehavebrieflydescribedwhatarecursivealgorithmisandwhy youmightwanttousesuchanapproachforyouralgorithmswewillnowtalk aboutiterativesolutions.Aniterativesolutionusesnorecursionwhatsoever. Aniterativesolutionreliesonlyontheuseofloops(e.g.for,while,do-while, etc).Thedownsidetoiterativealgorithmsisthattheytendnottobeasclear astotheirrecursivecounterpartswithrespecttotheiroperation.Themajor advantageofiterativesolutionsisspeed.Mostproductionsoftwareyouwill finduseslittleornorecursivealgorithmswhatsoever.Thelatterpropertycan sometimesbeacompaniesprerequisitetocheckingincode,e.g.uponchecking inastaticanalysistoolmayverifythatthecodethedeveloperischeckingin containsnorecursivealgorithms.Normallyitissystemslevelcodethathasthis zerotolerancepolicyforrecursivealgorithms.

Usingrecursionshouldalwaysbereservedforfastalgorithms,youshould avoiditforthefollowingalgorithmruntimedeficiencies:

1. O(n2) 2. O(n3) 93

Ifyouuserecursionforalgorithmswithanyoftheaboveruntimeefficiency’s youareinvitingtrouble.Thegrowthrateofthesealgorithmsishighandin mostcasessuchalgorithmswillleanveryheavilyontechniqueslikedivideand conquer.Whileconstantlysplittingproblemsintosmallerproblemsisgood practice,inthesecasesyouaregoingtobespawningalotofmethodcalls.All thisoverhead(methodcallsdon’tcome that cheap)willsoonpileupandeither causeyouralgorithmtorunalotslowerthanexpected,orworse,youwillrun outofstackspace.Whenyouexceedtheallottedstackspaceforathreadthe processwillbeshutdownbytheoperatingsystem.Thisisthecaseirrespective oftheplatformyouuse,e.g..NET,ornativeC++etc.Youcanaskforabigger stacksize,butyoutypicallyonlywanttodothisifyouhaveaverygoodreason todoso.

C.1ActivationRecords

Anactivationrecordiscreatedeverytimeyouinvokeamethod.Putsimply anactivationrecordissomethingthatisputonthestacktosupportmethod invocation.Activationrecordstakeasmallamountoftimetocreate,andare prettylightweight.

Normallyanactivationrecordforamethodcallisasfollows(thisisvery general):

• Theactualparametersofthemethodarepushedontothestack

• Thereturnaddressispushedontothestack

• Thetop-of-stackindexisincrementedbythetotalamountofmemory requiredbythelocalvariableswithinthemethod

• Ajumpismadetothemethod

Inmanyrecursivealgorithmsoperatingonlargedatastructures,oralgorithmsthatareinefficientyouwillrunoutofstackspacequickly.Consideran algorithmthatwheninvokedgivenaspecificvalueitcreatesmanyrecursive calls.Insuchacaseabigchunkofthestackwillbeconsumed.Wewillhaveto waituntiltheactivationrecordsstarttobeunwoundafterthenestedmethods inthecallchainexitandreturntotheirrespectivecaller.Whenamethodexits it’sactivationrecordisunwound.Unwindinganactivationrecordresultsin severalsteps:

1. Thetop-of-stackindexisdecrementedbythetotalamountofmemory consumedbythemethod

2. Thereturnaddressispoppedoffthestack

3. Thetop-of-stackindexisdecrementedbythetotalamountofmemory consumedbytheactualparameters

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS 94
O(2n)
3.

Whileactivationrecordsareanefficientwaytosupportmethodcallsthey canbuildupveryquickly.Recursivealgorithmscanexhaustthestacksize allocatedtothethreadfairlyfastgiventhechance.

Justaboutnowweshouldbedustingthecobwebsofftheageoldexampleof aniterativevs.recursivesolutionintheformoftheFibonaccialgorithm.This isafamousexampleasithighlightsboththebeautyandpitfallsofarecursive algorithm.Theiterativesolutionisnotaspretty,norselfdocumentingbutit doesthejobalotquicker.IfweweretogivetheFibonaccialgorithmaninput ofsay60thenwewouldhavetowaitawhiletogetthevaluebackbecauseit hasan O(gn)runtime.Theiterativeversionontheotherhandhasa O(n) runtime.Don’tletthisputyouoffrecursion.Thisexampleismainlyused toshockprogrammersintothinkingabouttheramificationsofrecursionrather thanwarningthemoff.

C.2Someproblemsarerecursiveinnature

Somethingthatyoumaycomeacrossisthatsomedatastructuresandalgorithmsareactuallyrecursiveinnature.Aperfectexampleofthisisatreedata structure.Acommontreenodeusuallycontainsavalue,alongwithtwopointerstotwoothernodesofthesamenodetype.Asyoucanseetreeisrecursive initsmakeupwiteachnodepossiblypointingtotwoothernodes.

Whenusingrecursivealgorithmsontree’sitmakessenseasyouaresimply adheringtotheinherentdesignofthedatastructureyouareoperatingon.Of courseitisnotallgoodnews,afterallwearestillboundbythelimitationswe havementionedpreviouslyinthischapter.

Wecanalsolookatsortingalgorithmslikemergesort,andquicksort.Both ofthesealgorithmsarerecursiveintheirdesignandsoitmakessensetomodel themrecursively.

C.3Summary

Recursionisapowerfultool,andonethatallprogrammersshouldknowof. Oftensoftwareprojectswilltakeatradebetweenreadability,andefficiencyin whichcaserecursionisgreatprovidedyoudon’tgoanduseittoimplement analgorithmwithaquadraticruntimeorhigher.Ofcoursethisisnotarule ofthumb,thisisjustusthrowingcautiontothewind.Defensivecodingwill alwaysprevail.

Manytimesrecursionhasanaturalhomeinrecursivedatastructuresand algorithmswhicharerecursiveinnature.Usingrecursioninsuchscenariosis perfectlyacceptable.Usingrecursionforsomethinglikelinkedlisttraversalis alittleoverkill.Itsiterativecounterpartisprobablylesslinesofcodethanits recursivecounterpart.

Becausewecanonlytalkabouttheimplicationsofusingrecursionfroman abstractpointofviewyoushouldconsultyourcompilerandruntimeenvironmentformoredetails.Itmaybethecasethatyourcompilerrecognisesthings liketailrecursionandcanoptimisethem.Thisisn’tunheardof,infactmost commercialcompilerswilldothis.Theamountofoptimisationcompilerscan

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS
95

dothoughissomewhatlimitedbythefactthatyouarestillusingrecursion. You,asthedeveloperhavetoacceptcertainaccountability’sforperformance.

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS 96

AppendixD

Testing

Testingisanessentialpartofsoftwaredevelopment.Testinghasoftenbeen discardedbymanydevelopersinthebeliefthattheburdenofproofoftheir softwareisonthosewithinthecompanywhoholdtestcentricroles.This couldn’tbefurtherfromthetruth.Asadeveloperyoushouldatleastprovide asuiteofunitteststhatverifycertainboundaryconditionsofyoursoftware.

Agreatthingabouttestingisthatyoubuildupprogressivelyasafetynet.If youaddortweakalgorithmsandthenrunyoursuiteoftestsyouwillbequickly alertedtoanycasesthatyouhavebrokenwithyourrecentchanges.Suchasuite oftestsinanysizeableprojectisabsolutelyessentialtomaintainingafairlyhigh barwhenitcomestoquality.Ofcourseinordertoattainsuchastandardyou needtothinkcarefullyabouttheteststhatyouconstruct.

Unittestingwhichwillbethesubjectofthevastmajorityofthischapter arewidelyavailableonmostplatforms.MostmodernlanguageslikeC++,C#, andJavaofferanimpressivecatalogueoftestingframeworksthatyoucanuse forunittesting.

Thefollowinglistidentifiestestingframeworkswhicharepopular:

JUnit: TargetedatJav., http://www.junit.org/

NUnit: CanbeusedwithlanguagesthattargetMicrosoft’sCommonLanguage Runtime. http://www.nunit.org/index.php

BoostTestLibrary: TargetedatC++.Thetestlibrarythatshipswiththeincrediblypopular Boostlibraries. http://www.boost.org.Adirectlinktothelibrariesdocumentation http://www.boost.org/doc/libs/1_36_0/libs/test/doc/ html/index.html

CppUnit: TargetedatC++. http://cppunit.sourceforge.net/

Don’tworryifyouthinkthatthelistisverysparse,therearefarmoreon offerthanthosethatwehavelisted.Theoneslistedarethetestingframeworks thatwebelievearethemostpopularforC++,C#,andJava.

D.1Whatconstitutesaunittest?

Aunittestshouldfocusonasingleatomicpropertyofthesubjectbeingtested. Donottryandtestmanythingsatonce,thiswillresultinasuiteofsomewhat

97

unstructuredtests.Asanexampleifyouwerewantingtowriteatestthat verifiedthataparticularvalue V isreturnedfromaspecificinput I thenyour testshoulddothesmallestamountofworkpossibletoverifythat V iscorrect given I.Aunittestshouldbesimpleandselfdescribing.

Aswellasaunittestbeingrelativelyatomicyoushouldalsomakesurethat yourunittestsexecutequickly.Ifyoucanimagineinthefuturewhenyoumay haveatestsuiteconsistingofthousandsoftestsyouwantthoseteststoexecute asquicklyaspossible.Failuretoattainsuchagoalwillmostlikelyresultin thesuiteoftestsnotbeingranthatoftenbythedevelopersonyourteam.This canoccurforanumberofreasonsbutthemainonewouldbethatitbecomes incrediblytediouswaitingseveralminutestoruntestsonadeveloperslocal machine.

Buildingupatestsuitecanhelpgreatlyinateamscenario,particularly whenusingacontinuousbuildserver.Insuchascenarioyoucanhavethesuite oftestsdevisedbythedevelopersandtestersranaspartofthebuildprocess.

Employingsuchstrategiescanhelpyoucatchnigglinglittleerrorcasesearly ratherthanviayourcustomerbase.Thereisnothingmoreembarrassingfora developerthantohaveaverytrivialbugintheircodereportedtothemfroma customer.

D.2WhenshouldIwritemytests?

Asourceofgreatdebatewouldbeanunderstatementtopersonifysuchaquestionasthis.Inrecentyearsatestdrivenapproachtodevelopmenthasbecome verypopular.Suchanapproachisknownastestdrivendevelopment,ormore commonlytheacronymTDD.

OneofthefoundingprinciplesofTDDistowritetheunittestfirst,watch itfailandthenmakeitpass.Thepremisebeingthatyouonlyeverwrite enoughcodeatanyonetimetosatisfythestatebasedassertionsmadeinaunit test.Wehavefoundthisapproachtoprovideamorestructuredintenttothe implementationofalgorithms.Atanyonestageyouonlyhaveasinglegoal,to makethefailingtestpass.BecauseTDDmakesyouwritethetestsupfrontyou neverfindyourselfinasituationwhereyouforget,orcan’tbebotheredtowrite testsforyourcode.Thisisoftenthecasewhenyouwriteyourtestsafteryou havecodedupyourimplementation.We,astheauthorsofthisbookourselves useTDDasourpreferredmethod.

AswehavealreadymentionedthatTDDisourfavouredapproachtotesting itwouldbesomewhatofaninjusticetonotlist,anddescribethemantrathat isoftenassociatewithit:

Red: Signifiesthatthetesthasfailed.

Green: Thefailingtestnowpasses.

Refactor: Canwerestructureourprogramsoitmakesmoresense,andeasierto maintain?

Thefirstpointoftheabovelistalwaysoccursatleastonce(moreifyoucount thebuilderror)inTDDinitially.Yourtaskatthisstageissolelytomakethe testpass,thatistomaketherespectivetestgreen.Thelastitemisbasedaround

APPENDIXD.TESTING 98

therestructuringofyourprogramtomakeitasreadableandmaintainableas possible.ThelastpointisveryimportantasTDDisaprogressivemethodology tobuildingasolution.Ifyouadheretoprogressiverevisionsofyouralgorithm restructuringwhenappropriateyouwillfindthatusingTDDyoucanimplement verycleanlystructuredtypesandsoon.

D.3HowseriouslyshouldIviewmytestsuite?

Yourtestsareamajorpartofyourprojectecosystemandsotheyshouldbe treatedwiththesameamountofrespectasyourproductioncode.Thisranges fromcorrect,andcleancodeformatting,tothetestingcodebeingstoredwithin asourcecontrolrepository.

EmployingamethodologylikeTDD,ortestingafterimplementingyouwill findthatyouspendagreatamountoftimewritingtestsandthustheyshould betreatednodifferentlytoyourproductioncode.Alltestsshouldbeclearly named,andfullydocumentedastotheirintent.

D.4ThethreeA’s

Nowthatyouhaveasenseoftheimportanceofyourtestsuiteyouwillinevitably wanttoknowhowtoactuallystructureeachblockofimperativeswithinasingle unittest.Apopularapproach-thethreeA’sisdescribedinthefollowinglist:

Assemble: Createtheobjectsyourequireinordertoperformthestatebasedassertions.

Act: Invoketherespectiveoperationsontheobjectsyouhaveassembledto mutatethestatetothatdesiredforyourassertions.

Assert: Specifywhatyouexpecttoholdaftertheprevioustwosteps.

Thefollowingexampleshowsasimpletestmethodthatemploysthethree A’s:

D.5Thestructuringoftests

Structuringtestscanbevieweduponasbeingthesameasstructuringproductioncode,e.g.allunittestsfora Person typemaybecontainedwithin

APPENDIXD.TESTING 99
publicvoidMyTest() { //assemble
//assert
}
Typet=newType(); //act t.MethodA();
Assert.IsTrue(t.BoolExpr)

a PersonTest type.Typicallyalltestsareabstractedfromproductioncode. Thatisthatthetestsaredisjointfromtheproductioncode,youmayhavetwo dynamiclinklibraries(dll);thefirstcontainingtheproductioncode,thesecond containingyourtestcode.

Wecanalsousethingslikeinheritanceetcwhendefiningclassesoftests. Thepointbeingthatthetestcodeisverymuchlikeyourproductioncodeand youshouldapplythesameamountofthoughttoitsstructureasyouwoulddo theproductioncode.

D.6CodeCoverage

Somethingthatyoucangetasaproductofunittestingarecodecoverage statistics.Codecoverageismerelyanindicatorastotheportionsofproduction codethatyourunitstestscover.UsingTDDitislikelythatyourcodecoverage willbeveryhigh,althoughitwillvarydependingonhoweasyitistouseTDD withinyourproject.

D.7Summary

Testingiskeytothecreationofamoderatelystableproduct.Moreoverunit testingcanbeusedtocreateasafetyblanketwhenaddingandremovingfeatures providinganearlywarningforbreakingchangeswithinyourproductioncode.

APPENDIXD.TESTING 100

AppendixE

SymbolDefinitions

Throughoutthepseudocodelistingsyouwillfindseveralsymbolsused,describes themeaningofeachofthosesymbols.

Symbol Description

← Assignment.

= Equality.

≤ Lessthanorequalto.

< Lessthan.*

≥ Greaterthanorequalto.

> Greaterthan.*

= Inequality.

∅ Null. and Logicaland. or Logicalor. whitespace Singleoccurrenceofwhitespace. yield Like return butbuildsasequence.

*Thissymbolhasadirecttranslationwiththevastmajorityofimperative counterparts.

TableE.1:Pseudosymboldefinitions
101

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.