Data Structure Algorithms by Omar Estrada

Data Structures and Algorithms

DSA

Referenc

Barne Luca Del Tongo

Ann

Granville

DataStructuresandAlgorithms: AnnotatedReferencewithExamples

FirstEdition

ThisbookismadeexclusivelyavailablefromDotNetSlackers (http://dotnetslackers.com/) the placefor.NETarticles,andnewsfrom someoftheleadingmindsinthesoftwareindustry.

Contents 1Introduction1 1.1Whatthisbookis,andwhatitisn’t................ 1 1.2Assumedknowledge......................... 1 1.2.1BigOhnotation....................... 1 1.2.2Imperativeprogramminglanguage............. 3 1.2.3Objectorientedconcepts.................. 4 1.3Pseudocode.............................. 4 1.4Tipsforworkingthroughtheexamples............... 6 1.5Bookoutline............................. 6 1.6Testing................................ 7 1.7WherecanIgetthecode?...................... 7 1.8Finalmessages............................ 7 IDataStructures8 2LinkedLists9 2.1SinglyLinkedList.......................... 9 2.1.1Insertion............................ 10 2.1.2Searching........................... 10 2.1.3Deletion............................ 11 2.1.4Traversingthelist...................... 12 2.1.5Traversingthelistinreverseorder............. 13 2.2DoublyLinkedList.......................... 13 2.2.1Insertion............................ 15 2.2.2Deletion............................ 15 2.2.3ReverseTraversal....................... 16 2.3Summary............................... 17 3BinarySearchTree19 3.1Insertion................................ 20 3.2Searching............................... 21 3.3Deletion................................ 22 3.4Findingtheparentofagivennode................. 24 3.5Attainingareferencetoanode................... 24 3.6Findingthesmallestandlargestvaluesinthebinarysearchtree 25 3.7TreeTraversals............................ 26 3.7.1Preorder............................ 26 I

3.7.2Postorder........................... 26 3.7.3Inorder............................ 29 3.7.4BreadthFirst......................... 30 3.8Summary............................... 31 4Heap32 4.1Insertion................................ 33 4.2Deletion................................ 37 4.3Searching............................... 38 4.4Traversal............................... 41 4.5Summary............................... 42 5Sets44 5.1Unordered............................... 46 5.1.1Insertion............................ 46 5.2Ordered................................ 47 5.3Summary............................... 47 6Queues48 6.1Astandardqueue........................... 49 6.2PriorityQueue............................ 49 6.3DoubleEndedQueue......................... 49 6.4Summary............................... 53 7AVLTree54 7.1TreeRotations............................ 56 7.2TreeRebalancing........................... 57 7.3Insertion................................ 58 7.4Deletion................................ 59 7.5Summary............................... 61 IIAlgorithms62 8Sorting63 8.1BubbleSort.............................. 63 8.2MergeSort.............................. 63 8.3QuickSort............................... 65 8.4InsertionSort............................. 67 8.5ShellSort............................... 68 8.6RadixSort.............................. 68 8.7Summary............................... 70 9Numeric72 9.1PrimalityTest............................ 72 9.2Baseconversions........................... 72 9.3Attainingthegreatestcommondenominatoroftwonumbers.. 73 9.4Computingthemaximumvalueforanumberofaspeciﬁcbase consistingofNdigits......................... 74 9.5Factorialofanumber........................ 74 9.6Summary............................... 75 II

10Searching76 10.1SequentialSearch........................... 76 10.2ProbabilitySearch.......................... 76 10.3Summary............................... 77 11Strings79 11.1Reversingtheorderofwordsinasentence............. 79 11.2Detectingapalindrome....................... 80 11.3Countingthenumberofwordsinastring............. 81 11.4Determiningthenumberofrepeatedwordswithinastring.... 83 11.5Determiningtheﬁrstmatchingcharacterbetweentwostrings.. 84 11.6Summary............................... 85 AAlgorithmWalkthrough86 A.1Iterativealgorithms......................... 86 A.2RecursiveAlgorithms......................... 88 A.3Summary............................... 90 BTranslationWalkthrough91 B.1Summary............................... 92 CRecursiveVs.IterativeSolutions93 C.1ActivationRecords.......................... 94 C.2Someproblemsarerecursiveinnature............... 95 C.3Summary............................... 95 DTesting97 D.1Whatconstitutesaunittest?.................... 97 D.2WhenshouldIwritemytests?................... 98 D.3HowseriouslyshouldIviewmytestsuite?............. 99 D.4ThethreeA’s............................. 99 D.5Thestructuringoftests....................... 99 D.6CodeCoverage............................ 100 D.7Summary............................... 100 ESymbolDeﬁnitions101 III

Preface

Everybookhasastoryastohowitcameaboutandthisoneisnodiﬀerent, althoughwewouldbelyingifwesaiditsdevelopmenthadnotbeensomewhat impromptu.Putsimplythisbookistheresultofaseriesofemailssentback andforthbetweenthetwoauthorsduringthedevelopmentofalibraryfor the.NETframeworkofthesamename(withtheomissionofthesubtitleof course!).Theconversationstartedoﬀsomethinglike,“Whydon’twecreate amoreaestheticallypleasingwaytopresentourpseudocode?”Afterafew weeksthisnewpresentationstylehadinfactgrownintopseudocodelistings withchunksoftextdescribinghowthedatastructureoralgorithminquestion worksandvariousotherthingsaboutit.Atthispointwethought,“Whatthe heck,let’smakethisthingintoabook!”Andso,inthesummerof2008we beganworkonthisbooksidebysidewiththeactuallibraryimplementation.

Whenwestartedwritingthisbooktheonlythingsthatweweresureabout withrespecttohowthebookshouldbestructuredwere:

1. alwaysmakeexplanationsassimpleaspossiblewhilemaintainingamoderatelyﬁnedegreeofprecisiontokeepthemoreeagermindedreaderhappy; and

2. injectdiagramstodemystifyproblemsthatareevenmoderatlychallenging tovisualise(...andsowecouldrememberhowourownalgorithmsworked whenlookingbackatthem!);andﬁnally

3. presentconciseandself-explanatorypseudocodelistingsthatcanbeported easilytomostmainstreamimperativeprogramminglanguageslikeC++, C#,andJava.

Akeyfactorofthisbookanditsassociatedimplementationsisthatall algorithms(unlessotherwisestated)weredesignedbyus,usingthetheoryof thealgorithminquestionasaguideline(forwhichweareeternallygratefulto theiroriginalcreators).Thereforetheymaysometimesturnouttobeworse thanthe“normal”implementations—andsometimesnot.Wearetwofellows oftheopinionthatchoiceisagreatthing.Readourbook,readseveralothers onthesamesubjectandusewhatyouseeﬁtfromeach(ifanything)when implementingyourownversionofthealgorithmsinquestion.

Throughthisbookwehopethatyouwillseetheabsolutenecessityofunderstandingwhichdatastructureoralgorithmtouseforacertainscenario.Inall projects,especiallythosethatareconcernedwithperformance(hereweapply anevengreateremphasisonreal-timesystems)theselectionofthewrongdata structureoralgorithmcanbethecauseofagreatdealofperformancepain.

Thereforeitisabsolutelykeythatyouthinkabouttheruntimecomplexityand spacerequirementsofyourselectedapproach.Inthisbookweonlyexplainthe theoreticalimplicationstoconsider,butthisisforagoodreason:compilersare verydifferentinhowtheywork.OneC++compilermayhavesomeamazing optimisationphasesspecificallytargetedatrecursion,anothermaynot,forexample.Ofcoursethisisjustanexamplebutyouwouldbesurprisedbyhow manysubtledifferencestherearebetweencompilers.Thesedifferenceswhich maymakeafastalgorithmslow,andviceversa.Wecouldalsofactorinthe sameconcernsaboutlanguagesthattargetvirtualmachines,leavingallthe actualvariousimplementationissuestoyougiventhatyouwillknowyourlanguage’scompilermuchbetterthanus...wellinmostcases.Thishasresultedin amoreconcisebookthatfocusesonwhatwethinkarethekeyissues.

Oneﬁnalnote:nevertakethewordsofothersasgospel;verifyallthatcan befeasiblyveriﬁedandmakeupyourownmind.

Wehopeyouenjoyreadingthisbookasmuchaswehaveenjoyedwritingit.

GranvilleBarnett

LucaDelTongo

Acknowledgements

Writingthisshortbookhasbeenafunandrewardingexperience.Wewould liketothank,innoparticularorderthefollowingpeoplewhohavehelpedus duringthewritingofthisbook.

SonuKapoorgenerouslyhostedourbookwhichwhenwereleasedtheﬁrst draftreceivedoverthirteenthousanddownloads,withouthisgenerositythis bookwouldnothavebeenabletoreachsomanypeople.JonSkeetprovidedus withanalarmingnumberofsuggestionsthroughoutforwhichweareeternally grateful.Jonalsoeditedthisbookaswell.

Wewouldalsoliketothankthosewhoprovidedtheoddsuggestionviaemail tous.Allfeedbackwaslistenedtoandyouwillnodoubtseesomecontent inﬂuencedbyyoursuggestions.

Aspecialthankyoualsogoesouttothosewhohelpedpublicisethisbook fromMicrosoft’sChannel9weeklyshow(thanksDan!)tothemanybloggers whohelpedspreadtheword.Yougaveusanaudienceandforthatweare extremelygrateful.

Thankyoutoallwhocontributedinsomewaytothisbook.Theprogrammingcommunityneverceasestoamazeusinhowwillingitsconstituentsareto givetimetoprojectssuchasthisone.Thankyou.

AbouttheAuthors GranvilleBarnett

GranvilleiscurrentlyaPh.DcandidateatQueenslandUniversityofTechnology (QUT)workingonparallelismattheMicrosoftQUTeResearchCentre1.Healso holdsadegreeinComputerScience,andisaMicrosoftMVP.Hismaininterests areinprogramminglanguagesandcompilers.Granvillecanbecontactedvia oneoftwoplaces:eitherhispersonalwebsite(http://gbarnett.org)orhis blog(http://msmvps.com/blogs/gbarnett).

LucaDelTongo

LucaiscurrentlystudyingforhismastersdegreeinComputerScienceatFlorence.Hismaininterestsvaryfromwebdevelopmenttoresearchﬁeldssuchas dataminingandcomputervision.LucaalsomaintainsanItalianblogwhich canbefoundat http://blogs.ugidotnet.org/wetblog/

1 http://www.mquter.qut.edu.au/

VII

Pageintentionallyleftblank.

Chapter1

Introduction

1.1Whatthisbookis,andwhatitisn’t

Thisbookprovidesimplementationsofcommonanduncommonalgorithmsin pseudocodewhichislanguageindependentandprovidesforeasyportingtomost imperativeprogramminglanguages.Itisnotadeﬁnitivebookonthetheoryof datastructuresandalgorithms.

Forthemostpartthisbookpresentsimplementationsdevisedbytheauthors themselvesbasedontheconceptsbywhichtherespectivealgorithmsarebased uponsoitismorethanpossiblethatourimplementationsdiﬀerfromthose consideredthenorm.

Youshouldusethisbookalongsideanotheronthesamesubject,butone thatcontainsformalproofsofthealgorithmsinquestion.Inthisbookweuse theabstractbigOhnotationtodepicttheruntimecomplexityofalgorithms sothatthebookappealstoalargeraudience.

1.2Assumedknowledge

Wehavewrittenthisbookwithfewassumptionsofthereader,butsomehave beennecessaryinordertokeepthebookasconciseandapproachableaspossible. Weassumethatthereaderisfamiliarwiththefollowing:

1. BigOhnotation

2. Animperativeprogramminglanguage

3. Objectorientedconcepts

1.2.1BigOhnotation

ForruntimecomplexityanalysisweusebigOhnotationextensivelysoitisvital thatyouarefamiliarwiththegeneralconceptstodeterminewhichisthebest algorithmforyouincertainscenarios.WehavechosentousebigOhnotation forafewreasons,themostimportantofwhichisthatitprovidesanabstract measurementbywhichwecanjudgetheperformanceofalgorithmswithout usingmathematicalproofs.

Figure1.1showssomeoftheruntimestodemonstratehowimportantitisto chooseaneﬃcientalgorithm.Forthesanityofourgraphwehaveomittedcubic

O(n3),andexponential O(2n)runtimes.Cubicandexponentialalgorithms shouldonlyeverbeusedforverysmallproblems(ifever!);avoidthemiffeasibly possible.

ThefollowinglistexplainssomeofthemostcommonbigOhnotations:

O(1) constant:theoperationdoesn’tdependonthesizeofitsinput,e.g.adding anodetothetailofalinkedlistwherewealwaysmaintainapointerto thetailnode.

O(n) linear:theruntimecomplexityisproportionatetothesizeof n

O(logn) logarithmic:normallyassociatedwithalgorithmsthatbreaktheproblem intosmallerchunkspereachinvocation,e.g.searchingabinarysearch tree.

O(nlogn) just nlogn:usuallyassociatedwithanalgorithmthatbreakstheproblem intosmallerchunkspereachinvocation,andthentakestheresultsofthese smallerchunksandstitchesthembacktogether,e.g.quicksort.

O(n2) quadratic:e.g.bubblesort.

O(n3) cubic:veryrare.

O(2n) exponential:incrediblyrare.

Ifyouencountereitherofthelattertwoitems(cubicandexponential)thisis reallyasignalforyoutoreviewthedesignofyouralgorithm.Whileprototypingalgorithmdesignsyoumayjusthavetheintentionofsolvingtheproblem irrespectiveofhowfastitworks.Wewouldstronglyadvisethatyoualways reviewyouralgorithmdesignandoptimisewherepossible—particularlyloops

CHAPTER1.INTRODUCTION 2

Figure1.1:Algorithmicruntimeexpansion

andrecursivecalls—sothatyoucangetthemosteﬃcientruntimesforyour algorithms.

ThebiggestassetthatbigOhnotationgivesusisthatitallowsustoessentiallydiscardthingslikehardware.Ifyouhavetwosortingalgorithms,one withaquadraticruntime,andtheotherwithalogarithmicruntimethenthe logarithmicalgorithmwillalwaysbefasterthanthequadraticonewhenthe datasetbecomessuitablylarge.Thisapplieseveniftheformerisranonamachinethatisfarfasterthanthelatter.Why?BecausebigOhnotationisolates akeyfactorinalgorithmanalysis:growth.Analgorithmwithaquadraticrun timegrowsfasterthanonewithalogarithmicruntime.Itisgenerallysaidat somepointas n →∞ thelogarithmicalgorithmwillbecomefasterthanthe quadraticalgorithm.

BigOhnotationalsoactsasacommunicationtool.Picturethescene:you arehavingameetingwithsomefellowdeveloperswithinyourproductgroup. Youarediscussingprototypealgorithmsfornodediscoveryinmassivenetworks. Severalminuteselapseafteryouandtwoothershavediscussedyourrespective algorithmsandhowtheywork.Doesthisgiveyouagoodideaofhowfasteach respectivealgorithmis?No.Theresultofsuchadiscussionwilltellyoumore aboutthehighlevelalgorithmdesignratherthanitsefficiency.Replaythescene backinyourhead,butthistimeaswellastalkingaboutalgorithmdesigneach respectivedeveloperstatestheasymptoticruntimeoftheiralgorithm.Using thelatterapproachyounotonlygetagoodgeneralideaaboutthealgorithm design,butalsokeyefficiencydatawhichallowsyoutomakebetterchoices whenitcomestoselectinganalgorithmfitforpurpose.

Somereadersmayactuallyworkinaproductgroupwheretheyaregiven budgetsperfeature.Eachfeatureholdswithitabudgetthatrepresentsitsuppermosttimebound.Ifyousavesometimeinonefeatureitdoesn’tnecessarily giveyouabuﬀerfortheremainingfeatures.Imagineyouareworkingonan application,andyouareintheteamthatisdevelopingtheroutinesthatwill essentiallyspinupeverythingthatisrequiredwhentheapplicationisstarted. Everythingisgreatuntilyourbosscomesinandtellsyouthatthestartup timeshouldnotexceed n ms.Theeﬃciencyofeveryalgorithmthatisinvoked duringstartupinthisexampleisabsolutelykeytoasuccessfulproduct.Even ifyoudon’thavethesebudgetsyoushouldstillstriveforoptimalsolutions. Takingaquantitativeapproachformanysoftwaredevelopmentproperties willmakeyouafarsuperiorprogrammer-measuringone’sworkiscriticalto success.

1.2.2Imperativeprogramminglanguage

Allexamplesaregiveninapseudo-imperativecodingformatandsothereader mustknowthebasicsofsomeimperativemainstreamprogramminglanguage toporttheexampleseﬀectively,wehavewrittenthisbookwiththefollowing targetlanguagesinmind:

CHAPTER1.INTRODUCTION 3

1. C++ 2. C# 3. Java

Thereasonthatweareexplicitinthisrequirementissimple—allourimplementationsarebasedonanimperativethinkingstyle.Ifyouareafunctional programmeryouwillneedtoapplyvariousaspectsfromthefunctionalparadigm toproduceeﬃcientsolutionswithrespecttoyourfunctionallanguagewhether itbeHaskell,F#,OCaml,etc.

Twoofthelanguagesthatwehavelisted(C#andJava)targetvirtual machineswhichprovidevariousthingslikesecuritysandboxing,andmemory managementviagarbagecollectionalgorithms.Itistrivialtoportourimplementationstotheselanguages.WhenportingtoC++youmustrememberto usepointersforcertainthings.Forexample,whenwedescribealinkedlist nodeashavingareferencetothenextnode,thisdescriptionisinthecontext ofamanagedenvironment.InC++youshouldinterpretthereferenceasa pointertothenextnodeandsoon.Forprogrammerswhohaveafairamount ofexperiencewiththeirrespectivelanguagethesesubtletieswillpresentnoissue,whichiswhywereallydoemphasisethatthereadermustbecomfortable withatleastoneimperativelanguageinordertosuccessfullyportthepseudoimplementationsinthisbook.

Itisessentialthattheuserisfamiliarwithprimitiveimperativelanguage constructsbeforereadingthisbookotherwiseyouwilljustgetlost.Somealgorithmspresentedinthisbookcanbeconfusingtofollowevenforexperienced programmers!

1.2.3Objectorientedconcepts

Forthemostpartthisbookdoesnotusefeaturesthatarespeciﬁctoanyone language.Inparticular,weneverprovidedatastructuresoralgorithmsthat workongenerictypes—thisisinordertomakethesamplesaseasytofollow aspossible.However,toappreciatethedesignsofourdatastructuresyouwill needtobefamiliarwiththefollowingobjectoriented(OO)concepts:

1. Inheritance

2. Encapsulation

3. Polymorphism

ThisisespeciallyimportantifyouareplanningonlookingattheC#target thatwehaveimplemented(moreonthatin §1.7)whichmakesextensiveuse oftheOOconceptslistedabove.Asaﬁnalnoteitisalsodesirablethatthe readerisfamiliarwithinterfacesastheC#targetusesinterfacesthroughout thesortingalgorithms.

1.3Pseudocode

Throughoutthisbookweusepseudocodetodescribeoursolutions.Forthe mostpartinterpretingthepseudocodeistrivialasitlooksverymuchlikea moreabstractC++,orC#,butthereareafewthingstopointout:

1. Pre-conditionsshouldalwaysbeenforced

2. Post-conditionsrepresenttheresultofapplyingalgorithm a todatastructure d

CHAPTER1.INTRODUCTION 4

3. Thetypeofparametersisinferred

4. Allprimitivelanguageconstructsareexplicitlybegunandended Ifanalgorithmhasareturntypeitwilloftenbepresentedinthepostcondition,butwherethereturntypeissuﬃcientlyobviousitmaybeomitted forthesakeofbrevity.

Mostalgorithmsinthisbookrequireparameters,andbecauseweassignno explicittypetothoseparametersthetypeisinferredfromthecontextsinwhich itisused,andtheoperationsperformeduponit.Additionally,thenameof theparameterusuallyactsasthebiggestcluetoitstype.Forinstance n isa pseudo-nameforanumberandsoyoucanassumeunlessotherwisestatedthat n translatestoanintegerthathasthesamenumberofbitsasaWORDona 32bitmachine,similarly l isapseudo-nameforalistwherealistisaresizeable array(e.g.avector).

Thelastmajorpointofreferenceisthatwealwaysexplicitlyendalanguage construct.Forinstanceifwewishtoclosethescopeofa for loopwewill explicitlystate endfor ratherthanleavingtheinterpretationofwhenscopes areclosedtothereader.Whileimplicitscopeclosureworkswellinsimplecode, incomplexcasesitcanleadtoambiguity.

Thepseudocodestylethatweusewithinthisbookisratherstraightforward. Allalgorithmsstartwithasimplealgorithmsignature,e.g.

1) algorithm AlgorithmName(arg1, arg2,..., argN )

2)...

n) end AlgorithmName

Immediatelyafterthealgorithmsignaturewelistany Pre or Post conditions.

1) algorithm AlgorithmName(n)

2) Pre: n isthevaluetocomputethefactorialof

3) n ≥ 0

4) Post: thefactorialof n hasbeencomputed

5)//...

n) end AlgorithmName

Theexampleabovedescribesanalgorithmbythenameof AlgorithmName, whichtakesasinglenumericparameter n.Thepreandpostconditionsfollow thealgorithmsignature;youshouldalwaysenforcethepre-conditionsofan algorithmwhenportingthemtoyourlanguageofchoice.

Normallywhatislistedasapre-coniditioniscriticaltothealgorithmsoperation.Thismaycoverthingsliketheactualparameternotbeingnull,orthatthe collectionpassedinmustcontainatleast n items.Thepost-conditionmainly describestheeﬀectofthealgorithmsoperation.Anexampleofapost-condition mightbe“Thelisthasbeensortedinascendingorder”

Becauseeverythingwedescribeislanguageindependentyouwillneedto makeyourownminduponhowtobesthandlepre-conditions.Forexample, intheC#targetwehaveimplemented,weconsidernon-conformancetopreconditionstobeexceptionalcases.Weprovideamessageintheexceptionto tellthecallerwhythealgorithmhasfailedtoexecutenormally.

CHAPTER1.INTRODUCTION 5

1.4Tipsforworkingthroughtheexamples

Aswithmostbooksyougetoutwhatyouputinandsowerecommendthatin ordertogetthemostoutofthisbookyouworkthrougheachalgorithmwitha penandpapertotrackthingslikevariablenames,recursivecallsetc.

Thebestwaytoworkthroughalgorithmsistosetupatable,andinthat tablegiveeachvariableitsowncolumnandcontinuouslyupdatethesecolumns. Thiswillhelpyoukeeptrackofandvisualisethemutationsthatareoccurring throughoutthealgorithm.Oftenwhileworkingthroughalgorithmsinsuch awayyoucanintuitivelymaprelationshipsbetweendatastructuresrather thantryingtoworkoutafewvaluesonpaperandtherestinyourhead.We suggestyouputeverythingonpaperirrespectiveofhowtrivialsomevariables andcalculationsmaybesothatyoualwayshaveapointofreference.

Whendealingwithrecursivealgorithmtraceswerecommendyoudothe sameastheabove,butalsohaveatablethatrecordsfunctioncallsandwho theyreturnto.Thisapproachisafarcleanerwaythandrawingoutanelaborate mapoffunctioncallswitharrowstooneanother,whichgetslargequicklyand simplymakesthingsmorecomplextofollow.Trackeverythinginasimpleand systematicwaytomakeyourtimestudyingtheimplementationsfareasier.

1.5Bookoutline

Wehavesplitthisbookintotwoparts:

Part1: Providesdiscussionandpseudo-implementationsofcommonanduncommondatastructures;and

Part2: Providesalgorithmsofvaryingpurposesfromsortingtostringoperations.

Thereaderdoesn’thavetoreadthebooksequentiallyfrombeginningto end:chapterscanbereadindependentlyfromoneanother.Wesuggestthat inpart1youreadeachchapterinitsentirety,butinpart2youcangetaway withjustreadingthesectionofachapterthatdescribesthealgorithmyouare interestedin.

Eachofthechaptersondatastructurespresentinitiallythealgorithmsconcernedwith:

Thepreviouslistrepresentswhatwebelieveinthevastmajorityofcasesto bethemostimportantforeachrespectivedatastructure.

Forallreaderswerecommendthatbeforelookingatanyalgorithmyou quicklylookatAppendixEwhichcontainsatablelistingthevarioussymbols usedwithinouralgorithmsandtheirmeaning.Onekeywordthatwewouldlike topointouthereis yield.Youcanthinkof yield inthesamelightas return The return keywordcausesthemethodtoexitandreturnscontroltothecaller, whereas yield returnseachvaluetothecaller.With yield controlonlyreturns tothecallerwhenallvaluestoreturntothecallerhavebeenexhausted.

CHAPTER1.INTRODUCTION 6

1. Insertion 2. Deletion 3. Searching

1.6Testing

Allthedatastructuresandalgorithmshavebeentestedusingaminimisedtest drivendevelopmentstyleonpapertoﬂeshoutthepseudocodealgorithm.We thentranscribethesetestsintounittestssatisfyingthemonebyone.When allthetestcaseshavebeenprogressivelysatisﬁedweconsiderthatalgorithm suitablytested.

Forthemostpartalgorithmshavefairlyobviouscaseswhichneedtobe satisﬁed.Somehoweverhavemanyareaswhichcanprovetobemorecomplex tosatisfy.Withsuchalgorithmswewillpointoutthetestcaseswhicharetricky andthecorrespondingportionsofpseudocodewithinthealgorithmthatsatisfy thatrespectivecase.

Asyoubecomemorefamiliarwiththeactualproblemyouwillbeableto intuitivelyidentifyareaswhichmaycauseproblemsforyouralgorithmsimplementation.Thisinsomecaseswillyieldanoverwhelminglistofconcernswhich willhinderyourabilitytodesignanalgorithmgreatly.Whenyouarebombardedwithsuchavastamountofconcernslookattheoverallproblemagain andsub-dividetheproblemintosmallerproblems.Solvingthesmallerproblems andthencomposingthemisafareasiertaskthancloudingyourmindwithtoo manylittledetails.

Theonlytypeoftestingthatweuseintheimplementationofallthatis providedinthisbookareunittests.Becauseunittestscontributesuchacore pieceofcreatingsomewhatmorestablesoftwareweinvitethereadertoview AppendixDwhichdescribestestinginmoredepth.

1.7WherecanIgetthecode?

Thisbookdoesn’tprovideanycodespeciﬁcallyalignedwithit,howeverwedo activelymaintainanopensourceproject1 thathousesaC#implementationof allthepseudocodelisted.Theprojectisnamed DataStructuresandAlgorithms (DSA)andcanbefoundat http://codeplex.com/dsa.

1.8Finalmessages

Wehavejustafewﬁnalmessagestothereaderthatwehopeyoudigestbefore youembarkonreadingthisbook:

1. Understandhowthealgorithmworksﬁrstinanabstractsense;and

2. Alwaysworkthroughthealgorithmsonpapertounderstandhowthey achievetheiroutcome

Ifyoualwaysfollowthesekeypoints,youwillgetthemostoutofthisbook.

1 Allreadersareencouragedtoprovidesuggestions,featurerequests,andbugssowecan furtherimproveourimplementations.

CHAPTER1.INTRODUCTION 7

PartI DataStructures

Chapter2

LinkedLists

Linkedlistscanbethoughtoffromahighlevelperspectiveasbeingaseries ofnodes.Eachnodehasatleastasinglepointertothenextnode,andinthe lastnode’scaseanullpointerrepresentingthattherearenomorenodesinthe linkedlist.

InDSAourimplementationsoflinkedlistsalwaysmaintainheadandtail pointerssothatinsertionateithertheheadortailofthelistisaconstant timeoperation.Randominsertionisexcludedfromthisandwillbealinear operation.Assuch,linkedlistsinDSAhavethefollowingcharacteristics:

1. Insertionis O(1)

2. Deletionis O(n)

3. Searchingis O(n)

Outofthethreeoperationstheonethatstandsoutisthatofinsertion.In DSAwechosetoalwaysmaintainpointers(ormoreaptlyreferences)tothe node(s)attheheadandtailofthelinkedlistandsoperformingatraditional insertiontoeitherthefrontorbackofthelinkedlistisan O(1)operation.An exceptiontothisruleisperforminganinsertionbeforeanodethatisneither theheadnortailinasinglylinkedlist.Whenthenodeweareinsertingbefore issomewhereinthemiddleofthelinkedlist(knownasrandominsertion)the complexityis O(n).Inordertoaddbeforethedesignatednodeweneedto traversethelinkedlisttoﬁndthatnode’scurrentpredecessor.Thistraversal yieldsan O(n)runtime.

Thisdatastructureistrivial,butlinkedlistshaveafewkeypointswhichat timesmakethemveryattractive:

1. thelistisdynamicallyresized,thusitincursnocopypenaltylikeanarray orvectorwouldeventuallyincur;and

2. insertionis O(1).

2.1SinglyLinkedList

Singlylinkedlistsareoneofthemostprimitivedatastructuresyouwillﬁndin thisbook.Eachnodethatmakesupasinglylinkedlistconsistsofavalue,and areferencetothenextnode(ifany)inthelist.

2.1.1Insertion

Ingeneralwhenpeopletalkaboutinsertionwithrespecttolinkedlistsofany formtheyimplicitlyrefertotheaddingofanodetothetailofthelist.When youuseanAPIlikethatofDSAandyouseeageneralpurposemethodthat addsanodetothelist,youcanassumethatyouareaddingthenodetothetail ofthelistnotthehead.

Addinganodetoasinglylinkedlisthasonlytwocases:

1. head = ∅ inwhichcasethenodeweareaddingisnowboththe head and tail ofthelist;or

2. wesimplyneedtoappendournodeontotheendofthelistupdatingthe tail referenceappropriately.

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtothelist

3) Post: value hasbeenplacedatthetailofthelist

4) n ← node(value)

5) if head = ∅ 6) head ← n

Asanexampleofthepreviousalgorithmconsideraddingthefollowingsequenceofintegerstothelist:1,45,60,and12,theresultinglististhatof Figure2.2.

2.1.2Searching

Searchingalinkedlistisstraightforward:wesimplytraversethelistchecking thevaluewearelookingforwiththevalueofeachnodeinthelinkedlist.The algorithmlistedinthissectionisverysimilartothatusedfortraversalin §2.1.4.

CHAPTER2.LINKEDLISTS 10

Figure2.1:Singlylinkedlistnode Figure2.2:Asinglylinkedlistpopulatedwithintegers

10) tail

11)

12)

7) tail ← n

else 9) tail.Next ← n

← n

endif

end Add

1) algorithm Contains(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetosearchfor

4) Post: theitemiseitherinthelinkedlist,true;otherwisefalse

5) n ← head

6) while n = ∅ and n.Value = value

7) n ← n.Next 8) endwhile 9) if n = ∅

returnfalse

endif

returntrue

end Contains

2.1.3Deletion

Deletinganodefromalinkedlistisstraightforwardbutthereareafewcases weneedtoaccountfor:

1. thelistisempty;or

2. thenodetoremoveistheonlynodeinthelinkedlist;or

3. weareremovingtheheadnode;or

4. weareremovingthetailnode;or

5. thenodetoremoveissomewhereinbetweentheheadandtail;or

6. theitemtoremovedoesn’texistinthelinkedlist

Thealgorithmwhosecaseswehavedescribedwillremoveanodefromanywherewithinalistirrespectiveofwhetherthenodeisthe head etc.Ifyouknow thatitemswillonlyeverberemovedfromthe head or tail ofthelistthenyou cancreatemuchmoreconcisealgorithms.Inthecaseofalwaysremovingfrom thefrontofthelinkedlistdeletionbecomesan O(1)operation.

CHAPTER2.LINKEDLISTS 11

10)

11)

12)

13)

1) algorithm Remove(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetoremovefromthelist

4) Post: value isremovedfromthelist,true;otherwisefalse

5) if head = ∅ 6)//case1

7) returnfalse

8) endif 9) n ← head 10) if n.Value= value 11) if head = tail

13) head ←∅

tail ←∅ 15) else

17) head ← head.Next

endif

returntrue 20) endif

while n.Next = ∅ and n.Next.Value = value

n ← n.Next

endwhile

if n.Next = ∅ 25) if n.Next= tail

27) tail ← n

endif 29)//thisisonlycase5iftheconditionalonline25was false

n.Next ← n.Next.Next

returnfalse

end Remove

2.1.4Traversingthelist

Traversingasinglylinkedlististhesameasthatoftraversingadoublylinked list(deﬁnedin §2.2).Youstartattheheadofthelistandcontinueuntilyou comeacrossanodethatis ∅.Thetwocasesareasfollows:

1. node = ∅,wehaveexhaustedallnodesinthelinkedlist;or

2. wemustupdatethe node referencetobe node.Next.

Thealgorithmdescribedisaverysimpleonethatmakesuseofasimple while looptochecktheﬁrstcase.

CHAPTER2.LINKEDLISTS 12

23)

30)

31)

33)//case6 34)

12)//case2

14)

16)//case3

18)

19)

21)

22)

24)

26)//case4

28)

returntrue 32) endif

35)

1) algorithm Traverse(head)

2) Pre: head istheheadnodeinthelist

3) Post: theitemsinthelisthavebeentraversed

4) n ← head

5) while n =0

6) yield n.Value

7) n ← n.Next 8)

2.1.5Traversingthelistinreverseorder

Traversingasinglylinkedlistinaforwardmanner(i.e.lefttoright)issimple asdemonstratedin §2.1.4.However,whatifwewantedtotraversethenodesin thelinkedlistinreverseorderforsomereason?Thealgorithmtoperformsuch atraversalisverysimple,andjustlikedemonstratedin §2.1.3wewillneedto acquireareferencetothepredecessorofanode,eventhoughthefundamental characteristicsofthenodesthatmakeupasinglylinkedlistmakethisan expensiveoperation.Foreachnode,ﬁndingitspredecessorisan O(n)operation, sooverthecourseoftraversingthewholelistbackwardsthecostbecomes O(n2). Figure2.3depictsthefollowingalgorithmbeingappliedtoalinkedlistwith theintegers5,10,1,and40.

1) algorithm ReverseTraversal(head, tail)

2) Pre: head and tail belongtothesamelist

3) Post: theitemsinthelisthavebeentraversedinreverseorder 4) if tail = ∅ 5) curr ← tail

Thisalgorithmisonlyofrealinterestwhenweareusingsinglylinkedlists, asyouwillsoonseethatdoublylinkedlists(deﬁnedin §2.2)makereverselist traversalsimpleandeﬃcient,asshownin §2.2.3.

2.2DoublyLinkedList

Doublylinkedlistsareverysimilartosinglylinkedlists.Theonlydiﬀerenceis thateachnodehasareferencetoboththenextandpreviousnodesinthelist.

CHAPTER2.LINKEDLISTS 13

endwhile

end Traverse

10) endwhile 11)

curr

12) curr ← prev 13) endwhile 14) yield curr.Value 15) endif 16)

ReverseTraversal

while curr = head

prev ← head

while prev.Next = curr

prev ← prev.Next

yield

.Value

end

CHAPTER2.LINKEDLISTS 14

Figure2.3:Reversetraveralofasinglylinkedlist Figure2.4:Doublylinkedlistnode

Thefollowingalgorithmsforthedoublylinkedlistareexactlythesameas thoselistedpreviouslyforthesinglylinkedlist:

1. Searching(deﬁnedin §2.1.2)

2. Traversal(deﬁnedin §2.1.4)

2.2.1Insertion

Theonlymajordiﬀerencebetweenthealgorithmin §2.1.1isthatweneedto remembertobindthepreviouspointerof n totheprevioustailnodeif n was nottheﬁrstnodetobeinsertedintothelist.

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtothelist

3) Post: value hasbeenplacedatthetailofthelist

4) n ← node(value)

5) if head = ∅

6) head ← n

7) tail ← n

8) else

9) n.Previous ← tail

10) tail.Next ← n 11) tail ← n

12) endif

13) end Add

Figure2.5showsthedoublylinkedlistafteraddingthesequenceofintegers deﬁnedin §2.1.1.

2.2.2Deletion

Asyoumayofguessedthecasesthatweusefordeletioninadoublylinked listareexactlythesameasthosedeﬁnedin §2.1.3.Likeinsertionwehavethe addedtaskofbindinganadditionalreference(Previous)tothecorrectvalue.

CHAPTER2.LINKEDLISTS 15

Figure2.5:Doublylinkedlistpopulatedwithintegers

1) algorithm Remove(head, value)

2) Pre: head istheheadnodeinthelist

3) value isthevaluetoremovefromthelist

4) Post: value isremovedfromthelist,true;otherwisefalse

Singlylinkedlistshaveaforwardonlydesign,whichiswhythereversetraversal algorithmdeﬁnedin §2.1.5requiredsomecreativeinvention.Doublylinkedlists makereversetraversalassimpleasforwardtraversal(deﬁnedin §2.1.4)except thatwestartatthetailnodeandupdatethepointersintheoppositedirection. Figure2.6showsthereversetraversalalgorithminaction.

CHAPTER2.LINKEDLISTS 16

returnfalse

if value = head

if head = tail 10) head ←∅ 11) tail ←∅ 12) else 13) head ← head.Next 14) head.Previous ←∅ 15) endif 16) returntrue 17) endif 18) n ← head.Next 19) while n = ∅ and value = n.Value 20) n ← n.Next 21) endwhile 22) if n = tail 23) tail ← tail.Previous 24) tail.Next ←∅ 25) returntrue 26) elseif n = ∅ 27) n.Previous.Next ← n

28) n.Next.Previous ← n.Previous 29) returntrue 30) endif 31) returnfalse 32) end Remove

5) if head = ∅ 6)

7) endif 8)

.Value 9)

.Next

2.2.3ReverseTraversal

2.3Summary

Linkedlistsaregoodtousewhenyouhaveanunknownnumberofitemsto store.Usingadatastructurelikeanarraywouldrequireyoutospecifythesize upfront;exceedingthatsizeinvolvesinvokingaresizingalgorithmwhichhas alinearruntime.Youshouldalsouselinkedlistswhenyouwillonlyremove nodesateithertheheadortailofthelisttomaintainaconstantruntime. Thisrequiresmaintainingpointerstothenodesattheheadandtailofthelist butthememoryoverheadwillpayforitselfifthisisanoperationyouwillbe performingmanytimes.

Whatlinkedlistsarenotverygoodforisrandominsertion,accessingnodes byindex,andsearching.Attheexpenseofalittlememory(inmostcases4 byteswouldsuﬃce),andafewmoreread/writesyoucouldmaintaina count variablethattrackshowmanyitemsarecontainedinthelistsothataccessing suchaprimitivepropertyisaconstantoperation-youjustneedtoupdate count duringtheinsertionanddeletionalgorithms.

Singlylinkedlistsshouldbeusedwhenyouareonlyperformingbasicinsertions.Ingeneraldoublylinkedlistsaremoreaccommodatingfornon-trivial operationsonalinkedlist.

Werecommendtheuseofadoublylinkedlistwhenyourequireforwards andbackwardstraversal.Forthemostcasesthisrequirementispresent.For example,consideratokenstreamthatyouwanttoparseinarecursivedescent fashion.Sometimesyouwillhavetobacktrackinordertocreatethecorrect parsetree.Inthisscenarioadoublylinkedlistisbestasitsdesignmakes bi-directionaltraversalmuchsimplerandquickerthanthatofasinglylinked

CHAPTER2.LINKEDLISTS 17

1) algorithm ReverseTraversal(tail) 2) Pre: tail isthetailnodeofthelisttotraverse 3) Post: thelisthasbeentraversedinreverseorder 4) n ← tail 5) while n = ∅ 6) yield n.Value 7) n ← n.Previous 8) endwhile 9) end ReverseTraversal

Figure2.6:Doublylinkedlistreversetraversal

CHAPTER2.LINKEDLISTS 18 list.

Chapter3

BinarySearchTree

Binarysearchtrees(BSTs)areverysimpletounderstand.Westartwitharoot nodewithvalue x,wheretheleftsubtreeof x containsnodeswithvalues <x andtherightsubtreecontainsnodeswhosevaluesare ≥ x.Eachnodefollows thesameruleswithrespecttonodesintheirleftandrightsubtrees.

BSTsareofinterestbecausetheyhaveoperationswhicharefavourablyfast: insertion,lookup,anddeletioncanallbedonein O(logn)time.Itisimportant tonotethatthe O(logn)timesfortheseoperationscanonlybeattainedif theBSTisreasonablybalanced;foratreedatastructurewithselfbalancing propertiesseeAVLtreedeﬁnedin §7).

Inthefollowingexamplesyoucanassume,unlessusedasaparameteralias that root isareferencetotherootnodeofthetree.

Figure3.1:Simpleunbalancedbinarysearchtree

3.1Insertion

Asmentionedpreviouslyinsertionisan O(logn)operationprovidedthatthe treeismoderatelybalanced.

1) algorithm Insert(value)

2) Pre: value haspassedcustomtypechecksfortype T

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if root = ∅

5) root ← node(value)

6) else

7)InsertNode(root, value)

8) endif

9) end Insert

1) algorithm InsertNode(current, value)

2) Pre: current isthenodetostartfrom

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if value<current.Value

5) if current.Left= ∅

6) current.Left ← node(value)

7) else

8)InsertNode(current.Left, value)

9) endif

10) else

11) if current.Right= ∅

12) current.Right ← node(value) 13) else

14)InsertNode(current.Right, value) 15) endif 16) endif 17) end InsertNode

Theinsertionalgorithmissplitforagoodreason.Thefirstalgorithm(nonrecursive)checksaverycorebasecase-whetherornotthetreeisempty.If thetreeisemptythenwesimplycreateourrootnodeandfinish.Inallother casesweinvoketherecursive InsertNode algorithmwhichsimplyguidesusto thefirstappropriateplaceinthetreetoput value.Notethatateachstagewe performabinarychop:weeitherchoosetorecurseintotheleftsubtreeorthe rightbycomparingthenewvaluewiththatofthecurrentnode.Foranytotally orderedtype,novaluecansimultaneouslysatisfytheconditionstoplaceitin bothsubtrees.

CHAPTER3.BINARYSEARCHTREE 20

3.2Searching

SearchingaBSTisevensimplerthaninsertion.Thepseudocodeisself-explanatory butwewilllookbrieﬂyatthepremiseofthealgorithmnonetheless. Wehavetalkedpreviouslyaboutinsertion,wegoeitherleftorrightwiththe rightsubtreecontainingvaluesthatare ≥ x where x isthevalueofthenode weareinserting.Whensearchingtherulesaremadealittlemoreatomicand atanyonetimewehavefourcasestoconsider:

1. the root = ∅ inwhichcase value isnotintheBST;or

2. root.Value= value inwhichcase value isintheBST;or

3. value<root.Value,wemustinspecttheleftsubtreeof root for value;or

4. value>root.Value,wemustinspecttherightsubtreeof root for value.

1) algorithm Contains(root, value)

2) Pre: root istherootnodeofthetree, value iswhatwewouldliketolocate

3) Post: value iseitherlocatedornot 4) if root = ∅

CHAPTER3.BINARYSEARCHTREE 21

7) if root.Value= value 8) returntrue 9) elseif value<root.Value 10) return Contains(root.Left, value) 11) else 12) return Contains(root.Right, value) 13) endif 14) end Contains

returnfalse

endif

3.3Deletion

RemovinganodefromaBSTisfairlystraightforward,withfourcasestoconsider:

1. thevaluetoremoveisaleafnode;or

2. thevaluetoremovehasarightsubtree,butnoleftsubtree;or

3. thevaluetoremovehasaleftsubtree,butnorightsubtree;or

4. thevaluetoremovehasbothaleftandrightsubtreeinwhichcasewe promotethelargestvalueintheleftsubtree.

Thereisalsoanimplicitﬁfthcasewherebythenodetoberemovedisthe onlynodeinthetree.Thiscaseisalreadycoveredbytheﬁrst,butshouldbe notedasapossibilitynonetheless.

OfcourseinaBSTavaluemayoccurmorethanonce.Insuchacasethe ﬁrstoccurrenceofthatvalueintheBSTwillberemoved.

The Remove algorithmgivenbelowreliesontwofurtherhelperalgorithms named FindParent,and FindNode whicharedescribedin §3.4and §3.5respectively.

CHAPTER3.BINARYSEARCHTREE 22

Figure3.2:binarysearchtreedeletioncases

1) algorithm Remove(value)

2) Pre: value isthevalueofthenodetoremove, root istherootnodeoftheBST

3)CountisthenumberofitemsintheBST

3) Post: nodewith value isremovediffoundinwhichcaseyieldstrue,otherwisefalse

value

CHAPTER3.BINARYSEARCHTREE 23

10)

11) elseif

nodeToRemove

null 12)//case#1

14) parent

15) else 16) parent.Right ←∅ 17) endif 18) elseif nodeToRemove

∅

nodeToRemove

= ∅ 19)//case#2 20) if nodeToRemove.Value <parent.Value 21) parent.Left ← nodeToRemove.Right 22) else 23) parent.Right ← nodeToRemove.Right 24) endif 25) elseif nodeToRemove.Left = ∅ and nodeToRemove.Right= ∅ 26)//case#3 27) if nodeToRemove.Value <parent.Value 28) parent.Left ← nodeToRemove.Left 29) else 30) parent.Right ← nodeToRemove.Left 31) endif 32) else 33)//case#4 34) largestValue ← nodeToRemove.Left 35) while largestValue.Right = ∅ 36)//ﬁndthelargestvalueintheleftsubtreeof nodeToRemove 37) largestValue ← largestValue.Right 38) endwhile 39)//settheparents’Rightpointerof largestValue to ∅ 40)FindParent(largestValue.Value).Right ←∅ 41) nodeToRemove.Value ← largestValue.Value 42) endif 43)Count ← Count 1 44) returntrue 45) end Remove

4) nodeToRemove ← FindNode(

) 5) if nodeToRemove = ∅ 6) returnfalse //valuenotinBST 7) endif 8) parent ← FindParent(value)

if Count=1

root ←∅ //weareremovingtheonlynodeintheBST

nodeToRemove.Left= ∅ and

.Right=

13) if nodeToRemove.Value <parent.Value

.Left ←∅

.Left=

and

.Right

3.4Findingtheparentofagivennode

Thepurposeofthisalgorithmissimple-toreturnareference(orpointer)to theparentnodeoftheonewiththegivenvalue.Wehavefoundthatsuchan algorithmisveryuseful,especiallywhenperformingextensivetreetransformations.

1) algorithm FindParent(value, root)

2) Pre: value isthevalueofthenodewewanttoﬁndtheparentof

3) root istherootnodeoftheBSTandis!= ∅

4) Post: areferencetotheparentnodeof value iffound;otherwise ∅

5) if value = root.Value

6) return ∅ 7) endif

8) if value<root.Value

9) if root.Left= ∅

10) return ∅

11) elseif root.Left.Value= value

12) return root

13) else

14) return FindParent(value, root.Left)

15) endif 16) else

17) if root.Right= ∅ 18) return ∅ 19) elseif root.Right.Value= value

20) return root 21) else

22) return FindParent(value, root.Right)

23) endif 24) endif 25) end FindParent

Aspecialcaseintheabovealgorithmiswhenthespeciﬁedvaluedoesnot existintheBST,inwhichcasewereturn ∅.Callerstothisalgorithmmusttake accountofthispossibilityunlesstheyarealreadycertainthatanodewiththe speciﬁedvalueexists.

3.5Attainingareferencetoanode

Thisalgorithmisverysimilarto §3.4,butinsteadofreturningareferencetothe parentofthenodewiththespeciﬁedvalue,itreturnsareferencetothenode itself.Again, ∅ isreturnedifthevalueisn’tfound.

CHAPTER3.BINARYSEARCHTREE 24

1) algorithm FindNode(root, value)

2) Pre: value isthevalueofthenodewewanttoﬁndtheparentof

3) root istherootnodeoftheBST

areferencetothenodeof value iffound;otherwise ∅

Astutereaderswillhavenoticedthatthe FindNode algorithmisexactlythe sameasthe Contains algorithm(deﬁnedin §3.2)withthemodiﬁcationthat wearereturningareferencetoanodenot true or false.Given FindNode, theeasiestwayofimplementing Contains istocall FindNode andcomparethe returnvaluewith ∅.

3.6Findingthesmallestandlargestvaluesin thebinarysearchtree

TofindthesmallestvalueinaBSTyousimplytraversethenodesintheleft subtreeoftheBSTalwaysgoingleftuponeachencounterwithanode,terminatingwhenyoufindanodewithnoleftsubtree.Theoppositeisthecasewhen findingthelargestvalueintheBST.Bothalgorithmsareincrediblysimple,and arelistedsimplyforcompleteness.

Thebasecaseinboth FindMin,and FindMax algorithmsiswhentheLeft (FindMin),orRight(FindMax)nodereferencesare ∅ inwhichcasewehave reachedthelastnode.

1) algorithm FindMin(root)

2) Pre: root istherootnodeoftheBST

3) root = ∅

4) Post: thesmallestvalueintheBSTislocated 5) if root.Left= ∅ 6) return root.Value 7) endif 8)FindMin(root.Left) 9) end FindMin

CHAPTER3.BINARYSEARCHTREE 25

10)

11)

12)

13)

14)

15)

4) Post:

5) if root = ∅ 6) return ∅ 7) endif 8) if root.Value= value 9) return root

elseif value<root.Value

return FindNode(root.Left, value)

else

return FindNode(root.Right, value)

endif

end FindNode

1) algorithm FindMax(root)

2) Pre: root istherootnodeoftheBST

3) root = ∅

4) Post: thelargestvalueintheBSTislocated

5) if root.Right= ∅

6) return root.Value

7) endif

8)FindMax(root.Right)

9) end FindMax

3.7TreeTraversals

Therearevariousstrategieswhichcanbeemployedtotraversetheitemsina tree;thechoiceofstrategydependsonwhichnodevisitationorderyourequire. InthissectionwewilltouchonthetraversalsthatDSAprovidesonalldata structuresthatderivefrom BinarySearchTree

3.7.1Preorder

Whenusingthepreorderalgorithm,youvisittherootﬁrst,thentraversetheleft subtreeandﬁnallytraversetherightsubtree.Anexampleofpreordertraversal isshowninFigure3.3.

1) algorithm Preorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinpreorder

4) if root = ∅

5) yield root.Value

6)Preorder(root.Left)

7)Preorder(root.Right)

8) endif

9) end Preorder

3.7.2Postorder

Thisalgorithmisverysimilartothatdescribedin §3.7.1,howeverthevalue ofthenodeisyieldedaftertraversingbothsubtrees.Anexampleofpostorder traversalisshowninFigure3.4.

1) algorithm Postorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinpostorder

4) if root = ∅

5)Postorder(root.Left)

6)Postorder(root.Right)

7) yield root.Value

8) endif

9) end Postorder

CHAPTER3.BINARYSEARCHTREE 26

CHAPTER3.BINARYSEARCHTREE 27

Figure3.3:Preordervisitbinarysearchtreeexample

CHAPTER3.BINARYSEARCHTREE 28

Figure3.4:Postordervisitbinarysearchtreeexample

3.7.3Inorder

Anothervariationofthealgorithmsdeﬁnedin §3.7.1and §3.7.2isthatofinorder traversalwherethevalueofthecurrentnodeisyieldedinbetweentraversing theleftsubtreeandtherightsubtree.Anexampleofinordertraversalisshown inFigure3.5.

Figure3.5:Inordervisitbinarysearchtreeexample

1) algorithm Inorder(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedininorder

4) if root = ∅

5)Inorder(root.Left)

6) yield root.Value

7)Inorder(root.Right)

8) endif

9) end Inorder

Oneofthebeautiesofinordertraversalisthatvaluesareyieldedintheir comparisonorder.Inotherwords,whentraversingapopulatedBSTwiththe inorderstrategy,theyieldedsequencewouldhaveproperty xi ≤ xi+1∀i.

CHAPTER3.BINARYSEARCHTREE 29

3.7.4BreadthFirst

Traversingatreeinbreadthﬁrstorderyieldsthevaluesofallnodesofaparticulardepthinthetreebeforeanydeeperones.Inotherwords,givenadepth d wewouldvisitthevaluesofallnodesat d inalefttorightfashion,thenwe wouldproceedto d +1andsoonuntilwehadenomorenodestovisit.An exampleofbreadthﬁrsttraversalisshowninFigure3.6.

Traditionallybreadthﬁrsttraversalisimplementedusingalist(vector,resizeablearray,etc)tostorethevaluesofthenodesvisitedinbreadthﬁrstorder andthenaqueuetostorethosenodesthathaveyettobevisited.

CHAPTER3.BINARYSEARCHTREE 30

Figure3.6:BreadthFirstvisitbinarysearchtreeexample

1) algorithm BreadthFirst(root)

2) Pre: root istherootnodeoftheBST

3) Post: thenodesintheBSThavebeenvisitedinbreadthﬁrstorder

4) q ← queue

5) while root = ∅

6) yield root.Value

7) if root.Left = ∅

8) q.Enqueue(root.Left)

9) endif 10) if root.Right = ∅ 11) q.Enqueue(root.Right) 12) endif 13) if !q.IsEmpty()

14) root ← q.Dequeue()

15) else

16) root ←∅

17) endif 18) endwhile 19) end BreadthFirst

3.8Summary

Abinarysearchtreeisagoodsolutionwhenyouneedtorepresenttypesthatare orderedaccordingtosomecustomrulesinherenttothattype.Withlogarithmic insertion,lookup,anddeletionitisveryeﬀecient.Traversalremainslinear,but therearemanywaysinwhichyoucanvisitthenodesofatree.Treesare recursivedatastructures,sotypicallyyouwillﬁndthatmanyalgorithmsthat operateonatreearerecursive.

Theruntimespresentedinthischapterarebasedonaprettybigassumption -thatthebinarysearchtree’sleftandrightsubtreesarereasonablybalanced. Wecanonlyattainlogarithmicruntimesforthealgorithmspresentedearlier whenthisistrue.Abinarysearchtreedoesnotenforcesuchaproperty,and theruntimesfortheseoperationsonapathologicallyunbalancedtreebecome linear:suchatreeiseﬀectivelyjustalinkedlist.Laterin §7wewillexamine anAVLtreethatenforcesself-balancingpropertiestohelpattainlogarithmic runtimes.

CHAPTER3.BINARYSEARCHTREE 31

Chapter4

Heap

Aheapcanbethoughtofasasimpletreedatastructure,howeveraheapusually employsoneoftwostrategies:

1. minheap;or

2. maxheap

Eachstrategydeterminesthepropertiesofthetreeanditsvalues.Ifyou weretochoosetheminheapstrategytheneachparentnodewouldhaveavalue thatis ≤ thanitschildren.Forexample,thenodeattherootofthetreewill havethesmallestvalueinthetree.Theoppositeistrueforthemaxheap strategy.Inthisbookyoushouldassumethataheapemploystheminheap strategyunlessotherwisestated.

Unlikeothertreedatastructuresliketheonedeﬁnedin §3aheapisgenerally implementedasanarrayratherthanaseriesofnodeswhicheachhavereferencestoothernodes.Thenodesareconceptuallythesame,however,havingat mosttwochildren.Figure4.1showshowthetree(notaheapdatastructure) (127(32)6(9))wouldberepresentedasanarray.ThearrayinFigure4.1isa resultofsimplyaddingvaluesinatop-to-bottom,left-to-rightfashion.Figure 4.2showsarrowstothedirectleftandrightchildofeachvalueinthearray.

Thischapterisverymuchcentredaroundthenotionofrepresentingatreeas anarrayandbecausethispropertyiskeytounderstandingthischapterFigure 4.3showsastepbystepprocesstorepresentatreedatastructureasanarray. InFigure4.3youcanassumethatthedefaultcapacityofourarrayiseight.

Usingjustanarrayisoftennotsuﬃcientaswehavetobeupfrontaboutthe sizeofthearraytousefortheheap.Oftentheruntimebehaviourofaprogram canbeunpredictablewhenitcomestothesizeofitsinternaldatastructures, soweneedtochooseamoredynamicdatastructurethatcontainsthefollowing properties:

1. wecanspecifyaninitialsizeofthearrayforscenarioswhereweknowthe upperstoragelimitrequired;and

2. thedatastructureencapsulatesresizingalgorithmstogrowthearrayas requiredatruntime

Figure4.1doesnotspecifyhowwewouldhandleaddingnullreferencesto theheap.Thisvariesfromcasetocase;sometimesnullvaluesareprohibited entirely;inothercaseswemaytreatthemasbeingsmallerthananynon-null value,orindeedgreaterthananynon-nullvalue.Youwillhavetoresolvethis ambiguityyourselfhavingstudiedyourrequirements.Forthesakeofclaritywe willavoidtheissuebyprohibitingnullvalues.

Becauseweareusinganarrayweneedsomewaytocalculatetheindexofa parentnode,andthechildrenofanode.Therequiredexpressionsforthisare deﬁnedasfollowsforanodeat index:

1. (index 1)/2(parentindex)

2. 2 ∗ index +1(leftchild)

3. 2 ∗ index +2(rightchild)

InFigure4.4a)representsthecalculationoftherightchildof12(2 ∗ 0+2); andb)calculatestheindexoftheparentof3((3 1)/2).

4.1Insertion

Designinganalgorithmforheapinsertionissimple,butwemustensurethat heaporderispreservedaftereachinsertion.Generallythisisapost-insertion operation.Insertingavalueintothenextfreeslotinanarrayissimple:wejust needtokeeptrackofthenextfreeindexinthearrayasacounter,andincrement itaftereachinsertion.Insertingourvalueintotheheapistheﬁrstpartofthe algorithm;thesecondisvalidatingheaporder.Inthecaseofmin-heapordering thisrequiresustoswapthevaluesofaparentanditschildifthevalueofthe childis < thevalueofitsparent.Wemustdothisforeachsubtreecontaining thevaluewejustinserted.

CHAPTER4.HEAP 33

Figure4.1:Arrayrepresentationofasimpletreedatastructure Figure4.2:Directchildrenofthenodesinanarrayrepresentationofatreedata structure 1. Vector 2. ArrayList 3. List

CHAPTER4.HEAP 34

Figure4.3:Convertingatreedatastructuretoitsarraycounterpart

Theruntimeeﬃciencyforheapinsertionis O(logn).Theruntimeisa byproductofverifyingheaporderastheﬁrstpartofthealgorithm(theactual insertionintothearray)is O(1).

Figure4.5showsthestepsofinsertingthevalues3,9,12,7,and1intoa min-heap.

CHAPTER4.HEAP 35

Figure4.4:Calculatingnodeproperties

CHAPTER4.HEAP 36

Figure4.5:Insertingvaluesintoamin-heap

1) algorithm Add(value)

2) Pre: value isthevaluetoaddtotheheap

3)Countisthenumberofitemsintheheap

4) Post: thevaluehasbeenaddedtotheheap

5) heap[Count] ← value

6)Count ← Count+1

7)MinHeapify()

8) end Add

1) algorithm MinHeapify()

2) Pre: Countisthenumberofitemsintheheap

3) heap isthearrayusedtostoretheheapitems

4) Post: theheaphaspreservedminheapordering

5) i ← Count 1

6) while i> 0 and heap[i] <heap[(i 1)/2]

7)Swap(heap[i], heap[(i 1)/2]

8) i ← (i 1)/2

9) endwhile

10) end MinHeapify

Thedesignofthe MaxHeapify algorithmisverysimilartothatofthe MinHeapify algorithm,theonlydiﬀerenceisthatthe < operatorinthesecond conditionofenteringthewhileloopischangedto >

4.2Deletion

Justasforinsertion,deletinganiteminvolvesensuringthatheaporderingis preserved.Thealgorithmfordeletionhasthreesteps:

1. ﬁndtheindexofthevaluetodelete

2. putthelastvalueintheheapattheindexlocationoftheitemtodelete

3. verifyheaporderingforeachsubtreewhichusedtoincludethevalue

CHAPTER4.HEAP 37

1) algorithm Remove(value)

2) Pre: value isthevaluetoremovefromtheheap

3) left,and right areupdatedalias’for2 ∗ index +1,and2 ∗ index +2respectively

4)Countisthenumberofitemsintheheap

5) heap isthearrayusedtostoretheheapitems

6) Post: value islocatedintheheapandremoved,true;otherwisefalse

7)//step1

8) index ← FindIndex(heap, value)

9) if index< 0

10) returnfalse

11) endif

12)Count ← Count 1

13)//step2

14) heap[index] ← heap[Count]

15)//step3

16) while left< Count and heap[index] >heap[left] or heap[index] >heap[right]

17)//promotesmallestkeyfromsubtree

18) if heap[left] <heap[right]

19)Swap(heap, left, index)

20) index ← left

21) else

22)Swap(heap, right, index)

23) index ← right

24) endif

25) endwhile

26) returntrue

27) end Remove

Figure4.6showsthe Remove algorithmvisually,removing1fromaheap containingthevalues1,3,9,12,and13.InFigure4.6youcanassumethatwe havespeciﬁedthatthebackingarrayoftheheapshouldhaveaninitialcapacity ofeight.

Pleasenotethatinourdeletionalgorithmthatwedon’tdefaulttheremoved valueinthe heap array.Ifyouareusingaheapforreferencetypes,i.e.objects thatareallocatedonaheapyouwillwanttofreethatmemory.Thisisimportant inbothunmanaged,andmanagedlanguages.Inthelatterwewillwanttonull thatemptyholesothatthegarbagecollectorcanreclaimthatmemory.Ifwe weretonotnullthatholethentheobjectcouldstillbereachedandthuswon’t begarbagecollected.

4.3Searching

Searchingaheapismerelyamatteroftraversingtheitemsintheheaparray sequentially,sothisoperationhasaruntimecomplexityof O(n).Thesearch canbethoughtofasonethatusesabreadthfirsttraversalasdefinedin §3.7.4 tovisitthenodeswithintheheaptocheckforthepresenceofaspecifieditem.

CHAPTER4.HEAP 38

CHAPTER4.HEAP 39

Figure4.6:Deletinganitemfromaheap

Theproblemwiththepreviousalgorithmisthatwedon’ttakeadvantage ofthepropertiesinwhichallvaluesofaheaphold,thatisthepropertyofthe heapstrategybeingused.Forinstanceifwehadaheapthatdidn’tcontainthe value4wewouldhavetoexhaustthewholebackingheaparraybeforewecould determinethatitwasn’tpresentintheheap.Factoringinwhatweknowabout theheapwecanoptimisethesearchalgorithmbyincludinglogicwhichmakes useofthepropertiespresentedbyacertainheapstrategy.

Optimisingtodeterministicallystatethatavalueisintheheapisnotthat straightforward,howevertheproblemisaveryinterestingone.Asanexample consideramin-heapthatdoesn’tcontainthevalue5.Wecanonlyrulethatthe valueisnotintheheapif5 > theparentofthecurrentnodebeinginspected and < thecurrentnodebeinginspected ∀ nodesatthecurrentlevelweare traversing.Ifthisisthecasethen5cannotbeintheheapandsowecan provideananswerwithouttraversingtherestoftheheap.Ifthispropertyis notsatisﬁedforanylevelofnodesthatweareinspectingthenthealgorithm willindeedfallbacktoinspectingallthenodesintheheap.Theoptimisation thatwepresentcanbeverycommonandsowefeelthattheextralogicwithin theloopisjustiﬁedtopreventtheexpensiveworsecaseruntime.

Thefollowingalgorithmisspeciﬁcallydesignedforamin-heap.Totailorthe algorithmforamax-heapthetwocomparisonoperationsinthe elseif condition withintheinner while loopshouldbeﬂipped.

CHAPTER4.HEAP 40 1) algorithm Contains(value) 2) Pre: value isthevaluetosearchtheheapfor 3)Countisthenumberofitemsintheheap 4) heap isthearrayusedtostoretheheapitems 5) Post: value islocatedintheheap,inwhichcasetrue;otherwisefalse 6) i ← 0 7) while i< Count and heap[i] = value 8) i ← i +1 9) endwhile 10) if i< Count 11) returntrue 12) else 13) returnfalse 14) endif 15) end Contains

1) algorithm Contains(value)

2) Pre: value isthevaluetosearchtheheapfor

3)Countisthenumberofitemsintheheap

4) heap isthearrayusedtostoretheheapitems

5) Post: value islocatedintheheap,inwhichcasetrue;otherwisefalse

6) start ← 0

7) nodes ← 1

Thenew Contains algorithmdeterminesifthe value isnotintheheapby checkingwhether count = nodes.Insuchaneventwherethisistruethenwe canconﬁrmthat ∀ nodes n atlevel i : value> Parent(n), value<n thusthere isnopossiblewaythat value isintheheap.AsanexampleconsiderFigure4.7. Ifwearesearchingforthevalue10withinthemin-heapdisplayeditisobvious thatwedon’tneedtosearchthewholeheaptodetermine9isnotpresent.We canverifythisaftertraversingthenodesinthesecondleveloftheheapasthe previousexpressiondeﬁnedholdstrue.

4.4Traversal

Asmentionedin §4.3traversalofaheapisusuallydonelikethatofanyother arraydatastructurewhichourheapimplementationisbasedupon.Asaresult youtraversethearraystartingattheinitialarrayindex(0inmostlanguages) andthenvisiteachvaluewithinthearrayuntilyouhavereachedtheupper boundoftheheap.Youwillnotethatinthesearchalgorithmthatweuse Count asthisupperboundratherthantheactualphysicalboundoftheallocated array. Count isusedtopartitiontheconceptualheapfromtheactualarray implementationoftheheap:weonlycareabouttheitemsintheheap,notthe wholearray—thelattermaycontainvariousotherbitsofdataasaresultof heapmutation.

CHAPTER4.HEAP 41

11)

12)

13)

14)

15)

16)

17)

18) start

start

19) endwhile 20) if count = nodes 21) returnfalse 22) endif 23) nodes

nodes

24) endwhile 25) returnfalse 26) end Contains

8) while start< Count 9) start ← nodes 1 10) end ← nodes + start

count ← 0

while start< Count and start<end

if value = heap[start]

returntrue

elseif value> Parent(heap[start]) and value<heap[start]

count ← count +1

endif

←

∗ 2

Ifyouhavefollowedtheadvicewegaveinthedeletionalgorithmthena heapthathasbeenmutatedseveraltimeswillcontainsomeformofdefault valueforitemsnolongerintheheap.Potentiallyyouwillhaveatmost LengthOf (heapArray) Count garbagevaluesinthebackingheaparraydata structure.Thegarbagevaluesofcoursevaryfromplatformtoplatform.To makethingssimplethegarbagevalueofareferencetypewillbesimple ∅ and0 foravaluetype.

Figure4.8showsaheapthatyoucanassumehasbeenmutatedmanytimes. Forthisexamplewecanfurtherassumethatatsomepointtheitemsinindexes 3 5actuallycontainedreferencestoliveobjectsoftype T .InFigure4.8 subscriptisusedtodisambiguateseparateobjectsof T

Fromwhatyouhavereadthusfaryouwillmostlikelyhavepickedupthat traversingtheheapinanyotherorderwouldbeoflittlebeneﬁt.Theheap propertyonlyholdsforthesubtreeofeachnodeandsotraversingaheapin anyotherfashionrequiressomecreativeintervention.Heapsarenotusually traversedinanyotherwaythantheoneprescribedpreviously.

4.5Summary

Heapsaremostcommonlyusedtoimplementpriorityqueues(see §6.2fora sampleimplementation)andtofacilitateheapsort.Asdiscussedinboththe insertion §4.1anddeletion §4.2sectionsaheapmaintainsheaporderaccording totheselectedorderingstrategy.Thesestrategiesarereferredtoasmin-heap,

CHAPTER4.HEAP 42

Figure4.7:Determining10isnotintheheapafterinspectingthenodesofLevel 2 Figure4.8:Livinganddeadspaceintheheapbackingarray

andmaxheap.Theformerstrategyenforcesthatthevalueofaparentnodeis lessthanthatofeachofitschildren,thelatterenforcesthatthevalueofthe parentisgreaterthanthatofeachofitschildren.

Whenyoucomeacrossaheapandyouarenottoldwhatstrategyitenforces youshouldassumethatitusesthemin-heapstrategy.Iftheheapcanbe conﬁguredotherwise,e.g.tousemax-heapthenthiswilloftenrequireyouto statethisexplicitly.Theheapabidesprogressivelytoastrategyduringthe invocationoftheinsertion,anddeletionalgorithms.Thecostofsuchapolicyis thatuponeachinsertionanddeletionweinvokealgorithmsthathavelogarithmic runtimecomplexities.Whilethecostofmaintainingthestrategymightnot seemoverlyexpensiveitdoesstillcomeataprice.Wewillalsohavetofactor inthecostofdynamicarrayexpansionatsomestage.Thiswilloccurifthe numberofitemswithintheheapoutgrowsthespaceallocatedintheheap’s backingarray.Itmaybeinyourbestinteresttoresearchagoodinitialstarting sizeforyourheaparray.Thiswillassistinminimisingtheimpactofdynamic arrayresizing.

CHAPTER4.HEAP 43

Chapter5

Sets

Asetcontainsanumberofvalues,innoparticularorder.Thevalueswithin thesetaredistinctfromoneanother.

Generallysetimplementationstendtocheckthatavalueisnotintheset beforeaddingit,avoidingtheissueofrepeatedvaluesfromeveroccurring.

Thissectiondoesnotcoversettheoryindepth;ratheritdemonstratesbrieﬂy thewaysinwhichthevaluesofsetscanbedeﬁned,andcommonoperationsthat maybeperformeduponthem.

Thenotation A = {4, 7, 9, 12, 0} deﬁnesaset A whosevaluesarelistedwithin thecurlybraces.

Giventheset A deﬁnedpreviouslywecansaythat4isamemberof A denotedby4 ∈ A,andthat99isnotamemberof A denotedby99 / ∈ A

Oftendefiningasetbymanuallystatingitsmembersistiresome,andmore importantlythesetmaycontainalargenumberofvalues.Amoreconciseway ofdefiningasetanditsmembersisbyprovidingaseriesofpropertiesthatthe valuesofthesetmustsatisfy.Forexample,fromthedefinition A = {x|x> 0,x %2=0} theset A containsonlypositiveintegersthatareeven. x isan aliastothecurrentvalueweareinspectingandtotherighthandsideof | are thepropertiesthat x mustsatisfytobeintheset A.Inthisexample, x must be > 0,andtheremainderofthearithmeticexpression x/2mustbe0.Youwill beabletonotefromthepreviousdefinitionoftheset A thatthesetcancontain aninfinitenumberofvalues,andthatthevaluesoftheset A willbealleven integersthatareamemberofthenaturalnumbersset N,where N = {1, 2, 3,...}.

Finallyinthisbriefintroductiontosetswewillcoversetintersectionand union,bothofwhichareverycommonoperations(amongstmanyothers)performedonsets.Theunionsetcanbedeﬁnedasfollows A ∪ B = {x | x ∈ Aorx ∈ B},andintersection A ∩ B = {x | x ∈ Aandx ∈ B}.Figure5.1 demonstratessetintersectionanduniongraphically.

Giventhesetdeﬁnitions A = {1, 2, 3},and B = {6, 2, 9} theunionofthetwo setsis A ∪ B = {1, 2, 3, 6, 9},andtheintersectionofthetwosetsis A ∩ B = {2}

Bothsetunionandintersectionaresometimesprovidedwithintheframeworkassociatedwithmainstreamlanguages.Thisisthecasein.NET3.51 wheresuchalgorithmsexistasextensionmethodsdeﬁnedinthetype System.Linq.Enumerable 2,asaresultDSAdoesnotprovideimplementationsof

1 http://www.microsoft.com/NET/

2 http://msdn.microsoft.com/en-us/library/system.linq.enumerable_members.aspx

thesealgorithms.Mostofthealgorithmsdeﬁnedin System.Linq.Enumerable dealmainlywithsequencesratherthansetsexclusively.

Setunioncanbeimplementedasasimpletraversalofbothsetsaddingeach itemofthetwosetstoanewunionset.

1) algorithm Union(set1, set2)

2) Pre: set1,and set2 = ∅

3) union isaset

3) Post: Aunionof set1,and set2hasbeencreated

4) foreach item in set1

5) union.Add(item)

6) endforeach

7) foreach item in set2

8) union.Add(item)

9) endforeach

10) return union

11) end Union

Theruntimeofour Union algorithmis O(m + n)where m isthenumber ofitemsintheﬁrstsetand n isthenumberofitemsinthesecondset.This runtimeappliesonlytosetsthatexhibit O(1)insertions.

Setintersectionisalsotrivialtoimplement.Theonlymajorthingworth pointingoutaboutouralgorithmisthatwetraversethesetcontainingthe fewestitems.Wecandothisbecauseifwehaveexhaustedalltheitemsinthe smallerofthetwosetsthentherearenomoreitemsthataremembersofboth sets,thuswehavenomoreitemstoaddtotheintersectionset.

CHAPTER5.SETS 45

Figure5.1:a) A ∩ B;b) A ∪ B

1) algorithm Intersection(set1, set2)

2) Pre: set1,and set2 = ∅

3) intersection,and smallerSet aresets

3) Post: Anintersectionof set1,and set2hasbeencreated

4) if set1.Count <set2.Count

5) smallerSet ← set1

6) else

7) smallerSet ← set2

8) endif

9) foreach item in smallerSet

10) if set1.Contains(item) and set2.Contains(item)

11) intersection.Add(item)

12) endif 13) endforeach 14) return intersection 15) end Intersection

Theruntimeofour Intersection algorithmis O(n)where n isthenumber ofitemsinthesmallerofthetwosets.Justlikeour Union algorithmalinear runtimecanonlybeattainedwhenoperatingonasetwith O(1)insertion.

5.1Unordered

Setsinthegeneralsensedonotenforcetheexplicitorderingoftheirmembers.Forexamplethemembersof B = {6, 2, 9} conformtonoorderingscheme becauseitisnotrequired.

MostlibrariesprovideimplementationsofunorderedsetsandsoDSAdoes not;wesimplymentionitheretodisambiguatebetweenanunorderedsetand orderedset.

Wewillonlylookatinsertionforanunorderedsetandcoverbrieﬂywhya hashtableisaneﬃcientdatastructuretouseforitsimplementation.

5.1.1Insertion

Anunorderedsetcanbeeﬃcientlyimplementedusingahashtableasitsbacking datastructure.Asmentionedpreviouslyweonlyaddanitemtoasetifthat itemisnotalreadyintheset,sothebackingdatastructureweusemusthave aquicklookupandinsertionruntimecomplexity.

Ahashmapgenerallyprovidesthefollowing:

1. O(1)forinsertion

2. approaching O(1)forlookup

Theabovedependsonhowgoodthehashingalgorithmofthehashtable is,butmosthashtablesemployincrediblyeﬃcientgeneralpurposehashing algorithmsandsotheruntimecomplexitiesforthehashtableinyourlibrary ofchoiceshouldbeverysimilarintermsofeﬃciency.

CHAPTER5.SETS 46

5.2Ordered

Anorderedsetissimilartoanunorderedsetinthesensethatitsmembersare distinct,butanorderedsetenforcessomepredeﬁnedcomparisononeachofits memberstoproduceasetwhosemembersareorderedappropriately.

InDSA0.5andearlierweusedabinarysearchtree(deﬁnedin §3)asthe internalbackingdatastructureforourorderedset.Fromversions0.6onwards wereplacedthebinarysearchtreewithanAVLtreeprimarilybecauseAVLis balanced.

Theorderedsethasitsorderrealisedbyperforminganinordertraversal uponitsbackingtreedatastructurewhichyieldsthecorrectorderedsequence ofsetmembers.

BecauseanorderedsetinDSAissimplyawrapperforanAVLtreethat additionallyensuresthatthetreecontainsuniqueitemsyoushouldread §7to learnmoreabouttheruntimecomplexitiesassociatedwithitsoperations.

5.3Summary

Setsprovideawayofhavingacollectionofuniqueobjects,eitherorderedor unordered.

Whenimplementingaset(eitherorderedorunordered)itiskeytoselect thecorrectbackingdatastructure.Aswediscussedin §5.1.1becausewecheck ﬁrstiftheitemisalreadycontainedwithinthesetbeforeaddingitweneed thischecktobeasquickaspossible.Forunorderedsetswecanrelyontheuse ofahashtableandusethekeyofanitemtodeterminewhetherornotitis alreadycontainedwithintheset.Usingahashtablethischeckresultsinanear constantruntimecomplexity.Orderedsetscostalittlemoreforthischeck, howeverthelogarithmicgrowththatweincurbyusingabinarysearchtreeas itsbackingdatastructureisacceptable.

Anotherkeypropertyofsetsimplementedusingtheapproachwedescribeis thatbothhavefavourablyfastlook-uptimes.Justlikethecheckbeforeinsertion,forahashtablethisruntimecomplexityshouldbenearconstant.Ordered setsasdescribedin3performabinarychopateachstagewhensearchingfor theexistenceofanitemyieldingalogarithmicruntime.

Wecanusesetstofacilitatemanyalgorithmsthatwouldotherwisebealittle lessclearintheirimplementation.Forexamplein §11.4weuseanunordered settoassistintheconstructionofanalgorithmthatdeterminesthenumberof repeatedwordswithinastring.

CHAPTER5.SETS 47

Chapter6

Queues

Queuesareanessentialdatastructurethatarefoundinvastamountsofsoftwarefromusermodetokernelmodeapplicationsthatarecoretothesystem. Fundamentallytheyhonourafirstinfirstout(FIFO)strategy,thatistheitem firstputintothequeuewillbethefirstserved,theseconditemaddedtothe queuewillbethesecondtobeservedandsoon.

Atraditionalqueueonlyallowsyoutoaccesstheitematthefrontofthe queue;whenyouaddanitemtothequeuethatitemisplacedatthebackof thequeue.

Historicallyqueuesalwayshavethefollowingthreecoremethods:

Enqueue: placesanitematthebackofthequeue;

Dequeue: retrievestheitematthefrontofthequeue,andremovesitfromthe queue;

Peek: 1 retrievestheitematthefrontofthequeuewithoutremovingitfrom thequeue

Asanexampletodemonstratethebehaviourofaqueuewewillwalkthrough ascenariowherebyweinvokeeachofthepreviouslymentionedmethodsobservingthemutationsuponthequeuedatastructure.Thefollowinglistdescribes theoperationsperformeduponthequeueinFigure6.1:

1. Enqueue(10) 2. Enqueue(12) 3. Enqueue(9) 4. Enqueue(8) 5. Enqueue(3) 6. Dequeue() 7. Peek()

1 ThisoperationissometimesreferredtoasFront

6.1Astandardqueue

Aqueueisimplicitlylikethatdescribedpriortothissection.InDSAwedon’t provideastandardqueuebecausequeuesaresopopularandsuchacoredata structurethatyouwillﬁndprettymucheverymainstreamlibraryprovidesa queuedatastructurethatyoucanusewithyourlanguageofchoice.Inthis sectionwewilldiscusshowyoucan,ifrequired,implementaneﬃcientqueue datastructure.

Themainpropertyofaqueueisthatwehaveaccesstotheitematthe frontofthequeue.Thequeuedatastructurecanbeeﬃcientlyimplemented usingasinglylinkedlist(deﬁnedin §2.1).Asinglylinkedlistprovides O(1) insertionanddeletionruntimecomplexities.Thereasonwehavean O(1)run timecomplexityfordeletionisbecauseweonlyeverremoveitemsfromthefront ofqueues(withtheDequeueoperation).Sincewealwayshaveapointertothe itemattheheadofasinglylinkedlist,removalissimplyacaseofreturning thevalueoftheoldheadnode,andthenmodifyingtheheadpointertobethe nextnodeoftheoldheadnode.Theruntimecomplexityforsearchingaqueue remainsthesameasthatofasinglylinkedlist: O(n).

6.2PriorityQueue

Unlikeastandardqueuewhereitemsareorderedintermsofwhoarrivedﬁrst, apriorityqueuedeterminestheorderofitsitemsbyusingaformofcustom comparertoseewhichitemhasthehighestpriority.Otherthantheitemsina priorityqueuebeingorderedbypriorityitremainsthesameasanormalqueue: youcanonlyaccesstheitematthefrontofthequeue.

Asensibleimplementationofapriorityqueueistouseaheapdatastructure (deﬁnedin §4).Usingaheapwecanlookattheﬁrstiteminthequeuebysimply returningtheitematindex0withintheheaparray.Aheapprovidesuswiththe abilitytoconstructapriorityqueuewheretheitemswiththehighestpriority areeitherthosewiththesmallestvalue,orthosewiththelargest.

6.3DoubleEndedQueue

Unlikethequeueswehavetalkedaboutpreviouslyinthischapteradouble endedqueueallowsyoutoaccesstheitemsatboththefront,andbackofthe queue.Adoubleendedqueueiscommonlyknownasadequewhichisthename wewillhereoninrefertoitas.

Adequeappliesnoprioritizationstrategytoitsitemslikeapriorityqueue does,itemsareaddedinordertoeitherthefrontofbackofthedeque.The formerpropertiesofthedequearedenotedbytheprogrammerutilisingthedata structuresexposedinterface.

CHAPTER6.QUEUES 49

8. Enqueue(33) 9. Peek() 10. Dequeue()

CHAPTER6.QUEUES 50

Figure6.1:Queuemutations

Deque’sprovidefrontandbackspeciﬁcversionsofcommonqueueoperations, e.g.youmaywanttoenqueueanitemtothefrontofthequeueratherthan thebackinwhichcaseyouwoulduseamethodwithanamealongthelines of EnqueueFront.Thefollowinglistidentiﬁesoperationsthatarecommonly supportedbydeque’s:

• EnqueueFront

• EnqueueBack

• DequeueFront

• DequeueBack

• PeekFront

• PeekBack

Figure6.2showsadequeaftertheinvocationofthefollowingmethods(inorder):

1. EnqueueBack(12)

2. EnqueueFront(1)

3. EnqueueBack(23)

4. EnqueueFront(908)

5. DequeueFront()

6. DequeueBack()

Theoperationshaveaone-to-onetranslationintermsofbehaviourwith thoseofanormalqueue,orpriorityqueue.Insomecasesthesetofalgorithms thataddanitemtothebackofthedequemaybenamedastheyarewith normalqueues,e.g. EnqueueBack maysimplybecalled Enqueue ansoon.Some frameworksalsospecifyexplicitbehaviour’sthatdatastructuresmustadhereto. Thisiscertainlythecasein.NETwheremostcollectionsimplementaninterface whichrequiresthedatastructuretoexposeastandard Add method.Insuch ascenarioyoucansafelyassumethatthe Add methodwillsimplyenqueuean itemtothebackofthedeque.

Withrespecttoalgorithmicruntimecomplexitiesadequeisthesameas anormalqueue.Thatisenqueueinganitemtothebackofathequeueis O(1),additionallyenqueuinganitemtothefrontofthequeueisalsoan O(1) operation.

Adequeisawrapperdatastructurethatuseseitheranarray,oradoubly linkedlist.Usinganarrayasthebackingdatastructurewouldrequiretheprogrammertobeexplicitaboutthesizeofthearrayupfront,thiswouldprovide anobviousadvantageiftheprogrammercoulddeterministicallystatethemaximumnumberofitemsthedequewouldcontainatanyonetime.Unfortunately inmostcasesthisdoesn’thold,asaresultthebackingarraywillinherently incurtheexpenseofinvokingaresizingalgorithmwhichwouldmostlikelybe an O(n)operation.Suchanapproachwouldalsoleavethelibrarydeveloper

CHAPTER6.QUEUES 51

CHAPTER6.QUEUES 52

Figure6.2:Dequedatastructureafterseveralmutations

tolookatarrayminimizationtechniquesaswell,itcouldbethatafterseveral invocationsoftheresizingalgorithmandvariousmutationsonthedequelater thatwehaveanarraytakingupaconsiderableamountofmemoryyetweare onlyusingafewsmallpercentageofthatmemory.Analgorithmdescribed wouldalsobe O(n)yetitsinvocationwouldbehardertogaugestrategically.

Tobypassalltheaforementionedissuesadequetypicallyusesadoubly linkedlistasitsbakingdatastructure.Whileanodethathastwopointers consumesmorememorythanitsarrayitemcounterpartitmakesredundantthe needforexpensiveresizingalgorithmsasthedatastructureincreasesinsize dynamically.Withalanguagethattargetsagarbagecollectedvirtualmachine memoryreclamationisanopaqueprocessasthenodesthatarenolongerreferencedbecomeunreachableandarethusmarkedforcollectionuponthenext invocationofthegarbagecollectionalgorithm.WithC++oranyotherlanguagethatusesexplicitmemoryallocationanddeallocationitwillbeuptothe programmertodecidewhenthememorythatstorestheobjectcanbefreed.

6.4Summary

Withnormalqueueswehaveseenthatthosewhoarrivefirstaredealtwithfirst; thatistheyaredealtwithinafirst-in-first-out(FIFO)order.Queuescanbe eversouseful;forexampletheWindowsCPUschedulerusesadifferentqueue foreachpriorityofprocesstodeterminewhichshouldbethenextprocessto utilisetheCPUforaspecifiedtimequantum.Normalqueueshaveconstant insertionanddeletionruntimes.Searchingaqueueisfairlyunusual—typically youareonlyinterestedintheitematthefrontofthequeue.Despitethat, searchingisusuallyexposedonqueuesandtypicallytheruntimeislinear.

Inthischapterwehavealsoseenpriorityqueueswherethoseatthefront ofthequeuehavethehighestpriorityandthosenearthebackhavethelowest. Oneimplementationofapriorityqueueistouseaheapdatastructureasits backingstore,sotheruntimesforinsertion,deletion,andsearchingarethe sameasthoseforaheap(deﬁnedin §4).

Queuesareaverynaturaldatastructure,andwhiletheyarefairlyprimitive theycanmakemanyproblemsalotsimpler.Forexamplethebreadthﬁrst searchdeﬁnedin §3.7.4makesextensiveuseofqueues.

CHAPTER6.QUEUES 53

Chapter7

AVLTree

Intheearly60’sG.M.Adelson-VelskyandE.M.Landisinventedtheﬁrstselfbalancingbinarysearchtreedatastructure,callingitAVLTree.

AnAVLtreeisabinarysearchtree(BST,definedin §3)withaself-balancing conditionstatingthatthedifferencebetweentheheightoftheleftandright subtreescannotbenomorethanone,seeFigure7.1.Thiscondition,restored aftereachtreemodification,forcesthegeneralshapeofanAVLtree.Before continuing,letusfocusonwhybalanceissoimportant.Considerabinary searchtreeobtainedbystartingwithanemptytreeandinsertingsomevalues inthefollowingorder1,2,3,4,5.

TheBSTinFigure7.2representstheworstcasescenarioinwhichtherunningtimeofallcommonoperationssuchassearch,insertionanddeletionare O(n).Byapplyingabalanceconditionweensurethattheworstcaserunning timeofeachcommonoperationis O(logn).TheheightofanAVLtreewith n nodesis O(logn)regardlessoftheorderinwhichvaluesareinserted.

TheAVLbalancecondition,knownalsoasthenodebalancefactorrepresents anadditionalpieceofinformationstoredforeachnode.Thisiscombinedwith atechniquethateﬃcientlyrestoresthebalanceconditionforthetree.Inan AVLtreetheinventorsmakeuseofawell-knowntechniquecalledtreerotation. h h+1

Figure7.1:TheleftandrightsubtreesofanAVLtreediﬀerinheightbyat most1

CHAPTER7.AVLTREE 55

2 4 5 1 3 4 5 3 2 1

Figure7.2:Unbalancedbinarysearchtree

a) b)

Figure7.3:Avltrees,insertionorder:-a)1,2,3,4,5-b)1,5,4,3,2

7.1TreeRotations

Atreerotationisaconstanttimeoperationonabinarysearchtreethatchanges theshapeofatreewhilepreservingstandardBSTproperties.Thereareleftand rightrotationsbothofthemdecreasetheheightofaBSTbymovingsmaller subtreesdownandlargersubtreesup.

CHAPTER7.AVLTREE

Figure7.4:Treeleftandrightrotations

1) algorithm LeftRotation(node)

2) Pre: node.Right!= ∅

3) Post: node.Rightisthenewrootofthesubtree,

4) node hasbecome node.Right’sleftchildand, 5)BSTpropertiesarepreserved

6) RightNode ← node.Right

7) node.Right ← RightNode.Left

8) RightNode.Left ← node

9) end LeftRotation

1) algorithm RightRotation(node)

2) Pre: node.Left!= ∅

3) Post: node.Leftisthenewrootofthesubtree,

4) node hasbecome node.Left’srightchildand,

5)BSTpropertiesarepreserved

6) LeftNode ← node.Left

7) node.Left ← LeftNode.Right

8) LeftNode.Right ← node

9) end RightRotation

Therightandleftrotationalgorithmsaresymmetric.Onlypointersare changedbyarotationresultinginan O(1)runtimecomplexity;theotherﬁelds presentinthenodesarenotchanged.

7.2TreeRebalancing

Thealgorithmthatwepresentinthissectionveriﬁesthattheleftandright subtreesdiﬀeratmostinheightby1.Ifthispropertyisnotpresentthenwe performthecorrectrotation.

Noticethatweusetwonewalgorithmsthatrepresentdoublerotations. Thesealgorithmsarenamed LeftAndRightRotation,and RightAndLeftRotation Thealgorithmsareselfdocumentingintheirnames,e.g. LeftAndRightRotation ﬁrstperformsaleftrotationandthensubsequentlyarightrotation.

CHAPTER7.AVLTREE 57

1) algorithm CheckBalance(current)

2) Pre: current isthenodetostartfrombalancing

3) Post: current heighthasbeenupdatedwhiletreebalanceisifneeded

4)restoredthroughrotations

5) if current.Left= ∅ and current.Right= ∅

6) current.Height=-1;

7) else

8) current.Height=Max(Height(current.Left),Height(current.Right))+1

9) endif

10) if Height(current.Left)- Height(current.Right) > 1

11) if Height(current.Left.Left)- Height(current.Left.Right) > 0

12)RightRotation(current)

13) else

14)LeftAndRightRotation(current)

15) endif

16) elseif Height(current.Left)- Height(current.Right) < 1

17) if Height(current.Right.Left)- Height(current.Right.Right) < 0

18)LeftRotation(current)

19) else

20)RightAndLeftRotation(current)

21) endif

22) endif

23) end CheckBalance

7.3Insertion

AVLinsertionoperatesﬁrstbyinsertingthegivenvaluethesamewayasBST insertionandthenbyapplyingrebalancingtechniquesifnecessary.Thelatter isonlyperformediftheAVLpropertynolongerholds,thatistheleftandright subtreesheightdiﬀerbymorethan1.EachtimeweinsertanodeintoanAVL tree:

1. Wegodownthetreetoﬁndthecorrectpointatwhichtoinsertthenode, inthesamemannerasforBSTinsertion;then

2. wetravelupthetreefromtheinsertednodeandcheckthatthenode balancingpropertyhasnotbeenviolated;ifthepropertyhasn’tbeen violatedthenweneednotrebalancethetree,theoppositeistrueifthe balancingpropertyhasbeenviolated.

CHAPTER7.AVLTREE 58

1) algorithm Insert(value)

2) Pre: value haspassedcustomtypechecksfortype T

3) Post: value hasbeenplacedinthecorrectlocationinthetree

4) if root = ∅

5) root ← node(value)

6) else

7)InsertNode(root, value)

8) endif

9) end Insert

1) algorithm InsertNode(current, value)

2) Pre: current isthenodetostartfrom

3) Post: value hasbeenplacedinthecorrectlocationinthetreewhile

4)preservingtreebalance

5) if value<current.Value

6) if current.Left= ∅

7) current.Left ← node(value)

8) else

9)InsertNode(current.Left, value)

10) endif

11) else

12) if current.Right= ∅

13) current.Right ← node(value)

14) else

15)InsertNode(current.Right, value)

16) endif

17) endif

18)CheckBalance(current)

19) end InsertNode

7.4Deletion

OurbalancingalgorithmisliketheonepresentedforourBST(deﬁnedin §3.3). Themajordiﬀerenceisthatwehavetoensurethatthetreestilladherestothe AVLbalancepropertyaftertheremovalofthenode.Ifthetreedoesn’tneed toberebalancedandthevalueweareremovingiscontainedwithinthetree thennofurthersteparerequired.However,whenthevalueisinthetreeand itsremovalupsetstheAVLbalancepropertythenwemustperformthecorrect rotation(s).

CHAPTER7.AVLTREE 59

1) algorithm Remove(value)

2) Pre: value isthevalueofthenodetoremove, root istherootnode

3)oftheAvl

4) Post: nodewith value isremovedandtreerebalancediffoundinwhich

5)caseyieldstrue,otherwisefalse

6) nodeToRemove ← root

7) parent ←∅

8) Stackpath ← root 9) while nodeToRemove = ∅ and nodeToRemove.Value = Value

parent = nodeToRemove

if value<nodeToRemove.Value

nodeToRemove ← nodeToRemove.Right

endif 16)path.Push(nodeToRemove) 17) endwhile

if nodeToRemove = ∅ 19) returnfalse //valuenotinAvl 20) endif

parent ← FindParent(value)

if count =1// count keepstrackofthe#ofnodesintheAvl

root ←∅ //weareremovingtheonlynodeintheAvl

elseif nodeToRemove.Left= ∅ and nodeToRemove.Right= null

CHAPTER7.AVLTREE 60

10)

11)

12)

13)

14)

15)

21)

23)

24)

25)//case#1 26)

27)

28) else 29) parent

30) endif 31) elseif

32)//case#2 33)

34) parent

nodeToRemove

35) else 36) parent

nodeToRemove

37) endif 38) elseif nodeToRemove

∅

nodeToRemove

∅ 39)//case#3 40) if nodeToRemove

.Value 41) parent.Left ← nodeToRemove.Left 42) else 43) parent

← nodeToRemove.Left 44) endif 45) else 46)//case#4 47) largestValue

nodeToRemove

48) while largestValue

∅ 49)//ﬁndthelargestvalueintheleftsubtreeof nodeToRemove 50) largestValue ← largestValue.Right

nodeToRemove ← nodeToRemove.Left

else

18)

22)

if nodeToRemove.Value <parent.Value

parent.Left ←∅

.Right ←∅

nodeToRemove.Left= ∅ and nodeToRemove.Right = ∅

if nodeToRemove.Value <parent.Value

.Left ←

.Right

.Right ←

.Right

.Left =

and

.Right=

.Value <parent

.Right

←

.Left

.Right =

51) endwhile

52)//settheparents’Rightpointerof largestValue to ∅

53)FindParent(largestValue.Value).Right ←∅

54) nodeToRemove.Value ← largestValue.Value

55) endif

56) while path.Count> 0

57)CheckBalance(path.Pop())//wetrackbacktotherootnodecheckbalance

58) endwhile

59) count ← count 1

60) returntrue

61) end Remove

7.5Summary

TheAVLtreeisasophisticatedselfbalancingtree.Itcanbethoughtofas thesmarter,youngerbrotherofthebinarysearchtree.Unlikeitsolderbrother theAVLtreeavoidsworstcaselinearcomplexityruntimesforitsoperations.

TheAVLtreeguaranteesviatheenforcementofbalancingalgorithmsthatthe leftandrightsubtreesdiﬀerinheightbyatmost1whichyieldsatmosta logarithmicruntimecomplexity.

CHAPTER7.AVLTREE 61

PartII Algorithms

Chapter8

Sorting

Allthesortingalgorithmsinthischapterusedatastructuresofaspeciﬁctype todemonstratesorting,e.g.a32bitintegerisoftenusedasitsassociated operations(e.g. <, >,etc)areclearintheirbehaviour.

Thealgorithmsdiscussedcaneasilybetranslatedintogenericsortingalgorithmswithinyourrespectivelanguageofchoice.

8.1BubbleSort

Oneofthemostsimpleformsofsortingisthatofcomparingeachitemwith everyotheriteminsomelist,howeverasthedescriptionmayimplythisform ofsortingisnotparticularlyeﬀecient O(n2).Init’smostsimpleformbubble sortcanbeimplementedastwoloops.

1) algorithm BubbleSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) for i ← 0to list.Count 1

5) for j ← 0to list.Count 1

6) if list[i] <list[j]

7) Swap(list[i],list[j])

8) endif

9) endfor 10) endfor 11) return list

12) end BubbleSort

8.2MergeSort

MergesortisanalgorithmthathasafairlyeﬃcientspacetimecomplexityO(nlogn)andisfairlytrivialtoimplement.Thealgorithmisbasedonsplitting alist,intotwosimilarsizedlists(left,and right)andsortingeachlistandthen mergingthesortedlistsbacktogether.

Note:thefunctionMergeOrderedsimplytakestwoorderedlistsandmakes themone.

1) algorithm Mergesort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) if list.Count=1//alreadysorted

5) return list

6) endif

7) m ← list.Count / 2

8) left ← list(m)

9) right ← list(list.Count m)

10) for i ← 0to left.Count 1

11) left[i] ← list[i]

12) endfor

13) for i ← 0to right.Count 1

14) right[i] ← list[i]

15) endfor

16) left ← Mergesort(left)

17) right ← Mergesort(right)

18) return MergeOrdered(left, right)

19) end Mergesort

CHAPTER8.SORTING 64

Figure8.1:BubbleSortIterations

8.3QuickSort

Quicksortisoneofthemostpopularsortingalgorithmsbasedondivideet imperastrategy,resultinginan O(nlogn)complexity.Thealgorithmstartsby pickinganitem,calledpivot,andmovingallsmalleritemsbeforeit,whileall greaterelementsafterit.Thisisthemainquicksortoperation,calledpartition, recursivelyrepeatedonlesserandgreatersublistsuntiltheirsizeisoneorzero -inwhichcasethelistisimplicitlysorted.

Choosinganappropriatepivot,asforexamplethemedianelementisfundamentalforavoidingthedrasticallyreducedperformanceof O(n2).

CHAPTER8.SORTING 65

Figure8.2:MergeSortDivideetImperaApproach

1) algorithm QuickSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) if list.Count=1//alreadysorted

5) return list

6) endif

7) pivot ←MedianValue(list)

8) for i ← 0to list.Count 1

9) if list[i]= pivot

10) equal.Insert(list[i])

11) endif

12) if list[i] <pivot

13) less.Insert(list[i])

14) endif

15) if list[i] >pivot

16) greater.Insert(list[i])

17) endif

18) endfor

19) return Concatenate(QuickSort(less), equal,QuickSort(greater))

20) end Quicksort

CHAPTER8.SORTING 66

Figure8.3:QuickSortExample(pivotmedianstrategy)

8.4InsertionSort

Insertionsortisasomewhatinterestingalgorithmwithanexpensiveruntimeof O(n2).Itcanbebestthoughtofasasortingschemesimilartothatofsorting ahandofplayingcards,i.e.youtakeonecardandthenlookattherestwith theintentofbuildingupanorderedsetofcardsinyourhand.

1) algorithm Insertionsort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) unsorted ← 1

5) while unsorted<list.Count

6) hold ← list[unsorted]

7) i ← unsorted 1

8) while i ≥ 0 and hold<list[i]

9) list[i +1] ← list[i]

10) i ← i 1

11) endwhile

12) list[i +1] ← hold

13) unsorted ← unsorted +1

14) endwhile 15) return list

16) end Insertionsort

CHAPTER8.SORTING 67

Figure8.4:InsertionSortIterations

8.5ShellSort

Putsimplyshellsortcanbethoughtofasamoreeﬃcientvariationofinsertion sortasdescribedin §8.4,itachievesthismainlybycomparingitemsofvarying distancesapartresultinginaruntimecomplexityof O(nlog2 n).

Shellsortisfairlystraightforwardbutmayseemsomewhatconfusingat ﬁrstasitdiﬀersfromothersortingalgorithmsinthewayitselectsitemsto compare.Figure8.5showsshellsortbeingranonanarrayofintegers,thered colouredsquareisthecurrentvalueweareholding.

1) algorithm ShellSort(list)

2) Pre: list = ∅

3) Post: list hasbeensortedintovaluesofascendingorder

4) increment ← list.Count / 2

5) while increment =0

6) current ← increment

7) while current<list.Count

8) hold ← list[current]

9) i ← current increment 10) while i ≥ 0 and hold<list[i] 11) list[i + increment] ← list[i] 12) i = increment 13) endwhile 14) list[i + increment] ← hold 15) current ← current +1 16) endwhile 17) increment/ =2 18) endwhile 19) return list 20) end ShellSort

8.6RadixSort

Unlikethesortingalgorithmsdescribedpreviouslyradixsortusesbucketsto sortitems,eachbucketholdsitemswithaparticularpropertycalledakey. Normallyabucketisaqueue,eachtimeradixsortisperformedthesebuckets areemptiedstartingthesmallestkeybuckettothelargest.Whenlookingat itemswithinalisttosortwedosobyisolatingaspecifickey,e.g.intheexample weareabouttoshowwehaveamaximumofthreekeysforallitems,thatis thehighestkeyweneedtolookatishundreds.Becausewearedealingwith,in thisexamplebase10numberswehaveatanyonepoint10possiblekeyvalues 0..9eachofwhichhastheirownbucket.Beforeweshowyouthisfirstsimple versionofradixsortletusclarifywhatwemeanbyisolatingkeys.Giventhe number102ifwelookatthefirstkey,theonesthenwecanseewehavetwoof them,progressingtothenextkey-tenswecanseethatthenumberhaszero ofthem,finallywecanseethatthenumberhasasinglehundred.Thenumber usedasanexamplehasintotalthreekeys:

CHAPTER8.SORTING 68

CHAPTER8.SORTING 69

Figure8.5:Shellsort

Forfurtherclariﬁcationwhatifwewantedtodeterminehowmanythousands thenumber102has?Clearlytherearenone,butoftenlookingatanumberas ﬁnallikeweoftendoitisnotsoobvioussowhenaskedthequestionhowmany thousandsdoes102haveyoushouldsimplypadthenumberwithazerointhat location,e.g.0102hereitismoreobviousthatthekeyvalueatthethousands locationiszero.

Thelastthingtoidentifybeforeweactuallyshowyouasimpleimplementationofradixsortthatworksononlypositiveintegers,andrequiresyouto specifythemaximumkeysizeinthelististhatweneedawaytoisolatea speciﬁckeyatanyonetime.Thesolutionisactuallyverysimple,butitsnot oftenyouwanttoisolateakeyinanumbersowewillspellitoutclearly here.Akeycanbeaccessedfromanyintegerwiththefollowingexpression: key ← (number/keyToAccess)%10.Asasimpleexampleletssaythatwe wanttoaccessthetenskeyofthenumber1290,thetenscolumniskey10and soaftersubstitutionyields key ← (1290 / 10)%10=9.Thenextkeyto lookatforanumbercanbeattainedbymultiplyingthelastkeybytenworking lefttorightinasequentialmanner.Thevalueof key isusedinthefollowing algorithmtoworkouttheindexofanarrayofqueuestoenqueuetheiteminto.

1) algorithm Radix(list, maxKeySize)

2) Pre: list = ∅

3) maxKeySize ≥ 0andrepresentsthelargestkeysizeinthelist

4) Post: list hasbeensorted

5) queues ← Queue[10]

6) indexOfKey ← 1

7) fori ← 0 to maxKeySize 1

8) foreach item in list

9) queues[GetQueueIndex(item, indexOfKey)].Enqueue(item)

10) endforeach

11) list ← CollapseQueues(queues)

12)ClearQueues(queues)

13) indexOfKey ← indexOfKey ∗ 10

14) endfor

15) return list

16) end Radix

Figure8.6showsthemembersof queues fromthealgorithmdescribedabove operatingonthelistwhosemembersare90, 12, 8, 791, 123, and61,thekeywe areinterestedinforeachnumberishighlighted.OmittedqueuesinFigure8.6 meanthattheycontainnoitems.

8.7Summary

Throughoutthischapterwehaveseenmanydifferentalgorithmsforsorting lists,someareveryefficient(e.g.quicksortdefinedin §8.3),somearenot(e.g.

CHAPTER8.SORTING 70

1. Ones 2. Tens 3. Hundreds

Selectingthecorrectsortingalgorithmisusuallydenotedpurelybyeﬃciency, e.g.youwouldalwayschoosemergesortovershellsortandsoon.Thereare alsootherfactorstolookatthoughandthesearebasedontheactualimplementation.Somealgorithmsareverynicelyexpressedinarecursivefashion, howeverthesealgorithmsoughttobeprettyeﬃcient,e.g.implementingalinear, quadratic,orsloweralgorithmusingrecursionwouldbeaverybadidea.

Ifyouwanttolearnmoreaboutwhyyoushouldbevery,verycarefulwhen implementingrecursivealgorithmsseeAppendixC.

CHAPTER8.SORTING 71

Figure8.6:Radixsortbase10algorithm bubblesortdeﬁnedin §8.1).

Chapter9

Numeric

Unlessstatedotherwisethealias n denotesastandard32bitinteger.

9.1PrimalityTest

Asimplealgorithmthatdetermineswhetherornotagivenintegerisaprime number,e.g.2,5,7,and13are all primenumbers,however6isnotasitcan betheresultoftheproductoftwonumbersthatare < 6.

Inanattempttoslowdowntheinnerloopthe √n isusedastheupper bound.

1) algorithm IsPrime(n)

2) Post: n isdeterminedtobeaprimeornot

3) for i ← 2 to n do

4) for j ← 1 to sqrt(n) do

5) if i ∗ j = n

6) return false

7) endif

8) endfor

9) endfor

10) end IsPrime

9.2Baseconversions

DSAcontainsanumberofalgorithmsthatconvertabase10numbertoits equivalentbinary,octalorhexadecimalform.Forexample7810 hasabinary representationof10011102

Table9.1showsthealgorithmtracewhenthenumbertoconverttobinary is74210

9.3Attainingthegreatestcommondenominatoroftwonumbers

Afairlyroutineprobleminmathematicsisthatofﬁndingthegreatestcommon denominatoroftwointegers,whatweareessentiallyafteristhegreatestnumber whichisamultipleofboth,e.g.thegreatestcommondenominatorof9,and 15is3.OneofthemostelegantsolutionstothisproblemisbasedonEuclid’s algorithmthathasaruntimecomplexityof O(n2).

CHAPTER9.NUMERIC 73 1) algorithm ToBinary(n) 2) Pre: n ≥ 0 3) Post: n hasbeenconvertedintoitsbase2representation 4) while n> 0 5) list.Add(n %2) 6) n ← n/2 7) endwhile 8) return Reverse(list) 9) end ToBinary n list 742 { 0 } 371 { 0, 1 } 185 { 0, 1, 1 } 92 { 0, 1, 1, 0 } 46 { 0, 1, 1, 0, 1 } 23 { 0, 1, 1, 0, 1, 1 } 11 { 0, 1, 1, 0, 1, 1, 1 } 5 { 0, 1, 1, 0, 1, 1, 1, 1 } 2 { 0, 1, 1, 0, 1, 1, 1, 1, 0 } 1 { 0, 1, 1, 0, 1, 1, 1, 1, 0, 1 }

Table9.1:AlgorithmtraceofToBinary

1) algorithm GreatestCommonDenominator(m, n) 2) Pre: m and n areintegers 3) Post: thegreatestcommondenominatorofthetwointegersiscalculated 4) if n =0 5) return m 6) endif 7) return GreatestCommonDenominator(n, m % n) 8) end GreatestCommonDenominator

9.4ComputingthemaximumvalueforanumberofaspeciﬁcbaseconsistingofNdigits

Thisalgorithmcomputesthemaximumvalueofanumberforagivennumber ofdigits,e.g.usingthebase10systemthemaximumnumberwecanhave madeupof4digitsisthenumber999910.Similarlythemaximumnumberthat consistsof4digitsforabase2numberis11112 whichis1510

Theexpressionbywhichwecancomputethismaximumvaluefor N digits is: BN 1.Inthepreviousexpression B isthenumberbase,and N isthe numberofdigits.Asanexampleifwewantedtodeterminethemaximumvalue forahexadecimalnumber(base16)consistingof6digitstheexpressionwould beasfollows:166 1.Themaximumvalueofthepreviousexamplewouldbe representedas FFFFFF16 whichyields1677721510

Inthefollowingalgorithm numberBase shouldbeconsideredrestrictedto thevaluesof2,8,9,and16.Forthisreasoninouractualimplementation numberBase hasanenumerationtype.The Base enumerationtypeisdeﬁned as:

Base = {Binary ← 2,Octal ← 8,Decimal ← 10,Hexadecimal ← 16}

Thereasonweprovidethedeﬁnitionof Base istogiveyouanideahowthis algorithmcanbemodelledinamorereadablemannerratherthanusingvarious checkstodeterminethecorrectbasetouse.Forourimplementationwecastthe valueof numberBase toaninteger,assuchweextractthevalueassociatedwith therelevantoptioninthe Base enumeration.Asanexampleifweweretocast theoption Octal toanintegerwewouldgetthevalue8.Inthealgorithmlisted belowthecastisimplicitsowejustusetheactualargument numberBase

1) algorithm MaxValue(numberBase, n)

2) Pre: numberBase isthenumbersystemtouse, n isthenumberofdigits

3) Post: themaximumvaluefor numberBase consistingof n digitsiscomputed

4) return Power(numberBase,n) 1

5) end MaxValue

9.5Factorialofanumber

Attainingthefactorialofanumberisaprimitivemathematicaloperation.Many implementationsofthefactorialalgorithmarerecursiveastheproblemisrecursiveinnature,howeverherewepresentaniterativesolution.Theiterative solutionispresentedbecauseittooistrivialtoimplementanddoesn’tsuﬀer fromtheuseofrecursion(formoreonrecursionsee §C).

Thefactorialof0and1is0.Theaforementionedactsasabasecasethatwe willbuildupon.Thefactorialof2is2∗ thefactorialof1,similarlythefactorial of3is3∗ thefactorialof2andsoon.Wecanindicatethatweareafterthe factorialofanumberusingtheform N !where N isthenumberwewishto attainthefactorialof.Ouralgorithmdoesn’tusesuchnotationbutitishandy toknow.

CHAPTER9.NUMERIC 74

1) algorithm Factorial(n)

2) Pre: n ≥ 0, n isthenumbertocomputethefactorialof

3) Post: thefactorialof n iscomputed

4) if n< 2

9.6Summary

Inthischapterwehavepresentedseveralnumericalgorithms,mostofwhich aresimplyherebecausetheywerefuntodesign.Perhapsthemessagethat thereadershouldgainfromthischapteristhatalgorithmscanbeappliedto severaldomainstomakeworkinthatrespectivedomainattainable.Numeric algorithmsinparticulardrivesomeofthemostadvancedsystemsontheplanet computingsuchdataasweatherforecasts.

CHAPTER9.NUMERIC 75

5) return 1 6) endif 7) factorial ← 1 8) for i ← 2 to n 9) factorial ← factorial ∗ i 10) endfor 11) return factorial 12) end Factorial

Chapter10

Searching 10.1SequentialSearch

Asimplealgorithmthatsearchforaspeciﬁciteminsidealist.Itoperates loopingoneachelement O(n)untilamatchoccursortheendisreached.

1) algorithm SequentialSearch(list, item)

2) Pre: list = ∅

3) Post: return index ofitemiffound,otherwise 1

4) index ← 0

5) while index<list.Count and list[index] = item

6) index ← index +1

7) endwhile

8) if index<list.Count and list[index]= item

9) return index 10) endif 11) return 1

12) end SequentialSearch

10.2ProbabilitySearch

Probabilitysearchisastatisticalsequentialsearchingalgorithm.Inadditionto searchingforanitem,ittakesintoaccountitsfrequencybyswappingitwith it’spredecessorinthelist.Thealgorithmcomplexitystillremainsat O(n)but inanon-uniformitemssearchthemorefrequentitemsareintheﬁrstpositions, reducinglistscanningtime.

Figure10.1showstheresultingstateofalistaftersearchingfortwoitems, noticehowthesearcheditemshavehadtheirsearchprobabilityincreasedafter eachsearchoperationrespectively.

Figure10.1:a)Search(12),b)Search(101)

1) algorithm ProbabilitySearch(list, item)

2) Pre: list = ∅

3) Post: abooleanindicatingwheretheitemisfoundornot; intheformercaseswapfoundeditemwithitspredecessor

4) index ← 0

5) while index<list.Count and list[index] = item

6) index ← index +1

7) endwhile

8) if index ≥ list.Count or list[index] = item

9) return false

10) endif

11) if index> 0

12) Swap(list[index],list[index 1])

13) endif

14) return true

15) end ProbabilitySearch

10.3Summary

Inthischapterwehavepresentedafewnovelsearchingalgorithms.Wehave presentedmoreeﬃcientsearchingalgorithmsearlieron,likeforinstancethe logarithmicsearchingalgorithmthatAVLandBSTtree’suse(deﬁnedin §3.2). Wedecidednottocoverasearchingalgorithmknownasbinarychop(another nameforbinarysearch,binarychopusuallyreferstoitsarraycounterpart)as

CHAPTER10.SEARCHING 77

thereaderhasalreadyseensuchanalgorithmin §3. Searchingalgorithmsandtheireﬃciencylargelydependsontheunderlying datastructurebeingusedtostorethedata.Forinstanceitisquickertodeterminewhetheranitemisinahashtablethanitisanarray,similarlyitisquicker tosearchaBSTthanitisalinkedlist.Ifyouaregoingtosearchfordatafairly oftenthenwestronglyadvisethatyousitdownandresearchthedatastructures availabletoyou.Inmostcasesusingalistoranyotherprimarilylineardata structureisdowntolackofknowledge.Modelyourdataandthenresearchthe datastructuresthatbestﬁtyourscenario.

CHAPTER10.SEARCHING 78

Chapter11

Strings

Stringshavetheirownchapterinthistextpurelybecausestringoperations andtransformationsareincrediblyfrequentwithinprograms.Thealgorithms presentedarebasedonproblemstheauthorshavecomeacrosspreviously,or wereformulatedtosatisfycuriosity.

11.1Reversingtheorderofwordsinasentence

Deﬁningalgorithmsforprimitivestringoperationsissimple,e.g.extractinga sub-stringofastring,howeversomealgorithmsthatrequiremoreinventiveness canbealittlemoretricky.

Thealgorithmpresentedheredoesnotsimplyreversethecharactersina string,ratheritreversestheorderofwordswithinastring.Thisalgorithm worksontheprincipalthatwordsarealldelimitedbywhitespace,andusinga fewmarkerstodeﬁnewherewordsstartandendwecaneasilyreversethem.

1) algorithm ReverseWords(value)

2) Pre: value = ∅, sb isastringbuﬀer

3) Post: thewordsin value havebeenreversed

4) last ← value.Length 1

5) start ← last

6) while last ≥ 0

7)//skipwhitespace

8) while start ≥ 0 and value[start]=whitespace

9) start ← start 1

10) endwhile

11) last ← start

12)//marchdowntotheindexbeforethebeginningoftheword

13) while start ≥ 0 and start =whitespace

14) start ← start 1

15) endwhile

16)//appendcharsfrom start +1to length +1tostringbuﬀersb

17) for i ← start +1 to last

18) sb.Append(value[i])

19) endfor

20)//ifthisisn’tthelastwordinthestringaddsomewhitespaceafterthewordinthebuﬀer

21) if start> 0

22) sb.Append(‘’)

23) endif

24) last ← start 1

25) start ← last 26) endwhile

27)//checkifwehaveaddedonetoomanywhitespaceto sb

28) if sb[sb.Length 1]=whitespace

29)//cutthewhitespace

30) sb.Length ← sb.Length 1

31) endif

32) return sb

33) end ReverseWords

11.2Detectingapalindrome

Althoughnotafrequentalgorithmthatwillbeappliedinreal-lifescenarios detectingapalindromeisafun,andasitturnsoutprettytrivialalgorithmto design.

Thealgorithmthatwepresenthasa O(n)runtimecomplexity.Ouralgorithmusestwopointersatoppositeendsofstringwearecheckingisapalindrome ornot.Thesepointersmarchintowardseachotheralwayscheckingthateach charactertheypointtoisthesamewithrespecttovalue.Figure11.1showsthe IsPalindrome algorithminoperationonthestring“WasitEliot’stoiletIsaw?” Ifyouremoveallpunctuation,andwhitespacefromtheaforementionedstring youwillﬁndthatitisavalidpalindrome.

CHAPTER11.STRINGS 80

Figure11.1:

1) algorithm IsPalindrome(value)

2) Pre: value = ∅

3) Post: value isdeterminedtobeapalindromeornot

4) word ← value.Strip().ToUpperCase()

5) left ← 0

6) right ← word.Length 1

7) while word[left]= word[right] and left<right

8) left ← left +1

9) right ← right 1

10) endwhile

11) return word[left]= word[right]

12) end IsPalindrome

Inthe IsPalindrome algorithmwecallamethodbythenameof Strip.This algorithmdiscardspunctuationinthestring,includingwhitespace.Asaresult word containsaheavilycompactedrepresentationoftheoriginalstring,each characterofwhichisinitsuppercaserepresentation.

Palindromesdiscardwhitespace,punctuation,andcasemakingthesechanges allowsustodesignasimplealgorithmwhilemakingouralgorithmfairlyrobust withrespecttothepalindromesitwilldetect.

11.3Countingthenumberofwordsinastring

Countingthenumberofwordsinastringcanseemprettytrivialatﬁrst,however thereareafewcasesthatweneedtobeawareof:

1. trackingwhenweareinastring

2. updatingthewordcountatthecorrectplace

3. skippingwhitespacethatdelimitsthewords

Asanexampleconsiderthestring“Benatehay”Clearlythisstringcontains threewords,eachofwhichdistinguishedviawhitespace.Allofthepreviously listedpointscanbemanagedbyusingthreevariables:

1. index

2. wordCount

3. inWord

CHAPTER11.STRINGS 81

left and right pointersmarchingintowardsoneanother

Ofthepreviouslylisted index keepstrackofthecurrentindexweareatin thestring, wordCount isanintegerthatkeepstrackofthenumberofwordswe haveencountered,andﬁnally inWord isaBooleanﬂagthatdenoteswhether ornotatthepresenttimewearewithinaword.Ifwearenotcurrentlyhitting whitespaceweareinaword,theoppositeistrueifatthepresentindexweare hittingwhitespace.

Whatdenotesaword?Inouralgorithmeachwordisseparatedbyoneor moreoccurrencesofwhitespace.Wedon’ttakeintoaccountanyparticular splittingsymbolsyoumayuse,e.g.in.NET String.Split 1 cantakeachar(or arrayofcharacters)thatdeterminesadelimitertousetosplitthecharacters withinthestringintochunksofstrings,resultinginanarrayofsub-strings.

InFigure11.2wepresentastringindexedasanarray.Typicallythepattern isthesameformostwords,delimitedbyasingleoccurrenceofwhitespace. Figure11.3showsthesamestring,withthesamenumberofwordsbutwith varyingwhitespacesplittingthem.

1 http://msdn.microsoft.com/en-us/library/system.string.split.aspx

CHAPTER11.STRINGS 82

Figure11.2:Stringwiththreewords Figure11.3:Stringwithvaryingnumberofwhitespacedelimitingthewords

1) algorithm WordCount(value)

2) Pre: value = ∅

3) Post: thenumberofwordscontainedwithin value isdetermined

4) inWord ← true

5) wordCount ← 0

6) index ← 0

7)//skipinitialwhitespace

8) while value[index]=whitespace and index<value.Length 1

9) index ← index +1

10) endwhile

11)//wasthestringjustwhitespace?

12) if index = value.Length and value[index]=whitespace 13) return 0 14) endif 15) while index<value.Length 16) if value[index]=whitespace 17)//skipallwhitespace

18) while value[index]=whitespace and index<value.Length 1

index ← index +1

inWord ← false

wordCount ← wordCount +1

11.4Determiningthenumberofrepeatedwords withinastring

Withthehelpofanunorderedset,andanalgorithmthatcansplitthewords withinastringusingaspeciﬁeddelimiterthisalgorithmisstraightforwardto implement.Ifwesplitallthewordsusingasingleoccurrenceofwhitespace asourdelimiterwegetallthewordswithinthestringbackaselementsof anarray.Thenifweiteratethroughthesewordsaddingthemtoasetwhich containsonlyuniquestringswecanattainthenumberofuniquewordsfromthe string.Allthatislefttodoissubtracttheuniquewordcountfromthetotal numberofstingscontainedinthearrayreturnedfromthesplitoperation.The splitoperationthatwerefertoisthesameasthatmentionedin §11.3.

CHAPTER11.STRINGS 83

21)

22)

23) else 24)

25) endif 26) index

27) endwhile 28)//lastwordmayhavenotbeenfollowedbywhitespace 29) if inWord 30) wordCount

+1 31) endif 32) return wordCount 33) end WordCount

19)

20) endwhile

inWord ← true

← index +1

← wordCount

Figure11.4:a)Undesired

1) algorithm RepeatedWordCount(value)

2) Pre: value = ∅

3) Post: thenumberofrepeatedwordsin value isreturned

4) words ← value.Split(’’)

5) uniques ← Set

6) foreach word in words

7) uniques.Add(word.Strip())

8) endforeach

9) return words.Length uniques.Count

10) end RepeatedWordCount

Youwillnoticeinthe RepeatedWordCount algorithmthatweusethe Strip methodwereferredtoearlierin §11.1.Thissimplyremovesanypunctuation froma word.Thereasonweperformthisoperationoneach word issothat wecanbuildamoreaccurateuniquestringcollection,e.g.“test”,and“test!” arethesamewordminusthepunctuation.Figure11.4showstheundesiredand desiredsetsforthe unique setrespectively.

11.5Determiningtheﬁrstmatchingcharacter betweentwostrings

Thealgorithmtodeterminewhetheranycharacterofastringmatchesanyofthe charactersinanotherstringisprettytrivial.Putsimply,wecanparsethestrings consideredusingadoubleloopandcheck,discardingpunctuation,theequality betweenanycharactersthusreturninganon-negativeindexthatrepresentsthe locationoftheﬁrstcharacterinthematch(Figure11.5);otherwisewereturn -1ifnomatchoccurs.Thisapproachexhibitaruntimecomplexityof O(n2).

CHAPTER11.STRINGS 84

uniques set;b)desired uniques set

1) algorithm Any(word,match)

2) Pre: word,match = ∅

3) Post: index representingmatchlocationifoccured, 1otherwise

4) for i ← 0to word.Length 1

5) while word[i]=whitespace

6) i ← i +1

7) endwhile

8) for index ← 0to match.Length 1

9) while match[index]=whitespace

10) index ← index +1

11) endwhile

12) if match[index]= word[i]

13) return index 14) endif 15) endfor 16) endfor 17) return 1 18) end Any

11.6Summary

Wehopethatthereaderhasseenhowfunalgorithmsonstringdatatypes are.Stringsareprobablythemostcommondatatype(anddatastructurerememberwearedealingwithanarray)thatyouwillworkwithsoitsimportant thatyoulearntobecreativewiththem.Weforoneﬁndstringsfascinating.A simpleGooglesearchonstringnuancesbetweenlanguagesandencodingswill provideyouwithagreatnumberofproblems.Nowthatwehavespurredyou alongalittlewithourintroductoryalgorithmsyoucandevisesomeofyourown.

CHAPTER11.STRINGS 85 t s e t 01234 s r e t p 0123456 Word Match i t s e t 01234 s r e t p 0123456 i index t s e t 01234 s r e t p 0123456 i index index a) b) c)

Figure11.5:a)FirstStep;b)SecondStepc)MatchOccurred

AppendixA

AlgorithmWalkthrough

Learninghowtodesigngoodalgorithmscanbeassistedgreatlybyusinga structuredapproachtotracingitsbehaviour.Inmostcasestracinganalgorithm onlyrequiresasingletable.Inmostcasestracingisnotenough,youwillalso wanttouseadiagramofthedatastructureyouralgorithmoperateson.This diagramwillbeusedtovisualisetheproblemmoreeﬀectively.Seeingthings visuallycanhelpyouunderstandtheproblemquicker,andbetter.

Thetracetablewillstoreinformationaboutthevariablesusedinyouralgorithm.Thevalueswithinthistableareconstantlyupdatedwhenthealgorithm mutatesthem.Suchanapproachallowsyoutoattainahistoryofthevarious valueseachvariablehasheld.Youmayalsobeabletoinferpatternsfromthe valueseachvariablehascontainedsothatyoucanmakeyouralgorithmmore eﬃcient.

Wehavefoundthisapproachbothsimple,andpowerful.Bycombininga visualrepresentationoftheproblemaswellashavingahistoryofpastvalues generatedbythealgorithmitcanmakeunderstanding,andsolvingproblems mucheasier.

Inthischapterwewillshowyouhowtoworkthroughbothiterative,and recursivealgorithmsusingthetechniqueoutlined.

A.1Iterativealgorithms

Wewilltracethe IsPalindrome algorithm(deﬁnedin §11.2)asourexample iterativewalkthrough.Beforeweevenlookatthevariablesthealgorithmuses, ﬁrstwewilllookattheactualdatastructurethealgorithmoperateson.It shouldbeprettyobviousthatweareoperatingonastring,buthowisthis represented?Astringisessentiallyablockofcontiguousmemorythatconsists ofsomechardatatypes,oneaftertheother.Eachcharacterinthestringcan beaccessedviaanindexmuchlikeyouwoulddowhenaccessingitemswithin anarray.Thepictureshouldbepresentingitself-astringcanbethoughtofas anarrayofcharacters.

Forourexamplewewilluse IsPalindrome tooperateonthestring“Never oddoreven”Nowweknowhowthestringdatastructureisrepresented,and thevalueofthestringwewilloperateonlet’sgoaheadanddrawitasshown inFigureA.1.

The IsPalindrome algorithmusesthefollowinglistofvariablesinsomeform throughoutitsexecution:

Havingidentiﬁedthevaluesofthevariablesweneedtokeeptrackofwe simplycreateacolumnforeachinatableasshowninTableA.1.

Now,usingthe IsPalindrome algorithmexecuteeachstatementupdating thevariablevaluesinthetableappropriately.TableA.2showstheﬁnaltable valuesforeachvariableusedin IsPalindrome respectively.

Whilethisapproachmaylookalittlebloatedinprint,onpaperitismuch morecompact.Wherewehavethestringsinthetableyoushouldannotate thesestringswitharrayindexestoaidthealgorithmwalkthrough.

Thereisoneotherpointthatweshouldclarifyatthistime-whetherto includevariablesthatchangeonlyafewtimes,ornotatallinthetracetable. InTableA.2wehaveincludedboththe value,and word variablesbecauseit wasconvenienttodoso.Youmayﬁndthatyouwanttopromotethesevalues toalargerdiagram(likethatinFigureA.1)andonlyusethetracetablefor variableswhosevalueschangeduringthealgorithm.Werecommendthatyou promotethecoredatastructurebeingoperatedontoalargerdiagramoutside ofthetablesothatyoucaninterrogateitmoreeasily.

APPENDIXA.ALGORITHMWALKTHROUGH 87

value word left right

FigureA.1:Visualisingthedatastructureweareoperatingon TableA.1:Acolumnforeachvariablewewishtotrack

1. value

2. word 3. left

4. right

value word left right “Neveroddoreven” “NEVERODDOREVEN” 0 13 1 12 2 11 3 10 4 9 5 8 6 7 7 6

TableA.2:Algorithmtracefor IsPalindrome

Wecannotstressenoughhowimportantsuchtracesarewhendesigning youralgorithm.Youcanusethesetracetablestoverifyalgorithmcorrectness. Atthecostofasimpletable,andquicksketchofthedatastructureyouare operatingonyoucandevisecorrectalgorithmsquicker.Visualisingtheproblem domainandkeepingtrackofchangingdatamakesproblemsaloteasiertosolve. Moreoveryoualwayshaveapointofreferencewhichyoucanlookbackon.

A.2RecursiveAlgorithms

Forthemostpartworkingthroughrecursivealgorithmsisassimpleaswalking throughaniterativealgorithm.Oneofthethingsthatweneedtokeeptrack ofthoughiswhichmethodcallreturnstowho.Mostrecursivealgorithmsare muchsimpletofollowwhenyoudrawouttherecursivecallsratherthanusing atablebasedapproach.Inthissectionwewillusearecursiveimplementation ofanalgorithmthatcomputesanumberfromtheFiboncaccisequence.

1) algorithm Fibonacci(n)

2) Pre: n isthenumberintheﬁbonaccisequencetocompute

3) Post: theﬁbonaccisequencenumber n hasbeencomputed

4) if n< 1

5) return 0

6) elseif n< 2

7) return 1

8) endif

9) return Fibonacci(n 1)+Fibonacci(n 2)

10) end Fibonacci

Beforewejumpintoshowingyouadiagrammticrepresentationofthealgorithmcallsforthe Fibonacci algorithmwewillbrieﬂytalkaboutthecasesof thealgorithm.Thealgorithmhasthreecasesintotal:

Theﬁrsttwoitemsinthepreceedinglistarethebasecasesofthealgorithm. Untilwehitoneofourbasecasesinourrecursivemethodcalltreewewon’t returnanything.Thethirditemfromthelistisourrecursivecase.

Witheachcalltotherecursivecaseweetcheverclosertooneofourbase cases.FigureA.2showsadiagrammticrepresentationoftherecursivecallchain. InFigureA.2theorderinwhichthemethodsarecalledarelabelled.Figure A.3showsthecallchainannotatedwiththereturnvaluesofeachmethodcall aswellastheorderinwhichmethodsreturntotheircallers.InFigureA.3the returnvaluesarerepresentedasannotationstotheredarrows.

Itisimportanttonotethateachrecursivecallonlyeverreturnstoitscaller uponhittingoneofthetwobasecases.Whenyoudoeventuallyhitabasecase thatbranchofrecursivecallsceases.Uponhittingabasecaseyougobackto

APPENDIXA.ALGORITHMWALKTHROUGH 88

1. n< 1 2. n< 2 3. n ≥ 2

APPENDIXA.ALGORITHMWALKTHROUGH 89

FigureA.2:Callchainfor Fibonacci algorithm FigureA.3:Returnchainfor Fibonacci algorithm

thecallerandcontinueexecutionofthatmethod.Executioninthecalleris contiuedatthenextstatement,orexpressionaftertherecursivecallwasmade.

Inthe Fibonacci algorithms’recursivecasewemaketworecursivecalls. Whentheﬁrstrecursivecall(Fibonacci(n 1))returnstothecallerwethen executethethesecondrecursivecall(Fibonacci(n 2)).Afterbothrecursive callshavereturnedtotheircaller,thecallercanthensubesequentlyreturnto itscallerandsoon.

Recursivealgorithmsaremucheasiertodemonstratediagrammaticallyas FigureA.2demonstrates.Whenyoucomeacrossarecursivealgorithmdraw methodcalldiagramstounderstandhowthealgorithmworksatahighlevel.

A.3Summary

Understandingalgorithmscanbehardattimes,particularlyfromanimplementationperspective.Inordertounderstandanalgorithmtryandworkthrough itusingtracetables.Incaseswherethealgorithmisalsorecursivesketchthe recursivecallsoutsoyoucanvisualisethecall/returnchain.

Inthevastmajorityofcasesimplementinganalgorithmissimpleprovided thatyouknowhowthealgorithmworks.Masteringhowanalgorithmworks fromahighleveliskeyfordevisingawelldesignedsolutiontotheproblemin hand.

APPENDIXA.ALGORITHMWALKTHROUGH 90

AppendixB

TranslationWalkthrough

Theconversionfrompseudotoanactualimperativelanguageisusuallyvery straightforward,toclarifyanexampleisprovided.Inthisexamplewewill convertthealgorithmin §9.1totheC#language.

1)publicstaticboolIsPrime(intnumber)

2) {

3)if(number < 2)

4) { 5)returnfalse;

6) }

7)intinnerLoopBound=(int)Math.Floor(Math.Sqrt(number)); 8)for(inti=1;i < number;i++)

9) { 10)for(intj=1;j <=innerLoopBound;j++)

11) { 12)if(i ∗ j==number)

13) { 14)returnfalse;

15) }

16) }

17) } 18)returntrue;

19) }

Forthemostparttheconversionisastraightforwardprocess,howeveryou mayhavetoinjectvariouscallstootherutilityalgorithmstoascertainthe correctresult.

Aconsiderationtotakenoteofisthatmanyalgorithmshavefairlystrict preconditions,ofwhichtheremaybeseveral-inthesescenariosyouwillneed toinjectthecorrectcodetohandlesuchsituationstopreservethecorrectnessof thealgorithm.Mostofthepreconditionscanbesuitablyhandledbythrowing thecorrectexception.

B.1Summary

Asyoucanseefromtheexampleusedinthischapterwehavetriedtomakethe translationofourpseudocodealgorithmstomainstreamimperativelanguages assimpleaspossible.

Wheneveryouencounterakeywordwithinourpseudocodeexamplesthat youareunfamiliarwithjustbrowsetoAppendixEwhichdescirbeseachkeyword.

APPENDIXB.TRANSLATIONWALKTHROUGH 92

AppendixC

RecursiveVs.Iterative Solutions

Oneofthemostsuccinctpropertiesofmodernprogramminglanguageslike C++,C#,andJava(aswellasmanyothers)isthattheselanguagesallow youtodeﬁnemethodsthatreferencethemselves,suchmethodsaresaidtobe recursive.Oneofthebiggestadvantagesrecursivemethodsbringtothetableis thattheyusuallyresultinmorereadable,andcompactsolutionstoproblems.

Arecursivemethodthenisonethatisdeﬁnedintermsofitself.Generally arecursivealgorithmshastwomainproperties:

1. Oneormorebasecases;and

2. Arecursivecase

Fornowwewillbrieﬂycoverthesetwoaspectsofrecursivealgorithms.With eachrecursivecallweshouldbemakingprogresstoourbasecaseotherwisewe aregoingtorunintotrouble.Thetroublewespeakofmanifestsitselftypically asastackoverﬂow,wewilldescribewhylater.

Nowthatwehavebrieﬂydescribedwhatarecursivealgorithmisandwhy youmightwanttousesuchanapproachforyouralgorithmswewillnowtalk aboutiterativesolutions.Aniterativesolutionusesnorecursionwhatsoever. Aniterativesolutionreliesonlyontheuseofloops(e.g.for,while,do-while, etc).Thedownsidetoiterativealgorithmsisthattheytendnottobeasclear astotheirrecursivecounterpartswithrespecttotheiroperation.Themajor advantageofiterativesolutionsisspeed.Mostproductionsoftwareyouwill ﬁnduseslittleornorecursivealgorithmswhatsoever.Thelatterpropertycan sometimesbeacompaniesprerequisitetocheckingincode,e.g.uponchecking inastaticanalysistoolmayverifythatthecodethedeveloperischeckingin containsnorecursivealgorithms.Normallyitissystemslevelcodethathasthis zerotolerancepolicyforrecursivealgorithms.

Usingrecursionshouldalwaysbereservedforfastalgorithms,youshould avoiditforthefollowingalgorithmruntimedeﬁciencies:

1. O(n2) 2. O(n3) 93

Ifyouuserecursionforalgorithmswithanyoftheaboveruntimeeﬃciency’s youareinvitingtrouble.Thegrowthrateofthesealgorithmsishighandin mostcasessuchalgorithmswillleanveryheavilyontechniqueslikedivideand conquer.Whileconstantlysplittingproblemsintosmallerproblemsisgood practice,inthesecasesyouaregoingtobespawningalotofmethodcalls.All thisoverhead(methodcallsdon’tcome that cheap)willsoonpileupandeither causeyouralgorithmtorunalotslowerthanexpected,orworse,youwillrun outofstackspace.Whenyouexceedtheallottedstackspaceforathreadthe processwillbeshutdownbytheoperatingsystem.Thisisthecaseirrespective oftheplatformyouuse,e.g..NET,ornativeC++etc.Youcanaskforabigger stacksize,butyoutypicallyonlywanttodothisifyouhaveaverygoodreason todoso.

C.1ActivationRecords

Anactivationrecordiscreatedeverytimeyouinvokeamethod.Putsimply anactivationrecordissomethingthatisputonthestacktosupportmethod invocation.Activationrecordstakeasmallamountoftimetocreate,andare prettylightweight.

Normallyanactivationrecordforamethodcallisasfollows(thisisvery general):

• Theactualparametersofthemethodarepushedontothestack

• Thereturnaddressispushedontothestack

• Thetop-of-stackindexisincrementedbythetotalamountofmemory requiredbythelocalvariableswithinthemethod

• Ajumpismadetothemethod

Inmanyrecursivealgorithmsoperatingonlargedatastructures,oralgorithmsthatareineﬃcientyouwillrunoutofstackspacequickly.Consideran algorithmthatwheninvokedgivenaspeciﬁcvalueitcreatesmanyrecursive calls.Insuchacaseabigchunkofthestackwillbeconsumed.Wewillhaveto waituntiltheactivationrecordsstarttobeunwoundafterthenestedmethods inthecallchainexitandreturntotheirrespectivecaller.Whenamethodexits it’sactivationrecordisunwound.Unwindinganactivationrecordresultsin severalsteps:

1. Thetop-of-stackindexisdecrementedbythetotalamountofmemory consumedbythemethod

2. Thereturnaddressispoppedoﬀthestack

3. Thetop-of-stackindexisdecrementedbythetotalamountofmemory consumedbytheactualparameters

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS 94

O(2n)

Whileactivationrecordsareaneﬃcientwaytosupportmethodcallsthey canbuildupveryquickly.Recursivealgorithmscanexhaustthestacksize allocatedtothethreadfairlyfastgiventhechance.

Justaboutnowweshouldbedustingthecobwebsofftheageoldexampleof aniterativevs.recursivesolutionintheformoftheFibonaccialgorithm.This isafamousexampleasithighlightsboththebeautyandpitfallsofarecursive algorithm.Theiterativesolutionisnotaspretty,norselfdocumentingbutit doesthejobalotquicker.IfweweretogivetheFibonaccialgorithmaninput ofsay60thenwewouldhavetowaitawhiletogetthevaluebackbecauseit hasan O(gn)runtime.Theiterativeversionontheotherhandhasa O(n) runtime.Don’tletthisputyouoffrecursion.Thisexampleismainlyused toshockprogrammersintothinkingabouttheramificationsofrecursionrather thanwarningthemoff.

C.2Someproblemsarerecursiveinnature

Somethingthatyoumaycomeacrossisthatsomedatastructuresandalgorithmsareactuallyrecursiveinnature.Aperfectexampleofthisisatreedata structure.Acommontreenodeusuallycontainsavalue,alongwithtwopointerstotwoothernodesofthesamenodetype.Asyoucanseetreeisrecursive initsmakeupwiteachnodepossiblypointingtotwoothernodes.

Whenusingrecursivealgorithmsontree’sitmakessenseasyouaresimply adheringtotheinherentdesignofthedatastructureyouareoperatingon.Of courseitisnotallgoodnews,afterallwearestillboundbythelimitationswe havementionedpreviouslyinthischapter.

Wecanalsolookatsortingalgorithmslikemergesort,andquicksort.Both ofthesealgorithmsarerecursiveintheirdesignandsoitmakessensetomodel themrecursively.

C.3Summary

Recursionisapowerfultool,andonethatallprogrammersshouldknowof. Oftensoftwareprojectswilltakeatradebetweenreadability,andeﬃciencyin whichcaserecursionisgreatprovidedyoudon’tgoanduseittoimplement analgorithmwithaquadraticruntimeorhigher.Ofcoursethisisnotarule ofthumb,thisisjustusthrowingcautiontothewind.Defensivecodingwill alwaysprevail.

Manytimesrecursionhasanaturalhomeinrecursivedatastructuresand algorithmswhicharerecursiveinnature.Usingrecursioninsuchscenariosis perfectlyacceptable.Usingrecursionforsomethinglikelinkedlisttraversalis alittleoverkill.Itsiterativecounterpartisprobablylesslinesofcodethanits recursivecounterpart.

Becausewecanonlytalkabouttheimplicationsofusingrecursionfroman abstractpointofviewyoushouldconsultyourcompilerandruntimeenvironmentformoredetails.Itmaybethecasethatyourcompilerrecognisesthings liketailrecursionandcanoptimisethem.Thisisn’tunheardof,infactmost commercialcompilerswilldothis.Theamountofoptimisationcompilerscan

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS

dothoughissomewhatlimitedbythefactthatyouarestillusingrecursion. You,asthedeveloperhavetoacceptcertainaccountability’sforperformance.

APPENDIXC.RECURSIVEVS.ITERATIVESOLUTIONS 96

AppendixD

Testing

Testingisanessentialpartofsoftwaredevelopment.Testinghasoftenbeen discardedbymanydevelopersinthebeliefthattheburdenofproofoftheir softwareisonthosewithinthecompanywhoholdtestcentricroles.This couldn’tbefurtherfromthetruth.Asadeveloperyoushouldatleastprovide asuiteofunitteststhatverifycertainboundaryconditionsofyoursoftware.

Agreatthingabouttestingisthatyoubuildupprogressivelyasafetynet.If youaddortweakalgorithmsandthenrunyoursuiteoftestsyouwillbequickly alertedtoanycasesthatyouhavebrokenwithyourrecentchanges.Suchasuite oftestsinanysizeableprojectisabsolutelyessentialtomaintainingafairlyhigh barwhenitcomestoquality.Ofcourseinordertoattainsuchastandardyou needtothinkcarefullyabouttheteststhatyouconstruct.

Unittestingwhichwillbethesubjectofthevastmajorityofthischapter arewidelyavailableonmostplatforms.MostmodernlanguageslikeC++,C#, andJavaoﬀeranimpressivecatalogueoftestingframeworksthatyoucanuse forunittesting.

Thefollowinglistidentiﬁestestingframeworkswhicharepopular:

JUnit: TargetedatJav., http://www.junit.org/

NUnit: CanbeusedwithlanguagesthattargetMicrosoft’sCommonLanguage Runtime. http://www.nunit.org/index.php

BoostTestLibrary: TargetedatC++.Thetestlibrarythatshipswiththeincrediblypopular Boostlibraries. http://www.boost.org.Adirectlinktothelibrariesdocumentation http://www.boost.org/doc/libs/1_36_0/libs/test/doc/ html/index.html

CppUnit: TargetedatC++. http://cppunit.sourceforge.net/

Don’tworryifyouthinkthatthelistisverysparse,therearefarmoreon oﬀerthanthosethatwehavelisted.Theoneslistedarethetestingframeworks thatwebelievearethemostpopularforC++,C#,andJava.

D.1Whatconstitutesaunittest?

Aunittestshouldfocusonasingleatomicpropertyofthesubjectbeingtested. Donottryandtestmanythingsatonce,thiswillresultinasuiteofsomewhat

unstructuredtests.Asanexampleifyouwerewantingtowriteatestthat veriﬁedthataparticularvalue V isreturnedfromaspeciﬁcinput I thenyour testshoulddothesmallestamountofworkpossibletoverifythat V iscorrect given I.Aunittestshouldbesimpleandselfdescribing.

Aswellasaunittestbeingrelativelyatomicyoushouldalsomakesurethat yourunittestsexecutequickly.Ifyoucanimagineinthefuturewhenyoumay haveatestsuiteconsistingofthousandsoftestsyouwantthoseteststoexecute asquicklyaspossible.Failuretoattainsuchagoalwillmostlikelyresultin thesuiteoftestsnotbeingranthatoftenbythedevelopersonyourteam.This canoccurforanumberofreasonsbutthemainonewouldbethatitbecomes incrediblytediouswaitingseveralminutestoruntestsonadeveloperslocal machine.

Buildingupatestsuitecanhelpgreatlyinateamscenario,particularly whenusingacontinuousbuildserver.Insuchascenarioyoucanhavethesuite oftestsdevisedbythedevelopersandtestersranaspartofthebuildprocess.

Employingsuchstrategiescanhelpyoucatchnigglinglittleerrorcasesearly ratherthanviayourcustomerbase.Thereisnothingmoreembarrassingfora developerthantohaveaverytrivialbugintheircodereportedtothemfroma customer.

D.2WhenshouldIwritemytests?

Asourceofgreatdebatewouldbeanunderstatementtopersonifysuchaquestionasthis.Inrecentyearsatestdrivenapproachtodevelopmenthasbecome verypopular.Suchanapproachisknownastestdrivendevelopment,ormore commonlytheacronymTDD.

OneofthefoundingprinciplesofTDDistowritetheunittestﬁrst,watch itfailandthenmakeitpass.Thepremisebeingthatyouonlyeverwrite enoughcodeatanyonetimetosatisfythestatebasedassertionsmadeinaunit test.Wehavefoundthisapproachtoprovideamorestructuredintenttothe implementationofalgorithms.Atanyonestageyouonlyhaveasinglegoal,to makethefailingtestpass.BecauseTDDmakesyouwritethetestsupfrontyou neverﬁndyourselfinasituationwhereyouforget,orcan’tbebotheredtowrite testsforyourcode.Thisisoftenthecasewhenyouwriteyourtestsafteryou havecodedupyourimplementation.We,astheauthorsofthisbookourselves useTDDasourpreferredmethod.

AswehavealreadymentionedthatTDDisourfavouredapproachtotesting itwouldbesomewhatofaninjusticetonotlist,anddescribethemantrathat isoftenassociatewithit:

Red: Signiﬁesthatthetesthasfailed.

Green: Thefailingtestnowpasses.

Refactor: Canwerestructureourprogramsoitmakesmoresense,andeasierto maintain?

Theﬁrstpointoftheabovelistalwaysoccursatleastonce(moreifyoucount thebuilderror)inTDDinitially.Yourtaskatthisstageissolelytomakethe testpass,thatistomaketherespectivetestgreen.Thelastitemisbasedaround

APPENDIXD.TESTING 98

therestructuringofyourprogramtomakeitasreadableandmaintainableas possible.ThelastpointisveryimportantasTDDisaprogressivemethodology tobuildingasolution.Ifyouadheretoprogressiverevisionsofyouralgorithm restructuringwhenappropriateyouwillﬁndthatusingTDDyoucanimplement verycleanlystructuredtypesandsoon.

D.3HowseriouslyshouldIviewmytestsuite?

Yourtestsareamajorpartofyourprojectecosystemandsotheyshouldbe treatedwiththesameamountofrespectasyourproductioncode.Thisranges fromcorrect,andcleancodeformatting,tothetestingcodebeingstoredwithin asourcecontrolrepository.

EmployingamethodologylikeTDD,ortestingafterimplementingyouwill ﬁndthatyouspendagreatamountoftimewritingtestsandthustheyshould betreatednodiﬀerentlytoyourproductioncode.Alltestsshouldbeclearly named,andfullydocumentedastotheirintent.

D.4ThethreeA’s

Nowthatyouhaveasenseoftheimportanceofyourtestsuiteyouwillinevitably wanttoknowhowtoactuallystructureeachblockofimperativeswithinasingle unittest.Apopularapproach-thethreeA’sisdescribedinthefollowinglist:

Assemble: Createtheobjectsyourequireinordertoperformthestatebasedassertions.

Act: Invoketherespectiveoperationsontheobjectsyouhaveassembledto mutatethestatetothatdesiredforyourassertions.

Assert: Specifywhatyouexpecttoholdaftertheprevioustwosteps.

Thefollowingexampleshowsasimpletestmethodthatemploysthethree A’s:

D.5Thestructuringoftests

Structuringtestscanbevieweduponasbeingthesameasstructuringproductioncode,e.g.allunittestsfora Person typemaybecontainedwithin

APPENDIXD.TESTING 99

publicvoidMyTest() { //assemble

//assert

}

Typet=newType(); //act t.MethodA();

Assert.IsTrue(t.BoolExpr)

a PersonTest type.Typicallyalltestsareabstractedfromproductioncode. Thatisthatthetestsaredisjointfromtheproductioncode,youmayhavetwo dynamiclinklibraries(dll);theﬁrstcontainingtheproductioncode,thesecond containingyourtestcode.

Wecanalsousethingslikeinheritanceetcwhendeﬁningclassesoftests. Thepointbeingthatthetestcodeisverymuchlikeyourproductioncodeand youshouldapplythesameamountofthoughttoitsstructureasyouwoulddo theproductioncode.

D.6CodeCoverage

Somethingthatyoucangetasaproductofunittestingarecodecoverage statistics.Codecoverageismerelyanindicatorastotheportionsofproduction codethatyourunitstestscover.UsingTDDitislikelythatyourcodecoverage willbeveryhigh,althoughitwillvarydependingonhoweasyitistouseTDD withinyourproject.

D.7Summary

Testingiskeytothecreationofamoderatelystableproduct.Moreoverunit testingcanbeusedtocreateasafetyblanketwhenaddingandremovingfeatures providinganearlywarningforbreakingchangeswithinyourproductioncode.

APPENDIXD.TESTING 100

AppendixE

SymbolDeﬁnitions

Throughoutthepseudocodelistingsyouwillﬁndseveralsymbolsused,describes themeaningofeachofthosesymbols.

Symbol Description

← Assignment.

= Equality.

≤ Lessthanorequalto.

< Lessthan.*

≥ Greaterthanorequalto.

> Greaterthan.*

= Inequality.

∅ Null. and Logicaland. or Logicalor. whitespace Singleoccurrenceofwhitespace. yield Like return butbuildsasequence.

*Thissymbolhasadirecttranslationwiththevastmajorityofimperative counterparts.

TableE.1:Pseudosymboldeﬁnitions

101