Time Series Forecasting with R A Hands On Guide 2nd Edition Galit
Visit to download the full and correct content document: https://textbookfull.com/product/practical-time-series-forecasting-with-r-a-hands-on-gu ide-2nd-edition-galit-shmueli/
More products digital (pdf, epub, mobi) instant download maybe you interests ...
Data Mining for Business Analytics Concepts Techniques and Applications with XLMiner 3rd Edition Galit Shmueli
https://textbookfull.com/product/data-mining-for-businessanalytics-concepts-techniques-and-applications-with-xlminer-3rdedition-galit-shmueli/
SAS for Forecasting Time Series John C. Brocklebank
https://textbookfull.com/product/sas-for-forecasting-time-seriesjohn-c-brocklebank/
Hands-on Time Series Analysis With Python: From Basics To Bleeding Edge Techniques B. V. Vishwas
https://textbookfull.com/product/hands-on-time-series-analysiswith-python-from-basics-to-bleeding-edge-techniques-b-v-vishwas/
Applied time series analysis with R Second Edition
Elliott
https://textbookfull.com/product/applied-time-series-analysiswith-r-second-edition-elliott/
Hands-On Entity Resolution: A Practical Guide to Data Matching With Python 1st Edition Michael Shearer
https://textbookfull.com/product/hands-on-entity-resolution-apractical-guide-to-data-matching-with-python-1st-edition-michaelshearer/
Nonlinear Time Series Analysis with R 1st Edition Ray Huffaker
https://textbookfull.com/product/nonlinear-time-series-analysiswith-r-1st-edition-ray-huffaker/
Design Patterns in C#: A Hands-on Guide with Real-world Examples 2nd Edition Vaskaran Sarcar
https://textbookfull.com/product/design-patterns-in-c-a-hands-onguide-with-real-world-examples-2nd-edition-vaskaran-sarcar/
Practical clinical microbiology and infectious diseases: a hands-on guide First Edition Gronthoud
https://textbookfull.com/product/practical-clinical-microbiologyand-infectious-diseases-a-hands-on-guide-first-edition-gronthoud/
Displaying Time Series Spatial and Space Time Data with R Second Edition Oscar Perpinan Lamigueiro
https://textbookfull.com/product/displaying-time-series-spatialand-space-time-data-with-r-second-edition-oscar-perpinanlamigueiro/
GALITSHMUELI KENNETHC.LICHTENDAHLJR. PRACTICAL TIMESERIES FORECASTING WITHR AHANDS-ONGUIDE SECONDEDITION AXELRODSCHNALLPUBLISHERS Copyright© 2016 GalitShmueli&KennethC.LichtendahlJr.
publishedbyaxelrodschnallpublishers
isbn-13: 978-0-9978479-1-8
isbn-10: 0-9978479-1-3
Coverart:PunakhaDzong,Bhutan.Copyright© 2016 BoazShmueli
ALLRIGHTSRESERVED.Nopartofthisworkmaybeusedorreproduced,transmitted, storedorusedinanyformorbyanymeansgraphic,electronic,ormechanical,includingbut notlimitedtophotocopying,recording,scanning,digitizing,taping,Webdistribution,informationnetworksorinformationstorageandretrievalsystems,orinanymannerwhatsoever withoutpriorwrittenpermission.
Forfurtherinformationsee www.forecastingbook.com
SecondEdition,July 2016
ToBoazShmueli,whomadetheproduction ofthePracticalAnalyticsbookseries areality
Preface Thepurposeofthistextbookistointroducethereadertoquantitativeforecastingoftimeseriesinapracticalandhands-on fashion.Mostpredictiveanalyticscoursesindatascienceand businessanalyticsprogramstouchverylightlyontimeseries forecasting,ifatall.Yet,forecastingisextremelypopularand usefulinpractice.
Fromourexperience,learningisbestachievedbydoing. Hence,thebookisdesignedtoachieveself-learninginthefollowingways:
• Thebookisrelativelyshortcomparedtoothertimeseries textbooks,toreducereadingtimeandincreasehands-ontime.
• Explanationsstrivetobeclearandstraightforwardwithmore emphasisonconceptsthanonstatisticaltheory.
• Chaptersincludeend-of-chapterproblems,ranginginfocus fromconceptualtohands-onexercises,withmanyrequiring runningsoftwareonrealdataandinterpretingtheoutputin lightofagivenproblem.
• Realdataisusedtoillustratethemethodsthroughoutthe book.
• Thebookemphasizesthe entireforecastingprocess ratherthan focusingonlyonparticularmodelsandalgorithms.
• Casesaregiveninthelastchapter,guidingthereaderthrough suggestedsteps,butallowingself-solution.Workingonthe caseshelpsintegratetheinformationandexperiencegained.
CoursePlan Thebookwasdesignedforaforecastingcourseatthegraduateorupper-undergraduatelevel.Itcanbetaughtinaminisemester(6-7 weeks)orasasemester-longcourse,usingthe casestointegratethelearningfromdifferentchapters.Asuggestedscheduleforatypicalcourseis:
Week 1 Chapters 1 ("ApproachingForecasting")and 2 ("Data") covergoaldefinition;datacollection,characterization,visualization,andpre-processing.
Week 2 Chapter 3 ("PerformanceEvaluation")coversdatapartitioning,naiveforecasts,measuringpredictiveaccuracyand uncertainty.
Weeks 3-4 Chapter 4 ("ForecastingMethods:Overview")describesandcomparesdifferentapproachesunderlyingforecastingmethods.Chapter 5 ("SmoothingMethods")coversmoving average,exponentialsmoothing,anddifferencing.
Weeks 5-6 Chapters 6 ("RegressionModels:TrendandSeasonality")and 7 ("RegressionModels:AutocorrelationandExternal Information")coverlinearregressionmodels,autoregressive (AR)andARIMAmodels,andmodelingexternalinformationas predictorsinaregressionmodel.
Week 7 Chapter 10 ("CommunicationandMaintenance")discussespracticalissuesofpresenting,reporting,documentingand monitoringforecasts.Thisweekisagoodpointforproviding feedbackonacaseanalysisfromChapter 11.
Week 8 (optional) Chapter 8 ("ForecastingBinaryOutcomes") expandsforecastingtobinaryoutcomes,andintroducesthe methodoflogisticregression.
Week 9 (optional) Chapter 9 ("NeuralNetworks")introduces neuralnetworksforforecastingbothcontinuousandbinary outcomes.
Weeks 10-12 (optional) Chapter 11 ("Cases")offersthreecases thatintegratethelearningandhighlightkeyforecastingpoints.
Ateamprojectishighlyrecommendedinsuchacourse,wherestudents workonarealorrealisticproblemusingrealdata.
SoftwareandData Thefreeandopen-sourcesoftware R (www.r-project.org)is usedthroughoutthebooktoillustratethedifferentmethods andprocedures.Thischoiceisgoodforstudentswhoarecomfortablewithsomecomputinglanguage,butdoesnotrequire priorknowledgewithR.WeprovidecodeforfiguresandoutputstohelpreaderseasilyreplicateourresultswhilelearningthebasicsofR.Inparticular,weusetheR forecastpackage, (robjhyndman.com/software/forecast)whichprovidescomputationallyefficientanduser-friendlyimplementationsofmany forecastingalgorithms.
Tocreateauser-friendlyenvironmentforusingR,download boththeRsoftwarefrom www.r-project.org andRStudiofrom www.rstudio.com.
Finally,weadvocateusinginteractivevisualizationsoftware forexploringthenatureofthedatabeforeattemptinganymodeling,especiallywhenmanyseriesareinvolved.TwosuchpackagesareTableau(www.tableausoftware.com)andTIBCOSpotfire (spotfire.tibco.com).Weillustratethepowerofthesepackages inChapter 1.
NewtotheSecondEdition
Basedonfeedbackfromreadersandinstructors,thiseditionhas twomainimprovements.Firstisanew-and-improvedstructuringofthetopics.Thisreorderingoftopicsisaimedatproviding aneasierintroductionofforecastingmethodswhichappearsto bemoreintuitivetostudents.Italsohelpsprioritizetopicstobe coveredinashortercourse,allowingoptionalcoverageoftopics inChapters 8-9.Therestructuringalsoalignsthisnewedition withtheXLMiner®-basededitionof PracticalTimeSeriesForecasting(3rdedition),offeringinstructorstheflexibilitytoteach
amixedcrowdofprogrammersandnon-programmers.The re-orderingincludes
• relocatingandcombiningthesectionsonautocorrelation,AR andARIMAmodels,andexternalinformationintoaseparate newchapter(Chapter 7).ThediscussionofARIMAmodels nowincludesequationsandfurtherdetailsonparametersand structure
• forecastingbinaryoutcomesisnowaseparatechapter(Chapter 8),introducingthecontextofbinaryoutcomes,performanceevaluation,andlogisticregression
• neuralnetworksarenowinaseparatechapter(Chapter 9) Thesecondupdateistheadditionandexpansionofseveral topics:
• predictionintervalsarenowincludedonallrelevantcharts andadiscussionofpredictionconeswasadded
• ThediscussionofexponentialsmoothingwithmultipleseasonalcyclesinChapter 5 hasbeenextended,withexamples usingRfunctions dshw and tbats
• Chapter 7 includestwonewexamples(bikesharingrentals andWalmartsales)usingRfunctions tslm and stlm toillustrateincorporatingexternalinformationintoalinearmodel andARIMAmodel.Additionally,theSTLapproachfordecomposingatimeseriesisintroducedandillustrated.
SupportingMaterials
Datasets,Rcode,andothersupportingmaterialsusedinthe bookareavailableat www.forecastingbook.com
Acknowledgments
ManythankstoProfessorRobHyndman(MonashUniversity) forinspiringthiseditionandprovidinginvaluableinformation abouttheR"forecast"package.ThankstoProfessorRaviBapna
andPeterBrucefortheirusefulfeedbackandsuggestions.Multiplereadershavesharedusefulcomments-wethankespecially KarlAraoforextensiveRcomments.SpecialthankstoNoa Shmueliforhermeticulousediting.KuberDeokarandShweta JadhavfromStatistics.comprovidedvaluablefeedbackonthe bookproblemsandsolutions.
1 ApproachingForecasting Inthisfirstchapter,welookatforecastingwithinthelarger contextofwhereitisimplementedandintroducethecomplete forecastingprocess.Wealsobrieflytouchuponthemainissues andapproachesthataredetailedinthebook.
1.1 Forecasting:Where? Timeseriesforecastingisperformedinnearlyeveryorganization thatworkswithquantifiabledata.Retailstoresforecastsales. Energycompaniesforecastreserves,production,demand,and prices.Educationalinstitutionsforecastenrollment.Governmentsforecasttaxreceiptsandspending.Internationalfinancial organizationssuchastheWorldBankandInternationalMonetaryFundforecastinflationandeconomicactivity.Passenger transportcompaniesusetimeseriestoforecastfuturetravel. Banksandlendinginstitutionsforecastnewhomepurchases, andventurecapitalfirmsforecastmarketpotentialtoevaluate businessplans.
1.2 BasicNotation Theamountofnotationinthebookiskepttothenecessaryminimum.Letusintroducethebasicnotationusedinthebook.In particular,weusefourtypesofsymbolstodenotetimeperiods, dataseries,forecasts,andforecasterrors:
t = 1,2,3,...Anindexdenotingthetimeperiodofinterest. t = 1isthefirstperiodinaseries.
y1, y2, y3,..., yn
Ft
Aseriesof n valuesmeasuredover n timeperiods, where yt denotesthevalueoftheseriesattimeperiod t. Forexample,foraseriesofdailyaveragetemperatures, t = 1,2,3,...denotesday 1,day 2,andday 3; y1, y2,and y3 denotethetemperaturesondays 1,2,and 3.
Theforecastedvaluefortimeperiod t.
Ft+k The k-step-aheadforecastwhentheforecastingtimeis t Ifwearecurrentlyattimeperiod t,theforecastforthe nexttimeperiod(t + 1)isdenoted Ft+1.
et Theforecasterrorfortimeperiod t,whichisthe differencebetweentheactualvalueandtheforecast attime t,andequalto yt Ft (seeChapter 3).
1.3 TheForecastingProcess Asinalldataanalysis,theprocessofforecastingbeginswith goal definition.Dataisthencollectedandcleaned,andexploredusing visualizationtools.Asetofpotentialforecastingmethodsis selected,basedonthenatureofthedata.Thedifferentmethods areappliedandcomparedintermsofforecastaccuracyand othermeasuresrelatedtothegoal.The"best"methodisthen chosenandusedtogenerateforecasts.
Ofcourse,theprocessdoesnotendonceforecastsaregenerated,becauseforecastingistypicallyanongoinggoal.Hence, forecastaccuracyismonitoredandsometimestheforecasting methodisadaptedorchangedtoaccommodatechangesinthe goalorthedataovertime.Adiagramoftheforecastingprocess isshowninFigure 1.1.
Notethetwosetsofarrows,indicatingthatpartsoftheprocessareiterative.Forinstance,oncetheseriesisexploredone mightdeterminethattheseriesathandcannotachievetherequiredgoal,leadingtothecollectionofneworsupplementary data.Anotheriterativeprocesstakesplacewhenapplyingaforecastingmethodandevaluatingitsperformance.Theevaluation oftenleadstotweakingoradaptingthemethod,oreventrying outothermethods.
Figure 1.1:Diagramofthe forecastingprocess
Giventhesequenceofstepsintheforecastingprocessandthe iterativenatureofmodelingandevaluatingperformance,the bookisorganizedaccordingtothefollowinglogic:Inthischapterweconsiderthecontext-relatedgoaldefinitionstep.Chapter 2 discussesthestepsofdatacollection,exploration,andpreprocessing.NextcomesChapter 3 onperformanceevaluation. Theperformanceevaluationchapterprecedestheforecasting methodchaptersfortworeasons:
1 Understandinghowperformanceisevaluatedaffectsthe choiceofforecastingmethod,aswellastheparticulardetails ofhowaspecificforecastingmethodisexecuted.Withineach oftheforecastingmethodchapters,weinfactrefertoevaluationmetricsandcomparedifferentconfigurationsusingsuch metrics.
2 Acrucialinitialstepforallowingtheevaluationofpredictive performanceisdatapartitioning.Thismeansthattheforecastingmethodisappliedonlytoasubsetoftheseries.Itis thereforeimportanttounderstandwhyandhowpartitioning iscarriedoutbeforeapplyinganyforecastingmethod.
Theforecastingmethodschapters(Chapters 5-9)arefollowed byChapter 10 ("CommunicationandMaintenance"),whichdiscussesthelaststepofimplementingtheforecastsorforecasting systemwithintheorganization.
Beforecontinuing,letuspresentanexamplethatwillbeused throughoutthebookforillustrativepurposes.
IllustrativeExample:RidershiponAmtrakTrains
Amtrak,aU.S.railwaycompany,routinelycollectsdataonridership.OurillustrationisbasedontheseriesofmonthlyAmtrak ridershipbetweenJanuary 1991 andMarch 2004 intheUnited States.
Thedataispubliclyavailableat www.forecastingprinciples. com (clickon Data,andselectSeriesM-34 fromtheT-Competition Data)aswellasonthebookwebsite.
(Imagebygraurcodrin/ FreeDigitalPhotos.net)
GoalDefinition Determiningandclearlydefiningtheforecastinggoalisessential forarrivingatusefulresults.Unliketypicalforecastingcompetitions1,whereasetofdatawithabriefstoryandagivensetof
performancemetricsareprovided,inreallifeneitherofthese componentsarestraightforwardorreadilyavailable.Onemust firstdeterminethepurposeofgeneratingforecasts,thetypeof forecaststhatareneeded,howtheforecastswillbeusedbythe organization,whatarethecostsassociatedwithforecasterrors, whatdatawillbeavailableinthefuture,andmore.
Itisalsocriticaltounderstandthe implications oftheforecasts todifferentstakeholders.Forexample,TheNationalAgricultural StatisticsService(NASS)oftheUnitedStatesDepartmentof Agriculture(USDA)producesforecastsfordifferentcropyields. Theseforecastshaveimportantimplications:
[...]somemarketparticipantscontinuetoexpressthebeliefthat theUSDAhasahiddenagendaassociatedwithproducingtheestimatesandforecasts[forcornandsoybeanyield].This"agenda" centersonpricemanipulationforavarietyofpurposes,includingsuchthingsasmanagingfarmprogramcostsandinfluencing foodprices.LackofunderstandingofNASSmethodologyand/or thebeliefinahiddenagendacanpreventmarketparticipants fromcorrectlyinterpretingandutilizingtheacreageandyield forecasts.2
Inthefollowingweelaborateonseveralimportantissuesthat mustbeconsideredatthegoaldefinitionstage.Theseissuesaffecteverystepintheforecastingprocess,fromdatacollection throughdataexploration,preprocessing,modelingandperformanceevaluation.
Descriptivevs.PredictiveGoals 1 Foralistofpopularforecastingcompetitionssee the"DataResourcesand Competitions"pagesatthe endofthebook
2 Fromfarmdocdailyblog, postedMarch 23, 2011, www. farmdocdaily.illinois. edu/2011/03/post.html; accessedDec 5, 2011
Aswithcross-sectionaldata3,modelingtimeseriesdataisdone 3 Cross-sectionaldatais asetofmeasurements takenatonepointintime. Incontrast,atimeseries consistsofonemeasurement overtime.
foreitherdescriptiveorpredictivepurposes.Indescriptivemodeling,or timeseriesanalysis,atimeseriesismodeledtodetermine itscomponentsintermsofseasonalpatterns,trends,relationto externalfactors,andthelike.Thesecanthenbeusedfordecision makingandpolicyformulation.Incontrast, timeseriesforecasting
usestheinformationinatimeseries(perhapswithadditionalinformation)toforecastfuturevaluesofthatseries.Thedifference betweendescriptiveandpredictivegoalsleadstodifferencesin thetypesofmethodsusedandinthemodelingprocessitself. Forexample,inselectingamethodfordescribingatimeseries orevenforexplainingitspatterns,priorityisgiventomethods thatproduceexplainableresults(ratherthanblack-boxmethods) andtomodelsbasedoncausalarguments.Furthermore,descriptioncanbedoneinretrospect,whilepredictionisprospective innature.Thismeansthatdescriptivemodelscanuse"future" information(e.g.,averagingthevaluesofyesterday,today,and tomorrowtoobtainasmoothrepresentationoftoday’svalue) whereasforecastingmodelscannotusefutureinformation.Finally,apredictivemodelisjudgedbyitspredictiveaccuracy ratherthanbyitsabilitytoprovidecorrectcausalexplanations.
ConsidertheAmtrakridershipexampledescribedatthebeginningofthischapter.Differentanalysisgoalscanbespecified, eachleadingtoadifferentpathintermsofmodeling,performanceevaluation,andimplementation.Onepossibleanalysis goalthatAmtrakmighthaveistoforecastfuturemonthlyridershiponitstrainsforpurposesofpricing.Usingdemanddata todeterminepricingiscalled"revenuemanagement"andisa popularpracticebyairlinesandhotelchains.Clearly,thisisa predictivegoal.
AdifferentgoalforwhichAmtrakmightwanttousetheridershipdataisforimpactassessment:evaluatingtheeffectof someevent,suchasairportclosuresduetoinclementweather,or theopeningofanewlargenationalhighway.Thisgoalisretrospectiveinnature,andisthereforedescriptiveorevenexplanatory.Analysiswouldcomparetheseriesbeforeandafterthe event,withnodirectinterestinfuturevaluesoftheseries.Note thatthesegoalsarealsogeography-specificandwouldtherefore requireusingridershipdataatafinerlevelofgeographywithin theUnitedStates.
AthirdgoalthatAmtrakmightpursueisidentifyingand quantifyingdemandduringdifferentseasonsforplanningthe numberandfrequencyoftrainsneededduringdifferentseasons. Ifthemodelisonlyaimedatproducingmonthlyindexesof
demand,thenitisadescriptivegoal.Incontrast,ifthemodel willbeusedtoforecastseasonaldemandforfutureyears,thenit isapredictivetask.
Finally,theAmtrakridershipdatamightbeusedbynational agencies,suchastheBureauofTransportationStatistics,toevaluatethetrendsintransportationmodesovertheyears.Whether thisisadescriptiveorpredictivegoaldependsonwhattheanalysiswillbeusedfor.Ifitisforthepurposesofreportingpast trends,thenitisdescriptive.Ifthepurposeisforecastingfuture trends,thenitisapredictivegoal.
Thefocusinthisbookison timeseriesforecasting,wherethe goalistopredictfuturevaluesofatimeseries.Someofthe methodspresented,however,canalsobeusedfordescriptive purposes.4
ForecastHorizonandForecastUpdating Howfarintothefutureshouldweforecast?Mustwegenerate allforecastsatasingletimepoint,orcanforecastsbegenerated onanongoingbasis?Theseareimportantquestionstobeansweredatthegoaldefinitionstage.Bothquestionsdependon howtheforecastswillbeusedinpracticeandthereforerequire closecollaborationwiththeforecaststakeholdersintheorganization.Theforecasthorizon k isthenumberofperiodsahead thatwemustforecast,and Ft+k isa k-step-aheadforecast.Inthe Amtrakridershipexample,one-month-aheadforecasts(Ft+1) mightbesufficientforrevenuemanagement(forcreatingflexible pricing),whereaslongertermforecasts,suchasthree-monthahead(Ft+3),aremorelikelytobeneededforschedulingand procurementpurposes.
Howrecentarethedataavailableatthetimeofprediction? Timelinessofdatacollectionandtransferdirectlyaffecttheforecasthorizon:Forecastingnextmonth’sridershipismuchharder ifwedonotyethavedataforthelasttwomonths.Itmeansthat wemustgenerateforecastsoftheform Ft+3 ratherthan Ft+1. Whetherimprovingtimelinessofdatacollectionandtransferis possibleornot,itsimplicationonforecastingmustberecognized atthegoaldefinitionstage.
4 Moststatisticaltimeseries booksfocusondescriptive timeseriesanalysis.Agood introductionisthebook. C.Chatfield. TheAnalysisof TimeSeries:AnIntroduction. Chapman&Hall/CRC, 6th edition, 2003
Whilelong-termforecastingisoftenanecessity,itisimportanttohaverealisticexpectationsregardingforecastaccuracy: thefurtherintothefuture,themorelikelythattheforecasting contextwillchangeandthereforeuncertaintyincreases.Insuch cases,expectedchangesintheforecastingcontextshouldbeincorporatedintotheforecastingmodel,andthemodelshouldbe examinedperiodicallytoassureitssuitabilityforthechanged contextandifpossible,updated.
Evenwhenlong-termforecastsarerequired,itissometimes usefultoprovideperiodicupdatedforecastsbyincorporating newaccumulatedinformation.Forexample,athree-monthaheadforecastforApril 2012,whichisgeneratedinJanuary 2012,mightbeupdatedinFebruaryandagaininMarchofthe sameyear.Suchrefreshingoftheforecastsbasedonnewdatais called roll-forwardforecasting.
Alltheseaspectsoftheforecasthorizonhaveimplicationson therequiredlengthoftheseriesforbuildingtheforecastmodel, onfrequencyandtimelinessofcollection,ontheforecasting methodsused,onperformanceevaluation,andontheuncertaintylevelsoftheforecasts.
ForecastUse Howwilltheforecastsbeused?Understandinghowtheforecastswillbeused,perhapsbydifferentstakeholders,iscriticalforgeneratingforecastsoftherighttypeandwithausefulaccuracylevel.Shouldforecastsbenumericalorbinary ("event"/"non-event")?Doesover-predictioncostmoreorless thanunder-prediction?Willtheforecastsbeuseddirectlyorwill theybe"adjusted"insomewaybeforeuse?Willtheforecasts andforecastingmethodtobepresentedtomanagementortothe technicaldepartment?Answerstosuchquestionsarenecessary forchoosingappropriatedata,methods,andevaluationschemes.
LevelofAutomation Thelevelofrequiredautomationdependsonthenatureofthe forecastingtaskandonhowforecastswillbeusedinpractice. Someimportantquestionstoaskare:
1 Howmanyseriesneedtobeforecasted?
2 Istheforecastinganongoingprocessoraonetimeevent?
3. Whichdataandsoftwarewillbeavailableduringtheforecastingperiod?
4. Whatforecastingexpertisewillbeavailableattheorganizationduringtheforecastingperiod?
Differentanswerswillleadtodifferentchoicesofdata,forecastingmethods,andevaluationschemes.Hence,thesequestions mustbeconsideredalreadyatthegoaldefinitionstage.
Inscenarioswheremanyseriesaretobeforecastedonan ongoingbasis,andnotmuchforecastingexpertisecanbeallocatedtotheprocess,anautomatedsolutioncanbeadvantageous.AclassicexampleisforecastingPointofSale(POS)data forpurposesofinventorycontrolacrossmanystores.Various consultingfirmsofferautomatedforecastingsystemsforsuch applications.
1.5 Problems ImpactofSeptember 11 onAirTravelintheUnitedStates: TheResearchandInnovativeTechnologyAdministration’sBureauof TransportationStatistics(BTS)conductedastudytoevaluate theimpactoftheSeptember 11, 2001,terroristattackonU.S. transportation.Thestudyreportandthedatacanbefoundat www.bts.gov/publications/estimated impacts of 9 11 on us travel.Thegoalofthestudywasstatedasfollows:
Thepurposeofthisstudyistoprovideagreaterunderstanding ofthepassengertravelbehaviorpatternsofpersonsmakinglong distancetripsbeforeandafterSeptember 11.
Thereportanalyzesmonthlypassengermovementdatabetween January 1990 andApril 2004.Dataonthreemonthlytimeseries aregiveninthefile Sept11Travel.xls forthisperiod:(1)actual airlinerevenuepassengermiles(Air),(2)railpassengermiles (Rail),and(3)vehiclemilestraveled(Auto).
InordertoassesstheimpactofSeptember 11,BTStookthe followingapproach:UsingdatabeforeSeptember 11,itforecastedfuturedata(undertheassumptionofnoterroristattack). Then,BTScomparedtheforecastedserieswiththeactualdatato assesstheimpactoftheevent.
1 Isthegoalofthisstudydescriptiveorpredictive?
2 Whatistheforecasthorizontoconsiderinthistask?Are next-monthforecastssufficient?
3. Whatlevelofautomationdoesthisforecastingtaskrequire? Considerthefourquestionsrelatedtoautomation.
4 Whatisthemeaningof t = 1,2,3intheAirseries?Which timeperioddoes t = 1referto?
5. Whatarethevaluesfor y1, y2,and y3 intheAirseries?
(Imagebyafrica/FreeDigitalPhotos.net)
TimeSeriesData 2 .1 DataCollection
Whenconsideringwhichdatatouseforgeneratingforecasts,the forecastinggoalandthevariousaspectsdiscussedinChapter 1 mustbetakenintoaccount.Therearealsoconsiderationsatthe datalevelwhichcanaffecttheforecastingresults.Severalsuch issueswillbeexaminednext.
DataQuality Thequalityofourdataintermsofmeasurementaccuracy,missingvalues,corrupteddata,anddataentryerrorscangreatly affecttheforecastingresults.Dataqualityisespeciallyimportantintimeseriesforecasting,wherethesamplesizeissmall (typicallynotmorethanafewhundredvaluesinaseries).
Iftherearemultiplesourcescollectingorhostingthedataof interest(e.g.,differentdepartmentsinanorganization),itcan beusefultocomparethequalityandchoosethebestdata(or evencombinethesources).However,itisimportanttokeepin mindthatforongoingforecasting,datacollectionisnotaonetimeeffort.Additionaldatawillneedtobecollectedagainin futurefromthesamesource.Moreover,ifforecastedvalueswill becomparedagainstaparticularseriesofactualvalues,then thatseriesmustplayamajorroleintheperformanceevaluation step.Forexample,ifforecasteddailytemperatureswillbecomparedagainstmeasurementsfromaparticularweatherstation,
forecastsbasedonmeasurementsfromothersourcesshouldbe comparedtothedatafromthatweatherstation.
Insomecasestheseriesofinterestaloneissufficientforarrivingatsatisfactoryforecasts,whileinothercasesexternaldata mightbemorepredictiveoftheseriesofinterestthansolelyits history.Inthecaseofexternaldata,itiscrucialtoassurethatthe samedatawillbeavailableatthetimeofprediction.
TemporalFrequency Withtoday’stechnology,manytimeseriesarerecordedonvery frequenttimescales.Stocktickerdataisavailableonaminuteby-minutelevel.Purchasesatonlineandbrick-and-mortarstores arerecordedinrealtime.However,althoughdatamightbe availableataveryfrequentscale,forthepurposeofforecasting itisnotalwayspreferabletousethisfrequency.Inconsideringthechoiceoftemporalfrequency,onemustconsiderthe frequencyoftherequiredforecasts(thegoal)andthelevelof noise1 inthedata.Forexample,ifthegoalistoforecastnext-day 1 "Noise"referstovariability intheseries’valuesthat isnottoaccountfor.See Section 2.2.
salesatagrocerystore,minute-by-minutesalesdataislikely lessusefulthandailyaggregates.Theminute-by-minuteseries willcontainmanysourcesofnoise(e.g.,variationbypeakand nonpeakshoppinghours)thatdegradeitsdaily-levelforecasting power,yetwhenthedataisaggregatedtoacoarserlevel,these noisesourcesarelikelytocancelout.
Evenwhenforecastsareneededonaparticularfrequency (suchasdaily)itissometimesadvantageoustoaggregatethe seriestoalowerfrequency(suchasweekly),andmodelthe aggregatedseriestoproduceforecasts.Theaggregatedforecasts canthenbedisaggregatedtoproducehigher-frequencyforecasts. Forexample,thetopperformersinthe 2008 NN5 TimeSeries ForecastingCompetition2 describetheirapproachforforecasting
dailycashwithdrawalamountsatATMmachines:
Tosimplifytheforecastingproblem,weperformedatimeaggregationsteptoconvertthetimeseriesfromdailytoweekly.... Oncetheforecasthasbeenproduced,weconverttheweekly forecasttoadailyonebyasimplelinearinterpolationscheme.
2 R.R.Andrawis,A.F. Atiya,andH.El-Shishiny. Forecastcombinationsof computationalintelligence andlinearmodelsforthe NN5 timeseriesforecasting competition. International JournalofForecasting, 27:672–688, 2011
SeriesGranularity "Granularity"referstothecoverageofthedata.Thiscanbein termsofgeographicalarea,population,timeofoperation,etc. IntheAmtrakridershipexample,dependingonthegoal,we couldlookatgeographicalcoverage(route-level,state-level,etc.), ataparticulartypeofpopulation(e.g.,seniorcitizens),and/or atparticulartimes(e.g.,duringrushhour).Inallthesecases, theresultingtimeseriesarebasedonasmallerpopulationof interestthanthenationalleveltimeseries.
Aswithtemporalfrequency,thelevelofgranularitymustbe alignedwiththeforecastinggoal,whileconsideringthelevelsof noise.Veryfinecoveragemightleadtolowcounts,perhapseven tomanyzerocounts.Exploringdifferentaggregationandslicing levelsisoftenneededforobtainingadequateseries.Thelevel ofgranularitywilleventuallyaffectthechoiceofpreprocessing, forecastingmethod(s),andevaluationmetrics.Forexample,if weareinterestedindailytrainridershipofseniorcitizenson aparticularroute,andtheresultingseriescontainsmanyzero counts,wemightresorttomethodsforforecastingbinarydata ratherthannumericdata(seeChapter 8).
DomainExpertise Whilethefocusinthissectionisonthequantitativedatatobe usedforforecasting,acriticaladditionalsourceofinformation isdomainexpertise.Withoutdomainexpertise,theprocessof creatingaforecastingmodelandevaluatingitsusefulnessmight nottoachieveitsgoal.
Domainexpertiseisneededfordeterminingwhichdatato use(e.g.,dailyvs.hourly,howfarback,andfromwhichsource), describingandinterpretingpatternsthatappearinthedata, fromseasonalpatternstoextremevaluesanddrasticchanges (e.g.,clarifyingwhatare"afterhours",interpretingmassiveabsencesduetotheorganization’sannualpicnic,andexplaining thedrasticchangeintrendduetoanewcompanypolicy).
Domainexpertiseisalsousedforhelpingevaluatethepracticalimplicationsoftheforecastingperformance.Aswediscussin Chapter 10,implementingtheforecastsandtheforecastingsys-
temrequirescloselinkagewiththeorganizationalgoals.Hence, theabilitytocommunicatewithdomainexpertsduringtheforecastingprocessiscrucialforproducingusefulresults,especially whentheforecastingtaskisoutsourcedtoaconsultingfirm.
2.2 TimeSeriesComponents Forthepurposeofchoosingadequateforecastingmethods,it isusefultodissectatimeseriesintoasystematicpartanda non-systematicpart.Thesystematicpartistypicallydivided intothreecomponents: level, trend,and seasonality.Thenonsystematicpartiscalled noise.Thesystematiccomponentsare assumedtobeunobservable,astheycharacterizetheunderlyingseries,whichweonlyobservewithaddednoise. Level describestheaveragevalueoftheseries, trend isthechangein theseriesfromoneperiodtothenext,and seasonality describes ashort-termcyclicalbehaviorthatcanbeobservedseveraltimes withinthegivenseries.Whilesomeseriesdonotcontaintrend orseasonality,allserieshavealevel.Lastly, noise istherandom variationthatresultsfrommeasurementerrororothercauses thatarenotaccountedfor.Itisalwayspresentinatimeseriesto somedegree,althoughwecannotobserveitdirectly.
Thedifferentcomponentsarecommonlyconsideredtobe either additive or multiplicative.Atimeserieswith additive componentscanbewrittenas:
Atimeserieswith multiplicative componentscanbewrittenas:
Forecastingmethodsattempttoisolatethesystematicpart andquantifythenoiselevel.Thesystematicpartisusedfor generatingpointforecastsandthelevelofnoisehelpsassessthe uncertaintyassociatedwiththepointforecasts.
Trendpatternsarecommonlyapproximatedbylinear,exponentialandothermathematicalfunctions.Illustrationsof differenttrendpatternscanbeseenbycomparingthedifferent
rowsinFigure 2 1.Forseasonalpatterns,twocommonapproximationsare additiveseasonality (wherevaluesindifferentseasons varybyaconstantamount)and multiplicativeseasonality (where valuesindifferentseasonsvarybyapercentage).Illustrations oftheseseasonalitypatternsareshowninthesecondandthird columnsofFigure 2.1.Chapter 6 discussesthesedifferentpatternsinfurtherdetail,andalsointroducesanothersystematic component,whichisthecorrelationbetweenneighboringvalues inaseries.
Figure 2 1:Illustrationsof commontrendandseasonalitypatterns.Reproduced withpermissionfrom Dr.JimFlower’swebsite jcflowers1.iweb.bsu.edu/ rlo/trends.htm
Another random document with no related content on Scribd:
Kabul 2½ ” 5 ”
Under Abdur Rahman some little relief from the oppressive and arbitrary payments, which were extorted alike from the unfortunate merchant and the luckless cultivator, was secured; and, as he instilled a measure of reform into the practices of government, certain sources of taxation were dropped and the burden resting upon industry and agriculture proportionately lightened. The principal means of income to the State now emanated from taxes which were levied upon cultivated lands and fruit-trees, export and import trade, customs, registration and postage fees (contracts, passport fees, marriage settlements, etc.), penalties under law, revenue from Government lands and shops, Government monopolies and manufactures, mines and minerals (salt, rubies, gold, lapis lazuli, coal) and the annual subsidy of eighteen lakhs of rupees—these several branches of the State revenue gradually defining the limits of its present prosperity, which has been somewhat further assisted by the benevolent, economic policy of the present Amir. Abuses in the collection of octroi have been remedied, certain taxes abolished, many mines developed, while to give an impetus to trade in Afghanistan, Habib Ullah has announced that, in future, traders may receive advances from the Kabul Treasury on proper security. This concession is greatly appreciated by the commercial community, as it will enable them to escape the payment of interest to the Hindu bankers from whom they have been in the habit of borrowing. Moreover, it is expected that if full effect is given to the Amir’s wishes trade between India and Afghanistan will soon improve. The loans will be repayable by easy instalments, this novel scheme establishing a very important departure.