[Ebooks PDF] download Using statistics in the social and health sciences with spss excel 1st… full c

Page 1


Using

Statistics in the Social and Health Sciences with SPSS Excel 1st…

Visit to download the full and correct content document: https://ebookmass.com/product/using-statistics-in-the-social-and-health-sciences-with -spss-excel-1st/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

Using Basic Statistics in the Behavioral and Social Sciences

https://ebookmass.com/product/using-basic-statistics-in-thebehavioral-and-social-sciences/

Practical Statistics for Nursing Using SPSS 1st Edition, (Ebook PDF)

https://ebookmass.com/product/practical-statistics-for-nursingusing-spss-1st-edition-ebook-pdf/

Introductory Statistics Using SPSS 2nd Edition, (Ebook PDF)

https://ebookmass.com/product/introductory-statistics-usingspss-2nd-edition-ebook-pdf/

eTextbook 978-9351500827 Discovering Statistics Using IBM SPSS Statistics, 4th Edition

https://ebookmass.com/product/etextbook-978-9351500827discovering-statistics-using-ibm-spss-statistics-4th-edition/

Statistics for Ecologists Using R and Excel: Data Collection, Exploration,

https://ebookmass.com/product/statistics-for-ecologists-using-rand-excel-data-collection-exploration/

Discovering Statistics Using IBM SPSS Statistics: North American Edition 5th Edition, (Ebook PDF)

https://ebookmass.com/product/discovering-statistics-using-ibmspss-statistics-north-american-edition-5th-edition-ebook-pdf/

Statistics Using IBM SPSS: An Integrative Approach –Ebook PDF Version

https://ebookmass.com/product/statistics-using-ibm-spss-anintegrative-approach-ebook-pdf-version/

eTextbook 978-0134173054 Statistics for Managers Using Microsoft Excel

https://ebookmass.com/product/etextbook-978-0134173054statistics-for-managers-using-microsoft-excel/

Applied Univariate, Bivariate, and Multivariate Statistics: Understanding Statistics for Social and Natural Scientists, With Applications in SPSS and R 2nd Edition Daniel J. Denis

https://ebookmass.com/product/applied-univariate-bivariate-andmultivariate-statistics-understanding-statistics-for-social-andnatural-scientists-with-applications-in-spss-and-r-2nd-edition-

USINGSTATISTICSIN THESOCIALAND HEALTHSCIENCES

WITHSPSS® AND EXCEL®

BetweenandWithinResearchDesigns,210

UsingDifferent T Tests,211

Independent T Test:TheProcedure,213

CreatingtheSamplingDistributionofDifferences,215

TheNatureoftheSamplingDistributionofDifferences,216

CalculatingtheEstimatedStandardErrorofDifferencewithEqualSample Size,218

UsingUnequalSampleSizes,219

TheIndependent T Ratio,221

Independent T TestExample,222

HypothesisTestElementsfortheExample,222

Before–AfterConventionwiththeIndependent T Test,226

ConfidenceIntervalsfortheIndependent T Test,227

EffectSize,228

TheAssumptionsfortheIndependent T Test,230

SPSS® ExploreforCheckingtheNormalDistributionAssumption, 231

ExcelProceduresforCheckingtheEqualVarianceAssumption,233

SPSS® ProcedureforCheckingtheEqualVarianceAssumption,237 UsingSPSS® andExcelwiththeIndependent T Test,239

SPSS® ProceduresfortheIndependent T Test,239

ExcelProceduresfortheIndependent T Test,243

EffectSizefortheIndependent T TestExample,245 PartingComments,245

NonparametricStatistics:TheMann–Whitney U Test,246

TermsandConcepts,249

DataLabandExamples(withSolutions),249

DataLab:Solutions,251

GraphicsintheDataSummary,254

9ANALYSISOFVARIANCE255

AHypotheticalExampleofANOVA,255

TheNatureofANOVA,257

TheComponentsofVariance,258

TheProcessofANOVA,259

CalculatingANOVA,260

EffectSize,268

PostHocAnalyses,269

AssumptionsofANOVA,274

AdditionalConsiderationswithANOVA,275

TheHypothesisTest:InterpretingANOVAResults,276

AretheAssumptionsMet?,276

UsingSPSS® andExcelwithOne-WayANOVA,282

TheNeedforDiagnostics,289

Non-ParametricANOVATests:TheKruskal–WallisTest,289 TermsandConcepts,292

DataLabandExamples(withSolutions),293

DataLab:Solutions,294

10FACTORIALANOVA297

ExtensionsofANOVA,297

ANCOVA,298

MANOVA,299

MANCOVA,299

FactorialANOVA,299

InteractionEffects,299

SimpleEffects,301

2XANOVA:AnExample,302

CalculatingFactorialANOVA,303

TheHypothesesTest:InterpretingFactorialANOVAResults,306

EffectSizefor2XANOVA:Partial �� 2 ,308

DiscussingtheResults,309 UsingSPSS® toAnalyze2XANOVA,311

SummaryChartfor2XANOVAProcedures,319 TermsandConcepts,319

DataLabandExamples(withSolutions),320

DataLab:Solutions,320

11CORRELATION329

TheNatureofCorrelation,330

TheCorrelationDesign,331

Pearson’sCorrelationCoefficient,332

PlottingtheCorrelation:TheScattergram,334 UsingSPSS® toCreateScattergrams,337 UsingExceltoCreateScattergrams,339

CalculatingPearson’s r ,341

The Z ScoreMethod,342

TheComputationMethod,344

TheHypothesisTestforPearson’s r ,345

EffectSize:theCoefficientofDetermination,347

Diagnostics:CorrelationProblems,349

CorrelationUsingSPSS® andExcel,352

NonparametricStatistics:Spearman’sRankOrderCorrelation(rs ),358 TermsandConcepts,363

DataLabandExamples(withSolutions),364 DataLab:Solutions,365

12BIVARIATEREGRESSION371

TheNatureofRegression,372

TheRegressionLine,374

CalculatingRegression,376

EffectSizeofRegression,379

The Z ScoreFormulaforRegression,380

TestingtheRegressionHypotheses,382

TheStandardErrorofEstimate,383

ConfidenceInterval,385

ExplainingVarianceThroughRegression,386

ANumericalExampleofPartitioningtheVariation,389

UsingExcelandSPSS® withBivariateRegression,390 TheSPSS® RegressionOutput,390

TheExcelRegressionOutput,396

CompleteExampleofBivariateLinearRegression,398

AssumptionsofBivariateRegression,398

TheOmnibusTestResults,404

EffectSize,404

TheModelSummary,405

TheRegressionEquationandIndividualPredictorTestofSignificance,405 AdvancedRegressionProcedures,406

DetectingProblemsinBivariateLinearRegression,408 TermsandConcepts,409

DataLabandExamples(withSolutions),410

DataLab:Solutions,411

13INTRODUCTIONTOMULTIPLELINEARREGRESSION417

TheElementsofMultipleLinearRegression,417

SameProcessasBivariateRegression,418

SomeDifferencesbetweenBivariateLinearRegressionandMultipleLinear Regression,419

StuffnotCovered,420

AssumptionsofMultipleLinearRegression,421

AnalyzingResidualstoCheckMLRAssumptions,422 DiagnosticsforMLR:CleaningandCheckingData,423 ExtremeScores,424

DistanceStatistics,428

InfluenceStatistics,429

MLRExtendedExampleData,430

AssumptionsMet?,431

AnalyzingResiduals:AreAssumptionsMet?,433 InterpretingtheSPSS® FindingsforMLR,436

EnteringPredictorsTogetherasaBlock,437 EnteringPredictorsSeparately,442

AdditionalEntryMethodsforMLRAnalyses,447

ExampleStudyConclusion,448

TermsandConcepts,448

DataLabandExample(withSolution),450

DataLab:Solution,450

14CHI-SQUAREANDCONTINGENCYTABLEANALYSIS455

ContingencyTables,455

TheChi-squareProcedureandResearchDesign,456

Chi-squareDesignOne:GoodnessofFit,457

AHypotheticalExample:GoodnessofFit,458

EffectSize:GoodnessofFit,462

Chi-squareDesignTwo:TheTestofIndependence,463

AHypotheticalExample:TestofIndependence,464

Special2 × 2Chi-square,468

EffectSizein2 × 2Tables:PHI,470

Cramer’s V :EffectSizefortheChi-squareTestofIndependence,471

RepeatedMeasuresChi-square:McnemarTest,472 UsingSPSS® andExcelwithChi-square,474 UsingSPSS® fortheChi-squareTestofIndependence,475 UsingExcelforChi-squareAnalyses,481

TermsandConcepts,483

DataLabandExamples(withSolutions),483

DataLab:Solutions,484

15REPEATEDMEASURESPROCEDURES: T dep ANDANOVAWS 489

IndependentandDependentSamplesinResearchDesigns,490 UsingDifferent T Tests,491

TheDependent T TestCalculation:The“Long”Formula,491 Example:TheLongFormula,492

TheDependent T TestCalculation:The“Difference”Formula,494 T dep andPower,496

ConductingThe T dep AnalysisUsingSPSS®,496 ConductingThe T dep AnalysisUsingExcel,498

Within-SubjectANOVA(ANOVAWS ),498

ExperimentalDesigns,499

PostFactoDesigns,500

Within-SubjectExample,501 UsingSPSS® forWithin-SubjectData,501

TheSPSS® Procedure,502

TheSPSS® Output,504

NonparametricStatistics,508

TermsandConcepts,508

APPENDICES

AppendixASPSS® BASICS509 UsingSPSS® ,509 GeneralFeatures,510 ManagementFunctions,513 AdditionalManagementFunctions,517

AppendixBEXCELBASICS531 DataManagement,531 TheExcelMenus,533 UsingStatisticalFunctions,541 DataAnalysisProcedures,543 MissingValuesand“0”ValuesinExcelAnalyses,544 UsingExcelwith“RealData”,544

AppendixCSTATISTICALTABLES545

TableC.1: Z -ScoreTable(ValuesShownarePercentages–%),545

TableC.2:ExclusionValuesforthe T -Distribution,547

TableC.3:Critical(Exclusion)ValuesfortheDistributionof F ,548

TableC.4:Tukey’sRangeTest(Upper5%Points),551

TableC.5:Critical(Exclusion)ValuesforPearson’sCorrelation Coefficient, r ,552

TableC.6:CriticalValuesofthe �� 2 (Chi-Square)Distribution,553

REFERENCES555 Index557

PREFACE

Thestudyofstatisticsisgainingrecognitioninagreatmanyfields.Inparticular, researchersinthesocialandhealthsciencesnoteitsimportanceforproblemsolving anditspracticalimportanceintheirareas.Statisticshasalwaysbeenimportant,for example,amongthosehopingtoentercareersinmedicinebutmoresonowdueto theincreasingemphasison“ScientificInquiry&ReasoningSkills”aspreparationfor theMedicalCollegeAdmissionTest(MCAT).Sociology,alwaysrelyingonstatistics andresearchforitscoreemphases,isnowincludedintheMCATaswell.

Thisbookfocusessquarelyontheproceduresimportanttoanessentialunderstandingofstatisticsandhowitisusedintherealworldforproblemsolving.Moreover,my discussioninthebookrepeatedlytiesstatisticalmethodologywithresearchdesign (seethe“companion”volumemycolleagueandIwrotetoemphasizeresearchand designskillsinsocialscience;AbbottandMcKinney,2013).

Iemphasizeappliedstatisticalanalysesandassuchwilluseexamplesthroughoutthebookdrawnfrommyownresearchaswellasfromnationaldatabaseslike GSSandBehavioralRiskFactorSurveillanceSystem(BRFSS).Usingdatafrom thesesourcesallowstudentstheopportunitytoseehowstatisticalproceduresapply toresearchintheirfieldsaswellastoexamine“realdata.”Acentralfeatureofthe bookismydiscussionanduseofSPSS® andMicrosoftExcel® toanalyzedatafor problemsolving.

Throughoutmyteachingandresearchcareer,Ihavedevelopedanapproachto helpingstudentsunderstanddifficultstatisticalconceptsinanewway.Ifindthatthe greatmajorityofstudentsarevisuallearners,soIdevelopeddiagramsandfigures overtheyearsthathelpcreateaconceptualpictureofthestatisticalproceduresthat areoftenproblematictostudents(likesamplingdistributions!).

Anotherreasonforwritingthisbookwastogivestudentsawaytounderstandstatisticalcomputingwithouthavingtorelyoncomprehensiveandexpensivestatistical softwareprograms.SincemoststudentshaveaccesstoMicrosoftExcel,Idevelopeda step-by-stepapproachtousingthepowerfulstatisticalproceduresinExceltoanalyze dataandconductresearchineachofthestatisticaltopicsIcoverinthebook.1

Ialsowantedtomakethosecomprehensivestatisticalprogramsmoreapproachabletostatisticsstudents,soIhavealsoincludeda“hands-on”guidetoSPSSin parallelwiththeExcelexamples.Insomecases,SPSShastheonlymeanstoperform somestatisticalprocedures,butinmostcases,bothExcelandSPSScanbeused.

Herearesomeofthefeaturesofthebook:

1.Emphasisonthe interpretationoffindings.

2.Useof clearexamplesfrommyexistingandformerresearchprojectsandlarge databasestoillustratestatisticalprocedures.“Real-world”datacanbecumbersome,soIintroducestraightforwardproceduresandexamplesinordertohelp studentsfocusmoreoninterpretationoffindings.

3.Inclusionofa datalabsectionineachchapterthatprovidesrelevant,clear examples.

4. Introductiontoadvancedstatisticalproceduresinchaptersections(e.g., regressiondiagnostics)andseparatechapters(e.g.,multiplelinearregression) forgreaterrelevancetoreal-worldresearchneeds.

5.Strengtheningofthe connectionbetweenstatisticalapplicationandresearch designs.

6.Inclusionofdetailedsectionsineachchapterexplaining applicationsfrom ExcelandSPSS.

IuseSPSS2 (versions22and23)screenshotsofmenusandtablesbypermission fromtheIBM® Company.IBM,theIBMlogo,ibm.com,andSPSSaretrademarks orregisteredtrademarksof InternationalBusinessMachinesCorporation,registeredinmanyjurisdictionsworldwide.Otherproductandservicenamesmightbe trademarksofIBMorothercompanies.AcurrentlistofIBMtrademarksisavailable ontheWebat“IBMCopyrightandtrademarkinformation”atwww.ibm.com/legal/

1 OnelimitationtoteachingstatisticsprocedureswithExcelisthatthedataanalysisfeaturesaredifferent dependingonwhethertheuserisa“Mac”userora“PC”user.IamusingthePCversion,whichfeatures a“DataAnalysis”suiteofstatisticaltools.ThisfeaturemaynolongerbeincludedintheMacversionof Excel.

2 SPSSscreenreprintsthroughoutthebookareusedcourtesyofInternationalBusinessMachinesCorporation,©InternationalBusinessMachinesCorporation.SPSSwasacquiredbyIBMinOctober2009.

PREFACE xvii

copytrade.shtml.MicrosoftExcelreferencesandscreenshotsinthisbookareused withpermissionfromMicrosoft.IuseMicrosoftExcel® 2013inthisbook.3

IuseGSS(2014)dataandcodebookforexamplesinthisbook.4 TheBRFSS SurveyQuestionnaireandDataareusedwithpermissionfromtheCDC.5

3 ExcelreferencesandscreenshotsinthisbookareusedwithpermissionfromMicrosoft®

4 Smith,TomW.,PeterMarsden,MichaelHout,andJibumKim.GeneralSocialSurveys,1972–2012 [machine-readabledatafile]/PrincipalInvestigator,TomW.Smith;CoprincipalInvestigator,PeterV.Marsden;CoprincipalInvestigator,MichaelHout;SponsoredbyNationalScienceFoundation.NORCed. Chicago:NationalOpinionResearchCenter[producer];Storrs,CT:TheRoperCenterforPublicOpinion Research,UniversityofConnecticut[distributor],2013.1datafile(57,061logicalrecords) + 1codebook (3432pp.).(NationalDataProgramfortheSocialSciences,No.21).

5 CentersforDiseaseControlandPrevention(CDC). BehavioralRiskFactorSurveillanceSystemSurveyQuestionnaire.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices,CentersforDisease ControlandPrevention,2013andCentersforDiseaseControlandPrevention(CDC). BehavioralRiskFactorSurveillanceSystemSurveyData.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices, CentersforDiseaseControlandPrevention,2013.

ACKNOWLEDGMENTS

IwishtothankmydaughterKristinHovaguimianforheroutstandingworkonthe Indextothisbook(andalltheothers!)–notaneasytaskwithabookofthisnature.

IthankmywifeKathleenAbbottforherdedicationandamazingcontributionsto theeditingprocess.

IthankmysonMatthewAbbottfortheinspirationhehasalwaysprovidedin mattersstatisticalandphilosophical.

ThankyouJonGurstelleandtheteamatWileyforyourcontinuingsupportofthis project.

1 INTRODUCTION

Theworldsuddenlyhasbecomeawashindata!Agreatmanypopularbookshave beenwrittenrecentlythatextol“bigdata”andtheinformationderivedfordecision makers.Thesedataareconsidered“big”becauseacertain“catalog”ofdatamaybeso largethattraditionalwaysofmanagingandanalyzingsuchinformationcannoteasily accommodateit.Thedataoriginatefromyouandmewheneverweusecertainsocial media,ormakepurchasesonline,orhaveinformationderivedfromusthroughradio frequencyidentification(RFID)readersattachedtoclothingandcars,evenimplanted inanimals,andsoon.Theresultisamassiveavalancheofinformationthatexists forbusinessesleaders,decisionmakers,andresearcherstouseforpredictingrelated behaviorsandattitudes.

BIGDATAANALYSIS

Decisionmakersaretryingtofigureouthowtomanageandusetheinformation available.Typicalcomputersoftwareusedforstatisticaldecisionmakingiscurrently limitedtoanumberofcasesfarbelowthatwhichisavailableforconsiderationofbig data.Atraditionalapproachtoaddressthisissueisknownas“datamining”inwhich anumberoftechniques,includingstatistics,areusedtodiscoverpatternsinalarge setofdata.

Researchersmaybeoverjoyedwiththeavailabilityofsuchrichdata,butitprovidesbothopportunitiesandchallenges.Ontheopportunityside,neverbeforehave

UsingStatisticsintheSocialandHealthScienceswithSPSS® andExcel®,FirstEdition. MartinLeeAbbott. ©2017JohnWiley&Sons,Inc.Published2017byJohnWiley&Sons,Inc.

suchlargeamountsofinformationbeenavailabletoassistresearchersandpolicy makersunderstandwidespreadpublicthinkingandbehavior.Onthechallengeside howeverareseveraldifficultquestions:

• Howaresuchdatatobeexamined?

• Docurrentsocialsciencemethodsandprocessesprovideguidancetoexamining datasetsthatsurpasshistoricaldata-gatheringcapacity?

• Arebigdatarepresentative?

• Dodatasetssolargeobviatetheneedforprobability-basedresearchanalyses?

• Dodecisionmakersunderstandhowtousesocialsciencemethodologytoassist intheiranalysesofemergingdata?

• Willthedecisionsemergingfrombigdatabeusedethically,withinthecontext tosocialscienceresearchguidelines?

• Willeffectsizeconsiderationsovershadowquestionsofsignificancetesting?

Socialscientistscanrelyonexistingstatisticalmethodstomanageandanalyze bigdata,butthe wayinwhichtheanalysesareusedfordecisionmakingwillchange. Onetrendisthatpredictionmaybehailedasamoreprominentmethodforunderstandingthedatathantraditionalhypothesistesting.Wewillhavemoretosayabout thisdistinctionlaterinthebook,butitisimportantatthispointtoseethatresearchers willneedtoadaptstatisticalapproachesforanalyzingbigdata.

VISUALDATAANALYSIS

Anotheremergingtrendforunderstandingandmanagingtheswellofdataistheuse ofvisuals.Ofcourse,visualdescriptionsofdatahavebeenusedforcenturies.Itis commonlyacknowledgedthatthefirst“piechart”waspublishedbyPlayfair(1801). Playfair’sexampleinFigure1.1comparesthedynamicsofnationsovertime.

Figure1.1comparednationsusingsize,color,andorientationovertime.Using thismethodforcomparinginformationhasbeenusefulforviewingthepatternsin datanotreadilyobservablefromnumericalanalysis.

Aswithnumericalmethods,however,thereareopportunitiesandchallengesin theuseofvisualanalyses:

• Canvisualmeansbeusedtoconveycomplexmeaning?

• Arethere“rules”thatwillhelptoinsureastandardwayofcreating,analyzing, andinterpretingsuchvisualinformation?

• Willvisualanalysesbecomedivorcedfromnumericalanalysissothatobservers havenowayofobjectivelyconfirmingthemeaningoftheimages?

Severalvisualdatasoftwareanalysisprogramshaveappearedoverthelastseveral years.Simplyrunninganonlinesearchwillyieldseveralpossibilitiesincludingmany thatofferfree(initial)programsforcatalogingandpresentingdatafromtheuser.I offeroneveryimportantcaveat(seethefinalbulletpointearlier),whichisthatitis

Figure1.1 WilliamPlayfair’spiechart. Source:https://commons.wikimedia.org/wiki/File :Playfair_piecharts.jpg.Publicdomain.

importanttoperformvisualdataanalysisinconcertwithnumericalanalysis.Aswe willseelaterinthebook,itiseasytointentionallyorunintentionallymisleadreaders usingvisualpresentationswhenthesearedivorcedfromnumericalstatisticalmeans thatdiscussthe“significance”and“meaningfulness”ofthevisualdata.

IMPORTANCEOFSTATISTICSFORTHESOCIALANDHEALTH SCIENCESANDMEDICINE

Thepresenceofsomuchrichinformationpresentsmeaningfulopportunitiesfor understandingmanyoftheprocessesthataffectthesocialworld.Whilemuchof thetimebigdataanalysesareusedforunderstandingbusinessdynamicsandeconomictrends,itisalsoimportanttofocusonthosedatapatternsthatcanaffectthe socialspherebeyondtheseindicators:socialandpsychologicalbehaviorandattitudes,changesinunderstandinghealthandmedicine,andeducationalprogress.These socialindicatorshavebeenthesubjectofagreatdealofanalysesoverthedecades andnowmaymakesignificantadvancesdependingonhowbigdataareanalyzedand managed.Onarelatednote,thesocialsciences(especiallysociologyandpsychology)arenowareasincludedinthenewMedicalCollegeAdmissionTest(MCAT), whichalsoincludesgreateremphasisupon“ScientificInquiry&ReasoningSkills.” Thematerialwewilllearnfromthisbookwillhelptosupportstudyintheseareas foraspiringhealthandmedicalprofessionals.

Inthisbook,Iintendtofocusonhowtouseandanalyzedataofallsizesand shapes.Whilewewillbelimitedinourabilitytodiveintotheworldofbigdatafully, wecanstudythebasicsofhowtorecognize,generate,interpret,andcritiqueanalyses ofdatafordecisionmaking.Oneofthefirstlessonsisthat datacanbeunderstood bothnumericallyandvisually.Whenwedescribeinformation,weareattemptingto

seeandconveyunderlyingmeaninginthenumbersandvisualexpressions.IfIhave acollectionofdata,Icannotrecognizeitsmeaningbysimplylookingatit.However, ifIapplycertainnumericalandvisualmethodsto organizethedata,Icanseewhat patternslaybelowthesurface.

HISTORICALNOTES:EARLYUSEOFSTATISTICS

Statisticsasafieldhashadalongandcolorfulhistory.Studentswillrecognizesome prominentnamesasthefielddevelopeditsmathematicalidentity:Pearson,Fisher, Bayes,Laplace,andothers.Butitisimportanttonotethatsomeoftheearlieststatisticalstudieswerebasedinsolvingsocialandpoliticalproblems.

OneoftheearliestofsuchstudieswasdevelopedbyJohnGrauntwhocompiled informationfromBillsofMortalitytodetect,amongotherthings,theimpactand originsofdeathsbyplague.Parishrecordsdocumentedchristenings,weddings,and burialsatthetime,soGraunt’sstudytrackedthenumberofdeathsintheparishesas

Figure1.2 JohnSnow’smapshowingdeathsintheLondoncholeraepidemicof1854. Source:https://commons.wikimedia.org/wiki/File:Snow-cholera-map-1.jpg.Publicdomain.

awaytounderstandthedynamicsoftheplague.Hisbroadergoalwastopredictthe populationofLondonusingextantdatafromtheparishrecords.

AnotherearlyuseofstatisticswasDrJohnSnow’smapshowingdeathsinthe housesofLondon’sSohoDistrictduringthe1854choleraepidemic,aspopularized byJohnson’sbook, TheGhostMap (2006).Inordertoinvestigatethereasonsforthe spreadofcholeraotherthanodor(“miasmatheory”),Snowcreatedamapshowing eachdeathasablacklineoutsideeachhousehold,alongwithfeaturesoftheneighborhoodincludingthewatersourceslocatedthroughoutthedistrict.Themapcreated avisualpictureoftheconcentrationofdeathsacrossthedistrictandledtohypothesesaboutcholeraspreadingbywaterbornecontaminationratherthansmell.(Ifyou weretowalkacrossthesameLondondistricttoday,youwillseethatthegreatsocial theoristKarlMarxlivedjustafewstreetsawayfromthecenterofthecholeradeaths.)

Figure1.2showsSnow’smap.Youcanseethatnearthecenterofthemapis the“BroadStreetPump”whichSnowdeterminedtobethesourceforthespreadof cholera.(Atthetime,KarlMarxlivedonDeanStreet,justtotheeastoftheBroad StreetPump.)Noticethatthehousesnearestthispumprecordedthehighestnumbers ofdeaths.

Figure1.2examplenotonlyshowshowdescriptivestatisticsunderscoredtheuse ofvisualmeansofrepresentingdata,butitalsohelpedtoclarifypossiblereasons foranepidemic.Graunt’stablesbasedontheBillsofMortalitywererudimentary visuals,butSnow’smapwasamoreeffectivemeansofportrayingcomplexdataby visualmeans.Astilllaterstatisticianmadeevengreateradvancementsinusingvisual informationtocommunicatetrendsindata.

Figure1.3 FlorenceNightingale’spolarchartcomparingbattlefieldandnonbattlefield deaths. Source:https://en.wikipedia.org/wiki/Pie_chart#/media/File:Nightingale-mortality.jpg. Publicdomain.

Nightingale(1858)ismostoftenrememberedasthefounderofmodernnursing. Sheisoftenrepresentedinpaintingsas“theladywiththelamp,”sinceshewas knowntowalkamongthebedsidescheckingonthesickandwoundedofthewar. ButNightingalewasalsoanastutestatisticianwhousedstatisticstocapturethedramaticneedinhospitalsduringtheCrimeanWar.Sheiscreditedasbeingoneofthe firsttousea“piechart”(moreaccurately,a“polarchart”).Figure1.3showscomparisonsinheroriginalpolarchartofdifferencesbetweensoldierswhodiedofbattlefield wounds(“red”wedgesnearthecenter)andthosewhodiedfromothercauses(“blue” wedgesmeasuredfromthecenterofthegraph)overtime.Therelationshipbetween thesegroupsfueledNightingale’seffortstoobtainfurtherfundingforsanitaryhospitalconditionssincethosewhodiedofinfectionsweregreaterinnumberthanthose dyingofbattlefieldwounds.

APPROACHOFTHEBOOK

Manystudentsandresearchersareintimidatedbystatisticalprocedures,whichmay beduetofearofmath,problematicmathteachersinearliereducation,orthelackof exposuretoa“discovery”methodforunderstandingdifficultprocedures.Thisbook isanintroductiontounderstandingstatisticsinawaythatallowsstudentstodiscover patternsindataanddevelopingskillatmakinginterpretationsfromdataanalyses.I describehowtousestatisticalprograms(SPSSandExcel)tomakethestudymore understandableandtoteachstudentshowtoapproachproblemsolving.Ordinarily,a firstcourseinstatisticsleadsstudentsthroughtheworldsofdescriptiveandinferential statisticsbyhighlightingtheformulasandsequentialproceduresthatleadtostatistical decisionmaking.Wewilldoallthisinthisbook,butIplaceagooddealmoreattention onconceptualunderstanding.Thus,ratherthanmemorizingaspecificformulaand usingitinaspecificwaytosolveaproblem,Iwanttomakesurethestudentfirst understandsthenatureoftheproblem,whyaspecificformulaisneeded,andhowit willresultintheappropriateinformationfordecisionmaking.

Byusingstatisticalsoftware,wecanplacemoreattentiononunderstandinghowto interpretfindings.Statisticscoursestaughtinmathematicsdepartments,andinsome socialsciencedepartments,oftenplaceprimaryemphasesontheformulas/processes themselves.Intheextreme,thiscanlimittheusefulnessoftheanalysestothepractitioner.Myapproachencouragesstudentstofocusmoreonhowtounderstandand makeapplicationsoftheresultsofstatisticalanalyses.SPSSandotherstatistical programsaremuchmoreefficientatperformingtheanalyses;thekeyissueinmy approachishowtointerprettheresultsinthecontextoftheresearchquestion.

Beginningwithmyfirstundergraduatecourseteachingstatisticswithconventional textbooks,Ihavespentcountlesshoursdemonstratinghowtoconductstatisticaltests manuallyandteachingstudentstodolikewise.Thisisnotalwaysabadstrategy; performingtheanalysismanuallycanleadthestudenttounderstandhowformulas treatdataandyieldvaluableinformation.However,itisoftenthecasethatthestudentgravitatestomemorizingtheformulaorthestepsinananalysis.Again,there isnothingwrongwiththisapproachaslongasthestudentdoesnotstopthere. The

outcomeoftheanalysisismoreimportantthanmemorizingthestepstotheoutcome. Examiningtheappropriateoutputderivedfromstatisticalsoftwareshiftstheattention fromthenuancesofaformulatothewealthofinformationobtainedbyusingit.

ItisimportanttounderstandthatIdoindeedteachthestudentthenuancesof formulas,understandingwhy,when,how,andunderwhatconditionstheyareused. Butinmyexperience,forcingthestudenttoscrutinizestatisticaloutputfilesaccomplishesthisandteachesthemtheappropriateuseandlimitationsoftheinformation derived.

Studentsinmyclassesarealwayssurprised(ecstatic)torealizetheycanusetheir textbooksandnotesonmyexams.Buttheyquicklyfindthat,unlesstheyreally understandtheprinciplesandhowtheyareappliedandinterpreted,anopenbook isnotgoingtohelpthem.Overtime,theycometorealizethattheanalysesandthe outcomesofstatisticalproceduresaresimplytheingredientsforwhatcomesnext: buildingsolutionstoresearchproblems.Therefore,theirroleismoredetectiveand constructorthannumberjuggler.

Thisapproachmirrorstherecentnationalandinternationaldebateaboutmath pedagogy.Inourrecentbook, WinningtheMathWars (2010),mycolleaguesand Iaddressedtheseissuesingreatdetail,suggestingthat,whiletraditionalwaysof teachingmathareusefulandimportant,theemphasesofreformapproachesarenotto bedismissed.Understandingandmemorizingdetailarecrucial,butproblemsolving requiresadifferentapproachtolearning.

CASESFROMCURRENTRESEARCH

Ifocusonusingreal-worlddatainthisbook.Thereareseveralreasonsfordoingso, primarilybecausestudentsneedtobegroundedinapproachesforusingdatafromthe realworldwithalltheirproblemsand“grittiness.”Whenpeoplerespondtosurveys orinterviews,theyinevitablyfilloutinformationinwaysnotaskedbyinterviewers(e.g.,respondentsmaychoosetwopossibleanswerswhenoneisrequired,etc.). Moreover,transferringdatatoelectronicformmayresultinmiscodedresponsesor categorizationproblems.Researchersalwaysconfronttheseissues,andIbelieveit isimportantforstudentstoleavetheclassroomawareoftherangeofpossibleproblemswithreal-worlddataandpreparedfordealingwiththem.Ofcourse,muchofthe datawewillexaminewillalreadyhavebeenputinstandardforms,butotherresearch issueswillarise(e.g.,howdoIrecategorizedata,assignmissingcases,computenew variables,etc.?).

AnotherreasonIusereal-worlddataistofamiliarizestudentswithcontemporary researchquestionsinthesocialandhealthsciencefields.Classroomdataoftenare contrivedtomakeacertainpointorshowaspecificprocedure,whicharebothhelpful. ButIbelieveitisimportanttodrawthefocusawayfromtheprocedureperseand understandhowtheprocedurewillhelptheresearcherresolvearesearchquestion. Theresearchquestionsareimportant.Policyreflectstheavailableinformationona researchtopic,tosomeextent,soitisimportantforstudentstobeabletogenerate thatinformationaswellastounderstandit.Thisisan“active”ratherthan“passive” learningapproachtounderstandingstatistics.

DataLabsareaveryimportantpartofthiscoursesincetheyallowstudentsto takechargeoftheirlearning.Thisistheheartofdiscoverylearning.Understandinga statisticalprocedureintheconfinesofaclassroomisnecessaryandhelpful.However, learningthatlastsisbestaccomplishedbystudentsdirectlyengagingtheprocesses withactualdataandobservingwhatpatternsemergeinthefindingsthatcanbeapplied torealresearchproblems.

Somepracticeproblemsmayusedatacreatedforclassroomuse,butreal-world datafromactualresearchdatabaseswillenableadeepeningofunderstanding.Inadditiontonationaldatabases,Iuseresultsfrommyownresearchforclassroomlearning. Ineverycase,researchersknowthattheywilldiscoverknottyproblemsandunusual, sometimesidiosyncratic,informationintheirdata.Ifstudentsarenotexposedtothis real-worldaspectofresearch,itwillbeconfusingwhentheyengageinactualresearch beyondtheconfinesoftheclassroom.

Inthiscourse,wewillhaveseveraloccasionstocompleteDataLabsthatpose researchproblemswithactualdata.StudentstakewhattheylearnfromthebookmaterialandconductastatisticalinvestigationusingSPSSandExcel.Then,theyhavethe opportunitytoexaminetheresults,writeresearchsummaries,andcomparefindings withthesolutionspresentedattheendofthebook.

Theprojectlabsalsointroducestudentstotwosoftwareapproachesforsolvingstatisticalproblems.Thesearequitedifferentinmanyregards,aswewillsee inthechaptersthatfollow.SPSSprovidesadditionaladvancedprocedureseducationalresearchersutilizeformorecomplexandextensiveresearchquestions.Excel iswidelyaccessibleandprovidesawealthofinformationtoresearchersaboutmany statisticalprocessestheyencounterinactualresearch.TheDataLabsprovidesolutionsinbothformatssothestudentcanlearnthecapabilitiesandapproachesofeach.

Thisbookmakesuseofpublicallyavailableresearchdata.TheGeneralSocialSurveyorGSS1 isanationallyrepresentativesurveydesignedtobepartofaprogramof socialresearchtomonitorchangesinAmericans’socialcharacteristicsandattitudes. FundedthroughtheNationalScienceFoundationandadministeredbytheNational OpinionResearchCenter(NORC),theGSShasbeenadministeredannuallyorbiannuallysince1972.Asageneralsurvey,theGSSasksavarietyofquestionsonaseries oftopicsdesignedtotracktheopinionsofAmericansoverthelastfourdecades. Otherdatabaseswewilluseinthebookincludethefollowing:

• TheCentersforDiseaseControlandPrevention(CDC)conductstheBehavioral RiskFactorSurveillanceSystem(BRFSS)asahealth-relatedtelephonesurvey tomeasureAmericanresidents’healthconditions,healthbehaviors,anduseof preventativeservices.2

1 TomW.Smith,PeterMarsden,MichaelHout,andJibumKim.GeneralSocialSurveys,1972–2012 [machine-readabledatafile]/PrincipalInvestigator,TomW.Smith;CoprincipalInvestigator,PeterV. Marsden;CoprincipalInvestigator,MichaelHout;SponsoredbyNationalScienceFoundation.–NORC ed.–Chicago:NationalOpinionResearchCenter[producer];Storrs,CT:TheRoperCenterforPublic OpinionResearch,UniversityofConnecticut[distributor],2013.1datafile(57,061logicalrecords) + 1 codebook(3432pp.).--(NationalDataProgramfortheSocialSciences,No.21).

2 CentersforDiseaseControlandPrevention(CDC)(2013). BehavioralRiskFactorSurveillanceSystemSurveyData.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices,CentersforDisease ControlandPrevention.

• AssociationofReligionDataArchives(ARDA)presentsaseriesofdatabases onavarietyofreligiontopicsfromthesociologicalperspective.Inadditionto otherdatabases,theARDApresentsGSSdatabasesonspecialmodules(setsof questions)relevanttoreligion.ByvisitingtheARDA(www.thearda.com),you canperusethecodebookforthelatestGSSfile(www.thearda.com/Archive/ GSS.asp)togetafullersenseofthetypesofquestionsageneralsurveyasks. YoucanalsovisittheARDA’s“LearningCenter”totakeasurveythatallows youtocompareyourselftoalargernationalprofile.The“CompareYourself totheNation”surveyallowsyoutoseehowyoucomparetoothersbasedon theresultsfromthe2005BaylorReligionSurvey(addressingreligiousidentity, beliefs,experiences,paranormalviews,etc.).

RESEARCHDESIGN

Researcherswhowritestatisticsbookshaveadilemmawithrespecttoresearch design.Typically,statisticsandresearchdesignaretaughtseparatelyinorderfor studentstounderstandeachingreaterdepth.Thedifficultywiththisapproachisthat thestudentisleftontheirowntosynthesizetheinformation;thisisoftennotdone successfully.

Collegesanduniversitiesattempttomanagethisproblemdifferently.Somerequire statisticsasaprerequisiteforaresearchdesigncourseorviceversa.Othersattemptto synthesizetheinformationintoonecourse,whichisdifficulttodogiventheeventual complexityofboth“sets”ofinformation.Addingsomewhattotheproblemisthe approachofmultiplecoursesinbothdomains.

Idonotofferaperfectsolutiontothisdilemma.Myapproachfocusesonan in-depthunderstandingofstatisticalproceduresforactualresearchproblems.What thismeansisthatIcannotdevoteagreatdealofattentioninthisbooktoresearch designapartfromthestatisticalprocedureswhichareanintegralpartofit.(Youmay wishtoconsultaseparatebookonresearchdesignIauthoredwithmycolleague JenniferMcKinney, UnderstandingandApplyingResearchDesign,2013.)

Itrytoaddresstheproblemintwoways.First,whereverpossible,Iconnectstatisticswithspecificresearchdesigns.Thisprovidesanadditionalcontextinwhichstudentscanfocusonusingstatisticstoanswerresearchquestions.Theresearchquestion drivesthedecisionaboutwhichstatisticalprocedurestouse;italsocallsfordiscussionofappropriatedesigninwhichtousethestatisticalprocedures.Wewillcover essentialinformationaboutresearchdesigninordertoshowhowthesemightbeused.

Second,Ihaveanonlinecourseinresearchdesignthatcanbeaccessedtocontinue yourexplorationfromthisbook.Inadditiontodatabasesandotherresearchresources, youcanfollowthewebaddressintheprefacetogainaccesstotheonlinecourseas additionalpreparationinresearchdesign.

FOCUSONINTERPRETATION

Icallattentiontoproblemsolvingandinterpretationastheimportantelementsof statisticalanalysis.Itistemptingforstudentstofocussomuchonusingstatistical

procedurestocreatemeaningfulresults(acriticalmatter!)thattheydonotfocuson whattheresultsmeanfortheresearchquestion.Theystopaftertheyuseaformulaand decidewhetherornotafindingisstatisticallysignificant.Istronglyencouragestudentsto thinkaboutthefindingsinthecontextandwordsoftheresearchquestion. Thisisnotaneasythingtodobecausethemeaningoftheresultsisnotalwayscut anddried.Itrequiresstudentstothinkbeyondtheformula.

Statisticiansandpractitionershavedevisedrulestohelpresearcherswiththis dilemmabycreatingcriteriafordecisionmaking.Forexample,aswewillseein Chapter11,squaringacorrelationyieldsthe“coefficientofdetermination,”which representstheamountofvarianceinonevariablethatisaccountedforbytheother variable(thisisknownas“effectsize,”atopicwhichwewillspendagreatdealof timewithinthisbook).Butthenextquestionis,howmuchofthe“accountedfor variance”ismeaningful?Thisconsiderationiskeytounderstandinghowtouseand makedecisionsonthebasisofbigdata.

Inmanyways,interpretationofresultsisanartundergirdedbythecannonsof science.Muchoftheabilitytodevelopexpertiseininterpretationcomesbylong hoursoftutelagewithresearcherswhohavedoneitformanyyears.Wecannothope toemergefromourstudywiththisexpertise,butthroughconstantfocusoninterpretation,wecanbecomeawareoftheacceptablewaysofunderstandingandusing statisticalresults.

Statisticianshavesuggesteddifferentwaysofhelpingwithinterpretation.For example,whendealingwiththe“accountingofvariance”examplepresentedearlier, statisticianshavecreatedcriteriathatdetermine0.01(1%)ofthevarianceaccounted forisconsidered“small”while0.05(5%)is“medium”andsoforth.(And,muchtothe dismayofmanystudents,therearemorethanonesetofthesecriteria.)Therefore,if wedeterminethatthecorrelationbetweentwovariablesreachthesecriterialevels,we canfeelsecureinstickingtogoodinterpretationguidelines.Problemsexisthowever inhowtoviewthesestatisticalresultswithinthecontextoftheresearchproblem.

Forexample,ifaresearchquestionis,“Doesclasssizeaffectmathachievement?” andtheresultssuggestthatclasssizeaccountsfor1%ofthevarianceinmathachievement,manyresearchersmightagreetheresultsrepresentasmallandperhapseven inconsequentialimpact.However,ifaresearchquestionis,“DoesdrugXaffectEbola survivalrates?,”researchersmightconsider1%ofthevariancetobemuchmore consequentialthan“small!”ThisisnottosaythatmathachievementisanylessimportantthanEbolasurvivalrates(althoughthatisanotherofthosedebatablequestions researchersface),buttheresearchermustconsiderarangeoffactorsindeterminingmeaningfulness:theintractabilityoftheresearchproblem,thediscoveryofnew dimensionsoftheresearchfocus,whetherornotthefindingsrepresentlifeanddeath, andsoon.Thematerialpointisthatstatisticalcriteriaareimportantforestablishing meaningfulnessofresults,butoverallinterpretationinvolvesthelargercontextwithin whichtheresearchtakesplace.

Ihavefoundthatstudentshavethemostdifficulttimewiththesematters.Usinga formulatocreatenumericalresultsisoftenmuchpreferabletounderstandingwhatthe resultsmeaninthecontextoftheresearchquestion.Studentshavebeenconditioned tostopaftertheygettherightnumericalanswer.Theytypicallydonotgettothe difficultworkofwhattherightanswer means becauseitisn’talwaysapparent.

Iemphasize“practicalsignificance”(effectsize)inthisbookaswellasstatistical significance.Inmanyways,thisisamorecomprehensiveapproachtouncertainty, sinceeffectsizeisameasureof“impact”intheresearchevaluation.Itisimportant tomeasurethelikelihoodofchancefindings(statisticalsignificance),buttheextent ofinfluencerepresentedintheanalysesaffordstheresearcheranothervantagepoint todeterminetherelationshipamongtheresearchvariables.

CoverageofStatisticalProcedures

Thestatisticalapplicationswewilldiscussinthisbookare“workhorses.”Thisisan introductorytreatment,soweneedtospendtimediscussingthenatureofstatisticsand basicproceduresthatallowyoutousemoresophisticatedprocedures.Wewillnotbe abletoexamineadvancedproceduresinmuchdetail.Iwillprovidesomereferences forstudentswhowishtocontinuetheirlearningintheseareas.Hopefully,asyou learnthecapabilityofSPSSandExcel,youcanexploremoreadvancedprocedures onyourown,beyondtheendofourdiscussions.

Somereadersmayhavetakenstatisticscourseworkpreviously.Ifso,myhopeis thattheyareabletoenrichwhattheypreviouslylearnedanddevelopamorenuanced understandingofhowtoaddressproblemsineducationalresearchthroughtheuseof SPSSandExcel.Whetherreadersarenewtothestudyorexperiencedpractitioners, myhopeisthatstatisticsbecomesmeaningfulasawayofexaminingproblemsand debunkingprevailingassumptionsinthesocialandhealthsciences.

Often,well-intentionedpeoplecan,throughignoranceofappropriateprocesses, promoteideasthatmaynotbetrue.Further,policiesmightbeofferedthatwouldhave anegativeimpacteventhoughthepolicywasnotbasedonsoundstatisticalanalyses. Statisticsaretoolsthatcanbemisusedandinfluencedbythevalueperspectiveofthe wielder.However,policiesareoftengeneratedintheabsenceofcompellingresearch. Studentsneedtobecome“researchliterate”inordertorecognizewhenstatistical processesshouldbeusedandwhentheyarebeingusedincorrectly.

2

DESCRIPTIVESTATISTICS:CENTRAL TENDENCY

WhenIteachstatistics,Itypicallybeginbyofferingaseriesofquestionsthat emphasizetheimportanceofstatisticsforsolvingrealresearchproblems.Statistical formulasandproceduresarelogicalandcrucial,buttheprimaryfunctionfor statisticalanalyses(atleast,inmymind)istobringclarityandunderstanding toaresearchquestion.AsIdiscussedinarecentbookdealingwithstatistics forprogramevaluation(Abbott,2010),statisticalproceduresarebestusedto discoverpatternsinthedatathatarenotdirectlyobservable.Bringinglighttothese patternsallowsthestudentandtheresearchertounderstandandengageinproblem solving.

WHATISTHEWHOLETRUTH?RESEARCHAPPLICATIONS (SPURIOUSNESS)

Findingthe“truth”isalaudablegoalandonethatshouldinformallresearchefforts. However,instatistics,itisnotlikelythatwewilleverreallydiscoverultimate truth.Thenatureofstatisticsisthat westrivetoobserveasfullyaspossiblewhat relationshipsexistamongvariablessothatwecanunderstandlikelycausallinkages. Doespoverty“cause”crime?Islongevityaffectedbyaccesstohealthcare?These questionsintimatevalidrelationshipsbetweentheresearchvariables.However,one ofthefirstlessonsinstatisticsandresearchisthatvalidandmeaningfulrelationships arenotalwayseasilyvisible.Certainlymostrealitiesincontemporarylifearemuch

UsingStatisticsintheSocialandHealthScienceswithSPSS® andExcel®,FirstEdition. MartinLeeAbbott. ©2017JohnWiley&Sons,Inc.Published2017byJohnWiley&Sons,Inc.

morecomplexthancanbeexplainedbytwovariables.Wethereforemustbeable to“see”patternsamongdatausingbothnumericalandvisualmeansthatunderlie seeminglysimplerelationships.

AswewilldiscussinChapter11,thereisabigdifferencebetween“correlation”and“causation.”Thisstatisticaladagehelpstopointoutthecomplexityof understandingthepatternsamongvariables.Justbecausetwovariablesarestrongly statisticallyrelateddoesnotmeanthatthereisacausalrelationshipbetweenthem. Causalityisdifficulttoprove.Inordertounderstandtheapparentcausalrelationship morefully,wemustlookat othervariablesthatmighthaveameaningfulbut“hidden” relationshipwithboth“visible”variables.Researchersusetheterm“spuriousness”to describewhetheranapparentrelationshipbetweentwovariablesmightbetheinfluenceofvariablesnotintheanalysis.Anexampleofspuriousnessistherelationship betweenicecreamconsumptionandcrime.1

Thereisapositiverelationshipbetweenratesoficecreamconsumptionandcrime; whenoneincreases,sodoestheother.Shouldweconcludethenthaticecreamconsumptionleadstocriminalbehaviorinacausalway?Spuriousnessmeansthatthere maynotatrueorgenuinerelationshipbetweenfactorsevenifitlookslikethereis. Some unobservedorunnoticedvariablemayberelatedtobothofthevariableswe can“see”(inthisexampleicecreamconsumptionandcrime),whichmaymakeit appearthatthe“visible”variableshaveacause–effectrelationship.

Inthisexample,icecreamconsumptionincreasesascrimeincreases;and, consequently,whencrimeincreases,sodoestheconsumptionoficecream.These twovariablesappeartobeconsistentlyrelatedtoeachother.Theyprobablydo nothaveacausalrelationship,however,sincebothicecreamconsumptionand crimearerelatedtoathirdfactor:temperature.Whentemperaturesrise,icecream consumptionincreases(peopleeatmoreicecreaminthesummerthanwinter).Also, whentemperaturesrise,crimeincreases.Ifweincludetheseadditionalrelationships inourstudy,thenwecanseethattheapparentcausalrelationshipbetweenicecream consumptionandcrimeisprobablyreallymoreanissueoftheweather;bothofthe variablesare“linked”bytemperature.

Withoutconsideringspuriousness,somemightbetemptedtoexplain whythereis acausalrelationshipbetweenicecreamconsumptionandcrime.Forexample,does icecreamleadtofeelingsofgrandeurorapropensityforaggression,whichcauses peopletocommitcrime?Orisitthatgoodicecreamissoexpensivethatpeople commitcrimesinordertosupporttheiricecreamhabit?Whichmakesmostsense? Althoughwecouldcomeupwithseveralreasons(mostlyfanciful)whyoneofthese variablesmightbecausallyrelatedtotheother,weneedtobecautious.

Thissituationleadstooneofthemostprofoundlessonsinsocialscience: objectivityisnecessarytopursueknowledgedispassionately.Ifweassumethereisarelationshipbetweenthingswithoutusingobjectivemeansofassessingthetruthofthe situation,thenwearesimplyimposingasubjectiveunderstandingofthesituationthat isnot“anchored”inscience.Somecallthisthe“procrusteanexercise”referencing themythologicalfigurewhoforcedpeopletoanironbedbyeitherstretchingthemto

1 ThisexampleandexplanationarediscussedinAbbottandMcKinney(2013).

Figure2.1 Thepossiblespuriousrelationshipbetweenicecreamconsumptionandcrime. fitorcuttingofftheexcess.Thus,bynottakinganobjectivestance,wemayhavea tendencytomakeapparentreality“fit”ourmentalpictureorsubjectiveassumptions.

Figure2.1showshowthepossiblerelationshipsamongicecreamconsumption, crime,andtemperature.Thetoppanelshowstheapparentrelationshipbetweenice creamconsumptionandcrime,withatwo-waylineconnectingthevariablesindicatingthatthetwoarehighlyrelatedtooneanother.Thebottompanelshowsthat, whenthethirdvariable(temperature)isintroduced,theapparentrelationshipbetween icecreamconsumptionandcrimedisappears,asindicatedbytheabsenceofaline connectingthem.

Identifyingpotentiallyspuriousrelationshipsisoftenquitedifficultandcomes onlyafterextendedresearch.Theresearchermustknowtheirdataintimatelyinorder tomakethediscovery.AnexampleofthisisastudyIconductedinastudyofindustrialdemocracyseveralyearsago.Itwasgenerallyacceptedinindustryatthetime that,ifworkersweregiventheabilitytoparticipateindecisionmaking,theywould havehigherjobsatisfaction(JS).Thiswasareasonableassumption,givensimilar findingsintheresearchliterature.However,themoreIexaminedmyowndatafrom workersinanelectronicindustry,themoreIquestionedthisassumptionanddecided toexplorethematterfurther.

Inoticedfrominterviewsthatmanyworkers didnotwanttoparticipateindecisionmaking,eventhoughtheyhadtheopportunitytodoso.Ithereforeanalyzed theoriginal“participation–jobsatisfaction”butthistimeaddedvariablesthatmeasuredworkers’attitudestowardtheirworkandadesireformanagement.Througha seriesofanalyses,Ifoundanumberofsurprisingresultsthat“modified”theoriginal assumptionofadirect(andcausal)relationshipbetweenparticipationandJS.Oneof thesefindingswasthataworker’s attitudetowardmanagementhadalottodowith theireventualsatisfactionlevels.Thoseworkerswhoparticipatedindecisionmakingandwhohadapositiveviewofmanagementshowedstrongersatisfactionthan thoseworkerswhodidnotsuchapositiveviewofmanagement.Thus,athirdvariable(viewofmanagement)thatwasnotoriginallyincludedinthesimplerelationship (participation–satisfaction)hadanimpactonthefindings.Thissubsequentanalysis discoveredapatterninthedatathatwasnot“visible”attheoutset.

Ice Cream Consumption
Ice Cream Consumption
Crime
Crime

Thepopularpressoftenpresentsresearchfindingsthataresomewhatbombastic butmightpossiblybespurious.Isstudentachievementreallyjustamatterofethnicity, orarethereotherfactorsinvolved(e.g.,familyincome)?Dolifestylechoicesdirectly impactlongevity,orarethereotherconsiderationsthatneedtobetakenintoaccount (e.g.,socialclass)?Thevalueofstatisticsisthatitequipsthestudentandresearcher withtheskillsnecessarytodebunksimplisticfindings.

DESCRIPTIVEANDINFERENTIALSTATISTICS

Statistics,likeothercoursesofstudy,ismultifaceted.Itincludes“divisions”that areeachimportantinunderstandingthewhole.Twomajordivisionsaredescriptive andinferentialstatistics. Descriptivestatisticsaremethodstosummarizeand“boil down”theessenceofasetofinformationsothatitcanbeunderstoodmorereadily andfromdifferentvantagepoints.Weliveinaworldrichwithdata;descriptive statisticaltechniquesarewaysofmakingsenseofit.Usingthesestraightforward methodsallowstheresearchertodetectnumericalandvisualpatternsindatathat arenotimmediatelyapparent.

Inferentialstatisticsareadifferentmatteraltogether.Thesemethodsallowyouto makepredictionsaboutattitudes,behaviors,andpatternsonalargescalebasedon smallsetsof“sample”values.Inreallife,wearepresentedwithsituationsthatcannot provideuswithcertainty:Wouldanationaltrainingmethodimprovepatients’satisfactionratingsoftheirphysicians?Canwepredictworkers’healthscoresorlongevity inavarietyofindustriesbasedontheirjobpositions?Inferentialstatisticsallowus toinferormakeanobservationaboutanunknownvaluefromsamplevaluesthat areknown.Obviously,wecannotdothiswithabsolutecertainty–wedonotlive inatotallypredictableworld.Butwecandoitwithincertainboundsofprobability. Hopefully,statisticalprocedureswillallowustogetclosertocertaintythanwecould getwithoutthem.

THENATUREOFDATA:SCALESOFMEASUREMENT

ThefirststepinunderstandingcomplexrelationshipsliketheonesIdescribedearlieristobeabletounderstandanddescribethenatureofwhatdataareavailable toaresearcher.Weoftenjumpintoaresearchanalysiswithouttrulyunderstanding thefeaturesofthedataweareusing.Understandingthedataisaveryimportant stepbecauseitcanrevealhiddenpatternsanditcansuggestcustom-madestatistical proceduresthatwillresultinthestrongestfindings.

Oneofthefirstrealizationsbyresearchersisthatdatacomeinavarietyofsizes andshapes.Thatis,researchershavetoworkwithavailableinformationtomake statisticaldecisionsandthatinformationtakesmanyforms.Studentsareidentifiedas either“qualified”or“notqualified”forfreeorreducedlunches:

1.Workerseither“desireparticipation”or“donotdesireparticipation.”

2.Jobsatisfactionismeasuredbyworkerresponsestoseveralquestionnaireitems askingthemto“AgreeStrongly,”“Agree,”“NeitherAgreenorDisagree,” “Disagree,”or“DisagreeStrongly.”

3.Medicalresearchersmeasureworkers’physicalhealthbyhowmanydaysduringthelastmonththeirphysicalhealthwasgood.

NominalData

Thefirstexampleshowsthatdatacanbe“either–or”inthesensethattheyrepresent mutuallyexclusivecategories.Ifaworkerindicatesthatthey“desireparticipation” onasurveyinstrument,forexample,theywouldnotfitthe“donotdesireparticipation”category.Otherexamplesof“categorical”dataaresex(maleandfemale)and experimentalgroups(treatmentorcontrol).

Thistypeofdata,called“nominal,”doesnotrepresentacontinuum,withintermediatevalues.Eachvalueisaseparatecategoryonlyrelatedbythefacttheyare categoriesofsomelargervalue(e.g.,maleandfemalearebothvaluesofsex).These dataarecallednominalsincetherootofthewordindicates“names”ofcategories. Theyarealsoappropriatelycalled“categorical”data.

Theexamplesofnominaldatajustmentionedcanalsobeclassifiedas“dichotomous”sincetheyarenominaldatathathaveonlytwocategories.Nominaldataalso includevariableswithmorethantwocategoriessuchasschooling(e.g.,public,private,homeschooling).Wewilldiscusslaterthatdichotomousdatacancomeina varietyofformsalso,like“truedichotomies”inwhichthecategoriesnaturallyoccur likesex,and“dichotomizedvariables”thathavebeencreatedbytheresearcherfrom somedifferentkindofdata(likesatisfiedandnotsatisfiedworkers).Inallcases, nominaldatarepresentmutuallyexclusivecategories.Educatorstypicallyconfront nominaldatainclassifyingstudentsbygenderorrace,or,iftheyareconducting research,theyclassifygroupsas“treatment”and“control.”

Inordertoquantifythevariables,researchersassign numericalvaluestothecategories.Forexample,“treatmentgroups”mightbeassignedavalueof“1”and“control groups”mightbeassignedavalueof“2.”Inthesecases,thenumbersareonlycategories; theydonotrepresentactualmeasurements.Thus,acontrolgroupisnottwice atreatmentgroup.Thenumbersareonlyaconvenientwayofidentifyingthedifferentcategories.

Becausenominaldataarecategorical,wecannotusethemathematicaloperations ofaddition,subtraction,multiplication,anddivision.Itwouldmakenosensetodivide thenumberofJeepsinaparkinglot(onecategory)bythenumberofTeslasinthe sameparkinglot(secondcategory)togetasinglemeasureoftheautomobiles.In ordertogetanideaoftheautomobilesntheparkinglot,researcherswouldneedto identifythecategoriesofautomobilesandfindthepercentageofeachcategoryinthe parkinglot.Thus,wemightsaythatthereare15%Jeeps,2%Teslas,29%Toyotas, andsoonintheparkinglot.

OrdinalData

Thesecondexamplelistedintheprevioussection(THENATUREOFDATA: SCALESOFMEASUREMENT)indicatesanotherkindofdata:ordinaldata.These aredatawithasecondcharacteristicofmeaning,position.Theredataarealso categories,asinnominaldata,butwiththe categoriesrelatedby“morethan”and

“lessthan.”Somecategoriesareplaced aboveinvalueorbelowinvalueofsome othercategory.

Medicalresearcherstypicallyfindordinaldatainmanyplaces:countysurveys regardingcitizens’healthandpreferencefortreatmentoptions,forexample.Inthese cases,oneperson’sresponsecanbemoreorlessthananotherperson’sonthesame measure.Accordingtoourearlierdiscussion,JScanbemeasuredbyaquestionthat workersanswerabouttheirworklikethefollowing:

“IamhappywiththeworkIdo.”

1.AgreeStrongly(SA)

2.Agree(A)

3.NeitherAgreenorDisagree(N)

4.Disagree(D)

5.DisagreeStrongly(SD)

Asyoucansee,oneworkercanbequitehappy,whichindicates“AgreeStrongly,” whileanothercanreportthattheyarealittlelesshappybyindicating“Agree.”Both workersarereportingdifferentlevelsofhappinesswithsomebeingmoreorless happythanothers.

Figure2.2showsanotherexampleofordinaldatacategories;thisexamplefromthe BRFSSCodebookinwhichmedicalresearchersassignednumberstorespondents’ reportedhealth.2

AsyoucanseeinFigure2.2,theresponsecategories(“Excellent,”“Verygood,” etc.)arestillcategories,buttheyarelinkedby“gradualamounts”ofagreement.

VariableName: GENHLTH Description: Wouldyousaythatingeneralyourhealthis:

1Excellent85,53217.3918.66

2Verygood159,10432.3531.68

3Good150,54830.6131.11

4Fair66,70013.5613.31

5Poor27,9095.684.76

7Don’tknow/notsure9690.200.18

9Refused1,0040.200.29 BLANKNotaskedormissing7

Figure2.2 TheBRFSSGENHLTHvariablevalues.

2 CentersforDiseaseControlandPrevention(CDC). BehavioralRiskFactorSurveillanceSystemSurvey Questionnaire.Atlanta,Georgia:U.S.DepartmentofHealthandHumanServices,CentersforDisease ControlandPrevention,2013.

TABLE2.1Typical OrdinalResponseScale

Accordingtothedatashown,17.39%oftherespondentsreportedthattheywould ratetheirhealthwasexcellent,while5.68%ofrespondentsratedtheirhealthaspoor.

Theseexamplesofsurveydataarethestock-in-tradeofsocialscientistsbecause theyprovidesuchaconvenientwindowintopeople’sthinking.Medical,health,and socialresearchersusethemconstantlyforgaininginsightinto,andmakingdecisions about,policiesinhealthcare,urbanplanning,workerdemocracy,education,andother relatedarenas.

Thereisadifficultywiththesekindsofdatafortheresearcherhowever.Typically,theresearcherneedstoprovidea numericalreferentforaperson’sresponseto differentquestionnaireresponsecategoriesinordertoexamineanddescribetheset ofresponses.Therefore,theyassignnumberstotheresponsecategoriesasshownin Table2.1.

Thedifficultyariseswhentheresearchertreatsthenumbers(1–5inTable2.1) as integersratherthan ordinalindicators.Iftheresearcherthinksofthenumbers asintegers,theytypicallycreateanaverageratingonaspecificquestionnaire itemforagroupofrespondents.Thus,assume,forexample,thatfourpeople respondedtothequestionnaireitemabove(“IamhappywiththeworkIdo”)with thefollowingresults:2,4,3,1(i.e.,personone“Agrees,”receivinga2for“Agree”; persontwo“Disagrees,”person3is“Neutral,”andperson4“StronglyAgrees”). Thedangerisinaveragingthesebyaddingthemtogetheranddividingbyfourto get2.5asfollows(2 + 4 + 3 + 1)/4).Thisresultwouldmeanthatonaverage,allfour respondentsindicatedanagreementhalfwaybetweenthe2andthe3(andtherefore halfwaybetween“Agree”and“Neutral”).Thisassumesthateachofthenumbers hasanequaldistancebetweenthem,thatis,thatthedistancebetween4and3isthe sameasthedistancebetween1and2. ThisiswhatthescaleinTable 2.1 lookslikeif yousimplythinkofthenumbersasintegers

However,anordinalscalemakesnosuchassumptions.Ordinaldataonlyassumes thata4isgreaterthana3,ora3isgreaterthana2, butnotthatthedistancesbetween thenumbersarethesame.Table2.2showsacomparisonbetweenhowanordinal scale appearsandhowitmight actuallyberepresentedinthemindsoftwodifferent respondents.

AccordingtoTable2.2,respondent1isthesortofpersonwhoisquitecertain whentheyindicateSA.Thissameperson,however,makesfewdistinctionsbetween

TABLE2.2PerceivedDistancesinOrdinalResponseItems

AandNandbetweenDandSD(buttheyarecertainthatanydisagreementisquite adistancefromagreementorneutrality).Respondent2,bycontrast,doesn’tmake muchofadistinctionbetweenSA,A,andN,butseemstomakeafinerdistinction betweenareasofdisagreement,indicatingstrongerfeelingsabouthowmuchfurther SDisfromD.

Hopefullythisexamplehelpsyoutoseethatthenumbersonanordinalscaledo notrepresentanobjectivedistancebetweenthenumbers,buttheyareonlyindicators ofordinalcategoriesandcandifferbetweenpeopleonthesameitem.Theupshot,for research,isthatyoucannotaddthenumbersanddividebythetotaltogetanaverage becausethedistancesbetweenthenumbersmaybedifferentforeachrespondent! Creatinganaveragewouldthenbebasedondifferentmeaningsofthenumbersand wouldnotaccuratelyrepresenthowalltherespondents,asagroup,respondedto theitem.

IntervalData

Themajorityoftheprocedureswewillstudyinthisbookuseintervaldata.Thesedata arenumbersthathavethepropertiesofnominalandordinaldata,butaddanother characteristic,equaldistancebetweenthenumbers.Intervaldataarenumbersthat have equaldistancebetweenthem,sothatthedifferencebetween90and91isthe sameasthedistancebetween103and104;inbothcases,thedifferenceisoneunit. Thevalueofthisassumptionisthatyoucanusemathematicaloperations(multiplication,addition,subtraction,anddivision)toanalyzethenumbersbecausethey haveequaldistances.Intervaldataarealso“continuous”sinceanintervalvariableis expressedthroughalargenumberofequaldistancemeasures.

Anexampleofanintervalscaleisastandardizedassessmenttestsuchasanintelligencequotient(IQ)test.A standardizedtestisonethatmeetsstrictcriteriafortesting andcanensurestrongvalidityandreliability.Thesetestsarebenchmarkedbyhaving beenusedwithanumberofdifferentsetsofrespondentsunderthesamedirections, withthesamematerials,time,andgeneralconditions.Theyalsotypicallyhavepublishednormssothatresearcherscanhaveanobjectivemeasureforwhichtocompare theresultsoftherespondentsoftheirownstudy.

WhilepsychologistsandeducationalresearchersdisagreeaboutwhatIQreally represents,nevertheless,thenumberssharetheequaldistanceproperty.WithIQ,or otherstandardizedtests,therespondentindicatestheiranswerstoasetofquestions designedtomeasurethecharacteristicortraitstudied.SincetheIQmeasurehasbeen usedandbenchmarkedwithsomanydifferentgroupsofpeopleoverthedecades,the scorescometohavethepropertyofequalintervalsbetweenIQquotients.

JSisanotherexample.Respondentsusuallyindicatethattheystronglyagree, agree,etc.,withaseriesofitemsmeasuringtheirattitudestowardtheirjob.TheJob DiagnosticSurvey(JDS)(HackmanandOldham,1980)includesthefollowingitem aspartofthemeasurementofJS:“IamgenerallysatisfiedwiththekindofworkIdo inthisjob”(responsescaleis“DisagreeStrongly,”“Disagree,”“DisagreeSlightly,” “Neutral,”“AgreeSlightly,”“Agree,”and“AgreeStrongly”).Themeasurementof JSusesaseriesofthesekindsofquestionstomeasureaworker’sattitudetoward theirjob.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.