06962i

Page 1


Contents To Teachers: About This Book To Students: What Is Statistics? About the Authors Data Table Index Beyond the Basics Index

PART I

xv xxv xxix xxxi xxxiii

Looking at Data

CHAPTER 1 Looking at Data—Distributions

1

Introduction Variables Measurement: know your variables

1 2 5

1.1 Displaying Distributions with Graphs

6

Graphs for categorical variables Data analysis in action: don’t hang up on me Stemplots Histograms Examining distributions Dealing with outliers Time plots

7 9 12 15 17 18

Beyond the basics: decomposing time series* Section 1.1 Summary Section 1.1 Exercises

19 22 22

1.2 Describing Distributions with Numbers Measuring center: the mean Measuring center: the median Mean versus median Measuring spread: the quartiles The five-number summary and boxplots The 1.5 × IQR rule for suspected outliers Measuring spread: the standard deviation Properties of the standard deviation Choosing measures of center and spread Changing the unit of measurement

30 30 32 34 34 36 38

Section 1.2 Summary Section 1.2 Exercises

6

40 42 43 45 47 48

1.3 Density Curves and Normal Distributions Density curves Measuring center and spread for density curves Normal distributions The 68–95–99.7 rule Standardizing observations Normal distribution calculations Using the standard Normal table Inverse Normal calculations Normal quantile plots

53 55 56 58 59 61 62 64 66 68

Beyond the basics: density estimation Section 1.3 Summary Section 1.3 Exercises Chapter 1 Exercises

71 71 72 78

CHAPTER 2 Looking at Data—Relationships

83

Introduction Examining relationships

83 84

2.1 Scatterplots

86

Interpreting scatterplots Adding categorical variables to scatterplots More examples of scatterplots

88 89 90

Beyond the basics: scatterplot smoothers

92

Categorical explanatory variables

93

Section 2.1 Summary Section 2.1 Exercises

94 95

2.2 Correlation

101

The correlation r Properties of correlation

102 102

Section 2.2 Summary Section 2.2 Exercises

105 105

2.3 Least-Squares Regression Fitting a line to data Prediction Least-squares regression Interpreting the regression line

Sections marked with an asterisk are optional.

108 110 111 112 115

vii


viii

CONTENTS

Correlation and regression *Understanding r2 Beyond the basics: transforming relationships Section 2.3 Summary Section 2.3 Exercises

2.4 Cautions about Correlation and Regression

115 118 119 121 122

125

Residuals Outliers and influential observations Beware the lurking variable Beware correlations based on averaged data The restricted-range problem

126 129 132

Beyond the basics: data mining Section 2.4 Summary Section 2.4 Exercises

136 137 137

2.5 Data Analysis for Two-Way Tables The two-way table Joint distribution Marginal distributions Describing relations in two-way tables Conditional distributions Simpson’s paradox The perils of aggregation Section 2.5 Summary Section 2.5 Exercises

2.6 The Question of Causation Explaining association: causation Explaining association: common response Explaining association: confounding Establishing causation Section 2.6 Summary Section 2.6 Exercises Chapter 2 Exercises

CHAPTER 3 Producing Data

135 135

142 142 144 145 146 146 148 151 151 152

154 154 155 156 157 159 159 161

Randomized comparative experiments How to randomize Cautions about experimentation Matched pairs designs Block designs

183 184 188 189 190

Section 3.1 Summary Section 3.1 Exercises

191 192

3.2 Sampling Design Simple random samples Stratified samples Multistage samples Cautions about sample surveys Section 3.2 Summary Section 3.2 Exercises

207 208

3.3 Toward Statistical Inference Sampling variability Sampling distributions Bias and variability Sampling from large populations Why randomize?

212 213 214 217 219 219

Beyond the basics: capture-recapture sampling Section 3.3 Summary Section 3.3 Exercises

220 221 221

3.4 Ethics Institutional review boards Informed consent Confidentiality Clinical trials Behavioral and social science experiments Section 3.4 Summary Section 3.4 Exercises Chapter 3 Exercises

PART II 171

197 200 202 203 204

224 225 226 227 228 230 232 232 234

Probability and Inference

CHAPTER 4 Probability: The Study of Randomness

Introduction Anecdotal data Available data Sample surveys and experiments

171 171 172 173

Introduction

237

3.1 Design of Experiments Comparative experiments Randomization

178 180 181

4.1 Randomness The language of probability Thinking about randomness

237 239 240

237


CONTENTS The uses of probability

241

Section 4.1 Summary Section 4.1 Exercises

241 241

4.2 Probability Models Sample spaces Probability rules Assigning probabilities: finite number of outcomes Assigning probabilities: equally likely outcomes Independence and the multiplication rule Applying the probability rules Section 4.2 Summary Section 4.2 Exercises

4.3 Random Variables

242 243 245 248 249 251 254 255 255

258

Discrete random variables Continuous random variables Normal distributions as probability distributions

265

Section 4.3 Summary Section 4.3 Exercises

267 267

4.4 Means and Variances of Random Variables The mean of a random variable Statistical estimation and the law of large numbers Thinking about the law of large numbers Beyond the basics: more laws of large numbers

259 263

270 270 273 275 277

Rules for means The variance of a random variable Rules for variances and standard deviations

277 279

Section 4.4 Summary Section 4.4 Exercises

285 286

4.5 General Probability Rules* General addition rules Conditional probability General multiplication rules Tree diagrams Bayes’s rule Independence again Section 4.5 Summary Section 4.5 Exercises Chapter 4 Exercises

281

289 290 293 298 299 301 302 302 303 307

CHAPTER 5 Sampling Distributions Introduction 5.1 Sampling Distributions for Counts and Proportions The binomial distributions for sample counts Binomial distributions in statistical sampling Finding binomial probabilities: software and tables Binomial mean and standard deviation Sample proportions Normal approximation for counts and proportions The continuity correction* Binomial formula* Section 5.1 Summary Section 5.1 Exercises

5.2 The Sampling Distribution of a Sample Mean

ix

311 311 313 314 315 316 319 321 322 326 327 330 331

335

The mean and standard deviation of x The central limit theorem A few more facts

337 339 343

Beyond the basics: Weibull distributions Section 5.2 Summary Section 5.2 Exercises Chapter 5 Exercises

344 346 346 350

CHAPTER 6 Introduction to Inference Introduction Overview of inference 6.1 Estimating with Confidence Statistical confidence Confidence intervals Confidence interval for a population mean How confidence intervals behave Choosing the sample size Some cautions Beyond the basics: the bootstrap Section 6.1 Summary Section 6.1 Exercises

6.2 Tests of Significance The reasoning of significance tests

353 353 354 356 356 358 360 363 364 366 368 368 369

372 372


x

CONTENTS

Stating hypotheses Test statistics P-values Statistical significance Tests for a population mean Two-sided significance tests and confidence intervals P-values versus fixed α

386 388

Section 6.2 Summary Section 6.2 Exercises

390 390

6.3 Use and Abuse of Tests Choosing a level of significance What statistical significance does not mean Don’t ignore lack of significance Statistical inference is not valid for all sets of data Beware of searching for significance Section 6.3 Summary Section 6.3 Exercises

6.4 Power and Inference as a Decision* Power Increasing the power Inference as decision* Two types of error Error probabilities The common practice of testing hypotheses Section 6.4 Summary Section 6.4 Exercises Chapter 6 Exercises

CHAPTER 7 Inference for Distributions

374 376 377 379 382

394 395 396 397 398 398 399 399

401 401 405 406 406 407 409 410 410 412

417

Introduction

417

7.1 Inference for the Mean of a Population The t distributions The one-sample t confidence interval The one-sample t test Matched pairs t procedures Robustness of the t procedures The power of the t test* Inference for non-Normal populations*

418 418 420 422 428 432 433 435

Section 7.1 Summary Section 7.1 Exercises

440 441

7.2 Comparing Two Means

447

The two-sample z statistic

448

The two-sample t procedures The two-sample t significance test The two-sample t confidence interval Robustness of the two-sample procedures Inference for small samples Software approximation for the degrees of freedom* The pooled two-sample t procedures*

450 451 454

Section 7.2 Summary Section 7.2 Exercises

466 467

7.3 Optional Topics in Comparing Distributions* Inference for population spread The F test for equality of spread Robustness of Normal inference procedures The power of the two-sample t test Section 7.3 Summary Section 7.3 Exercises Chapter 7 Exercises

CHAPTER 8 Inference for Proportions

456 457 460 461

473 473 474 476 477 479 479 481

487

Introduction

487

8.1 Inference for a Single Proportion Large-sample confidence interval for a single proportion

488

Beyond the basics: the plus four confidence interval for a single proportion

488 491

Significance test for a single proportion Confidence intervals provide additional information Choosing a sample size

496 498

Section 8.1 Summary Section 8.1 Exercises

501 502

8.2 Comparing Two Proportions Large-sample confidence interval for a difference in proportions

493

505 506

Beyond the basics: plus four confidence interval for a difference in proportions

509

Significance test for a difference in proportions

511

Beyond the basics: relative risk Section 8.2 Summary Section 8.2 Exercises Chapter 8 Exercises

515 516 517 519


CONTENTS

PART III

Section 10.2 Summary Chapter 10 Exercises

Topics in Inference

CHAPTER 9 Analysis of Two-Way Tables

525

CHAPTER 11 Multiple Regression

xi 593 594

607

Introduction

525

Introduction

607

9.1 Inference for Two-Way Tables The hypothesis: no association Expected cell counts The chi-square test The chi-square test and the z test

526 529 529 530 533

607

Beyond the basics: meta-analysis Section 9.1 Summary

534 536

11.1 Inference for Multiple Regression Population multiple regression equation Data for multiple regression Multiple linear regression model Estimation of the multiple regression parameters Confidence intervals and significance tests for regression coefficients ANOVA table for multiple regression Squared multiple correlation R2

9.2 Formulas and Models for Two-Way Tables* Computations Computing conditional distributions Computing expected cell counts The X 2 statistic and its P-value Models for two-way tables Concluding remarks Section 9.2 Summary

9.3 Goodness of Fit* Section 9.3 Summary Chapter 9 Exercises

CHAPTER 10 Inference for Regression

536 536 537 540 540 541 544 545

545 548 548

559

Introduction

559

10.1 Simple Linear Regression

560

Statistical model for linear regression Data for simple linear regression Estimating the regression parameters Confidence intervals and significance tests Confidence intervals for mean response Prediction intervals

570 572 574

Beyond the basics: nonlinear regression Section 10.1 Summary

576 578

10.2 More Detail about Simple Linear Regression* Analysis of variance for regression The ANOVA F test Calculations for regression inference Inference for correlation

560 561 565

579 579 581 583 590

11.2 A Case Study Preliminary analysis Relationships between pairs of variables Regression on high school grades Interpretation of results Residuals Refining the model Regression on SAT scores Regression using all variables Test for a collection of regression coefficients Beyond the basics: multiple logistic regression Chapter 11 Summary Chapter 11 Exercises

CHAPTER 12 One-Way Analysis of Variance

607 608 609 610 611 612 613 615 615 616 618 619 620 621 622 623 623 625 627 628

637

Introduction

637

12.1 Inference for One-Way Analysis of Variance Data for one-way ANOVA Comparing means The two-sample t statistic An overview of ANOVA The ANOVA model Estimates of population parameters Testing hypotheses in one-way ANOVA The ANOVA table The F test

638 638 639 640 641 644 646 648 649 652


xii

CONTENTS

12.2 Comparing the Means Contrasts Multiple comparisons Software Power* Section 12.2 Summary Chapter 12 Exercises

CHAPTER 13 Two-Way Analysis of Variance

655 655 661 665 666 669 670

683

Introduction

683

13.1 The Two-Way ANOVA Model Advantages of two-way ANOVA The two-way ANOVA model Main effects and interactions

684 684 688 689

13.2 Inference for Two-Way ANOVA

694

The ANOVA table for two-way ANOVA

694

Section 13.2 Summary Chapter 13 Exercises

698 699

CHAPTER 15 Nonparametric Tests Introduction 15.1 The Wilcoxon Rank Sum Test The rank transformation The Wilcoxon rank sum test The Normal approximation What hypotheses does Wilcoxon test? Ties Rank, t, and permutation tests Section 15.1 Summary Section 15.1 Exercises

15.2 The Wilcoxon Signed Rank Test

CHAPTER 14 Logistic Regression

14-1

Introduction

14-1

14.1 The Logistic Regression Model Binomial distributions and odds Odds for two samples Model for logistic regression Fitting and interpreting the logistic regression model

14-1 14-2 14-3 14-4

14.2 Inference for Logistic Regression

14-8

14-6

Confidence intervals and significance tests Multiple logistic regression

14-8 14-14

Section 14.2 Summary Chapter 14 Exercises Chapter 14 Notes

14-16 14-17 14-23

15-1 15-3 15-4 15-5 15-7 15-8 15-10 15-12 15-14 15-15

15-17

The Normal approximation Ties

15-20 15-21

Section 15.2 Summary Section 15.2 Exercises

15-23 15-23

15.3 The Kruskal-Wallis Test* Hypotheses and assumptions The Kruskal-Wallis test Section 15.3 Summary Section 15.3 Exercises Chapter 15 Exercises Chapter 15 Notes

Companion Chapters (on the IPS Web site www.whfreeman.com/ips6e and CD-ROM)

15-1

CHAPTER 16 Bootstrap Methods and Permutation Tests

15-26 15-28 15-28 15-30 15-31 15-33 15-35

16-1

Introduction Software

16-1 16-2

16.1 The Bootstrap Idea The big idea: resampling and the bootstrap distribution Thinking about the bootstrap idea Using software

16-3

Section 16.1 Summary Section 16.1 Exercises

16.2 First Steps in Using the Bootstrap Bootstrap t confidence intervals Bootstrapping to compare two groups Beyond the basics: the bootstrap for a scatterplot smoother Section 16.2 Summary Section 16.2 Exercises

16-4 16-9 16-10 16-11 16-11

16-13 16-13 16-17 16-20 16-21 16-22


CONTENTS 16.3 How Accurate Is a Bootstrap Distribution?* Bootstrapping small samples Bootstrapping a sample median Section 16.3 Summary Section 16.3 Exercises

16.4 Bootstrap Confidence Intervals Bootstrap percentile confidence intervals More accurate bootstrap confidence intervals: BCa and tilting Confidence intervals for the correlation Section 16.4 Summary Section 16.4 Exercises

16-24 16-26 16-28 16-28 16-30

16-30 16-31 16-32 16-35 16-38 16-38

16.5 Significance Testing Using Permutation Tests 16-41 Using software Permutation tests in practice Permutation tests in other settings

16-45 16-45 16-49

Section 16.5 Summary Section 16.5 Exercises Chapter 16 Exercises Chapter 16 Notes

16-52 16-52 16-56 16-59

CHAPTER 17 Statistics for Quality: Control and Capability

17-1

xiii

x charts for process monitoring s charts for process monitoring

17-8 17-12

Section 17.1 Summary Section 17.1 Exercises

17-17 17-17

17.2 Using Control Charts x and R charts Additional out-of-control rules Setting up control charts Comments on statistical control Don’t confuse control with capability! Section 17.2 Summary Section 17.2 Exercises

17.3 Process Capability Indexes* The capability indexes Cp and Cpk Cautions about capability indexes Section 17.3 Summary Section 17.3 Exercises

17.4 Control Charts for Sample Proportions

17-21 17-22 17-23 17-25 17-30 17-33 17-34 17-35

17-39 17-41 17-44 17-45 17-46

17-49

Control limits for p charts

17-50

Section 17.4 Summary Section 17.4 Exercises Chapter 17 Exercises Chapter 17 Notes

17-54 17-54 17-56 17-57

Data Appendix

D-1

Introduction Use of data to assess quality

17-1 17-2

Tables

T-1

Answers to Odd-Numbered Exercises

A-1

17.1 Processes and Statistical Process Control

17-3

Notes and Data Sources

N-1

17-3 17-6

Photo Credits

C-1

Describing processes Statistical process control

Index

I-1


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.