SOLUTIONS MANUAL Elementary Statistics Using Excel, 7th Edition By Mario Triola Solution Manual for Elementary Statistics Using Excel, 7th Edition By Mario F. Triola Chapter 1-15 CONTENTS Chapter 1: Introduction to Statistics Section 1-1: Statistical and Critical Thinking ...................................................................1 Section 1-2: Types of Data ...............................................................................................3 Section 1-3: Collecting Sample Data ................................................................................4 Section 1-4: Introduction to Excel ....................................................................................5 Quick Quiz ........................................................................................................................5 Review Exercises ..............................................................................................................6 Cumulative Review Exercises ..........................................................................................6
Chapter 2: Exploring Data with Tables and Graphs Section 2-1: Frequency Distributions for Organizing and Summarizing Data .................9 Section 2-2: Histograms ..................................................................................................15 Section 2-3: Graphs That Enlighten and Graphs That Deceive ......................................18 Section 2-4: Scatterplots, Correlation, and Regression ..................................................21 Quick Quiz ......................................................................................................................23 Review Exercises ............................................................................................................24 Cumulative Review Exercises ........................................................................................25
Chapter 3: Describing, Exploring, and Comparing Data Section 3-1: Measures of Center .....................................................................................27 Section 3-2: Measures of Variation ................................................................................35 Section 3-3: Measures of Relative Standing and Boxplots .............................................42 Quick Quiz ......................................................................................................................47 Review Exercises ............................................................................................................47 Cumulative Review Exercises ........................................................................................49
Chapter 4: Probability Section 4-1: Basic Concepts of Probability ....................................................................51 Section 4-2: Addition Rule and Multiplication Rule ......................................................52 Section 4-3: Complements, Conditional Probability, and Bayes’ Theorem ...................55 Section 4-4: Counting .....................................................................................................57 Section 4-5: Simulations for Hypotheses Tests ..............................................................60 Quick Quiz ......................................................................................................................60 Review Exercises ............................................................................................................61 Cumulative Review Exercises ........................................................................................62
Chapter 5: Discrete Probability Distributions Section 5-1: Probability Distributions ............................................................................65 Section 5-2: Binomial Probability Distributions.............................................................67 Section 5-3: Poisson Probability Distributions ...............................................................70 Quick Quiz ......................................................................................................................72 Review Exercises ............................................................................................................73 Cumulative Review Exercises ........................................................................................74
Chapter 6: Normal Probability Distributions Section 6-1: The Standard Normal Distribution .............................................................77 Section 6-2: Real Applications of Normal Distributions ................................................79 Section 6-3: Sampling Distributions and Estimators ......................................................83 Section 6-4: The Central Limit Theorem ........................................................................87 Section 6-5: Assessing Normality ...................................................................................91 Section 6-6: Normal as Approximation to Binomial (Download Only) .........................94 Quick Quiz ......................................................................................................................97 Review Exercises ............................................................................................................98 Cumulative Review Exercises ........................................................................................99
Chapter 7: Estimating Parameters and Determining Sample Sizes Section 7-1: Estimating a Population Proportion ..........................................................101 Section 7-2: Estimating a Population Mean ..................................................................106 Section 7-3: Estimating a Population Standard Deviation or Variance ........................111 Section 7-4: Bootstrapping: Using Excel for Estimates................................................115 Quick Quiz ....................................................................................................................117 Review Exercises ..........................................................................................................117 Cumulative Review Exercises ......................................................................................119
Chapter 8: Hypothesis Testing Section 8-1: Basics of Hypothesis Testing ...................................................................121 Section 8-2: Testing a Claim About a Proportion .........................................................123 Section 8-3: Testing a Claim About a Mean .................................................................131 Section 8-4: Testing a Claim About a Standard Deviation or Variance .......................136 Section 8-5: Resampling: Using Excel for Hypothesis Testing ....................................140 Quick Quiz ....................................................................................................................141 Review Exercises ..........................................................................................................142 Cumulative Review Exercises ......................................................................................143
Chapter 9: Inferences from Two Samples Section 9-1: Two Proportions .......................................................................................147 Section 9-2: Two Means: Independent Samples ...........................................................158 Section 9-3: Matched Pairs ...........................................................................................168 Section 9-4: Two Variances or Standard Deviations ....................................................176 Section 9-5: Resampling: Using Technology for Inferences ........................................179 Quick Quiz ....................................................................................................................181 Review Exercises ..........................................................................................................182 Cumulative Review Exercises ......................................................................................185
Chapter 10: Correlation and Regression Section 10-1: Correlation ..............................................................................................189 Section 10-2: Regression ..............................................................................................198 Section 10-3: Prediction Intervals and Variation ..........................................................210 Section 10-4: Multiple Regression ................................................................................213 Section 10-5: Nonlinear Regression .............................................................................217 Quick Quiz ....................................................................................................................222 Review Exercises ..........................................................................................................223 Cumulative Review Exercises ......................................................................................224
Chapter 11: Goodness-of-Fit and Contingency Tables Section 11-1: Goodness-of-Fit ......................................................................................229 Section 11-2: Contingency Tables ................................................................................235 Quick Quiz ....................................................................................................................241 Review Exercises ..........................................................................................................242 Cumulative Review Exercises ......................................................................................243
Chapter 12: Analysis of Variance Section 12-1: One-Way ANOVA .................................................................................247 Section 12-2: Two-Way ANOVA ................................................................................249 Quick Quiz ....................................................................................................................251 Review Exercises ..........................................................................................................251 Cumulative Review Exercises ......................................................................................252
Chapter 13: Nonparametric Tests Section 13-2: Sign Test .................................................................................................255 Section 13-3: Wilcoxon Signed-Ranks Test for Matched Pairs....................................257 Section 13-4: Wilcoxon Rank-Sum Test for Two Independent Samples .....................259 Section 13-5: Kruskal-Wallis Test for Three or More Samples ...................................263 Section 13-6: Rank Correlation.....................................................................................266 Section 13-7: Runs Test for Randomness .....................................................................270 Quick Quiz ....................................................................................................................271 Review Exercises ..........................................................................................................272 Cumulative Review Exercises ......................................................................................274
Chapter 14: Statistical Process Control Section 14-1: Control Charts for Variation and Mean ..................................................277 Section 14-2: Control Charts for Attributes ..................................................................279 Quick Quiz ....................................................................................................................284 Review Exercises ..........................................................................................................284 Cumulative Review Exercises ......................................................................................286
Chapter 15: Holistic Statistics Holistic Statistics Exercises ..........................................................................................289
Section 1-1: Statistical and Critical Thinking 1
Chapter 1: Introduction to Statistics Section 1-1: Statistical and Critical Thinking 1. The respondents are a voluntary response sample or a self-selected sample. Because those with strong interests in the topic are more likely to respond, it is very possible that their responses do not reflect the opinions or behavior of the general population. 2. a. The sample consists of the 1046 adults who were surveyed. The population consists of all adults. b. When asked, respondents might be inclined to avoid the shame of the unhealthy habit of not washing their hands, so the reported rate of 70% might well be much higher than it is in reality. It is generally better to observe or measure human behavior than to ask subjects about it. 3. Statistical significance is indicated when methods of statistics are used to reach a conclusion that a treatment is effective, but common sense might suggest that the treatment does not make enough of a difference to justify its use or to be practical. Yes, it is possible for a study to have statistical significance, but not practical significance. 4. No. Correlation does not imply causation. The example illustrates a correlation that is clearly not the result of any interaction or cause effect relationship between per capita consumption of margarine and the divorce rate in Maine. 5. Yes, there does appear to be a potential to create a bias. 6. No, there does not appear to be a potential to create a bias. 7. No, there does not appear to be a potential to create a bias. 8. Yes, there does appear to be a potential to create a bias. 9. The sample is a voluntary response sample and has strong potential to be flawed. 10. The samples are voluntary response samples and have potential for being flawed, but this approach might be necessary due to ethical considerations involved in randomly selecting subjects and somehow imposing treatments on them. 11. The sampling method appears to be sound. 12. The sampling method appears to be sound. 13. The Ornish weight loss program has statistical significance, because the results are so unlikely (3 chances in 1000) to occur by chance. It does not have practical significance because the amount of lost weight (3.3 lb) is so small. 14. Because there is only one chance in a thousand of getting such success rates by chance, the difference does appear to have statistical significance. The 92% success rate for surgery appears to be substantially better than the 72% success rate for splints, so the difference does appear to have practical significance. 15. The difference between Mendel’s 25% rate and the result of 26% is not statistically significant. According to Mendel’s theory, 145 of the 580 peas would have yellow pods, but the results consisted of 152 peas with yellow pods. The difference of 7 peas with yellow pods among the 580 offspring does not appear to be statistically significant. The difference does not appear to have practical significance. 16. Because there is a 25% chance of getting such results with a program that has no effect, the program does not appear to have statistical significance. Because the average increase is only 3 IQ points, the program does not appear to have practical significance. 17. With 40 out of 41 ballots having the Democrat first, it appears that the result is statistically significant. Because of the great advantage enjoyed by Democrats, the results also have practical significance. 18. Because it is so unlikely (0.3%) to get these results by chance, the results have statistical significance. With about 57% (from 235/414) of the coin toss winners going on to win the game, the result appears to have practical significance. 19. There appears to be statistical significance given the large discrepancy between 79.1% and 39%. Because the results are so far from yielding a jury of peers, it appears that the results have practical significance. 20. With only a 0.0000006% chance of getting such results, it appears that the results are statistically significant. The discrepancy between the 61% rate for voters who actually did vote and the 70% rate of those who said that they voted is a fairly large discrepancy, and the results appear to have practical significance.
Copyright © 2022 Pearson Education, Inc.
2
Chapter 1: Introduction to Statistics
21. Yes. Each column of 8 AM and 12 AM temperatures is recorded from the same subject, so each pair is matched. 22. No. The source is from university researchers who do not appear to gain from distorting the data. 23. The data can be used to address the issue of whether there is a correlation between body temperatures at 8 AM and at 12 AM. Also, the data can be used to determine whether there are differences between body temperatures at 8 AM and at 12 AM. 24. Because the differences could easily occur by chance (with a 64% chance), the differences do not appear to have statistical significance. 25. No. The lemon imports are weights in metric tons and the crash fatality rates are fatalities per 100,000 population, so their differences are meaningless. 26. The issue that can be addressed is whether there is a correlation, or association, between lemon imports and crash fatality rates. 27. No. The author of an article for the Journal of Chemical Information and Modeling has no reason to collect or present the data in a way that is biased. 28. No. Correlation does not imply causation, so a statistical correlation between lemon imports and crash fatality rates should not be used to conclude that lemon imports are the cause of fatal crashes. 29. It is questionable that the sponsor is the Idaho Potato Commission and the favorite vegetable is potatoes. 30. The sample is a voluntary response sample, so there is a good chance that the results do not reflect the larger population of people who have a water preference. 31. The correlation, or association, between two variables does not mean that one of the variables is the cause of the other. Correlation does not imply causation. Clearly, sour cream consumption is not directly related in any way to motorcycle fatalities. 32. The sponsor of the poll is an electronic cigarette maker, so the sponsor does have an interest in the poll results. The source is questionable. 33. The correlation, or association, between two variables does not mean that one of the variables is the cause of the other. Correlation does not imply causation. 34. The correlation, or association, between two variables does not mean that one of the variables is the cause of the other. Correlation does not imply causation. 35. The sample is a voluntary response sample, so there is a good chance that the results do not accurately reflect the larger population. 36. Because the nutritionists are paid such large amounts of money, they might be more inclined to find favorable results. It is very possible that the results represent desired outcomes instead of actual outcomes. 37. a. 700 adults b. 55% 38. a. 253.31 subjects b. No. Because the result is a count of people among the 347 who were surveyed, the result must be a whole number. c. 253 subjects d. 32% 39. a. 559.2 respondents b. No. Because the result is a count of respondents among the 1165 engaged or married women who were surveyed, the result must be a whole number. c. 559 respondents d. 8% 40. a. 847.56 drivers b. No. Because the result is a count of respondents saying that they text while driving, the result must be a whole number. c. 848 drivers d. Given that texting while driving is extremely dangerous, the result of 42% of drivers who text while driving is far too high. The result suggests that steps should be taken to substantially lower that rate.
Copyright © 2022 Pearson Education, Inc.
Section 1-2: Types of Data 3 41. Because a reduction of 100% would eliminate all of the size, it is not possible to reduce the size by 100% or more. 42. In an editorial criticizing the statement, the New York Times correctly interpreted the 100% improvement to mean that no baggage is being lost, which was not true. 43. Because a reduction of 100% would eliminate all plaque, it is not possible to reduce it by more than 100%. 44. Because a reduction of 100% would eliminate all car thefts, it is not possible to reduce it by more than 100%. 45. If one subgroup receives a 4% raise and another subgroup receives a 4% raise, the combined group will receive a 4% raise, not an 8% raise. The percentages should not be added in this case. 46. The wording of the question is biased and tends to encourage negative responses. The sample size of 20 is too small. Survey respondents are self-selected instead of being randomly selected by the newspaper. If 20 readers respond, the percentages should be multiples of 5, so 87% and 13% are not possible results. 47. All percentages of success should be multiples of 5. The given percentages cannot be correct. Section 1-2: Types of Data 1. The population consists of all adults in the United States, and the sample is the 1001 adults who were surveyed. Because the value of 69% refers to the sample, it is a statistic. 2. a. quantitative c. categorical b. categorical d. quantitative 3. Only part (b) describes discrete data. 4. a. The sample is the 36,000 adults who were surveyed. The population is all adults in the United States. b. statistic c. ratio d. discrete 5. statistic 17. continuous 6. parameter 18. discrete 7. parameter 19. discrete 8. statistic 20. continuous 9. statistic 21. nominal 10. statistic 22. ordinal 11. parameter 23. ordinal 12. parameter 24. ratio 13. continuous 25. interval 14. continuous 26. nominal 15. discrete 27. ratio 16. discrete 28. interval 29. The numbers are not counts or measures of anything. They are at the nominal level of measurement, and it makes no sense to compute the average (mean) of them. 30. The digits are not counts or measures of anything. They are at the nominal level of measurement and it makes no sense to calculate their average (mean). 31. The temperatures are at the interval level of measurement. Because there is no natural starting point with 0F representing ―no heat,‖ ratios such as ―twice‖ make no sense, so it is wrong to say that it is twice as warm in Paris as it is in Anchorage. 32. The ranks are at the ordinal level of measurement. Differences between the universities cannot be determined, so there is no way to know whether the difference between Harvard and MIT is the same as the difference between Stanford and the University of California at Berkeley. 33. a. Continuous, because the number of possible values is infinite and not countable. b. Discrete, because the number of possible values is finite. c. Discrete, because the number of possible values is finite. d. Discrete, because the number of possible values is infinite and countable. Copyright © 2022 Pearson Education, Inc.
4
Chapter 1: Introduction to Statistics
34. Interval level of measurement. The direction of north represented by 0 is arbitrary, and 0 does not represent ―no direction.‖ Differences between degrees are meaningful; the difference between 30 and 60 is the same as the difference between 150 and 180. But ratios are not meaningful; the ratio of 60 to 30 does not result in twice some direction. (These degree measurements are directions, not amounts of rotation.) Section 1-3: Collecting Sample Data 1. The study is an experiment because subjects were given treatments. 2. The subjects in the study did not know whether they were given the magnet treatment or the sham treatment, and those who administered the treatments also did not know. 3. The group sample sizes are large enough so that the researchers could see the effects of the two treatments, but it would have been better to have larger samples. 4. The sample appears to be a convenience sample. Given that the subjects were all patients at a Veterans Affairs hospital, it is not likely that the sample is representative of the population, so it is questionable whether the results can be generalized for the population of subjects with chronic low back pain. 5. The sample appears to be a convenience sample. By e-mailing the survey to a readily available group of Internet users, it was easy to obtain results. Although there is a real potential for getting a sample group that is not representative of the population, indications of which ear is used for cell phone calls and which hand is dominant do not appear to be factors that would be distorted much by a sample bias. 6. The study is an observational study because the subjects were not given any treatment. 7. With 717 responses, the response rate is 14%, which does appear to be quite low. In general, a very low response rate creates a serious potential for getting a biased sample that consists of those with a special interest in the topic. 8. Answers vary, but the following are good possibilities. a. Obtain a printed copy of the class roster, assign consecutive numbers (integers), then use a computer to randomly generate six of those numbers. b. Select every third student leaving class until six students are chosen. c. Randomly select three males and three females. d. Randomly select a row, and then select the students in that row. (Use only the first six to meet the requirement of a sample of size six.) e. Select the first six students who enter the class. 9. systematic 15. stratified 10. convenience 16. systematic 11. random 17. random 12. stratified 18. cluster 13. cluster 19. convenience 14. random 20. systematic 21. Observational study. The sample is a convenience sample consisting of subjects who decided themselves to respond. Such voluntary response samples have a high chance of not being representative of the larger population, so the sample may well be biased. The question was posted in an electronic edition of a newspaper, so the sample is biased from the beginning. 22. Experiment. The sample subjects consist of male physicians only. It would have been better to include females. Also, it would be better to include male and females who are not physicians. 23. Experiment. This experiment would create an extremely dangerous and illegal situation that has a real potential to result in injury or death. It’s difficult enough to drive in New York City while being completely sober. 24. Observational study. The sample of only three students is too small. 25. Experiment. The biased sample created by using a small sample of college students cannot be fixed by using a larger sample. The larger sample will still be a biased sample that is not representative of the population of all adults. 26. Experiment. Calling the subjects and asking them to report their weights has a high risk of getting results that do not reflect the actual weights. It would have been much better to somehow measure the weights instead of asking the subjects to report them. Copyright © 2022 Pearson Education, Inc.
Section 1-4: Introduction to Excel 5 27. Observational study. Respondents who have been convicted of felonies are not likely to respond honestly to the second question. The survey will suffer from a ―social desirability bias‖ because subjects will tend to respond in ways that will be viewed favorably by those conducting the survey. 28. Observational study. The number of responses is very small, and the response rate of only 1.52% is far too small. With such a low response rate, there is a real possibility that the sample of respondents is biased and consists only of those with special interests in the survey topic. 29. prospective study 33. matched pairs design 30. retrospective study 34. randomized block design 31. cross-sectional study 35. completely randomized design 32. prospective study 36. matched pairs design 37. a. Not a simple random sample, but it is a random sample. b. Simple random sample and also a random sample. c. Not a simple random sample and not a random sample. Section 1-4: Introduction to Excel 1. A spreadsheet is a collection of data organized in an array of cells arranged in rows and columns, and it is used to summarize, analyze, and perform calculations with the data. 2. worksheet 3. An Excel file that contains worksheets. 4. A single spreadsheet, which is a collection of data organized in an array of cells arranged in rows and columns. 5. f. The top half of the worksheet is deleted, then restored. 6. f. The top half of the worksheet is deleted, then restored. 7. f. The top half of the worksheet is deleted, then restored. 8. f. The top half of the worksheet is deleted, then restored. 9. The described worksheet will be printed. 11. The described worksheet will be printed. 10. The described worksheet will be printed. 12. The described worksheet will be printed. 13. Click the File tab, then select Save As. 14. Mean: 179.32; range: 288; minimum: 64; maximum: 352; sum: 8966; count: 50. The mean (―average‖) is 179.32 seconds. The difference between the longest and shortest service times is 288 seconds. The shortest service time is 64 seconds. The longest service time is 352 seconds. The sum of the service times is 8966 seconds. The ―count‖ of 50 indicates that 50 service times are included in this sample. Excel Data Analysis Output Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count Quick Quiz 1. No. The numbers do not measure or count anything. 2. nominal 3. continuous
179.32 8.900834 172 121 62.9384 3961.242 0.912738 0.954476 288 64 352 8966 50
4. 5.
quantitative data ratio
Copyright © 2022 Pearson Education, Inc.
6
Chapter 1: Introduction to Statistics
6. statistic 8. observational study 7. no 9. The subjects did not know whether they were getting aspirin or the placebo. 10. simple random sample Review Exercises 1. The respondents are a voluntary response sample or a self-selected sample. Because those with strong interests in the topic are more likely to respond, it is very possible that their responses do not reflect the opinions or behavior of the general population. 2. a. The sample is a voluntary response sample, so the results are questionable. b. statistic c. observational study 3. Randomized: Subjects were assigned to the different groups through a process of random selection, whereby they had the same chance of belonging to each group. Double-blind: The subjects did not know which of the three groups they were in, and the people who evaluated results did not know either. 4. No. Correlation does not imply causality. 5. a. systematic d. convenience b. stratified e. cluster c. simple random sample 6. Yes. The two questions give the false impression that they are addressing very different issues. Most people would be in favor of defending marriage, so the first question is likely to receive a substantial number of ―yes‖ responses. The second question better describes the issue and subjects are much more likely to have varied responses. 7. a. discrete b. ratio c. The mailed responses would be a voluntary response sample, so those with strong opinions or greater interest in the topics are more likely to respond. It is very possible that the results do not reflect the true opinions of the population of all state residents. d. stratified e. cluster 8. a. If they have no fat at all, they have 100% less than any other amount with fat, so the 125% figure cannot be correct. b. 686 c. 28% 9. a. interval data; systematic sample b. nominal data; stratified sample c. ordinal data; convenience sample 10. Because there is less than a 1% chance of getting the results by chance, the method does appear to have statistical significance. The result of 239 boys in 291 births is a rate of 82% so it is above the 50% rate expected by chance, and it does appear to be high enough to have practical significance. The procedure appears to have both statistical significance and practical significance. Cumulative Review Exercises 135 149 145 129 118 119 115 133 107 188 127 131 133.0. The IQ score of 188 12 appears to be substantially higher than the other IQ scores. fx | =(135+149+145+129+118+119+115+133+107+188+127+131)/12
1.
The mean is
2.
0.513 0.000122;
3.
203 176 6
fx | =0.5^13
4.50, which is an unusually high value.
fx | =(203-176)/6
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 7 4.
5. 6. 7.
8.
98.2 98.6 6.64; fx | =(98.2-98.6)/(0.62/SQRT(106)) 0.62 106 1.959962 0.25 1068; fx | =1.95996^2*0.25/0.03^2 0.032 188 107 20.25; fx | =(188-107)/4 4 135 133.02 0.364; fx | =(135-133.0)^2/11 11
98.4 98.62 98.6 98.62 98.8 98.62 3 1
0.04 0.20
fx | =SQRT(((98.4-98.6)^2+(98.6-98.6)^2+(98.8-98.6)^2)/(3-1)) 9.
0.36 0.000729;
fx | =0.3^6
10. 812 68,719,476,736 (or about 68,719,477,000); 11. 856 377,149,515,625 (or about 377,149,520,000); 12. 0.212 0.000000004096;
fx | =8^12 fx | =85^6
fx | =0.2^12
Copyright © 2022 Pearson Education, Inc.
Section 2-1: Frequency Distributions for Organizing and Summarizing Data 9
Chapter 2: Exploring Data with Tables and Graphs Section 2-1: Frequency Distributions for Organizing and Summarizing Data 1. The table summarizes 1000 commute times. It is not possible to identify the exact values of all of the original times. 2. The classes of 0–30, 30–60, …, 120–150 overlap, so it is not always clear which class we should put a value in. For example, the commute time value of 30 minutes could go in the first class or the second class. The classes should be mutually exclusive, meaning that there is no overlap. 3. Daily Commute Time in Boston (minutes) 0–29 30–59 60–89 90–119 120–149
Relative Frequency 46.8% 42.2% 9.2% 1.0% 0.8%
4.
The sum of the relative frequencies is 125%, but it should be 100%, with a small round off error. All of the relative frequencies appear to be roughly the same, but if they are from a normal distribution, they should start low, reach a maximum, and then decrease. 5. Class width: 10 Class midpoints: 24.5, 34.5, 44.5, 54.5, 64.5, 74.5, 84.5 Class boundaries: 19.5, 29.5, 39.5, 49.5, 59.5, 69.5, 79.5, 89.5 Number: 91 6. Class width: 10 Class midpoints: 24.5, 34.5, 44.5, 54.5, 64.5, 74.5 Class boundaries: 19.5, 29.5, 39.5, 49.5, 59.5, 69.5, 79.5 Number: 91 7. Class width: 100 Class midpoints: 49.5, 149.5, 249.5, 349.5, 449.5, 549.5, 649.5 Class boundaries: –0.5, 99.5, 199.5, 299.5, 399.5, 499.5, 599.5, 699.5 Number: 153 8. Class width: 100 Class midpoints: 149.5, 249.5, 349.5, 449.5, 549.5 Class boundaries: 99.5, 199.5, 299.5, 399.5, 499.5, 599.5 Number: 147 9. No. The maximum frequency is in the second class instead of being near the middle, so the frequencies below the maximum do not mirror those above the maximum. 10. Yes. The frequencies start low, reach a maximum of 38, and then decrease. The values below the maximum are very roughly a mirror image of those above it. 11. Yes. Except for the single value that lies between 600 and 699, the frequencies start low, reach a maximum of 90, and then decrease. The values below the maximum are very roughly a mirror image of those above it. (That single value between 600 and 699 is an outlier that makes the determination of a normal distribution somewhat questionable, but using a loose interpretation of the criteria for normality, it is reasonable to conclude that the distribution is normal.) 12. Yes. Except for two values that lie between 500 and 599, there is a low frequency of 25, then a maximum frequency of 92, and then a low frequency of 28. The values below and above the maximum are roughly a mirror image. (Those two values between 500 and 599 are outliers that make the determination of a normal distribution somewhat questionable, but using a loose interpretation of the criteria for normality, it is reasonable to conclude that the distribution is normal.)
Copyright © 2022 Pearson Education, Inc.
10 Chapter 2: Exploring Data with Tables and Graphs 13. The data amounts do not appear to have a normal distribution. The distribution does not appear to be symmetric because the frequencies preceding the maximum frequency of 16 are far outweighed by the frequencies following the maximum. Daily Commute Time in Chicago (Minutes) 0–14 15–29 30–44 45–59 60–74 75–89
Frequency 5 16 14 9 5 1
14. The ages do appear to have a normal distribution. Age (years) 40–44 45–49 50–54 55–59 60–64 65–69 70–74
Frequency 2 7 10 10 6 3 1
Duration (sec) 125–149 150–174 175–199 200–224 225–249 250–274
Frequency 1 0 0 3 34 12
15.15.
16. The intensities do not appear to have a normal distribution. Tornado F-Scale 0 1 2 3 4
Frequency 24 16 2 2 1
17.17. Burger King Lunch Service Times (sec) 70–109 110–149 150–189 190–229 230–269 230–269
Frequency 11 23 7 6 3 6
Copyright © 2022 Pearson Education, Inc.
Section 2-1: Frequency Distributions for Organizing and Summarizing Data 11 18.18. Burger King Dinner Service Times (sec) 30–69 70–109 110–149 150–189 190–229 230–269 270–309
Frequency 1 6 26 7 3 6 1
19. The distribution does appear to be a normal distribution. Weight (kg) 40–49 50–59 60–69 70–79 80–89 90–99
Frequency 2 22 23 13 3 4
20. The distribution does appear to be a normal distribution. Weight (g) 4.300–4.399 4.400–4.499 4.500–4.599 4.600–4.699 4.700–4.799
Frequency 11 17 28 16 3
21. Because there are disproportionately more 0s and 5s, it appears that the heights were reported instead of measured. Consequently, it is likely that the results are not very accurate. Last Digit 0 1 2 3 4 5 6 7 8 9
Frequency 9 2 1 3 1 15 2 0 3 1
Copyright © 2022 Pearson Education, Inc.
12 Chapter 2: Exploring Data with Tables and Graphs 22. Because there are disproportionately more 0s and 5s, it appears that the heights were reported instead of measured. There does appear to be a gap due to the tendency of respondents to round their heights to values ending in 0 or 5. Because the results appear to be reported instead of measured, it is likely that the results are not very accurate. Last Digit 0 1 2 3 4 5 6 7 8 9
Frequency 26 1 1 2 2 12 1 0 4 1
23. The actresses appear to be generally younger than the actors. Age When Oscar Was Won 20–29 30–39 40–49 50–59 60–69 70–79 80–89
Actresses 34.1% 37.4% 16.5% 3.3% 6.6% 1.1% 1.1%
Actors 1.1% 31.9% 41.8% 17.6% 6.6% 1.1% 0.0%
24. There do appear to be differences, but overall they are not very substantial differences. Blood Platelet Count 0–99 100–199 200–299 300–399 400–499 500–599 600–699
Males 0.7% 33.3% 58.8% 6.5% 0.0% 0.0% 0.7%
Females 0.0% 17.0% 62.6% 19.0% 0.0% 1.4% 0.0%
25.25. Age (years) of Best Actress When Oscar Was Won Less than 30 Less than 40 Less than 50 Less than 60 Less than 70 Less than 80 Less than 90
Cumulative Frequency 31 65 80 83 89 90 91
Copyright © 2022 Pearson Education, Inc.
Section 2-1: Frequency Distributions for Organizing and Summarizing Data 13 26.26. Age (years) of Best Actor When Oscar Was Won Less than 30 Less than 40 Less than 50 Less than 60 Less than 70 Less than 80
Cumulative Frequency 1 30 68 84 90 91
27. No. The United States has 37.1% of the cost of piracy for only the five countries listed, not the total cost of piracy for all countries. Because only the top five costs of piracy are listed, we know only that any other country must have a cost less than $1.9 billion. Country United States China India France United Kingdom
Relative Frequency 37.1% 35.5% 11.0% 8.6% 7.8%
28. Yes, it appears that births occur on the days of the week with frequencies that are about the same. Day Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Relative Frequency 13.0% 16.5% 18.0% 14.3% 14.3% 10.8% 13.3%
29. It is very similar to Table 2-2. Both frequency distributions begin with a low frequency in the first class, followed by the maximum frequency in the second class, and the frequencies are generally lower as you progress from top to bottom in the table. (TI data: Frequencies are 79, 148, 157, 43, 49, 6, 9, 0, 0, 9.) Daily Commute Time in Los Angeles (minutes) 0–14 15–29 30–44 45–59 60–74 75–89 90–104 105–119 120–134 135–149
Copyright © 2022 Pearson Education, Inc.
Frequency 157 324 282 103 98 7 18 0 0 11
14 Chapter 2: Exploring Data with Tables and Graphs 30. It is very similar to the frequency distribution for Exercise 1. Both frequency distributions begin with the maximum frequency, and then the frequencies decrease moving from top to bottom in the table. (TI data: Frequencies are 289, 154, 43, 11, 0, 3.) Daily Commute Time in Dallas (minutes) 0–29 30–59 60–89 90–119 120–149 150–179
Frequency 547 339 92 15 0 7
31. Yes, the frequency distribution appears to be a normal distribution. Systolic Blood Pressure (mm Hg) 80–99 100–119 120–139 140–159 160–179 180–199
Frequency 11 116 131 34 7 1
32. Yes, the frequency distribution appears to be a normal distribution. Diastolic Blood Pressure (mm Hg) 40–54 55–69 70–84 85–99 100–114
Frequency 27 107 133 31 2
33. Yes, the frequency distribution appears to be a normal distribution. Magnitude 1.00–1.49 1.50–1.99 2.00–2.49 2.50–2.99 3.00–3.49 3.50–3.99 4.00–4.49 4.50–4.99
Frequency 19 97 187 147 100 38 8 4
34. No, the frequency distribution does not appear to be a normal distribution. Depth (km) 0.0–9.9 10.0–19.9 20.0–29.9 30.0–39.9 40.0–49.9
Frequency 539 49 10 1 1
Copyright © 2022 Pearson Education, Inc.
Section 2-2: Histograms 15 35. An outlier can dramatically increase the number of classes. Weight (lb) 200–219 220–239 240–259 260–279 280–299 300–319 320–339 340–359 360–379 380–399 400–419 420–439 440–459 460–479 480–499 500–519
With Outlier 6 5 12 36 87 28 0 0 0 0 0 0 0 0 0 1
Without Outlier 6 5 12 36 87 28
Section 2-2: Histograms 1. The histogram should be bell-shaped. 2. Not necessarily. Because the sample subjects themselves chose to be included, the voluntary response sample might not be representative of the population. 3. With a data set that is so small, the true nature of the distribution cannot be seen with a histogram. 4. The outlier will result in a single bar that is far away from all of the other bars in the histogram, and the height of that bar will correspond to a frequency of 1. 5. 40 6. Approximate values: Class width: 0.1 gram, lower limit of first class: 5.5 grams, upper limit of first class: 5.6 grams 7. The shape of the graph would not change. The vertical scale would be different, but the relative heights of the bars would be the same. 8. 40 of the quarters are ―pre-1964‖ made with 90% silver and 10% copper, and the other 40 quarters are ―post1964‖ made with a copper-nickel alloy. The histogram depicts weights from two different populations of quarters. 9. Because the data are skewed to the right, the histogram does not appear to depict data from a population with a normal distribution.
Copyright © 2022 Pearson Education, Inc.
16 Chapter 2: Exploring Data with Tables and Graphs 10. The histogram does appear to be the graph of data from a population with a normal distribution.
11. Because it is far from being bell-shaped, the histogram does not appear to depict data from a population with a normal distribution.
12. The histogram appears to be skewed to the right (or positively skewed).
13. Using a loose interpretation of ―normal,‖ the histogram appears to be approximately normal. The histogram does show skewness to the right.
Copyright © 2022 Pearson Education, Inc.
Section 2-2: Histograms 17 14. The histogram isn’t very close to being bell-shaped, so it does not appear to depict data from a population with a normal distribution. The departure from a normal distribution is not too substantial.
15. Because the histogram is roughly bell-shaped, it does appear to depict a normal distribution.
16. Because the histogram isn’t close enough to being bell-shaped, it does not appear to depict data from a population with a normal distribution.
17. The digits 0 and 5 appear to occur more often than the other digits, so it appears that the heights were reported and not actually measured. This suggests that the data might not be very useful.
Copyright © 2022 Pearson Education, Inc.
18 Chapter 2: Exploring Data with Tables and Graphs 18. The digits 0 and 5 appear to occur more often than the other digits, so it appears that the weights were reported and not actually measured. This suggests that the data might not be very useful.
19. Only part (c) appears to represent data from a normal distribution. Part (a) has a systematic pattern that is not that of a straight line, part (b) has points that are not close to a straight-line pattern, and part (d) is really bad because it shows a systematic pattern and points that are not close to a straight-line pattern. 20. The histogram for the movie lengths suggests that the data have a distribution that is approximately normal with an outlier of 120 minutes. The histogram for the time of tobacco use appears to be similar to the histogram for the times of alcohol use, but both distributions appear to be skewed to the right, with each histogram having an outlier. Section 2-3: Graphs That Enlighten and Graphs That Deceive 1. The data set is too small for a dotplot to reveal important characteristics of the data. Because the data are listed in order for each of the last several years, a time-series graph would be most effective for these data. 2. Yes, the original data values can be found from the stemplot. 5|4 6|8 7 | 0369 8 | 123 9|8 3. No. Graphs should be constructed in a way that is fair and objective. The readers should be allowed to make their own judgments, instead of being manipulated by misleading graphs. 4. No. If the sample is a bad sample, such as one obtained from voluntary responses, there are no graphs or statistical methods that can be used to salvage the data. 5. The pulse rate of 36 beats per minute appears to be an outlier.
6.
There do not appear to be any outliers.
7.
The data are arranged in order from lowest to highest, as 36, 56, 56, and so on. 3|6 4| 5 | 668 6 | 044666 7 | 6888 8 | 02468 9|4
Copyright © 2022 Pearson Education, Inc.
Section 2-3: Graphs That Enlighten and Graphs That Deceive 19 8.
9.
The two values closest to the middle are 72 mm Hg and 74 mm Hg. 6 | 0022468 7 | 0000246688 8 | 22468 9 | 00 There is a gradual upward trend that appears to be leveling off in recent years. An upward trend would be helpful to women so that their earnings become equal to those of men.
10. The numbers of home runs rose from 1994 to 2000, then there was a gradual decline from 2000 to 2014, and there was a steady increase after 2014.
11. Given that box office receipts are closely tracked and widely reported, these data are very accurate.
Copyright © 2022 Pearson Education, Inc.
20 Chapter 2: Exploring Data with Tables and Graphs 12. The overwhelming response was that thank-you notes should be sent to everyone who is met during a job interview. Given what is at stake, that seems like a wise strategy.
14.
13.
15. The distribution does not have much skewness.
16. The distribution does not appear to be skewed.
17. Because the vertical scale starts with a frequency of 200 instead of 0, the difference between the ―no‖ and ―yes‖ responses is greatly exaggerated. The graph makes it appear that about five times as many respondents said ―no,‖ when the ratio is actually a little less than 2.5 to 1. 18. The fare increased from $1 to $2.50, so it increased by a factor of 2.5. But when the larger bill is drawn so that the width is 2.5 times that of the smaller bill and the height is 2.5 times that of the smaller bill, the larger bill has an area that is 6.25 times that of the smaller bill (instead of being 2.5 times its size, as it should be). The illustration greatly exaggerates the increase in the fare.
Copyright © 2022 Pearson Education, Inc.
Section 2-4: Scatterplots, Correlation, and Regression 21 19. The two costs are one-dimensional in nature, but the baby bottles are three-dimensional objects. The $4500 cost isn’t even twice the $2600 cost, but the baby bottles make it appear that the larger cost is about five times the smaller cost. 20. The graph is misleading because it depicts one-dimensional data with three-dimensional boxes. See the first and last boxes in the graph. Workers with advanced degrees have annual incomes that are roughly 3 times the incomes of those with no high school diplomas, but the graph exaggerates this difference by making it appear that workers with advanced degrees have incomes that are roughly 27 times the amounts for workers with no high school diploma. 21.21. 96 | 96 | 59 97 | 0001112333444 97 | 55666666788888999 98 | 555566666666666666677777788888889 99 | 001244 99 | 56 22. Around 2016, digital ad spending surpassed TV ad spending. TV ad spending appears to be relatively stable with only slight increases over time, but digital ad spending is growing at a much faster rate than TV ad spending.
Section 2-4: Scatterplots, Correlation, and Regression 1. The term linear refers to a straight line, and r measures how well a scatterplot of the sample paired data fits a straight-line pattern. 2. No. Finding the presence of a statistical correlation between two variables does not justify any conclusion that one of the variables is a cause of the other. A scatterplot is a graph of paired x, y quantitative data. It helps us by providing a visual image of the data plotted as points, and such an image is helpful in enabling us to see patterns in the data and to recognize that there may be a correlation between the two variables. 4. a. 1 c. 0 b. d. 1
3.
Copyright © 2022 Pearson Education, Inc.
22 Chapter 2: Exploring Data with Tables and Graphs 5.
There does not appear to be a correlation. The given data suggest that five-day predicted high temperatures are not very accurate.
6.
There does appear to be a linear correlation between the Sprint and Verizon airport data speeds. The correlation suggests that the speeds of the two companies are somehow related to the airports. One possibility is that the Sprint and Verizon cell towers are in similar locations at these airports.
7.
There does appear to be a linear correlation between the amounts of tar and the amounts of nicotine in cigarettes.
8.
There does not appear to be a correlation between pulse rates of females and males. The major flaw with this exercise is that the data are not paired as required. The results are therefore meaningless.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz 23 With n 10 pairs of data, the critical values are 0.632. Because r 0.475 is between –0.632 and 0.632, there is not sufficient evidence to conclude that there is a linear correlation. 10. With n 9 pairs of data, the critical values are 0.666. Because r 0.866 is in the right tail region beyond 0.666, there is sufficient evidence to conclude that there is a linear correlation. 11. With n 9 pairs of data, the critical values are 0.666. Because r 0.971 is in the right tail region beyond 0.666, there is sufficient evidence to conclude that there is a linear correlation. 12. With n 10 pairs of data, the critical values are 0.632. Because r 0.076 is between –0.632 and 0.632, there is not sufficient evidence to conclude that there is a linear correlation. The data are not paired, so the results are meaningless. 13. Because the P-value is 0.166, which is not small (such as 0.05 or less), there is a high chance (16.6%) of getting the sample results when there is no correlation, so there is not sufficient evidence to conclude that there is linear correlation. 14. Because the P-value is 0.003, which is small (such as 0.05 or less), there is a small chance (0.3%) of getting the sample results when there is no correlation, so there is sufficient evidence to conclude that there is linear correlation. 15. Because the P-value of 0.000 is small (such as 0.05 or less), there is a small chance of getting the sample results when there is no correlation, so there is sufficient evidence to conclude that there is a linear correlation. 16. Because the P-value of 0.835 is not small (such as 0.05 or less), there is a high chance (83.5% chance) of getting the sample results when there is no correlation, so there is not sufficient evidence to conclude that there is a linear correlation. But the data are not paired, so the P-value is meaningless here. Quick Quiz 1. Class width: 20. It is not possible to identify the original data values. 2. Class limits: 0 and 19. Class boundaries: –0.5 and 19.5. 3. 69 4. 9.
Annual Tornadoes in Oklahoma 0–19 20–39 40–59 60–79 80–99 100–119 120–139 140–159
Relative Frequency 4.3% 26.1% 30.4% 21.7% 8.7% 7.2% 0.0% 1.4%
5. 6. 7. 8.
30, 30, 30, 31, 34, 34, 36, 36, 39 pareto chart scatterplot No, the term ―normal distribution‖ has a different meaning than the term ―normal‖ that is used in ordinary speech. A normal distribution has a bell shape, but the randomly selected lottery digits will have a uniform or flat shape. 9. variation 10. Parts a, d, and e describe normally distributed data.
Copyright © 2022 Pearson Education, Inc.
24 Chapter 2: Exploring Data with Tables and Graphs Review Exercises 1. Both distributions are skewed to the right. Email Interarrival Time (minutes) 0–9 10–19 20–29 30–39 40–49 50–59 60–69
Frequency 27 15 7 7 1 2 1
2.
Because the histogram has a shape that is far from being bell-shaped, it suggests that the data are from a population not having a normal distribution.
3.
By using fewer classes, the histogram does a better job of illustrating the distribution.
4.
There are no outliers.
5.
No. There is no pattern suggesting that there is a relationship.
6.
a. time-series graph b. scatterplot c. pareto chart
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 25 7.
A pie chart wastes ink on components that are not data; pie charts lack an appropriate scale; pie charts don’t show relative sizes of different components as well as some other graphs, such as a Pareto chart.
8.
The Pareto chart does a better job. It draws attention to the most annoying words or phrases and shows the relative sizes of the different categories.
Cumulative Review Exercises 1. Magnitude 1.00–1.49 1.50–1.99 2.00–2.49 2.50–2.99 3.00–3.49 3.50–3.99 2.
3.
Frequency 1 1 5 7 5 1
a. 1.00 and 1.49 b. 0.995 and 1.495 1.00 1.49 c. 1.245 2 The distribution is closer to being a normal distribution than the others. Only the single lowest value of 1.09 prevents perfect symmetry, but that one value should not be a basis for stating that the distribution is skewed left.
Copyright © 2022 Pearson Education, Inc.
26 Chapter 2: Exploring Data with Tables and Graphs 4.
5.
a. continuous b. quantitative c. ratio d. sample The scatterplot does not show any pattern. There does not appear to be correlation between magnitude and depth.
Copyright © 2022 Pearson Education, Inc.
Section 3-1: Measures of Center
27
Chapter 3: Describing, Exploring, and Comparing Data Section 3-1: Measures of Center 1. The term average is not used in statistics. The term mean should be used for the result obtained by adding all of the sample values and dividing the total by the number of sample values. 2. No. The 50 amounts are all weighed equally in the calculation that yields the mean of $56,479, but some states have many more teachers than others, so the mean teacher salary should be calculated using a weighted mean that takes into account the numbers of teachers in each of the 50 different states. 3. They use different approaches for providing a value (or values) of the center or middle of the sorted list of data. 15 20 25 30 30 35 45 50 50 180 4. Using all ten values, x 48.0 min and the median is 10 30 35 32.5 min. 2 15 20 25 30 30 35 45 50 50 Excluding 180 minutes, x 33.3 min and the median is 30 min. 9 The outlier caused the mean to change by a substantial amount, but the median changed a relatively small amount. The median is resistant to the effect of the outlier, but the mean is not resistant. XLSTAT Output for all data XLSTAT Output with outlier removed Statistic Nbr. of observations Minimum Maximum Median Mean 5.
Statistic Nbr. of observations Minimum Maximum Median Mean
Wait Times (minutes) 9 15.000 50.000 30.000 33.333
12 26 46 15 11 87 77 62 60 69 61 47.8. 11 The median is 60.0. There is no mode. 11 87 The midrange is 49.0. 2 The jersey numbers are nominal data that are just replacements for names, and they do not measure or count anything, so the resulting statistics are meaningless. XLSTAT Output for Exercise 5 XLSTAT Output for Exercise 6 The mean is x
Statistic Nbr. of observations Minimum Maximum Median Mean 6.
Wait Times (minutes) 10 15.000 180.000 32.500 48.000
Jersey Number 11 11.000 87.000 60.000 47.818
Statistic Nbr. of observations Minimum Maximum Median Mean
Age 11 24.000 41.000 29.000 29.000
41 24 30 31 32 29 25 26 26 25 30 29.0 years. 11 The median is 29.0 years The modes are 25, 26, and 30 years. 24 41 The midrange is 32.5 years. 2 The jersey numbers in the preceding exercise are nominal data that are just substitutes for names, but the ages in this exercise are data at the ratio level of measurement, so the statistics are meaningful. These ages show that the players are all relatively young, with the exception of the GOAT (Tom Brady, age 41). The mean is x
Copyright © 2022 Pearson Education, Inc.
28 Chapter 3: Describing, Exploring, and Comparing Data 7.
5500 3700 3200 1700 1200 1000 1000 950 2281, or $2281 million. 8 1200 1700 The median is 1450, or $1450 million. 2 The mode is $100 million 950 5500 The midrange is 3225, or $3225 million. 2 Apart from the fact that all other celebrities have amounts of net worth lower than those given, nothing meaningful can be known about the population of net worth of all celebrities. The numbers all end in 0, and they appear to be rounded estimates (which is the reason for rounding to the nearest whole number). XLSTAT Output for Exercise 7 XLSTAT Output for Exercise 8 Net Worth Statistic Income (dollars) Statistic Nbr. of observations 9 (millions of dollars) Nbr. of observations 8 Minimum 16354.000 Minimum 950.000 Maximum 2200000.000 Median 18259.000 Maximum 5500.000 Mean 260741.111 Median 1450.000 Mean 2281.250 The mean is x
17, 466 18, 085 17,875 18, 259 16, 354 2, 200, 000 260, 741.1, or $260,741.1. 9 The median is $18,259.0 There is no mode. 16, 354 2, 200, 000 The midrange is 1,108,177.0, or $1,108,177.0. 2 Michael Jordan’s income of $2,200,000 is an outlier that has a dramatic effect on the mean and midrange, but it has relatively little effect on the median. It would be technically correct to say that the mean income is greater than $250,000, but it would be very misleading to do so.
8.
The mean is x
9.
The mean is x
70 54 68 82 79 83 76 73 98 81 76.4 attacks. 10 76 79 The median is 77.5 attacks. 2 There is no mode. 54 98 The midrange is 76.0 attacks. 2 The data are time-series data, but measures of center do not reveal anything about a trend consisting of a pattern of change over time. XLSTAT Output for Exercise 9 Statistic Nbr. of observations Minimum Maximum Median Mean
Number of Attacks 10 54.000 98.000 77.500 76.400
XLSTAT Output for Exercise 10 Statistic Nbr. of observations Minimum Maximum Median Mean
Copyright © 2022 Pearson Education, Inc.
Phenotype Code 25 1.000 4.000 2.000 1.880
Section 3-1: Measures of Center 10. The mean is x
29
11111 111 11 2 2 2 3 3 3 3 3 3 4 1.9. 25
The median is 2. The mode is 1. 1 4
2.5 . 2 The mode of 1 correctly indicates that the smooth-yellow peas occur more than any other phenotype, but the other measures of center do not make sense with these data at the nominal level of measurement.
The midrange is
11. The mean is x $198.2. The median is
250 170 225 100 250 250 130 200 150 250 170 200 180 250 198.2, or 14
200 200
200.0, or $200.0.
2 The mode is $250. 100 250 The midrange is 175.0, or $175.0. 2 The lowest price is a relevant statistic for someone planning to buy one of the smart thermostats. XLSTAT Output for Exercise 11 XLSTAT Output for Exercise 12 Statistic Nbr. of observations Minimum Maximum Median Mean
Selling Price (dollars) 14 100.000 250.000 200.000 198.214
Statistic Nbr. of observations Minimum Maximum Median Mean
Radiation Rate 8 0.480 1.520 1.030 1.049
0.97 1.38 0.93 1.52 1.37 1.09 0.48 0.65 1.049 W/kg. 8 0.97 1.09 The median is 1.030 W/kg. 2 There is no mode. 0.48 1.52 The midrange is 1.000 W/kg. 2 If concerned about radiation, you might purchase the cell phone with the lowest radiation level. All of the cell phones in the sample have radiation levels below the FCC maximum of 1.6 W/kg.
12. The mean is x
0 0 0 34 36 38 41 41 41 53 54 55 32.0 mg. 20 38 41 The median is 39.5 mg. 2 The mode is 0 mg. 0 55 The midrange is 27.5 mg. 2 Americans consume some brands much more often than others, but the 20 brands are all weighted equally in the calculations, so the statistics are not necessarily representative of the population of all cans of the same 20 brands consumed by Americans.
13. The mean is x
Copyright © 2022 Pearson Education, Inc.
30 Chapter 3: Describing, Exploring, and Comparing Data XLSTAT Output for Exercise 13 Caffeine Statistic (mg per 12 oz) Nbr. of observations 20 Minimum 0.000 Maximum 55.000 Median 39.500 Mean 32.550
XLSTAT Output for Exercise 14 Statistic Nbr. of observations Minimum Maximum Median Mean
Percentage of Men’s Earnings 20 74.400 82.500 80.300 79.610
74.4 76.3 76.5 76.9 76.4 77.9 82.2 80.9 82.1 82.5 81.1 81.9 79.61%. 20 80.2 80.4 The median is 80.30%. 2 The mode is 80.2%. 74.4 82.5 The midrange is 78.45%. 2 The data are time-series data, but the measures of center do not reveal anything about a trend consisting of a pattern of change over time.
14. The mean is x
11111 111 2 2 1.2. 10 11 The median is 1.0. 2 The mode is 1. 1 2 The midrange is 1.5. 2 The statistics are meaningless because the data are at the nominal level of measurement with the numbers being replacements for ―right‖ and ―left.‖ Because the measurements were made in 1988, they are not necessarily representative of the current population of all Army women. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16
15. The mean is x
Statistic Nbr. of observations Minimum Maximum Median Mean
Writing Hand (1 = right) 10 1.000 2.000 1.000 1.200
Statistic Nbr. of observations Minimum Maximum Median Mean
Annual Costs (dollars) 10 54010.000 57208.000 54575.000 54817.600
57, 208 55, 210 54, 770 54, 380 54, 259 54, 010 54,817.6, or $54,817.6. 10 54, 380 54, 770 The median is 54, 575.0, or $54,575.0. 2 There is no mode. 54, 010 57, 208 The midrange is 55, 609.0, or $55,609.0. 2 Apart from the fact that all other colleges have tuition and fee amounts less than those listed, nothing meaningful can be known about the population.
16. The mean is x
Copyright © 2022 Pearson Education, Inc.
Section 3-1: Measures of Center
31
39 50 50 50 175 200 209 500 500 1500 2500 365.3, or $365.30. 25 The median is 200.0, or $200.00. The mode is 500, or $500.00. 39 2500 The midrange is 1269.5, or $1269.5. 2 The amounts of $1500 and $2500 appear to be outliers. XLSTAT Output for Exercise 17 XLSTAT Output for Exercise 18
17. The mean is x
Statistic Nbr. of observations Minimum Maximum Median Mean 18. The mean is x
Cost (dollars) 25 39.000 2500.000 200.000 365.320
Statistic Nbr. of observations Minimum Maximum Median Mean
Sales (millions of units) 25 0.300 14.300 1.400 3.456
0.3 0.6 0.8 1.1 1.4 1.2 0.9 0.9 1.0 9.2 11.9 13.114.3 3.46 million 25
albums. The median is 1.40 million albums. The mode is 1.4 million albums. 0.3 14.3 The midrange is 7.30 million albums. 2 None of the measures of center give us any information about the changing trend over time 0 0 0 0 0 0 9 10 10 20 40 50 2.8 cigarettes. 50 The median is 0 cigarettes. The mode is 0 cigarettes. 0 50 The midrange is 25.0 cigarettes. 2 Because the selected subjects report the number of cigarettes smoked, it is very possible that the data are not at all accurate. And what about that person who smokes 50 cigarettes (or 2.5 packs) a day? What are they thinking?
19. The mean is x
XLSTAT Output for Exercise 19 Statistic Nbr. of observations Median Mean
Cigarettes Smoked per Day 50 0.000 2.780
XLSTAT Output for Exercise 20 Statistic Nbr. of observations Median Mean
Price (dollars) 12 3.900 3.992
5.3 5.3 2.6 3.2 3.4 2.3 6.8 5.1 4.8 4.4 2.8 1.9 3.99, or $3.99. 12 3.4 4.4 The median is 3.90, or $3.90 2 The mode is $5.3. 1.9 6.8 The midrange is 4.35, or $4.35. 2 The prices appear to vary by considerable amounts. For example, the price of a Big Mac in Switzerland ($6.8) is more than three times the price in Egypt ($1.9). None of the measures of center reveal anything about the variation of the prices.
20. The mean is x
Copyright © 2022 Pearson Education, Inc.
32 Chapter 3: Describing, Exploring, and Comparing Data 21. Systolic: 96 116 118 120 122 126 128 136 156 158 127.6 mm Hg, 10 122 126 The median is 124.0 mm Hg. 2 Diastolic: 52 58 64 72 74 76 80 82 88 90 The mean is x 73.6 mm Hg. 10 74 76 The median is 75.0 mm Hg. 2 Given that systolic and diastolic blood pressures measure different characteristics, a comparison of the measures of center doesn’t make sense. Because the data are matched, it would make more sense to investigate whether there is an association or correlation between systolic blood pressure measurements and diastolic blood pressure measurements. XLSTAT Output The mean is x
Statistic Nbr. of observations Median Mean
Systolic (mm Hg) 10 124.000 127.600
Diastolic (mm Hg) 10 75.000 73.600
22. Brinks: 1.3 1.3 1.4 1.5 1.5 1.6 1.7 1.7 1.8 1.55, or $1.55 million. 10 1.5 1.6 The median is 1.55, or $1.55 million. 2 Not Brinks: 1.5 1.5 1.6 1.6 1.6 1.7 1.8 1.9 1.9 2.2 The mean is x 1.73, or $1.73 million. 10 1.6 1.7 The median is 1.65, or $1.65 million. 2 The data do suggest that collections were considerably lower when Brinks was the collection contractor. XLSTAT Output Brinks Not Brinks Statistic (millions of dollars) (millions of dollars) Nbr. of observations 10 10 Median 1.550 1.650 Mean 1.550 1.730 The mean is x
23. Males: The mean is x
54 56 58 66 66 72 80 86 96
69.5, beats per minute.
15 The median is 66.0 beats per minute. Females: 64 68 70 82 84 86 90 90 94 The mean is x 82.1, beats per minute. 15 The median is 84.0 beats per minute. The pulse rates of males appear to be lower than those of females.
Copyright © 2022 Pearson Education, Inc.
Section 3-1: Measures of Center
33
23. (continued) XLSTAT Output Male (beats per minute)
Statistic Nbr. of observations Median Mean
Female (beats per minute)
15 66.000 69.467
15 84.000 82.133
24. It’s a Small World: 10 5 5 10 10 20 15 15 9.7 minutes. 50 10 10 The median is 10.0 minutes. 2 Avatar Flight of Passage: 180 195 110 150 180 120 150 180 225 165 The mean is x 140.6 minutes. 50 150 150 The median is 150.0 minutes. 2 The results are dramatically different, suggesting the wait time for ―Avatar Flight of Passage‖ is much longer than ―It’s a Small World‖. XLSTAT Output The mean is x
Small World (minutes)
Statistic Nbr. of observations Median Mean
Avatar (minutes)
50 10.000 9.700
50 150.000 140.600
25. ANSUR I 1988: The mean is x 78.49 kg and the median is 77.70 kg. ANSUR II 2012: The mean is x 85.52 kg and the median is 84.60 kg. It does appear that males have become heavier. (TI data: ANSUR I 1988: x 78.82 kg and the median is 77.70 kg. ANSUR II 2012: x 85.53 kg and the median is 84.00 kg.) XLSTAT Output Statistic Nbr. of observations Median Mean
Weight 1998 (kg) 4082 77.700 78.487
Weight 2012 (kg) 4082 84.600 85.524
26. The mean is x 2.572 and the median is 2.490. Yes, the magnitude of that World Series earthquake is an outlier because a magnitude of 7.0 is substantially greater than all of the other magnitudes listed in Data Set 24. XLSTAT Output Statistic Nbr. of observations Median Mean
Magnitude
Copyright © 2022 Pearson Education, Inc.
600 2.490 2.572
34 Chapter 3: Describing, Exploring, and Comparing Data 27. The mean is x 98.20F and the median is 98.40F. These results suggest that the mean is less than 98.6F. XLSTAT Output for Exercise 28
XLSTAT Output for Exercise 27 Statistic Nbr. of observations Median Mean
DAY 2 - 12AM (F) 106 98.400 98.200
Statistic Nbr. of observations Median Mean
Birth Weight (grams) 400 3300.000 3152.000
28. The mean is x 3152.0 g and the median is 3300 g. All of the weights end in 00, so they are all rounded to the nearest 100 grams. This suggests that the results should be rounded as follows: x 3150.0 g and the median is 3300 g. 7(6) 22(18) 37(14) 52(5) 67(5) 82(1) 97(1) 29. The mean is 34.6, minutes, which is reasonably close to 6 18 14 5 5 11 the mean of 31.4 minutes obtained by using the original list of values. 79.5(4) 99.5(7) 119.5(6) 139.5(6) 159.5(18) 179.5(5) 199.5(1) 219.5(3) 143.9 minutes, 4 7 6 6 18 5 1 3 which is close to the mean of 140.6 minutes obtained by using the original list of values. That’s a long wait! 42(2) 47(7) 52(10) 57(10) 62(6) 67(3) 72(1) 31. The mean is 55.1 years. The mean from the frequency 2 7 10 10 6 3 1 distribution is quite close to the mean of 55.2 years obtained by using the original list of values.
30. The mean is
137(1) 162(0) 187(0) 212(3) 237(34) 262(12) 239.5. seconds. The mean from the 1 0 0 3 34 12 frequency distribution is reasonably close to the mean of 240.2 seconds obtained by using the original list of values. 34 32 33 4 4 11 33. The mean is x 3.14, so the student made the dean’s list. 3 3 3 4 1 63 91 88 84 79 34. The mean is x 0.60 0.10 86 0.1590 0.15 70 81.2, so the student earned a 5 B. 32. The mean is
35. a. The missing value is 556.2 64 46 54 47 70 years. b. n 1 36. The mean ignoring the presidents who are still alive is 15.5 years. The mean including the presidents who are still alive is at least 16.3. years. The two results differ by 0.8 year, which is not very large. 37. 504 lb is an outlier. For the original data, the median is 285.5 lb and the mean is 294.4 lb. The 10% trimmed mean is 285.4 lb and the 20% trimmed mean is 285.8 lb. The median, 10% trimmed mean, and 20% trimmed mean are all quite close, but the untrimmed mean of 294.4 lb differs from them because it is strongly affected by the inclusion of the outlier. 38. a. It took 30 30 4 hours for the round trip, so the arithmetic mean is 60 15 mi/h. The harmonic mean is 30
10
1 1 15.0 mi/h. 2 30 10 b. The harmonic mean is 2
4
138 156 45.3 mi/h and the arithmetic mean is
3856
47.0 mi/h. The harmonic
2
mean of 45.3 mi/h is the actual mean speed for the round trip. 39. The geometric mean is 6 (1.0058)(1.0029)(1.0013)(1.0014)(1.0015)(1.0019) 1.002465417 , or 1.00247 when rounded, so the growth rate is 1.00247 1 0.00247, or 0.247%. 40. The root mean square (RMS) is 0 2 60 2 110 2 110 2 60 2 0 2 6 72.3 volts, which is very
different from the mean of 0 volts.
Copyright © 2022 Pearson Education, Inc.
Section 3-2: Measures of Variation 35 501 24 1 41. The median found using the given expression is 30 15 2 30.5 minutes. The median of the 14 1000 times from Data Set 31 is 30.0 minutes. The difference is 0.5 minute. Section 3-2: Measures of Variation 193.9 155.0 1. s 9.58 cm, which is in the general ballpark of the standard deviation of 7.10 cm calculated 4 using the 153 heights. The range rule of thumb does not necessarily give an estimate of s that is very accurate 2.
Significantly low values are less than or equal to 174.12 2(7.10) 159.92 cm and significantly high values are greater than or equal to 174.12 2(7.10) 188.32 cm. A height of 190 cm is significantly high.
3.
7.10 cm2 50.41 cm2
4.
(a) s, (b) , (c) s2, (d) ; If sample data consist of weights measured in grams, s and would also be in 2
units of grams (g), but s 2 and 5.
2
would be in units of grams squared g 2 .
The range is 87 11 76.0. (12 47.8)2 (26 47.8)2 (69 47.8)2 (61 47.8)2 2 755.4. The variance is s 11 1 The standard deviation is 755.4 27.5. Because the jersey numbers are really just replacements for names, they are at the nominal level of measurement, so the results are meaningless. XLSTAT Output for Exercise 5 XLSTAT Output for Exercise 6 Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Jersey Number 11 76.000 755.364 27.484
Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Age (years) 11 17.000 23.400 4.837
6.
The range is 41 24 17.0 years. (41 29.0)2 (24 29.0)2 (25 29.0)2 (30 29.0)2 23.4 years . The variance is s 2 111 2 The standard deviation is s 23.4 4.8 years. Unlike the results from the preceding exercise, these results are meaningful and they give us some insight into the nature of the data. We can see that with a range of 17 years, there is a large difference in age between the oldest player (Tom Brady) and youngest player.
7.
The range is 5500 950 $4550 million. (5500 2281)2 (950 2281)2 2 2,825, 670 (million dollars)2. The variance is s 8 1 The standard deviation is s 2,825, 670 $1681 million dollars. Because the data are from celebrities with the highest net worth, the measures of variation are not at all typical for all celebrities. Because all of the amounts end with 0, it appears that they are rounded to the nearest ten million dollars, so it would make sense to round the results to the nearest million dollars, as is done here.
Copyright © 2022 Pearson Education, Inc.
36 Chapter 3: Describing, Exploring, and Comparing Data XLSTAT Output for Exercise 7 Net Worth (millions of dollars)
Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1) 8.
8 4550.000 2825669.643 1680.973
XLSTAT Output for Exercise 8 Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
9 2183646.000 528853134417.111 727222.892
The range is 2, 200, 000 16, 354 $2,183, 646. The variance is s 2
(16, 354 260, 741.1)2 (2, 200, 000 260, 741.1)2
8 1 The standard deviation is s 528,853,134, 358.0 $727, 222.9
9.
Income (dollars)
528,853,134, 358.0 (dollars) . 2
(Because the variance is so large, the exact number usually cannot be found with software or a calculator, so a result such as 528,853,134,000 square dollars is acceptable here.) The income of Michael Jordan does have a substantial effect on the results. The range is 98 54 44.0 attacks. 10(59, 564) (764)2 2 132.7 attacks2. The variance is s 10(10 1) The standard deviation is s 132.7 11.5 attacks. The measures of variation are blind to any trend for these time-series data. XLSTAT Output for Exercise 9 XLSTAT Output for Exercise 10 Number of Statistic Phenotype Code Statistic Attacks Nbr. of observations 25 Range 3.000 Nbr. of observations 10 Range 44.000 Variance (n-1) 0.860 Variance (n-1) 132.711 Standard deviation (n-1) 0.927 Standard deviation (n-1) 11.520
10. The range is 4 1 3. 2 25109 47 2 0.9. The variance is s 25 25 1 The standard deviation is s 0.9 0.9. The measures of variation can be found, but they make no sense because the data don’t measure or count anything. They are nominal data. 11. The range is 250 100 $150.0 14(582, 725) (2775)2 2 2513.9 (dollars)2. The variance is s 14(14 1) The standard deviation is s 2513.9 $50.1 These statistics are measuring variation among the sample of prices. The minimum price would be a particularly helpful statistic, but none of these statistics is helpful for selecting a smart thermostat for purchase. XLSTAT Output for Exercise 11 XLSTAT Output for Exercise 12 Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Selling Price (dollars) 14 150.000 2513.874 50.139
Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Copyright © 2022 Pearson Education, Inc.
Radiation Rate (W/kg) 8 1.040 0.134 0.366
Section 3-2: Measures of Variation 37 12. The range is 1.52 0.48 1.040 W/kg. 8(9.7385) (8.39)2 2 2 0.134 (W/kg) . The variance is s 8(8 1) The standard deviation is s 0.134 0.366 W/kg. The minimum radiation would be a particularly helpful statistic, but none of these statistics is helpful for selecting a cell phone for purchase. 13. The range is 55 0 55.0 mg. 2 20 29,045 651 2 413.4 mg2. The variance is s 20 20 1 The standard deviation is s2 413.4 20.3 mg. Americans consume some brands much more often than others, but the 20 brands are all weighted equally in the calculations, so the statistics are not necessarily representative of the population of all cans of the same 20 brands consumed by Americans. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 Caffeine Percentage of Statistic Statistic Men's Earnings (mg per 12 oz) Nbr. of observations 20 Nbr. of observations 20 Range 8.100 Range 55.000 Variance (n-1) 5.582 Variance (n-1) 413.418 Standard deviation (n-1) 2.363 Standard deviation (n-1) 20.333 14. The range is 82.5 74.4 8.10 percentage points. 20(126,861.1) (1592.2)2 2 5.58 (percentage points)2. The variance is s 20(20 1) The standard deviation is s 5.58 2.36 percentage points. These statistics are blind to any trend occurring over time. 15. The range is 2 1 1.0. 10(16) (12)2 2 0.2. The variance is s 10(10 1) The standard deviation is s 0.2 0.4. Because the numbers are replacements for ―right‖ and ―left,‖ they are at the nominal level of measurement, and the results are meaningless. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Writing Hand (1 = right hand) 10 1.000 0.178 0.422
Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Annual Costs (dollars) 10 3198.000 837554.711 915.180
16. The range is 57, 208 54, 010 $3198.0. 10(30, 057, 230, 690) (548,176)2 2 837, 554.7 (dollars)2. The variance is s 10(10 1) The standard deviation is s 837, 554.7 $915.2 Because the data are from the ten most expensive colleges, the measures of variation are not at all typical for all colleges.
Copyright © 2022 Pearson Education, Inc.
38 Chapter 3: Describing, Exploring, and Comparing Data 17. The range is 2500 39 $2461.0. 2 2510,306,077 9133 2 290, 400.4 (dollars)2. The variance is s 25 25 1 The standard deviation is s 290, 400.4 $538.9. The amounts of $1500 and $2500 appear to be outliers, and it is likely that they have a large effect on the measures of variation. XLSTAT Output for Exercise 18 XLSTAT Output for Exercise 17 Sales Statistic Cost (dollars) Statistic Nbr. of observations 25 (millions of units) Range 2461.000 Nbr. of observations 25 Variance (n-1) 290400.393 Range 14.000 Variance (n-1) 17.244 Standard deviation (n-1) 538.888 Standard deviation (n-1) 4.153 18. The range is 14.3 0.3 14.00 million units. 25(712.46) (86.4)2 2 17.24 (million units)2. The variance is s 25(25 1) The standard deviation is s 17.24 4.15 million units. The measures of variation do not give us any information about a changing trend over time. 19. The range is 50 0 50.0 cigarettes. 2 50 4781 139 2 89.7 (cigarettes)2. The variance is s 50 50 1 The standard deviation is s 89.7 9.5 cigarettes. Because the selected subjects report the number of cigarettes smoked, it is very possible that the data are not at all accurate, so the results might not reflect the actual smoking behavior of California adults. XLSTAT Output for Exercise 20 XLSTAT Output for Exercise 19 Price Cigarettes Statistic Statistic Smoked per Day (dollars) Nbr. of observations 50 Nbr. of observations 12 Range 50.000 Range 4.900 Variance (n-1) 89.685 Variance (n-1) 2.266 Standard deviation (n-1) 9.470 Standard deviation (n-1) 1.505 20. The range is 6.8 1.9 $4.90 12(216.13) (47.9)2 2 2.27 (dollars)2. The variance is s 12(12 1) The standard deviation is s 2.27 $1.51 The range alone tells us that there are very substantial differences among prices of Big Mac hamburgers in different countries. 18.6 21. Systolic: x 127.6, s 18.6; The coefficient of variation is 100% 14.6%. 127.6 12.5 Diastolic: x 73.6, s 12.5; The coefficient of variation is 100% 16.9%. 73.6 The variation is roughly about the same.
Copyright © 2022 Pearson Education, Inc.
Section 3-2: Measures of Variation 39 21. (continued) XLSTAT Output Statistic Nbr. of observations Mean Standard deviation (n-1) Variation coefficient (n-1)
Systolic (mm Hg)
Diastolic (mm Hg) 10 10 127.600 73.600 18.614 12.465 0.146 0.169 0.18 22. Brinks: x 1.55, s 0.18; The coefficient of variation is 100% 11.5%. 1.55 0.22 Not Brinks: x 1.73, s 0.22; The coefficient of variation is 100% 12.8%. 1.73 The results are not very different, so the two samples do not appear to have different amounts of variation. These data do not suggest that theft may have occurred. XLSTAT Output Brinks Not Brinks Statistic (millions of dollars) (millions of dollars) Nbr. of observations 10 10 Mean 1.550 1.730 Standard deviation (n-1) 0.178 0.221 Variation coefficient (n-1) 0.115 0.128 11.3 23. Male: x 69.5, s 11.3; The coefficient of variation is 100% 16.2%. 69.5 9.2 Female: x 82.1, s 9.2; The coefficient of variation is 100% 11.2%. 82.1 Pulse rates of males appear to vary more than pulse rates of females. XLSTAT Output Male Female Statistic (beats per minute) (beats per minute) Nbr. of observations 15 15 Mean 69.467 82.133 Standard deviation (n-1) 11.275 9.211 Variation coefficient (n-1) 0.162 0.112 4.56 100 47.0% 24. It’s a Small World: x 9.7, s 4.56, 9.7 36.7 100 26.1% Avatar Flight of Passage: x 140.6, s 36.7, 140.6 The data suggest that the two rides have a fundamental difference in variation. XLSTAT Output Statistic Nbr. of observations Mean Standard deviation (n-1) Variation coefficient (n-1)
Small World (minutes) 50 9.700 4.564 0.470
Copyright © 2022 Pearson Education, Inc.
Avatar (minutes) 50 140.600 36.723 0.261
40 Chapter 3: Describing, Exploring, and Comparing Data 25. Range 3.10F, s2 0.39 F , s 0.62F 2
XLSTAT Output Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
DAY 2 - 12AM (F) 106 3.100 0.388 0.623
26. Range 3.600, s2 0.423, s 0.651; Including 7.0: Range 5.910, s2 0.455, s 0.675; The range changes by a considerable amount, but the variance and standard deviation do not change very much. XLSTAT Output Statistic Nbr. of observations Range Standard deviation (n-1) Variation coefficient (n-1)
Magnitude
Magnitude w/7 600 3.600 0.651 0.253
601 5.910 0.675 0.262
27. Right Threshold: Range 70.0, s2 160.7, s 12.7 Left Threshold: Range 75.0, s2 164.9, s 12.8 The amounts of variation for the left and right thresholds do not appear to be very different. XLSTAT Output Statistic Nbr. of observations Range Variance (n-1) Standard deviation (n-1)
Right Threshold
Left Threshold
350 70.000 160.670 12.676
350 75.000 164.864 12.840
28. ANSUR I 1988: Range 80.20 kg, s2 123.35 kg2 , s 11.11 kg ANSUR II 2012: Range 104.90 kg, s2 202.23 kg2 , s 14.22 kg There are differences among the measures of variation, but they are not substantial, so it appears that amounts of variation have not changed very much. TI data: ANSUR I 1988: Range 67.10 kg, s2 133.13 kg2 , s 11.54 kg ANSUR II 2012: Range 89.30 kg, s2 186.91 kg2 , s 13.67 kg XLSTAT Output Statistic Nbr. of observations Range Mean Variance (n-1) Standard deviation (n-1)
Weight 1998 (kg) Weight 2012 (kg) 4082 4082 80.200 104.900 78.487 85.524 123.352 202.228 11.106 14.221 3.10 29. The rule of thumb standard deviation is s 0.78F, which is not substantially different from s 0.62F 4 found by using all of the data. 3.600 30. The rule of thumb standard deviation is s 0.900, which is not substantially different from s 0.651 4 found by using all of the data.
Copyright © 2022 Pearson Education, Inc.
Section 3-2: Measures of Variation 41 31. Right Threshold: The rule of thumb standard deviation is s
70.0
17.5, which is reasonably close to
4
s 12.7 found using all of the data. Left Threshold: The rule of thumb standard deviation is s
75.0
18.8, which is in the ballpark of s 12.8
4 found using all of thedata. 80.2
32. ANSUR I 1988: therule of thumb standard deviation is s
20.05 kg, which differs from s 11.11 kg 4 by a large amount, but theestimate is in thevery general ballpark of s 11.11 kg. 104.9 ANSUR II 2012: therule of thumb standard deviation is s 26.23 kg, which differs from 4 s 14.22 kg by a large amount, but theestimate is in thevery general ballpark of s 14.22 kg. (TI data: theANSUR I 1988 estimate is 16.78 kg and s 11.54 kg. theANSUR II 2012 estimate is 22.33 kg and s 23.67 kg.)
33. Significantly low values are less than or equal to 55.2 2 6.9 41.4 years, and significantly high values are greater than or equal to 55.2 2 6.9 69.0 years. An age of 35 years would be significantly low. 34. Significantly low values are less than or equal to 69.6 2 11.3 47.0 beats per minute, and significantly high values are greater than or equal to 69.6 2 11.3 92.2 beats per minute. A pulse rate of 50 beats per minute is neither significantly low or high. 35. Significantly low values are less than or equal to 27.32 2 1.29 24.74 cm, and significantly high values are greater than or equal to 27.32 2 1.29 29.90 cm. A foot length of 30 cm is significantly high. 36. Significantly low values are less than or equal to 98.20 2 0.62 96.96F, and significantly high values are greater than or equal to 98.20 2 0.62 99.44F. A body temperature of 100F is significantly high. 37.3 s 7 . s
50 6 72 1 972 6 7 1 97
2
20.4 minutes is very close to theexact value of 18.5 minutes
50 50 1
50 4 79.52 3 219.52 4 79.5 3 219.5 50 50 1
2
36.4 minutes is very close to theexact value of
38.3 8 . 36.7 minutes. 39.3 s 9 . s
39 2 422 1 722 2 42 1 72
39 39 1
2
7.1 years is very close to theexact value of 6.9 years.
50 11372 22 2622 1137 22 262 50 50 1
2
19.7 seconds is very close to theexact value of
40.4 0 . 20.4 seconds. 41. a. t h e empirical rule states that approximately 95% of women should fall between two standard deviations of themean. 189.7 255.1 320.5 255.1 b. Since 1 and 1, t h e empirical rule states that approximately 68% of women Copyright © 2022 Pearson Education, Inc.
Section 3-2: Measures of Variation 41 65.4 65.4 should fall between one standard deviation of themean.
Copyright © 2022 Pearson Education, Inc.
42 Chapter 3: Describing, Exploring, and Comparing Data 42. a. theempirical rule states that approximately 68% of healthy adults should fall between one standard deviation of themean. 96.34 98.20 100.06 98.20 b. Since 3 and 3, t h e empirical rule states that approximately 99.7% 0.62 0.62 healthy adults should fall between three standard deviations of themean. 43. At least 11 32 89% of women have platelet counts within 3 standard deviations of themean. theminimum count is 255.1 365.4 58.9 and themaximum count is 255.1 365.4 451.3. 44. At least 11 22 75% of healthy adults have body temperatures within 2 standard deviations of themean. theminimum temperature is 98.20 2 0.62 96.96F and themaximum temperature is 98.20 2 0.62 99.44F. 45. a.
9 10 20
13 cigarettes and
2
9 13 10 13 20 13 24.7 cigarettes2 2
2
2
3 3 b. The nine possible samples of two values are thefollowing: {(9, 9), (9, 10), (9, 20), (10, 9), (10, 10), (10, 20), (20, 9), (20, 10), (20,20)} which have thefollowing corresponding sample variances:
{0, 0.5, 60.5, 0.5, 0, 50, 60.5, 50, 0}, that have a mean of s2 2.47 cigarettes2. c. The population variances of thenine samples above are {0, 0.25, 30.25, 0.25, 0, 25, 30.25, 25, 0} that have a mean of s2 12.3 cigarettes2. d. Part (b), because repeated samples result in variances that target thesame value (24.7 cigarettes2) as the population variance. Use division by n 1. e. No. themean of thesample variances (24.7 cigarettes2) equals thepopulation variance (24.7 cigarettes2), but themean of thesample standard deviations (3.5 cigarettes) does not equal thepopulation standard deviation (5.0 cigarettes). 46. The mean absolute deviation of thepopulation is 4.7 cigarettes. With repeated samplings of size 2, thenine different possible samples have mean absolute deviations of 0, 0, 0, 0.5, 0.5, 5, 5, 5.5, 5.5. With many such samples, themean of those nine results is 2.4 cigarettes, showing that thesample mean absolute deviations tend to center about thevalue of 2.4 cigarettes instead of themean absolute deviation of thepopulation, which is 4.7 cigarettes. thesample mean deviations do not target themean deviation of thepopulation. This is not good. This indicates that a sample mean absolute deviation is not a good estimator of themean absolute deviation of a population. Section 3-3: Measures of Relative Standing and Boxplots 1. Brady’s height is 2.66 standard deviations above themean. 2. The waiting line represented by thebottom boxplot is better because thetimes have much less variation, so all customers have wait times that are closer together. 3. It appears that weights of U.S. Army males increased from 1988 to 2012. All values of the5-number survey increased, except for theminimum. 4. 2.00 should be preferred, because it is 2.00 standard deviations above themean and would correspond to thehighest of thefive different possible scores. 5. a. t h e difference is 98 70.2 27.8 mm Hg. 27.8 b. 2.48 standard deviations 11.2 c. z 2.48 fx | =STANDARDIZE(98,70.2,11.2) d. t h e diastolic blood pressure of 98 mm Hg is significantly high.
Copyright © 2022 Pearson Education, Inc.
Section 3-3: Measures of Relative Standing and Boxplots 43 6.
7.
a. t h e difference is 40 71.3 31.3 mm Hg. 31.3 b. 2.61 standard deviations 12.0 c. z 2.61 fx | =STANDARDIZE(40,71.3,12.0) d. t h e diastolic blood pressure of 40 mm Hg is significantly low. a. t h e difference is 95.0 42.6 52.4 minutes. 52.4 b. 2.00 standard deviation 26.2 c. z 2.00 fx | =STANDARDIZE(95.0,42.6,26.2) d. t h e commute time of 95.0 minutes is significantly high because z 2.0.
8.
a. thedifference is 141116, 576.1 15,165.1 words. 15,165.1 b. 1.93 standard deviations 7871.5 c. z 1.93 fx | =STANDARDIZE(1411,16576.1,7871.5) d. t h e count of 1411 words is neither significantly low nor significantly high.
9.
Significantly low scores are less than or equal to 21.1 2 5.2 10.7, and significantly high scores are greater than or equal to 21.1 2 5.2 31.5. Scores that are not significant are between 10.7 and 31.5.
10. Significantly low scores are less than or equal to 100 2 15 70, and significantly high scores are greater than or equal to 100 2 15 130. Scores that are not significant are between 70 and 130. 11. Significantly low knee heights are less than or equal to 21.4 2 1.2 19.0 in., and significantly high knee heights are greater than or equal to 21.4 2 1.2 23.8 in. Knee heights that are not significant are between 19.0 in. and 23.8 in. 12. Significantly low hip breadths are less than or equal to 36.6 2 2.5 31.6 cm, and significantly high hip breadths are greater than or equal to 36.6 2 2.5 41.6 cm. Hip breadths that are not significant are between 31.6 cm and 41.6 cm. 13. The tallest man’s z score is z
272 174.12
13.79 and theshortest man’s z score is z
54.6 174.12
7.10 7.10 16.83. Chandra Bahadur Dangi has themore extreme height because his z score of –16.83 is farther from themean than thez score of 13.79 for Robert Wadlow. Tallest: fx | =STANDARDIZE(272,174.12,7.10) Shortest: fx | =STANDARDIZE(54.6,174.12,7.10) 14. The female has a higher pulse rate because her z score is z the z score of z
99 74.0
2.00, which is a higher number than
12.5
50 69.6
1.73 for themale. 11.3 Female: fx | =STANDARDIZE(99,74.0,12.5) Male: fx | =STANDARDIZE(50,69.6,11.3) 15. The male has a more extreme birth weight because his z score is z number than thez score of z
1500 3272.8
1500 3037.1
2.69, which is a lower
660.2
2.18 for thefemale. 706.3 Female: fx | =STANDARDIZE(1500,3037.1,706.3) Male: fx | =STANDARDIZE(1500,3272.8,660.2)
Copyright © 2022 Pearson Education, Inc.
44 Chapter 3: Describing, Exploring, and Comparing Data 16. The Reese’s Cup has themore extreme age since its z score is z from themean than thez score of z
4.794 4.5338
8.413 8.8637
2.80, which is farther
0.1608
2.50 for theHershey’s Kiss. 0.1039 Reese’s: fx | =STANDARDIZE(8.413,8.8637,0.1608) Hershey’s: fx | =STANDARDIZE(4.794, 4.5338, 0.1039) 3
100 6, so it is the6th percentile. 50 46 18. For 1.47 W/kg, 100 92, so it is the92nd percentile. 50 20 19. For 1.10 W/kg, 100 40, so it is the40th percentile. 50 16 20. For 98 W/kg, 100 32, so it is the32nd percentile. 50 30 50 0.93 0.97 15, so P 0.95 W/kg (Excel: 0.958 W/kg) 21.2 L 1 . 30 2 100 25 50 12.5, so Q P 0.91 W/kg 22. L 1 25 100 75 50 37.5, so Q P 1.28 W/kg 23. L 3 75 100 40 50 1.09 1.10 20, so P 1.095 W/kg (Excel: 1.096 W/kg) 24. L 40 2 100 50 50 1.15 1.16 25, so P 1.155 W/kg 25. L 50 2 100 75 50 37.5, so Q P 1.28 W/kg 26. L 3 75 100 25 50 12.5, so Q P 0.91 W/kg 27. L 1 25 100 85 50 28. L 42.5, so P 1.38 W/kg (Excel: 1.3765 W/kg) 85 100 29. the5- number summary is 2, 6.0, 7.0, 8.0, 10. 17. For 0.48 W/kg,
XLSTAT Output for Exercise 29 Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
Attractiveness 20 2.000 10.000 6.000 7.000 8.000
XLSTAT Output for Exercise 30 Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
Copyright © 2022 Pearson Education, Inc.
Time 8 3.000 62.000 9.250 13.500 31.500
Section 3-3: Measures of Relative Standing and Boxplots 45 30. The 5-number summary is 3 min, 7.5 min, 13.5 min, 32 min, 62 min. (Excel yields Q1 9.25 min and Q3 31.5 min.)
31. The 5-number summary is 128 mBq, 140.0 mBq, 150.0 mBq, 158.5 mBq, 172 mBq (Excel yields Q1 141.0 mBq and Q3 157.25 mBq.)
XLSTAT Output for Exercise 31 Statistic
XLSTAT Output for Exercise 32
Radiation
Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
20 128.000 172.000 141.000 150.000 157.250
Statistic
Systolic
Nbr. of observations Minimum
14 120.000
Maximum 1st Quartile Median 3rd Quartile
150.000 130.000 132.500 140.000
32. The 5-number summary is 120 mm Hg, 130.0 mm Hg, 132.5 mm Hg, 140.0 mm Hg, 150 mm Hg.
33. The top boxplot represents males. Males appear to have slightly lower pulse rates than females. Male Pulse
Female Pulse
Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
XLSTAT Output Pulse (female) 147 36.000 104.000 64.000 74.000 82.000
Pules (male) 153 40.000 104.000 62.000 68.000 76.000
34. The top boxplot represents actresses. Although actresses include theoldest age of 80 years, their boxplot shows that they have ages that are generally lower than those of actors. (Excel: For actresses, Q1 28.5 and for actors, Q3 49.5.) Actresses Actors
Copyright © 2022 Pearson Education, Inc.
46 Chapter 3: Describing, Exploring, and Comparing Data 34. (continued) XLSTAT Output Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
Actress Age 91 21.000 80.000 28.500 33.000 41.000
Actor Age 91 29.000 76.000 38.000 42.000 49.500
35. The top boxplot represents theweights from ANSUR I 1988 and thebottom boxplot represents weights from ANSUR II 2012. It appears that theweights of male Army personnel increased somewhat from 1988 to 2012. (TI data: thevalues of thefive-number summary for ANSUR I are 49.8, 70.7, 77.7, 86.0, 116.9, and for ANSUR II those values are 47.8, 75.2, 84.0, 93.2, 137.1.) ANSUR I ANSUR II
Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
XLSTAT Output Weight 1998 (kg) 4082 47.600 127.800 70.800 77.700 85.300
Weight 2012 (kg) 4082 39.300 144.200 75.600 84.600 94.400
36. The low lead level group represented in thetop boxplot has much more variation and theIQ scores tend to be higher than theIQ scores from thehigh lead level group. theresults suggest that greater exposure to lead corresponds to lower full IQ scores (although a direct cause/effect link is not established). Low Lead
High Lead
Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
XLSTAT Output IQ (low led) 78 50.000 141.000 85.000 94.000 101.000
Copyright © 2022 Pearson Education, Inc.
IQ (high lead) 21 75.000 104.000 80.000 85.000 93.000
Quick Quiz
47
37. The top boxplot represents males. Males appear to have slightly lower pulse rates than females. theoutliers for males are 40 beats per minute, 102 beats per minute, and 104 beats per minute. theoutlier for females is 36 beats per minute. Males
Females
38. The outliers for actresses are 61, 61, 62, 63, 74, 80. theoutlier for actors is 76. Actresses Actors Quick Quiz 70 76 97 81 57 151194 65 117 65 45 105 93.6 km/h. 12
1.
The sample mean is x
2.
The median is
3.
2 The mode is 65 km/h.
4.
The variance is (43.1 km/h)2 = 1857.6 (km/h)2
5. 6.
The speed of 194 km/h appears to be an outlier because it is substantially greater than theother speeds. 34 85.9 z 1.81; No, thespeed of 34 km/h is not significantly low. fx | =STANDARDIZE(34,85.9,28.7) 28.7
7.
About 75% or 0.7592 69 speeds are less than Q3.
8. 9.
76 81
78.5 km/h.
minimum, first quartile Q1, second quartile Q2 (or median), third quartile Q3, maximum 194 10 s 46 km/h 4
10. x, , s, , s2,
2
Review Exercises 1. The differences are given in thetable below. Reported Measured
68 67.9
71 69.9
63 64.9
70 68.3
71 70.3
60 60.6
65 64.5
64 67
54 55.6
63 74.2
66 65
72 70.8
Difference
0.1
1.1
–1.9
1.7
0.7
–0.6
0.5
–3
–1.6 –11.2
1
1.2
0.11.1 (1.9) 1.7 0.7 (0.6) 0.5 (–3) (1.6) (11.2) 11.2 1.00 in. 12 0.1 0.5 11.2 1.7 b. t h e median is 0.30 in. 4.75 in. d. t h e midrange 2 2 is
a. t h e mean is x
c. There is no mode.
e. t h e range is 1.7 (11.2) 12.90 in.
Copyright © 2022 Pearson Education, Inc.
48 Chapter 3: Describing, Exploring, and Comparing Data 1.
(continued) f.
s
0.1 (1.0)2 1.1 (1.0)2
1 (1.0) 1.2 (1.0) 12 1 2
2
3.52 in.
g. s2 3.522 12.39 in.2 1.9 1.6 25 12 h. L 3, so Q 1.75 in. (Excel: Q 1.675 in.) 1 1 100 2 75 12 11.1 i. L 9, so Q 1.05 in. (Excel: Q 1.025 in.) 3 3 100 2 XLSTAT Output Statistic Nbr. of observations Minimum Maximum Range 1st Quartile Median 3rd Quartile Mean Variance (n-1) Standard deviation (n-1) 2.
3.
Difference (inches) 12 -11.200 1.700 12.900 -1.675 0.300 1.025 -1.000 12.387 3.520
The difference of –11.2 in. appears to be an outlier. If that outlier is excluded, themean changes from –1.00 in. to –0.07 in., themedian changes from 0.30 in. to 0.50 in., and thestandard deviation changes from 3.52 in. to 1.51 in. theoutlier has a strong effect on themean and standard deviation, but very little effect on themedian. 11.2 (1.00) z 2.90; t h e difference of –11.2 in. is significantly low (because its z score is less than or 3.52 equal to –2). fx | =STANDARDIZE(-11.2,-1.00,3.52)
4.
(See XLSTAT output for Exercise 1.) the5-number summary is –11.2 in., –1.75 in., 0.30 in., 1.05 in., 1.70 in. (Excel yields Q1 1.675 in. and Q3 1.025 in.)
5.
The mean is x
6.
Significantly low scores are less than or equal to 504.7 2 9.4 485.9, and significantly high scores are
12 14 22 27 40
23.0. t h e numbers don’t measure or count anything. They are used as 5 replacements for thenames of thecategories, so thenumbers are at thenominal level of measurement. In this case themean is a meaningless statistic. greater than or equal to 504.7 29.4 523.5.
7. 8.
9.
The minimum value is 119 mm, thefirst quartile is 128 mm, thesecond quartile (or median) is 131 mm, thethird quartile is 135 mm, and themaximum value is 141 mm. Based on a minimum of 0.799 g and a maximum of 0.944 g, an estimate of thestandard deviation of candy 0.944 0.799 weights would be s 0.0363 g, which is very close to thestandard deviation of 0.0366 g. 4 25 20 0.870 0.872 L 5, so P Q 0.871 g (Excel yields 0.8715.) 25 1 2 100
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 49 10. themale z score is z
3400 3272.8
0.19 . thefemale z score is z
3200 3037.1
0.23 . thefemale has 660.2 706.3 the larger relative birth because thefemale has thelarger z score. Female: fx | =STANDARDIZE(3200,3037.1,706.3) Male: fx | =STANDARDIZE(3400,3272.8,660.2)
Cumulative Review Exercises 1. a. quantitative b. ratio level of measurement c. continuous d. sample e. statistic
2. Weight (mg) 3500–3549 3550–3599 3600–3649 3650–3699 3700–3749
Frequency 6 3 7 3 1
3.
4.
For 3467 mg,
15
100 75, so it is the75th percentile.
20 5.
a. b.
3511 3516 3521 3531 3666 3673 3678 3723 3605.2 mg. 20 3617 3621 The median is 3619.0 mg. 2 The mean is x
20 260, 016,141 72,103
2
c.
s
d.
s2 62.4 3895.2 mg2
e.
The range is 3723 3511 212.0 mg.
2020 1
62.4 mg
2
XLSTAT Output Statistic Nbr. of observations Range Median Mean Variance (n-1) Standard deviation (n-1) 6. 7.
Weight (mg) 20 212.000 3619.000 3605.150 3895.292 62.412
The vertical scale does not begin at 0, so thedifferences among different outcomes are exaggerated. No. A normal distribution would appear in a histogram as being bell-shaped, but thehistogram is not bellshaped.
Copyright © 2022 Pearson Education, Inc.
50 Chapter 3: Describing, Exploring, and Comparing Data 8.
Based on thescatterplot, there does appear to be a correlation between heights of fathers and heights of their first sons. Because thepoints are not very close to a straight-line pattern, thecorrelation does not appear to be very strong.
Copyright © 2022 Pearson Education, Inc.
Section 4-1: Basic Concepts of Probability
51
Chapter 4: Probability Section 4-1: Basic Concepts of Probability 1 1 9999 , or 0.9999 , or 0.0001, P( A) 1 1. P( A) 10, 000 10, 000 10, 000 2.
The probability of rain today is 0.25, or
1
.
4 3.
a. 1 6 , or 0.167 b. 1 2 , or 0.5
4.
c. 0 The answers vary, but a high answer in theneighborhood of 0.98 is reasonable.
5.
0, 3 5, 1, 0.135
7.
1 10 , or 0.1
6.
1 5, or 0.2
8.
{bb, bg, gb, gg}
9. 53 girls is neither significantly low nor significantly high. 10. 35 girls is significantly low. 11. 75 girls is significantly high. 12. 48 girls is neither significantly low nor significantly high. 13. 1 2 , or 0.5
17. 1 10 , or 0.1
14. 1 5, or 0.2
18. 1 2 , or 0.5
15. 0.33 19. 0 16. 0.47 20. 1 239 21. , or 0.821; Yes, thetechnique appears to be effective. 291 22. 0.17. Yes, it does appear that most adults carry some cash. 428 23. , or 0.738; Yes, it is reasonable. 580 c. He already knew. 24. a. 1 365 d. 0 b. yes 311 3732 , or 0.730; No, it is unlikely for someone to use a social networking site. 25. 1380 3732 426 83, 600 418 , or 0.0160; Yes, in a passenger car crash, a rollover is unlikely. 26. 83, 600 5,127, 400 26, 055
27. a. brown /brown, brown/blue, blue/brown, blue/blue b. 1 4 c. 3 4 28. In thefollowing, thefirst letter represents thechromosome contributed by thefather and thesecond letter represents thechromosome contributed by themother. Let X1 and X2 represent thepossible X chromosomes contributed by themother. a. 0; thepossible outcomes are {xX1, xX2, YX1, YX2}, neither son will have thedisease. b. 0; thepossible outcomes are {xX1, xX2, YX1, YX2}, neither daughter will have thedisease. c. 1 2, or 0.5; thepossible outcomes are {Xx1, XX2, Yx1, YX2}, one of thetwo sons will have thedisease. d. 0; thepossible outcomes are {Xx1, XX2, Yx1, YX2}, neither daughter will have thedisease. 29. 3 8, or 0.375
30. 3 8, or 0.375
31. {bbbb, bbbg, bbgb, bbgg, bgbb, bgbg, bggb, bggg, gbbg, gbbb, gbgb, gbgg, ggbb, ggbg, gggb, gggg}; 4 16 , or 0.25
Copyright © 2022 Pearson Education, Inc.
52 Chapter 4: Probability 32. 2 16 , or 0.125 33. Because theprobability of a result such as 40 is small (less than or equal to 0.05), and because 40 is much higher than what is expected with randomness, theresult of 40 is significantly high. 34. Because theprobability of a result as high as 26 is 0.058638, theprobability is not small (less than or equal to 0.05). theresult of 26 Democrats being placed on thefirst line is neither significantly low nor significantly high. 35. Because theprobability of a result as low as 14 is 0.029792, that probability is small (less than or equal to 0.05). theresult of 14 Democrats being placed on thefirst line is significantly low. 36. Because theprobability of a result as high as 27 is 0.029792, that probability is small (less than or equal to 0.05). theresult of 27 Democrats being placed on thefirst line is significantly high. 37. If pregnant women have no ability to predict thegenders of their babies, then among 104 predictions, we expect about half of them (or 52) to be correct. the57 correct predictions is greater than 52. thehigh probability of 0.189 is greater than 0.05, so 57 correct predictions is not significantly high (and it is not significantly low). It does not appear that pregnant women can correctly predict thegenders of their babies. 38. Because theprobability of 0.00000000978 is so low (less than or equal to 0.05), theresult of 148 is significantly low. This suggests that theclaim that therate is less than 50% appears to be supported. 39. Because theprobability of getting 604 or more respondents who have made new friends online is 0.00000306 (less than or equal to 0.05), it appears that 604 is significantly high. This suggests that thetrue rate is not 50%. 40. Because 0.000235 is so low (less than or equal to 0.05), and because 37 deaths is greater than half of the49 deaths, it appears that 37 male deaths is significantly high. It appears that male selfie deaths are more likely than female selfie deaths. 41. a. 999 :1 b. 499 :1 c. Yes. If payoffs were made according to theactual odds, thepayoff for a winning ticket would be a profit of $999 instead of only $499. 42. a. 18 38, or 0.474 b. 10 : 9 c. $18 d. $20 43. a. 7.80 2 $5.80 b. 5.80 : 2 or, by multiplying both numbers by 5, 29 :10, which is roughly 3:1. c. 0.8065 : 0.1935 or, by dividing both numbers by 0.1935, about 4.168 :1, which is roughly 4:1. d. The worth of the$2 bet would be approximately 4.168 2 2 $10.34 (instead of theactual payoff of $7.80). 26 2103 26 26 1 2103 2103 0.939 Odds ratio: 0.938 44. Relative risk: 22 22 1671 1671 22 1 1671 The probability of a headache with Nasonex (0.0124) is slightly less than theprobability of a headache with theplacebo (0.0132), so Nasonex does not appear to pose a risk of headache. Section 4-2: Addition Rule and Multiplication Rule 1.
P(D) represents theprobability of selecting a smartphone with a manufacturing defect, and P(D) represents theprobability that theselected smartphone does not have a manufacturing defect.
Copyright © 2022 Pearson Education, Inc.
Section 4-2: Addition Rule and Multiplication Rule 53 2.
P(M | B) represents theprobability of getting a male, given that someone with blue eyes has been selected. Because P(B | M ) is theprobability of selecting someone with blue eyes given that theselected person is a male, P(M | B) is not thesame as P(B | M ).
3.
5.
Because theselections are made without replacement, theevents are dependent. Because thesample size of 1068 is less than 5% of thepopulation size of 30,488,983, theselections can be treated as being independent (based on the5% guideline for cumbersome calculations). Because theprobability of getting a result of 47 or lower is only 0.000000987, it appears that 47 is significantly low. It does appear theclaim is supported by thedata. 7. 1 0.13 0.87 1 0.682 0.318; t h e probability should be 0.5.
6.
1 0.18 0.82
8.
P(I ) denotes theprobability of screening a driver and finding that he or she is not intoxicated, and
4.
P(I ) 0.99112, or 0.991 when rounded. Use thefollowing table for Exercises 9–20
Texted While Driving No Texting While Driving Total 9.
887 8505
, or 0.104
Drove When Drinking Alcohol? Yes No 731 3054 156 4564 887 7618 4720 10. , or 0.555 8505
Total 3785 4720 8505
3785 887 731 3941 , or 0.463; thetwo events are not disjoint. 11. 8505 8505 8505 8505 4720 887 156 5451 , or 0.641; thetwo events are not disjoint. 12. 8505 8505 8505 8505 887 887 0.0109; Yes, theevents are independent. 8505 8505 887 886 b. 0.0109; t h e events are dependent, not independent. 8505 8504 c. In this case, theresults are thesame when rounded to three significant digits, but with more significant digits, theresults of 0.0108767364 and 0.0108657516 are different. 3785 3785 0.198; Yes, theevents are independent. 14. a. 8505 8505 3785 3784 b. 0.198; t h e events are dependent, not independent. 8505 8504 c. In this case, theresults are thesame when rounded to three significant digits, but with more significant digits, theresults of 0.1980537782 and 0.1980247356 are different. 156 156 15. a. 0.0309; Yes, theevents are independent. 887 887 156 155 b. 0.0308; t h e events are dependent, not independent. 887 886 13. a.
156 156 156 0.0000361; Yes, theevents are independent. 4720 4720 4720 156 155 154 b. 0.0000354; t h e events are dependent, not independent. 4720 4719 4718
16. a.
Copyright © 2022 Pearson Education, Inc.
54 Chapter 4: Probability 731 0.0859 8505 4564 18. 0.537 8505 21. Use thefollowing table for parts (a) and (b). 17.
Subject Used Marijuana Subject Did Not Use Marijuana Total
Positive Test Result True Positive 119 False Positive 24 143
19.
887 886 885 884 0.000118 8505 8504 8503 8502
20.
3785 3784 3783 3782 0.0392 8505 8504 8503 8502
Negative Test Result False Negative 3 True Negative 154 157
a. There was a total of 300 subjects in thestudy.
c.
b. 154 subjects had a true negative result. 22.
119 3 154 23 0.920 300 25
23.
Total 122 178 300
154 77 0.513 300 150
119 24 154 99 0.990 300 100
178 89 0.593; No, in thegeneral population, therate of subjects who do not use marijuana is probably 300 150 much greater than 0.593 or 59.3%. 25. a. 0.0366 b. 0.0366 0.0366 0.00133956, or 0.00134 rounded
24.
c. 0.0366 0.0366 0.0366 0.000049027896, or 0.0000490 rounded d. By using one drive without a backup, theprobability of total failure is 0.0366, and with three independent drives, theprobability drops to 0.0000490 (rounded). By changing from one drive to three, theprobability of total failure drops from 0.0366 to 0.0000490, and that is a very substantial improvement in reliability. Back up your data! 26. a. 1 0.005 0.995 b. 0.005 0.052 0.000260; 1 0.000260 0.99974 c. 0.005 0.052 0.001 0.00000026; 1 0.000000260 0.99999974 d. Use all three of thealarm clocks for redundancy and improved reliability. 8330 8329 8328 0.838; t h e probability of 0.838 is high, so it is likely that theentire 27. 8834 504 8330, 8834 8833 8832 batch will be accepted, even though it includes many firmware defects. 392 391 390 0.941; It is likely that theentire lot would be accepted. 28. 400 8 392, 400 399 398 47, 637 47, 637 0.299 29. a. 47, 637 111,874 159, 511 b. Using the5% guideline for cumbersome calculations, 0.299 0.00239. Using exact probabilities, 5
47, 637 47, 636 47, 635 47, 634 47, 633 0.00238. 159, 511 159, 510 159, 509 159, 508 159, 507 47, 637 188 40 30. Using the5% guideline for cumbersome calculations, 0.854. 47, 637 31. a. 0.985 0.985 0.985 0.015 0.015 0.985 0.999775 b. 0.985 0.985 0.970225 c. t h e series arrangement provides better protection.
Copyright © 2022 Pearson Education, Inc.
Section 4-3: Complements, Conditional Probability, and Bayes’ Theorem 55 32.
365 364 363 341 … 0.431 365 365 365 365
34. P( A or B)
4564
33. a. P( A or B) P( A) P(B) 2P( A and B) 3785 887 731 3210 , or 0.377 b. 2 8505 8505 8505 8505
, or 0.537 and P( A or B)
8505 is not thesame as P( A or B).
7774
, or 0.914; theresults are different. In general, P( A or B)
8505
Section 4-3: Complements, Conditional Probability, and Bayes’ Theorem 1. 2. 3. 4.
5. 6. 7.
A is event of not getting at least 1 defect among the4 calculators, which means that all 4 calculators are good. Parts (a) and (b) are correct. The probability that thepolygraph indicated a lie given that thesubject did tell a lie. Confusion of theinverse is to think that P L | Y P Y | L or to switch one of those values for theother. That is, confusion of theinverse is to think that thefollowing two probabilities are equal or to incorrectly use one of them for theother: (1) theprobability that thesubject told a lie given that thepolygraph indicated a lie; (2) the probability that thepolygraph indicated a lie given that thesubject told a lie. 8. a. 1 0.526 0.474 1 4 15 , or 0.938 1 16 b. (0.474)4 0.0505 2 c. (0.526)4 0.0765 1 2 , or 0.5 d. 1 (0.526)4 0.923 a. 1 0.512 0.488 b. (0.512)3 0.134
9.
a.
1 0.000110 0.999
b. 1 0.999 0.00100
c. (0.488)3 0.116 d. 1 (0.512)3 0.866 10. 1 1 0.20 11. 1 1 0.20
10
0.893; There is a good chance of continuing.
15
0.965; theprobability is high enough so that she can be reasonably sure of getting a defect for
her work. 12. 1 1 0.31 0.844; With a probability of 0.844, theresearcher can be reasonably confident that thefive road 5
crash deaths will include at least one that involved alcohol. Don’t drink and drive! Use thefollowing table for Exercises 13–16. Students Given Four Quarters Students Given a $1 Bill Total 13. a. P spent money | given quarters
Purchased Gum
Kept theMoney
Total
27
16
43
12
34
46
39
50
89
27
, or 0.628 43 16 , or 0.372 b. P kept money | given quarters 43 c. It appears that when students are given four quarters, they are more likely to spend themoney than keep it.
Copyright © 2022 Pearson Education, Inc.
56 Chapter 4: Probability 6 , or 0.261 46 23 34 17 b. P kept money | given dollar bill , or 0.739 46 23 c. It appears that when students are given a $1 bill, they are more likely to keep themoney than spend it. 27 , or 0.628 15. a. P spent money | given quarters 43 12 6 b. P spent money | given dollar bill , or 0.261 46 23 c. It appears that students are more likely to spend themoney when given four quarters than when given a $1 bill. 16 , or 0.373 16. a. P kept money | given quarters 43 34 17 , or 0.739 b. P kept money | given dollar bill 46 23 c. It appears that students are more likely to spend themoney when given four quarters than when given a $1 bill. Use thefollowing table for Exercises 17–20. 14. a. P spent money | given dollar bill
12
Subject Did Not Lie 15 32 47
Subject Lied 42 9 51
Total 57 41 98
Polygraph indicated that thesubject lied. Polygraph indicated that thesubject did not lie. Total 15 , or 0.319; This is theprobability of thepolygraph making it appear that the 17.1 P positive result | did not lie 47 7 . subject lied when thesubject did not lie, so thesubject would be unfairly characterized as a liar. 9 18.1 P negative result | lied , or 0.176; thesubject appears to be truthful when telling a lie, so thesubject 51 8 . would receive undeserved credibility. 42 , or 0.737; To be truly effective, theprobability should be much higher. There is 19.1 P lied | positive result 57 9 . not sufficient evidence to conclude that thepolygraph is effective. 32 P did not lie | negative result , or 0.780; To be truly effective, theprobability should be much higher. 41 There is not sufficient evidence to conclude that thepolygraph is effective. 20.2 0 . 21. a. 1 (0.0122)2 0.999851 b. 1 (0.0122)3 0.999998; theusual round-off rule for probabilities would result in a probability of 1.00, which would incorrectly indicate that we are certain to have at least one working hard drive. 22. 1 0.22 0.989; theresult is quite high, but there is roughly a 1% chance that in theevent of a power failure, none of thebackup generators will work. With thepossibility of an outage during a major event with thousands of spectators and millions of televisions viewers (as happened in theSuperdome—Super Bowl XLVII), it would be wise for manufacturers to improve thereliability of thebackup generators so that the22% failure rate is lowered and theprobability of 0.989 is increased. 3
23. 1 1 0.126 0.490; theprobability is not low, so further testing of theindividual samples will be 5
Copyright © 2022 Pearson Education, Inc.
56 Chapter 4: Probability necessary for about 49% of thecombined samples.
Copyright © 2022 Pearson Education, Inc.
Section 4-4: Counting 57 24.2 1 1 0.005 0.0489; theprobability is quite low, indicating that further testing of theindividual samples 4 will be necessary for about 5% of thecombined samples. . 365 364 363 341 1 … 1 0.431 0.569 365 365 365 365 25.2 5 . 26. The event ―one of thecoins turned up heads‖ includes thecase in which both coins turned up heads. thesample space is {HH, HT, TH}, so theprobability that both coins turned up heads is 1 3. 10
Section 4-4: Counting 1. The symbol ! is thefactorial symbol that represents theproduct of decreasing whole numbers, as in 5! 5 4 3 2 1 120. Five NBA players can stand in line 120 different ways. 2. 3.
Combinations, because order does not count and six numbers are selected (from 1 to 35) without replacement. Because repetition is allowed, numbers are selected with replacement, so thecombinations rule and thetwo permutation rules do not apply. themultiplication counting rule can be used to show that thenumber of possible outcomes is 10 10 10 1000.
4.
No, because order counts. the―combination‖ of 8 – 21 – 8 will not work if it is entered as 8 – 8 – 21. (Also, because repetition of numbers is allowed, thenumbers can be selected with replacement, so thecombinations rule does not apply.) 1 1 1 1 1 1 1 1 1 1 1 6. 10 10 10 10 10, 000 10 10 10 10 10 100, 000
5.
7.
There are 20 C2
20! 190 ways to choose a quinela. theprobability is 1 190 , or 0.00526. (20 2)!2!
fx | =COMBIN(20,2) 11! 462 5! 120; (11 5)!5!
fx | =COMBIN11,5 and fx | =FACT 5
8.
11C5
9.
10! 50, 400; t h e probability is 1 50, 400. 2!3!3!
fx | =FACT(10)/(FACT(2)*FACT(3)*FACT(3))
10. For three additional letters, there are 2 26 26 26 35,152 possibilities, for two additional letters, there are 2 26 26 1352 possibilities, for a total of 35,152 1352 36, 504 possibilities. 11. There are 50 P4
50! 5, 527, 200 possible routes. theprobability is 1 5, 527, 200. (50 4)!
fx | =PERMUT(50,4) 12.
10!
604,800
3! 13.
fx | =FACT(10)/FACT(3)
1
1 ; No, there are too many different possibilities. 100 100 100 100 100, 000, 000
14.
5 C2
5! 10 (5 2)!2!
fx | =COMBIN(5,2)
6!
360; t h e correct unscrambling is RHYTHM: P(RHYTHM) 1 . fx | =FACT(6)/FACT(2) 360 2! 16. 4 4 4 64 1 1 17. fx | =COMBIN(69,5)*26 292, 201, 338 69C5 26 15.
Copyright © 2022 Pearson Education, Inc.
Section 4-4: Counting 57 18. 4! 24; t h e probability is 1 24.
fx | =FACT(4)
Copyright © 2022 Pearson Education, Inc.
58 Chapter 4: Probability 19. 10 10 10 10 10 100, 000; t h e probability is 1 100, 000 , or 0.00001. 20.
1 1 , or 0.000333; It is likely thecompany actually fired theoldest employees. 3003 15C5 fx | =COMBIN(15,3)
21. There are 8 10 10 8 11 792 possible area codes and 792 possible exchange numbers. There are 792 792 10 10 10 10 6, 272, 640, 000 possible phone numbers. Yes. (With a total population of about 400,000,000, there would be roughly 16 phone numbers for every adult and child.) 22.
11! 34, 650; t h e probability is 1 34, 650. 4!4!2! 15 P5
23. a.
15! 360, 360 (15 5)!
fx | =FACT(11)/(FACT(4)*FACT(4)*FACT(2))
fx | =PERMUT(15,5)
15! 3003 fx | =COMBIN(15,5) (15 5)!5! c. t h e probability is 1 3003. b.
C
15 5
3 1 3 , or 0.188 24. a. 1 4, or 0.25 b. 4 4 16 c. Trick question. There is no finite number of attempts, because you could continue to get thewrong position every time. 25. a. 220 1, 048, 576 b.
20 C10
fx | =2^20
20! 184, 756 (20 10)!10!
fx | =COMBIN(20,10)
184, 756 0.176 fx | =2^20/COMBIN(20,10) 1, 048, 576 d. With a probability of 0.176, theresult is common, but it should not happen consistently. 1 1 1 1 b. 26. a. 12 16 10, 000, 000, 000, 000, 000 1, 000, 000, 000, 000 10 10 1 1 ; t h e number of possibilities (100,000,000) is still quite large, so there is no reason to c. 108 100, 000, 000 worry(unless theinformation is taken from your credit card or is hacked from theInternet). c.
27.
16! 653,837,184,000 2!2!2!2!2!
28. a.
16 P14
fx | =FACT(16)/(FACT(2)*FACT(2)*FACT(2)*FACT(2)*FACT(2))
16! 10, 461, 394, 944, 000; (Most calculators will give a result in scientific notation, so an (16 14)!
answer such as 10,461,395,000,000 is OK.) fx | =PERMUT(16,14) 16! 120 (16 14)!14! c. t h e probability is 1 120. b.
16 C14
fx | =COMBIN(16,14)
Copyright © 2022 Pearson Education, Inc.
Section 4-4: Counting 59 1
29. a.
1
70C5 25
302,575,350
fx | =COMBIN(70,5)*25
b. There is a much better chance of being struck by lightning. 1 1 c. t h e probability for theold Mega Millions game was . th e current Mega Millions C 15 258,890,850 75 5 game has a substantially lower probability of winning when compared to theold Mega Millions game. 30. 9 20 30 10 20 30 10 20 20 10 10 12, 960, 000, 000, 000; t h e number of different possible MBIs is substantially larger than theUnited States population, so there is no danger of running out of MBIs. thenumberof different possible MBIs is substantially greater than 1,000,000,000, which is thenumber of different possible Social Security Numbers. A major advantage of thenew MBIs is that Social Security Numbers become much more secure because they are no longer being used for thewidely circulated Medicare cards. 31. There are 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 4 8 16 32 62 different possible characters. thealphabet requires 26 characters and there are 10 digits, so theMorse code system is more than adequate. C 32. 2 1 2 1 fx | =COMBIN(2,1)/COMBIN(8,4) 70 35 8C4 33. 12 ways: {25p, 1n 20p, 2n 15p, 3n 10p, 4n 5p, 5n, 1d 15p, 1d 1n 10p, 1d 2n 5p, 1d 3n, 2d 5p,2d 1n} (Note: 25p represents 25 pennies, etc.) 5! 34. You can touch five fingers, four fingers, three fingers or two fingers in 5! 5! 5! 5!0! 4!1! 3!2! 2!3! 1 5 10 10 26 different ways. fx | =COMBIN(5,5)+COMBIN(5,4)+COMBIN(5,3)+COMBIN(5,2) 1 1 fx | =COMBIN(25,6) 177,100 25C6 177,100 b. $88, 500 2 c. No, because thejackpot is too small.
35. a.
1 1 fx | =COMBIN(50,6) 15,890, 700 C 50 6 15,890, 700 b. $7,945,350 2 c. With a jackpot of several million dollars, thelottery has a reasonable chance of being appealing.
36. a.
37.3 26 26 36 26 362 26 363 26 364 26 365 26 366 26 367 2, 095, 681, 645, 538 or about 2 trillion. 7 . 38. a. 10 C2 45 b.
n C2
fx | =COMBIN(10,2)
n(n 1) 2
c. 9! 362,880
fx | =FACT(9)
d. (n 1)! 39.
20 C10 80 C10
40. 60!
184, 756 0.000000112; 1, 646, 492,110,120 60 60
fx | =COMBIN(20,10)/COMBIN(80,10)
81
2 (60) 8.309 438315 10 , which is very close to theexact value. e fx | =SQRT(2*pi()*60)*(60/exp(1))^60
Copyright © 2022 Pearson Education, Inc.
60 Chapter 4: Probability Section 4-5: Simulations for Hypothesis Tests 1. No. thegenerated numbers between 2 and 12 would be equally likely, but they are not equally likely with actual dice. 2. Generate one random integer between 1 and 6, then generate a second random integer between 1 and 6, then add thetwo results. 3. Yes, it does. Each of the365 birthdays has thesame chance of being selected, and thecards are replaced, so it is possible to select thesame birthday more than once. 4. If 40 of the100 samples have means that are at least as extreme as thesample mean, then thesample mean is not significantly low or significantly high, so there is not strong evidence against theassumed mean of 98.6F. It appears that thepopulation mean could be 98.6F. 5.
6. 7. 8. 9.
Randomly generate 50 integers, with each integer between 1 and 100. Consider thenumbers 1 through 95 to be adults who recognize thebrand name of McDonald’s, while thenumbers 96 through 100 represent adults who do not recognize McDonald’s. Randomly generate 15 integers between 1 and 10, and consider all 1s to be left-handed people, while theintegers 2 through 10 are people who are not left-handed. Randomly generate an integer between 1 and 1000 inclusive. Consider an outcome of 1 through 640 to be a pass that was caught and consider an outcome between 641 and 1000 to be a pass that was not caught. Randomly generate 20 integers between 1 and 4 inclusive, and consider an outcome of 1 or 2 or 3 to be a pea with a green pod, and consider an outcome of 4 to be a pea with a yellow pod. Answers vary, but here is a typical result: Among 100 generated samples, thesample mean of x 97.49F or lower never occurred, so theconclusions are essentially thesame as in Example 1: With theassumption that themean body temperature is 98.6F, we have found that thesample mean of 97.49F is highly unlikely and is significantly low. Because we did get thesample mean of 97.49F from Data Set 5, we have strong evidence suggesting that theassumed population mean of 98.6F is likely to be wrong.
10. Answers vary, but here is a typical result: Among 100 generated samples, thesample mean of x 98.5F or lower occurred 15 times, so 98.5F is not significantly low. It appears that thesample mean of 98.5F could easily occur with a population mean of 98.6F, so there isn’t strong evidence against 98.6F as thepopulation mean. 11. Sample statistics: n 15, x 62.7 seconds, s 19.5 seconds; Generate random samples from a normally distributed population with theassumed mean of 60 seconds, a standard deviation of s 19.5 seconds, and a sample size of n 15. Answers vary, but here is a typical result: Among 100 generated samples, thesample mean of x 62.7 seconds or higher occurred 33 times, so 62.7 seconds is not significantly high. thesample mean of 62.7 seconds could easily occur with a population mean of 60 seconds, so there isn’t strong evidence against 60 seconds as thepopulation mean. 12. Sample statistics: n 10, x 97.1, s 14.7; Generate random samples from a normally distributed population with theassumed mean of 100, a standard deviation of s 14.7, and a sample size of n 10. Answers vary, but here is a typical result: Among 100 generated samples, thesample mean of x 97.1 or lower occurred 35 times, so 97.1 is not significantly low. thesample mean of 97.1 could easily occur with a population mean of 100, so there isn’t strong evidence against 100 as themean IQ score of thepopulation. 2 1 13. With switching: P(win) ; With sticking, P(win) 3 3 14. a. Answers will vary. theprobability is 0.970. b. Answers will vary. theprobability is 0.127 15. The reasoning is not correct. theproportion of girls will not increase. Quick Quiz 1 4 1. , or 0.2 2. , or 0.8 5 5
Copyright © 2022 Pearson Education, Inc.
Quick Quiz 61 3. 5.
1
4.
1
=
1
or 0.01
365 10 10 100 Answer varies, but theprobability should be somewhat low, such as 0.02. Use thefollowing table for Exercises 6–10. Vaccine Treatment Placebo Total
6.
1
109
Developed Flu 14 95 109
Did Not Develop Flu 1056 437 1493
Total 1070 532 1602
, or 0.0680
1602 14 1056 95 1165 109 1070 14 1165 , or 0.727 , or 1602 1602 1602 1602 1602 1602 14 7 0.00847 109 108 0.00459 8. 9. 1602 801 1602 1601 14 7 10. P developed flu | given vaccine , or 0.0131 1070 535 Review Exercises Use thefollowing table for Exercises 1–10. 7.
Male Female Total 1.
2. 3. 4. 8.
Writes with Left Hand? Yes No 23 217 65 455 88 672
Total 240 520 760
520
0.684, which does not appear to be reasonably close to theproportion of females in thegeneral 760 population. It does not seem that thestudy subjects were randomly selected from thegeneral population. 65 88 240 23 305 0.401 0.125 P left hand | female 5. 520 760 760 760 760 65 88 87 P female | left hand 0.739 6. 0.0133 760 759 88 88 520 65 543 0.714 88 88 0.0134 7. 760 760 760 760 760 760 L is theevent of randomly selecting one of thestudy subjects and getting someone who does not write with 88 672 their left hand, so P L 1 0.884. 760 760
9.
M is theevent of randomly selecting one of thestudy subjects and getting someone who is not male, so 240 520 P M 1 0.684. 760 760 88 87 86 10. 0.00151; Yes, because theprobability of getting three lefties is so small. 760 759 758 15 14 13 12 11. 0.00202; Because that probability is so low, it is very unlikely that theseats were randomly 65 64 63 62 assigned.
Copyright © 2022 Pearson Education, Inc.
62 Chapter 4: Probability 12. a. 1 0.75 0.25, or 25% b. 0.75 0.75 0.75 0.75 0.316 c. A result of x successes among n trials is a significantly high number of successes if theprobability of xor more successes is unlikely with a probability of 0.05 or less. That is, x is a significantly high number of successes if P( x or more 0.05). (The value 0.05 is not absolutely rigid. Other values, such as 0.01, could be used to distinguish between results that are significant and those that are not significant.) d. No, theprobability of getting all four people using vision correction is 0.316, which is not unlikely with a small probability such as 0.05 or less. Because theprobability of four people using vision correction is so high, that event can easily occur and it is not a significant event. 13. a. 1 365 b. 31 365 c. Answers will vary, but it is probably quite small, such as 0.01 or less. d. yes 10 4.3 10 1 1 43 0.0422; No, it is not likely. fx | =1-(1-4.3/1000)^10 14. 1 1 1000 10, 000 1 1 15. ; No, thejackpot seems disproportionately small given theprobability of winning it. But C 35 1,832, 600 35 4
then, no lotteries are fair. fx | =COMBIN(35,4)*35 Cumulative Review Exercises 1.
15.53 7.27 4.70 4.50 3.44 5.70 … 4.05 4.46 6.919 rnfl. 12 5.7 7.27 b. t h e median is 6.485 rnfl. 2 3.44 15.53 c. t h e midrange is 9.485 rnfl. 2 d. t h e range is 15.53 3.44 12.090 rnfl. 15.53 6.9192 7.27 6.9192 … 4.05 6.9192 4.46 6.9192 e. s 3.400 rnfl 12 1 a. t h e mean is x
f.
s2 3.400 11.558 rnfl2 2
XLSTAT Output Statistic Nbr. of observations Range Median Mean Variance (n-1) Standard deviation (n-1) 2.
Rainfall (rnfl) 12 12.090 6.485 6.919 11.558 3.400
a. t h e 5-five number summary is 3.44 rnfl, 4.480 rnfl, 6.485 rnfl, 7.845 rnfl, 15.53 rnfl. XLSTAT Output Statistic Minimum Maximum 1st Quartile Median 3rd Quartile
Rainfall (rnfl) 3.440 15.530 4.490 6.485 7.648
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 63 2.
(continued) b. c. thevalue of 15.53 rnfl appears to be an outlier. 3.
4.
2346
0.46 46% 5100 b. 0.460 c. stratified sample a. convenience sample b. If thestudents at thecollege are mostly from a surrounding region that includes a large proportion of one ethnic group, theresults will not reflect thegeneral population of theUnited States. 2 c. 0.35 0.4 0.75 d. 1 0.6 0.64 a.
5.
Based on thescatterplot, it is reasonable to conclude that there is no association between heights of presidents and theheights of thepresidential candidates who were runners-up. It is also reasonable to conclude that there isa very weak association with increasing heights of winners corresponding to decreasing heights of runners-up. (More objective criteria will be introduced in Chapter 10.)
Copyright © 2022 Pearson Education, Inc.
Section 5-1: Probability Distributions 65
Chapter 5: Discrete Probability Distributions Section 5-1: Probability Distributions 1. The random variable is x, which is thenumber of unlicensed software packages. thepossible values of x are 0,1, 2, 3, and 4. thevalues of therandom variable x are numerical. 2. The random variable is discrete if it has a finite number of values or a countable number of values. Here, therandom variable is discrete because it has five possible values, and 5 is a finite number 3.
4.
5.
7. 8. 9.
P(x) 0.008 0.076 0.265 0.412 0.240 1.001; t h e sum is not exactly 1 because of a round-off error. The sum is close enough to 1 to satisfy therequirement. Also, thevariable x is a numerical random variable and its values are associated with probabilities, and each of theprobabilities is between 0 and 1 inclusive, as required. thetable does describe a probability distribution. The probability of 0.136 is relevant; 56 is not significantly high because theprobability of 56 or more girls is 0.136, which is not small, such as 0.05 or less. With random chance, it is likely that theoutcome could be 56 or more girls. a. discrete random variable 6. a. discrete random variable b. continuous random variable b. not a random variable c. discrete random variable c. discrete random variable d. not a random variable d. not a random variable e. discrete random variable e. continuous random variable Not a probability distribution because thecauses are not values of a numerical random variable. Not a probability distribution because thesum of theprobabilities is 0.617, which is not 1 as required. Probability distribution with 0 0.104 1 0.351 2 0.396 3 0.149 1.6
(0 1.6)2 0.104 (1 1.6)2 0.351 (2 1.6)2 0.396 (3 1.6)2 0.149 0.9 10. Probability distribution with 0 0.0311156 2 0.313 3 313 4 0.156 5 0.031 2.5
(0 2.5)2 0.031 (1 2.5)2 0.156 (4 2.5)2 0.156 (5 2.5)2 0.031 1.1 11. Not a probability distribution because theresponses are not values of a numerical random variable. 12. This is not a probability distribution because theresponses are not values of a numerical random variable. 13. Probability distribution with 0 0.023 1 0.145 2 0.340 3 0.354 4 0.138 2.4
(0 2.4)2 0.023 (1 2.4)2 0.145 (2 2.4)2 0.340 (3 2.4)2 0.354 (4 2.4)2 0.138 1.0 14. Probability distribution with 0 0.358 1 0.439 2 0.179 3 0.024 0.9
(0 0.9)2 0.358 (1 0.9)2 0.439 (2 0.9)2 0.179 (3 0.9)2 0.024 0.8 15. 0 0.656 1 0.292 2 0.049 3 0.004 4 0 0.4 matches
(0 0.4)2 0.656 (1 0.4)2 0.292 (2 0.4)2 0.049 (3 0.4)2 0.004 (4 0.4)2 00.6 matches 16. Significantly low numbers of matches are less than or equal to 2 0.4 2(0.6) 0.8 matches. Because 0 matches is not less than or equal to –0.8 matches, it is not a significantly low number of matches. 17. Significantly high numbers of matches are greater than or equal to 2 0.4 2(0.6) 1.6 matches. Because 4 matches is greater than or equal to 1.6 matches, it is a significantly high number of matches. 18. a. P( X 2) 0.049 b. P( X 2) 0.049 0.004 0 0.053 c. t h e probability from part (b), since it is theprobability of thegiven or more extreme result. d. No, because theprobability of 2 or more matches is not low (less than or equal to 0.05).
Copyright © 2022 Pearson Education, Inc.
66 Chapter 5: Discrete Probability Distributions 19. a. P( X 3) 0.004 b. P( X 3) 0.004 0 0.004 c. t h e probability from part (b), since it is theprobability of thegiven or more extreme result. d. Yes, because theprobability of 0.004 is low (less than or equal to 0.05). 20. a. P( X 1) 0.292 b. P( X 1) 0.656 0.292 0.948 c. t h e result from part (b), since it is theprobability of thegiven or more extreme result. d. No, because theprobability of 0.948 is not low (less than or equal to 0.05). 21. 0 0.066 1 0.238 2 0.344 3 0.249 4 0.090 5 0.013 2.1 drivers
(0 2.1)2 0.066 (1 2.1)2 0.238 (4 2.1)2 0.090 (5 2.1)2 0.013 1.1 drivers 22. Significantly high numbers of drivers who say that they text while driving is greater than or equal to 2 2.1 2(1.1) 4.3. Because 4 drivers is not greater than or equal to 4.3, 4 is not a significantly high number of drivers who say that they text while driving. 23. Significantly low numbers of drivers who say that they text while driving is less than or equal to 2 2.1 2(1.1) 0.1. Because 1 driver is not less than or equal to –0.1, 1 is not a significantly low number of drivers who say that they text while driving. 24. a. P( X 3) 0.249 b. P( X 3) 0.249 0.090 0.013 0.352 c. t h e probability from part (b), since it is theprobability of thegiven or more extreme result. d. No, because theprobability of 3 or more drivers who say that they text while driving is not low (less than or equal to 0.05). 25. a. P( X 2) 0.344 b. P( X 2) 0.066 0.238 0.344 0.648 c. t h e probability from part (b), since it is theprobability of thegiven or more extreme result. d. No, because theprobability of 2 or fewer drivers who say that they text while driving is not low (less than or equal to 0.05). 26. a. P( X 4) 0.090 b. P( X 4) 0.090 0.013 0.103 c. The probability from part (b), since it is theprobability of thegiven or more extreme result. d. No, because theprobability of 4 or more drivers who say that they text while driving is not low (less than or equal to 0.05). 27. Because theprobability of 270 or more saying that we should use biometrics is 0.0995, which is not low (less than or equal to 0.05), 270 is not significantly high. Given that 270 is not significantly greater than 50%, there is not sufficient evidence to conclude that themajority of thepopulation says that we should replace passwords with biometric security. 28. Because theprobability of 481 or more saying that we should use federal tax money is 0.00389, which is low (less than or equal to 0.05), 481 is significantly high. Given that 481 is significantly greater than 50%, there is sufficient evidence to reject thepolitician’s claim. 29. a. 10 10 10 1000 b. 1 1000 c. $500 $1 $499 1 d. $11 $500 $0.50 50 ¢ 1000 e. The $1 bet on thepass line in craps is better because its expected value of –1.4¢ is much greater than theexpected value of –50¢ for theFlorida Pick 3 lottery.
Copyright © 2022 Pearson Education, Inc.
Section 5-2: Binomial Probability Distributions 67 30. a. 10 10 10 10 10, 000 b. 1 10, 000 c. $5000 $1 $4999 1 d. $11 $5000 $0.50 50 ¢ 10, 000 e. Because both bets have thesame expected value of –50¢, neither bet is better than theother. 5 33 31. a. $0.26 $30 $5 $0.39, or –39¢ 38 38 b. t h e bet on thenumber 27 is better because its expected value of –26¢ is greater than theexpected value of –39¢ for theother bet. 32. The company will pay $0 for a female that survives theyear and –$1,000,000 for a female that does not survive theyear. theexpected payout per policy would be $0 0.99963 $1, 000, 000 (1 0.99963) $370. The company should charge $400 $370 $770 per policy. Section 5-2: Binomial Probability Distributions 1. The given calculation assumes that thefirst two speaking characters are females and thelast three are not females, but there are other arrangements consisting of two females and three males. theprobabilities corresponding to those other arrangements should also be included in theresult. 2. n 5, x 2, p 0.331, q 1 0.331 0.669 3.
Because the50 selections are made without replacement, they are dependent, not independent. Based on the5% guideline for cumbersome calculations, the50 selections can be treated as being independent. (The 50 selections constitute 3.33% of thepopulation of 1500 speaking characters, and 3.33% is not more than 5% of thepopulation.) 4. No, 0+ does not indicate that an event is impossible. 0+ indicates that theevent is possible with a probability that is a positive number, but its probability is so small that it becomes 0 when rounded to thefirst few decimal places. 5. Not binomial; each of theages has more than two possible outcomes. 6. binomial 7. Not binomial; each response has more than two possible outcomes. 8. binomial 9. Not binomial; because thesenators are selected without replacement, theselections are not independent. (The 5% guideline for cumbersome calculations cannot be applied because the40 selected senators constitute 40% of thepopulation of 100 senators, and that exceeds 5%.) 10. Not binomial; because thesenators are selected without replacement, they are not independent. (The 5% guideline for cumbersome calculations cannot be applied because the10 selected senators constitute 10% of thepopulation of 100 senators, and that exceeds 5%.). Also, thenumbers of terms have more than two possible outcomes. 11. Binomial; although theevents are not independent, they can be treated as being independent by applying the5% guideline for cumbersome calculations. thesample size of 3600 is not more than 5% of thepopulation of all households. 12. Binomial; although theevents are not independent, they can be treated as being independent by applying the5% guideline for cumbersome calculations. thesample size of 1659 is not more than 5% of thepopulation of all adults with credit cards. 4 4 1 13. a. 0.128 5 5 5 b. {WWC, WCW, CWW}; Each has a probability of 0.128. c. 0.128 3 0.384
Copyright © 2022 Pearson Education, Inc.
68 Chapter 5: Discrete Probability Distributions 14. a.
P(NNNS) 0.92 0.92 0.92 0.08 0.0623
b. {NNNS, NNSN, NSNN, SNNN }; Each has a probability of 0.0623. c. 0.0623 4 0.249 15. 8C6 0.46 0.62 0.0413 (Table: 0.041)
fx | =BINOM.DIST(6,8, 0.4, 0)
16. 8C6 0.46 0.62 8C7 0.47 0.61 8C8 0.48 0.60 0.0498 (Table: 0.050)
fx | =1-BINOM.DIST(5,8, 0.4,1)
17. 8C0 0.40 0.68 8C1 0.41 0.67 8C2 0.42 0.66 0.315 (Table: 0.316)
fx | =BINOM.DIST(2,8, 0.4,1)
18. 8C0 0.40 0.68 8C1 0.41 0.67 8C2 0.42 0.66 8C3 0.43 0.65 0.594 (Table: 0.595) fx | =BINOM.DIST(3,8, 0.4,1) 19. 8C0 0.40 0.68 0.0168 (Table: 0.017)
fx | =BINOM.DIST(0,8, 0.4, 0)
20. 1 8C0 0.40 0.68 0.983 (Table: 0.983 or 0.984, depending on method) 21. 10 C6 0.656 0.354 0.238
fx | =BINOM.DIST(6,10, 0.65, 0)
22. 16 C12 0.6512 0.354 0.155
fx | =BINOM.DIST(12,16, 0.65, 0)
fx | =1-BINOM.DIST(0,8, 0.4, 0)
23. 20 C18 0.6518 0.352 20C19 0.6519 0.351 20C20 0.6520 0.350 0.0121 fx | =1-BINOM.DIST(17, 20, 0.65,1) 24. 14 C0 0.650 0.3514 14C1 0.651 0.3513 14C2 0.652 0.3512 0.000141
fx | =BINOM.DIST(2,14, 0.65,1)
25. 90 C0 0.270 0.7390 90C1 0.271 0.7389 90C6 0.276 0.7384 90C7 0.277 0.7383 0.00000451; The result of 7 minorities is significantly low. theprobability shows that it is very highly unlikely that a process of random selection would result in 7 or fewer minorities. (The Supreme Court rejected theclaim that theprocess was random.) fx | =BINOM.DIST(7, 90, 0.27,1) 26. 25C14 0.3614 0.6411 25C25 0.3625 0.640 0.0326, Yes, 14 would be significantly high because theprobability of 14 or more is 0.0326, which is low (such as less than 0.05).fx | =1-BINOM.DIST(13, 25, 0.36,1) 27. a.
15C12 0.39
12
0.613 0.00128
fx | =BINOM.DIST(12,15, 0.39, 0)
b. Yes, because theprobability of 12 or more is 15C12 0.3912 0.613 15C15 0.3915 0.610 0.00149, which is low (such as less than 0.05). fx | =1-BINOM.DIST(11,15, 0.39,1) c. 1 15C0 0.390 0.6116 0.999 28. a.
5C0 0.20
b.
5C1 0.20
c.
5C0 0.20
fx | =1-BINOM.DIST(0,15, 0.39, 0)
0
0.805 0.328
fx | =BINOM.DIST(0, 20, 0.20, 0)
1
0.804 0.410
fx | =BINOM.DIST(1, 20, 0.20, 0)
0
0.805 5C1 0.201 0.804 0.737 (Table: 0.738) fx | =BINOM.DIST(1, 20, 0.20,1)
d. No, theprobability from part (c) is not small, so 1 is not an unusually low number. 29. a. np 36 0.5 18.0 girls, np(1 p) 36 0.5 0.5 3.0 girls b. Values of 18.0 2(3.0) 12.0 girls or fewer are significantly low, values of 18.0 2(3.0) 24.0 girls or more are significantly high, and values between 12.0 girls and 24.0 girls are not significant. c. t h e result is significantly high because theresult of 26 girls is greater than or equal to 24.0 girls. A result of 26 girls would suggest that theXSORT method is effective.
Copyright © 2022 Pearson Education, Inc.
Section 5-2: Binomial Probability Distributions 69 30. a. np 16 0.5 8.0 girls, np(1 p) 16 0.5 0.5 2.0 girls b. Values of 8.0 2(2.0) 4.0 girls or fewer are significantly low, values of 8.0 2(2.0) 12.0 girls or more are significantly high, and values between 4.0 girls and 12.0 girls are not significant. c. The result is not significant because theresult of 11 girls is not greater than or equal to 12.0 girls. A result of 11 girls would not suggest that theXSORT method is effective. 31. a. np 10 0.75 7.5 peas, np(1 p) 10 0.75 0.25 1.4 peas b. Values of 7.5 2(1.4) 4.7 peas or fewer are significantly low, values of 7.5 2(1.4) 10.3 peas or more are significantly high, and values between 4.7 peas and 10.3 peas are not significant. c. t h e result is not significant because theresult of 9 peas is not greater than or equal to 10.3 peas. 32. a. np 16 0.75 12.0 peas, np(1 p) 16 0.75 0.25 1.7 peas b. Values of 12.0 2(1.7) 8.6 peas or fewer are significantly low, values of 12.0 2(1.7) 15.4 peas or more are significantly high, and values between 4.7 peas and 10.3 peas are not significant. c. t h e result is significant because theresult of 7 peas is less than or equal to 8.6 peas. 33. With a probability of 1 50C0 0.003430 0.9965750 0.158 that thecombined sample tests positive, it is not unlikely because theprobability of 0.158 is not very low.
fx | =1-BINOM.DIST(0, 50, 0.00343, 0)
34. The probability that thecombined sample tests positive is 1 100C0 0.0420 0.958100 0.986, which means that nearly all of thecomposite samples will result in further tests. Therefore, it does not seem wise to combine 100 samples into one sample. fx | =1-BINOM.DIST(1,100, 0.042, 0) 35. 5C0 0.040 0.965 5C1 0.041 0.964 0.985; t h e high probability shows that a high percentage of ammo cans will be accepted, so it does not appear that there is a production problem. fx | =BINOM.DIST(1, 50, 0.02,1) 36. 50 C0 0.020 0.9850 50C1 0.021 0.9849 50C2 0.022 0.9848 0.922; About 92% of all shipments will be accepted. Almost all shipments will be accepted, and only 8% of theshipments will be rejected. fx | =BINOM.DIST(2, 50, 0.02,1) 37. a. np 345 0.13 44.9 brown M&Ms, 345 0.13 0.87 6.2 brown M&Ms; Values between 345 0.13 2 345 0.13 0.87 32.4 and 345 0.13 2 345 0.13 0.87 57.3 (32.5 and 57.3 if using therounded .); theresult of 36 brown M&Ms lies between those limits, so it is neither significantly low nor significantly high. b. The probability of exactly 36 brown M&Ms is 0.0241. fx | =BINOM.DIST(36, 345, 0.13, 0) c. The probability of 36 or fewer brown M&Ms is 0.0878. fx | =BINOM.DIST(36, 345, 0.13,1) d. The probability from part (c) is relevant. theresult of 36 brown M&Ms is not significantly low. e. The results do not provide strong evidence against theclaim that 13% of M&Ms are brown. 38. a. np 41 0.5 20.5, 41 0.5 0.5 3.2; Values between 20.5 2(3.2) 14.1 and 20.5 2(3.2) 26.9; Because 40 first lines for Democrats is greater than 26.9, thevalue of 40 is significantly high. b. The probability of exactly 40 first lines is 0.000 rounded fx | =BINOM.DIST(40, 41, 0.5, 0) c. The probability of 40 or more first lines is 0.000 when rounded. fx | =1-BINOM.DIST(39, 41, 0.5,1) d. The probability from part (c) is relevant. theresult of 40 first lines for Democrats is significantly high. e. The results suggest that theclerk did not randomly assign theorder of candidates’ names on voting ballots.
Copyright © 2022 Pearson Education, Inc.
70
Chapter 5: Discrete Probability Distributions
39. a. np 611 0.43 262.7 votes, np(1 p) 611 0.43 0.57 12.2 votes; Values between 262.7 2(12.2) 238.3 votes and 262.7 2(12.2) 287.1 votes are not significant. thevalue of 308 votes is greater than or equal to 128.1, so it is significant. b. t h e probability of exactly 308 voters is 611C308 0.43308 0.57303 0.0000369. fx | =BINOM.DIST(308, 611, 0.43, 0) c. t h e probability of 308 or more voters is 611C308 0.43308 0.57303 611C611 0.430 0.57611 0.000136. fx | =1-BINOM.DIST(307, 611, 0.43,1) d. t h e probability from part (c) is relevant. thevalue of 308 votes is significantly high. e. t h e results suggest that thesurveyed voters either lied or had defective memory of how they voted. 40. a. np 580 0.25 145.0 yellow peas, np(1 p) 580 0.25 0.75 10.4 yellow peas; Values between 145.0 2(10.4) 124.2 yellow peas and 145.0 2(10.4) 165.8 yellow peas are not significant (124.1 and 165.9 if using unrounded values). 152 yellow peas falls between these limits, so it is not significant. b. t h e probability of exactly 152 yellow peas is 580 C152 0.25152 0.75428 0.0301. fx | =BINOM.DIST(152, 580, 0.25, 0) c. t h e probability of 152 or more yellow peas is 580 C152 0.25152 0.75428 580C580 0.25580 0.750 0.265. fx | =1-BINOM.DIST(151, 580, 0.25,1) d. The probability from part (c) is relevant. thevalue of 152 yellow peas is not significantly high. e. The results do not provide strong evidence against Mendel’s claim of 25% for yellow peas. 41. P(5) 0.06 1 0.06 0.0468 4
42.
7 6 2 15! 18 18 2 0.0302 7!6!2! 38 38 38
43. P(4)
(6 43)! 0.000969 (6 4)!4! (43 6 4)!(6 4)! (6 43 6)!6! 6!
43!
Section 5-3: Poisson Probability Distributions 1.
9000 19,130 0.470 arrivals, x 2 arrivals, and e 2.71828, which is a constant used in all applications
2.
of Formula 5-9. x is a discrete random variable with possible values of 0, 1, 2, 3, . . . (with no upper bound). It is not possible to have x 4.8 Internet arrivals in one thousandth of a minute.
3.
The probability of exactly 2 Internet arrivals in one thousandth of a minute is P(2) (or 0.0691 if using theunrounded value of ).
4.
0.4702 e0.470 0.0690 2!
fx | =POISSON.DIST(2,0.470,0)
The probability of no Internet arrivals in one thousandth of a minute is P(0)
0 1 e Poisson probability distribution, P(0) e 0! 1
e
.
0.4700 e0.470 0.625. For any 0!
fx | =POISSON.DIST(0,0.470,0)
Copyright © 2022 Pearson Education, Inc.
Section 5-3: Poisson Probability Distributions 71
5.
5.57 e5.5 0.123 fx | =POISSON.DIST(7,5.5,0) 7! b. In 118 years, theexpected number of years with 7 hurricanes is 118(0.123) 14.5.
a.
P(7)
c. theexpected value of 14.5 years is very close to theactual value of 14 years, so in this case thePoisson distribution works quite well. 6.
a. P(0)
5.50 e5.5
0.00409 fx | =POISSON.DIST(0,5.5,0) 0! b. In 118 years, theexpected number of years without any hurricanes is 118(0.00409) 0.5. c. theexpected value of 0.5 years is very close to theactual value of 2 years, so in this case thePoissondistribution works quite well.
7.
a. P(3)
5.53 e5.5
0.113 fx | =POISSON.DIST(3,5.5,0) 3! b. In 118 years, theexpected number of years with 3 hurricanes is 118(0.113) 13.3. c. theexpected value of 13.3 years is very close to theactual value of 17 years, so in this case thePoisson distribution works quite well.
8.
a. P(10)
5.510 e5.5
0.0285 fx | =POISSON.DIST(10,5.5,0) 10! b. In 118 years, theexpected number of years with 10 hurricanes is 118(0.0285) 3.4.
c. theexpected value of 3.4 years is very close to theactual value of 4 years, so in this case thePoissondistribution works quite well. 5942 16.3 births 9. a. 365 16.316 e16.3 0.0989 (or 0.0990 if using theunrounded mean) b. P(16) 16! fx | =POISSON.DIST(16,5942/365,0) 16.30 e16.3 0 (0.0000000834 or 0.0000000851 if using theunrounded mean). Yes, 0 births in a 0! day would be a significantly low number of births. fx | =POISSON.DIST(0,5942/365,0)
c. P(0)
650 1.8 murders (1.780821918 unrounded); theprobability of no murders in a day is 10.1 365 0 0 1.8 . P(0) 1.8 e 0.165 (or 0.168 if using theunrounded mean). No, 0 murders in a day would not be a 0! significantly low number of murders. fx | =POISSON.DIST(0,650/365,0) 11. a.
22, 713
62.2 365 62.250 e62.2 0.0156 (0.0156 using therounded mean.) b. P(50) 50! 196 0.7 12. 280 0.70 e0.7 a. P(0) 0.497 fx | =POISSON.DIST(0,0.7,0) 0! 0.71 e0.7 b. P(1) 0.348 fx | =POISSON.DIST(1,0.7,0) 1!
Copyright © 2022 Pearson Education, Inc.
fx | =POISSON.DIST(50,22713/365,0)
72
Chapter 5: Discrete Probability Distributions
12. (continued) c. P(2) d. P(3)
0.72 e0.7
0.122
fx | =POISSON.DIST(2,0.7,0)
0.73 e0.7 0.0284 3!
fx | =POISSON.DIST(3,0.7,0)
2!
9.84 e9.8 0.00497 fx | =POISSON.DIST(4,0.7,0) 4! The expected frequencies of 139, 97, 34, 8, and 1.4 compare reasonably well to theactual frequencies, so thePoisson distribution does provide good results.
e. P(4)
0.9292 e0.929 0.170 fx | =POISSON.DIST(2,0.929,0) 2! b. The expected number of regions with exactly 2 hits is between 97.9 and 98.2, depending on rounding. c. The expected number of regions with 2 hits is close to 93, which is theactual number of regions with 2 hits. 14. a. 12, 429 0.000011 0.136719 (or 0.137) 13. a. P(2)
b. P(0 or 1)
0.1367190 e0.136719
0.1367191 e0.136719
0! fx | =POISSON.DIST(1,0.136719,1)
0.991464 (0.990 rounded)
1!
c. 1 0.991464 0.00854 (or 0.009) fx | =1-POISSON.DIST(1,0.136719,1) d. No, theprobability of more than one case is extremely small, so theprobability of getting as many as four cases is even smaller. 15. a. 30.41176471 chocolate chips (or 30.4 rounded) 30.426 e30.4 0.0558 (0.0557 if using theunrounded mean.) 26! fx | =POISSON.DIST(26,30.41176471,0)
b. P(26)
c. theexpected number of cookies is 34(0.0558) 1.9, and that is very close to theactual number of cookies with 26 chocolate chips, which is 2. 16. a. 30.41176471 chocolate chips (or 30.4 rounded) 30.430 e30.4 0.0724 fx | =POISSON.DIST(30,30.41176471,0) 30! c. theexpected number of cookies is 34(0.0724) 2.5. theexpected number of cookies is 2.5, and that is very different from theactual number of cookies with 30 chocolate chips, which is 6. b. P(30)
17. The Poisson distribution approximation is valid since n 5200 100 and np
5200 302, 575, 350
0.00001720 e0.0000172 0! 0.0000172. No, it is highly unlikely that at least one jackpot win will occur in 50 years. You would need to play for more than 306,534 years to reach a 10% chance of winning at least one jackpot. Don’t hold your breath. fx | =1-POISSON.DIST(0,0.0000178,0) 0.0000172 10. t h e probability of winning at least one time is 1 P(0) 1
Quick Quiz 1. 0+ indicates that theprobability is a very small positive number. It does not indicate that it is impossible for event A to occur. 2. No, thesum of theprobabilities is 1.5, which is greater than 1.
Copyright © 2022 Pearson Education, Inc.
Review Exercises 3. 4.
Yes. therandom variable x has numerical values, thesum of theprobabilities is 1, and each probability is between 0 and 1. 1 0.2 2 0.2 3 0.2 4 0.2 5 0.2 3.0
5. 6.
The mean is a parameter. This is probability distribution because thethree requirements are satisfied. First, thevariable x is a numerical random variable and its values are associated with probabilities. Second, P(x) 0.172 0.363 0.306 0.129 0.027 0.002 0.999, which is not exactly 1 due to round-off error, but is close enough to satisfy therequirement. Third, each of theprobabilities is between 0 and 1 inclusive, as required.
7.
P( X 1) 1 P( X 0) 1 0.172 0.828 or P( X 1) 0.363 0.306 0.129 0.027 0.002 0.827
8.
0 0.172 1 0.363 2 0.306 3 0.129 4 0.027 5 0.002 1.5 sleepwalkers
9.
2 (1.019041635)2 1.0 sleepwalker2
10. Yes. If using therange rule of thumb, significantly high numbers of sleepwalkers are greater than or equal to 2 1.5 2(1.0) 3.5 sleepwalkers, and 4 sleepwalkers is greater than or equal to 3.5 sleepwalkers. If using probabilities, theprobability of 4 or more sleepwalkers is 0.027 0.002 0.029 , which is small (less than 0.05). Review Exercises 1.
P( X 2) 10C2 0.0422 0.9588 0.0563
2.
P( X 1) 1 P( X 0) 1 10C0 0.0420 0.95810 0.349
3.
10 0.042 0.4 workers, 10 0.042 0.958 0.6 workers
4.
No, 0 is not significantly low. If using therange rule of thumb with 0.4 workers and 0.6 workers,
5.
values are significantly low if they are less than or equal to 2 0.4 2(0.6) 0.8, but 0 is not less than or equal to -0.8. If using probabilities, theprobability of 0 workers testing positive is 0.651, which is not low (less than or equal to 0.05). Yes, 4 is significantly high. If using therange rule of thumb with 0.4 workers and 0.6 workers, values
6.
are significantly high if they are greater than or equal to 2 0.4 2(0.6) 1.6, and 4 is greater than or equal to 1.6. If using probabilities, theprobability of 4 or more workers testing positive is 0.000533, which is low (less than or equal to 0.05). This is not a probability distribution because theresponses are not numerical as required.
7. 8.
fx | =BINOM.DIST(2,10, 0.042, 0) fx | =1-BINOM.DIST(0,10, 0.042, 0)
This is not a probability distribution because P(x) 0.0016 0.0250 0.1432 0.3892 0.4096 0.9686, instead of 1 as required. thediscrepancy between 0.9686 and 1 is too large to attribute to rounding errors. This is a probability distribution (The sum of theprobabilities is 0.999, but that is due to rounding errors.) with 0 0 1 0.003 2 0.025 3 0.111 4 0.279 5 0.373 6 0.208 4.6 people and
(0 4.6)2 0.0 (1 4.6)2 0.003 (5 4.6)2 0.373 (6 4.6)2 0.208 1.0 people. 9.
a. 784 0.301 236.0 checks b. 784 0.301 236.0 checks, 784 0.301 0.699 12.8 checks c. thelimit separating significantly low values is 2 236.0 2(12.8) 210.3 (210.4 if using unrounded values.) d. Yes, because 0 is less than or equal to 210.3 (or 210.4) checks.
Copyright © 2022 Pearson Education, Inc.
73
74
Chapter 5: Discrete Probability Distributions
10. a. 7 365 0.0192 0.01920 e0.0192 0.981 fx | =POISSON.DIST(0,7/365,0) 0! 0.01920 e0.0192 0.01921 e0.0192 c. P( X 1) P( X 2) 1 P( X 1) 1 0.000182 0! 1! fx | =1-POISSON.DIST(1,7/365,1) b. P( X 0)
d. No, because theevent is so rare. (But it is possible that more than one death occurs in a car crash or some other such event, so it might be wise to consider a contingency plan.) Cumulative Review Exercises 0 0 1 2 17 28 21 8 1. a. t h e mean is x 9.6 moons. 8 28 b. The median is 5.0 moons. 2 c. The mode is 0 moons. d. The range is 28 0 28.0 moons. e. The standard deviation is s 2
f. The variance is s
(0 9.6)2 (0 9.6)2 (21 9.6)2 (8 9.6)2
8 1 (0 9.6)2 (0 9.6)2 (21 9.6)2 (8 9.6)2
11.0 moons.
120.3 moons2.
8 1 g. The minimum is 9.6 2 11.0 12.4 moons and themaximum is 9.6 2 11.0 31.6 moons. h. No, because none of theplanets have a number of moons less than or equal to –12.4 moons (which is impossible, anyway) and none of theplanets have a number of moons equal to or greater than 31.6 moons. i. ratio j. discrete XLSTAT Output Statistic Nbr. of observations Range Median Mean Standard deviation (n-1) Variation coefficient (n-1) 2.
a.
1
1
1
1
10 10 10 10
1
0.0001
1 0.0001
8 28.000 5.000 9.625 10.967 1.139 b. 365 0.0001 0.0365
10, 000
365
c.
Moons
0.964 or P(0)
0.03650 e0.0365
0! d. 1 0.9999 2499 0.0001 0.75 or –75¢.
0.964
fx | =POISSON.DIST(0,0.0365,0)
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 75 3.
Refer to thefollowing table.
Challenges by Men Challenges by women Total
Challenge Upheld with Overturned Call 160 95 255
255
Challenge Rejected with No Change Total 398 558 292 387 690 945 558 255 160 d. 0.691 945 945 945 160 e. 0.627 255
0.270 945 95 b. 0.373 255 255 254 c. 0.0726 945 944 4. a. 0.29 2287 663 b. 0.84 663 557 557 c. 0.244; About 24.4% received higher pay. 2287 d. 29% is a statistic because it is a measure based on a sample, not theentire population. 5. No vertical scale is shown, but a comparison of thenumbers shows that 7,066,000 is roughly 1.2 times thenumber 6,000,000. However, thegraph makes it appear that thegoal of 7,066,000 people is roughly 3 times thenumber of people enrolled. thegraph is misleading in thesense that it creates thefalse impression that actual enrollments are far below thegoal, which is not thecase. Fox News apologized for their graph and provided a corrected graph. a.
6.
Two wins is not significantly high because P(2 or more wins) 0.380, which is not low (0.05 or less). The probability of making a profit with two or more wins is 0.380, which is not very likely. Things aren’t looking too good for this gambler.
7.
a. P( X 5) 8C5 0.75 0.33 0.254
fx | =BINOM.DIST(5,8, 0.7, 0)
b. P( X 7) 8C7 0.77 0.31 8C8 0.78 0.30 0.255 (Table: 0.256) fx | =1-BINOM.DIST(6,8, 0.7,1) c. 8 0.7 5.6 adults, 8 0.7 0.3 1.3 adults d. Yes. (Using therange rule of thumb, thelimit separating significantly low values is 2 5.6 2(1.3) 3.0, and 1 is less than 3. Using probabilities, theprobability of 1 or fewer people washing their hands is 8C0 0.70 0.38 8C1 0.71 0.37 0.00129, (Table: 0.001) which is low, such as less than 0.05. fx | =BINOM.DIST(1,8, 0.7,1) 8.
Not a probability distribution because theresponses are not values of a numerical random variable.
Copyright © 2022 Pearson Education, Inc.
Section 6-1: The Standard Normal Distribution 77
Chapter 6: Normal Probability Distributions Section 6-1: theStandard Normal Distribution 1. The word ―normal‖ has a special meaning in statistics. It refers to a specific bell-shaped distribution that can be described by Formula 6-1. thelottery digits do not have a normal distribution. 2.
3.
The mean is 0 and thestandard deviation is 1.
4.
The notation z represents thez score with theproperty that a vertical boundary line at that z score has an area of to its right.
5.
P x 3.00 0.2 5 3 0.4
7.
P 2 x 3 0.2 3 2 0.2
6.
P x 4.00 0.2 4 0 0.8
8.
P 2.5 x 4.5 0.2 4.5 2.5 0.4
9.
P z 0.44 0.6700
10. P z 1.04 0.8508
fx | =NORM.DIST(0.44,0,1,1) fx | =1-NORM.DIST(-1.04,0,1,1)
11. P 0.84 z 1.28 P z 1.28 P z 0.84 0.8997 0.2005 0.6992 (Excel: 0.6993) fx | =NORM.DIST(1.28,0,1,1)-NORM.DIST(-0.84,0,1,1) 12. P 1.07 z 0.67 P z 0.67 P z 1.07 0.7486 0.1423 0.6063 fx | =NORM.DIST(0.67,0,1,1)-NORM.DIST(-1.07,0,1,1) 13. z 1.23
15. z 1.45 fx | =NORM.INV(1-0.9265,0,1)
fx | =NORM.INV(0.8907,0,1)
14. z 0.51 fx | =NORM.INV(0.3050,0,1)
16. z 0.82
17. P z 2.00 0.0228
19. P z 1.33 0.9082
fx | =NORM.DIST(-2.00,0,1,1)
fx | =NORM.INV(1-0.2061,0,1)
fx | =NORM.DIST(1.33,0,1,1)
18. P z 0.50 0.3085
20. P z 2.33 0.9901
fx | =NORM.DIST(-0.50,0,1,1)
fx | =NORM.DIST(2.33,0,1,1)
21. P z 1.00 1 P z 1.00 1 0.8413 0.1587
fx | =1-NORM.DIST(1.00,0,1,1)
22. P z 2.33 1 P z 2.33 1 0.9901 0.0099
fx | =1-NORM.DIST(2.33,0,1,1)
23. P z 1.75 1 P z 1.75 1 0.0401 0.9599
fx | =1-NORM.DIST(-1.75,0,1,1)
24. P z 2.09 1 P z 2.09 1 0.0183 0.9817
fx | =1-NORM.DIST(-2.09,0,1,1)
25. P 1.50 z 2.00 P z 2.00 P z 1.50 0.9772 0.9332 0.0440 (Excel: 0.0441) fx | =NORM.DIST(2.00,0,1,1)-NORM.DIST(1.50,0,1,1)
Copyright © 2022 Pearson Education, Inc.
78 Chapter 6: Normal Probability Distributions 26. P 1.37 z 2.25 P z 2.25 P z 1.37 0.9878 0.9147 0.0731 fx | =NORM.DIST(2.25,0,1,1)-NORM.DIST(1.37,0,1,1) 27. P 2.36 z 1.22 P z 1.22 P z 2.36 0.1112 0.0091 0.1021 fx | =NORM.DIST(-1.22,0,1,1)-NORM.DIST(-2.36,0,1,1) 28. P 2.08 z 0.45 P z 0.45 P z 2.08 0.3264 0.0188 0.3076 fx | =NORM.DIST(-0.45,0,1,1)-NORM.DIST(-2.08,0,1,1) 29. P 1.55 z 1.55 P z 1.55 P z 1.55 0.9394 0.0606 0.8788 (Excel: 0.8789) fx | =NORM.DIST(-1.55,0,1,1)-NORM.DIST(1.55,0,1,1) 30. P 0.77 z 1.42 P z 1.42 P z 0.77 0.9222 0.2206 0.7016 (Excel: 0.7015) fx | =NORM.DIST(1.42,0,1,1)-NORM.DIST(-0.77,0,1,1) 31. P 2.00 z 3.50 P z 3.50 P z 2.00 0.9999 0.0228 0.9771 (Excel: 0.9770) fx | =NORM.DIST(3.50,0,1,1)-NORM.DIST(-2.00,0,1,1) 32. P 3.52 z 2.53 P z 2.53 P z 3.52 0.9943 0.0001 0.9942 (Excel: 0.9941) fx | =NORM.DIST(2.53,0,1,1)-NORM.DIST(-3.52,0,1,1) 33. P z 3.77 1 P z 3.77 1 0.0001 0.9999
fx | =1-NORM.DIST(-3.77,0,1,1)
34. P z 3.93 0.0000425 or 0.0000 rounded (Table: 0.0001) fx | =NORM.DIST(-3.93,0,1,1) 35. P 4.00 z 4.00 P z 4.00 P z 4.00 0.9999 0.0001 0.9998 (Excel: 0.9999) fx | =NORM.DIST(4.00,0,1,1)-NORM.DIST(-4.00,0,1,1) 36. P 3.67 z 4.25 P z 4.25 P z 3.67 0.9999 0.0001 0.9998 (Excel: 0.9999) fx | =NORM.DIST(4.25,0,1,1)-NORM.DIST(-3.67,0,1,1) 37. P99 2.33
fx | =NORM.INV(0.99,0,1)
41. z0.25 0.67
fx | =NORM.INV(1-0.25,0,1)
38. P15 1.04
fx | =NORM.INV(0.15,0,1)
42. z0.90 1.28
fx | =NORM.INV(1-0.90,0,1)
39. P1 2.33 fx | =NORM.INV(0.01,0,1)
43. z0.02 2.05
fx | =NORM.INV(1-0.02,0,1)
P99 2.33 fx | =NORM.INV(0.99,0,1) 40. Q1 P25 0.67
fx | =NORM.INV(0.25,0,1)
Q2 P50 0.00
fx | =NORM.INV(0.50,0,1)
Q3 P75 0.67
fx | =NORM.INV(0.75,0,1)
44. z0.05 1.64 (Table: 1.645) fx | =NORM.INV(1-0.05,0,1)
45. P 1 z 1 P z 1 P z 1 0.8413 0.1587 0.6826 68.26% (Excel: 68.27%) fx | =NORM.DIST(1.00,0,1,1)-NORM.DIST(-1.00,0,1,1) 46. P 2 z 2 P z 2 P z 2 0.9772 0.0228 0.9544 95.44% (Excel: 95.45%) fx | =NORM.DIST(2.00,0,1,1)-NORM.DIST(-2.00,0,1,1)
Copyright © 2022 Pearson Education, Inc.
Section 6-2: Real Applications of Normal Distributions
79
47. P 3 z 3 P z 3 P z 3 0.9987 0.0013 0.9974 99.74% (Excel: 99.73%) fx | =NORM.DIST(3.00,0,1,1)-NORM.DIST(-3.00,0,1,1) 48. P 3.5 z 3.5 P z 3.5 P z 3.5 0.9999 0.0001 0.9998 99.98% (Excel: 99.95%) fx | =NORM.DIST(3.50,0,1,1)-NORM.DIST(-3.50,0,1,1) 49. a. P z 2 1 P z 2 1 0.9772 0.0228 2.28% b. P z 2 0.0228 2.28%
fx | =1-NORM.DIST(2.00,0,1,1)
fx | =NORM.DIST(-2.00,0,1,1)
c. P 2 z 2 P z 2 P z 2 0.9772 0.0228 0.9544 95.44% (Excel: 95.45%) fx | =NORM.DIST(2.00,0,1,1)-NORM.DIST(-2.00,0,1,1) 50. a. 2.5 min. and 5
12 1.4 min
b. t h e probability is1 3 or 0.5774, and it is very different from theprobability of 0.6827 that would be obtained by incorrectly using thestandard normal distribution. thedistribution does affect theresults verymuch. Section 6-2: Real Applications of Normal Distributions 1. a. 0 and 1 2.
b. t h e z scores are numbers without units of measurements. a. t h e area equals themaximum probability value of 1. b. t h e median is themiddle value and for normally distributed scores that is also themean, which is 4.5338 g. c. t h e mode is also 4.5338 g. d. t h e variance is (0.1039 g)2 0.0108 g2.
3.
4.
5. 6. 7.
No. A standard normal distribution has a mean of 0 and a standard deviation of 1, but thedistribution described in thepreceding exercise has a mean different from 0 and a standard deviation different from 1. thedistributionis a normal distribution, but not a standard normal distribution. The distribution of theselected numbers will have a rectangular shape in which all of theprobabilities are thesame (1 25). thedistribution is not bell-shaped and is not called a normal distribution. theword ―normal‖ in this context has a meaning different from theword ―normal‖ that is commonly used. 118 100 1.2; which has an area of 0.8849 to theleft. fx | =NORM.DIST(118,100,15,1) z x118 15 91100 0.6; which has an area of 0.7257 to theright. fx | =1-NORM.DIST(91,100,15,1) z x91 15 133 100 79 100 2.2; which has an area of 0.9861 to theleft. z 1.4; which has an area of z x133 x79 15 15 0.0808 to theleft. thearea between thetwo scores is 0.9861 0.0808 0.9053. fx | =NORM.DIST(133,100,15,1)-NORM.DIST(79,100,15,1)
8.
z x124
124 100
1.6; which has an area of 0.9452 to theleft. z
x112
112 100
0.8; which has an area
15 15 of 0.7881 to theleft. thearea between thetwo scores is 0.9452 0.7881 0.1571. fx | =NORM.DIST(124,100,15,1)-NORM.DIST(112,100,15,1)
9.
z 2.44; so x 2.44 15 100 136 fx | =NORM.INV(0.9918,100,15)
10. z 1; so x 115 100 115 fx | =NORM.INV(1-0.1587,100,15)
Copyright © 2022 Pearson Education, Inc.
80 Chapter 6: Normal Probability Distributions 11. z 2.07; so x 2.07 15 100 69 fx | =NORM.INV(1-0.9798,100,15) 13. zx60
12. z 1.33; so x 1.3315 100 120 fx | =NORM.INV(0.9099,100,15)
60 69.6
0.85, which has an area of 0.1977 to theleft. (Excel: 0.1978) 11.3 fx | =NORM.DIST(60,69.6,11.3,1)
14. zx60 15. zx80
60 74.0
1.12, which has an area of 0.1314 to theleft.
12.5 80 74.0
fx | =NORM.DIST(60,74.0,12.5,1)
0.48, which has an area of 1 0.6844 0.3156 to theright. 12.5 fx | =NORM.DIST(80,74.0,12.5,1)
16. zx80
80 69.6
0.92, which has an area of 0.1787 to theright. (Table: 1 0.8212 0.1788)11.3
fx | =1-NORM.DIST(80,69.6,11.3,1) 17. zx70
70 74.0
0.32, which has an area of 0.3745 to theleft and z
90 74.0
1.28, which has an 12.5 12.5 area of 0.8997 to theleft. thearea between thetwo scores is 0.8997 0.3745 0.5252. 85 69.6 fx | =NORM.DIST(90,74.0,12.5,1)-NORM.DIST(70,74.0,12.5,1) 1.36, which has an 18. x85 65 69.6 11.3 0.41, which has an area of 0.3409 to theleft and z zx65 11.3 area of 0.9131 to theleft. thearea between thetwo scores is 0.9131 0.3409 0.5722. (Excel: 0.5716) 90 69.6 fx | =NORM.DIST(85,69.6,11.3,1)-NORM.DIST(65,69.6,11.3,1) 1.81, which has an 19. x90 70 69.6 11.3 0.04, which has an area of 0.5160 to theleft and z zx70 11.3 area of 0.9649 to theleft. thearea between thetwo scores is 0.9649 0.5160 0.4489 (Excel: 0.4505) 70 74.0 fx | =NORM.DIST(90,69.6,11.3,1)-NORM.DIST(70,69.6,11.3,1) 0.32, which has an 20. x70 60 74.0 12.5 1.12, which has an area of 0.1314 to theleft and z zx60 12.5 area of 0.3745 to theleft. thearea between thetwo scores is 0.3745 0.1314 0.2431. x90
fx | =NORM.DIST(70,74.0,12.5,1)-NORM.DIST(60,74.0,12.5,1) 21. The z score for P90 is 1.28, so P90 69.6 1.28 11.3 84.1 beats per minute. fx | =NORM.INV(0.90,69.6,11.3) 22. The z score for Q1 P25 is 0.67, so Q1 74.0 0.67 12.5 65.6 beats per minute. fx | =NORM.INV(0.25,74.0,12.5) 23. The z score for P1 is 2.326, so thelower male pulse rate is 69.6 2.326 11.3 43.3 beats per minute. thez score for P99 is 2.326, so theupper male pulse rate is 69.6 2.326 11.3 95.9 beats per minute. fx | =NORM.INV(0.01,69.6,11.3) and fx | =NORM.INV(0.99,69.6,11.3) 24. The z score for P2.5 is 1.96, so thelower female pulse rate is 74.0 1.96 12.5 49.5 beats per minute. thez score for P97.5 is 1.96, so theupper female pulse rate is 74.0 1.96 12.5 98.5 beats per minute. fx | =NORM.INV(0.025,74.0,12.5) and fx | =NORM.INV(0.975,74.0,12.5)
Copyright © 2022 Pearson Education, Inc.
Section 6-2: Real Applications of Normal Distributions 81
25. a.
zx78
78 63.7
4.93; which has an area of 0.9999 to theleft. z
62 63.7
0.59; which has an 2.9 2.9 area of 0.2776 to theleft. Therefore, thepercentage of qualified women is 0.9999 0.2776 0.7223, or 72.23% (Excel: 72.11%). Yes, about 28% of women are not qualified because of their heights. x62
fx | =NORM.DIST(78,63.7,2.9,1)-NORM.DIST(62,63.7,2.9,1) b. t h e z score with 3% of values to theleft is –1.88, which corresponds to a height of 1.88 2.9 63.7 58.2 in. thez score with 3% of values to theright is 1.88 which corresponds to a height of 1.88 2.9 63.7 69.2 in. fx | =NORM.INV(0.03,63.7,2.9) and fx | =NORM.INV(1-0.03,63.7,2.9) 26. a.
zx77
77 68.6
3.00; which has an area of 0.9987 to theleft. z
64 68.6
1.64; which has an 2.8 2.8 area of 0.0505 to theleft. Therefore, thepercentage of qualified men is 0.9987 0.0505 0.9482, or x64
94.82% (Excel: 94.84%). fx | =NORM.DIST(77,68.6,2.8,1)-NORM.DIST(64,68.6,2.8,1) b. t h e z score with 2.5% of values to theleft is –1.96, which corresponds to a height of 1.96 2.8 68.6 63.1 in. thez score with 2.5% of values to theright is 1.96 which corresponds to a height of 1.96 2.8 68.6 74.1 in. fx | =NORM.INV(0.025,68.6,2.8) and fx | =NORM.INV(1-0.025,68.6,2.8) 27. a. t h e z score for theminimum height for men is
56 68.6
4.5 and thez score for themaximum height for
2.8
62 68.6
2.36. t h e area between thez scores is 0.0091 0.0001 0.0090 or 0.90% (Excel: 2.8 0.92%). Because so few men can meet theheight requirement, it is likely that most Mickey Mouse characters are women. fx | =NORM.DIST(62,68.6,2.8,1)-NORM.DIST(56,68.6,2.8,1) men is
b. t h e z score with 5% of values to theleft is –1.645, which corresponds to a height of 68.6 1.645 2.8 64.0 in. and thez score with 50% of values to theright is 0 which corresponds to a height of 0 2.8 68.6 68.6 in. fx | =NORM.INV(0.05,68.6,2.8) and fx | =NORM.INV(1-0.50,68.6,2.8) 28. a. t h e z score for theminimum height for women is 67 63.7
64 63.7
0.10 and thez score for themaximum height
2.9
1.14. t h e area between thez scores is 0.8729 0.5398 0.3331 or 33.31% 2.9 (Excel: 33.12%). fx | =NORM.DIST(78,63.7,2.9,1)-NORM.DIST(62,63.7,2.9,1) for women is
b. t h e z score with 40% of values to theleft is –0.25, which corresponds to a height of 63.7 0.25 63.0 2.9 in. and thez score with 5% of values to theright is 1.645 which corresponds to a height of 63.7 1.645 2.9 68.5 in. fx | =NORM.INV(0.40,63.7,2.9) and fx | =NORM.INV(1-0.05,63.7,2.9) 29. The z score with 5% of values to theleft is –1.645, which corresponds to a head circumference of 570.0 1.645 18.3 539.9 mm and thez score with 5% of values to theright is 1.645 which corresponds to a head circumference of 570.0 1.645 18.3 600.1 mm. fx | =NORM.INV(0.05,570.0,18.3) and fx | =NORM.INV(1-0.05,570.0,18.3) 30. a. t h e z score that has 95% of thearea to theleft is 1.67 which corresponds to a height of 1.67 1.2 21.4 23.4 in. If there is clearance for 95% of males, there will certainly be clearance for all women in thebottom 5%.fx | =NORM.INV(0.95,21.4,1.2)
Copyright © 2022 Pearson Education, Inc.
82 Chapter 6: Normal Probability Distributions 30. (continued) 23.5 21.4
1.75; which has an area to theleft of 0.9599, or 95.99%. 1.2 fx | =NORM.DIST(23.5,21.4,1.2,1) 23.5 19.6 The women’s z score is 3.55; which has an area to theleft of 0.9999, or 99.99% (Excel: 1.1 99.98%). fx | =NORM.DIST(23.5,19.6,1.1,1)
b. t h e men’s z score is
The desk will fit almost everyone except about 4% of themen with thelargest sitting knee heights. 31. thez score with 99% of values to theleft is 2.326, which corresponds to a width of 14.3 2.326 0.9 16.4 in. 32.
fx | =NORM.INV(0.99,14.3,0.9) zx15
15 2.6
1.15, which has an area of 0.8749 to theleft. theprobability that a male college student 10.8 gains 15 pounds or more during his freshman year is 1 0.8749 0.1251 (Excel: 0.1255). It appears that only about 12.6% of males gain 15 pounds or more, so theFreshman 15 appears to be an exaggeration instead of an accurate description of reality. fx | =1-NORM.DIST(15,2.6,10.8,1) 33. a.
zx308
308 268
2.67, which has an area of 0.9962 to theleft. theprobability of a pregnancy lasting 15 308 days or longer is 1 0.9962 0.0038. This is either a very rare event, or thehusband is not thefather. fx | =1-NORM.DIST(308,268,15,1)
b. t h e z score with 3% of values to theleft is 1.88, which corresponds to a duration of 268 1.88 15 240 days. fx | =NORM.INV(0.03,268,15) 100.4 98.2
3.55; which corresponds to an area of 0.62 1 0.9999 0.0001 0.01% (Excel: 0.02%) to theright, which suggests a cutoff of 100.4F is appropriate.
34. a. t h e z score for a temperature of 100.4 is
fx | =1-NORM.DIST(100.4,98.2,0.62,1) b. t h e z score for a probability of 2% to theright is 2.05 which corresponds to a temperature of 2.05 0.62 98.2 99.47F. fx | =NORM.INV(1-0.02,98.2,0.62) 38.9 35.4 1.94, which has an area of 0.9738, or 97.38% (Excel: 97.41%), to theleft. thedesign is 35.3 zx38.9 1.8 5 . feasible if theloss of 2.62% (Excel: 2.59%) of adult drivers is acceptable. fx | =NORM.DIST(38.9,35.4,1.8,1) 174 188.6
0.38; which has an area to theleft of 0.3520 (Excel: 0.3537). 38.9 fx | =NORM.DIST(174,188.6,38.9,1)
36. a. t h e z score for 174 lb. is
b. 3500 140 25 people c. 3500 188.6 18.6, so 18 people d. themean weight is increasing over time, so safety limits must be periodically updated to avoid an unsafe condition. 37. a. thenew mean is equal to theold mean plus theadditional points which is 60 15 75. t h e standard deviation is unchanged at 12 (since thesame amount was added to each score.) b. No, theconversion should also account for variation.
Copyright © 2022 Pearson Education, Inc.
Section 6-3: Sampling Distributions and Estimators 83 37. (continued) c. The z score for thebottom 70% is 0.52 which has a corresponding score of 60 0.52 12 66.2 , (Excel: 66.3) and thez score for thetop 10% is 1.28 which has a corresponding score of 60 1.28 12 75.4. fx | =NORM.INV(0.70,60,12) and fx | =NORM.INV(1-0.10,60,12) d. Using a scheme like theone in part (c), because variation is included in thecurving process. 38. The z score for Q1 is –0.67, and thez score for Q3 is 0.67. theIQR is 0.67 0.67 1.34. 1.5 IQR 2.01, so Q1 1.5 IQR 0.67 2.01 2.68 and Q3 1.5 IQR 0.67 2.01 2.68. The percentage to theleft of –2.68 is 0.0037 and thepercentage to theright of 2.68 is 0.0037. Therefore, thepercentage of an outlier is 0.0074 (Excel: 0.0070). fx | =1-(NORM.DIST(-2.68,0,1,1)-NORM.DIST(2.68,0,1,1)) Section 6-3: Sampling Distributions and Estimators 1. a. In thelong run, thesample proportions will have a mean of 0.00559. b. t h e sample proportions will tend to have a distribution that is approximately normal. 2. a. without replacement b. (1) When selecting a relatively small sample from a large population, it makes no significant difference whether we sample with replacement or without replacement. (2) Sampling with replacement results in independent events that are unaffected by previous outcomes, and independent events are easier to analyze and they result in simpler calculations and formulas. 3. sample mean, sample variance, sample proportion 4. No, thedata set is only one sample, but thesampling distribution of themean is thedistribution of themeans from all samples, not theone sample mean obtained from theone sample in Data Set 1. 5. No. thesample is not a simple random sample from thepopulation of all college students in theUnited States.It is very possible that incomes of college students in Maine differ substantially from theincomes of college students in theother states. 6. a. t h e distribution of thesample means is approximately normal. b. t h e sample means target themean annual income of all college presidents. 459 7. a. The population mean is 6, t h e population variance is 3 (4 6)2 (5 6)2 (9 6)2 4.7. 2 3 b. t h e possible sample of size 2 are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)} which have the following variances {0, 0.5, 12.5, 0.5, 0, 8, 12.5, 8, 0} respectively. Sample Variance 0.0 0.5 8.0 12.5
Probability 3/9 2/9 2/9 2/9
3 0 2 0.5 2 8 2 12.5 4.7. 9 d. Yes. themean of thesampling distribution of thesample variances (4.7) is equal to thevalue of thepopulation variance (4.7) so thesample variances target thevalue of thepopulation variance. c. t h e mean is
Copyright © 2022 Pearson Education, Inc.
84 Chapter 6: Normal Probability Distributions 8.
a. t h e population standard deviation (using theresult from theprevious problem) is 4.67 2.160. b. By taking thesquare root of thesample variances from theprevious problem we get Sample Standard Deviation 0.000 0.707 2.828 3.536
Probability 3/9 2/9 2/9 2/9
3 0 2 0.707 2 2.828 2 3.536 1.571. 9 d. No, themean of thesampling distribution of thesample standard deviations is 1.571, and it is not equal to thevalue of thepopulation standard deviation (2.160), so thesample standard deviations do not target thevalue of thepopulation standard deviation. a. t h e population median is 5. b. t h e possible sample of size 2 are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)}, which have the following medians {4, 4.5, 6.5, 4.5, 5, 7, 6.5, 7, 9} with thefollowing associated probabilities. c. t h e mean is
9.
Sample Median 4.0 4.5 5.0 6.5 7.0 9.0
Probability 1/9 2/9 1/9 2/9 2/9 1/9
4 2 4.5 5 2 6.52 7 9 6.0. 9 d. No, themean of thesampling distribution of thesample medians is 6.0, and it is not equal to thevalue of thepopulation median of 5.0, so thesample medians do not target thevalue of thepopulation median. 10. a. theproportion of odd numbers is 2/3, or 0.7 (there are two odd numbers from thepopulation of 4, 5, and 9).b. t h e possible sample of size 2 are {(4, 4), (4, 5), (4, 9), (5, 4), (5, 5), (5, 9), (9, 4), (9, 5), (9, 9)}, which have the following proportion of odd numbers {0, 0.5, 0.5, 0.5, 1, 1, 0.5, 1, 1. c. t h e mean is
Sample Proportion 0.0 0.5 1.0 c. t h e mean is
0 4 0.5 4 1
Probability 1/9 4/9 4/9
2
, or 0.7. 9 3 d. Yes. themean of thesampling distribution of thesample proportion of odd numbers is 2/3, and it is equal to thevalue of thepopulation proportion of odd numbers of 2/3, so thesample proportions target thevalue of thepopulation proportion
Copyright © 2022 Pearson Education, Inc.
Section 6-3: Sampling Distributions and Estimators 85 11. a. The possible samples of size 2 are {(2, 2), (2, 3), (2, 5), (2, 9), (3, 2), (3, 3), (3, 5), (3, 9), (5, 2), (5, 3), (5, 5), (5, 9), (9, 2), (9, 3), (9, 5), (9, 9)}, which have thefollowing means and associated probabilities. x 2.0 2.5 3.0 3.5 4.0 5.0 5.5 6.0 7.0 9.0
Probability 1/16 2/16 1/16 2/16 2/16 1/16 2/16 2/16 2/16 1/16
2359
4.75 and themean of thesample means is 4 2.0 1 2.5 2 3.0 1 3.5 2 4.0 2 5.0 1 5.5 2 6.0 2 7.0 2 9.0 1 4.75 as well. 16 c. thesample means target thepopulation mean. Sample means make good estimators of population means because they target thevalue of thepopulation mean instead of systematically underestimating or overestimating it. 12. a. t h e possible samples of size 2 are {(2, 2), (2, 3), (2, 5), (2, 9), (3, 2), (3, 3), (3, 5), (3, 9), (5, 2), (5, 3), (5, 5), (5, 9), (9, 2), (9, 3), (9, 5), (9, 9)}, which have thefollowing medians and associated probabilities. b. t h e mean of thepopulation is
Sample Median Probability 2.0 1/16 2.5 2/16 3.0 1/16 3.5 2/16 4.0 2/16 5.0 1/16 5.5 2/16 6.0 2/16 7.0 2/16 9.0 1/16 35 b. t h e median of thepopulation is 4.0, but themean of thesample medians is 2 2.0 1 2.5 2 3.0 1 3.5 2 4.0 2 5.0 1 5.5 2 6.0 2 7.0 2 9.0 1 4.75, so thevalues are not 16 equal. c. thesample medians do not target thepopulation median of 4.0, so thesample medians do not make good estimators of thepopulation medians.
Copyright © 2022 Pearson Education, Inc.
86 Chapter 6: Normal Probability Distributions 13. a. The possible samples of size 2 are {(2, 2), (2, 3), (2, 5), (2, 9), (3, 2), (3, 3), (3, 5), (3, 9), (5, 2), (5, 3), (5, 5), (5, 9), (9, 2), (9, 3), (9, 5), (9, 9)}, which have thefollowing ranges and associated probabilities. Sample Range 0 1 2 3 4 6 7
Probability 4/16 2/16 2/16 2/16 2/16 2/16 2/16
b. t h e range of thepopulation is 9 2 7.0, but themean of thesample ranges is 0 4 1 2 2 2 3 2 4 2 6 2 7 2 2.875, so thevalues are not equal. 16 c. thesample ranges do not target thepopulation range of 7.0, so sample ranges do not make good estimators of thepopulation range. 14. a. t h e possible samples of size 2 are {(2, 2), (2, 3), (2, 5), (2, 9), (3, 2), (3, 3), (3, 5), (3, 9), (5, 2), (5, 3), (5, 5), (5, 9), (9, 2), (9, 3), (9, 5), (9, 9)}, which have thefollowing variances and associated probabilities. Sample Variance (s2) 0.0 0.5 2.0 4.5 8.0 18.0 24.5
Probability 4/16 2/16 2/16 2/16 2/16 2/16 2/16
The variance of thepopulation is 7.2 themean of thesample variances is 4 0 2 0.5 2 2.0 2 4.5 2 8.0 2 18.0 2 24.5 7.2. t h e two values are equal. 16 c. thesample variances do target thepopulation variance of 7.2, so sample variances do make good estimators of thepopulation variance. 15. The possible birth samples are {(b, b), (b, g), (g, b), (g, g)}. Proportion of Girls 0 0.5 1
Probability 0.25 0.50 0.25
Yes. theproportion of girls in 2 births is 0.5, and themean of thesample proportions is 0.5. theresult suggeststhat a sample proportion is an unbiased estimator of thepopulation proportion. 16. The possible birth samples are {bbb, bbg, bgb, gbb, ggg, ggb, gbg, bgg}. Proportion of Girls 0 1/3 2/3 1
Probability 1/8 3/8 3/8 1/8
Yes. theproportion of girls in 3 births is 0.5 and themean of thesample proportions is 0.5. theresult suggeststhat a sample proportion is an unbiased estimator of thepopulation proportion.
Copyright © 2022 Pearson Education, Inc.
Section 6-4: The Central Limit Theorem 87 17. The possibilities are: both questions incorrect, one question correct (two choices), both questions correct. a. Proportion Correct 0 0 2 1 0.5 2 2 1 2
Probability 4 4 16 5 5 25 1 4 8 2 5 5 25 1 1 1 5 5 25
16 0 8 0.5 11
0.2. 25 c. Yes, thesampling distribution of thesample proportions has a mean of 0.2 and thepopulation proportion is also 0.2 (because there is 1 correct answer among 5 choices). Yes, themean of thesampling distribution of thesample proportions is always equal to thepopulation proportion. 18. a. theproportions of 0, 0.5, and 1 have thefollowing probabilities. b. t h e mean is
Proportion with Yellow Pods Probability 0 1/25 0.5 8/25 1 16/25 1 0 8 0.5 16 1 b. t h e mean is 0.8. 25 c. Yes, thepopulation proportion is 0.8 and themean of thesampling proportions is also 0.8. themean of thesampling distribution of proportions is always equal to thepopulation proportion. 19. The formula yields P(0)
1 1 0.25 , P(0.5) 0.5, and 2(2 2 0)!(2 0)! 2(2 2 0.5)!(2 0.5)!
1 0.25, which describes thesampling distribution of thesample proportions. The 2(2 2 1)!(2 1)! formula is just a different way of presenting thesame information in thetable that describes thesampling distribution. 20. Sample values of themean absolute deviation (MAD) do not usually target thevalue of thepopulation MAD, soa MAD statistic is not good for estimating a population MAD. If thepopulation of {4, 5, 9} from Example 5 is used, thesample MAD values of 0, 0.5, 2, and 2.5 have corresponding probabilities of 3/9, 2/9, 2/9, and 2/9. For these values, thepopulation MAD is 2, but thesample MAD values have a mean of 1.1, so themean of thesample MAD values (1.1) is not equal to thepopulation MAD of 2.0. Section 6-4: theCentral Limit Theorem 1. The sample must have more than 30 values, or there must be evidence that thepopulation of systolic blood pressures has a normal distribution. 2. No, because theoriginal population is normally distributed, thesample means will be normally distributed for any sample size, not just for n 30. P(1)
3.
x represents themean of all sample means and x which represents thestandard deviation of all sample means. For samples of 36 IQ scores, x 100, and x 15
4.
36 2.5.
No. thesample of 50 annual incomes will tend to have a distribution that is skewed to theright, no matter how large thesample is. If we compute thesample mean, we can consider that value to be one value in a normally distributed population.
Copyright © 2022 Pearson Education, Inc.
88 Chapter 6: Normal Probability Distributions
5.
zx0
0 1.2
0.24; which has an area of 0.4052 to theleft. (Excel: 0.4033) 4.9 fx | =NORM.DIST(0,1.2,4.9,1) 0 1.2 1.22; which has an area of 0.1112 to theleft. (Excel: 0.1104) b. z x0 4.9 / 25 a.
fx | =NORM.DIST(0,1.2,4.9/SQRT(25),1)
6.
c. Because theoriginal population has a normal distribution, thedistribution of sample means is a normal distribution for any sample size. 2 1.2 0.16; which has an area of 1 0.5636 0.4364 to theright. (Excel: 0.4352) a. zx2 4.9 fx | =1-NORM.DIST(2,1.2,4.9,1) b. z x2
2 1.2 0.65; which has an area of 1 0.7422 0.2578 to theright. (Excel: 0.2569) 4.9 / 16
fx | =1-NORM.DIST(2,1.2,4.9/SQRT(16),1)
7.
c. Because theoriginal population has a normal distribution, thedistribution of sample means is a normal distribution for any sample size. 0 1.2 3 1.2 0.24 and 0.37, which has an area of 0.6443 0.4052 0.2391 between a. zx0 x3 4.9 4.9 them. (Excel: 0.2401) fx | =NORM.DIST(3,1.2,4.9,1)-NORM.DIST(0,1.2,4.9,1) b. z x0
0 1.2 3 1.2 0.73 and x3 1.10, which has an area of 0.8643 0.2327 0.6316 between 4.9 / 9 4.9 / 9
them. (Excel: 0.6335) fx | =NORM.DIST(3,1.2,4.9/SQRT(9),1)-NORM.DIST(0,1.2,4.9/SQRT(9),1)
8.
c. Because theoriginal population has a normal distribution, thedistribution of sample means is a normal distribution for any sample size. 0.5 1.2 2.5 1.2 0.14 and a. zx0.5 0.27, which has an area of 0.6064 0.4443 0.1621 x2.5 4.9 4.9 between them. (Excel: 0.1614) fx | =NORM.DIST(2.5,1.2,4.9,1)-NORM.DIST(0.5,1.2,4.9,1) b. z x0.5
0.5 1.2
x2.5
2.5 1.2
0.53, which has an area of 0.7019 0.3859 0.3160 4.9 / 4 4.9 / 4 between them. (Excel: 0.3146) fx | =NORM.DIST(2.5,1.2,4.9/SQRT(4),1)-NORM.DIST(0.5,1.2,4.9/SQRT(4),1)
9.
0.29 and
c. Because theoriginal population has a normal distribution, thedistribution of sample means is a normal distribution for any sample size. 148 189 1.05, which has an area of 1 0.1469 0.8531 to theright. (Excel: 0.8534) a. zx148 39 fx | =1-NORM.DIST(148,189,39,1) 148 189 5.46 which has an area of 1 0.0001 0.9999 to theright. (Excel: 0.99999998) b. x148 39 / 27 fx | =1-NORM.DIST(148,189,39/SQRT(27),1) c. It appears that theelevator would be overloaded with 27 adult males, which is probably unlikely but possible. Also, it is likely that a safety factor is built in so that theelevator can safely take a load greater than the4000 lb capacity on theplacard. But to be safe, instead of boarding theelevator full of adult men, it would be wiser to wait for another elevator.
Copyright © 2022 Pearson Education, Inc.
Section 6-4: theCentral Limit Theorem 89 zx22
22 18.2
3.8; which has a probability of 0.9999 to theleft. So thepercentage is 99.99%. 1 fx | =1-NORM.DIST(22,18.2,1,1) 18.5 18.2 1.8 which has a probability of 0.9641 to theleft. No, when considering thediameters b. z x18.5 1 36 of manholes, we should use a design based on individual men, not samples of 36 men. fx | =1-NORM.DIST(18.5,18.2,1/SQRT(36),1)
10. a.
11. a. t h e mean weight of passengers is 3500 25 140 lb. 140 189 6.28; which has a probability of 1 0.0001 0.9999 (Excel: 0.999999998) to the b. z x140 39 25 right. fx | =1-NORM.DIST(140,189,39/SQRT(50),1) 175 189 1.61; which has a probability of 1 0.0537 0.9463 (Excel: 0.9458) to theright. c. zx175 39 20 fx | =1-NORM.DIST(175,189,39/SQRT(20),1) d. The new capacity of 20 passengers does not appear to be safe enough because theprobability of overloading is too high. 140 189 8.88; which has a probability of 1 0.0001 0.9999 to theright (Excel: 1.0000 when 12. a. zx140 39 50 rounded to four decimal places). fx | =1-NORM.DIST(140,189,39/SQRT(50),1) b. z x174
174 189
1.44; which has a probability of 1 0.0449 0.9251 to theright (Excel: 0.9249). 39 14 Because there is a high probability of overloading, thenew ratings do not appear to be safe when theboat is loaded with 14 male passengers. fx | =1-NORM.DIST(174,189,39/SQRT(50),1)
13. a.
211171 140 171 0.87 and z x140 0.67; which have a probability of 0.8078 0.2514 46 46 0.5564 between them (Excel: 0.5575). fx | =NORM.DIST(211,171,46,1)-NORM.DIST(140,171,46,1) zx211
b. z x211
211171
4.35 and z
x140
140 171
46 25 46 0.9995 between them (Excel: 0.9996).
3.37; which have a probability of 0.9999 0.0004
25
fx | =NORM.DIST(211,171,46/SQRT(25),1)-NORM.DIST(140,171,46/SQRT(25),1) c. Part (a) because theejection seats will be occupied by individual women, not groups of women. 167.6 189.0 3.34; which has a probability of 1 0.0004 0.9996 to theright. There is a 0.9996 14.1 zx167.6 39 37 4 . probability that theaircraft is overloaded. Because that probability is so high, thepilot should take action, such as removing excess fuel and/or requiring that some passengers disembark and take a later flight. fx | =1-NORM.DIST(167.6,189,39/SQRT(37),1) 15. a.
zx72
72 68.6
1.21; which has a probability of 0.8869 (Excel: 0.8877). 2.8 fx | =NORM.DIST(72,68.6,2.8,1)
Copyright © 2022 Pearson Education, Inc.
90 Chapter 6: Normal Probability Distributions 15. (continued) b. z x72
72 68.6 2.8 100
12.1; which has a probability of 0.9999. (Excel: 1.0000 when rounded to four decimal
places.) fx | =NORM.DIST(72,68.6,2.8/SQRT(100),1) c. The probability of Part (a) is more relevant because it shows that about 89% of male passengers will not need to bend. theresult from part (b) gives us useful information about thecomfort and safety of individual male passengers. d. Because men are generally taller than women, a design that accommodates a suitable proportion of men will necessarily accommodate a greater proportion of women. 53 51.6 0.64, which has an area of 0.7389 to theleft. thepercentage is 73.89% (Excel: 16. a. zx53 2.2 73.77%), which is too high. fx | =NORM.DIST(53,51.6,2.2,1) b. t h e z score with 95% of values to theright is –1.645, which corresponds to a distance of 51.6 1.645 2.2 48.0 in. fx | =NORM.INV(1-0.95,51.6,2.2,1) c. t h e z score with 95% of values to theright is –1.645, which corresponds to a distance of
51.6 1.645 2.2 / 40 51.0 in. Because thecockpits are occupied by individual pilots, it would be a big mistake to plan a design around themean for 40 pilots. With a design of 51.0 in., about 39% of individual pilots would not have sufficient overhead grip reach, so far too many pilots would be disqualified. fx | =NORM.INV(1-0.95,51.6,2.2/SQRT(40),1)
17.1 2.6 15 7z 11.8, which has an area of 0.0001 to theleft. If we assume that thetrue mean weight x2.6 . 8.6 / 67 gain is 15 lb, theprobability of getting a sample of 67 college students having a mean weight gain of 2.6 lb or less is 0.0001 (Excel: 0+). Because it is so unlikely to get a sample such as theone obtained, it appears that theassumption of a mean weight gain of 15 lb is an incorrect assumption. theFreshman 15 claim of a mean weight gain of 15 lb appears to be an incorrect claim. fx | =NORM.DIST(2.6,15,8.6/SQRT(67),1) 6.8 7 18.1 zx6.8 0.35, which has an area of 0.3632 to theleft (Excel: 0.3645). If themean really is 7 2.0 / 12 8 . hours as assumed, theprobability of getting a sample with a mean of 6.8 hours or less is not low (such as 0.05 or less), so themean of 6.8 hours is not significantly low. There isn’t enough evidence to support theclaim that the mean sleep time is less than 7 hours. fx | =NORM.DIST(6.8,7,2.0/SQRT(12),1) 3.0 0 3.88, which has an area of 1 0.9999 0.0001 to theright (Excel: 0.0000540). If themean 19.1 zx3.0 4.9 / 40 9 . really is 0 lb as assumed, theprobability of getting a sample with a mean of 3.0 lb or higher is very small, so themean of 3.0 lb is significantly high. There is strong evidence suggesting that themean is actually higher than 0 lb, so thediet does appear to be effective. However, themean weight loss of only 3.0 lb is not very much, so even though thediet appears to be effective, it doesn’t appear to be worth using this diet. theamount of weight loss appears to have statistical significance but not practical significance. fx | =1-NORM.DIST(3.0,0,4.9/SQRT(40),1) 20. a. Yes, thesampling is without replacement and thesample size of 50 is greater than 5% of thefinite 16 275 50 2.0504584 population size of 275. x 50 275 1
Copyright © 2022 Pearson Education, Inc.
Section 6-5: Assessing Normality 20. (continued)
91
105 95.5
95 95.5 4.63 and z 0.24; which have a probability of x95.5 2.0504584 2.0504584 0.9999 0.4052 0.5947 between them (Excel: 0.5963).
b. z x105
fx | =NORM.DIST(105,95.5,2.0504584,1)-NORM.DIST(95,95.5,2.0504584,1) Section 6-5: Assessing Normality 1. The requirement is satisfied because thesample size of 147 is greater than 30. 2. The histogram should be approximately bell-shaped, and thenormal quantile plot should have points that approximate a straight-line pattern 3. Either thepoints are not reasonably close to a straight-line pattern, or there is some systematic pattern that is not a straight-line pattern. 4. Because thehistogram is roughly bell-shaped, we can conclude that thedata are from a population having a normal distribution. 5. Normal, thepoints are reasonably close to a straight-line pattern, and there is no other pattern that is not a straight-line pattern. 6. Normal, thepoints are reasonably close to a straight-line pattern, and there is no other pattern that is not a straight-line pattern. 7. Not normal, thepoints are not reasonably close to a straight-line pattern, and there appears to be a pattern that is not a straight-line pattern. 8. Not normal, thepoints are not reasonably close to a straight-line pattern. 9. normal 11. not normal
10 not normal
12. normal
Copyright © 2022 Pearson Education, Inc.
92 Chapter 6: Normal Probability Distributions 13. Normal, the points are reasonably close to a straight-line pattern and there is no other pattern that is not a straight-line pattern. Graph for Exercise 13 Graph for Exercise 14
14. Not normal, thepoints are not reasonably close to a straight-line pattern and there appears to be a pattern that is not a straight-line pattern. 15. Not normal, thepoints are not reasonably close to a straight-line pattern and there appears to be a pattern that is not a straight-line pattern. Graph for Exercise 15 Graph for Exercise 16
16. Normal, thepoints are reasonably close to a straight-line pattern and there is no other pattern that is not a straight-line pattern.
Copyright © 2022 Pearson Education, Inc.
Section 6-5: Assessing Normality
93
17. Normal, the points are reasonably close to a straight-line pattern and there is no other pattern that is not a straight-line pattern. thepoints have coordinates (97.9, 1.28), (98.0, 0.52), (98.2, 0), (98.4, 0.52), (98.7,1.28). Graph for Exercise 17
Graph for Exercise 18
18. Not normal, thepoints are not reasonably close to a straight-line pattern and there appears to be a pattern that is not a straight-line pattern. thepoints have coordinates (6.8, 1.38), (7.0, 0.67), (7.0, 0.21), (7.0, 0.21), (8.1, 0.67), (17.3,1.38). 19. Not normal, thepoints are not reasonably close to a straight-line pattern and there appears to be a pattern that is not a straight-line pattern. thepoints have coordinates (963, 1.53), (1027, 0.89), (1029, 0.49), (1034, 0.16), (1070, 0.16), (1079, 0.49), (1079, 0.89), (1439,1.53). Graph for Exercise 19
Graph for Exercise 20
20. Not normal, thepoints are not reasonably close to a straight-line pattern and there appears to be a pattern that is not a straight-line pattern. thepoints have coordinates (24, 1.59), (25, 0.97), (27, 0.59), (29, 0.28), (30, 0), (33, 0.28), (35, 0.59), (41, 0.97), (80,1.59). 21. a. Yes, thehistogram or normal quantile plot will remain unchanged. b. Yes, thehistogram or normal quantile plot will remain unchanged. c. No, thehistogram and normal quantile plot will not indicate a normal distribution.
Copyright © 2022 Pearson Education, Inc.
94 Chapter 6: Normal Probability Distributions 22. The original values are not from a normally distributed population.
After taking thelogarithm of each value, thevalues appear to be from a normally distributed population. theoriginal values are from a population with a lognormal distribution. Section 6-6: Normal as Approximation to Binomial (Download Only) 1. a. t h e area below (to theleft of) 502.5 b. t h e area between 501.5 and 502.5 c. t h e area above (to theright of) 502.5 2. Yes, thecircumstances correspond to 100 independent trials of a binomial experiment in which theprobability of success is 0.2. Also, with n 100, p 0.2, and q 0.8, t h e requirements of np 100 0.2 20 5 and nq 100 0.8 80 5 are both satisfied. 3.
4. 5.
p 0.2, q 0.8, 20, 4; thevalue of 20 shows that for people who make random guesses for the 100 questions, themean number of correct answers is 20. For people who make 100 random guesses, thestandard deviation of 4 is a measure of how much thenumbers of correct responses vary. The histogram should be approximately normal or bell-shaped, because sample proportions tend to approximate a normal distribution. The requirements for thenormal approximation are satisfied with np 20 0.512 10.2 5 and 7.5 20 0.512 1.23; which has a probability of 0.1093 (Excel: nq 20 0.488 9.8 5. zx7.5 20 0.512 0.488 0.1102) to theleft.
fx | =NORM.DIST 7.5,10.24, 2.23542,1
6.
The requirement of np 8 0.512 4.1 5 is not satisfied. thenormal approximation should not be used.
7.
The requirement of np 20 0.2 4 5 is not satisfied. thenormal approximation should not be used.
8.
The requirements for thenormal approximation are satisfied with np 50 0.2 10 5 and 12.5 50 0.2 11.5 50 0.2 0.88; which have a 0.53 and z x12.5 nq 50 0.8 40 5. zx11.5 50 0.2 0.8 50 0.2 0.8 probability of 0.8106 0.7019 0.1087 (Excel: 0.1096) between them. fx | =NORM.DIST 12.5,10, 2.82843,1-NORM.DIST 12.5,10, 2.82843,1
9.
The requirements for thenormal approximation are satisfied with np 100 0.23 23 5 and 19.5 100 0.23 0.83; which has a probability of 0.2033 (Excel: 0.2028) 100 0.23 0.77 to theleft. No, 20 is not a significantly low number of white cars. (Excel: Using thebinomial distribution: nq 100 0.77 77 5. zx19.5
0.2047.)
fx | =NORM.DIST 19.5, 23, 4.20833,1 and fx | =BINOM.DIST 19,100, 0.23,1
Copyright © 2022 Pearson Education, Inc.
Section 6-6: Normal as Approximation to Binomial 95 10. The requirements for thenormal approximation are satisfied with np 100 0.18 18 5 and nq 100 0.82 82 5. zx24.5
24.5 100 0.18
1.69; which has a probability of 1 0.9545 0.0455 100 0.18 0.82 (Excel: 0.2028) to theright. Yes, 25 is a significantly high number of black cars. (Excel: Using thebinomial distribution: 0.0496.) fx | =1-NORM.DIST 24.5,18, 3.84187,1 and fx | =1-BINOM.DIST 25,100, 0.18,1 11. The requirements for thenormal approximation are satisfied with np 100 0.10 10 5 and 13.5 100 0.10 14.5 100 0.10 1.17 and z x14.5 1.50; which have a 100 0.10 0.90 100 0.10 0.90 probability of 0.9524 0.9332 0.0542 (Excel: 0.0549) between them. Determination of whether 14 red cars is significantly high should be based on theprobability of 14 or more red cars, not theprobability of exactly 14 red cars. (Excel: Using thebinomial distribution: 0.0513.) nq 100 0.90 90 5. z x13.5
fx | =NORM.DIST 14.5,10, 3.0,1-NORM.DIST 13.5,10, 3.0,1 and fx | =BINOM.DIST 14,100, 0.10, 0 12. The requirements for thenormal approximation are satisfied with np 100 0.16 16 5 and 9.5 100 0.16 10.5 100 0.16 1.77 and z x10.5 1.50; which have 100 0.16 0.84 100 0.16 0.84 a probability of 0.0668 0.0384 0.0284 (Excel: 0.0287) between them. Determination of whether 10gray cars is significantly high should be based on theprobability of 10 or fewer red cars, not theprobability of exactly 10 red cars. (Excel: Using thebinomial distribution: 0.0292.) nq 100 0.84 84 5. z x9.5
fx | =NORM.DIST 10.5,16, 3.66607,1-NORM.DIST 9.5,16, 3.66607,1 fx | =BINOM.DIST 10,100, 0.16, 0 13. a. t h e requirements for thenormal approximation are satisfied with np 879 0.25 219.75 5 and 230.5 879 0.25
231.5 879 0.25 0.92; 879 0.25 0.75 879 0.25 0.75 which have a probability of 0.8212 0.7995 0.0217 (Excel: 0.0212) between them. (Excel: Using thebinomial distribution: 0.0209.) nq 879 0.75 659.25 5. zx230.5
0.84 and z x231.5
fx | =NORM.DIST 231.5, 219.75,12.83793,1-NORM.DIST 230.5, 219.75,12.83793,1 fx | =BINOM.DIST 231,879, 0.25, 0 b. t h e requirements for thenormal approximation are satisfied with np 879 0.25 219.75 5 and 230.5 879 0.25 0.84; which has a probability of nq 879 0.75 659.25 5. zx230.5 879 0.25 0.75 1 0.7995 0.2005 (Excel: 0.2012) to theright. theresult of 231 overturned calls is not significantly high. (Excel: Using thebinomial distribution: 0.2006.) fx | =1-NORM.DIST 230.5, 219.75,12.83793,1 and fx | =1-BINOM.DIST 231,879, 0.25,1 14. a. t h e requirements for thenormal approximation are satisfied with np 879 0.22 193.38 5 and nq 879 0.78 685.62 5. zx230.5
230.5 879 0.22
3.02 and z x231.5 879 0.22 0.78 which have a probability of 0.9990 0.9987 0.0003 between them.
231.5 879 0.22 879 0.22 0.78
fx | =NORM.DIST 231.5,193.38,12.28155,1-NORM.DIST 230.5,193.38,12.28155,1 fx | =BINOM.DIST231,879, 0.22, 0
Copyright © 2022 Pearson Education, Inc.
3.10;
96 Chapter 6: Normal Probability Distributions 14. (continued) b. t h e requirements for thenormal approximation are satisfied with np 879 0.22 193.38 5 and 230.5 879 0.22 3.02; which has a probability of nq 879 0.78 685.62 5. zx230.5 879 0.22 0.78 1 0.9987 0.0013 to theright. theresult of 231 overturned calls is significantly high. (Excel: Using thebinomial distribution: 0.0015.) fx | =1-NORM.DIST 230.5,193.38,12.28155,1 and fx | =1-BINOM.DIST 231,879, 0.22,1 15. a. t h e requirements for thenormal approximation are satisfied with np 250 0.51 127.5 5 and 109.5 250 0.51 2.28; which has a probability of 0.0113 (Excel: nq 250 0.49 122.5 5. zx109.5 250 0.51 0.49 0.0114) to theleft. (Excel: Using thebinomial distribution: 0.0113.) fx | =NORM.DIST 109.5,127.5, 7.90411,1 and fx | =BINOM.DIST 109, 250, 0.51,1 b. t h e result of 109 is significantly low. 16. a. t h e requirements for thenormal approximation are satisfied with np 650 0.12 78 5 and 85.5 650 0.12 0.91; which has a probability of 650 0.12 0.88 1 0.8186 0.1814 (Excel: 0.1827) to theright. (Excel: Using thebinomial distribution: 0.1819.)
nq 650 0.88 572 5. zx85.5
fx | =1-NORM.DIST 85.5, 78.0,8.28493,1 and fx | =1-BINOM.DIST 86, 650, 0.12,1 b. t h e result of 86 people with green eyes is not significantly high. 17. a. t h e requirements for thenormal approximation are satisfied with np 929 0.75 696.75 5 and 704.5 929 0.75 0.59; which has a probability of nq 929 0.25 232.25 5. zx704.5 929 0.75 0.25 1 0.7224 0.2776 (Excel: 0.2785) to theright. (Excel: Using thebinomial distribution: 0.2799.) fx | =1-NORM.DIST 704.5, 696.75,13.19801,1 and fx | =1-BINOM.DIST 705, 929, 0.75,1 b. t h e result of 705 peas with red flowers is not significantly high. c. theresult of 705 peas with red flowers is not strong evidence against Mendel’s assumption that 3 4 of peas will have red flowers. 18. a. t h e requirements for thenormal approximation are satisfied with np 1480 0.292 432.16 5 and 454.5 1480 0.292 1.28; which has a probability of nq 1480 0.708 1047.84 5. zx454.5 1480 0.292 0.708 1 0.8997 0.1003 (Excel: 0.1008) to theright. (Excel: Using thebinomial distribution: 0.1012.) fx | =1-NORM.DIST 454.5, 432.16,17.49198,1 and fx | =1-BINOM.DIST 455,1480, 0.292,1 b. t h e result of 455 who have sleepwalked is not significantly high. c. t h e result of 455 does not provide strong evidence against therate of 29.2%. 19. a. Using thenormal approximation: 1002 0.61 611.22, 1002 0.61 0.39 15.4394, and 700.5 611.22 5.78; which has a probability of 0.0001 (Excel: 0.0000) to theright. zx700.5 1002 0.61 0.39 fx | =1-NORM.DIST 700.5, 611.22,15.43942,1 and fx | =1-BINOM.DIST 700,1002, 0.61,1 b. t h e result suggests that thesurveyed people did not respond accurately.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz
97
20. a. Using normal approximation: 420, 095 0.00034 142.83, 420, 095 0.000344 0.999656 135.5 142.83 0.61; which has a probability of 0.2709 11.9492, and z x135.5 420, 095 0.000344 0.999656 (Excel: 0.2697) to theleft. (Excel: Using thebinomial distribution: 0.2726.) fx | =NORM.DIST 135.5,142.83,11.9492,1 and fx | =BINOM.DIST 135, 420095, 0.000340,1 b. Media reports appear to be wrong. 21. (1) therequirements for thenormal approximation are satisfied with np 11 0.512 5.632 5 and nq 11 0.488 5.368 5. zx6.5
6.5 11 0.512
0.52 and z x7.5
11 0.512 0.488 which have a probability of 0.8708 0.6985 0.1723 between them.
7.5 11 0.512 1.13; 11 0.512 0.488
(2) 0.1704 fx | =NORM.DIST 7.5, 5.632,1.65783,1-NORM.DIST 6.5, 5.632,1.65783,1 (3) 0.1726 fx | =BINOM.DIST 7,11, 0.512, 0 No, theapproximations are not off by very much. 22. The z score that corresponds to a 0.95 probability is 1.645. This means that we have to solve theequation 1.645 n 0.9005 0.0995 n 0.9005 213 for n. This has a solution of 229 (Excel: 230) reservations. Quick Quiz 1.
2.
z P90 1.28
fx | =NORM.INV(0.90,0,1)
3.
P z 1.55 1 P z 1.55 1 0.9394 0.0606
4.
P 1.00 z 2.00 P z 2.00 P z 1.00 0.9772 0.1587 0.8185 (Excel: 0.8186)
fx | =1-NORM.DIST(1.55,0,1,1)
fx | =NORM.DIST(2.00,0,1,1)-NORM.DIST(-1.00,0,1,1) 5.
a. 0 and 1 b. x represents themean of all sample means, and x represents thestandard deviation of all sample means.
7.
Since n 30, t h e distribution of thesample means will be a normal distribution. 25.0 23.5 1.36, which has an area of 1 0.9131 0.0869 to theright. theprobability is 0.0869. z x25.0 1.1 (Excel: 0.0863) fx | =1-NORM.DIST(25.0,23.5,1.1,1)
8.
z x22.0
6.
22.0 23.5
1.36 and z
26.0 23.5
2.27. t h e area between thetwo scores is 1.1 1.1 0.9884 0.0869 0.9015. the probability is 0.9015. (Excel: 0.9021) x26.0
fx | =NORM.DIST(26.0,23.5,1.1,1)-NORM.DIST(22.0,23.5,1.1,1) 9.
z x23.0
23.0 23.5
1.36, which has an area of 1 0.0869 0.9131 to theright. theprobability is 0.9131. 1.1 / 9 (Excel: 0.9137) fx | =1-NORM.DIST(25.0,23.5,1.1/SQRT(9),1)
10. The normal quantile plot suggests that thescores are from a normally distributed population.
Copyright © 2022 Pearson Education, Inc.
98 Chapter 6: Normal Probability Distributions Review Exercises 1.
a. P z 1.37 1 P z 1.37 1 0.0853 0.9147 b. P z 2.34 0.9904
fx | =1-NORM.DIST(-1.37,0,1,1)
fx | =1-NORM.DIST(2.34,0,1,1)
c. P 0.67 z 1.29 P z 1.29 P z 0.67 0.9015 0.2514 0.6501 (Excel: 0.6500) fx | =NORM.DIST(1.29,0,1,1)-NORM.DIST(-0.67,0,1,1) d. Q1 0.67
fx | =NORM.INV(0.25,0,1)
0.23 0.0 e. P z 0.23 P z P z 0.69 1 0.7549 0.2451 1 9 fx | =1-NORM.DIST(0.23,0,1/SQRT(9),1) 2.
3.
a. An unbiased estimator is a statistic that targets thevalue of thepopulation parameter in thesense that thesampling distribution of thestatistic has a mean that is equal to themean of thecorresponding parameter. b. mean, variance and proportion c. true 5. a. t h e area under thecurve is 1. a. z0.99 2.33 fx | =NORM.INV(0.99,0,1) b. t h e median is 3037.1 g. b. z0.01 2.33 fx | =NORM.INV(0.01,0,1) c. t h e mode is 3037.1 g. c. z0.025 1.96
4.
d. t h e variance is (706.3 g)2 498,859.7 g2.
a. t h e distribution of samples means is normal. b. x 33.64 cm c. x 4.14
6.
fx | =NORM.INV(0.025,0,1)
25 0.83
a. t h e z score with 2% of values to theright is 2.054, which corresponds to an IQ score of 100 2.054 15 131.
fx | =NORM.INV(1-0.02,100,15)
b. z x 131
131100
4.13, which has an area of 0.0001 to theright. (Excel: 0.0000179)
15 / 4
fx | =1-NORM.DIST(131,100,15/SQRT(4),1)
7.
c. No. It is very possible that the4 subjects have a mean of 131 while some of them have scores below 131. 70.0 63.7 2.17; which has a probability of 1 0.9850 0.0150 , or 1.50% (Excel: 1.49%) to the a. zx70 2.9 right. fx | =1-NORM.DIST(70.0,63.7,2.9,1) b. t h e z score for theupper 2.5% is 1.96 which corresponds to a doorway height of 1.96 2.9 63.7 69.4 in.
8.
a.
fx | =NORM.INV(1-0.025,63.7,2.9) zx54
54 59.7
2.28, which has an area of 0.0113, or 1.13%, to theleft. 2.5 fx | =NORM.DIST(54,59.7,2.5,1)
b. t h e z score with 95% of values to theleft is 1.645, which corresponds to a height of 59.7 1.645 2.5 63.8 in. fx | =NORM.INV(0.95,59.7,2.5)
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 99 The z score with 1% of values to theleft is 2.326, which corresponds to a height of 59.7 2.326 2.5 53.9 in. thez score with 1% of values to theright is 2.326, which corresponds to a height of 59.7 2.326 2.5
9.
65.5 in. Yes, 67 in. is significantly high. fx | =NORM.INV(0.01,59.7,2.5) and fx | =NORM.INV(0.01,59.7,2.5) 10. No, a histogram is far from bell shaped and a normal quantile plot reveals a pattern of points that is far from a straight-line pattern.
Cumulative Review Exercises 1.
35 35 20 50 95 75 45 50 30 35 30 30 44.2 minutes. 12 35 35 b. t h e median is 35.0 minutes. 2 (35 44.2)2 (35 44.2)2 ... (30 44.2)2 (30 44.2)2 c. s 21.4 minutes 15 1 a. t h e mean is x
d. s2 21.409 minutes 458.3 minutes2 2
95 44.2 2.37 fx | =STANDARDIZE(95,44.2,21.4) 21.4 f. Yes. thez score of 2.37 is greater than 2, so thelongest wait time of 95 minutes is significantly high. g. ratio level of measurement h. The exact unrounded wait times would have been continuous, but thelisted wait times are all multiples of 5, so these rounded wait times are discrete data. XLSTAT Output e. z x95
Statistic Nbr. of observations Median Mean Variance (n-1) Standard deviation (n-1) 2.
a. Q1
30 30
Q 2
Q3 b.
2 35 35 2 50 50 2
30.0 minutes 35.0 minutes 50.0 minutes
Wait Times (Minutes) 12 35.000 44.167 458.333 21.409 XLSTAT Output Statistic Nbr. of observations Minimum Maximum 1st Quartile Median 3rd Quartile
Copyright © 2022 Pearson Education, Inc.
Wait Times (Minutes) 12 20.000 95.000 30.000 35.000 50.000
100 Chapter 6: Normal Probability Distributions 2.
3.
(continued) c. No, thedistribution appears to be skewed instead of being normal. d. No. thepoints don’t fit a straight-line pattern very well, and there appears to be some other pattern that isnot a straight line. 221.5 246.3 2.00, which has an area of 0.0228 to theleft. a. zx221.5 12.4 fx | =NORM.DIST(221.5,246.3,12.4,1) 220 246.3 250 246.3 2.12 and 0.30, which has an area of 0.6179 0.0170 b. z x220 x250 12.4 12.4 0.6009 between them. (Excel: 0.6003) fx | =NORM.DIST(250,246.3,12.4,1)-NORM.DIST(220,246.3,12.4,1) c. The z score with 95% of values to theleft is 1.645, so P95 246.3 1.645 12.4 266.7 mm. fx | =NORM.INV(0.95,246.3,12.4) 250 246.3 1.19, which has an area of 1 0.8830 0.1170 to theright. (Excel: 0.1163) d. z x 250 12.4 / 16 fx | =1-NORM.DIST(250,246.3,12.4/SQRT(16),1)
4.
e. The result from part (c) is more helpful because it provides a foot length that is potentially themaximum to be used for shoe sizes. theresult from part (d) is based on thedistribution of means from groups of 16 women, which is irrelevant for planning shoe sizes. a. B is theevent of selecting someone who does not have blue eyes. c. 0.35 0.35 0.35 0.0429 b. P B 1 P B 1 0.35 0.65 d. 100 C40 0.3540 0.6560 100C100 0.35100 0.650 0.1724 (Table: 0.1736) fx | =1-BINOM.DIST(39,100, 0.35,1) fx | =1-NORM.DIST(39.5,100*0.35,SQRT(100*0.35*0.65),1) Using normal approximation: 0.1727 (See section 6-6) e. No, a result of 40 people with blue eyes among 100 randomly selected people is not significantly high.
5.
97.6 98.2 99.6 98.7 99.4 98.2 980 98.6 98.6 196.54F, which is far too high 9 to be reasonable as themean body temperature of a group of adult males. b. t h e value of 980F is an outlier and an error because it is not a possible body temperature of an adult male. c. If thevalue of 980F is discarded as an error, themean appears to be 98.61F. It is also reasonable to believe that thevalue of 980F was incorrectly entered without thedecimal point, and if 980F is changedto 98.0F, t h e mean appears to be 98.54F. a. t h e mean is x
Copyright © 2022 Pearson Education, Inc.
Section 7-1: Estimating a Population Proportion 101
Chapter 7: Estimating Parameters and Determining Sample Sizes Section 7-1: Estimating a Population Proportion 1. The confidence level (such as 95%) was not provided. 2. When using 51% to estimate thevalue of thepopulation percentage for those who approve of Kavanaugh, themaximum likely difference between 51% and thetrue population percentage is 3.5 percentage points, so theinterval from 51% 3.5% 47.5% to 51% 3.5% 54.5% is likely to contain thetrue population percentage. 3.
p̂ 0.51 is thesample proportion; q̂ 0.49 (found from evaluating 1 p̂ ); n 1144 is thesample size; E 0.035 is themargin of error; p is thepopulation proportion, which is unknown. thevalue of is 0.05.
4.
The 95% confidence interval will be wider than the80% confidence interval. A confidence interval must be wider in order for us to be more confident that it captures thetrue value of thepopulation proportion. (Think of estimating theage of a classmate. You might be 90% confident that she is between 20 and 30, but you might be 99.9% confident that she is between 10 and 40.)
5.
z0.05 1.645
6.
z0.005 2.576 (Table: 2.575)
fx | =NORM.S.INV(1-0.10/2)
7.
z0.0025 2.81 fx | =NORM.S.INV(1-0.005/2)
8.
z0.01 2.33
fx | =NORM.S.INV(1-0.02/2)
fx | =NORM.S.INV(1-0.01/2) p̂
0.192 0.116 0.116 0.192 0.038; 0.154 0.038 0.154 and E 2 2
10. p̂
0.193 0.283 0.283 0.193 0.238 and E 0.045; 0.238 0.045 2 2
9.
11. p̂ 0.0847 0.153 0.1189 and E 0.153 0.0847 0.0342; 0.1189 0.0342 p 0.1189 0.0342, or 2 2 0.0847 p 0.153 12. 0.255 0.046 p 0.255 0.046, or 0.209 p 0.301 13. a.
p̂ 88 240 0.367
b. E z /2 c.
p̂q̂
1.96
88 152 240 240 0.0610
240 n p̂ E p p̂ E
0.367 0.0610 p 0.367 0.0610 0.306 p 0.428 d. We have 95% confidence that theinterval from 0.306 to 0.428 actually does contain thetrue value of thepopulation proportion of successful challenges. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 95% confidence interval on the proportion (Wald): ( 0.306, 0.428 ) 14. a.
p̂ 153 5924 0.0258
b. E z /2 c.
99% confidence interval on the proportion (Wald): ( 0.021, 0.031 )
p̂q̂ 2.576 n
153 771 5924 5924 0.00531
5924 p̂ E p p̂ E
0.0258 0.00531 p 0.0258 0.00531 0.0205 p 0.0311 d. We have 99% confidence that theinterval from 0.0205 to 0.0311 actually does contain thetrue value of thepopulation proportion of all Eliquis users who experience nausea.
Copyright © 2022 Pearson Education, Inc.
102
Chapter 7: Estimating Parameters and Determining Sample Sizes
15. a.
p̂ 717 5000 0.143
b. E z /2 c.
717 4283
p̂q̂ 1.645 5000 5000 0.00815 5000 n p̂ E p p̂ E
0.143 0.00815 p 0.143 0.00815 0.135 p 0.152 d. We have 90% confidence that theinterval from 0.135 to 0.152 actually does contain thetrue value of thepopulation proportion of returned surveys. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 90% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.135, 0.152 ) ( 0.671, 0.723 ) 16. a.
p̂ 856 1228 0.697
b. E z /2 c.
p̂q̂
1.96
856 372 1228 1228 0.0257
1228 n p̂ E p p̂ E
0.697 0.0257 p 0.697 0.0257 0.671 p 0.723 d. We have 95% confidence that theinterval from 0.671 to 0.723 actually does contain thetrue value of thepopulation proportion of medical malpractice lawsuits that are dropped or dismissed.
426 434 p̂q̂ 426 1.96 860 860 0.462 p 0.529; Because 0.512 is contained within the n 860 860 confidence interval, there is not strong evidence against 0.512 as thevalue of theproportion of boys in all births. XLSTAT Output for Exercise 17 XLSTAT Output for Exercise 18 95% confidence interval on the proportion (Wald): 99% confidence interval on the proportion (Wald): ( 0.462,0.529 ) ( 0.691,0 .785 )
17. 95% CI: p̂ z /2
428 152 p̂q̂ 428 2.576 580 580 0.691 p 0.785, or 69.1% p 78.5% n 580 580 b. No, theconfidence interval includes 75%, so thetrue percentage could easily equal 75%.
18. a. 99% CI: p̂ z /2
33 104 p̂q̂ 33 2.576 137 137 0.147 p 0.335, or 14.7% p 33.5% n 137 137 b. Because thetwo confidence intervals overlap, it is very possible that both genders have thesame success rates. Neither gender appears to be substantially more successful in their challenges. XLSTAT Output for Exercise 19 XLSTAT Output for Exercise 20 99% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.147, 0.335 ) ( 0.174, 0.284 )
19. a. 99% CI: p̂ z /2
52 175 p̂q̂ 152 1.96 227 227 0.174 p 0.284, or 17.4% p 28.4% n 227 227 b. Because thetwo confidence intervals overlap, it is possible that theOxyContin treatment group and theplacebo group have thesame rate of nausea. Nausea does not appear to be an adverse reaction made worsewith OxyContin.
20. a. 95% CI: p̂ z /2
Copyright © 2022 Pearson Education, Inc.
Section 7-1: Estimating a Population Proportion 103 21. a. 0.5 b. p̂ 123 280 0.439
123 157 123 p̂q̂ 2.576 280 280 0.363 p 0.516 n 280 280 d. If thetouch therapists really had an ability to select thecorrect hand by sensing an energy field, their success rate would be significantly greater than 0.5, but thesample success rate of 0.439 and theconfidence interval suggest that they do not have theability to select thecorrect hand by sensing an energy field. XLSTAT Output for Exercise 21 XLSTAT Output for Exercise 22
c. 99% CI: p̂ z /2
99% confidence interval on the proportion (Wald): ( 0.363, 0.516 )
95% confidence interval on the proportion (Wald): ( 0.140, 0.160 )
751 4254 p̂q̂ 751 1.96 5005 5005 0.140 p 0.160, or 14.0% p 16.0%; Because n 5005 5005 48% is not contained within theconfidence interval, it appears that thepercentage of U.S. adults who do not use theInternet is different from 48%. It appears that thepercentage of those who do not use theInternet is now significantly less than 48%.
22. 95% CI: p̂ z /2
23. a. 0.459(514) 236 p̂q̂ 0.459 2.576 n (Using x 236: 0.403 p 0.516) %.
b.
99% CI: p̂ z /2
0.4590.541 514
0.402 p 0.516
XLSTAT Output for Part (b)
XLSTAT Output for Part (c)
99% confidence interval on the proportion (Wald): ( 0.402, 0.516 )
80% confidence interval on the proportion (Wald): ( 0.431, 0.487 )
p̂q̂ 0.4590.541 0.431 p 0.487 0.459 1.282 n 514 d. The 99% confidence interval is wider than the80% confidence interval. A confidence interval must be wider in order to be more confident that it captures thetrue value of thepopulation proportion. c. 80% CI: p̂ z /2
24. a. 0.90(514) 463 p̂q̂ 0.90 2.576 n (Using x 436: 0.867 p 0.935) %.
b.
99% CI: p̂ z /2
0.900.10 514
0.866 p 0.934
XLSTAT Output for Part (b)
XLSTAT Output for Part (c)
99% confidence interval on the proportion (Wald): ( 0.866, 0.934 )
80% confidence interval on the proportion (Wald): ( 0.883, 0.917 )
p̂q̂ 0.90 1.282 n (Using x 463 0.884 p 0.918)
c. 80% CI: p̂ z /2
0.900.10 514
0.883 p 0.917
d. The 99% confidence interval is wider than the80% confidence interval. A confidence interval must be wider in order to be more confident that it captures thetrue value of thepopulation proportion. 25. 95% CI: p̂ z /2 99% CI: p̂ z /2
p̂q̂ 0.042 1.96 n
0.0420.958
0.0419 p 0.0421 10, 000, 000 0.0420.958 p̂q̂ 0.0418 p 0.0422; %. 0.042 2.576 n 10, 000, 000
Copyright © 2022 Pearson Education, Inc.
104
Chapter 7: Estimating Parameters and Determining Sample Sizes
25. (continued) XLSTAT Output for 95% CI
XLSTAT Output for 99% CI
95% confidence interval on the proportion (Wald): 99% confidence interval on the proportion (Wald): ( 0.0419, 0.0421 ) ( 0.0418, 0.0422 ) For both confidence levels, theupper and lower confidence interval limits are very close to each other, suggesting that theestimates are very accurate. Also, there is not much difference between the95% CI and the99% CI. theextremely large sample size is giving us confidence interval estimates with a very narrow range, so we are getting very precise estimates. 26. XSORT: 95% CI: p̂ z /2
p̂q̂ 879 1.96 n 945
YSORT: 95% CI: p̂ z /2
p̂q̂ 239 1.96 n 291
66 879 945 945 0.914 p 0.946, or 91.4% p 94.6%
945
52 239 291 291 0.777 p 0.865, or 77.7% p 86.5% %.
291
XLSTAT Output for XSORT XLSTAT Output for YSORT 95% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.914, 0.946 ) ( 0.777, 0.865 ) The two confidence intervals do not overlap. It appears that thesuccess rate for theXSORT method is higher than thesuccess rate for theYSORT method, but thebig story here is that theXSORT method and theYSORT method both appear to be very effective because thesuccess rates are well above the50% rates expected with no treatments. 27. Sustained care: 0.8280.172 p̂q̂ 0.775 p 0.881, or 77.5% p 88.1%; 95% CI: p̂ z /2 0.828 1.96 n 198 (Using x 164: 77.6% p 88.1%) %. Standard care: 95% CI: p̂ z /2
p̂q̂ 0.628 1.96 n
0.6280.372 199
0.561 p 0.695, or 56.1% p 69.5%
XLSTAT Output for Sustained Care XLSTAT Output for Standard Care 95% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.775, 0.881 ) ( 0.561, 0.695 ) The two confidence intervals do not overlap. It appears that thesuccess rate is higher with sustained care. 28. Biochemically confirmed results: 0.25280.742 p̂q̂ 0.197 p 0.319, or 19.7% p 31.9%; 95% CI: p̂ z /2 0.258 1.96 n 198 (Using x 51: 19.7% p 31.8%) %. Reported results: p̂q̂ 0.409 1.96 n (Using x 81: 34.1% p 47.8%)
95% CI: p̂ z /2
XLSTAT Output for Biochemical Results
0.4090.591 198
0.341 p 0.477, or 34.1% p 47.7%;
XLSTAT Output for Reported Results
95% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.197, 0.319 ) ( 0.341, 0.477 ) The two confidence intervals do not overlap. It appears that thesuccess rate is higher with sustained care.
Copyright © 2022 Pearson Education, Inc.
Section 7-1: Estimating a Population Proportion 105
19 16 p̂q̂ 19 1.96 35 35 0.378 p 0.708, n 35 35 or 37.8% p 70.8%; Greater height does not appear to be an advantage for presidential candidates. If greater height is an advantage, then taller candidates should win substantially more than 50% of theelections, but theconfidence interval shows that thepercentage of elections won by taller candidates is likely to be anywhere between 37.8% and 70.8%. XLSTAT Output for Exercise 29 XLSTAT Output for Exercise 30
29. p̂ 19 35 0.543, 95% CI: p̂ z /2
95% confidence interval on the proportion (Wald): ( 0.378, 0.708 )
95% confidence interval on the proportion (Wald): ( 0.116, 0.192 )
53 292 p̂q̂ 53 1.96 345 345 0.116 p 0.192, n 345 345 or 11.6% p 19.2%; Because theconfidence interval contains theclaimed value of 16%, thesample data do not provide strong evidence against theclaim that 16% of M&Ms are green. z 2 p̂q̂ 2.5752 (0.5)(0.5) /2 31. a. n 1842 (Excel: 1844) E2 0.032
30. p̂ 53 345 0.154 95% CI: p̂ z /2
fx | =NORM.S.INV(1-0.01/2)^2*0.50*0.50/0.03^2 2 2.5752 (0.22)(0.78) b. n z /2 p̂q̂ 1265 (Excel: 1266) E2 0.032
fx | =NORM.S.INV(1-0.01/2)^2*0.22*(1-0.22)/0.03^2 32. a. n
z 2 p̂q̂ 1.6452 (0.5)(0.5) /2 1692 (Excel: 1691) E2
0.022
fx | =NORM.S.INV(1-0.10/2)^2*0.50*0.50/0.02^2 2 1.6452 (0.1)(0.9) b. n z /2 p̂q̂ 609 fx | =NORM.S.INV(1-0.10/2)^2*0.10*(1-0.10)/0.02^2 E2 0.022 c. Yes, there is a substantial decrease in thesample size. z 2 p̂q̂ 1.962 (0.5)(0.5) 33. a. n /2 4269 fx | =NORM.S.INV(1-0.05/2)^2*0.50*0.50/0.015^2 E2 0.0152 2 2 z /2 p̂q̂ 1.96 (0.037)(0.963) b. n
609 fx | =NORM.S.INV(1-0.05/2)^2*0.037*(1-0.037)/0.015^2 E2 0.0152 c. Yes, there is a substantial decrease in thesample size. z 2 p̂q̂ 2.5752 (0.5)(0.5) /2 34. a. n 1037 fx | =NORM.S.INV(1-0.01/2)^2*0.50*0.50/0.04^2 E2 0.042 b. n 35. a. n
2 z /2 2 p̂q̂ 2.575 (0.26)(0.74)
E2
z /2 2 p̂q̂
798 fx | =NORM.S.INV(1-0.01/2)^2*0.26*(1-0.26)/0.04^2
1537 fx | =NORM.S.INV(1-0.05/2)^2*0.50*0.50/0.025^2 0.0252 2 2 z /2 p̂q̂ 1.96 (0.38)(0.62) b. n 2 1449 fx | =NORM.S.INV(1-0.05/2)^2*0.38*(1-0.38)/0.025^2 E 0.0252 E2
0.042 1.962 (0.5)(0.5)
Copyright © 2022 Pearson Education, Inc.
106
Chapter 7: Estimating Parameters and Determining Sample Sizes
36. a. n
z 2 p̂q̂ 2.5752 (0.5)(0.5) /2 4145 (Excel: 4147) E2
0.022
fx | =NORM.S.INV(1-0.01/2)^2*0.50*0.50/0.02^2 b. n z /2 p̂q̂ E2 2
2.5752 (0.18)(0.82)
2447 (Excel: 2449)
0.022
fx | =NORM.S.INV(1-0.01/2)^2*0.18*(1-0.18)/0.02^2 37. a. n
z 2 p̂q̂ 1.6452 (0.5)(0.5) /2 752 E2
0.032
fx | =NORM.S.INV(1-0.10/2)^2*0.50*0.50/0.03^2
2 2 z /2 p̂q̂ 1.645 (0.11)(0.89) b. n
295 fx | =NORM.S.INV(1-0.10/2)^2*0.11*(1-0.11)/0.03^2 E2 0.032 c. No. A sample of thepeople you know is a convenience sample, not a simple random sample, so it is very possible that theresults would not be representative of thepopulation. z 2 p̂q̂ 2.5752 (0.5)(0.5) /2 38. a. n 4145 (Excel: 4147) E2 0.022 fx | =NORM.S.INV(1-0.01/2)^2*0.50*0.50/0.02^2 b. n z /2 p̂q̂ E2 2
2.5752 (0.82)(0.18)
2447 (Excel: 4147)
0.022
fx | =NORM.S.INV(1-0.01/2)^2*0.82*(1-0.82)/0.02^2 c. Randomly selecting adult women would result in an underestimate, because some women will give birth to their first child after thesurvey was conducted. It will be important to survey women who have completed thetime during which they can give birth. 39. n
Np̂q̂ z /2 2
p̂q̂ z /2 N 1 E 2
2
2500 0.82 0.182.5752
0.820.182.5752 2500 10.022
1237 (Excel: 1238)
40. Because we have 95% confidence that p is greater than 50.3%, we can safely conclude that more than 50% of undergraduates take online courses. 0.530.47 p 0.503 p̂q̂ 0.53 1.645 p̂ z 950 n (p 50.4% if using x 504 so that p̂ 504 950.) 41. a. therequirement of at least 5 successes and at least 5 failures is not satisfied, so thenormal distribution cannot be used. b. 3 45 0.067, or 6.67% Section 7-2: Estimating a Population Mean 1.
a. 85.74 min 91.76 min
2.
b. t h e best point estimate of is x 91.76 85.74 E 3.01 min. 2 a. df 36 1 35 b. 2.030
85.74 91.76
88.75 min. t h e margin of error is
2
fx | =T.INV.2T(0.05,36-1)
c. In general, thenumber of degrees of freedom for a collection of sample data is thenumber of sample values that can vary after certain restrictions have been imposed on all data values. Copyright © 2022 Pearson Education, Inc.
Section 7-2: Estimating a Population Mean 107 3. 4.
5.
We have 95% confidence that thelimits of 85.74 minutes and 91.76 minutes contain thetrue value of themean of thepopulation of all times between eruptions. a. thesample is a simple random sample, and either or both of these conditions are satisfied: thepopulation is normally distributed or n 30. b. When we say that theconfidence interval methods of this section are robust against departures from normality, we mean that these methods work reasonably well with distributions that are not normal, provided that departures from normality are not too extreme. c. Yes, thesample is a simple random sample and thesample size is 36, which is greater than 30. a. t /2 2.030 b. E t /2
fx | =T.INV.2T(0.05,36-1)
s 695.5 2.030 235.3 g n 36
s 3150.0 2.030 695.5 2914.7 g 3385.3 g n 36 d. We have 95% confidence that thelimits of 2914.7 g and 3385.3 g contain thetrue value of themean birth weight of all girls. c. 95% CI: x t /2
6.
a. t /2 2.744 b. E t /2
s n
fx | =T.INV.2T(0.01,32-1) 2.744
0.1077 0.0522 g 32
0.1077 4.5210 2.744 4.4688 g 4.573 g n 32 d. We have 99% confidence that thelimits of 4.4688 g and 4.5732 g contain thetrue value of themean weightof all Hershey Kisses. c. 95% CI: x t /2
7.
a. t /2 2.724 s
s
fx | =T.INV.2T(0.01,36-1)
0.00570 0.00259 lb n 36 s 0.82410 2.724 0.00570 0.82151 lb 0.82669 lb c. 99% CI: x t /2 n 36 d. We have 99% confidence that thelimits of 0.82151 lb and 0.82669 lb contain thetrue value of themean weight of Pepsi in cans. b. E t /2
8.
a. t /2 2.026
2.724
fx | =T.INV.2T(0.05,38-1)
s 15.66 2.026 5.15 Mbps n 38 s 18.86 2.026 15.66 13.71 Mbps 24.01 Mbps c. 95% CI: x t /2 n 38 d. We have 95% confidence that thelimits of 13.71 Mbps and 24.01 Mbps contain thetrue value of themeanof all Verizon airport data speeds. The sample size is greater than 30 and thedata appear to be from a population that is normally distributed. s 98.20 1.983 0.62 98.08F 98.32F; Because theconfidence interval does 95% CI: x t /2 n 106 not contain 98.6F, it appears that themean body temperature is not 98.6F, as is commonly believed. b. E t /2
9.
Excel Command for Exercise 9
Excel Command for Exercise 10
fx | =T.INV.2T(0.05,106-1)
fx | =T.INV.2T(0.10,40-1)
Copyright © 2022 Pearson Education, Inc.
108
Chapter 7: Estimating Parameters and Determining Sample Sizes
s 2.11.685 4.8 0.8 lb 3.4 lb; Because the n 40 confidence interval does not include 0 or negative values, it does appear that theweight loss program is effective, with a positive loss of weight. Because theamount of weight lost is relatively small, theweight loss program does not appear to be very practical. 11. It is assumed that the16 sample values appear to be from a normally distributed population. s 98.9 2.602 42.3 71.4 min 126.4 min; t h e confidence interval includes the 98% CI: x t /2 n 16 mean of 102.8 min that was measured before thetreatment, so themean could be thesame after thetreatment. This result suggests that thezopiclone treatment does not have a significant effect. Excel Command for Exercise 11 Excel Command for Exercise 12 10. The sample size is greater than 30. 90% CI: x t /2
fx | =T.INV.2T(0.02,16-1)
fx | =T.INV.2T(0.02,49-1)
s 0.4 2.407 21.0 6.8 mg/dL 7.6 mg/dL; n 49 Because theconfidence interval includes thevalue of 0, it is very possible that themean of thechanges in LDL cholesterol is equal to 0, suggesting that thegarlic treatment did not affect LDL cholesterol levels. It does not appear that garlic has a significant effect in reducing LDL cholesterol.
12. The sample size is greater than 30. 98% CI: x t /2
13. 99% CI: x t /2
s 5.994 133.8 3.499 126.3 mm 141.2 mm 8 n
XLSTAT Output for Exercise 13
XLSTAT Output for Exercise 14
99% confidence interval on the mean: 95% confidence interval on the mean: ( 126.334, 141.166 ) ( 5.58898, 5.69494 ) 0.07407 s 5.64196 2.262 5.58898 g 5.69494 g; t h e confidence interval 14. 95% CI: x t /2 n 10 contains thespecified weight of 5.670 g, so production appears to be OK provided that thevariation is not too large. s 22.5 2.977 17.3 9.2 minutes 35.8 minutes; Five of thelisted values are 5, 15. 99% CI: x t /2 n 15 so thedata do not appear to be from a normally distributed population. Also, thesample size is not greater than 30, so therequirement of a ―normal distribution or n 30‖ is not satisfied. It is very possible that theconfidence interval is not a good estimate of thepopulation mean. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 99% confidence interval on the mean: 99% confidence interval on the mean: ( 9.224, 35.843 ) ( 18.308, 36.317 ) 18.7 s 27.3 2.744 18.3 minutes 36.3 minutes; Unlike thepreceding exercise, 16. 99% CI: x t /2 n 32 this sample does satisfy theconfidence interval requirement because n 30, so theconfidence interval is likelyto be a good estimate of thepopulation mean. s 2.6 2.262 1.07 1.8 3.4; 17. The sample appears to have a normal distribution. 95% CI: x t /2 n 10 The given numbers are just substitutes for thefour DNA base names, so thenumbers don’t measure or count anything, and they are at thenominal level of measurement. theconfidence interval has no practical use. XLSTAT Output for Exercise 17 XLSTAT Output for Exercise 18 95% confidence interval on the mean: 90% confidence interval on the mean: ( 1.831, 3.369 ) ( 5.002, 7.698 )
Copyright © 2022 Pearson Education, Inc.
Section 7-2: Estimating a Population Mean 109 s 6.35 1.833 2.325 n 10 5.00 g 7.70 g; No, thesamples obtained from California might be very different from those obtained in Arkansas. s 0.719 3.143 0.366 19. The sample appears to have a normal distribution. 98% CI: x t /2 n 7 0.284 ppm 1.153 ppm; Using theFDA guideline, theconfidence interval suggests that there could be too much mercury in fish because it is possible that themean is greater than 1 ppm. Also, one of thesample values exceeds theFDA guideline of 1 ppm, so at least some of thefish have too much mercury. XLSTAT Output for Exercise 19 XLSTAT Output for Exercise 20 18. The sample appears to have a normal distribution. 90% CI: x t /2
98% confidence interval on the mean: 99% confidence interval on the mean: ( 0.284, 1.153 ) ( 19.543, 45.557 ) 20. The presence of five zeros suggests that thesample is not from a normally distributed population, so thenormality requirement is violated and theconfidence interval might not be a good estimate of thepopulation mean. s 32.6 2.861 20.33 19.5 mg 45.6 mg; People consume some brands much 99% CI: x t /2 n 20 more often than others, but the20 brands are all weighted equally in thecalculations. This, along with theviolation of thenormality requirement, means theconfidence interval might not be a good estimate of thepopulation mean. z /2 2 1.96 152 97. It does appear to be very practical. 3 21. The sample size is n E fx | =(NORM.S.INV(1-0.05/2)*15/3)^2 z /2 2
2.33152 305. It does appear to be very practical. 2 22. The sample size is n E fx | =(NORM.S.INV(1-0.02/2)*15 / 2)^2 z /2 2
2.3319.6 2 1.5 927 (Excel: 925). thesample should not be obtained from 23. The sample size is n E one movie at one theater, because that sample could be easily biased. Instead, a simple random sample of thebroader population should be obtained.fx | =(NORM.S.INV(1-0.02/2)*19.6/1.5)^2 12 1.5 24.
z /2 2
(60) 157.5; t h e sample size is n 4 fx | =(NORM.S.INV(1-0.01/2)*157.5/15)^2
E
2.575 157.52
15
732.
z /2 2 2.575 16.0 2 104 40 25. a. 425. 16.0; t h e sample size is n 2 4 E fx | =(NORM.S.INV(1-0.01/2)*16.0/2)^2 z /2 2
2.575 11.32 212 fx | =(NORM.S.INV(1-0.01/2)*11.3/2)^2 2 b. n E from part (a) is substantially larger than theresult from part (b). theresult from part (b) is likelyto c. The result be better because it uses s instead of theestimated obtained from therange rule of thumb.
Copyright © 2022 Pearson Education, Inc.
110 Chapter 7: Estimating Parameters and Determining Sample Sizes z /2 2 2.575 17.0 2 104 36 26. a. 480 17.0; t h e sample size is n 2 4 E fx | =(NORM.S.INV(1-0.01/2)*17.0/2)^2 z /2 2
2.575 12.5 2 260 fx | =(NORM.S.INV(1-0.01/2)*12.5/2)^2 2 b. n E from part (a) is substantially larger than theresult from part (b). theresult from part (b) is likelyto c. The result be better because it uses s instead of theestimated obtained from therange rule of thumb. z /2 2 1.96 12 27. 38, 416 (Excel: 38,415); thesample appears to be too large to be practical for a 0.01 n E telephone survey. fx | =(NORM.S.INV(1-0.05/2)*1/0.01)^2 z /2 2
99.6 96.5 28. a.
0.775; t h e sample size is n 4 fx | =(NORM.S.INV(1-0.02/2)*0.775/0.1)^2 z /2 2
E
2.33 0.775 2
0.1
327 (Excel:326)
2.33 0.62 2
fx | =(NORM.S.INV(1-0.02/2)*0.62/0.1)^2 b. n 0.1 209 E c. theresult from part (a) is substantially larger than theresult from part (b). theresult from part (b) is likelyto be better because it uses s instead of theestimated obtained from therange rule of thumb. 29. Both samples appear to have a normal distribution. s 69.58 1.976 0.916 67.8 bpm 71.4 bpm Males: 95% CI: x t /2 n 153 s 1.03 74.0 1.976 Females: 95% CI: x t /2 72.0 bpm 76.1 bpm n 147 Although final conclusions about means of populations should not be based on theoverlapping of confidence intervals, theintervals do not overlap, so adult females appear to have a mean pulse rate that is higher than themean pulse rate of adult males. XLSTAT Output for Males XLSTAT Output for Females 95% confidence interval on the mean: ( 67.772, 71.392 ) 30. SMOKER: 95% CI: x t /2
95% confidence interval on the mean: ( 71.996, 76.085 )
s 266.3922 1.963 140.5448 257.208 ng/mL 275.577 ng/mL n 902 XLSTAT Output for SMOKER
95% confidence interval on the mean: ( 257.208, 275.576 ) s 28.5707 1.965 106.1758 18.5419 ng/mL 38.5995 ng/mL ETS: 95% CI: x t /2 n 433 XLSTAT Output for ETS 95% confidence interval on the mean: 18.5419, 38.5995
Copyright © 2022 Pearson Education, Inc.
Section 7-3: Estimating a Population Standard Deviation or Variance 111 30. (continued) NO ETS: 95% CI: x t /2
s 2.8396 1.967 28.9488 0.1692 ng/mL 5.8486 ng/mL n 358 XLSTAT Output for NO ETS
95% confidence interval on the mean: -0.1692, 5.8486 None of theconfidence intervals overlap, so thethree groups appear to have different means. Increased exposure to smoke appears to result in higher levels of cotinine. 31. 95% CI: x t /2
s
26.2
42.6 1.96
n 1000 (TI data: 40.0 minutes 44.9 minutes)
41.0 minutes 44.3 minutes
XLSTAT Output for Exercise 32
XLSTAT Output for Exercise 31
95% confidence interval on the mean: 99% confidence interval on the mean: ( 41.015, 44.263 ) ( 29.290, 32.798 ) 21.5 s 31.0 2.581 32. 99% CI: x t /2 29.3 minutes 32.8 minutes n 1000 (TI data: 29.6 minutes 35.0 minutes) Unlike Exercise 15, this sample does satisfy theconfidence interval requirement because n 7 30, so theconfidence interval is likely to be a good estimate of thepopulation mean. 33. x s
20.5(13) 30.5(61) 40.5(66) 50.5(36) 60.5(14) 70.5(5) 40.1 years 13 61 66 36 14 5
195 13 20.52 5 70.52 13 20.5 5 70.5
95% CI: x t /2
195195 1 s n
40.11.972
195
11.3 years
38.5 years 41.7 years
0.03436 0.8715 g 0.8921 g 45 s N n 0.8818 2.015 0.03436 345 45 0.8722 g 0.8914 g; The b. 95% CI: x t /2 N 1 345 1 n 45 mean of all 345 M&Ms is 0.8839 g, and theconfidence interval captures that population mean. c. theconfidence interval from part (b) is a bit narrower than theone from part (a), indicating that we have a more accurate estimate when therelatively large sample is selected without replacement from a relatively small finite population. Section 7-3: Estimating a Population Standard Deviation or Variance
34. a. 95% CI: x t /2
1.
s
11.3
2
n
0.8818 2.015
a. Unlike confidence interval estimates of p or , confidence interval estimates of or are not created using a distribution that is symmetric, so there is no E as in confidence interval estimates of p or . 2
b. theformat of 10.9 bpm 3.3 bpm implies that s 10.9 bpm, which is not thecase. Because thechi-square distribution is not symmetric, confidence interval estimates of or cannot be expressed in theformat of s E. 2
2.
a. thesample must be a simple random sample and thesample must be from a normally distributed population. b. For confidence interval estimates of or , therequirement of a normal distribution is much stricter, and there is no sample size (such as n 30) suitable for waiving thenormality requirement. 2
c. No. theselected numbers are equally likely and have a uniform distribution, not a normal distribution as required.
Copyright © 2022 Pearson Education, Inc.
112 Chapter 7: Estimating Parameters and Determining Sample Sizes 3.
0.360 million cells/L2 2 0.454 million cells/L2 0.130 million cells/L 2
4.
2
0.206 million cells/L
df 20 1 24; L 8.907 and R 32.852 2
2
L2: fx | =CHISQ.INV(0.05/2,20-1) 5.
R2 : fx | =CHISQ.INV.RT(0.05/2,20-1)
df 25 1 24; L 12.401, R 39.364 2
(n 1)s2
R2
7.
2
6.
df 20 1 19; L 8.907, R 32.852 2
(n 1)s2
(n 1)s2
L2
R2
2
(n 1)s2
L2
(25 1)0.242 (25 1)0.242 39.364 12.401 95% CI: 0.19 mg 0.33 mg
(20 1)0.041112 (20 1)0.041112 32.852 8.907 95% CI: 0.03126 g 0.06004 g
L2: fx | =CHISQ.INV(0.05/2,25-1)
L2: fx | =CHISQ.INV(0.05/2,20-1)
R2 : fx | =CHISQ.INV.RT(0.05/2,25-1)
R2 : fx | =CHISQ.INV.RT(0.05/2,25-1)
df 147 1 146; L 105.741, 2
8.
R2 193.761 2
2
(n 1)s2
R2
R2
R2 : fx | =CHISQ.INV.RT(0.01/2,147-1) df 106 1 105; L 78.536, R 135.247 2
106 10.622 135.247
2
(n 1)s2
L : fx | =CHISQ.INV(0.01/2,147-1)
(Table: L 67.328, R 140.169)
L2
2
R2
2
(n 1)s2
(147 1)1.962 (147 1)1.962 193.761 105.741 99% CI: 1.70 2.30 in (1000 cells/L) Table: 2.00 2.89 in (1000 cells/L)
n 1 s2
df 153 1 152; L 110.846,
R2 200.657
(Table: L 67.328, R 140.169)
9.
2
2
2
L2
(153 1)7.102 (153 1)7.102 200.657 110.846 99% CI: 6.18 cm 8.31 cm Table: 7.39 cm 10.67 cm L2: fx | =CHISQ.INV(0.01/2,153-1)
R2 : fx | =CHISQ.INV.RT(0.01/2,153-1) 10. df 40 1 39; L 25.695, R 54.572 2
n 1 s2
n 1 s2
L2
R2
106 10.622
(n 1)s2
40 14.82
78.536
2
n 1 s2 L2
40 14.82
54.572 25.695 90% CI: 4.1 lb 5.8 lb
95% CI: 0.55∘ F < < 0.72∘ F Table: 0.56∘ F 0.74∘ F 2 L: fx | =CHISQ.INV(0.05/2,106-1)
R2: fx | =CHISQ.INV.RT(0.05/2,106-1)
Table: 4.0 lb 5.9 lb
L: fx | =CHISQ.INV(0.10/2,147-1) 2
R2: fx | =CHISQ.INV.RT(0.10/2,147-1) The confidence interval gives us information about thevariation among theamounts of lost weight, but it does not give us information about theeffectiveness. of thediet.
Copyright © 2022 Pearson Education, Inc.
Section 7-3: Estimating a Population Standard Deviation or Variance 113 11. df 16 1 15; L 5.229, R 30.578 2
n 1 s2 R2
16 142.32
2
12. df 49 1 48; L 28.177, R 73.682 2
n 1 s2
n 1 s2
L2
R2
16 142.32
30.578 5.229 98% CI: 29.6 min 71.6 min
49 121.02
2
n 1 s2 L2
49 121.02
73.682 28.177 98% CI: Excel: 16.9 mg/dL 27.4 mg/dL
L2: fx | =CHISQ.INV(0.02/2,16-1)
Table 16.7 mg/dL 26.7 mg/dL
R: fx | =CHISQ.INV.RT(0.02/2,16-1)
L : fx | =CHISQ.INV(0.02/2,49-1)
No, theconfidence interval does not indicate whether thetreatment is effective.
R2 : fx | =CHISQ.INV.RT(0.02/2,49-1)
2
2
No, theconfidence interval does not indicate whether thetreatment is effective. 13. Because theconfidence interval includes thevalue of 2.9 in., it is very possible that thestandard deviation of heights of women who are professional soccer players is thesame as thestandard deviation of heights of women in thegeneral population. There is not sufficient evidence to support theclaim that women who are professional soccer players have heights that vary less than women in thegeneral population. df 23 1 22; L 10.982, R 36.781 2
2
(n 1)s2
R2
L2: fx | =CHISQ.INV(0.05/2,23-1)
(n 1)s2
R2 : fx | =CHISQ.INV.RT(0.05/2,23-1)
L2
(23 1)2.41132 (23 1)2.41132 36.781 10.982 95% CI: 1.9 in. 3.4 in.0 14. The specified value of 0.0230 g is contained within theconfidence interval, so it appears that theMint specification is being met. df 9 1 8; L 2.180, R 17.535 2
2
(n 1)s2
R2
L2: fx | =CHISQ.INV(0.05/2,9-1)
(n 1)s2
L2
R2 : fx | =CHISQ.INV.RT(0.05/2,9-1)
(9 1)0.015882 (9 1)0.015882 17.535 2.180 95% CI: 0.01073 g 0.03042 g 15. Because thetwo confidence intervals overlap, it is possible that females and males have thesame amount of variation, so there does not appear to be a significant difference in variation. df 10 1 9; L 2.700, R 19.023 2
F:
(n 1)s2
R2
2
(n 1)s2
L2
(10 1)0.720802 (10 1)0.720802 19.023 2.700 95% CI: 0.50 1.32
(n 1)s2
M:
R2
(n 1)s2
L2
(10 1)0.352772 (10 1)0.352772 19.023 2.700 95% CI: 00.24 0.64
L2: fx | =CHISQ.INV(0.05/2,10-1)
L2: fx | =CHISQ.INV(0.05/2,10-1)
R2: fx | =CHISQ.INV.RT(0.05/2,10-1)
R2: fx | =CHISQ.INV.RT(0.05/2,10-1)
Copyright © 2022 Pearson Education, Inc.
114 Chapter 7: Estimating Parameters and Determining Sample Sizes 16. df 10 1 9; L 2.700, R 19.023 2
2
(n 1)s2
a.
R2
(n 1)s2
L2
(n 1)s2
b.
R2
(n 1)s2
L2
(10 1)0.476682 (10 1)0.476682 19.023 2.700 95% CI: 0.33 min 0.87 min L2: fx | =CHISQ.INV(0.05/2,10-1)
(10 1)1.821632 (10 1)1.821632 19.023 2.700 95% CI: 1.25 min 3.33 min L2: fx | =CHISQ.INV(0.05/2,10-1)
R2 : fx | =CHISQ.INV.RT(0.05/2,10-1)
R2 : fx | =CHISQ.INV.RT(0.05/2,10-1)
c. thevariation appears to be significantly lower with a single line. thesingle line appears to be better because customers are more satisfied if their waiting times are about thesame. 17. The sample size is 85, which is very practical. 18. The sample size of 19,205 is too large to be practical. theUnited States has only 15,000 air traffic controllers. 19. The sample size is 192, which is practical. 20. The sample size is 14, which is very practical, but an estimate of s that is within 50% of thetrue value is not very accurate and may have limited value. 21. Because theconfidence intervals overlap, it is possible that thedifferent line configurations have thesame variation, so these results do not support theexpectation that thesingle line has less variation. Requirements: thestrict normality requirement is questionable for both data sets, especially thetwo-line wait times. df 85 1 84; L 59.692, R 110.090 2
2
(n 1)s
2
Two-line:
R2
L2: fx | =CHISQ.INV(0.05/2,85-1)
(n 1)s2
L2
(85 1)240.286702 (85 1)240.286702 110.090 59.692 95% CI: 208.8 sec 283.0 sec Table: 213.3 sec 291.3 sec (n 1)s2
One-line:
R2
22. a. df 205 1 204, L 166.338, R 245.448
R2
L2: fx | =CHISQ.INV(0.05/2,85-1)
L2
Table: 178.0 sec 243.2 sec
n 1 s2
(n 1)s2
(85 1)200.581682 (85 1)200.581682 110.090 59.692 95% CI: 174.3 sec 236.3 sec 2
R2 : fx | =CHISQ.INV.RT(0.05/2,85-1)
2
R2 : fx | =CHISQ.INV.RT(0.05/2,85-1)
n 1 s2
L2: fx | =CHISQ.INV(0.05/2,205-1)
L2
R2 : fx | =CHISQ.INV.RT(0.05/2,205-1)
205 1706.32 205 1706.32 245.448 166.338 95% CI: 643.9 g 782.1 g Table: 886.2 g 1170.9 g
Copyright © 2022 Pearson Education, Inc.
Section 7-4: Bootstrapping: Using Technology for Estimates
115
22. (continued) b. df = 195 - 1 = 194, c 2L = 157.321, and c 2R = 234.465 (Using technology.)
(n - 1 )s2 c R2
(195 - 1)660.22
<s <
<s <
(n - 1 )s2
c L2: fx | =CHISQ.INV(0.05/2,195-1) c R2 : fx | =CHISQ.INV.RT(0.05/2,195-1)
c L2
(195 - 1)660.22
157.321 234.465 95% CI: 600.5 g < s < 733.1 g Table: 807.8 g < s < 1076.3 g c. t h e amounts of variation are about thesame. 2 2 1 1 2 2 23. c = é+1.96 + 2 (106 - 1)- 1ù = 134.756 and c = é- 1.96 + 2 (106 - 1)- 1ù = 78.085; t h e values from L R û û 2ë 2ë the given approximation are quite close to theactual critical values. 2 2 1 æza /2 ö 1 æ2.326347877 ö 24. = 1083 (Using thegiven formula with Table A-2 yields 1086. Statdisk ÷ n= ç 0.05 ÷ = 2 çè ø 2è d ø yields 1087.) Section 7-4: Bootstrapping: Using Excel for Estimates 1. (1) thesample must be collected in an appropriate way, such as a simple random sample. (2) thesample means found from thegenerated bootstrap samples should have a distribution that is approximately symmetric. 2. For thegiven sample, a bootstrap sample is another sample of 15 of thelisted times randomly selected with replacement. 3. There is no universal exact number, but there should be at least 1000 bootstrap samples, and theuse of 10,000 or more is common. 4. The 1000 means will target thesample mean of 54.3 seconds. thebootstrap method will not improve upon a poor estimate from thesample. 5. 0.000 < p < 0.500 9. Answers vary, but here are typical answers. a. - 0.8 kg < m < 7.8 kg 6. 0.125 < p < 0.750 7.
a. 0.1 kg < m < 8.6 kg b. 1.9 kg < s < 6.3 kg
8.
a. 66.3 cW/kg < m < 98.3 cW/kg
b. 1.2 kg < s < 7.0 kg 10. Answers vary, but here are typical answers. a. 50.0 cW/kg < m < 118.3 cW/kg
b. 9.8 cW/kg < s < 57.3 cW/kg b. 16.7 cW/kg < s < 54.6 cW/kg 11. a. Answers vary, but here is a typical answer: 128.4 mm. < m < 139.3 mm. b. t h e confidence interval obtained from thebootstrap method is very close to this confidence interval from Exercise 13 in Section 7-2: 126.3 mm. < m < 141.2 mm. 12. a. Answers vary, but here is a typical answer: 5.60249 g < m < 5.68677 g b. The confidence interval contains thespecified weight of 5.670 g, so production appears to be OK provided that thevariation is not too large. c. The confidence interval obtained from thebootstrap method is close to this confidence interval from Exercise 14 in Section 7-2: 5.58898 g < m < 5.69494 g. 13. a. Answers vary, but here is a typical answer: 11.9 minutes. < m < 34.3 minutes. b. t h e confidence interval given in part (a) is likely to be better. Given that thesample data do not appear to be from a normally distributed population, theuse of thet distribution is questionable, so theconfidence interval found in Exercise 15 in Section 7-2 might not be a good estimate of m.
Copyright © 2022 Pearson Education, Inc.
116 Chapter 7: Estimating Parameters and Determining Sample Sizes 14. a. Answers vary, but here is a typical answer: 19.5 minutes. 36.2 minutes. b. theconfidence interval obtained from thebootstrap method is close to this confidence interval from Exercise 16 in Section 7-2: 18.3 minutes 36.3 minutes. 15. a. Answers vary, but here is a typical answer: 1.5 in. 3.1 in. b. theconfidence interval obtained from thebootstrap method is close to this confidence interval from Exercise 13 in Section 7-3: 1.9 in. 3.4 in. 16. a. Answers vary, but here is a typical answer: 0.00756 in. 0.01968 g. b. theconfidence interval obtained from thebootstrap method is fairly close to this confidence interval from Exercise 14 in Section 7-3: 0.01073 g 0.03042 g. 17. a. Answers vary, but here is a typical answer: 1080.0 cm3 1169.0 cm3 b. Answers vary, but here is a typical answer: 79.0 cm3 152.8 cm3 18. a. Answers vary, but here is a typical answer: 19.8 mg 42.9 mg. b. theconfidence interval given in part (a) is likely to be better. Given that thesample data do not appear to be from a normally distributed population, theuse of thet distribution is questionable, so theconfidence interval found in Exercise 20 in Section 7-2 might not be a good estimate of . c. theconfidence interval obtained from thebootstrap method is very close to this confidence interval from Exercise 20 in Section 7-2: 19.5 mg 45.6 mg. 19. Answers vary, but here is a typical result: 0.135 p 0.152. theresult is essentially thesame as theconfidence interval of 0.135 p 0.152 found in Exercise 15 from Section 7-1. 20. Answers vary, but here is a typical result: 0.671 p 0.722. theresult is very close to theconfidence interval of 0.671 p 0.723 found in Exercise 16 from Section 7-1. 21. Answers vary, but here is a typical result: 0.0208 p 0.0317. This is quite close to theconfidence interval of 0.0205 p 0.0311 found in Exercise 14 from Section 7-1. 22. Answers vary, but here is a typical result: 0.860 p 0.932. This is quite close to theconfidence interval found in Exercise 24 in Section 7-1. 23. a. Answers vary, but here is a typical result: 2.5 3.3. b. 2.4 3.7 c. theconfidence interval from thebootstrap method is not very different from theconfidence interval found using themethods of Section 7-3. Because a histogram or normal quantile plot shows that thesample appears to be from a population not having a normal distribution, thebootstrap confidence interval of 2.5 3.3 would be a better estimate of . 24. a. Answers vary, but here is a typical result : 1.6 3.3. b. 1.6 3.3 c. theconfidence interval found from thebootstrap method is essentially thesame as theconfidence interval found using themethods of Section 7-2. 25. Answers vary, but here is a typical result using 10,000 bootstrap samples: 1.7 3.3. This result is very close to theconfidence interval of 1.6 3.3 found using 1000 bootstrap samples. In this case, increasing the number of bootstrap samples from 1000 to 10,000 does not have much of an effect on theconfidence interval. 26. a. No. A histogram or normal quantile plot would show that thedistribution of thesample data is far from normal. b. Yes. thesample means appear to have a distribution that is approximately normal. c. Yes. thesample standard deviations appear to have a distribution that is approximately normal. 27. The histogram of the1000 bootstrap sample means is approximately symmetric as required.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz 117 28. The 1000 medians have a mean around 18 sec, and the95% confidence interval of the1000 medians is likely to be something like 0 sec median 37 sec. or 0 sec median 74 sec. t h e 95% confidence interval is good in thesense that it does capture themedian of thepopulation, but it is quite wide and does not give us a very accurate estimate of thepopulation median. With this median, thebootstrap method does not work well because the1000 medians are targeting a value near 18 seconds instead of targeting themedian of 5.5 seconds or thesample median of 24.0 seconds. Quick Quiz 0.175 0.206 1. p̂ 0.191, or 19.1% 2 2. We have 95% confidence that thelimits of 17.5% and 20.6% contain thetrue value of thepercentage of motorcycle owners who are women. 3.
z0.01/ 2 2.576 (Table: 2.575) fx | =NORM.S.INV(1-0.01/2)
4.
40% 3.1% p 40% 3.1% 36.9% p 43.1%
5.
6. 7. 8. 9.
2 1.962 0.5 0.5 n z /2 p̂q̂ 2401 fx | =NORM.S.INV(1-0.05/2)^2*0.50*0.50/0.02^2 E2 0.022 z / 2 2 2.575 152 373 (Excel: 374) fx | =(NORM.S.INV(1-0.01/2)*15/2)^2 2 n E There is a loose requirement that thesample values are from a normally distributed population. The degrees of freedom is thenumber of sample values that can vary after restrictions have been imposed on all of thevalues. For thesample data described in Exercise 7, df 6 1 5.
t0.05/ 2 2.571
fx | =T.INV.2T(0.05,6-1)
10. No, theuse of the distribution has a fairly strict requirement that thedata must be from a normal distribution. thebootstrap method could be used to find a 95% confidence interval estimate of . 2
Review Exercises z 2 p̂q̂ 1.962 (0.5)(0.5) 1. a. n /2 97 E2 0.12
fx | =NORM.S.INV(1-0.05/2)^2*0.50*0.50/0.1^2
z /2 p̂q̂ 1.96 (0.4)(0.6) b. n 2 93 fx | =NORM.S.INV(1-0.05/2)^2*0.40*(1-0.40)/0.1^2 E 0.12 c. No, in this case thesample size doesn’t change much at all. z /2 2 1.96 1.32 fx | =(NORM.S.INV(1-0.05/2)*1.3/0.2)^2 2. n 0.2 163 E 3. a. 0.70(1002) 701 0.700.30 p̂q̂ 0.671 p 0.728, or 67.1% p 72.8%. b. 95% CI: p̂ z /2 0.70 1.96 n 1002 c. Based on a survey, 70% percent of those surveyed say that they voted in thepresidential election, and the 72.8 67.1 survey has a margin of error of 2.9% or 1.96 0.700.30 0.028 2.8%. 1002 2 d. No. Because 61% is not included in theconfidence interval, it does not appear that theresponses are consistent with theactual voter turnout. XLSTAT Output for Exercise 3 XLSTAT Output for Exercise 4 95% confidence interval on the proportion (Wald): 95% confidence interval on the mean: ( 0.672, 0.728 ) ( 30.697, 52.303 ) 2
2
Copyright © 2022 Pearson Education, Inc.
118
4.
5.
Chapter 7: Estimating Parameters and Determining Sample Sizes 15.102 41.5 2.262 30.7 minutes 52.3 minutes; We have 95% confidence 10 n that thelimits of 30.7 minutes and 52.3 minutes contain thetrue value of thepopulation mean . thesampledoes not appear to be from a population having a normal distribution. a. student t distribution b. None of thethree distributions is appropriate, but a confidence interval could be constructed by using bootstrap methods. c.
6.
s
95% CI: x t /2
2 (chi-square distribution)
d. normal distribution e. None of thethree distributions is appropriate, but a confidence interval could be constructed by using bootstrap methods. s 42.057 95% CI: x t /2 15.3 2.365 19.9 minutes 50.4 minutes; theconfidence interval8 n includes 0 (on time), so theon-time performance looks reasonably good. theconfidence interval is wide, so thepopulation mean delay time is not estimated with much accuracy. XLSTAT Output 95% confidence interval on the mean: ( -19.910, 50.410 )
7.
df 8 1 7; L 1.690, R 16.013 2
(n 1)s
2
R2
2
(n 1)s
2
L2
L2: fx | =CHISQ.INV(0.05/2,8-1) R2 : fx | =CHISQ.INV.RT(0.05/2,8-1)
(8 1)42.0572 (8 1)42.0572 16.013 1.690 95% CI: 27.8 min 85.6 min 8.
Answers will vary, but this result is typical: 27.8 min 85.6 min.
9.
a. Yes, therequirements are satisfied (n 30). s 34.1 2.030 74.737 8.8 seconds 59.3 seconds 95% CI: x t /2 n 36 b. Construction of a confidence interval estimate of has a fairly strict requirement that thesample should be from a population with a normal distribution, regardless of thesample size. That requirement is not satisfied. 19 of the36 times are 0 seconds, so thesample does not appear to be from a population having a normal distribution. XLSTAT Output for Exercise 9 XLSTAT Output for Exercise 10 95% confidence interval on the mean: ( 8.768, 59.343 )
95% confidence interval on the mean: ( 27.026, 33.074 )
10. a. t h e requirements are satisfied. s 4.227 95% CI: x t /2 30.05 2.262 27.03 cm 33.07 cm 10 n b. t h e requirements are satisfied. df 10 1 9; L 2.700 R 19.023 2: fx | =CHISQ.INV(0.05/2,10-1) (n 1)s2 (n 1)s2 L R2 L2 R2 : fx | =CHISQ.INV.RT(0.05/2,10-1) 2
(10 1)4.227
2
2 (10 1)4.227 19.023 2.700 95% CI: 2.91 cm 7.72 cm 2
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 119 Cumulative Review Exercises 1.
x
1.5 0.9 1.3 1.5 1.3 1.2 1.4 1.1 1.3 1.4 1.2 1.28 W/kg; min, Q 2 1.30 W/kg, 11 (1.5 1.28)2 (0.9 1.28)2 (1.3 1.28)2 (1.4 (1.2 1.28)2 s 0.18 W/kg, 11 1 s2 0.18 W/kg 0.03 W/kg , range 1.5 0.9 0.6 W/kg 2
2
XLSTAT Output Statistic Nbr. of observations Range Median Mean Variance (n-1) Standard deviation (n-1) 2.
3. 4. 5.
Radiation
Using therange rule of thumb, thelimit separating significantly low values is 2 1.28 2(0.18) 0.92 W/kg and thelimit separating significantly high values is 2 1.28 2(0.18) 1.64 W/kg. Because thelowest value of 0.9 W/kg is less than 0.92 W/kg, it is significantly low. ratio level of measurement; continuous data. The graphs suggest that thesample appears to be from a population having a distribution that is approximately normal. s 1.28 2.228 0.18 1.16 W/kg 1.40 W/kg; We have 95% confidence that the 95% CI: x t /2 n 11 limits of 1.16 W/kg and 1.40 W/kg contain theactual value of thepopulation mean . XLSTAT Output for Exercise 5
6. 7.
11 0.600 1.300 1.282 0.032 0.178
Excel Command for Exercise 6
95% confidence interval on the mean: fx | =(NORM.S.INV(1-0.05/2)*0.29/0.05)^2 ( 1.162, 1.401 ) z /2 2 1.96 0.29 2 n 0.05 130 cell phones E 1.6 1.17 1.48; which has a probability of 1 0.9306 0.0694 (Excel: 0.0691) to theright. a. z1.6 0.29 fx | =1-NORM.DIST(1.6,1.17,0.29,1) b. t h e z score for thelower 75% is 0.67, which correspond to a radiation amount of 1.17 0.67(0.29) 1.36 W/kg (Excel: 1.37 W/kg). fx | =NORM.INV(0.75,1.17,0.29)
986 57
p̂q̂ 986 1043 1.96 1043 1043 0.932 p 0.959 n 1043 b. Based on theresult from part (a), it is clear that themajority of thepopulation does not feel that thesong istoo offensive. c. Because therespondents chose to respond, thesurvey involves a self-selected or voluntary response sample, so theresults are very questionable. Given that therespondents are a self-selected sample, we don’t really know anything about thepopulation. XLSTAT Output for Exercise 8 XLSTAT Output for Exercise 9 95% confidence interval on the proportion (Wald): 95% confidence interval on the proportion (Wald): ( 0.932, 0.959 ) ( 0.639, 0.681 )
8.
a. 95% CI: p̂ z /2
Copyright © 2022 Pearson Education, Inc.
120 Chapter 7: Estimating Parameters and Determining Sample Sizes
0.660.34 p̂q̂ 0.639 p 0.681, or 63.9% p 68.1%; CVS 0.66 1.96 n 2020 Pharmacy sells flu shots and could potentially benefit from thewidespread belief that there will be high demand for flu shots, so there is a potential for bias (although theHarris Poll would likely conduct thesurvey without any bias). s 40.22 2.101 0.645 39.909 km/h 40.531 km/h 10. a. 95% CI: x t /2 n 19 XLSTAT Output 9.
95% CI: p̂ z /2
95% confidence interval on the mean: ( 39.909, 40.531 ) b. Given that thespeeds are listed in order by year, another tool that would be more helpful is a time series graph of thespeeds. thetime series graph could reveal any pattern of change over time. thetime series graph of thelisted data does not reveal any notable pattern of change over time.
Copyright © 2022 Pearson Education, Inc.
Section 8-1: Basics of Hypothesis Testing
121
Chapter 8: Hypothesis Testing Section 8-1: Basics of Hypothesis Testing 1. a. H0: 40 minutes b. H1: 40 minutes
2.
3. 4.
5. 6.
c. Reject thenull hypothesis or fail to reject thenull hypothesis. d. No, in this case, theoriginal claim becomes thenull hypothesis. For theclaim that themean wait time is equal to 40 minutes, we can either reject that claim or fail to reject it, but we cannot state that there is sufficient evidence to support that claim. Estimates and hypothesis tests are both methods of inferential statistics, but they have different objectives. We could use thesample data to construct a confidence interval estimate of thepopulation mean, but hypothesis testing is used to test some claim made about thevalue of themean amount of Dr. Pepper in thecans. The P-value of 0.001 is preferred because it corresponds to thesample evidence that most strongly supports thealternative hypothesis that themethod is effective. The most serious error is thefirst error (i) because it would result in distributing pills with lisinopril amounts that are very different from theintended amounts. Distributing pills with thewrong amount of a drug is more serious than distributing vitamin pills with thewrong amount of thevitamin. theerror of failing to reject thefalse null hypothesis is a type II error. a. p 0.10 13. a. Left-tailed b. P-value P z 0.75 0.2266 b. H0: p 0.10; H1: p 0.10 a.
p 0.5
fx | =NORM.S.DIST(-0.75,1)
b. H0: p 0.5; H1: p 0.5 7.
c. 0.2266 0.05; Fail to reject H 0 .
a. 123 mm Hg b. H0: 123 mm Hg; H1: 123 mm Hg
8.
a. 5 mm Hg
fx | =1-NORM.S.DIST(2.00,1)
b. H0: 5 mm Hg; H1: 5 mm Hg 9.
14. a. Right-tailed b. P-value P z 2.00 0.0228
z
p̂ p
0.058 0.10 17.77 (0.10)(0.90) 16,113
z
pq n
c. 0.0228 0.05; Reject H0. 15. a. Two-Tailed b. P-value 2 P z 1.80 0.0719 (Table: 0.0718) fx | =2*(1-NORM.S.DIST(1.80,1))
( z 17.76 if using x 0.058(16,113) 935) 10. z
11. t
p̂ p 0.72 0.5 z 25.19 pq (0.5)(0.5) n 3278 x 122.96 123 z 0.044 15.85 s n 300
12. 2 17. a.
(n 1)s2
2
z 1.645
16. a. Two-Tailed b. P-value 2 P z 1.60 0.1096 fx | =2*NORM.S.DIST(-1.60,1) c. 0.1096 0.05; Fail to reject H 0 .
(300 1)15.852 3004.621 52 fx | =NORM.S.INV(0.05)
b. 0.75 1.645; Fail to reject H0. 19. a. z 1.96
c. 0.0719 0.05; Fail to reject H 0 .
18. a.
z 1.645 fx | =NORM.S.INV(1-0.05)
b. 2.00 1.645; Reject H 0 .
fx | =NORM.S.INV(0.05/2) and fx | =NORM.S.INV(1-0.05/2)
b. 1.96 1.80 1.96; Fail to reject H0.
Copyright © 2022 Pearson Education, Inc.
122
Chapter 8: Hypothesis Testing
20. a. z 1.96 fx | =NORM.S.INV(0.05/2) and fx | =NORM.S.INV(1-0.05/2) b. 1.96 1.60 1.96; Fail to reject H 0 . 21. a. 0.3257 0.05; Fail to reject H 0 . b. There is not sufficient evidence to support theclaim that more than 58% of adults would erase all of their personal information online if they could. 22. a. 0.00001 0.05; Reject H 0 . b. There is sufficient evidence to support theclaim that more than 35% of air travelers would choose another airline to have access to inflight Wi-Fi. 23. a. 0.0095 0.05; Reject H0. b. There is sufficient evidence to warrant rejection of theclaim that themean pulse rate (in beats per minute) of adult males is 72 bpm. 24. a. 0.3045 0.05; Fail to reject H 0 . b. There is not sufficient evidence to support theclaim that thestandard deviation of pulse rates of adult malesis more than 11 bpm. 25. Type I error: In reality p 0.1, but we reject theclaim that p 0.1. Type II error: In reality p 0.1, but we fail to reject theclaim that p 0.1. 26. Type I error: In reality p 0.34, but we reject theclaim that p 0.34. Type II error: In reality p 0.34, but we fail to reject theclaim that p 0.34. 27. Type I error: In reality p 0.25, but we support theclaim that p 0.25. Type II error: In reality p 0.25, but we fail to support that claim. 28. Type I error: In reality p 1/ 2, but we support theclaim that p 1/ 2. Type II error: In reality p 1/ 2, but we fail to support that conclusion. 29. The power of 0.96 shows that there is a 96% chance of rejecting thenull hypothesis of p 0.08 when thetrue proportion is actually 0.18. That is, if theproportion of Chantix users who experience abdominal pain is actually 0.18, then there is a 96% chance of supporting theclaim that theproportion of Chantix users who experience abdominal pain is greater than 0.08. 0.50.5 0.6028125 30. a. From p 0.5, p̂ 0.5 1.645 64 0.6028125 0.65 0.791; Power P z 0.791 0.7852 (Excel: 0.7857) From p 0.65, z 0.650.35 64
fx | =1-NORM.S.DIST(-0.791623,1) b. Assuming that p 0.5, as in thenull hypothesis, thecritical value of z 1.645 corresponds to p̂ 0.6028125, so any sample proportion greater than 0.6028125 causes us to reject thenull hypothesis, as shown in theshaded critical region of thetop graph. If p is actually 0.65, then thenull hypothesis of p 0.5 is false, and theactual probability of rejecting thenull hypothesis is found by finding thearea greater than p̂ 0.6028125 in thebottom graph, which is theshaded area. That is, theshaded area in the bottom graph represents theprobability of rejecting thefalse null hypothesis.
Copyright © 2022 Pearson Education, Inc.
Section 8-2: Testing a Claim About a Proportion 123 31. From p 0.5, p̂ 0.5 1.645 p̂ 0.55 0.842 So: 0.5 1.645
0.50.5 n
; from p 0.55, since P z 0.842 0.8000,
0.550.45 n
0.50.5 n
0.55 0.842
0.550.45 n
0.5 n 1.645 0.25 0.55 n 0.842 0.2475 0.05 n 1.645 0.25 0.842 0.2475 2 1.645 0.842 0.2475 0.25 n 617 0.05 32. a
p 0.10
b. H0: p 0.10; H1: p 0.10; If H0 is rejected, conclude that there is sufficient evidence to warrant rejection of theclaim that at most 10% of homes have only a landline telephone and no wireless phone. If we fail to reject H0 , conclude that there is not sufficient evidence to warrant rejection of theclaim that at most 10% of homes have only a landline telephone and no wireless phone. Section 8-2: Testing a Claim About a Proportion 1. a. 0.86 1020 877 b. p̂ 0.86; thesymbol p̂ is used to represent a sample proportion. c. 2.
p 3 4 or 0.75
a. H0: p 3/4; H1: p 3/4 b. z
3.
p̂ p pq n
0.86 0.75 (0.75)(0.25) 1020
8.11 (8.10 if using x 0.86(1020) 877)
(1) It was stated that thesample is a simple random sample. (2) There is a fixed number of trials (1020), thetrials are independent because any respondent is independent of theothers, there are two categories of success, and theprobability remains thesame for all respondents. (3) With n 1020 and p 3/ 4, t h e conditions np 0.75(1020) 765 5 and nq 0.25(1020) 255 5 are both satisfied.
4.
5.
a. It’s possible that more than 3/4 of adults can rate themselves as above average drivers, but in reality, it is not possible for more than 3/4 of adults to actually be above average drivers. Half would be below average and half would be above average. b. The method based on a confidence interval is not equivalent to theP-value method and thecritical value method. c. If theP-value is very low (such as less than or equal to 0.05), ―the null must go‖ means that we should reject thenull hypothesis. d. The statement that ―if theP is high, thenull will fly‖ suggests that with a high P-value, thenull hypothesis has been proved or is supported, but we should never make such a conclusion of proving or supporting thenull hypothesis. e. Choosing a significance level with a number like 0.0483 would make it seem like you’re carefully choosing thesignificance level to reach a desired conclusion. a. left-tailed b. z 4.46 c. P-value 0.000004 d. H0: p 0.10; Reject thenull hypothesis. e. There is sufficient evidence to support theclaim that less than 10% of treated subjects experience headaches.
Copyright © 2022 Pearson Education, Inc.
124 6.
7.
8.
Chapter 8: Hypothesis Testing a. right-tailed c. P-value 0.0132 d. H0: p 0.25; Reject thenull hypothesis.
b. z 2.22
e. There is sufficient evidence to support theclaim that more than 1/4 of smartphone owners have no lock screen on their phones. a. left-tailed b. z 6.32 c. P-value 0.000 d. H0: p 0.50; Reject thenull hypothesis. e. There is sufficient evidence to support theclaim that fewer than 50% of adults search for themselves online. a. two-tailed b. z 1.33 c. P-value 0.1840 d. H0: p 0.5; Fail to reject thenull hypothesis.
e. There is not sufficient evidence to warrant rejection of theclaim that half of us say that we should replace passwords with biometric security. 91 0.40 p̂ p 220 9. H0: p 0.40; H1: p 0.40; Test statistic: z 0.41 (0.40)(0.60) pq 220 n P-value P(z 0.41) 0.3409 (Excel: 0.3399); Critical value: z 1.645; Fail to reject H0. There is not sufficient evidence to support theclaim that thesample of cast and crew members is from a population in which therate of cancer is greater than 40%. There is not sufficient evidence to support a conclusion that themovie was cursed because of a significantly high number of cancer cases. XLSTAT Output for Exercise 9 XLSTAT Output for Exercise 10 z-test for one proportion / Upper-tailed test: z-test for one proportion / Two-tailed test: Difference 0.014 Difference -0.104 z (Observed value) 0.413 z (Observed value) -1.443 |z| (Critical value) 1.960 z (Critical value) 1.645 p-value (one-tailed) 0.340 p-value (Two-tailed) 0.149 alpha 0.050 alpha 0.050 19 0.50 p̂ p 10. H0: p 0.5; H1: p 0.5; z 48 1.44 (0.50)(0.50) pq 48 n P-value 2P(z 1.44) 0.1498 (Excel: 0.1489); Critical values: z 1.96; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that dropped toast will land with thebuttered side down 50% of thetime. thereal intent of theexperiment was to test thecommon theory that when buttered toast is dropped, it will land with thebuttered side down most of thetime. That theory is not supported by theexperimental data. 11. H0: p 0.5; H1: p 0.5; 0.11(2005) 221, 221, so 221 of those surveyed knew what thesymbol designates. p̂ p 0.11 0.50 34.93 (34.91 if using x 0.11(2005) 221) (0.50)(0.50) pq 2005 n P-value P(z 34.93) 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that fewer than half of all Americans know what thesymbol designates. Because it appears that so few Americans know what thesymbol designates, thesymbol should be replaced with one that is much more obvious. z
Copyright © 2022 Pearson Education, Inc.
Section 8-2: Testing a Claim About a Proportion 125 XLSTAT Output for Exercise 11 z-test for one proportion / Lower-tailed test: Difference -0.390 z (Observed value) -34.926 z (Critical value) -2.326 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Exercise 12 z-test for one proportion / Two-tailed test: Difference 0.015 z (Observed value) 0.652 |z| (Critical value) 1.960 p-value (Two-tailed) 0.514 alpha 0.050
p̂ p 0.255 0.24 0.65 (0.66 if using x 0.255(345) 88) (0.24)(0.76) pq 345 n P-value 2P(z 0.66) 0.5156 or 0.5092 (Excel: 0.5121); Critical values: z 1.96; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that 24% of M&Ms are blue. Corrective action is not warranted. Even if they were making blue M&Ms at a rate significantly different from 24%, that would not be a big consumer issue. 52 0.20 p̂ p 227 13. H0: p 0.20; H1: p 0.20; z 1.10 (0.20)(0.80) pq 227 n P-value P(z 1.10) 0.1357 (Excel: 0.1367); Critical value: z 1.645; Fail to reject H0. There is not sufficient evidence to support theclaim that more than 20% of OxyContin users develop nausea. However, with p̂ 0.229, we see that a large percentage of OxyContin users experience nausea, so that rate does appear to be very high. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 z-test for one proportion / Upper-tailed test: z-test for one proportion / Upper-tailed test: Difference 0.029 Difference 0.197 z (Observed value) 1.095 z (Observed value) 13.812 z (Critical value) 1.645 z (Critical value) 2.326 p-value (one-tailed) 0.137 p-value (one-tailed) <0.0001 alpha 0.050 alpha 0.010 856 0.50 p̂ p 14. H0: p 0.5; H1: p 0.5; z 1228 13.81 (0.50)(0.50) pq 1228 n P-value P(z 13.81) 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that most medical malpractice lawsuits are dropped or dismissed. This should be comforting to physicians. 12. H0: p 0.24; H1: p 0.24; z
p̂ p 0.47 0.50 4.06 (x 0.47(4581) 2153) (0.50)(0.50) pq 4581 n P-value P(z 4.06) 0.0001 (Excel: 0.00002); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that fewer than half of Americans prefer to watch thenews rather than read or listen to it. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 z-test for one proportion / Lower-tailed test: z-test for one proportion / Lower-tailed test: Difference -0.030 Difference -0.330 z (Observed value) -4.061 z (Observed value) -46.723 z (Critical value) -2.326 z (Critical value) -1.645 p-value (one-tailed) <0.0001 p-value (one-tailed) <0.0001 alpha 0.010 alpha 0.050
15. H0: p 0.50; H1: p 0.50; z
Copyright © 2022 Pearson Education, Inc.
126
Chapter 8: Hypothesis Testing
16. H0: p 0.48; H1: p 0.48; z
p̂ p
751 0.48 5005 46.72 (0.48)(0.52) 5005
pq n P-value P(z 46.72) 0.0001 (Excel: 0.0000); Critical value: z 1.645; Reject H 0 . There is sufficient evidence to support theclaim that thepercentage of U.S. adults who do not use theInternet is now less than 48%. thepercentage appears to be dramatically different that it was in 2000. 426 0.512 p̂ p 17. H0: p 0.512; H1: p 0.512; z 860 0.98 (0.512)(0.488) pq 860 n P-value 2P(z 0.98) 0.3270 (Excel: 0.3286); Critical values: z 1.96; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that 51.2% of newborn babies are boys. theresults do not support thebelief that 51.2% of newborn babies are boys; theresults merely show that there is not strong evidence against therate of 51.2%. XLSTAT Output for Exercise 17 XLSTAT Output for Exercise 18 z-test for one proportion / Two-tailed test: z-test for one proportion / Two-tailed test: Difference -0.017 Difference 0.012 z (Observed value) -0.977 z (Observed value) 0.671 |z| (Critical value) 2.576 |z| (Critical value) 1.960 p-value (Two-tailed) 0.502 p-value (Two-tailed) 0.329 alpha 0.050 alpha 0.010 152 0.25 18. H0: p 0.25; H1: p 0.25; Test statistic: z 580 0.67; 0.250.25 580
P-value 2 P z 0.67 0.5028 (Excel: 0.5021); Critical values: z 2.575 (Excel: z 2.576); Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that 25% of offspring peas are yellow. There is not sufficient evidence to conclude that Mendel’s rate of 25% is wrong. 19. H0: p 0.80; H1: p 0.80; Test statistic: z
74 0.80 98 1.11; 0.800.20 98
P-value P z 1.11 0.1335 (Excel: 0.1332); Critical value: z 1.645; Fail to reject H0. There is not sufficient evidence to support theclaim that thepolygraph results are correct less than 80% of thetime. However, based on thesample proportion of correct results in 75.5% of the98 cases, polygraph results do not appear to have thehigh degree of reliability that would justify theuse of polygraph results in court, so polygraph test results should be prohibited as evidence in trials. XLSTAT Output for Exercise 19 XLSTAT Output for Exercise 20 z-test for one proportion / Lower-tailed test: z-test for one proportion / Lower-tailed test: Difference -0.045 Difference -0.016 z (Observed value) -1.111 z (Observed value) -1.361 z (Critical value) -1.645 z (Critical value) -2.326 p-value (one-tailed) 0.133 p-value (one-tailed) 0.087 alpha 0.050 alpha 0.010 446 0.30 p̂ p 1569 20. H0: p 0.30; H1: p 0.30; z 1.36 (0.30)(0.70) pq 1569 n P-value P(z 1.36) 0.0869 (Excel: 0.0868); Critical value: z 2.33; Fail to reject H0. There is not sufficient evidence to support theclaim that less than 30% of thechallenges are successful. Because thesuccess rate is so low, players don’t appear to be very good at recognizing referee errors.
Copyright © 2022 Pearson Education, Inc.
Section 8-2: Testing a Claim About a Proportion 127 123 0.5 280 2.03; 21. H0 : p 0.5; H1 : p 0.5; Test statistic: z 0.50.5 280
P-value 2 P z 2.03 0.0424 (Excel: 0.0422); Critical values: z 1.645; Reject H0. There is sufficient evidence to warrant rejection of theclaim that touch therapists use a method equivalent to random guesses. However, their success rate of 123 280, or 43.9%, indicates that they performed worse than random guesses, so they do not appear to be effective. XLSTAT Output for Exercise 21 XLSTAT Output for Exercise 22 z-test for one proportion / Two-tailed test: z-test for one proportion / Two-tailed test: Difference -0.061 Difference 0.070 z (Observed value) -2.032 z (Observed value) 4.558 |z| (Critical value) 1.645 |z| (Critical value) 2.576 p-value (Two-tailed) 0.042 p-value (Two-tailed) <0.0001 alpha 0.100 alpha 0.010 p̂ p 0.57 0.50 4.56; (4.55 if using x 0.57(1060) 604) (0.50)(0.50) pq 1060 n P-value 2P(z 4.56) 0.0002 (Excel: 0.000005); Critical values: z 2.575; Reject H0. There is sufficient evidence to warrant rejection of theclaim that half of all teens have made new friends online. (The proportion appears to be greater than one half.)
22. H0: p 0.50; H1: p 0.50; z
p̂ p 0.33 0.35 2.18 (2.17 if using x 0.33(2705) 893) (0.35)(0.65) pq 2705 n P-value 2P(z 2.17) 0.0300 (Excel: 0.0303); Critical values: z 1.96; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that thepercentage who make angry gestures while driving is equal to 35%. If thesignificance level is changed to 0.01, theconclusion does change to this: There is not sufficient evidence to warrant rejection of theclaim that thepercentage who make angry gestures while driving is equal to 35%. XLSTAT Output for Exercise 23 XLSTAT Output for Exercise 24 z-test for one proportion / Two-tailed test: z-test for one proportion / Two-tailed test: Difference -0.020 Difference -0.060 z (Observed value) -2.181 z (Observed value) -5.087 |z| (Critical value) 1.960 |z| (Critical value) 1.960 p-value (Two-tailed) 0.029 p-value (Two-tailed) <0.0001 alpha 0.050 alpha 0.050
23. H0: p 0.35; H1: p 0.35; z
p̂ p 0.74 0.80 5.09 (x 0.74(1150) 851) (0.80)(0.20) pq 1150 n P-value 2P(z 5.09) 0.0002 (Excel: 0.0000); Critical values: z 1.96; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that thepercentage of all homes with at least one Internet-connected TV device is equal to 80%. 27 0.50 p̂ p H : p 0.50; H : p 0.50; 25. 0 z 53 0.14 1 (0.50)(0.50) pq 53 n P-value P(z 0.14) 0.4443 (Excel: 0.4454); Critical value: z 1.645; Fail to reject H 0. There is not sufficient evidence to support theclaim that NFC teams win themajority of Super Bowl games. 24. H0: p 0.80; H1: p 0.80; z
Copyright © 2022 Pearson Education, Inc.
128
Chapter 8: Hypothesis Testing XLSTAT Output for Exercise 25 z-test for one proportion / Upper-tailed test: Difference 0.009 z (Observed value) 0.137 z (Critical value) 1.645 p-value (one-tailed) 0.445 alpha 0.050
XLSTAT Output for Exercise 26 z-test for one proportion / Upper-tailed test: Difference 0.010 z (Observed value) 0.676 z (Critical value) 1.645 p-value (one-tailed) 0.249 alpha 0.050
p̂ p 0.51 0.50 0.68 (0.65 if using x 0.51(1144) 583) (0.50)(0.50) pq 1144 n P-value P(z 0.65) 0.2483 or 0.2578 (Excel: 0.2577); Critical value: z 1.645; Fail to reject H0. There is not sufficient evidence to support theclaim of theheadline that themajority of Americans disapprove of Kavanaugh. 252 0.50 p̂ p 27. H0: p 0.50; H1: p 0.50; z 460 2.05 (0.50)(0.50) pq 460 n P-value 2P(z 2.05) 0.0404 (Excel: 0.0402); Critical values: z 1.96; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that thecoin toss is fair in thesense that neither team has an advantage by winning it. thecoin toss rule does not appear to be fair. This helps explain why theovertime rules were changed. XLSTAT Output for Exercise 27 XLSTAT Output for Exercise 28 z-test for one proportion / Two-tailed test: z-test for one proportion / Lower-tailed test: Difference 0.048 Difference 0.005 z (Observed value) 2.052 z (Observed value) 1.132 |z| (Critical value) 1.960 z (Critical value) -1.645 p-value (Two-tailed) 0.040 p-value (one-tailed) 0.871 alpha 0.050 alpha 0.050 26. H0: p 0.50; H1: p 0.50; z
6062 0.5 28. H0: p 0.5; H1: p 0.5; Test statistic: z 12,000 1.13; 0.50.5 12,000
P-value P z 1.13 0.8708 (Excel: 0.8712); Critical value: z 1.645; Fail to reject H0. There is not sufficient evidence to support theclaim that less than 0.5 of thedeaths occur theweek before Thanksgiving. Based on these results, there is no indication that people can temporarily postpone their death to survive Thanksgiving. (With 50.5% of thedeaths occurring before Thanksgiving, there is no way that theclaim of a proportion less than 0.5 could be supported.) 29. H0: p 1 / 3; H1: p 1 / 3; z
p̂ p
0.42 1 / 3
8.72 (x 0.42(2250) 945) (1/3)(2/3) pq 2250 n P-value P(z 8.72) 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that more than 1/3 of adults believe in ghosts. XLSTAT Output for Exercise 29 XLSTAT Output for Exercise 30 z-test for one proportion / Upper-tailed test: z-test for one proportion / Two-tailed test: Difference 0.087 Difference 0.028 z (Observed value) 8.721 z (Observed value) 0.985 z (Critical value) 2.326 |z| (Critical value) 2.576 p-value (one-tailed) <0.0001 p-value (Two-tailed) 0.325 alpha 0.010 alpha 0.010
Copyright © 2022 Pearson Education, Inc.
Section 8-2: Testing a Claim About a Proportion 129 30. H 0: p 0.80; H 1: p 0.80; Test statistic: z
0.828 0.80 0.98 (0.99 using x 0.828(198) 164); 0.800.20 198
P-value 2 P z 0.98 0.3270 (Excel: 0.3246); Critical values: z 2.575 (Excel: z 2.576); Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that 80% of patients stop smoking when given sustained care. With a success rate around 80%, it appears that sustained care is effective. 31. H 0: p 0.791; H 1: p 0.791; Test statistic: z
0.39 0.791 29.09 ( 29.11 using x 0.39(870) 339); 0.7910.209 870
P-value P z 29.09 0.0001 (Excel: 0.0000) (using x 339, 0.3222, Excel: 0.3198); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that thepercentage of selected Americans of Mexican ancestry is less than 79.1%, so thejury selection process appears to be biased. XLSTAT Output for Exercise 31 XLSTAT Output for Exercise 32 z-test for one proportion / Lower-tailed test: z-test for one proportion / Upper-tailed test: Difference -0.401 Difference 0.450 z (Observed value) -29.090 z (Observed value) 47.990 z (Critical value) -2.326 z (Critical value) 2.326 p-value (one-tailed) <0.0001 p-value (one-tailed) <0.0001 alpha 0.010 alpha 0.010 p̂ p 0.57 0.12 47.99 (48.03 if using x 0.57(1201) 685) (0.12)(0.88) pq 1201 n P-value P(z 48.03) 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that thecurrent rate of support for marijuana is greater than the12% level in 1969.
32. H0: p 0.12; H1: p 0.12; z
33. H0: p 0.5; H1: p 0.5; Normal approximation: 9 0.5 10 z 2.53; P-value 2 P z 2.53 0.0114 0.50.5 10
Exact:
P-value 2 10 C9 0.5 0.5 10 C10 0.5 0.5 0.0215 Continuity Correction: 9 1 10 0 1 9 1 P-value 2 C 0.5 0.5 C 0.5 0.5 C 0.5 0.5 0.0117 10 9 10 10 2 10 9 9
1
10
0
H0: p 0.4; H1: p 0.4; Normal approximation: 9 0.4 10 z 3.23; P-value 2 P z 3.23 0.0012 0.40.6 10
Exact:
P-value 2 10 C9 0.4 0.6 10 C10 0.4 0.6 0.0034 Continuity Correction: 9 1 10 0 1 9 1 P-value 2 C 0.4 0.6 C 0.4 0.6 C 0.4 0.6 0.0018 10 9 10 10 2 10 9 9
1
10
0
Copyright © 2022 Pearson Education, Inc.
130
Chapter 8: Hypothesis Testing
33. (continued) H 0: p 0.5; H1: p 0.5; Normal approximation: 482 0.5 z 926 1.25; P-value P z 1.25 0.1056 (Excel: 0.1059) 0.50.5 926
Exact: P-value 926C482 0.5 0.5 926C926 0.5 0.5 0.1120 Continuity Correction: 482 444 926 0 1 482 444 P-value C 0.5 0.5 C 0.5 0.5 C 0.5 0.5 926 482 926 926 2 926 482 0.1060 The P-values agree reasonably well with thelarge sample size of n 926. thenormal approximation to thebinomial distribution works better as thesample size increases. 482
444
926
0
34. a. H 0 : p 0.10; H1: p 0.10; . Test statistic: z
0.119 0.1 2.00; Critical values: z 1.96; 0.10.9 1000
Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that theproportion of zeros is 0.1. b. H 0: p 0.10; H 1: p 0.10; Test statistic: z
0.119 0.1 2.00; 0.10.9 1000
P-value 2 P z 2.00 0.0456 (Excel: 0.0452); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that theproportion of zeros is 0.1. 0.1190.881 p̂q̂ 0.0989 p 0.139; Because 0.1 is contained c. 95% CI: p̂ z /2 0.119 1.96 n 1000 within theconfidence interval, fail to reject H0: p 0.10. There is not sufficient evidence to warrant rejection of theclaim that theproportion of zeros is 0.1. d. t h e traditional and P-value methods both lead to rejection of theclaim, but theconfidence interval method does not lead to rejection of theclaim. 35. a. From p 0.40, p̂ 0.4 1.645 From p 0.25, z
0.40.6 0.286 50
0.286 0.25 0.588; Power P z 0.588 0.7224 (Excel: 0.7219) 0.250.75 50
b. 1 0.7224 0.2776 (Excel: 0.2781) c. t h e power of 0.7224 shows that there is a reasonably good chance of making thecorrect decision of rejecting thefalse null hypothesis. It would be better if thepower were even higher, such as greater than 0.8 or 0.9. 36. a. With a claim of at least 50%, theoriginal claim is p 0.50, and theopposite of that is p 0.50, so we get H0: p 0.50; H1: p 0.50. t h e test then changes from being right-tailed to becoming a left-tailed test. The P-value changes from 0.1059 to 0.8941. We fail to reject H0 and we conclude that there is not sufficient evidence to warrant rejection of theclaim that at least 50% of Internet users utilize two-factor authentication.
Copyright © 2022 Pearson Education, Inc.
Section 8-3: Testing a Claim About a Mean 131 36. (continued) b. With a claim of at most half of Americans prefer to watch thenews rather than read or listen to it, the original claim becomes p 0.50, and theopposite of that is p 0.50, so we get H0: p 0.50; H1: p 0.50. t h e P-value changes from 0.00002 (Table: 0.0001) to 0.99998 (Table: 0.9999). The conclusion changes to this: There is not sufficient evidence to warrant rejection of thenull hypothesis, and there is not sufficient evidence to warrant rejection of theclaim that at most half of Americans prefer to watch thenews rather than read or listen to it. Very different conclusion! Section 8-3: Testing a Claim About a Mean 1. The requirements are (1) thesample must be a simple random sample, and (2) either or both of these conditions must be satisfied: thepopulation is normally distributed or n 30. There is not enough information given to determine whether thesample is a simple random sample. Because thesample size is not greater than 30, we must check for normality, but thevalue of $36 million appears to be an outlier, and a normal quantile plot or histogram suggests that thesample does not appear to be from a normally distributed population. therequirements are not satisfied.
2.
df denotes thenumber of degrees of freedom. For thesample of 15 salaries, df 15 1 14.
3.
A t test is a hypothesis test that uses theStudent t distribution, such as themethod of testing a claim about a population mean as presented in this section. theletter t is used in reference to theStudent t distribution, whichis used in a t test. thez test methods require a known value of , but it would be very rare to conduct a hypothesis test for a claim about an unknown value of while we somehow know thevalue of .
4.
Test statistic: t
x s
n
6.133333 5 8.862978
0.495; Critical value: t 1.761
15
fx | =T.INV(1-0.05,15-1)
5.
H0: 5.670 g; H1: 5.670 g; P-value 0.0033 (Table: P-value 0.01); There is sufficient evidence to warrant rejection of theclaim that themean weight of quarters is equal to 5.670 g. fx | =T.DIST.2T(abs(-3.135),40-1)
6.
H0: 2.500 g; H1: 2.500 g; P-value 0.7426 (Table: P-value 0.20); There is not sufficient evidence to warrant rejection of theclaim that themean weight of pennies is equal to 2.500 g. fx | =T.DIST.2T(abs(-0.331),37-1)
7.
H0: 2.84 ng / mL; H1: 2.84 ng / mL; P-value 0.0000 (Table: P-value 0.005); There is sufficient evidence to support theclaim that themean cotinine level of smokers is greater than 2.84 ng/mL. fx | =T.DIST.RT(56.319,902-1)
8.
H0: 98.6F; H1: 98.6F; P-value 0.0001 (Table: P-value 0.005); There is sufficient evidence to support theclaim that themean body temperature of males is less than 98.6F.
Copyright © 2022 Pearson Education, Inc.
fx | =T.DIST(-4.411,33-1)
132
Chapter 8: Hypothesis Testing
9.
H0: 8.953 min; H1: 8.953 min; Test statistic: t 3.423; P-value 0.0015; Critical values (with 0.05): t 2.026; Reject H 0 . There is sufficient evidence to warrant rejection of the claim that thesample is from a population with a mean equal to 8.953 g.
10. H0: 15 min; H1: 15 min; Test statistic: t 1.61; P-value 0.058; Critical value (with 0.05): t 1.690 (Table: –1.676); Fail to reject H 0 . There is not sufficient evidence to support theclaim that themean time of all taxi cab rides is less than 15 min. 11. H0: 30 min; H1: 30 min; Test statistic: t 0.940; P-value 0.177; Critical value (with 0.05): t 1.685; Fail to reject H0. There is not sufficient evidence to support the claim that themean wait time is more than 30 min. 12. H0: 300 sec; H1: 300 sec; Test statistic: t 3.6569; P-value 0.0005; Critical value (with 0.05): t 1.694; Reject H 0 . There is sufficient evidence to support theclaim that the mean wait service time is less than 300 seconds. 13. n 300 30; H0: 120 mm; H1: 120 mm; x 122.96 120 Test statistic: t 3.234; Critical value: t 2.339 s / n 15.85169 / 300 P-value 0.0007 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that thesample is from a population with a mean greater than 120 mm Hg. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 One-sample t-test / Upper-tailed test: One-sample t-test / Upper-tailed test: Difference 2.960 Difference 10.753 t (Observed value) 3.234 t (Observed value) 16.034 t (Critical value) 2.339 t (Critical value) 2.339 DF 299 DF 299 p-value (one-tailed) 0.001 p-value (one-tailed) <0.0001 alpha 0.010 alpha 0.010 14. n 300 30; H0: 60 mm Hg; H1: 60 mm Hg; x 70.75333 60 Test statistic: t 16.034; Critical value: t 2.339 s / n 11.61618 / 300 P-value 0.0000 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that thesample is from a population with a mean greater than 60 mm Hg. 15. Since n 21 30, assume thedata follow a normal distribution. H0: 100; H1: 100; x 86.90476 100 Test statistic: t 6.676; Critical values: t 2.086 s / n 8.988352 / 21 P-value 0.0000 (Table: P-value 0.01); Reject H 0 . There is sufficient evidence to warrant rejection of the claim that thesample of children is from a population with mean IQ equal to 100. These results do not ―prove‖ that exposure to lead has an adverse effect on IQ scores of children, but it strongly suggests that such an adverse effect is very possible. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 One-sample t-test / Two-tailed test: One-sample t-test / Lower-tailed test: Difference -13.095 Difference -2.965 t (Observed value) -6.676 t (Observed value) -2.243 t (Critical value) -1.685 |t| (Critical value) 2.086 DF 39 DF 20 p-value (Two-tailed) <0.0001 p-value (one-tailed) 0.015 alpha 0.050 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
Section 8-3: Testing a Claim About a Mean 133 16. n 40 30; H0: $15.00; H1: $15.00; x 12.035 15.00 Test statistic: t 2.243; Critical value: t 1.685 s/ n 8.361 / 40 P-value 0.0153 (Table: 0.01 P-value 0.025); Reject H0. There is sufficient evidence to support theclaim that themean cost of a taxicab ride in New York City is less than $15.00. 17. n 40 30; H0: 0 lb; H1: 0 lb; Test statistic: t
3.0 0.0 4.9
40
3.872; Critical value: t 2.426;
P-value 0.0002 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that the mean weight loss is greater than 0. Although thediet appears to have statistical significance, it does not appear to have practical significance, because themean weight loss of only 3.0 lb does not seem to be worth theeffort and cost. Excel commands for Exercise 17 EXCEL commands for Exercise 18 P-value: fx | =T.DIST.RT(3.872,40-1) Critical Value: fx | =T.INV(1-0.01,40-1)
P-value: fx | =T.DIST.RT(1.068,10-1) Critical Value: fx | =T.INV(1-0.01,10-1)
18. The data cannot be verified to follow a normal distribution and n 30, so proceed with caution. H0: 48.0 words; H1: 48.0 words; 53.3 48.0 Test statistic: t 1.068; Critical value: t 2.821; 15.7 10 P-value 0.1568 (Table: P-value 0.10); Fail to reject H 0 . There is not sufficient evidence to support the claim that themean number of defined words per page is greater than 48.0 words. There is not enough evidence to support theclaim that there are more than 70,000 defined words in thedictionary. 19. The data appear right-skewed, but n 36 30. H0: 12.00 oz; H1: 12.00 oz; 12.19 12.00 Test statistic: t 10.364; Critical values: t 2.030; 0.11 36 P-value 0.0000 (Table: P-value 0.10); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that themean volume is equal to 12.00 oz. Because themean appears to be greater than 12.00 oz, consumers are not being cheated because they are getting slightly more than 12.00 oz. Excel commands for Exercise 19 EXCEL commands for Exercise 20 P-value: fx | =T.DIST.2T(10.364,36-1) Critical Value: fx | =T.INV.2T(0.05,36-1)
P-value: fx | =T.DIST(-0.369,16-1) Critical Value: fx | =T.INV(0.05,16-1)
20. The data cannot be verified to follow a normal distribution and n 30, so proceed with caution. H0: 102.8 min; H1: 102.8 min; 98.9 102.8 Test statistic: t 0.369; Critical value ( 0.05): t 1.753; 42.3 16 P-value 0.3587 (Table: P-value 0.10); Fail to reject H 0 . There is not sufficient evidence to support the claim that after treatment with zopiclone, subjects have a mean wake time less than 102.8 min. These results suggest that thezoplicone treatment is not effective.
Copyright © 2022 Pearson Education, Inc.
134
Chapter 8: Hypothesis Testing
21. The sample data meet theloose requirement of having a normal distribution. H0: 14 g g; H1: 14 g g; 11.05 14.0 Test statistic: t 1.444; Critical value: t 1.883; 6.46 10 P-value 0.0913 (Table: P-value 0.05); Fail to reject H 0 . There is not sufficient evidence to support theclaim that themean lead concentration for all such medicines is less than 14 g g. XLSTAT Output for Exercise 21 One-sample t-test / Lower-tailed test: Difference -2.950 t (Observed value) -1.444 t (Critical value) -1.833 DF 9 p-value (one-tailed) 0.091 alpha 0.050
XLSTAT Output for Exercise 22 One-sample t-test / Two-tailed test: Difference 2.667 t (Observed value) 0.530 |t| (Critical value) 2.145 DF 14 p-value (Two-tailed) 0.604 alpha 0.050
22. The sample data meet theloose requirement of having a normal distribution. H0: 60 sec; H1: 60 sec; 62.67 60.00 Test statistic: t 0.530; Critical values: t 2.145; 9.48 15 P-value 0.6043 (Table: P-value 0.20); Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thetimes are from a population with a mean equal to 60 seconds. Although some of theindividual estimates are off by a large amount, thegroup of 15 students yielded themean of 62.7 sec, whichis not significantly different from 60 sec, so as a group they appear to be reasonably good at estimating one minute. 23.2 n 8 30, but thesample data meet theloose requirement of having a normal distribution. 3 H0: 1.6 W/kg; H1: 1.6 W/kg; . x 1.04875 1.6 Test statistic: t 4.256; Critical value: t 2.998 s / n 0.36635 / 8 P-value 0.0019 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that the sample is from a population of cell phones with a mean amount of radiation that is less than theFCC standard of 1.6 W/kg. XLSTAT Output for Exercise 23 XLSTAT Output for Exercise 24 One-sample t-test / Lower-tailed test: One-sample t-test / Two-tailed test: Difference -0.551 Difference -0.029 t (Observed value) -4.256 t (Observed value) -3.151 t (Critical value) -2.998 |t| (Critical value) 2.861 DF 7 DF 19 p-value (one-tailed) 0.002 p-value (Two-tailed) 0.005 alpha 0.010 alpha 0.010 24. n 20 30 and thesample data do not appear to meet theloose requirement of having a normal distribution, so proceed with caution. H0: 8.1 g; H1: 8.1 g; x 8.07104 8.1 Test statistic: t 3.151; Critical values: t 2.861 s / n 0.041105 / 20 P-value 0.0053 (Table: P-value 0.01); Reject H 0 . There is sufficient evidence to warrant rejection of the claim that dollar coins have a mean weight of 8.1 grams. thesample mean is 8.07104 g, which is just below theclaimed mean of 8.1 g. thespecification of 8.1 g is for newly minted dollar coins, but those coins experience some wear and loss of material after being in circulation, so it is reasonable that themean weight of dollar coins in circulation is less than 8.1 g.
Copyright © 2022 Pearson Education, Inc.
Section 8-3: Testing a Claim About a Mean 135 25.2 n 1000 30; H0: 35 minutes; H1: 35 minutes; 5 . x 31.044 35 Test statistic: t 5.820; Critical value: t 2.330 s / n 21.49517 / 1000 P-value 0.000 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that the mean commute time is less than 35 minutes. thesample mean is 31.0 minutes, and theclaimed mean is 35 minutes. That difference is statistically significant, but does not appear to have much practical significance. (TI data: Test statistic is t 2.62 and P-value 0.005.) XLSTAT Output for Exercise 25 One-sample t-test / Lower-tailed test: Difference -3.956 t (Observed value) -5.820 t (Critical value) -2.330 DF 999 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Exercise 26 One-sample t-test / Lower-tailed test: Difference -2.663 t (Observed value) -8.156 t (Critical value) -1.647 DF 702 p-value (one-tailed) <0.0001 alpha 0.050
26.2 n 703 30; H0: $15.00; H1: $15.00; 6 . x 12.33734 15.00 Test statistic: t 8.156; Critical value: t 1.647 (Table: 1.645) s/ n 8.65649 / 703 P-value 0.0000 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that themean cost of a taxicab ride in New York City is less than $15.00. 27.2 n 4082 30; H0: 1818 mm; H1: 1818 mm; 7 . x 1814.154 1818 Test statistic: t 2.903; Critical value: t 2.327 (Table: 2.236) s / n 84.63228 / 4082 P-value 0.0019 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that the sample is from a population with a mean arm span less than 1818 mm. thesample mean is 1814.154 mm, and theclaimed mean is 1818 mm. thedifference of 3.846 mm (or 0.15 in.) is statistically significant, but does not appear to have practical significance. (TI data: Test statistic: t 0.126, P-value 0.550, Critical value: t 2.334. Fail to reject H 0 . There is not sufficient evidence to support theclaim that thesample is from a population with a mean arm span less than 1818 mm.) XLSTAT Output for Exercise 27 XLSTAT Output for Exercise 28 One-sample t-test / Lower-tailed test: One-sample t-test / Upper-tailed test: Difference -3.846 Difference 0.415 t (Observed value) -2.903 t (Observed value) 0.386 t (Critical value) -2.327 t (Critical value) 1.645 DF 4081 DF 4081 p-value (one-tailed) 0.002 p-value (one-tailed) 0.350 alpha 0.010 alpha 0.050 28.2 n 4082 30; H0: 1755.8 mm; H1: 1755.8 mm; 8 . x 1756.215 1755.8 Test statistic: t 0.386; Critical value: t 1.645 s / n 68.55079 / 4082 P-value 0.3496 (Table: P-value 0.10); Fail to reject H 0 . There is not sufficient evidence to support the claim that thesample from 2012 is from a population with a mean height greater than 1755.8 mm. It does not appear that male army personnel were taller in 2012 than in 1988. (TI data: Test statistic: t 0.564, Copyright © 2022 Pearson Education, Inc.
Section 8-3: Testing a Claim About a Mean 135 P-value 0.2868, critical value: t 1.648.)
Copyright © 2022 Pearson Education, Inc.
136
Chapter 8: Hypothesis Testing
29.2 H : 100; H : 100; Test statistic: t 9 . 0
x
100.05 100
3.33; P-value 0.0009; The
1
s / n 15 / 1, 000, 000 difference between x 100.05 and theclaimed mean of 100 is statistically significant, but that difference is so small that it does not have practical significance. fx | =T.DIST.2T(3.33,1000000-1) 30. The power of 0.9127 shows that thechance of recognizing that 1.6 W/kg is quite high when in reality
1.0 W/kg. This is a good value of power. 1 0.9127 0.0873, so there is about a 9% chance of failing to recognize that 1.6 W/kg when in reality 1.0 W/kg.
1.645(8 4081 3) 1.645100769 and t 4081 e1.6451007692 / 4081 1 1.645; t h e critical t score found 31.3 A 8 4081 1 1 . using thegiven approximation is 1.645, which is thesame value of 1.645 found from technology. theapproximation appears to work quite well, and it provides us with a method for finding critical t scores whenthenumber of degrees of freedom cannot be found from Table A-3 and suitable technology is unavailable.
32. a.
x
6.83333333 7
0.29; P-value 0.3860 (Table: 0.3859); theP-value is smaller with 1.99240984 / 12 the known value of . thenull and alternative hypotheses are thesame and theconclusions are thesame. These results of this particular hypothesis test are not affected very much by theknowledge of . z
/ n
fx | =NORM.S.DIST(-0.29,1) b. Because theP-value is smaller with theknowledge of , it is possible that theconclusion of a hypothesis test could change. It is possible to have a conclusion of fail to reject H0 without knowledge of , while theconclusion could be reject H 0 if is known. Section 8-4: Testing a Claim About a Standard Deviation or Variance 1. The sample must be a simple random sample, and thesample must be from a normally distributed population. thenormality requirement for a hypothesis test of a claim about a standard deviation is different in these ways: (1) It is much more strict, meaning that thedistribution of thepopulation must be much closer to a normal distribution; (2) thenormality requirement applies for any sample size, not just for small samples with n 30. 2.
H0: 0.04000 g; H1: 0.04000 g; (16 1)0.021762 4.4390 (or 4.4389 if using theoriginal data values). thesampling distribution of the 2 0.040002 test statistic is the
3.
4.
5.
2
(chi-square) distribution.
a. Reject H 0. b. There is sufficient evidence to support theclaim that thestandard deviation is less than 0.04000 g. c. It appears that with thenew minting process, thevariation among weights has decreased, so theweights aremore consistent. thenew minting process appears to be better than theoriginal minting process. All of thevalues in theconfidence interval are less than thestandard deviation of 0.04000 g from theoriginal minting process, so it appears that thenew minting process results in a smaller standard deviation. With thenew minting process, thedollar coins have weights with less variation. H0: 10 bpm; H1: 10 bpm; Test statistic: 195.172; P-value 0.0208; 2
Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that pulse rates of men have a standard deviation equal to 10 beats per minute. Using thenormal range of 60 to 100 beats per minute is not very good for estimating in this case.
Copyright © 2022 Pearson Education, Inc.
Section 8-4: Testing a Claim About a Standard Deviation or Variance 137
6.
2 2 H0: 10 bpm; H1: 10 bpm; Test statistic: 229.718 or
147 112.52 102
228.125;
P-value 0.0001; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that pulse rates of women have a standard deviation equal to 10 beats per minute. Using therange rule of thumb with thenormal range of 60 to 100 beats per minute is not very good for estimating in this case. 7.
H0: 2.08F; H1: 2.08F; Test statistic: 2
n 1 s2 106 10.62 2
2
2.082
9.329;
P-value 0.0000 (Table: P-value 0.005); Critical value: 74.252 (Table: 70.065); 2
2
Reject H 0 . There is sufficient evidence to support theclaim that body temperatures have a standard deviation less than 2.08F. It is very highly unlikely that theconclusion in thehypothesis test in Example 5 from Section83 would change because of a standard deviation from a different sample. P-value: fx | =CHISQ.DIST((106-1)*0.62^2/2.08^2,106-1,1) Critical Value: fx | =CHISQ.INV(0.01,106-1) 8.
H0: 660.2 hg; H1: 660.2 hg; Test statistic: 2
n 1 s2 30 1829.5 2
2
45.780;
660.22
P-value 0.0493 (Table: P-value 0.02); Critical values: 13.121 and 52.336; 2
2
Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that birth weights of girls have thesame standard deviation as thebirth weights of boys. P-value: fx | =CHISQ.DIST.RT((106-1)*0.62^2/2.08^2,30-1) Critical Values: fx | =CHISQ.INV(0.01/2,30-1) and fx | =CHISQ.INV(1-0.01/2,30-1) 2
9.
(n 1)s2
H0: 0.04000 g; H1: 0.04000 g; Test statistic:
2
(20 1)0.03372
0.04002
13.486;
P-value 0.1872; Critical value: 7.633; Fail to reject H0. There is not sufficient evidence to support a claim that M&Ms are manufactured so that they have weights with a standard deviation less than 0.0400 g. It does not appear that thegoal is being satisfied. P-value: fx | =CHISQ.DIST((20-1)*0.0337^2/0.0400^2,20-1,1) 2
Critical Value: fx | =CHISQ.INV(0.01,20-1) 10. H0: 0.01000 g; H1: 0.01000 g; Test statistic: 2
(n 1)s2
2
(37 1)0.016482 0.010002
97.772;
P-value 0.0000 (Table: P-value 0.005); Critical value: 50.999 (Table: 43.77); Reject H 0. There is sufficient evidence to support a claim that thesample is from a population of pennies with weights having a standard deviation greater than 0.01000 g. 2
P-value: fx | =CHISQ.DIST.RT((37-1)*0.01648^2/0.01000^2,37-1) Critical Value: fx | =CHISQ.INV(1-0.05,37-1)
Copyright © 2022 Pearson Education, Inc.
2
138
Chapter 8: Hypothesis Testing
11. H0: 9.9 min; H1: 9.9 min; Test statistic: 2
(n 1)s2
2
(12 1)13.22
19.556;
9.92
P-value 0.0158 (Table: 0.05 P-value 0.10); Critical value: 19.675; Fail to reject H0. There is not sufficient evidence to support a claim that Friday afternoon ride times have greater variation than theFriday morning ride times. 2
P-value: fx | =CHISQ.DIST.RT((12-1)*13.2^2/9.9^2,12-1) Critical Value: fx | =CHISQ.INV(1-0.05,12-1) 12. H0: 7460 words; H1: 7460 words; Test statistic: 2
n 1 s2 56 17871 2
74602
2
61.227;
P-value 0.2625 (Table: P-value 0.10); Critical value: 82.292 (Table: 76.145); 2
2
Fail to reject H 0 . There is not sufficient evidence to support theclaim that males have a standard deviation that is greater than thestandard deviation of 7460 words for females. P-value: fx | =CHISQ.DIST.RT((56-1)*7871^2/7460^2,56-1) Critical Value: fx | =CHISQ.INV(1-0.01,56-1) 13. H0: 32.2 ft; H1: 32.2 ft; Test statistic: 2
n 1 s2 12 152.441 2
2
29.176;
32.22
P-value 0.0021; Critical value: 19.675; Reject H 0 . There is sufficient evidence to support theclaim that thenew production method has errors with a standard deviation greater than 32.2 ft. thevariation appears to be greater than in thepast, so thenew method appears to be worse, because there will be more altimeters that have larger errors. thecompany should take immediate action to reduce thevariation. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 One-sample variance test / Upper-tailed test: One-sample variance test / Lower-tailed test: Variance 2750.061 Variance 0.227 Chi-square (Observed value) 29.176 Chi-square (Observed value) 0.631 Chi-square (Critical value) 19.675 Chi-square (Critical value) 3.325 DF 11 DF 9 p-value (one-tailed) 0.002 p-value (one-tailed) <0.0001 alpha 0.050 alpha 0.050 2
14. H0: 1.8 min; H1: 1.8 min; Test statistic: 2
n 1 s2 10 10.4767 2
2
1.82
0.631;
P-value 0.0001 (Table: P-value 0.005); Critical value: 3.325; 2
Reject H 0 . There is sufficient evidence to support theclaim that with a single waiting line, thewaiting times have a standard deviation less than 1.8 min. Because thevariation among waiting times appears to be reduced with thesingle waiting line, customers are happier because their waiting times are closer to being thesame. Customers are not annoyed by being stuck in an individual line that takes much more time than other individual lines. 15. H0: 0.2000 g; H1: 0.2000 g; Test statistic: 2
(n 1)s2
2
(6 1)0.205432 0.20002
5.275;
P-value 0.7664 (Table: P-value 0.20); Critical values: 0.831 and 12.833; Fail to reject H 0. There is not sufficient evidence to warrant rejection of theclaim that thesample is from a population having weights with a standard deviation equal to 0.2000 g. 2
Copyright © 2022 Pearson Education, Inc.
2
Section 8-4: Testing a Claim About a Standard Deviation or Variance 139 XLSTAT Output for Exercise 15 One-sample variance test / Two-tailed test: Variance 0.042 Chi-square (Observed value) 5.275 |Chi-square| (Critical value) 12.833 DF 5 p-value (Two-tailed) 0.766 alpha 0.050
XLSTAT Output for Exercise 16 One-sample variance test / Two-tailed test: Variance 0.000 Chi-square (Observed value) 3.814 |Chi-square| (Critical value) 21.955 DF 8 p-value (Two-tailed) 0.253 alpha 0.010
16. H0: 0.0230 g; H1: 0.0230 g; Test statistic: 2
n 1 s2 9 10.01588 2
2
0.02302
3.814;
P-value 0.2529 (Table: P-value 0.20); Critical values: 1.344 and 21.955; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that wheat pennies are manufactured so that their weights have a standard deviation equal to 0.0230 g. It appears that theMint specification is being met. 2
17. H0: 20 min; H1: 20 min; Test statistic: 2
2
(n 1)s2
2
(1000 1)21.495172 202
1153.950;
P-value 0.0009; Critical values: 887.620 and 1117.889; 2
2
Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that thesample is from a population with a standard deviation equal to 20 minutes. A histogram or normal quantile plot shows that thedistribution is far from being normal, so a requirement of thehypothesis test is violated and theresults of this hypothesis test are not necessarily valid. (TI data: Test statistic: 679.168; P-value 0.0000; Critical values: 2
2 421.384 and 2 584.125.) XLSTAT Output for Exercise 17 One-sample variance test / Two-tailed test: Variance 462.042 Chi-square (Observed value) 1153.950 |Chi-square| (Critical value) 1117.890 DF 999 p-value (Two-tailed) 0.001 alpha 0.010
XLSTAT Output for Exercise 18 One-sample variance test / Two-tailed test: Variance 4699.210 Chi-square (Observed value) 4297.724 |Chi-square| (Critical value) 4259.958 DF 4081 p-value (Two-tailed) 0.018 alpha 0.050
18. H0: 66.8 mm; H1: 66.8 mm; Test statistic: 2
(n 1)s2
2
(4082 1)68.550792 66.82
4297.725;
P-value 0.0180; Critical values: 3905.835 and 4259.952; Reject H 0 . There is sufficient evidence to support theclaim that thesample collected in 2012 is from a population with a standard deviation different from 66.8 mm. A histogram or normal quantile plot shows that thedistribution is very close to being normal, so a requirement of thehypothesis test is satisfied. (TI data: Test 2
2
statistic: 513.244; P-value 0.0000; Critical values: 270.492 and 369.295.) 2
19. Critical 2
2
2
2.33 2 55 1 81.54 (or 81.494 if using z 2.326348 found from technology), which 2
1
2
is reasonably close to thevalue of 22.465 obtained from technology. 3 2 2 2 2.33 20. Critical 55 1 82.360 (or 82.309 if using z 2.326348 found from 9 55 9 55 technology), which is very close to thevalue of 82.292 obtained from technology.
Copyright © 2022 Pearson Education, Inc.
140
Chapter 8: Hypothesis Testing
Section 8-5: Resampling: Using Excel for Hypothesis Testing 1.
2. 3.
a. A sample of thesame size (n = 5) is randomly selected from thegiven sample. b. The sampling is done with replacement. If we were to sample without replacement, we would always obtain thesame sample consisting of thesame five waiting times, and those results would have no real value. c. A resampling method does not require that thesample data are from a population having a particular distribution (such as normal) or that thesample size meets some minimum requirement. With bootstrapping, theoriginal data values are resampled. With randomization, theoriginal data values are somehow modified to conform to theassumed value of thepopulation parameter. a.
p̂ = 426 / 860 = 0.495 and p = 0.512.
b. The sample proportions that are at least as extreme as p̂ = 0.495 are those that are 0.495 or lower and those that are 0.529 or higher. c. The result of 310 samples (among 1000) with a proportion at least as extreme as 426 / 860 shows that theevent of getting such a sample proportion is quite common and can easily occur. It appears that there is notsufficient evidence to warrant rejection of theclaim that theproportion of male births is equal to 0.512. 4.
a. t h e sample means at least as extreme as 98.13
F are those that are 98.13 F or lower and those that are
99.07 F or higher. b. Because none of theresampled data sets have means at least as extreme as themean of 98.13 F that was obtained, it appears that 98.13 F is significantly different from theclaimed mean of 98.6 F. It appears that a mean of 98.13 F is very unlikely to occur with a population having a mean of 98.6 F, so there is sufficient evidence to warrant rejection of theclaim that thepopulation has a mean body temperature equal to 98.6 F. 5.
6.
7.
8.
Answers vary, but thefollowing is typical. Resample 1000 times to find that 364 of theresults have a proportion of 91 / 220 = 0.414 or greater. There is a likelihood of 0.364 of getting a sample proportion that is atleast as extreme as theone obtained, so there is not sufficient evidence to support theclaim that thesample of cast and crew members is from a population in which therate of cancer is greater than 40%. Answers vary, but thefollowing is typical. thesample proportion is 0.396, so in this two-tailed case, thevaluesat least as extreme as 0.396 are those that are 0.396 or lower plus those that are 0.604 or greater. (Note that 0.396 is below theassumed value of 0.5 by 0.104, so theupper bound becomes 0.104 above 0.5, which is 0.604.) Resample 1000 times to find that 184 of theresults have a proportion at least as extreme as 0.396. Thereis a likelihood of 0.184 of getting a sample proportion that is at least as extreme as theone obtained, so there is not sufficient evidence to warrant rejection of theclaim that dropped toast will land with thebuttered side down50% of thetime. Answers vary, but thefollowing is typical. Create a column of 2005 ones and zeros, with 0.5 of them being 1 (as assumed in thenull hypothesis). thesample proportion is 0.11, so in this left-tailed case, thevalues at leastas extreme as 0.11 are those that are 0.11 or lower. Resample 1000 times to find that none of theresults have a proportion of 0.11 or lower. There is a very small likelihood of getting a sample proportion that is at least as extreme as theone obtained, so there is sufficient evidence to support theclaim that fewer than half of all Americans know what thesymbol designates. Answers vary, but thefollowing is typical. thesample proportion is 0.255, so in this two-tailed case, thevaluesat least as extreme as 0.255 are those that are 0.255 or greater plus those that are 0.225 or lower. (Note that 0.255 is above theassumed value of 0.24 by 0.015, so thelower bound becomes 0.015 below 0.24, which is 0.225.) Resample 1000 times to find that 511 of theresults have a proportion of 0.225 or lower or 0.255 or greater. There is about a 0.511 likelihood of getting a sample proportion that is at least as extreme as theone obtained, so there is not sufficient evidence to warrant rejection of theclaim that 24% of M&Ms are blue.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz 141 Answers vary, but thefollowing is typical. With x 11.05 mg/g and 14 mg/g (as assumed in thenull hypothesis), add 2.95 mg/g to each sample value (as Statdisk does) so that thesample has theclaimed mean of 14 mg/g. Resample themodified sample 1000 times to find that 53 of theresults have a mean of 11.05 mg/g or less. There appears to be a likelihood of 0.053 of getting a sample mean of 11.05 mg/g or lower, so there is not sufficient evidence to support theclaim that themean lead concentration for all such medicines is less than 14 mg/g. (Note: Because thelikelihood of 0.053 is so close to thesignificance level of 0.05, it is very possible to find that thelikelihood is less than 0.05, so theconclusion would be that there is sufficient evidence to support theclaim.) 10. Answers vary, but thefollowing is typical. With x 62.66666667 sec and 60 sec (as assumed in thenull hypothesis), subtract 2.66666667 sec from each sample value (as Statdisk does) so that thesample has theclaimed mean of 60 sec. Resample themodified sample 1000 times. thesample means at least as extreme as 62.66666667 sec are those that are 62.66666667 sec or greater and those that are 57.33333333 sec or lower. Among the1000 generated samples, there are 579 that are at least as extreme as 62.66666667 sec, so there appears to be a likelihood of 0.579 of getting a sample mean at least as extreme as 62.66666667 sec. There is not sufficient evidence to warrant rejection of theclaim that thetimes are from a population with a mean equal to 60 sec. 11. Answers vary, but thefollowing is typical. With x 1.04875 W/kg and 1.6 W/kg (as assumed in thenull hypothesis), add 0.55125 W/kg to each sample value (as Statdisk does) so that thesample has theclaimed mean of 1.6 W/kg. Resample themodified sample 1000 times. thesample means at least as extreme as 1.04875 W/kg are those that are 1.04875 W/kg or lower. Among the1000 generated samples, there are none that are at least as extreme as 1.04875 W/kg, so there appears to be a likelihood of 0.000 of getting a sample mean at least as extreme as 1.04875 W/kg. There is sufficient evidence to support theclaim that thesample is from a population of cell phones with a mean amount of radiation that is less than theFCC standard of 1.6 W/kg. 12. Answers vary, but thefollowing is typical. With x 8.07104 g and 8.1 g (as assumed in thenull hypothesis), add 0.02896 g to each sample value (as Statdisk does) so that thesample has theclaimed mean of 8.1 g. Resample themodified sample 1000 times. thesample means at least as extreme as 8.07104 g are those that are 8.07104 g or lower and those that are 8.12896 g or greater. Among the1000 generated samples, there are only 2 that are at least as extreme as 8.07104 g, so there appears to be a likelihood of 0.002 of getting a sample mean at least as extreme as 8.07104 g. There is sufficient evidence to warrant rejection of theclaim that dollar coins have a mean weight of 8.1 g. 13. For thedata in Example 4, s 0.0480164 g. For testing theclaim that 0.062 g, we assume in thenull hypothesis that 0.062 g. To modify thedata set so that s becomes 0.062 g, multiply each of thedata values by / s 0.062 / 0.0480164 1.2912255, then resample 1000 times. A typical result is that among the1000 resulting values of s, 69 of them are 0.0480164 or lower. That is, 69 of the1000 resamples are at least as extreme as thevalue of s 0.0480164 g that was obtained. That result corresponds to an estimated P-value of 0.069, which is greater than thesignificance level of 0.05 used in Example 4. We therefore conclude that there is not sufficient evidence to support theclaim that thelisted weights are from a population with a standard deviation that is less than 0.062 g. These results from randomization agree quite well with those from Example 1 in Section 8-4, and this suggests that randomization is very effective in this case. 9.
14. Here are typical results: x 62.1 g/g, s 6.8 g/g, t h e range is 32.0 g/g, t h e minimum is 47 g/g and the maximum is 79 g/g. t h e distribution appears to be approximately normal, and there are no outliers. therange of values from 47 g/g. to 79 g/g. is somewhat large. Given theway that thevalues were generated, itis not surprising that thedistribution appears to be approximately normal. Quick Quiz x 1.911 2.000 0.658 s/ n 1.065 / 62
1.
H0: 2.000 pounds; H1: 2.000 pounds
3.
a. t distribution b. No. therelevant requirement is that thepopulation is normally distributed or n 30. With n 62, that requirement is satisfied and it is not necessary to test for normality.
2.
t
Copyright © 2022 Pearson Education, Inc.
142 Chapter 8: Hypothesis Testing 4.
a. 0.2565 0.05; Fail to reject H 0. b. There is not sufficient evidence to support theclaim that thesample is from a population with a mean less than 2.000 pounds.
5.
H0: p 0.45; H1: p 0.45
6.
z
p̂ p pq n
7.
0.49 0.45 (0.45)(0.55) 1150
2.727 (2.76 if using x 0.49(1150) 564)
a. 0.0029 0.01; Reject H 0.
b. There is sufficient evidence to support theclaim that for thepopulation of all TV households, more than 45% have at least one stand-alone streaming device. 8. a. t distribution 9. a. True b. False b. normal distribution c. False c. chi-square distribution d. False e. False 10. The t test requires that thesample is from a normally distributed population, and thetest is robust in thesense that thetest works reasonably well if thedeparture from normality is not too extreme. the (chi-square) test is not robust against a departure from normality, meaning that thetest does not work well if thepopulation has a distribution that is far from normal. Review Exercises 2
1.
H 0: p 0.5; H 1: p 0.5; Test statistic: z
0.51 0.5 8.85; 0.50.5 195,600
P-value P z 8.85 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that themajority of employees are searching for new jobs. XLSTAT Output for Exercise 1 XLSTAT Output for Exercise 2 z-test for one proportion / Upper-tailed test: One-sample t-test / Two-tailed test: Difference 0.010 Difference 13.025 t (Observed value) 92.116 z (Observed value) 8.845 z (Critical value) 2.326 |t| (Critical value) 2.093 p-value (one-tailed) <0.0001 DF 19 alpha 0.010 p-value (Twotailed) <0.0001 alpha 0.050 2.
n 20 30, but thedata appear to follow a normal distribution. H0: 27.2 km/h; H1: 27.2 km/h; 40.225 27.2 Test statistic: t 92.116; Critical values: t 2.093; 0.63235 20 P-value 0.0000 (Table: P-value 0.0002); Reject H 0 . There is sufficient evidence to reject theclaim that current winning speeds are not significantly different from the27.2 km/h overall speed of winners from thefirst few races. Current winning speeds appear to be significantly higher.
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 143 3.
4.
The data cannot be verified to have a normal distribution, but n 30. H0: 5.4 million cells per microliter; H1: 5.4 million cells per microliter; 4.932 5.4 Test statistic: t 5.873; Critical value: t 2.426; 0.504 40 P-value 0.0000 (Table: P-value 0.005); Reject H 0 . There is sufficient evidence to support theclaim that thesample is from a population with a mean less than 5.4 million cells per microliter. thetest deals with thedistribution of sample means, not individual values, so theresult does not suggest that each of the40 males has a red blood cell count below 5.4 million cells per microliter. Excel Commands for Exercise 3 XLSTAT Output for Exercise 4 z-test for one proportion / Two-tailed test: P-value: fx | =T.DIST(-5.873,40-1,1) Difference 0.074 Critical Value: fx | =T.INV(0.01,40-1) z (Observed value) 3.699 |z| (Critical value) 1.960 p-value (Two-tailed) 0.000 alpha 0.050 308 0.43 H0: p 0.43; H1: p 0.43; Test statistic: z 611 3.70; 0.430.57 611
P-value 2 P z 3.70 0.0002; Critical values: z 1.96; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that thepercentage who believe that they voted for thewinning candidate is equal to 43%. There appears to be a substantial discrepancy between how people said that they voted and how they actually did vote. 5. a. A type I error is themistake of rejecting a null hypothesis when it is actually true. A type II error is themistake of failing to reject a null hypothesis when in reality it is false. b. Type I error: In reality, thepercentage who voted for thewinning candidate is equal to 43%, but we conclude that thepercentage is different from 43%. Type II error: In reality, thepercentage who voted for thewinning candidate is different from 43%, but we fail to support that claim that thepercentage is different from 43%. 6. a. false d. false b. true e. false c. false Cumulative Review Exercises 1.
46 51 44 51 45 27 34 29 28 40 16 20 34.8 deaths 20 32 34 33.0 deaths b. Q2 2 (45 34.8)2 (51 34.8)2 (16 34.8)2 (20 34.8)2 c. s 10.7 deaths 20 1 a.
x
d. s2 10.7112 114.7 deaths2 e. range 5116 35.0 deaths f. The pattern of thedata over time is not revealed by thestatistics. A time-series graph would be very helpful in understanding thepattern over time. thedata appear to be trending downward.
Copyright © 2022 Pearson Education, Inc.
144 1.
Chapter 8: Hypothesis Testing (continued) XLSTAT Output Statistic Nbr. of observations Range Median Mean Variance (n-1) Standard deviation (n-1)
2.
3.
4.
Lightning Deaths 20 35.000 33.000 34.750 114.724 10.711
a. ratio b. discrete c. quantitative d. No, thedata are from recent and consecutive years, so they are not randomly selected. s 34.8 2.861 10.7 27.9 deaths 41.6 deaths a. 99% CI: x t /2 n 20 b. We have 99% confidence that thelimits of 27.9 deaths and 41.6 deaths contain thevalue of thepopulationmean. c. No. thedata appear to be trending downward. Instead of having a stable population, thepopulation characteristics appear to be changing over time. XLSTAT Output for Exercise 3 XLSTAT Output for Exercise 4 One-sample t-test / Lower-tailed test: 99% confidence interval on the mean: ( 27.898, 41.602 ) Difference -37.850 t (Observed value) -15.804 t (Critical value) -2.539 DF 19 p-value (one-tailed) <0.0001 alpha 0.010 H0: 72.6 deaths; H1: 72.6 deaths; 34.8 72.6 Test statistic: t 15.804; Critical value: t 2.539; 10.7 20 P-value 0.0000 (Table: P-value 0.005);
Reject H 0 . There is sufficient evidence to support theclaim that themean number of annual lightning deaths is now less than themean of 72.6 deaths from the1980s. Possible factors: Shift in population from rural to urban areas; better lightning protection and grounding in electric and cable and phone lines; better medical treatment of people struck by lightning; fewer people use phones attached to cords; better weather predictions. 5. Because thevertical scale starts at 50 and not at 0, thedifference between thenumber of males and thenumber of females is exaggerated, so thegraph is deceptive by creating thefalse impression that males account for nearly all lightning strike deaths. A comparison of thenumbers of deaths shows that thenumber of male deathsis roughly 4 times thenumber of female deaths, but thegraph makes it appear that thenumber of male deaths is around 12 times thenumber of female deaths. 242 0.5 6. H0: p 0.5; H1: p 0.5; Test statistic: z 306 10.18; 0.50.5 306
P-value P z 10.18 0.0001 (Excel: 0.0000); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that theproportion of male deaths is greater than 1 2. More males are involved in certain outdoor activities such as construction, fishing, and golf.
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 145 XLSTAT Output for Exercise 6 z-test for one proportion / Upper-tailed test: Difference 0.291 z (Observed value) 10.176 z (Critical value) 2.326 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Exercise 7 95% confidence interval on the proportion: ( 0.745, 0.836 )
242 64 p̂q̂ 242 1.96 306 306 0.745 p 0.836; Because theentire confidence interval n 306 306 is greater than 0.5, it does not seem feasible that males and females have equal chances of being killed by lightning. theconfidence interval indicates that theproportion of male lightning deaths is substantially greater than 0.5. 8. a. 0.8 0.8 0.8 0.512 b. 0.2 0.2 0.2 0.008 c. 1 0.2 0.2 0.2 0.992
7.
95% CI: p̂ z /2
0.83 0.22 0.205
fx | =BINOM.DIST(3,5,0.8,0)
d.
5 C3
e.
np 50 0.8 40 males;
npq 50 0.8 0.2 2.8 males
f. Yes, using therange rule of thumb, values above 2 40.0 2(2.8) 45.6 are considered significantly high. Since 46 is greater than 45.6, 46 males would be a significantly high number in a group of 50. fx | =NORM.INV(1-0.05/2,40,2.8)
Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 147
Chapter 9: Inferences from Two Samples Section 9-1: Two Proportions 1. The samples are simple random samples that are independent. For each of thetwo groups, thenumber of successes is at least 5 and thenumber of failures is at least 5. (Depending on what we call a success, thefour numbers are 33, 115, 201,229 and 200,630 and all of those numbers are at least 5.) therequirements are satisfied. 33 2. n 201,229, p̂ 0.000163992, q̂ 1 0.000163992 0.999836; 1 1 1 201,299 n 200,745, p̂ 115 0.000572866, qˆ 1 0.000572866 0.999427; 2 2 2 200, 745 33 115 p 0.000368183, q 1 0.000368183 0.999632 201, 299 200, 745 3.
a. H0: p1 p2 ; H1: p1 p2 b. There is sufficient evidence to support theclaim that therate of polio is less for children given theSalk vaccine than it is for children given a placebo. theSalk vaccine appears to be effective.
4.
a. theP-value method and thecritical value method are equivalent in thesense that they will always lead to thesame conclusion, but theconfidence interval method is not equivalent to them. b. 0.90, or 90% c. Because theconfidence interval limits do not contain 0, there appears to be a significant difference between thetwo proportions. Because theconfidence interval consists of negative values only, it appears that thefirst proportion is less than thesecond proportion. There is sufficient evidence to support theclaim that therate of polio is less for children given theSalk vaccine than it is for children given a placebo.
5.
H0: p1 p2 ; H1: p1 p2 ; population1 vinyl gloves, population2 latex gloves; Test statistic: z 12.82; P-value 0.0000; Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that vinyl gloves have a greater virus leak rate than latex gloves.
H0: p1 p2 ; H1: p1 p2 ; population1 surgery, population2 splints; Test statistic: z 3.12; P-value 0.0009; Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that thesuccess rate is better with surgery For Exercises 7–22, assume that thedata fit therequirements for thestatistical methods for two proportions unless otherwise indicated. 6.
7.
a. H0: p1 p2 ; H1: p1 p2 ; population1 buttered toast, population2 X marked toast; Test statistic: z 0.62; P-value 0.5359 (Table: 0.5352); Critical values: z 1.96; Fail to reject H 0. There is not sufficient evidence to warrant rejection of theclaim that when dropped, buttered toast and toast marked with an X have thesame proportion that land with thebuttered/X side down. 19 22 41 41 55 p ; q 1 48 48 96 96 96 z
p̂1 p̂2 ( p1 p2 ) pq n1
pq n2
1948 2248 0 41 55 41 55 96
96
48
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference -0.063 z (Observed value) -0.619 |z| (Critical value) 1.960 p-value (Two-tailed) 0.536 alpha 0.050
96 96 48
XLSTAT Output for Part (b) 95% confidence interval on the difference between the proportions: ( -0.260, 0.135 )
Copyright © 2022 Pearson Education, Inc.
148 7.
Chapter 9: Inferences from Two Samples (continued) b. 95% CI: 0.260 p1 p2 0.135; Because theconfidence interval limits do contain 0, there is a not a significant difference between thetwo sample proportions. There is not sufficient evidence to warrant rejection of theclaim that when dropped, buttered toast and toast marked with an X have thesame proportion that land with thebuttered/X side down.
p̂1 p̂2 z /2 8.
p̂1q̂1
n1
p̂2 q̂2 n2
22 19 48 48 1.96
1948 2948 4822 4826 48
48
a. H0: p1 p2 ; H1: p1 p2 ; population1 men, population2 women; Test statistic: z 2.52; P-value 0.0118; Critical values: z 1.96; Reject H0. There is sufficient evidence to warrant rejection of theclaim that men and women singles players have equal success in challenging referee calls. 1757 887 2644 2644 6719 p ; q 1 6036 3327 9363 9363 9363 z
p̂1 p̂2 ( p1 p2 ) pq
n1
pq n2
887 1757 6036 3327 0 2644 6719 2644 6719 9363 9363 9363 9363
6036
3327
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference 0.024 z (Observed value) 2.518 |z| (Critical value) 1.960 p-value (Two-tailed) 0.012 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the proportions: ( 0.0056, 0.0434 )
b. 95% CI: 0.00558 p1 p2 0.0434; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo sample proportions. There is sufficient evidence to warrant rejection of theclaim that men and women singles players have equal success in challenging referee calls.
p̂1 p̂2 z /2
p̂1q̂1
p̂2 q̂2
1757 887 1.96 6036 3327
4279 887 2440 1757 6036 6036 3327 3327
6036 3327 n1 n2 c. It appears that men and women do not have equal success in challenging calls. 9.
a. H0: p1 p2 ; H1: p1 p2 ; population1 used left ear, population2 used right ear; Test statistic: z 7.94; P-value 0.0000 (Table: 0.0001); Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that therate of right-handedness for those who prefer to use their left ear for cell phones is less than therate of right-handedness for those who prefer to use their right ear for cell phones. 1757 887 2644 2644 6719 p ; q 1 6036 3327 9363 9363 9363 z
p̂1 p̂2 ( p1 p2 ) pq n1
pq n2
887 1757 6036 3327 0 2644 6719 2644 6719 9363 9363 9363 9363
6036
XLSTAT Output for Part (a) z-test for two proportions / Lower-tailed test: Difference -0.196 z (Observed value) -7.944 z (Critical value) -2.326 p-value (one-tailed) <0.0001 alpha 0.010
3327 XLSTAT Output for Part (b) 98% confidence interval on the difference between the proportions: ( -0.266, -0.126 )
Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 149 9.
(continued) b. 98% CI: 0.266 p1 p2 0.126; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo proportions. Because theinterval consists of negative numbers only, it appears that theclaim is supported. thedifference between thepopulations does appear to have practical significance.
p̂1 p̂2 z /2
p̂1q̂1 n1
p̂2 q̂2 n2
1757 887 6036 3327
4279 887 2440 1757 6036 6036 3327 3327
1.96
6036
3327
10. a. H0: p1 p2 ; H1: p1 p2 ; population1 single bill, population2 multiple bills; Test statistic: z 1.85; P-value 0.0324 (Table: P-value 0.0322); Critical value: z 1.645; Reject H0. There is sufficient evidence to support theclaim that when given a single large bill, a smaller proportion of women in China spend some or all of themoney when compared to theproportion of women in China given thesame amount in smaller bills. 60 68 64 64 11 p ; q 1 75 75 75 75 75 60 68 0 p̂ p̂ p p 2 1 2 z 1 64 7511 75 64 11 pq pq 75 75 75 75 n1 n2 75 75 XLSTAT Output for Part (a) XLSTAT Output for Part (b) z-test for two proportions / Lower-tailed test: 90% confidence interval on the difference between the proportions: Difference -0.107 ( -0.201, -0.013 ) z (Observed value) -1.846 z (Critical value) -1.645 p-value (one-tailed) 0.0324 alpha 0.050
b. 90% CI: 0.201 p1 p2 0.0127; Because theconfidence interval does not include 0 and it includes only negative values, it appears that thefirst proportion is less than thesecond proportion. There is sufficient evidence to support theclaim that when given a single large bill, a smaller proportion of women in China spend some or all of themoney when compared to theproportion of women in China given thesame amount in smaller bills. 60 68 60 15 68 7 p̂ q̂ p̂ q̂ p̂1 p̂2 z /2 1 1 2 2 1.645 75 75 75 75 75 75 75 75 n1 n2
c. Because theP-value 0.0324 (Table: 0.0322), thedifference is significant at the0.05 significance level,but not at the0.01 significance level. theconclusion does change. 11. a. H0: p1 p2 ; H1: p1 p2 ; population1 over age 55, population2 under age 25; Test statistic: z 6.44; P-value 0.0000 (Table: 0.0001); Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that theproportion of people over 55 who dream in black and white is greater than theproportion of those under 25. 68 13 81 ; q 1 81 523 p 306 298 604 604 604 68 13 0 p̂ p̂ p p 1 2 1 2 306 298 z 81 523 81 523 pq pq 604 604 604 604 n1 n2 306 298
Copyright © 2022 Pearson Education, Inc.
150
Chapter 9: Inferences from Two Samples
11. (continued) XLSTAT Output for Part (a) z-test for two proportions / Upper-tailed test: Difference 0.179 z (Observed value) 6.440 z (Critical value) 2.326 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Part (b) 98% confidence interval on the difference between the proportions: ( 0.117, 0.240 )
b. 98% CI: 0.117 p1 p2 0.240; Because theconfidence interval limits do not include 0, it appears that the two proportions are not equal. Because theconfidence interval limits include only positive values, it appears that theproportion of people over 55 who dream in black and white is greater than theproportion of those under 25.
p̂1 p̂2 z / 2
p̂1q̂1
p̂2 q̂2
68 13 306 298
2.33
68 238 13 285 306 306 298 298
306 298 n1 n2 c. theresults suggest that theproportion of people over 55 who dream in black and white is greater than theproportion of those under 25, but theresults cannot be used to verify thecause of that difference. 12. a. H0: p1 p2 ; H1: p1 p2 ; population1 OxyContin, population2 placebo; Test statistic: z 1.78; P-value 0.0757 (Table: P-value 0.0750); Critical values: z 1.96; Fail to reject H 0 . There is not sufficient evidence to support theclaim that there is a difference between theratesof nausea for those treated with OxyContin and those given a placebo. 215 52 5 57 ; q 1 57 p 227 45 272 272 272 52 5 0 p̂ p̂ p p 1 2 1 2 227 45 z 57 215 57 215 pq pq 272 272 272 272 n1 n2 227 45 XLSTAT Output for Part (a) XLSTAT Output for Part (b) z-test for two proportions / Two-tailed test: 95% confidence interval on the difference between the proportions: Difference 0.118 ( 0.0111, 0.225 ) z (Observed value) 1.776 |z| (Critical value) 1.960 p-value (Two-tailed) 0.076 alpha 0.050
b. 95% CI: 0.0111 p1 p2 0.225; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo proportions.
p̂1 p̂2 z /2
p̂1q̂1
p̂2 q̂2
22752 455 1.96
52 175 227 227 455 4540
227 45 n1 n2 c. theconclusions from parts (a) and (b) are different. theconclusion from part (a) results from a hypothesis test instead of an estimate of thedifference between thetwo rates of nausea, so there does not appear to be sufficient evidence to conclude that there is a difference between therates of nausea for those treated with OxyContin and those given a placebo.
Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 151 13. a. H0: p1 p2 ; H1: p1 p2 ; population1 wearing seatbelt, population2 not wearing seatbelt; Test statistic: z 6.11; P-value 0.0000 (Table: 0.0001); Critical value: z 1.645; Reject H0. There is sufficient evidence to support theclaim that thefatality rate is higher for those not wearing seat belts. 3116 47 ; q 1 47 10, 541 p 2823 7765 10, 588 10, 588 10, 588 31 p̂ p̂ p p 16 0 2823 7765 1 2 1 2 z 47 10,541 10,541 47 pq pq 10,588 10,588 10,588 10,588 n1 n2 2823 7765 XLSTAT Output for Part (a) XLSTAT Output for Part (b) z-test for two proportions / Upper-tailed test: 90% confidence interval on the difference between the proportions: Difference 0.009 ( 0.00559, 0.01226 ) z (Observed value) 6.106 z (Critical value) 1.645 p-value (one-tailed) <0.0001 alpha 0.050
b. 90% CI: 0.00559 p1 p2 0.0123; Because theconfidence interval limits do not include 0, it appears that thetwo fatality rates are not equal. Because theconfidence interval limits include only positive values, it appears that thefatality rate is higher for those not wearing seat belts.
p̂1 p̂2 z / 2
p̂1q̂1
p̂2 q̂2
31 16 1.645 2823 2823 7765 7765 7765 2823 2823 7765 31
2972
16
7749
n1 n2 c. theresults suggest that theuse of seat belts is associated with fatality rates lower than those associated with not using seat belts. 14. a. H0: p1 p2 ; H1: p1 p2 ; population1 text warnings, population2 picture warnings; Test statistic: z 2.89; P-value 0.0019; Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that theproportion of smokers who tried to quit in thetext warning group isless than theproportion in thepicture warning group. 366 428 794 ; q 1 794 1355 p 1078 1071 2149 2149 2149
z
p̂1 p̂2 ( p1 p2 ) pq
n1
pq n2
366 428 1071 0 1078 794 1355 794 1355 2149 2149 2149 2149
1078
XLSTAT Output for Part (a) z-test for two proportions / Lower-tailed test: Difference -0.060 z (Observed value) -2.887 z (Critical value) -2.326 p-value (one-tailed) 0.002 alpha 0.010
1071 XLSTAT Output for Part (b) 98% confidence interval on the difference between the proportions: ( -0.1085, -0.0118 )
b. 98% CI: 0.108 p1 p2 0.0118; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo sample proportions. Because theconfidence interval consists of negative values only, it appears that theproportion of smokers who tried to quit in thetext warning group is less than theproportion in thepicture warning group.
p̂1 p̂2 z /2
p̂1q̂1 n1
p̂2 q̂2 n2
366 428 1078 1071 2.326
366 712 428 643 1078 1078 1071 1071
1078
Copyright © 2022 Pearson Education, Inc.
1071
152
Chapter 9: Inferences from Two Samples
15. a. H0: p1 p2 ; H1: p1 p2 ; population1 malaria, population2 no malaria; Test statistic: z 4.41; P-value 0.0000 (Table: P-value 0.0002); Critical values: z 1.96; Reject H0. There is sufficient evidence to reject theclaim of no difference between thetwo rates of correct responses. It appears that thedogs do a better job of correctly identifying subjects without malaria. 123 131 127 127 33 p ; q 1 175 145 160 160 160 z
123 131 145 0 175 127 33 127 33 160 160 160 160
p̂1 p̂2 ( p1 p2 ) pq
n1
pq n2
175
145
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference -0.201 z (Observed value) -4.415 |z| (Critical value) 1.960 p-value (Two-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the proportions: ( -0.284, -0.118 )
b. 95% CI: 0.284 p1 p2 0.118; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo sample proportions. Because theconfidence interval consists of negative values only, it appears that theproportion of correct identifications made with subjects having malaria is less than theproportion of correct identifications made with subjects not having malaria.
p̂1 p̂2 z /2
p̂1q̂1
p̂2 q̂2
123 131 1.96 175 145
52 131 14 123 175 175 145 145
175 145 n1 n2 c. With correct identification rates of 70.3% and 90.3%, thedogs are doing quite well for subjects with malariaand subjects without malaria, but they do a much better job of correctly identifying patients without malaria. 16. a. H0: p1 p2 ; H1: p1 p2 ; population1 used bednet, population2 did not use bednet; Test statistic: z 2.44; P-value 0.0074; Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that theincidence of malaria is lower for infants who use thebednets. 15 27 6 ; q 1 6 85 p 343 294 91 91 91 15 27 0 p̂ p̂ p p z
1
2
pq n1
1
pq n2
2
343
294
916 8591 916 8591
343 XLSTAT Output for Part (a) z-test for two proportions / Lower-tailed test: Difference -0.048 z (Observed value) -2.439 z (Critical value) -2.326 p-value (one-tailed) 0.007 alpha 0.010
294
XLSTAT Output for Part (b) 98% confidence interval on the difference between the proportions: ( -0.09496, -0.00125)
b. 98% CI: 0.0950 p1 p2 0.00125 (Table: 0.0950 p1 p2 0.00118); Because theconfidence interval does not include 0 and it includes only negative values, it appears that therate of malaria is lower for infants who use thebednets. 15 27 15 328 27 197 p̂1q̂1 p̂2 q̂2 343 343 294 294 2.33 p̂1 p̂2 z /2 343 294 343 294 n1 n2
c. t h e bednets appear to be effective. Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 153 17. a. H0: p1 p2 ; H1: p1 p2 ; population1 Connecticut cars, population2 New York cars; Test statistic: z 7.11; P-value 0.0000 (Table: P-value 0.0002); Critical value: z 1.96; Reject H0. There is sufficient evidence to warrant rejection of theclaim that Connecticut and New York have thesame proportion of cars with rear license plates only. 239 9 248 ; q 1 248 2351 p 2049 550 2599 2599 2599
z
p̂1 p̂2 ( p1 p2 ) pq
n1
pq
239 9550 0 2049 248 248 2351 2599 2351 2599 2599 2599
n2
2049
550
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference 0.100 z (Observed value) 7.107 |z| (Critical value) 1.960 p-value (Two-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the proportions: ( 0.0828, 0.1178 )
b. 95% CI: 0.0828 p1 p2 0.118; Because theconfidence interval limits do not contain 0, there is sufficient evidence to warrant rejection of theclaim that Connecticut and New York have thesame proportion of cars with rear license plates only. There appears to be a significant difference between theproportions of cars with rear license plates only.
p̂1 p̂2 z /2
p̂1q̂1
n1
p̂2 q̂2 n2
239 1810 9 541 9 239 1.96 2049 2049 550 550 2049 550 2049 550
18. a. H0: p1 p2 ; H1: p1 p2 ; population1 cars, population2 trucks; Test statistic: z 0.95; P-value 0.8280 (Table: P-value 0.5289); Critical value: z 1.645; Fail to reject H 0 . There is not sufficient evidence to support theclaim that car owners violate license plate laws at a higher rate than owners of commercial trucks. 239 45 284 ; q 1 284 2099 p 2049 334 2383 2383 2383 239 45 0 p̂ p̂ p p 1 2 1 2 2049 334 z 284 2099 284 2099 pq pq 2383 2383 2383 2383 n1 n2 2049 334 XLSTAT Output for Part (a) XLSTAT Output for Part (b) z-test for two proportions / Upper-tailed test: 90% confidence interval on the difference between the proportions: Difference -0.018 ( -0.0510, 0.0148 ) z (Observed value) -0.946 z (Critical value) 1.645 p-value (one-tailed) 0.828 alpha 0.050
b. 90% CI: 0.0510 p1 p2 0.0148; Because theconfidence interval limits contain 0, there is not a significant difference between thetwo proportions. There is not sufficient evidence to support theclaim that car owners violate license plate laws at a higher rate than owners of commercial trucks. 239 45 239 1810 45 289 p̂1q̂1 p̂2 q̂2 1.645 2049 2049 334 334 p̂1 p̂2 z /2 2049 334 2049 334 n1 n2
Copyright © 2022 Pearson Education, Inc.
154
Chapter 9: Inferences from Two Samples
19. a. H0: p1 p2 ; H1: p1 p2 ; population1 women, population2 men; Test statistic: z 2.63; P-value 0.0085 (Table: P-value 0.0086); Critical values: z 2.575; Reject H0. There is sufficient evidence to reject theclaim that theproportions of blue eyes are thesame for females and males. 370 359 729 ; q 1 729 1297 p 1107 919 2026 2026 2026
z
p̂1 p̂2 ( p1 p2 ) pq
n1
pq
370 359 1107 919 0 729 1297 729 1297 2026 2026 2026 2026
n2
1107
919
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference -0.056 z (Observed value) -2.634 |z| (Critical value) 2.576 p-value (Two-tailed) 0.008 alpha 0.010
XLSTAT Output for Part (b) 99% confidence interval on the difference between the proportions: (-0.11165, -0.00116)
b. 99% CI: 0.112 p1 p2 0.00116; Because theconfidence interval limits do not contain 0, there appears to be a significant difference between thetwo sample proportions. Because theconfidence interval consists of negative values only, it appears that theproportion of blue eyes among females is less than theproportion of blue eyes among males.
p̂1 p̂2 z / 2
p̂1q̂1 n1
p̂2 q̂2 n2
370 359 1107 919
370 737 359 560 1107 1107 919 919 2.575
1107
919
c. theprofessors used a convenience sample of their students. Convenience samples are typically problematic for making inferences about population parameters, but in this case there isn’t anything about eye color that would seem to make someone more or less likely to be a student in a statistics class, so thedata do not appear to have thetypical bias of a convenience sample. 20. a. H0: p1 p2 ; H1: p1 p2 ; population1 women, population2 men; Test statistic: z 0.12; P-value 0.9074 (Table: P-value 0.9044); Critical values: z 2.575; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that theproportions of brown eyes are thesame for females and males. 352 290 321 321 692 p ; q 1 1107 919 1013 1013 1013 z
p̂1 p̂2 ( p1 p2 ) pq n1
pq n2
352 290 919 0 1107 321 692 321 1013 1013 1013 692 1013
1107
XLSTAT Output for Part (a) z-test for two proportions / Two-tailed test: Difference 0.002 z (Observed value) 0.116 |z| (Critical value) 2.576 p-value (Two-tailed) 0.907 alpha 0.010
919 XLSTAT Output for Part (b) 99% confidence interval on the difference between the proportions: (-0.0511, 0.0559)
Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 155 20. (continued) b. 99% CI: 0.0511 p1 p2 0.0559; Because theconfidence interval limits do contain 0, there is not a significant difference between thetwo sample proportions. It appears that proportions of brown eyes are thesame for females and males.
p̂1 p̂2 z / 2
p̂1q̂1
n1
p̂2 q̂2 n2
352 290 1107 919
352 755 290 629 1107 1107 919 919 2.575
1107
919
c. theprofessors used a convenience sample of their students. Convenience samples are typically problematic for making inferences about population parameters, but in this case there isn’t anything about eye color that would seem to make someone more or less likely to be a student in a statistics class, so thedata do not appear to have thetypical bias of a convenience sample. 21. a. H0: p1 p2 ; H1: p1 p2 ; population1 male, population2 female; Test statistic: z 1.17; P-value 0.1214 (Table: 0.1210); Critical value: z 2.33; Fail to reject H 0 . There is not sufficient evidence to support theclaim that therate of left-handedness among males is less than that among females. 23 65 11 ; q 1 11 84 p 240 520 95 95 95 23 65 0 p̂ p̂ p p 2 1 2 z 1 1124084 520 11 84 pq pq 95 95 95 95 n1 n2 240 520 XLSTAT Output for Part (a) XLSTAT Output for Part (b) z-test for two proportions / Lower-tailed test: 98% confidence interval on the difference between the proportions: Difference -0.029 (-0.0848, 0.0264) z (Observed value) -1.168 z (Critical value) -2.326 p-value (one-tailed) 0.121 alpha 0.010
b. 98% CI: 0.0848 p1 p2 0.0264 (Table: 0.0849 p1 p2 0.0265); Because theconfidence interval limits include 0, there does not appear to be a significant difference between therate of lefthandedness among males and therate among females. There is not sufficient evidence to support theclaim that therate of left-handedness among males is less than that among females.
p̂1 p̂2 z /2
p̂1q̂1
p̂2 q̂2
240 520 520 24023 52065 2.33 240240 520 23
217
65
455
n1 n2 c. therate of left-handedness among males does not appear to be less than therate of left-handedness among females. 22. a. H0: p1 p2 ; H1: p1 p2 ; population1 helicopter, population2 ground; Test statistic: z 10.75; P-value 0.0000 (Table: P-value 0.0001); Critical value: z 2.33; Reject H0. There is sufficient evidence to support theclaim that therate of fatalities is higher for patients transported by helicopter. 7813 17, 775 25, 588 25, 588 197,887 p ; q 1 61, 909 161, 566 223, 475 223, 475 223, 475 17,775 7813 0 p̂ p̂ p p 61,909 161,566 2 1 2 z 1 25,588 197,887 25,588 197,887 pq pq 223,475 223,475 223,475 223,475 n1 n2 61, 909 161, 566
Copyright © 2022 Pearson Education, Inc.
156
Chapter 9: Inferences from Two Samples
22. (continued) XLSTAT Output for Part (a) z-test for two proportions / Upper-tailed test: Difference 0.016 z (Observed value) 10.753 z (Critical value) 2.326 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Part (b) 98% confidence interval on the difference between the proportions: (0.0126, 0.0198)
b. 98% CI: 0.0126 p1 p2 0.0198; Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo proportions. Because theentire range of values in theconfidence interval consists of positive numbers, it appears that therate of fatalities is higher for patients transported by helicopter.
p̂ p̂ z 1
2
/2
p̂1q̂1
p̂2 q̂2
7813
61,909
17,775
2.33 7813 61,909
54,096 61,909
17,775 161,566
143,791 161,566
161,566
61, 909 161, 566 n1 n2 c. thefatality rate for thehelicopter sample is 0.126, or 12.6%, and thefatality rate for theground services sample is 0.110, or 11.0%. thelarge sample sizes result in a significant difference, but it does not appear that thedifference has very much practical significance. Also, it is possible that themost serious of theserious traumatic injuries led to helicopter transportation, and that could partly explain thehigher rate of fatalities with helicopter transportation. z2 /2 1.962 23. n 4082; t h e samples should include 4082 men and 4082 women. 2E2 2 0.022 fx | =NORM.S.INV(1-0.05/2)^2/(2*0.02^2) 24. a. themethod of this section requires that both samples must have at least 5 successes and 5 failures, but thegroup not exposed to yawning includes a frequency of 4, which violates that requirement. b. P-value 0.3729 (Table: 0.3745); which is not close to theP-value of 0.5128 from Fisher’s exact test. H0: p1 p2 ; H1: p1 p2 ; population1 yawning, population2 not yawning; XLSTAT Output for Part (b) 10 4 7 ; q 1 7 18 p z-test for two proportions / Upper-tailed test: 34 16 25 25 25 Difference 0.044 10 4 0 34 16 z (Observed value) 0.324 z 7 18 7 18 z (Critical value) 1.645 25 25 25 25 p-value (one-tailed) 0.373 34 16 alpha 0.050
25. a. 95% CI: 0.0227 p1 p2 0.217; population1 first sample, population2 second sample; Because the confidence interval limits do not contain 0, it appears that p1 p2 can be rejected.
p̂1 p̂2 z / 2
p̂1q̂1
p̂2 q̂2
n1 n2 XLSTAT Output for Part (a)
88 112 200 200
95% confidence interval on the difference between the proportions: ( 0.023, 0.217 )
112 88 112 88 200 200 200 200 1.96
200
200
XLSTAT Output for Part (c) z-test for two proportions / Two-tailed test: Difference 0.120 z (Observed value) 2.400 |z| (Critical value) 1.960 p-value (Two-tailed) 0.016 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
Section 9-1: Two Proportions 157 25. (continued) p̂q̂ 112 1.96 n 200
b. First sample: 95% CI: p̂ z /2
88 112 200 200 0.491 p 0.629
200
1
88 112 88 p̂q̂ 1.96 200 200 0.371 p2 0.509 n 200 200 Because theconfidence intervals do overlap, it appears that p1 p2 cannot be rejected. XLSTAT Output for Sample 2 XLSTAT Output for Sample 1
Second sample: 95% CI: p̂ z /2
95% confidence interval on the proportion: 95% confidence interval on the proportion: ( 0.491, 0.629 ) ( 0.371, 0.509 ) c. H0: p1 p2 ; H1: p1 p2 ; population1 first sample, population2 second sample; Test statistic: z 2.40; P-value 0.0164; Critical values: z 1.96; Reject H0. There is sufficient evidence to reject p1 p2 . 112 88 1 ; q 1 1 1 p 200 200 2 2 2 112 88 0 p̂ p̂ p p 1 2 1 2 200 200 z 1 1 1 1 pq pq 2 2 2 2 n1 n2 200 200 d. Reject p1 p2. t h e least effective method is using theoverlap between theindividual confidence intervals.
26. Hypothesis test: H0: p1 p2 ; H1: p1 p2 ; population1 first sample, population2 second sample; Test statistic: z 1.9615; P-value 0.0498 (Table: P-value 0.05); Critical values: z 1.96; Reject H 0. There is sufficient evidence to reject p1 p2 . 10 1404 7 ; q 1 7 3 p 20 2000 10 10 10 10 1404 0 p̂ p̂ p p 1 2 1 2 20 2000 z 7 3 7 3 pq pq 10 10 10 10 n1 n2 20 2000
XLSTAT Outputs z-test for two proportions / Two-tailed test: Difference -0.202 z (Observed value) -1.962 |z| (Critical value) 1.960 p-value (Two-tailed) 0.050 alpha 0.050 95% confidence interval on the difference between the proportions: ( -0.422, 0.018 ) 95% CI: 0.422 p1 p2 0.0180; population1 first sample, population2 second sample; which suggests that we should not reject p1 p2 (because 0 is included).
p̂1 p̂2 z / 2
p̂1q̂1
p̂2 q̂2
1404 2010 2000 1.96
596 1020 1020 1404 2000 2000
20 2000 n1 n2 The hypothesis test and confidence interval lead to different conclusions about theequality of
Copyright © 2022 Pearson Education, Inc.
p1 p2 .
158
Chapter 9: Inferences from Two Samples
27. H0: p1 p2 ; H1: p1 p2 ; population1 ANSUR I 1998, population2 ANSUR I 2012; Test statistic: z 22.59; P-value 0.0000 (Table: P-value 0.0002); Critical values: z 2.575; Reject H0. There is sufficient evidence to warrant rejection of theclaim that theproportion of males in thesample in Data Set 1 ―ANSUR I 1988‖ is thesame as theproportion of males in thesample in Data Set 2 ―ANSUR II 2012.‖ theproportion of males in ANSUR II 2012 appears to be significantly higher than in ANSUR I 1988. the95% CI is 0.253 p1 p2 0.202. (TI data: With sample proportions of 227/500 and 319/500, test statistic is z 5.84, P-value 0.0000, and theconfidence interval is 0.264 p1 p2 0.104.) 1774 4082 976 976 699 p ; q 1 3982 6068 1675 1675 1675 z
p̂1 p̂2 ( p1 p2 ) pq n1
pq
4082 1774 3982 6068 0 976 699 1675 1675 976 699 1675 1675
n2
3982
6068
XLSTAT Output z-test for two proportions / Two-tailed test: Difference -0.227 z (Observed value) -22.592 |z| (Critical value) 2.576 p-value (Two-tailed) <0.0001 alpha 0.010 Section 9-2: Two Means: Independent Samples 1. Only parts (b) and (c) describe independent samples. 2. a. Because theconfidence interval does not include 0, it appears that there is a significant difference between themean pulse rate in women and themean pulse rate in men. b. We have 95% confidence that theinterval from 1.7 bpm to 7.2 bpm actually contains thevalue of thedifference between thetwo population means 1 2 . c. 7.2 bpm 1 2 1.7 bpm 3.
4.
5.
a. yes b. yes c. 98% The critical value of t 1.796 is more conservative than t 1.727 in thesense that rejection of thenull hypothesis requires a greater difference between thesample means. thesample evidence must be stronger with t 1.796. a. H0: 1 2; H1: 1 2; population1 No candy, population2 Two candies Test statistic: t 4.084; P-value 0.0001 (Table: P-value 0.005); Critical value: t 1.695 (Table: t 1.729); Reject H 0 . There is sufficient evidence to support theclaim that giving candy does result in greater tips. P-value: fx | =T.DIST(-4.084,31.036,1) Critical Value: fx | =T.INV(0.05,31.036) t
x1 x2 (1 2 ) 18.95 21.62 0 s12 s22 n1 n2
1.502 2.512 20 20
b. 90% CI: 3.78 1 2 1.56 (Table: 3.80 1 2 1.54)
x1 x2 t / 2
s12 s22 1.502 2.512 (df 20 1 19) 18.95 21.62 1.729 n1 n2 20 20
Critical value: fx | =T.INV.2T(0.10,31.036)
Copyright © 2022 Pearson Education, Inc.
Section 9-2: Two Means: Independent Samples 159 6.
a. H0: 1 2; H1: 1 2; population1 Roman font, population2 Arial font Test statistic: t 0.189; P-value 0.8510 (Table: P-value 0.20); Critical values: t 2.017 (Table: t 2.069); Fail to reject H 0 . There is not sufficient evidence to reject theclaim that there is no significant difference in readability between Roman and Arial fonts. There doesn’t appear to be a significant difference. x1 x2 (1 2 ) 66.25 65.62 0 t s12 s22 12.962 9.932 n1 n2 24 24 P-value: fx | =T.DIST.2T(0.189,43.083)
Critical Values: fx | =T.INV.2T(0.05,43.083)
b. 95% CI: 6.09 1 2 7.35 (Table: 6.27 1 2 7.53)
x1 x2 t / 2
s12 s22 12.962 9.932 (df 24 1 23) 66.25 65.62 2.069 n1 n2 24 24
Critical value: fx | =T.INV.2T(0.05,43.083) 7.
a. H0: 1 2; H1: 1 2; population1 Washing with soap, population2 Rubbing with alcohol Test statistic: t 2.095; P-value 0.0196 (Table: P-value 0.01); Critical value: t 2.372 (Table: t 2.403); Fail to reject H 0 . There is not sufficient evidence to support theclaim that rubbing with alcohol results in a lower bacteria count. t
x1 x2 (1 2 ) 69 35 0 s12 s22 n1 n2
1062 592 55 59
P-value: fx | =T.DIST.RT(2.095,83.231)
Critical Value: fx | =T.INV(1-0.01,83.231)
b. 98% CI: 4 1 2 72 (Table: 5 1 2 73);
x1 x2 t / 2
s12 s22 1062 592 (df 55 1 54) 69 35 2.403 n1 n2 55 59
Critical value: fx | =T.INV.2T(0.02,83.231) c. Yes, thenull hypothesis is rejected with a 0.05 significance level, so theconclusion changes to this: There is sufficient evidence to support theclaim that rubbing with alcohol results in a lower bacteria count. 8.
a. H0: 1 2; H1: 1 2; population1 girls, population2 boys Test statistic: t 3.450; P-value 0.0003 (Table: P-value 0.005); Critical value: t 2.336 (Table: t 2.345); Reject H 0 . There is sufficient evidence to support theclaim that at birth, girls have a lower mean weight than boys. t
x1 x2 (1 2 ) 3037.1 3272.8 0 s12 s22 n1 n2
706.32 660.22 205 195
P-value: fx | =T.DIST(-3.450,397.880,1)
Critical Value: fx | =T.INV(0.01,397.880)
b. 98% CI: 395.3 g 1 2 76.1 g (Table: 395.5 g 1 2 75.5 g)
x1 x2 t / 2
s12
s2 706.32 660.22 2 3037.1 3272.8 2.345 (df 195 1 194) 205 195 n1 n2
Critical value: fx | =T.INV.2T(0.02,397.880)
Copyright © 2022 Pearson Education, Inc.
160 9.
Chapter 9: Inferences from Two Samples a. H0: 1 2; H1: 1 2; population1 red background, population2 blue background Test statistic: t 2.647; P-value 0.0101 (Table: P-value 0.02); Critical values: t 1.995 (Table: t 2.032); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that the samples are from populations with thesame mean. Color does appear to have an effect on word recall scores. Red appears to be associated with higher word memory recall scores. x1 x2 1 2 15.89 12.31 0 t s12 s22 5.902 5.482 n1 n2 35 36 P-value: fx | =T.DIST.2T(2.647,41.775)
Critical Values: fx | =T.INV.2T(0.05,41.775)
b. 95% CI: 0.88 1 2 6.28 (Table: 0.83 1 2 6.33); t h e confidence interval includes positive numbers only, so themean score with a red background appears to be greater than themean score with a blue background.
x1 x2 t / 2
s12 s22 5.902 5.482 15.89 12.31 2.032 n1 n2 35 36
(df 35 1 34)
Critical Value: fx | =T.INV.2T(0.05,41.775) c. t h e background color does appear to have an effect on word recall scores. Red appears to be associated with higher word memory recall scores. 10. a. H0: 1 2; H1: 1 2; population1 red background, population2 blue background Test statistic: t 2.979; P-value 0.0021 (Table: P-value 0.005); Critical value: t 2.392 (Table: t 2.441); Reject H 0 . There is sufficient evidence to support theclaim that blue enhances performance on a creative task. x1 x2 1 2 3.39 3.97 0 t s12 s22 0.972 0.632 n1 n2 35 36 P-value: fx | =T.DIST(-2.979,58.112,1)
Critical Value: fx | =T.INV(0.01,58.112)
b. 98% CI: 1.05 1 2 0.11 (Table: 1.06 1 2 0.10); t h e confidence interval consists of negative numbers only and does not include 0, so themean creativity score with thered background appears to be less than themean creativity score with theblue background. It appears that blue enhances performance on a creative task.
x1 x2 t / 2
s12 s22 0.972 0.632 3.39 3.97 2.441 n1 n2 35 36
(df 35 1 34)
Critical Value: fx | =T.INV.2T(0.02,58.112) 11. a. H0: 1 2; H1: 1 2; population1 magnet treatment, population2 sham treatment Test statistic: t 0.132; P-value 0.4480 (Table: P-value 0.10); Critical value: t 1.691 (Table: t 1.729); Fail to reject H0. There is not sufficient evidence to support theclaim that themagnets are effective in reducing pain. t
x1 x2 1 2 0.49 0.44 0 s12 s22 n1 n2
0.962 1.42 20 20
P-value: fx | =T.DIST.RT(0.132,33.633)
Critical Value: fx | =T.INV(1-0.05,33.633)
Copyright © 2022 Pearson Education, Inc.
Section 9-2: Two Means: Independent Samples 161 11. (continued) b. 90% CI: 0.59 1 2 0.69 (Table: 0.61 1 2 0.71);
x1 x2 t / 2
s12 s22 0.962 1.42 (df 20 1 19) 0.49 0.44 1.729 n1 n2 20 20
Critical Value: fx | =T.INV.2T(0.10,33.633) c. Magnets do not appear to be effective in treating back pain. It is valid to argue that themagnets might appear to be effective if thesample sizes were larger. 12. a. H0: 1 2; H1: 1 2; population1 exposed, population2 not exposed Test statistic: t 1.845; P-value 0.0352 (Table: P-value 0.05); Critical value: t 1.673 (Table: t 1.685); Reject H0. There is sufficient evidence to support theclaim that nonsmokers exposed to tobacco smoke have a higher mean cotinine level than nonsmokers not exposed to tobacco smoke. t
x1 x2 1 2 60.58 16.35 0 s12 s22 n1 n2
138.082 62.532 40 40
P-value: fx | =T.DIST.RT(1.685,54.350)
Critical Value: fx | =T.INV(1-0.05,54.350)
b. 90% CI: 4.12 ng mL 1 2 84.34 ng mL (Table: 3.85 ng mL 1 2 84.61 ng mL );
x1 x2 t / 2
s12 s22 138.082 62.532 60.58 16.35 1.685 n1 n2 40 40
(df 40 1 39)
Critical Value: fx | =T.INV.2T(0.10,54.350) c. Exposure to second-hand smoke appears to have theeffect of being associated with greater amounts of nicotine than for those not exposed to secondhand smoke. 13. a. H0: 1 2; H1: 1 2; population1 Heavier bicycle, population2 Lighter bicycle Test statistic: t 0.393; P-value 0.6959 (Table: P-value 0.20); Critical values: t 2.012 (Table: t 2.060); Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that the mean commuting time with theheavier bicycle is thesame as themean commuting time with thelighter bicycle. x1 x2 (1 2 ) 107.8 108.4 0 t s12 s22 4.92 6.32 n1 n2 30 26 P-value: fx | =T.DIST.2T(abs(-0.393),46.959)
Critical Values: fx | =T.INV.2T(0.05,46.959)
b. 95% CI: 3.7 min 1 2 2.5 min Because theconfidence interval includes 0, there is not sufficient evidence to warrant rejection of theclaim that thetwo samples are from populations with thesame mean.
x1 x2 t / 2
s12 s22 4.92 6.32 (df 26 1 25) 107.8 108.4 2.060 n1 n2 30 26
Critical Value: fx | =T.INV.2T(0.05,46.959)
Copyright © 2022 Pearson Education, Inc.
162
Chapter 9: Inferences from Two Samples
14. a. H0: 1 2; H1: 1 2; population1 low lead, population2 high lead Test statistic: t 2.282; P-value 0.0132 (Table: P-value 0.05); Critical value: t 1.673 (Table: t 1.725); Reject H0. There is sufficient evidence to support theclaim that themean IQ score of people with low blood lead levels is higher than themean IQ score of people with high blood lead levels. t
x1 x2 1 2 92.88462 86.90476 0 s12 s22 n1 n2
15.344512 8.9883522 78 21
XLSTAT Output for Part (a) t-test for two independent samples / Two-tailed test: Difference 5.980 t (Observed value) 2.282 t (Critical value) 1.673 DF 54.917 p-value (one-tailed) 0.013 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( 1.596, 10.364 )
b. 90% CI: 1.6 1 2 10.4 (Table: 1.5 1 2 10.5);
x1 x2 t / 2
s12 s22 15.344512 8.9883522 92.88462 86.90476 1.725 n1 n2 78 21
(df 211 20) c. Yes, it does appear that exposure to lead has an effect on IQ scores. 15. a. H0: 1 2; H1: 1 2; population1 pre-1964, population2 post-1964 Test statistic: t 32.771; P-value 0.0000 (Table: P-value 0.005); Critical value: t 1.667 (Table: t 1.685); Reject H0. There is sufficient evidence to support theclaim that pre-1964 quarters have a mean weight that is greater than themean weight of post-1964 quarters. t
x1 x2 1 2 6.19267 5.63930 0 s12 s22 n1 n2
0.087002 0.061942 40 40
XLSTAT Output for Part (a) t-test for two independent samples / Upper-tailed test: Difference 0.553 t (Observed value) 32.773 t (Critical value) 1.667 DF 70.455 p-value (one-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( 0.52523, 0.58152 )
b. 90% CI: 0.52522 lb 1 2 0.58125 lb (Table: 0.52492 lb 1 2 0.58182 lb);
x1 x2 t / 2
s12 s22 0.087002 0.061942 6.19267 5.63930 1.685 n1 n2 40 40
(df 40 1 39)
c. Yes, vending machines are not affected very much because pre-1964 quarters are mostly out of circulation.
Copyright © 2022 Pearson Education, Inc.
Section 9-2: Two Means: Independent Samples 163 16. a. H0: 1 2; H1: 1 2; population1 Disney movies, population2 other movies Test statistic: t 0.462; P-value 0.6465 (Table: P-value 0.20); Critical values: t 2.012
(Table: t 2.120); Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that Disney animated children’s movies and other animated children’s movies have thesame mean time showing tobacco use. x1 x2 1 2 61.6 49.3 0 t s12 s22 118.82 69.32 33 17 n1 n2 XLSTAT Output for Part (a) t-test for two independent samples / Two-tailed test: Difference 12.342 t (Observed value) 0.463 |t| (Critical value) 2.012 DF 47.109 p-value (Two-tailed) 0.645 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the means: ( -41.265, 65.949 )
b. 95% CI: 41.3 sec 1 2 65.9 sec (Table: 44.2 sec 1 2 68.8 sec)
x1 x2 t / 2
s12 s22 118.82 69.32 (df 17 1 16) 61.6 49.3 2.120 n1 n2 33 17
c. thetimes appear to be from a population with a distribution that is not normal (the sample is right-skewed), but themethods in this section are robust against departures from normality. (Results obtained by using other methods confirm that theresults obtained here are quite good, even though thenon- Disney times appear to violate thenormality requirement.) 17. a. H0: 1 2; H1: 1 2; population1 ANSUR I, population2 ANSUR II Test statistic: t 1.085; P-value 0.1442 (Table: P-value 0.10); Critical value: t 1.708 (Table: t 1.796); Fail to reject H0. There is not sufficient evidence to support theclaim that themean weight of the1988 male population is less than themean weight of the2012 male population. t
x1 x2 (1 2 ) 77.67 82.25 0 s12 s22 n1 n2
9.432 12.462 12 15
XLSTAT Output for Part (a) t-test for two independent samples / Lower-tailed test: Difference -4.572 t (Observed value) -1.085 t (Critical value) -1.708 DF 24.949 p-value (one-tailed) 0.144 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -11.770, 2.627 )
b. 90% CI: 11.77 kg 1 2 2.63 kg (Table: 12.14 kg 1 2 3.00 kg)
x1 x2 t / 2
s12 s22 9.432 12.462 (df 12 1 11) 77.67 82.25 1.796 n1 n2 12 15
Copyright © 2022 Pearson Education, Inc.
164
Chapter 9: Inferences from Two Samples
18. a. H0: 1 2; H1: 1 2; population1 Two lines, population2 One line Test statistic: t 0.158; P-value 0.8749 (Table: P-value 0.20); Critical values: t 2.703 (Table: t 2.831); Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that cars in two queues have a mean waiting time equal to that of cars in a single queue. t
x1 x2 (1 2 ) 411.0 423.4 0 s12 s22 n1 n2
295.02 244.32 22 28
XLSTAT Output for Part (a) t-test for two independent samples / Two-tailed test: Difference -12.357 t (Observed value) -0.158 |t| (Critical value) 2.703 DF 40.568 p-value (Two-tailed) 0.875 alpha 0.010
XLSTAT Output for Part (b) 99% confidence interval on the difference between the means: ( -223.222, 198.508 )
b. 99% CI: 223.3 sec 1 2 198.5 sec (Table: 233.2 sec 1 2 208.5 sec)
x1 x2 t / 2
2
2
s 295.02 244.32 2 411.0 423.4 2.831 (df 22 1 21) 22 28 n1 n2
s1
19. a. H0: 1 2; H1: 1 2; population1 Regular Coke, population2 Diet Coke Test statistic: t 16.084; P-value 0.0000 (Table: P-value 0.005); Critical value: t 2.538 (Table: t 2.718); Reject H0. There is sufficient evidence to support theclaim that thecontents of cans of regular Coke have weights with a mean that is greater than themean for Diet Coke t
x1 x2 (1 2 ) 0.81776 0.78459 0 s12 s22 n1 n2
0.006062 0.004372 12 16
XLSTAT Output for Part (a) t-test for two independent samples / Upper-tailed test: Difference 0.033 t (Observed value) 16.084 t (Critical value) 2.538 DF 19.114 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Part (b) 98% confidence interval on the difference between the means: ( 0.02794, 0.03841 )
b. 98% CI: 0.02794 lb 1 2 0.03841 lb (Table: 0.02756 lb 1 2 0.03878 lb) s2 0.006062 0.004372 2 0.81776 0.78459 2.718 (df 12 1 11) 12 16 n1 n 2 c. thecontents in cans of regular Coke appear to weigh more, probably due to thesugar present in regular Coke but not diet Coke.
x1 x2 t / 2
s12
20. H0: 1 2; H1: 1 2; population1 easy to difficult, population2 difficult to easy; Test statistic: t 2.657; P-value 0.0114 (Table: P-value 0.02); Critical values ( 0.05): t 2.023 (Table: t 2.131); t h e conclusion depends on thechoice of thesignificance level. There is a significant difference between thetwo population means at the0.05 significance level, but not at the0.01 significance level.
Copyright © 2022 Pearson Education, Inc.
Section 9-2: Two Means: Independent Samples 165 20. (continued) t
x1 x2 1 2 27.115 31.728 0 2.657 (df 15) s12 s22 n1 n2
6.8572 4.2602 25 16 XLSTAT Output t-test for two independent samples / Two-tailed test: Difference -4.613 t (Observed value) -2.657 |t| (Critical value) 2.023 DF 38.988 p-value (Two-tailed) 0.011 alpha 0.050
21. H0: 1 2; H1: 1 2; population1 ANSUR I, population2 ANSUR II Test statistic: t 20.393; P-value 0.0000 (Table: P-value 0.005); Critical value: t 1.645; Reject H0. There is sufficient evidence to support theclaim that themean weight of the1988 male population isless than themean weight of the2012 male population. It does appear that themale population is getting heavier. (TI data: Test statistic is t 5.282, P-value 0.0000, critical value is t 1.648.) t
x1 x2 (1 2 ) 78.49 85.52 0 s12 s22 n1 n2
11.122 14.222 1774 4082
XLSTAT Output for Part (a) t-test for two independent samples / Lower-tailed test: Difference -7.037 t (Observed value) -20.393 t (Critical value) -1.645 DF 4259.990 p-value (one-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -7.605, -6.469 )
b. 90% CI: 7.60 kg 1 2 6.47 kg (TI data: 7.50 kg 1 2 3.93 kg)
x1 x2 t / 2
s12 s22 11.122 14.222 (df 1774 1 1773) 78.49 85.52 1.645 n1 n2 1774 4082
22. H0: 1 2; H1: 1 2; population1 ANSUR I, population2 ANSUR II Test statistic: t 0.212; P-value 0.4159 (Table: P-value 0.10); Critical value: t 1.645; Fail to reject H 0 . There is not sufficient evidence to support theclaim that themean height of the1988 male population is less than themean height of the2012 male population. thelarger samples do not provide sufficient evidence to suggest that people are getting taller. (TI data: Test statistic is t 0.659 and P-value 0.2552) t
x1 x2 (1 2 ) 1755.8 1756.2 0 s12 s22 n1 n2
66.82 68.62 1774 4082
Copyright © 2022 Pearson Education, Inc.
166
Chapter 9: Inferences from Two Samples XLSTAT Output for Exercise 22 t-test for two independent samples / Lower-tailed test: Difference -0.407 t (Observed value) -0.212 t (Critical value) -1.645 DF 3452.698 p-value (one-tailed) 0.416 alpha 0.050
XLSTAT Output for Exercise 23 t-test for two independent samples / Lower-tailed test: Difference -236.312 t (Observed value) -0.132 t (Critical value) -1.669 DF 64.023 p-value (one-tailed) 0.448 alpha 0.050
23. H0: 1 2; H1: 1 2; population1 men, population2 women; Test statistic: t 0.132; P-value 0.4477 (Table: P-value 0.10); Critical value: t 1.669 (Table: t 1.688); Fail to reject H0. There is not sufficient evidence to support theclaim that men talk less than women. t
x1 x2 1 2 14, 060.38 14, 296.69 0 0.132 (df 36) s12
s2 2 n1 n2
9065.032 6440.972 37 42
24. a. H0: 1 2; H1: 1 2; population1 Two lines, population2 One line Test statistic: t 0.246; P-value 0.8062 (Table: P-value 0.20); Critical values: t 2.606 (Table: t 2.639); Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that cars in two queues have a mean waiting time equal to that of cars in a single queue. t
x1 x2 (1 2 ) 437.8 429.4 0 s12 s22 n1 n2
240.32 200.62 85 85
XLSTAT Output for Part (a) t-test for two independent samples / Two-tailed test: Difference 8.341 t (Observed value) 0.246 |t| (Critical value) 2.606 DF 162.803 p-value (Two-tailed) 0.806 alpha 0.010
XLSTAT Output for Part (b) 99% confidence interval on the difference between the means: ( -80.145, 96.827 )
b. 99% CI: 80.1 sec 1 2 96.8 sec (Table: 81.3 sec 1 2 97.9 sec)
x1 x2 t / 2
2
2
s 240.32 200.62 2 437.8 429.4 2.639 (df 85 1 84) 85 85 n1 n2
s1
25. a. H0: 1 2; H1: 1 2; population1 low lead, population2 high lead; Test statistic: t 1.705; P-value 0.0457 (Table: P-value 0.05); Critical value: t 1.661 (Table: t 1.987); df 78 21 2 97; Reject H0. There is sufficient evidence to support theclaim that the mean IQ score of people with low blood lead levels is higher than themean IQ score of people with high blood lead levels.
Copyright © 2022 Pearson Education, Inc.
Section 9-2: Two Means: Independent Samples 167 25. (continued) n1 1 s2 n2 1 s2 78 115.344512 2118.9883522 2 203.564 s2p 1 78 1 211 n1 1 n2 1 x1 x2 1 2 92.88462 86.90476 0 t 203.564 203.564 s2p s2p 78 21 n n 1
2
XLSTAT Output for Part (b)
XLSTAT Output for Part (a) t-test for two independent samples / Upper-tailed test: Difference 5.980 t (Observed value) 1.705 t (Critical value) 1.661 DF 97 p-value (one-tailed) 0.046 alpha 0.050
90% confidence interval on the difference between the means: ( 0.155, 11.805 )
b. 90% CI: 0.15 1 2 11.81;
x1 x2 t / 2
s12 s22 92.88462 86.90476 1.662 n1 n2
203.564 203.564 (df 97) 78 21
c. Yes, it does appear that exposure to lead has an effect on IQ scores. With pooling, df increases dramatically to 97, but thetest statistic decreases from 2.282 to 1.705 (because theestimated standard deviation increases from 2.620268 to 3.507614), theP-value increases to 0.0457, and the90% confidence interval becomes wider. With pooling, these results do not show greater significance. 26. df 38.9884; Using ― df smaller of n1 1 and n2 1 ‖ is a more conservative estimate of thenumber of degrees of freedom (than theestimate obtained with Formula 9-1) in thesense that theconfidence interval is wider, so thedifference between thesample means needs to be more extreme to be considered a significant difference.
s12 n1 s2 n2 2 6.857 25 4.260 16 38.9884 df 2
2
s2 n
s2 n
n1 1
n2 1
1
1 2
2
2
2
6.8572 25 4.2602 16 25 1 16 1
27. H0: 1 2; H1: 1 2; population1 treatment, population2 placebo; Test statistic: t 15.322; P-value 0.0000 (Table: P-value 0.01); Critical values: t 2.080; Reject H0. There is sufficient evidence to warrant rejection2of theclaim 2that thetwo populations have thesame mean. n1 1 s2 n2 1s2 22 10.015 22 10 2 0.000125; s2p 1 22 1 22 1 n 1 n 1 1 2 x1 x2 1 2 0.049 0.000 0 t 0.000125 0.000125 15.322 s2p s2p 22 22 n1 n2 P-value: fx | =T.DIST.2T(15.322,21)
Critical Values: fx | =T.INV.2T(0.05,21)
Copyright © 2022 Pearson Education, Inc.
168
Chapter 9: Inferences from Two Samples
Section 9-3: Matched Pairs 1. H0: d 0 admissions; H1: d 0 admissions; difference 6th 13th; 2.
a. d 3.3 admissions; sd 3.0 admissions Friday 6th 9 6 11 11 3 5 Friday 13th 13 12 14 10 4 12 Difference –4 –6 –3 1 –1 –7 b. d represents themean of thedifferences in hospital admissions from thepopulation of paired data.
3.
4.
5.
a. 90% b. t0.05 2.015 (df n 1 6 1 5) c. Because theconfidence interval limits do not include 0 admissions and therange of values consists of negative values only, there is sufficient evidence to support theclaim that fewer hospital admissions due to traffic accidents occur on Friday the6th than on thefollowing Friday the13th. a. true b. False, thesample size is sufficiently large. c. False, thedata are not matched pairs. d. True, thesignificance level is thesame. e. False, thesample size is n 100. a.
H0: d 0 lb; H1: d 0 lb; difference measured reported Test statistic: t 0.407; P-value 0.3469 (Table: P-value 0.10); Critical value: t 1.833; Fail to reject H0. There is not sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights. t
0.370 0 d d (df 10 1 9) sd / n 2.877 / 10
XLSTAT Output for Part (a) t-test for two paired samples / Upper-tailed test: Difference 0.370 t (Observed value) 0.407 t (Critical value) 1.833 DF 9 p-value (one-tailed) 0.347 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -1.298, 2.038 )
b. 90% CI: 1.30 lb d 2.04 lb; Because theconfidence interval includes 0, fail to reject H 0 . There is not sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights. s 2.877 (df 10 1 9) d t /2 d 0.370 1.833 n 10 6.
a. H0: d 0 words; H1: d 0 words; difference men women Test statistic: t 0.216; P-value 0.4188 (Table: P-value 0.10); Critical value: t 2.015; Fail to reject H 0 . There is not sufficient evidence to support theclaim that men talk less than women. t
d d sd / n
978 0 (df 6 1 5) 11,100 / 6
Copyright © 2022 Pearson Education, Inc.
Section 9-3: Matched Pairs 169 6.
(continued) XLSTAT Output for Part (a) t-test for two paired samples / Lower-tailed test: Difference -978 t (Observed value) -0.216 t (Critical value) -2.015 DF 5 p-value (one-tailed) 0.419 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -10109.447, 8153.447 )
b. 90% CI: 10,109.4 words d 8153.4 words (Table: 10,109.2 words d 8153.2 words); Because theconfidence interval does include 0 words, fail to reject H 0 . There is not sufficient evidence to support theclaim that men talk less than women. 11,100 s (df 6 1 5) d t /2 d 978 2.015 n 6 7.
a. H0: d 0 kg; H1: d 0 kg; difference September April Test statistic: t 4.438; P-value 0.0011 (Table: P-value 0.005); Critical value: t 2.896; Reject H 0. There is sufficient evidence to support theclaim that for thepopulation of freshman male college students, theweights in September are less than theweights in thefollowing April. d d 0.889 0 t (df 9 1 8) sd / n 0.601 / 9 XLSTAT Output for Part (a) t-test for two paired samples / Lower-tailed test: Difference -0.889 t (Observed value) -4.438 t (Critical value) -2.896 DF 8 p-value (one-tailed) 0.001 alpha 0.010
XLSTAT Output for Part (b) 98% confidence interval on the difference between the means: ( -1.469, -0.309 )
b. 98% CI: 1.5 kg d 0.3 kg; Because theconfidence interval does not include 0 kg, reject H 0 . There is sufficient evidence to support theclaim that for thepopulation of freshman male college students, theweights in September are less than theweights in thefollowing April. 0.601 s d t /2 d 0.889 2.896 (df 9 1 8) n 9 c. thetest does not address thespecific weight gain of 15 lb, but it does suggest that males gain weight during their freshman year. 8.
a. H0: d 0; H1: d 0; difference QWERTY Dvorak Test statistic: t 6.013; P-value 0.0000 (Table: P-value 0.005); Critical value: t 2.624; Reject H 0. There is sufficient evidence to support theclaim that theword difficulty scores are higher with theQWERTY keyboard. d d 2.867 0 t (df 15 1 14) sd / n 1.846 / 15
Copyright © 2022 Pearson Education, Inc.
170
Chapter 9: Inferences from Two Samples
8.
(continued) XLSTAT Output for Part (a) t-test for two paired samples / Upper-tailed test: Difference 2.867 t (Observed value) 6.013 t (Critical value) 2.624 DF 14 p-value (one-tailed) <0.0001 alpha 0.010
XLSTAT Output for Part (b) 98% confidence interval on the difference between the means: ( 1.615, 4.118 )
b. 98% CI: 1.6 d 4.1; Because theconfidence interval does not include 0, reject H 0 . There is sufficient evidence to support theclaim that word difficulty scores are higher with theQWERTY keyboard. 1.846 s (df 15 1 14) d t /2 d 2.867 2.624 n 15 9.
a. H0: d 0; H1: d 0; difference right ear left ear Test statistic: t 0.890; P-value 0.3924 (Table: P-value 0.20); Critical values: t 2.201; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thepopulation of differences has a mean equal to 0. t
2.08 0 d d (df 12 1 11) sd / n 8.11 / 12
XLSTAT Output for Part (a) t-test for two paired samples / Two-tailed test: Difference 2.083 t (Observed value) 0.890 |t| (Critical value) 2.201 DF 11 p-value (Two-tailed) 0.392 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the means: ( -3.067, 7.234 )
b. 95% CI: 3.1 d 7.2; Because theconfidence interval includes 0, fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that thepopulation of differences has a mean equal to 0. 8.11 s (df 12 1 11) d t /2 d 2.08 2.201 n 12 10. a. H0: d 0; H1: d 0 in.; difference right eye left eye Test statistic: t 0.896; P-value 0.3938 (Table: P-value 0.20); Critical values: t 2.262; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thepopulation of differences has a mean equal to 0. t
1.50 0 d d (df 10 1 9) sd / n 5.30 / 10
XLSTAT Output for Part (a) t-test for two paired samples / Two-tailed test: Difference 1.500 t (Observed value) 0.896 |t| (Critical value) 2.262 DF 9 p-value (Two-tailed) 0.394 alpha 0.050
XLSTAT Output for Part (b) 95% confidence interval on the difference between the means: ( -2.289, 5.289 )
Copyright © 2022 Pearson Education, Inc.
Section 9-3: Matched Pairs 171 10. (continued) b. 95% CI: 2.3 d 5.3; Because theconfidence interval includes 0, fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thepopulation of differences has a mean equal to 0. 5.30 s (df 10 1 9) d t /2 d 1.50 2.262 n 10 11. a. H0: d 0 years; H1: d 0 years; difference actress actor Test statistic: t 2.609; P-value 0.0142 (Table: P-value 0.025); Critical value: t 1.833; Reject H 0 . There is sufficient evidence to support theclaim that for thepopulation of ages of Best Actresses and Best Actors, thedifferences have a mean less than 0. There is sufficient evidence to conclude that Best Actresses are generally younger than Best Actors. d d 9.70 0 t (df 10 1 9) sd / n 11.76 / 10 XLSTAT Output for Part (a) t-test for two paired samples / Lower-tailed test: Difference -9.700 t (Observed value) -2.609 t (Critical value) -1.833 DF 9 p-value (one-tailed) 0.014 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -16.515, -2.885 )
b. 90% CI: 16.5 years d 2.9 years; t h e confidence interval consists of negative numbers only and does not include 0. 11.76 s (df 10 1 9) d t /2 d 9.70 1.833 n 10 12. a. H0: d 0 cm; H1: d 0 cm; difference president opponent; Test statistic: t 1.304; P-value 0.1246 (Table: P-value 0.10); Critical value: t 2.015; Fail to reject H 0 . There is not sufficient evidence to support theclaim that for thepopulation of heights of presidents and their main opponents, thedifferences have a mean greater than 0 cm (or that presidents tend to be taller than their opponents). d d 3.67 0 t (df 5 1 6) sd n 6.890 6 XLSTAT Output for Part (a) t-test for two paired samples / Upper-tailed test: Difference 3.667 t (Observed value) 1.304 t (Critical value) 2.015 DF 5 p-value (one-tailed) 0.125 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -2.001, 9.334 )
b. 90% CI: 2.0 cm d 9.3 cm; t h e confidence interval includes 0, so it is possible that d 0. d t /2
sd n
3.67 2.015
6.890 6
(df 5 1 6)
Copyright © 2022 Pearson Education, Inc.
172
Chapter 9: Inferences from Two Samples
13. H0: d 0 in.; H1: d 0 in.; difference mother daughter; Test statistic: t 1.379; P-value 0.2013 (Table: P-value 0.20); Critical values: t 2.262; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that there is no difference in heights between mothers and their first daughters. t
d d 0.95 0 (df 10 1 9) sd n 2.179 10
XLSTAT Output for Exercise 13 t-test for two paired samples / Two-tailed test: Difference -0.950 t (Observed value) -1.379 |t| (Critical value) 2.262 DF 9 p-value (Two-tailed) 0.201 alpha 0.050
XLSTAT Output for Exercise 14 t-test for two paired samples / Two-tailed test: Difference 0.020 t (Observed value) 0.034 |t| (Critical value) 2.262 DF 9 p-value (Two-tailed) 0.974 alpha 0.050
14. H0: d 0 in.; H1: d 0 in.; difference father son; Test statistic: t 0.034; P-value 0.9737 (Table: P-value 0.20); Critical values: t 2.262; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that there is no difference in heights between fathers and their first sons. t
0.02 0 (df 10 1 9) d d sd n 1.863 10
15. difference before after; 95% CI: 0.69 d 5.66; Because theconfidence interval limits do not contain 0 and they consist of positive values only, it appears that the―before‖ measurements are greater than the―after‖ measurements, so hypnotism does appear to be effective in reducing pain. s 2.911 d t /2 d 3.125 2.365 (df 8 1 7) n 8 XLSTAT Output 95% confidence interval on the difference between the means: ( 0.691, 5.559 ) 16. a. H0: d 0 hours; H1: d 0 hours; difference Dextro Laevo Test statistic: t 4.672; P-value 0.0012 (Table: P-value 0.01); Critical value: t 3.250; Reject H 0 . There is sufficient evidence to warrant rejection of theclaim that both treatments have thesame effect. t
d d sd / n
1.670 0 (df 10 1 9) 1.130 / 10
XLSTAT Output for Part (a) t-test for two paired samples / Two-tailed test: Difference -1.670 t (Observed value) -4.672 |t| (Critical value) 3.250 DF 9 p-value (Two-tailed) 0.001 alpha 0.010
XLSTAT Output for Part (b) 99% confidence interval on the difference between the means: ( -2.832, -0.508 )
Copyright © 2022 Pearson Education, Inc.
Section 9-3: Matched Pairs 173 16. (continued) b. 99% CI: 2.83 hours d 0.51 hours; Because theconfidence interval does not include 0, reject H 0. There is sufficient evidence to warrant rejection of theclaim that both treatments have thesame effect. 1.130 s (df 10 1 9) d t /2 d 1.670 3.250 n 10 17. a. t h e larger data set changed theresults and conclusion. H0: d 0 lb; H1: d 0 lb; difference measured reported Test statistic: t 17.611; P-value 0.0000 (Table: P-value 0.005); Critical value: t 1.645; Reject H 0 . There is sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights (TI data: Test statistic: t 4.864, critical value: t 1.651, P-value 0.0000.) t
d d sd / n
3.245 0 (df 29711 2970) 10.045 / 2971
XLSTAT Output for Part (a) t-test for two paired samples / Upper-tailed test: Difference 3.245 t (Observed value) 17.611 t (Critical value) 1.645 DF 2970 p-value (one-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( 2.942, 3.549 )
b. 90% CI: 2.94 lb d 3.55 lb; Because theconfidence interval does not include 0, reject H 0 . There is sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights. (Table: 1.02 lb d 2.08 lb) 10.045 s (df 29711 2970) d t /2 d 3.245 1.645 n 2971 18. The larger data set changed theresults and conclusion. H0: d 0 lb; H1: d 0 lb; difference measured reported Test statistic: t 3.879; P-value 0.0001 (Table: P-value 0.005); Critical value: t 1.645; Reject H 0. There is sufficient evidence to support theclaim that for males, themeasured weights tend to be higher than the reported weights. (TI data: Test statistic: t 0.773, critical value: t 1.651, P-value 0.2202; Using the247 matched pairs did not have much of an effect on theconclusion from Example 1.) d d 0.811 0 t (df 2784 1 2783) 11.034 / 2784 sd / n XLSTAT Output t-test for two paired samples / Upper-tailed test: Difference 0.811 t (Observed value) 3.879 t (Critical value) 1.645 DF 2783 p-value (one-tailed) <0.0001 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
174
Chapter 9: Inferences from Two Samples
19. a. H0: d 0 years; H1: d 0 years; difference actress actor Test statistic: t 5.269; P-value 0.0000 (Table: P-value 0.005); Critical value: t 1.662; Reject H 0 . There is sufficient evidence to support theclaim that actresses are generally younger than actors. t
d d sd / n
7.75 0 (df 91 1 90) 14.03 / 91
XLSTAT Output for Part (a) t-test for two paired samples / Lower-tailed test: Difference -7.747 t (Observed value) -5.269 t (Critical value) -1.662 DF 90 p-value (one-tailed) <0.0001 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -10.191, -5.304 )
b. 90% CI: 10.2 years d 5.3 years; t h e confidence interval consists of negative numbers only and does not include 0. 14.03 s (df 91 1 90) t t d t /2 d 7.75 1.662 n 91 20. a. H0: d 0 cm; H1: d 0 cm; difference president opponent; Test statistic: t 0.399; P-value 0.3461 (Table: P-value 0.10); Critical value: t 1.691; Fail to reject H 0 . There is not sufficient evidence to support theclaim that for thepopulation of heights of presidents and their main opponents, thedifferences have a mean greater than 0 cm. There is not sufficient evidence to support theclaim that presidents tend to be taller than their main opponents. d d 0.69 0 t (df 35 1 34) sd / n 10.16 / 35 XLSTAT Output for Part (a) t-test for two paired samples / Upper-tailed test: Difference 0.686 t (Observed value) 0.399 t (Critical value) 1.691 DF 34 p-value (one-tailed) 0.346 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -2.219, 3.590 )
b. 90% CI: 2.2 cm d 3.6 cm; t h e confidence interval includes 0, so it is possible that d 0 cm. 10.16 s (df 35 1 34) d t /2 d 0.69 1.691 n 35 21.2 H0: d 0 cm; H1: d cm; difference height arm span 1 Test statistic: t 77.006; P-value 0.0000 (Table: P-value 0.005); Critical values: t 1.961 Reject H . 0 . There is sufficient evidence to warrant rejection of theclaim that for males, their height is thesame as their arm span. (TI data: Test statistic: t 22.16, critical values: t 1.967, P-value 0.0000.) t
d d sd / n
57.940 0 (df 4082 1 4081) 48.072 / 4082
Copyright © 2022 Pearson Education, Inc.
Section 9-3: Matched Pairs 175 21. (continued) XLSTAT Output t-test for two paired samples / Two-tailed test: Difference -57.940 t (Observed value) -77.006 |t| (Critical value) 1.961 DF 4081 p-value (Two-tailed) <0.0001 alpha 0.050 22. a. H0: d 0 year; H1: d 0 year; difference male female Test statistic: t 1.560; P-value 0.0622 (Table: P-value 0.05); Critical value: t 1.673 (Table: 1.676 t 1.671); Fail to reject H 0 . There is not sufficient evidence to support theclaim that among couples, males speak fewer words in a day than females. t
d d 1867.1 0 (df 56 1 55) sd n 8955.2 56
XLSTAT Output for Part (a) t-test for two paired samples / Lower-tailed test: Difference -1867.107 t (Observed value) -1.560 t (Critical value) -1.673 DF 55 p-value (one-tailed) 0.062 alpha 0.050
XLSTAT Output for Part (b) 90% confidence interval on the difference between the means: ( -3869.198, 134.984 )
b. 90% CI: 3869.2 words d 135.0 words (Table: 2872.7 words d 138.5 words); Because theconfidence interval does include 0 words, fail to reject H 0 . There is not sufficient evidence to support theclaim that men talk less than women. 8955.2 s (df 56 1 55) d t /2 d 1867.11.673 n 56 23. H0: d 0 in.; H1: d 0 in.; difference mother daughter Test statistic: t 4.090; P-value 0.0001 (Table: P-value 0.01); Critical values: t 1.978 (Table: t 1.974); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim of no difference in heights between mothers and their first daughters. t
d d 0.93 0 (df 134 1 133) sd n 2.636 134
XLSTAT Output for Exercise 23 t-test for two paired samples / Two-tailed test: Difference -0.931 t (Observed value) -4.089 |t| (Critical value) 1.978 DF 133 p-value (Two-tailed) <0.0001 alpha 0.050
XLSTAT Output for Exercise 24 t-test for two paired samples / Two-tailed test: Difference -1.366 t (Observed value) -6.347 |t| (Critical value) 1.978 DF 133 p-value (Two-tailed) <0.0001 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
176 Chapter 9: Inferences from Two Samples 24. H0: d 0 in.; H1: d 0 in.; difference father son Test statistic: t 6.347; P-value 0.0000 (Table: P-value 0.01); Critical values: t 1.978 (Table: t 1.984); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim of no difference in heights between fathers and their first sons. t
1.366 0 d d (df 134 1 133) sd n 2.491 134
25. a. difference father son; 95% CI: 1.31 in. d 1.35 in.; Because theconfidence interval includes thevalue of 0 in., fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim of no difference in heights between fathers and their first sons. There does not appear to be a significant difference. 1.863 s (df 10 1 9) d t /2 d 0.02 2.262 n 10 XLSTAT Output 95% confidence interval on the difference between the means: ( -1.313, 1.353 ) b. Answers vary, but here is a typical 95% CI: 0.98 in. d 1.20 in. Because theconfidence interval includes thevalue of 0 in., fail to reject H 0 . There is not sufficient evidence to warrant rejection of the claim of no difference in heights between fathers and their first sons. There does not appear to be a significant difference. c. t h e two confidence intervals do not differ by amounts that are very substantial. theconclusions from thetwo confidence intervals are thesame. It appears that theconfidence interval constructed using thet distribution and theconfidence interval constructed using thebootstrap method are reasonably consistent with each other. Section 9-4: Two Variances or Standard Deviations 1. a. No, thenumerator will always be larger than thedenominator in thefraction. b. No, both variances are nonnegative, so their quotient cannot be negative. c. The two samples have standard deviations (or variances) that are very close in value. d. skewed right 2. a. s2 4121.6 cm2 and s2 4045.0 cm2 1
2
b. H0: ; H : 2
2
1
2
2
1
1
2
or H : ; H : 2
0
1
2
1
1
2
population1 ANSUR I females, population2 ANSUR II females c. F s2 s2 64.22 63.62 1.0189 1
3.
4. 5.
2
d. Fail to reject H 0 . There is not sufficient evidence to support theclaim that heights of female Army personnel in 1988 and 2012 have different amounts of variation. No, unlike some other tests that have a requirement that samples must be from normally distributed populations or thesamples must have more than 30 values, theF test has a requirement that thesamples must be from normally distributed populations, regardless of how large thesamples are. The F test is very sensitive to departures from normality, which means that it works poorly by leading to wrong conclusions when either or both of thepopulations have a distribution that is not normal. H0: 1 2; H1: 1 2; population1 Before 1964, population2 After 1964 Test statistic: F s2 s2 0.0869952 0.0619372 1.9728; P-value 0.0184; Critical value: F 1.7045 1
2
(Table: F 1.6928); Reject H0. There is sufficient evidence to support theclaim that thevariation of weights before 1964 is greater than thevariation of weights after 1964. Here is one advantage of thechange in variation: Weights of quarters now have less variation than they did before 1964, so vending machines can be calibrated to accept a narrower range of weights, and counterfeit coins will be less likely to be accepted.
Copyright © 2022 Pearson Education, Inc.
Section 9-4: Two Variances or Standard Deviations 177 5.
(continued) P-value: fx | =F.DIST.RT((0.086995/0.061937)^2,40-1,40-1) Critical Value: fx | =F.INV.RT(0.05, 40-1,40-1)
6.
H0: 1 2; H1: 1 2; population1 Before 1983, population2 After 1983 Test statistic: F s2 s2 0.039102 0.016482 5.6291; P-value 0.0000; Critical value: F 1.7530 1
2
(Table: 1.6928 F 1.8409); Reject H0. There is sufficient evidence to support theclaim that thevariation of weights before 1983 is greater than thevariation of weights after 1983. Here is one advantage of thechange in variation: Weights of pennies now have less variation than they did before 1983, so machines that count pennies based on weight can be more accurate. P-value: fx | =F.DIST.RT((0.03910/0.01648)^2,35-1,37-1) Critical Value: fx | =F.INV.RT(0.05, 35-1,37-1) 7.
H0: 1 2; H1: 1 2; population1 red background, population2 blue background; Test statistic: F s2 s2 0.972 0.632 2.3706; P-value 0.0129; Upper critical value: F 1.9678 1
2
(Table: 1.8752 F 2.0739); Reject H 0 . There is sufficient evidence to warrant rejection of theclaim thatcreative task scores have thesame variation with a red background and a blue background. P-value: fx | =2*F.DIST.RT((0.97/0.63)^2,35-1,36-1) Critical Value: fx | =F.INV.RT(0.05/2,35-1,36-1) 8.
H0: 1 2; H1: 1 2; population1 red background, population2 blue background; Test statistic: F s2 s2 5.902 5.482 1.1592; P-value 0.6656; Upper critical value: F 1.9678 1
2
(Table: 1.8752 F 2.0739); Fail to reject H0 . There is not sufficient evidence to warrant rejection of theclaim that variation of scores is thesame with thered background and blue background. P-value: fx | =2*F.DIST.RT((5.90/5.48)^2,35-1,36-1) Critical Value: fx | =F.INV.RT(0.05/2,35-1,36-1) 9.
H0: 1 2; H1: 1 2; population1 ethanol, population2 placebo; Test statistic: F s2 s2 2.202 0.722 9.3364; P-value 0.0000; Critical value: F 2.4086 1
2
(Table: 2.3675 F 2.4247); Reject H 0 . There is sufficient evidence to reject theclaim that thetreatment and placebo groups have thesame amount of variation among theerrors. P-value: fx | =2*F.DIST.RT((2.20/0.72)^2,22-1,22-1) Critical Value: fx | =F.INV.RT(0.05/2,22-1,22-1) 10. a. H0: 1 2; H1: 1 2; population1 exposed, population2 not exposed; Test statistic: F s2 s2 119.502 62.532 3.6522; P-value 0.0000; Critical value: F 1.7045 1
2
(Table: 1.6928 F 1.8409); Reject H0. There is sufficient evidence to support theclaim that the variation of cotinine in smokers is greater than thevariation of cotinine in nonsmokers not exposed to tobacco smoke. P-value: fx | =F.DIST.RT((119.50/62.53)^2,40-1,40-1) Critical Value: fx | =F.INV.RT(0.05, 40-1,40-1) b. The sample is not from a normally distributed population as required, so theresults in part (a) are highly questionable.
Copyright © 2022 Pearson Education, Inc.
178
Chapter 9: Inferences from Two Samples
11. H0: 1 2; H1: 1 2; population1 Lighter bicycle, population2 Heavier bicycle; Test statistic: F s2 s2 6.32 4.92 1.6531; P-value 0.0966; Critical value: F 1.8915 1
2
(Table: 1.8543 F 1.9005); Fail to reject H0. There is not sufficient evidence to support theclaim that commuting times with thelighter bicycle have more variation than commuting times with theheavier bicycle. P-value: fx | =F.DIST.RT((6.3/4.9)^2,26-1,30-1) Critical Value: fx | =F.INV.RT(0.05, 26-1,30-1) 12. H0: 1 2; H1: 1 2; population1 Girls, population2 Boys; Test statistic: F s2 s2 706.32 660.22 1.1445; P-value 0.1714; Critical value: F 1.2639 1
2
(Table A-5 cannot be used with thegiven sample sizes). Fail to reject H 0 . There is not sufficient evidence to support theclaim that at birth, weights of girls have more variation than weights of boys. P-value: fx | =F.DIST.RT((706.3/660.2)^2,205-1,195-1) Critical Value: fx | =F.INV.RT(0.05, 205-1,195-1) 13. H0: 1 2; H1: 1 2 ; population1 Female pulse rates, population2 Male pulse rates; Test statistic: F s2 s2 13.962 7.152 3.8134; P-value 0.0340; Critical value: F 3.3129 1
2
(Table: 3.3472 F 3.2839); Reject H 0 . There is sufficient evidence to support theclaim that thevariation among pulse rates of females is greater than thevariation among males. XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 Fisher's F-test / Upper-tailed test: Fisher's F-test / Two-tailed test: Ratio 3.813 Ratio 5.096 F (Observed value) 3.813 F (Observed value) 5.096 F (Critical value) 3.313 F (Critical value) 3.588 DF1 11 DF1 9 DF2 8 DF2 11 p-value (one-tailed) 0.034 p-value (Two-tailed) 0.014 alpha 0.050 alpha 0.050 14. H0: 1 2; H1: 1 2; population1 Female weights in 2012 population2 Female weights in 1988; Test statistic: F s2 s2 12.962 5.742 5.0960; P-value 0.0139; Upper critical value: F 3.5879; 1
2
Reject H 0 . There is sufficient evidence to reject theclaim that thevariation among weights did not change from theANSUR I study in 1988 to theANSUR II study in 2012. It appears that thevariation did change. 15. H0: 1 2; H1: 1 2; population1 Medium lead exposure, population2 High lead exposure; Test statistic: F s2 s2 14.292 8.992 2.5285; P-value 0.0213; Critical value: F 2.1124 1
2
(Table: 2.0825 F 1.1242); Reject H0. There is sufficient evidence to support theclaim that IQ scores of subjects with medium lead levels vary more than IQ scores of subjects with high lead levels. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 Fisher's F-test / Upper-tailed test: Fisher's F-test / Two-tailed test: Ratio 2.529 Ratio 2.591 F (Observed value) 2.529 F (Observed value) 2.591 F (Critical value) 2.112 F (Critical value) 2.701 DF1 21 DF1 24 DF2 20 DF2 15 p-value (one-tailed) 0.021 p-value (Two-tailed) 0.060 alpha 0.050 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
Section 9-5: Resampling: Using Technology for Inferences 179 16. H0: 1 2; H1: 1 2; population1 Easy to difficult, population2 Difficult to easy; Test statistic: F s2 s2 6.8572 4.2602 2.5908; P-value 0.0599; Upper critical value: F 2.7006; Fail 1
2
to reject H 0 . There is not sufficient evidence to support a claim that thetwo populations of scores have different amounts of variation. 17. a. Calculations not shown. b. c1 3, c2 0 c. Critical value log0.05 / 2 7.4569 25 log 25 16 d. c1 3 7.4569; Fail to reject H 0 . There is not sufficient evidence to support a claim that thetwo populations of scores have different amounts of variation. 18. H0: 1 2; H1: 1 2; population1 easy to difficult, population2 difficult to easy; Test statistic: t 1.403; P-value 0.1686 (Table: P-value 0.10); Critical values: t 2.024 (Table: t 2.131); Fail to reject H 0 . There is not sufficient evidence to support a claim that thetwo populations of scores have different amounts of variation. x1 x2 1 2 4.914 3.296 0 t (df 15) s12 s22 4.7662 2.59982 n1 n2 25 16
XLSTAT Output t-test for two independent samples / Two-tailed test: Difference 1.619 t (Observed value) 1.403 |t| (Critical value) 2.024 DF 38.269 p-value (Two-tailed) 0.169 alpha 0.050
1 19.1 FL 0.4103, FR 2.7006 2.4374 9 . Section 9-5: Resampling: Using Technology for Inferences 1. Bootstrapping is used for obtaining a confidence interval estimate, and it involves sampling with replacement of selected sample values. Randomization is used for testing hypotheses. When working with two independent samples, randomization involves sampling without replacement; when working with matched pairs, randomization involves sampling with replacement. 2. Both samples are convenience samples. Randomization will not overcome theflaws of those samples. It is likely that nothing can be done to overcome those flaws. 3. Only part (b) is a randomization. Only part (b) has thesame original sample sizes with thesample data selected without replacement. 4. Because thesamples are small and thedistributions are far from normal, theresampling method is more likely to yield better results. 5. a. Randomization: thedifference between thetwo sample proportions is 0.0625. Results vary but this is typical: Among 1000 resamplings, differences at least as extreme as 0.0625 occurred 695 times. It appears that by chance, it is easy to get a difference like theone obtained, so there is not sufficient evidence to warrant rejection of theclaim that when dropped, buttered toast and toast marked with an X have thesame proportion that land with thebuttered/X side down.
Copyright © 2022 Pearson Education, Inc.
180 5.
6.
7.
8.
9.
Chapter 9: Inferences from Two Samples (continued) b. Bootstrapping: Results vary but this is typical: 95% CI: 0.250 p1 p2 0.125. Because theconfidence interval limits do contain 0, there is a not a significant difference between thetwo sample proportions. There is not sufficient evidence to warrant rejection of theclaim that when dropped, buttered toast and toast marked with an X have thesame proportion that land with thebuttered/X side. a. Randomization: Difference between sample proportions is 0.0245. Typical results for 1000 resamplings: differences at least as extreme as 0.0245 occurred only 9 times. There is sufficient evidence to reject theclaim that men and women singles players have equal success in challenging referee calls. b. Bootstrapping: Typical 95% CI: 0.00478 p1 p2 0.04184, which doesn’t contain 0. Same conclusion as part (a). a. Randomization: thedifference between thetwo sample proportions is 0.196. Results vary but this is typical: Among 1000 resamplings, differences at least as extreme as 0.196 never occurred. It appears that by chance, it is very difficult to get a difference like theone obtained, so there is sufficient evidence to support theclaim that therate of right-handedness for those who prefer to use their left ear for cell phones is less than therate of right-handedness for those who prefer to use their right ear for cell phones. b. Bootstrapping: Results vary but this is typical: 98% CI: 0.266 p1 p2 0.127. Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo sample proportions. There is sufficient evidence to support theclaim that therate of right-handedness for those who prefer to use their left ear for cell phones is less than therate of right-handedness for those who prefer to use their right ear for cell phones. a. Randomization: Difference between sample proportions is 0.107. Typical results for 1000 resamplings: differences at least as extreme as 0.107 occurred only 47 times. There is sufficient evidence to support theclaim that when given a single large bill, a smaller proportion of women spend some or all of themoney when compared to theproportion of women given thesame amount in smaller bills. (Note: Because 47 is so close to 50, results could easily suggest that there is not sufficient evidence to support theclaim.) b. Bootstrapping: Typical: 90% CI: 0.200 p1 p2 0.0133, which doesn’t contain 0. Same conclusion as part (a). a. Randomization: Using thesample data, we get x1 x2 4.57 kg. If we use randomization with thetwo sets of sample data to generate 1000 simulated differences, a typical result is that 150 of those differences will be 4.57 kg or below, so it appears that such differences can easily occur. There is not sufficient evidence to support theclaim that themean weight of the1988 male population is less than themean weightof the2012 male population. b. Bootstrapping: Results vary but this is typical: 90% CI: 11.38 kg 1 2 2.21 kg. Because the confidence interval does include 0, it appears that there is not a significant difference between themean weight in 1988 and themean weight in 2012. There is not sufficient evidence to support theclaim that themean weight of the1988 male population is less than themean weight of the2012 male population.
10. a. Randomization: x1 x2 12.357 sec. Typical result with 1000 simulated differences: 886 of those differences are at least as extreme as 12.357 sec. There is not sufficient evidence to warrant rejection of theclaim that cars in two queues have a mean waiting time equal to that of cars in a single queue. b. Bootstrapping: Typical: 99% CI: 219.6 sec 1 2 220.8 sec, which includes 0. Same conclusion as part (a). 11. a. Randomization: Using thesample data, we get x1 x2 0.03317 lb. If we use randomization to generate 1000 simulated differences, a typical result is that none of those differences will be at least as extreme as 0.03317 lb, so it appears that such differences are very rare. There is sufficient evidence to support theclaim that thecontents of cans of regular Coke have weights with a mean that is greater than themean for Diet Coke. b. Bootstrapping: Results vary but this is typical: 98% CI: 0.02844 lb 1 2 0.03773 lb. Because the confidence interval does not include 0, it appears that there is a significant difference between thetwo sample means. There is sufficient evidence to support theclaim that thecontents of cans of regular Coke have weights with a mean that is greater than themean for Diet Coke.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz
181
12. Randomization: x1 x2 4.613. Typical result for 1000 simulated differences: 18 of thedifferences are at least as extreme as 4.613. Whether using a 0.05 or 0.01 significance level, it appears that there is a significant difference between thetwo population means. 13. a. Randomization: Typical result: With 1000 resamples, 336 of thevalues of d are 0.37 lb or greater, so the value of d 0.37 lb can easily occur by chance. There is not sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights. b. Bootstrapping: Typical result: 90% CI: 1.02 lb d 1.78 lb. Because theconfidence interval does include 0, there is not sufficient evidence to support theclaim that for females, themeasured weights tend to be higher than thereported weights. 14. a. Randomization: Typical result: With 1000 resamples, 402 of thevalues of d are 978.0 words or less. There is not sufficient evidence to support theclaim that men talk less than women. b. Bootstrapping: Typical result: 90% CI: 8071.4 words d 5505.3 words, which includes 0. Same conclusion as part (a). 15. a. Randomization: Typical result: With 1000 resamples, 5 of thevalues of d are 0.889 kg or less, so thevalue of d 0.889 kg is very unlikely to occur by chance. There is sufficient evidence to support theclaim that for thepopulation of freshman male college students, theweights in September are less than the weights in thefollowing April. b. Bootstrapping: Typical result: 98% CI: 1.3 kg d 0.4 kg. Because theconfidence interval does not include 0 and consists of negative values only, there is sufficient evidence to support theclaim that for thepopulation of freshman male college students, theweights in September are less than theweights in thefollowing April. 16. a. Randomization: Typical result with 1000 resamples: none of thevalues of d are 2.867 or more. There is sufficient evidence to support theclaim that theword difficulty scores are higher with theQWERTY keyboard. b. Bootstrapping: Typical result: 98% CI: 1.8 d 3.9, which does not include 0. Same conclusion as part (a). 17. Typical result from 1000 resamples: 95% CI: 0.02909 lb 1 2 0.03478 lb. Typical result from 10,000 resamples: 0.02908 lb 1 2 0.03474 lb. thedifferences are very small, so thelarger number of resamples does not have much of an effect on theresults. 18. Typical 95% confidence interval: 1.1753 1 / 2 6.1753. Because theconfidence interval does not include 1 and consists of positive values only, it appears that theweights of regular Coke vary more than theweights of diet Coke. Quick Quiz 2
2
2.
H0: p1 p2 ; H1: p1 p2 ; population1 treatment, population2 placebo 102 75 0.271, p 102 75 0.313 0.354, p̂ p̂ 1 2 288 277 288 277
3.
a.
1.
P-value 2 P(z 2.14) 0.0324
fx | =2*(1-NORM.S.DIST(2.14,1))
b. 0.0324 0.05; Reject H0 and conclude that there is sufficient evidence to warrant rejection of theclaim that patients treated with dexamethasone and patients given a placebo have thesame rate of complete resolution. It appears that therates of complete resolution are different. 4.
a. 95% CI: 0.00732 p1 p2 0.159 b. theconfidence interval does not include 0, so it appears that thetwo proportions are not equal. There appears to be a significant difference between thesuccess rate in thetreatment group and thesuccess rate intheplacebo group.
Copyright © 2022 Pearson Education, Inc.
182 5.
6.
Chapter 9: Inferences from Two Samples a. t h e two samples are independent because they are not matched or paired in any way. x1 x2 (1 2 ) 0.81682 0.78479 0 b. t 22.092 (22.095 if using original data.) 2 2 2 2 s1 s2 0.00751 0.00439 36 36 n1 n2 XLSTAT Output for Exercise 6 XLSTAT Output for Exercise 5 Fisher's F-test / Two-tailed test: t-test for two independent samples / Two-tailed test: Ratio 2.923 F (Observed value) 2.923 Difference 0.032 t (Observed value) 22.095 F (Critical value) 1.961 |t| (Critical value) 2.003 DF1 35 DF2 35 DF 56.437 p-value (Two-tailed) <0.0001 p-value (Two-tailed) 0.002 alpha 0.050 alpha 0.050 F s2 s2 0.007512 0.004392 2.9265 (2.9233 if using original data.) 1
2
7.
a. Because each pair of data is matched by thesame subject, thetwo sets of data are dependent. b. H0: d 0F; H1: d 0F; difference 8 AM 12 AM
8.
Because theconfidence interval includes thevalue of 0F, it appears that there is not a significant difference between thetemperatures at 8 AM and thetemperatures at 12 AM. True, since n 30.
9.
10. False, therequirements are np 5 and nq 5. Review Exercises 1.
H0: p1 p2 ; H1: p1 p2 ; population1 $1 bill, population2 4 quarters; Test statistic: z 3.49; P-value 0.0002; Critical value: z 1.645; Reject H 0 . There is sufficient evidence to support theclaim that money in a large denomination is less likely to be spent relative to an equivalent amount in smaller denominations. 12 27 39 39 50 p ; q 1 ; 46 43 89 89 89 12 27 0 p̂ p̂ p p 2 1 2 46 43 z 1 39 50 39 50 pq pq 89 89 89 89 n1 n2 46 43
XLSTAT Output for Exercise 1 z-test for two proportions / Lower-tailed test: Difference -0.367 z (Observed value) -3.487 z (Critical value) -1.645 p-value (one-tailed) 0.000 alpha 0.050 2.
XLSTAT Output for Exercise 2 90% confidence interval on the difference between the proportions: ( -0.528,-0.206 )
90% CI: 0.528 p1 p2 0.206; t h e confidence interval limits do not contain 0, so it appears that there is a significant difference between thetwo proportions. Because theconfidence interval consists of negative values only, it appears that p1 is less than p2 , so it appears that money in a large denomination is less likely to be spent relative to an equivalent amount in smaller denominations
p̂1 p̂2 z /2
p̂1q̂1 n1
p̂2 q̂2 n2
12 27 46 43
12 34 27 16 46 46 43 43 1.645
46
Copyright © 2022 Pearson Education, Inc.
43
Review Exercises 183
3.
H0: d 0F; H1: d 0F; difference Actual Forecast Test statistic: t 0.262; P-value 0.7995 (Table: P-value 0.10); Critical values: t 2.262; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that differences between actual temperatures and temperatures forecast five days earlier are from a population with a mean of 0F. There does not appear to be a difference between actual temperatures and temperatures forecast five days earlier. d d 0.40 0 t sd / n 4.835 / 10 XLSTAT Output for Exercise 3 t-test for two paired samples / Two-tailed test: Difference -0.400 t (Observed value) -0.262 |t| (Critical value) 2.262 DF 9 p-value (Two-tailed) 0.800 alpha 0.050
XLSTAT Output for Exercise 4 95% confidence interval on the difference between the means: ( -3.859, 3.059 )
4.
difference Actual Forecast; 95% CI: 3.9F d 3.1F; Because theconfidence interval includes 0F, fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that differences between actual temperatures and temperatures forecast five days earlier are from a population with a mean of 0F. There does not appear to be a difference between actual temperatures and temperatures forecast five days earlier. It appears that weather forecasters are doing a good job. 4.835 s d t /2 d 0.40 2.262 (df 10 1 9) n 10
5.
H0: p1 p2 ; H1: p1 p2 ; population1 sustained care, population2 standard care Test statistic: z 2.64; P-value 0.0041; Critical value: z 2.33; Reject H 0 . There is sufficient evidence to support theclaim that therate of success for smoking cessation is greater with thesustained care program. 51 30 81 ; q 1 81 316 p 198 199 397 397 397 z
51 30 199 0 198 81 316 81 316 397 397 397 397
p̂1 p̂2 ( p1 p2 ) pq n1
pq n2
198
199
XLSTAT Output for Exercise 5 z-test for two proportions / Upper-tailed test: Difference 0.107 z (Observed value) 2.641 z (Critical value) 2.326 p-value (one-tailed) 0.004 alpha 0.010 6.
98% confidence interval on the difference between the proportions: ( 0.0135, 0.2001 )
a. 98% CI: 0.0135 p1 p2 0.200 (Table: 0.0134 p1 p2 0.200); Because theconfidence interval limits do not contain 0, there is a significant difference between thetwo proportions. Because theinterval consists of positive numbers only, it appears that thesuccess rate for thesustained care program is greater than thesuccess rate for thestandard care program.
p̂1 p̂2 z / 2
XLSTAT Output for Exercise 6
p̂1q̂1
p̂2 q̂2
51 30 198 199
51 147 30 169 198 198 199 199 2.326
198 199 n1 n2 b. Based on thesamples, thesuccess rates of theprograms are 25.8% (sustained care) and 15.1% (standard care). That difference does appear to be substantial, so thedifference between theprograms does appear to have practical significance.
Copyright © 2022 Pearson Education, Inc.
184
Chapter 9: Inferences from Two Samples
6.
(continued) c. themore successful of thetwo programs has a success rate of only 25.8%, so there is a failure rate of about74% for those who try to stop smoking with thesustained care program. thetime required and cost of thesustained program relative to standard care must be considered when determining practical significance. If sustained care is much more time consuming and expensive, thepractical significance will be lower.
7.
H0: 1 2; H1: 1 2; population1 seat belt, population2 no seat belt Test statistic: t 2.330; P-value 0.0102 (Table: P-value 0.025); Critical value: t 1.649 (Table: t 1.660); Reject H0 . There is sufficient evidence to support theclaim that children wearing seat belts have a lower mean length of time in an ICU than themean for children not wearing seat belts. Buckle up! x1 x2 (1 2 ) 0.83 1.39 0 (df 123 1 122) t s12 s22 1.772 3.062 123 290 n1 n2 P-value: fx | =T.DIST(-2.330,462.919,1)
8.
Critical Value: fx | =T.INV(1-0.05,462.919)
90% CI: 0.96 day 1 2 0.16 day; t h e confidence interval does not include 0 days, so there appears to be a significant difference. Because theconfidence interval consists of negative values only, it appears that children wearing seat belts spend less time in intensive care units than children who don’t wear seat belts. Children should wear seat belts (except for young children who should use properly installed car seats).
x1 x2 t / 2
s12 s22 0.83 1.39 1.660 1.772 3.062 (df 123 1 122) n1 n2 123 290
population1 seat belt, population2 no seat belt Critical value: fx | =T.INV.2T(0.10,462.919) 9.
The waist circumferences from 1988 are measured from eight males, and thewaist circumferences from 2012 are from eight different males, so thedata are not paired or matched in any meaningful way. thestated claim makes no sense for these two independent samples. theresults that would be obtained by blindly following theinstruction to test theclaim that thedifferences between thepairs of data are from a population with a mean of 0 mm are given below. H0: d 0 mm; H1: d 0 mm; difference 1988 2012 Test statistic: t 0.834; P-value 0.4320 (Table: P-value 0.20); Critical values: t 2.365; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thedifferences between thepairs of data are from a population with a mean of 0 mm. However, these results are not valid and they make no sense with thetwo independent samples. d d 38.4 0 t 130.19 / 8 sd / n XLSTAT Output t-test for two paired samples / Two-tailed test: Difference -38.375 t (Observed value) -0.834 |t| (Critical value) 2.365 DF 7 p-value (Two-tailed) 0.432 alpha 0.050
10. H0: 1 2; H1: 1 2; population1 no seat belt, population2 seat belt Test statistic: F s2 s2 3.062 1.772 2.9888; P-value 0.0000; Upper critical value: F 1.3630; Reject 1
2
H 0 . There is sufficient evidence to warrant rejection of theclaim that for children hospitalized after motor vehicle crashes, thenumbers of days in intensive care units for those wearing seat belts and for those not wearing seat belts have thesame variation. Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 185 10. (continued) P-value: fx | =2*F.DIST.RT((3.06/1.77)^2,290-1,123-1) Critical Value: fx | =F.INV.RT(0.05/2,290-1,123-1) Cumulative Review Exercises 1. a. Key question: Does thebar graph correctly depict thedata or is it somehow misleading? (Another reasonable question would be to ask whether significantly more males play video games than females, but no sample sizes are provided to help address that question.) b. Examine thegraph to determine whether it is misleading. c. Because thevertical scale begins with 40% instead of 0%, thebottom portion of thebar graph is cut off, so thedifferences are visually exaggerated. thegraph makes it appear that about 3.5 times as many males play video games as females, but examination of thepercentage values shows that theratio is closer to 1.5. 2. a. Key question: Estimate theproportion of all females aged 18–29 who play video games. Another reasonable question is this: Are females aged 18–29 who play video games in theminority (with a proportion less than 0.5 or 50%)? b. t h e population proportion can be estimated with a confidence interval. Determination of whether the population proportion is less than 0.5 can be addressed with a hypothesis test of theclaim that p 0.5. c. the95% confidence interval estimate of thepopulation proportion is 0.459 p 0.521. For determining whether theproportion of females aged 18–29 who play video games is less than 0.5 (H0: p 0.5; H1: p 0) t h e test statistic is z 0.63 ( 0.64 if using x 0.49(984) 482) and theP-value is 0.2619 (Table: 0.2611), and these values show that thesample proportion of 0.49 is not significantly below 0.5. There is not sufficient evidence to support a claim that theproportion is less than 0.5. (Also, there is not sufficient evidence to support a claim that theproportion is significantly different from 0.5.) z
p̂ p pq n
0.49 0.50
p̂ z /2
(0.50)(0.50) 984
XLSTAT Outputs for Exercise 2 z-test for one proportion / Lower-tailed test: Difference -0.010 z (Observed value) -0.627 z (Critical value) -1.645 p-value (one-tailed) 0.265 alpha 0.050
3.
p̂q̂ 0.49 1.96 n
0.490.51 984
XLSTAT Outputs for Exercise 3 z-test for two proportions / Upper-tailed test: Difference 0.230 z (Observed value) 10.531 z (Critical value) 1.645 p-value (one-tailed) <0.0001 alpha 0.050
95% confidence interval on the 95% confidence interval on the difference proportion (Wald): between the proportions: ( 0.459, 0.521 ) ( 0.188, 0.272 ) a. Key question: Do significantly more males play video games than females? Or is there a significant difference between theproportion of male video game players and theproportion of female video game players? b. Use a hypothesis test or confidence interval with two proportions (as in Section 9-1). c. Conclude that theproportion of male video game players is greater than (or different from) theproportion of female video game players (H0: p1 p2 ; H1: p1 p2 ). Using themethods of Section 9-1, thetest statistic of z 10.53 and theP-value of 0.0000 support that claim. Or the95% confidence interval of 0.188 p1 p2 0.272 consists of positive values only, which also supports that claim.
Copyright © 2022 Pearson Education, Inc.
186 3.
Chapter 9: Inferences from Two Samples (continued) 0.72 1017 0.49 984 p 0.607; q 1 0.607 0.393 1017 984 p̂ p̂2 ( p1 p2 ) 0.72 0.49 0 z 1 pq pq 0.6070.393 0.6070.393 1017 984 n1 n2
p̂ p̂ z 1
4.
2
p̂1 q̂1 p̂2 q̂2 0.72 0.49 1.96 0.720.28 0.490.51
/2
1017 984 n1 n2 a. Key question: Based on thedifferences between theIQ scores of each matched pair in thesample, is thereno difference between theIQ scores of pairs of twins in thepopulation? b. Use themethods of Section 9-3 for matched pairs. c. There does not appear to be a difference between theIQ scores of first-born and second-born twins in thepopulation. H0: d 0; H1: d 0; difference first second; Test statistic: t 0.150; P-value 0.8843 (Table: P-value 0.20); Critical values ( 0.05): 2.262; Fail to reject H0. 95% CI: 5.6 d 6.4. t
d d 0.40 0 sd / n 8.45 / 10
d t /2
sd 0.40 2.262 8.45 n
10
XLSTAT Outputs for Exercise 4 t-test for two paired samples / Two-tailed test: Difference 0.400 t (Observed value) 0.150 |t| (Critical value) 2.262 DF 9 p-value (Two-tailed) 0.884 alpha 0.050
Graph for Exercise 5
95% confidence interval on the difference between the means: ( -5.644, 6.444 ) 5.
6.
a. t h e data are time series data. Key question: What is thetrend of thedata over time? b. A time series graph would reveal a trend over time. c. A time series graph clearly shows that there is a distinct trend of declining sales of CDs. a. Key question: Is eyewitness memory of police better (or thesame as) with a non-stressful interrogation than with a stressful interrogation? b. Use themethods for inferences from two independent means (Section 9-2). c. It appears that eyewitness memory of police is better with a non-stressful interrogation than with a stressful interrogation. H0: 1 2; H1: 1 2; Test statistic: t 2.843; P-value 0.0029 (Table: P-value 0.005); Critical value: t 1.665 (Table: t 1.685); 90% CI: 3.27 1 2 12.53 (Table: 3.22 1 2 12.58); Reject H0 . There is sufficient evidence to support theclaim that eyewitness memory of police is better with a non-stressful interrogation than with a stressful interrogation.
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 187 6.
(continued) t
x1 x2 (1 2 ) 53.3 45.4 0 s12 s22 n1 n2
11.62 13.22 40 40
P-value: fx | =T.DIST.RT(2.843,76.733)
x1 x2 t / 2
Critical Value: fx | =T.INV(1-0.05,76.733)
s12 s22 11.62 13.22 53.3 45.4 1.685 n1 n2 40 40
Critical value: fx | =T.INV.2T(0.1,76.733) 7.
a. Key question: How do thedifferent categories compare in terms of thenumbers of deaths? b. Construct an effective graph such as a Pareto chart so that we can see which causes of death are most significant. c. A Pareto chart or bar chart or pie chart shows that pollution is thelargest cause of deaths, followed by tobacco use. Deaths from theother causes are relatively much lower.
8.
a. Key question: Is themean height of supermodels greater than (or different from) 162 cm, so that supermodels are generally taller than (or have heights different from) adult women in thegeneral population? b. Use a hypothesis test or confidence interval for an inference involving a single population mean (as in Section 8-3). c. thedata support theclaim that supermodels are taller than (or have heights different from) themean of 162 cm for adult women in thegeneral population. H0: 162 cm; H1: 162 cm; Test statistic: t 33.082; P-value 0.0000 (Table: P-value 0.005); Critical value ( 0.05): t 1.753; 90% CI: 176.4 cm 178.1 cm; Reject H0. (If testing H1: 162 cm, thetest statistic and P-value will be thesame.) There is sufficient evidence to support the claim that supermodels have heights with a mean that is greater than themean height of 162 cm for women in thegeneral population. Supermodels appear to be taller than typical women. x 177.25 162 s 1.84 t 177.25 1.753 x t /2 16 s/ n 1.84 / 16 n XLSTAT Outputs One-sample t-test / Upper-tailed test: Difference 15.250 t (Observed value) 33.082 t (Critical value) 1.753 DF 15 p-value (one-tailed) <0.0001 alpha 0.050 90% confidence interval on the mean: ( 176.442, 178.058 )
Copyright © 2022 Pearson Education, Inc.
188 9.
Chapter 9: Inferences from Two Samples a. Key question: Are thenumbers selected in a way that appears to be random, with roughly thesame frequency for each number? b. Construct a histogram or dotplot to visualize thefrequencies. c. t h e graph shows that thenumbers do appear to be drawn with roughly thesame frequency. (Chapter 11 will introduce a procedure for testing goodness-of-fit with a uniform distribution, and that procedure is much more thorough and objective.)
10. a. Key question: Is there a correlation between thenumbers of pleasure boats and thenumbers of manatee fatalities? b. A scatterplot (see Section 2-2) would be helpful in visualizing whether there is a correlation. (The following chapter will introduce much more thorough and objective criteria for analyzing correlations.) c. A scatterplot reveals that there does not appear to be a correlation between thenumbers of pleasure boats and thenumbers of manatee deaths.
Copyright © 2022 Pearson Education, Inc.
Section 10-1: Correlation 189
Chapter 10: Correlation and Regression Section 10-1: Correlation 1. a. r is a statistic that represents thevalue of thelinear correlation coefficient computed from thepaired sample data, and is a parameter that represents thevalue of thelinear correlation coefficient that would be computed by using all of thepaired data in thepopulation of all statistics students. b. t h e value of r is estimated to be 0, because it is likely that there is no correlation between heights of statistics students and their scores on thefirst statistics test. c. t h e value of r does not change if theheights are converted from centimeters to inches. 2. No, with r 0, there is no linear correlation, but there might be some association with a scatterplot showing a distinct pattern that is not a straight-line pattern. 3. No, a correlation between two variables indicates that they are somehow associated, but that association does not necessarily imply that one of thevariables has a direct effect on theother variable. Correlation does not imply causality. 4. a. –1 d. 0.992 b. 0.746 e. 1 c. 0.268 5.
r 0.963; P-value 0.000; Critical values: r 0.268 (Table: r 0.279); Yes, there is sufficient evidence to support theclaim that there is a linear correlation between theweights of bears and their chest sizes. It is easier to measure thechest size of a bear than theweight, which would require lifting thebear onto a scale. It does appear that chest size could be used to predict weight.
6.
r 0.864; Critical values (n 54): r 0.268 (Table: 0.279); Yes, there is sufficient evidence to support theclaim that for bears there is a linear correlation between length and weight
7.
r 0.319; Critical values: (n 56): r 0.263 (Table: 0.254); Yes, there is sufficient evidence to support the claim that there is a linear correlation between thenumbers of words spoken in a day by men and women in couple relationships.
8.
r 0.031; Critical values: (n 10): r 0.632; No, there is not sufficient evidence to support theclaim that there is a linear correlation between heights of mothers and heights of their first daughters. a. Answer will vary, but because there appears to be an upward pattern, it is reasonable to think that there is a linear correlation. b. H0: 0; H1: 0; r 0.906; Critical values ( 0.05): r 0.632; P-value 0.000 (Table:
9.
P-value 0.01); There is sufficient evidence to support theclaim of a linear correlation. XLSTAT Output for All Points Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.906 x 0 0.000 y 0.906 1 y 0.000 0 Values in bold are different from 0 with a significance level alpha=0.05 XLSTAT Output without (10, 10) Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.000 x 0 1.000 y 0.000 1 y 1.000 0 Values in bold are different from 0 with a significance level alpha=0.05 c. H0: 0; H1: 0; r 0; Critical values ( 0.05): r 0.666; P-value 1.000 (Table: P-value 0.05); There is not sufficient evidence to support theclaim of a linear correlation. d. t h e effect from a single pair of values can be very substantial, and it can change theconclusion.
Copyright © 2022 Pearson Education, Inc.
190
Chapter 10: Correlation and Regression
10. a. There does not appear to be a linear correlation. b. There does not appear to be a linear correlation. c. H0: 0; H1: 0; r 0; Critical values ( 0.05): r 0.950; P-value 1.000 (Table: P-value 0.05); There does not appear to be a linear correlation. thesame results are obtained with thefour points in theupper right corner. XLSTAT Output for Lower Four Points Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.000 x 0 1.000 y 0.000 1 y 1.000 0 Values in bold are different from 0 with a significance level alpha=0.05 XLSTAT Output for All Points Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.985 x 0 <0.0001 y 0.985 1 y <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 d. H0: 0; H1: 0; r 0.985; Critical values ( 0.05): r 0.707; P-value 0.000 (Table: P-value 0.01); There is sufficient evidence to support theclaim of a linear correlation. e. There are two different populations that should be considered separately. It is misleading to use the combined data from women and men and conclude that there is a relationship between x and y. 11. a. See graph below. b. H0: 0; H1: 0; r 0.816; P-value 0.002 (Table: P-value 0.01); Critical values ( 0.05): r 0.602; There is sufficient evidence to support theclaim of a linear correlation between thevariables. c. t h e scatterplot reveals a distinct pattern that is not a straight-line pattern. XLSTAT Output for Exercise 11 Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.816 x 0 0.002 y 0.816 1 y 0.002 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 11 Scatterplot for Exercise 12
Copyright © 2022 Pearson Education, Inc.
Section 10-1: Correlation 191 XLSTAT Output for Exercise 12 Correlation matrix (Pearson): p-values (Pearson): Variables x y Variables x y x 1 0.816 x 0 0.002 y 0.816 1 y 0.002 0 Values in bold are different from 0 with a significance level alpha=0.05 12. a. See graph on previous page. b. H0: 0; H1: 0; r 0.816; P-value 0.002 (Table: P-value 0.01); Critical values ( 0.05): r 0.602; There is sufficient evidence to support theclaim of a linear correlation between thevariables. c. t h e scatterplot reveals a perfect straight-line pattern, except for thepresence of one outlier. 13.1 H0: 0; H1: 0; r 0.532; P-value 0.114 (Table: P-value 0.05); Critical values: r 0.632; Fail to 3 . reject H 0 . There is not sufficient evidence to support theclaim that there is a linear correlation between lottery jackpots and numbers of tickets sold. thepair of data in thelast column correspond to a point in thescatterplot that is an outlier. theoutlier had a significant effect on theresults and theconclusion changed from theone reached in Example 4. Also, theadded pair of values represent a huge jackpot of 400 million dollars, but therelatively low number of ticket sales is not consistent with known lottery behavior. It appears that theadded pairof values are in error. XLSTAT Output for Exercise 13 Correlation matrix (Pearson): p-values (Pearson): Variables Jackpot Tickets Variables Jackpot Tickets Jackpot 1 0.532 Jackpot 0 0.114 Tickets 0.532 1 Tickets 0.114 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 13 Scatterplot for Exercise 14
XLSTAT Output for Exercise 14 Correlation matrix (Pearson): p-values (Pearson): Variables Jackpot Tickets Variables Jackpot Tickets Jackpot 1 0.985 Jackpot 0 <0.0001 Tickets 0.985 1 Tickets <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 14.1 H0: 0; H1: 0; r 0.985; P-value 0.000 (Table: P-value 0.05); Critical values: r 0.632; Fail to 4 . reject H 0 . There is sufficient evidence to support theclaim that there is a linear correlation between lottery jackpots and numbers of tickets sold. thepair of data in thelast column correspond to a point in thescatterplot that is an outlier. theoutlier did not have a significant effect on theresults and theconclusion is thesame as theone reached in Example 4. Copyright © 2022 Pearson Education, Inc.
192
Chapter 10: Correlation and Regression
15.1 H0: 0; H1: 0; r 0.298; P-value 0.473 (Table: P-value 0.05); Critical values: r 0.707; Fail to 5 . reject H 0 . There is not sufficient evidence to support theclaim that there is a linear correlation between thetime of theride and thetip amount. It does not appear that riders base their tips on thetime of theride. STAT Output for Exercise 15 Correlation matrix (Pearson): p-values (Pearson): Variables Time Tip Variables Time Tip Time 1 0.298 Time 0 0.473 Tip 0.298 1 Tip 0.473 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 15 Scatterplot for Exercise 16
XLSTAT Output for Exercise 16 Correlation matrix (Pearson): p-values (Pearson): Variables Distance Tip Variables Distance Tip Distance 1 -0.114 Distance 0 0.789 Tip -0.114 1 Tip 0.789 0 Values in bold are different from 0 with a significance level alpha=0.05 16.1 H0: 0; H1: 0; r 0.114; P-value 0.789 (Table: P-value 0.05); Critical values: r 0.707; Fail 6 . to reject H0. There is not sufficient evidence to support theclaim that there is a linear correlation between thedistance of theride and thetip amount. It does not appear that riders base their tips on thedistance of theride. 17.1 H0: 0; H1: 0; r 986; P-value 0.000 (Table: P-value 0.01); Critical values: r 0.707; Reject 7 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between thedistance of theride and thefare. XLSTAT Output for Exercise 17 Correlation matrix (Pearson): p-values (Pearson): Variables Distance Fare Variables Distance Fare Distance 1 0.986 Distance 0 <0.0001 Fare 0.986 1 Fare <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 XLSTAT Output for Exercise 18 Correlation matrix (Pearson): p-values (Pearson): Variables Time Fare Variables Time Fare Time 1 0.953 Time 0 0.000 Fare 0.953 1 Fare 0.000 0 Values in bold are different from 0 with a significance level alpha=0.05 Copyright © 2022 Pearson Education, Inc.
Section 10-1: Correlation 193 Scatterplot for Exercise 17
Scatterplot for Exercise 18
18.1 H0: 0; H1: 0; r 0.953; P-value 0.000 (Table: P-value 0.01); Critical values: r 0.707; Reject 8 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between thetime of theride and thefare. 19.1 H0: 0; H1: 0; r 0.139; P-value 0.667 (Table: P-value 0.05); Critical values: r 0.576; Fail 9 . to reject H0. There is not sufficient evidence to support theclaim that there is a significant linear correlation between theages of Best Actresses and Best Actors. Because Best Actresses and Best Actors typically appeared in different movies, we should not expect that there would be a correlation between their ages at thetime that they won theawards. XLSTAT Output for Exercise 19 Correlation matrix (Pearson): p-values (Pearson): Variables Actor Actress Variables Actor Actress Actor 1 -0.139 Actor 0 0.667 Actress -0.139 1 Actress 0.667 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 19 Scatterplot for Exercise 20
XLSTAT Output for Exercise 20 Correlation matrix (Pearson): p-values (Pearson): Variables President Opponent Variables President 1 -0.446 President Opponent -0.446 1 Opponent Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
President 0 0.229
Opponent 0.229 0
194
Chapter 10: Correlation and Regression
20.2 H0: 0; H1: 0; r 0.446; P-value 0.229 (Table: P-value 0.05); Critical values: r 0.666; Fail 0 . to reject H0. There is not sufficient evidence to support theclaim that there is a linear correlation between heights of winning presidential candidates and heights of their main opponents. In an ideal world, voters would focus on important issues and not height or physical appearance of candidates, so there should not be a correlation. 21.2 H0: 0; H1: 0; r 0.990; P-value 0.000 (Table: P-value 0.01); Critical values: r 0.632; Reject 1 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between New York City prices of a slice of pizza and a subway ride. XLSTAT Output for Exercise 21 Correlation matrix (Pearson): p-values (Pearson): Variables Pizza Cost Subway Fare Variables Pizza Cost Subway Fare Pizza Cost 1 0.990 Pizza Cost 0 <0.0001 Subway Fare 0.990 1 Subway Fare <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 21 Scatterplot for Exercise 22
XLSTAT Output for Exercise 22 Correlation matrix (Pearson): p-values (Pearson): Variables Subway Fare CPI Variables Subway Fare CPI Subway Fare 1 0.985 Subway Fare 0 <0.0001 CPI 0.985 1 CPI <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 22.2 H0: 0; H1: 0; r 0.985; P-value 0.000 (Table: P-value 0.01); Critical values: r 0.632; Reject 2 . H0. There is sufficient evidence to support theclaim that there is a significant linear correlation between theCPI (Consumer Price Index) and thesubway fare. 23.2 H0: 0; H1: 0; r 0.857; P-value 0.007 (Table: P-value 0.01); Critical values: r 0.707; Reject 3 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between footprint lengths and heights of males. thegiven results do suggest that police can use a footprint length to estimate theheight ofa male. XLSTAT Output for Exercise 23 Correlation matrix (Pearson): p-values (Pearson): Variables Foot Length Height Variables Foot Length Height Foot Length 1 0.857 Foot Length 0 0.007 Copyright © 2022 Pearson Education, Inc.
194
Chapter 10: Correlation and Regression
Height 0.857 1 Height Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
0.007
0
Section 10-1: Correlation 195 Scatterplot for Exercise 23
Scatterplot for Exercise 24
XLSTAT Output for Exercise 24 Correlation matrix (Pearson): p-values (Pearson): Variables Chirps/min Temperature Variables Chirps/min Temperature Chirps/min 1 0.874 Chirps/min 0 0.005 Temperature 0.874 1 Temperature 0.005 0 Values in bold are different from 0 with a significance level alpha=0.05 24.2 H0: 0; H1: 0; r 0.874; P-value 0.005 (Table: P-value 0.01); Critical values: r 0.707; Reject 4 . H0. There is sufficient evidence to support theclaim of a linear correlation between thenumber of cricket chirps and thetemperature. 25.2 H0: 0; H1: 0; r 0.174; P-value 0.681 (Table: P-value 0.05); Critical values: r 0.707; Fail to 5 . reject H 0 . There is not sufficient evidence to support theclaim that there is a linear correlation between the numbers of cars sold (thousands) and thenumbers of points scored in theSuper Bowl. Common sense suggests that it would be unreasonable to expect a correlation between car sales and points scored in theSuper Bowl. XLSTAT Output for Exercise 25 Correlation matrix (Pearson): p-values (Pearson): Variables Car Sales Points Variables Car Sales Points Car Sales 1 0.174 Car Sales 0 0.681 Points 0.174 1 Points 0.681 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 25 Scatterplot for Exercise 26
Copyright © 2022 Pearson Education, Inc.
196
Chapter 10: Correlation and Regression
XLSTAT Output for Exercise 26 Correlation matrix (Pearson): p-values (Pearson): Variables Consumption PhDs Variables Consumption Consumption 1 0.959 Consumption 0 PhDs 0.959 1 PhDs <0.0001 Values in bold are different from 0 with a significance level alpha=0.05
PhDs <0.0001 0
26.2 H0: 0; H1: 0; r 0.959; P-value 0.000 (Table: P-value 0.01); Critical values: r 0.632; Reject 6 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between per capita consumption of mozzarella cheese and thenumbers of civil engineering PhD degrees awarded each year. Common sense suggests that there is no cause/effect relationship between consumption of mozzarella cheese and PhD degrees awarded in civil engineering. 27.2 H0: 0; H1: 0; r 0.959; P-value 0.010; Critical values: r 0.878; Reject H0. There is 7 . sufficient evidence to support theclaim that there is a linear correlation between weights of lemon imports from Mexico and U.S. car fatality rates. theresults do not suggest any cause-effect relationship between thetwo variables. XLSTAT Output for Exercise 27 Correlation matrix (Pearson): p-values (Pearson): Variables Imports Fatality Rate Variables Imports Fatality Rate Imports 1 -0.959 Imports 0 0.010 Fatality Rate -0.959 1 Fatality Rate 0.010 0 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 27 Scatterplot for Exercise 28
XLSTAT Output for Exercise 28 Correlation matrix (Pearson): p-values (Pearson): Variables Width Weight Variables Width Weight Width 1 0.948 Width 0 0.004 Weight 0.948 1 Weight 0.004 0 Values in bold are different from 0 with a significance level alpha=0.05 28.2 H0: 0; H1: 0; r 0.948; P-value 0.004 (Table: P-value 0.01); Critical values: r 0.811; Reject 8 . H0. There is sufficient evidence to support theclaim of a linear correlation between theoverhead width of a seal in a photograph and theweight of a seal. 29.2 H0: 0; H1: 0; r 0.571; P-value 0.000 (Table: P-value 0.05); Critical values: r 0.074; Reject 9 . Copyright © 2022 Pearson Education, Inc.
196
Chapter 10: Correlation and Regression
H0. There is sufficient evidence to support XLSTAT theclaim Output that forthere Exercise is a linear 26 correlation between thetime of the Correlation matrix (Pearson): ride and thetip amount. It does appear that riders base theirp-values tips on (Pearson): thetime of theride. While thesmaller sample of eight pairs of data listed in Exercise 15 did not provide sufficient evidence to support a claim of a linear correlation, thelarger data set of 703 pairs of data does provide sufficient evidence to support that claim.
Copyright © 2022 Pearson Education, Inc.
Section 10-1: Correlation 197 XLSTAT Output for Exercise 29 Correlation matrix (Pearson): p-values (Pearson): Variables Time Tip Variables Time Time 1 0.571 Time 0 Tip 0.571 1 Tip <0.0001 Values in bold are different from 0 with a significance level alpha=0.05 Scatterplot for Exercise 29 Scatterplot for Exercise 30
Tip <0.0001 0
XLSTAT Output for Exercise 30 Correlation matrix (Pearson): p-values (Pearson): Variables Distance Tip Variables Distance Tip Distance 1 0.616 Distance 0 <0.0001 Tip 0.616 1 Tip <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 30.3 H0: 0; H1: 0; r 0.616; P-value 0.000 (Table: P-value 0.05); Critical values: r 0.074; Reject 0 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between thedistance of the ride and thetip amount. It does appear that riders base their tips on thedistance of theride. While thesmaller sample of eight pairs of data did not provide sufficient evidence to support a claim of a linear correlation, thelarger data set of 703 pairs of data does provide sufficient evidence to support that claim. 31.3 H0: 0; H1: 0; r 0.934; P-value 0.000 (Table: P-value 0.05); Critical values: r 0.074; Reject 1 . H0. There is sufficient evidence to support theclaim that there is a linear correlation between thedistance of the ride and thefare. thesmaller sample of eight pairs of data leads to thesame conclusion as thelarger data setof 703 pairs of data, and thevalues of thelinear correlation coefficient r are not dramatically different. XLSTAT Output for Exercise 31 Correlation matrix (Pearson): p-values (Pearson): Variables Distance Fare Variables Distance Fare Distance 1 0.934 Distance 0 <0.0001 Fare 0.934 1 Fare <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 XLSTAT Output for Exercise 32 Correlation matrix (Pearson): p-values (Pearson): Variables Time Fare Variables Time Fare Time 1 0.878 Time 0 <0.0001 Fare 0.878 1 Fare <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
198
Chapter 10: Correlation and Regression Scatterplot for Exercise 31
Scatterplot for Exercise 32
32.3 H0: 0; H1: 0; r 0.878; P-value 0.000 (Table: P-value 0.05); Critical values: r 0.074; There 2 . is sufficient evidence to support theclaim that there is a linear correlation between thetime of theride and thefare. It does appear that riders base their tips on thetime of theride. thesmaller sample of eight pairs of data leads to thesame conclusion as thelarger data set of 703 pairs of data, and thevalues of thelinear correlation coefficient r are not dramatically different. 33. Answers vary, but thefollowing is typical. Resampling 1000 times, there are 108 results that are at least as extreme as r 0.532 found from thesample data. (The results at least as extreme are those with r 0.532 or greater and those that with r 0.532 or lower.) It appears that thelikelihood of getting a result at least as extreme as theone obtained is 0.108, so there is not sufficient evidence to support theclaim that there is a linear correlation between lottery jackpots and numbers of tickets sold. 34. Answers vary, but thefollowing is typical. Resampling 1000 times, there are no results that are at least as extreme as r 0.985 found from thesample data. (The results at least as extreme are those with r 0.985 or greater and those that with r 0.985 or lower.) It appears that thelikelihood of getting a result at least as extreme as theone obtained is 0.000, so there is sufficient evidence to support theclaim that there is a linear correlation between lottery jackpots and numbers of tickets sold. 35. Answers vary, but thefollowing is typical. Resampling 1000 times, there are 473 results that are at least as extreme as r 0.298 found from thesample data. (The results at least as extreme are those with r 0.298 or greater and those that with r 0.298 or lower.) It appears that thelikelihood of getting a result at least as extreme as theone obtained is 0.473, so there is not sufficient evidence to support theclaim that there is a linear correlation between thetime of theride and thetip amount. 36. Answers vary, but thefollowing is typical. Resampling 1000 times, there are 822 results that are at least as extreme as r 0.114 found from thesample data. (The results at least as extreme are those with r 0.114 or greater and those that with r 0.114 or lower.) It appears that thelikelihood of getting a result at least as extreme as theone obtained is 0.822, so there is not sufficient evidence to support theclaim that there is a linear correlation between thedistance of theride and thetip amount. 37. With n 703, there are 703 2 701 degrees of freedom. From Table A-3 use theclosest t value of 1.965 in 1.965 thegiven formula to get thecritical values of r 0.074. Using a more accurate value of 1.962 703 2 1.963354 t 1.963354 from technology leads to thesame critical values of r 0.074. 1.9633542 703 2 Section 10-2: Regression 1.
a.
yˆ represents thepredicted value of highway fuel consumption.
b. t h e slope is 0.00749 and they-intercept is 58.9. c. t h e predictor variable is weight which is represented by x. d. yˆ 58.9 0.00749(3000) 36.4 mi/gal Copyright © 2022 Pearson Education, Inc.
Section 10-2: Regression 199 2.
The first equation represents theregression line that best fits sample data, and thesecond equation represents theregression line that best fits all paired data in a population. thevalues b0 and b1 are statistics representing they-intercept and slope of thestraight line that best fits thesample data. thevalues 0 and 1 are parameters representing they-intercept and slope of thestraight line that best fits thedata in thepopulation of paired data.
3.
a. A residual is a value of y ŷ, which is thedifference between an observed value of y and a predicted value of y. b. t h e regression line has theproperty that thesum of squares of theresiduals is thelowest possible sum.
4.
The value of r and thevalue of b1 have thesame sign. They are both positive or they are both negative or they are both 0. If r is positive, theregression line has a positive slope and rises from left to right. If r is negative, theslope of theregression line is negative and it falls from left to right.
5.
With no significant linear correlation, thebest predicted value is
6.
With a significant linear correlation, thebest predicted value is yˆ 212 61.9(6.5) 190 lb.
7.
With a significant linear correlation, thebest predicted value is yˆ 106 1.10(180) 92.0 kg.
8.
With no significant linear correlation, thebest predicted value is y 1.26 mg of nicotine.
9.
yˆ 3.00 0.500x; t h e data have a pattern that is not a straight line.
y 37.3 mi/gal.
XLSTAT Output for Exercise 9 Model parameters (y): Source Intercept x
Value 3.001 0.500
Standard error 1.125 0.118
Scatterplot for Exercise 9
t
Pr > |t| 2.667 4.239
0.026 0.002
Scatterplot for Exercise 10
XLSTAT Output for Exercise 10 Model parameters (y): Source Intercept x
Value 3.002 0.500
Standard error 1.124 0.118
t
Pr > |t| 2.670 4.239
0.026 0.002
10. yˆ 3.00 0.500x; There is an outlier. 11. a.
yˆ 0.264 0.906x XLSTAT Output
Model parameters (y): Source Intercept x
Value 0.264 0.906
Standard error 0.565 0.150
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 0.468 6.041
0.653 0.000
200
Chapter 10: Correlation and Regression
11. (continued) b. yˆ 2 0x (or yˆ 2 ) XLSTAT Output Model parameters (y): Source Intercept x
Value 2.000 0.000
Standard error 0.816 0.378
t
Pr > |t| 2.449 0.000
0.044 1.000
c. The results are very different, indicating that one point can dramatically affect theregression equation. 12. a.
yˆ 0.0846 0.985x XLSTAT Output
Model parameters (y): Source Intercept x b.
Value 0.085 0.985
Standard error 0.486 0.071
t
Pr > |t| 0.174 13.803
0.868 <0.0001
yˆ 1.5 0x (or yˆ 1.5 ) XLSTAT Output
Model parameters (y): Source Intercept x c.
Value 1.500 0.000
Standard error 1.118 0.707
t
Pr > |t| 1.342 0.000
0.312 1.000
yˆ 9.5 0x (or yˆ 9.5 ) XLSTAT Output
Model parameters (y): Source Intercept x
Value 9.500 0.000
Standard error 6.727 0.707
t
Pr > |t| 1.412 0.000
0.293 1.000
d. The results are very different, indicating that combinations of clusters can produce results that differ dramatically from results within each cluster alone. 13.1 yˆ 7.97 0.0756x; r 0.532; P-value 0.114; With no significant linear correlation, thebest predicted 3 . value is y 25.6 million tickets. t h e best predicted value is very different from theactual value of 90 million tickets that were sold. XLSTAT Output Model parameters (Tickets): Source Value Standard error t Pr > |t| Intercept 7.966 10.559 0.754 0.472 Jackpot 0.076 0.043 1.775 0.114
Copyright © 2022 Pearson Education, Inc.
Section 10-2: Regression 201 14.1 yˆ 7.72 0.159x; r 0.985; P-value 0.000; With a significant linear correlation, thebest predicted value 4 . is yˆ 7.72 0.159(345) 47 million tickets. thebest predicted value is somewhat close to theactual value of 55 million tickets that were sold. XLSTAT Output Model parameters (Tickets): Source Value Standard error t Pr > |t| Intercept -7.725 2.878 -2.684 0.028 Jackpot 0.159 0.010 16.041 <0.0001
15.1 yˆ 1.06 0.0452x; r 0.298; P-value 0.473; With no significant linear correlation, thebest predicted value 5 . is y $1.68. t h e best predicted value is very different from theactual tip of $4.55. XLSTAT Output Model parameters (Tip): Source Intercept Time
Value 1.056 0.045
Standard error 1.002 0.059
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 1.053 0.765
0.333 0.473
202
Chapter 10: Correlation and Regression
16.1 yˆ 1.83 0.0399x; r 0.114; P-value 0.789; With no significant linear correlation, thebest predicted 6 . value is y $1.68. t h e best predicted value is very different from theactual tip of $4.55. XLSTAT Output Model parameters (Tip): Source Intercept Distance
Value 1.826 -0.040
Standard error 0.790 0.142
t
Pr > |t| 2.312 -0.280
0.060 0.789
17.1 yˆ 5.19 2.70x; r 0.986; P-value 0.000; With a significant linear correlation, thebest predicted value is 7 . yˆ 5.19 2.70(3.10) $13.55 (or $13.56). t h e best predicted value is close to theactual fare of $15.30. XLSTAT Output Model parameters (Fare): Source Intercept Distance
Value 5.185 2.699
Standard error 1.050 0.189
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 4.940 14.268
0.003 <0.0001
Section 10-2: Regression 203 18.1 yˆ 0.709 1.13x; r 0.953; P-value 0.000; With a significant linear correlation, thebest predicted value 8 . is yˆ 0.709 1.13(20) $21.89 (or $21.82). t h e best predicted value isn’t very close to theactual fare of $15.30. XLSTAT Output Model parameters (Fare): Source Value Standard error t Pr > |t| Intercept -0.709 2.493 -0.285 0.786 Time 1.126 0.147 7.665 0.000
19.1 yˆ 50.0 0.0886x; r 0.139; P-value 0.667; With no significant linear correlation, thebest predicted 9 . value is y 46.4 years. thebest predicted value isn’t close to theactual value of 60 years. XLSTAT Output Model parameters (Actress): Source Value Intercept 49.959 Age of Actor -0.089
Standard error 8.411 0.200
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 5.939 -0.443
0.000 0.667
204
Chapter 10: Correlation and Regression
20.2 yˆ 284 0.567x; r 0.446; P-value 0.229; With no significant linear correlation, thebest predicted value 0 . is y 178.9 cm. thebest predicted value differs from theactual value by 10 cm (rounded), which isn’t a very large discrepancy. XLSTAT Output Model parameters (Opponent): Source Value Standard error t Pr > |t| Intercept 284.289 80.023 3.553 0.009 President -0.567 0.430 -1.318 0.229
21.2 yˆ 0.0329 0.969x; r 0.990; P-value 0.000; With a significant linear correlation, thebest predicted 1 . value is yˆ 0.0329 0.969(4.00) $3.91. XLSTAT Output Model parameters (Subway Fare): Source Value Intercept 0.033 Pizza Cost 0.969
Standard error 0.094 0.049
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 0.351 19.760
0.735 <0.0001
Section 10-2: Regression 205 22.2 yˆ 27.0 82.3x; r 0.985; P-value 0.000; With a significant linear correlation, thebest predicted value is 2 . yˆ 27.0 82.3(3.00) 273.9 (or 274.0). XLSTAT Output Model parameters (CPI): Source Intercept Subway Fare
Value 26.996 82.334
Standard error 9.482 5.021
t
Pr > |t| 2.847 16.397
0.022 <0.0001
23.2 yˆ 350 5.21x; r 0.857; P-value 0.007; With a significant linear correlation, thebest predicted value is 3 . yˆ 350 5.21(273) 1772 mm. t h e best predicted height is close to theactual height. XLSTAT Output Model parameters (Height): Source Value Intercept 350.270 Foot Length 5.208
Standard error 342.888 1.278
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 1.022 4.076
0.346 0.007
206
Chapter 10: Correlation and Regression
24.2 yˆ 27.6 0.0523x; r 0.874; P-value 0.005; With a significant linear correlation, thebest predicted value 4 . is yˆ 27.6 0.0523(3000) 185F (or 184F). t h e value of 3000 chirps in one minute is well beyond thescope of thelisted sample data, so theextrapolation might be off by a considerable amount, especially if thecricket is dead from such a high temperature. XLSTAT Output Model parameters (Temperature): Source Value Standard error t Pr > |t| Intercept 27.628 12.170 2.270 0.064 Chirps/min 0.052 0.012 4.399 0.005
25.2 yˆ 0.923 0.00665x; r 0.174; P-value 0.681; With no significant linear correlation, thebest predicted 5 . value is y 57 points. thebest predicted value isn’t close to theactual value of 37 points. XLSTAT Output Model parameters (Points): Source Value Intercept 0.923 Car Sales (1000s) 0.007
Standard error 129.862 0.015
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 0.007 0.432
0.995 0.681
Section 10-2: Regression 207 26.2 yˆ 988 157x; r 0.959; P-value 0.000; With a significant linear correlation, thebest predicted value is 6 . yˆ 988 157(12.0) 896 (or 897). Given thespurious nature of thedata, it is not too likely that thepredictedvalue will be accurate. XLSTAT Output Model parameters (PhDs): Source Value Standard error t Pr > |t| Intercept -988.146 167.095 -5.914 0.000 Cheese (pounds) 157.109 16.490 9.527 <0.0001
27.2 yˆ 16.5 0.00282x; r 0.959; P-value 0.010; With a significant linear correlation, thebest predicted 7 . value is yˆ 16.5 0.00282(500) 15.1 fatalities per 100,000 population. Common sense suggests that the prediction doesn’t make much sense. XLSTAT Output Model parameters (Fatality Rate): Source Value Standard error t Pr > |t| Intercept 16.491 0.188 87.704 <0.0001 Lemon Imports -0.00282 0.000 -5.858 0.010
Copyright © 2022 Pearson Education, Inc.
208
Chapter 10: Correlation and Regression
28.2 yˆ 157 40.2x; r 0.948; P-value 0.004; With a significant linear correlation, thebest predicted value is 8 . yˆ 157 40.2(2) 76.6 kg (or 76.5 kg). t h e prediction is a negative weight that cannot be correct. The overhead width of 2 cm is well beyond thescope of thesample widths, so theextrapolation might be off by a considerable amount. Clearly, thepredicted negative weight makes no sense. XLSTAT Output Model parameters (Weight): Source Value Standard error t Pr > |t| Intercept -156.879 57.414 -2.732 0.052 Overhead Width 40.182 6.712 5.986 0.004
29.2 yˆ 0.174 0.116x; r 0.571; P-value 0.000; With a significant linear correlation, thebest predicted value 9 . is yˆ 0.174 0.116(20) $2.49. (Unlike Exercise 15, this larger data set results in a significant linear correlation, so thepredicted value is not y.) The best predicted value isn’t very close to theactual tip of $4.55. XLSTAT Output Model parameters (Tip): Source Intercept Time
Value 0.174 0.116
Standard error 0.108 0.006
Copyright © 2022 Pearson Education, Inc.
t
Pr > |t| 1.607 18.429
0.108 <0.0001
Section 10-2: Regression 209 30.3 yˆ 0.820 0.439x; r 0.616; P-value 0.000; With a significant linear correlation, thebest predicted value 0 . is yˆ 0.820 0.439(3.1) $2.18. (Unlike Exercise 16, this larger data set results in a significant linear correlation, so thepredicted value is not y.) The best predicted value isn’t very close to theactual tip of $4.55. XLSTAT Output Model parameters (Tip): Source Intercept Distance
Value 0.820 0.439
Standard error 0.077 0.021
t 10.721 20.679
Pr > |t| <0.0001 <0.0001
31.3 yˆ 5.95 2.86x; r 0.934; P-value 0.000; With a significant linear correlation, thebest predicted value is 1 . yˆ 5.95 2.86(3.1) $14.82 (or $14.80). t h e best predicted value is close to theactual fare of $15.30. XLSTAT Output Model parameters (Fare): Source Intercept Distance
Value 5.948 2.856
Standard error 0.149 0.041
Copyright © 2022 Pearson Education, Inc.
t 40.032 69.343
Pr > |t| <0.0001 <0.0001
210
Chapter 10: Correlation and Regression
32.3 yˆ 1.60 0.765x; r 0.878; P-value 0.000; With no significant linear correlation, thebest predicted value 2 . is yˆ 1.60 0.765(20) $16.90 (or $16.91). t h e best predicted value is reasonably close to theactual fare of $15.30. XLSTAT Output Model parameters (Fare): Source Value Standard error t Pr > |t| Intercept 1.601 0.270 5.922 <0.0001 Time 0.765 0.016 48.639 <0.0001
33. a. See table below. b. For yˆ 10.9 0.174x, t h e sum of squares of theresiduals is 137.862. c. For yˆ 10.0 0.200x, t h e sum of squares of theresiduals is 535.560, which is larger than 137.862, which is thesum of squares of theresiduals for theregression line. yˆ 10.9 0.174x
yˆ 10.0 0.200x
x
yˆ
yˆ y
( yˆ y)2
334 127 300 227 202 180
47.216
6.784
46.023
11.198 41.3 28.598
4.802 –0.300 –1.598
23.059 0.090 2.554
24.248 20.42 17.636
–1.248 –2.420 0.364
1.558 5.856 0.132
14.33 33.47
1.670 –7.470
2.789 55.801
164 145 255
Sum:
x
yˆ
yˆ y
( yˆ y)2
334 127 300 227 202 180
56.8 15.4 50 35.4 30.4 26
–2.800 0.600 –9.000 –8.400 –7.400 –8.000
7.840 0.360 81.000 70.560 54.760 64.000
164 145 255
22.8 19 41
–4.800 –3.000 –15.000
23.040 9.000 225.000
137.862
Sum:
535.560
Section 10-3: Prediction Intervals and Variation 1. The value of se 16.27555 cm is thestandard error of estimate, which is a measure of thedifferences between theobserved weights and theweights predicted from theregression equation. It is a measure of thevariation of thesample points about theregression line. 2. We have 95% confidence that thelimits of 59.0 kg and 123.6 kg contain thevalue of theweight for a male witha height of 180 cm. themajor advantage of using a prediction interval is that it provides us with a range of likely weights, so we have a sense of how accurate thepredicted weight is likely to be. theterminology of prediction interval is used for an interval estimate of a variable, whereas theterminology of confidence intervalis used for an interval estimate of a population parameter.
Copyright © 2022 Pearson Education, Inc.
Section 10-3: Prediction Intervals and Variation 211 3.
The coefficient of determination is r2 0.155. We know that 15.5% of thevariation in weight is explained by thelinear correlation between height and weight, and 84.5% of thevariation in weight is explained by other factors and/or random variation.
4.
For thepaired weights, se 0 because there is an exact conversion formula with no error. For a student who weighs 100 lb, thepredicted weight is 45.4 kg, and there is no prediction interval because theconversion yields an exact result.
5.
r2 (0.298)2 0.089; 8.9% of thevariation in tips is explained by thelinear correlation between times and tips, and 91.1% of thevariation in tips is explained by other factors and/or random variation.
6.
r2 (0.114)2 0.013; 1.3% of thevariation in tips is explained by thelinear correlation between distance and tips, and 98.7% of thevariation in tips is explained by other factors and/or random variation.
7.
r2 (0.986)2 0.972; 97.2% of thevariation in fares is explained by thelinear correlation between distances and fares, and 2.8% of thevariation in fares is explained by other factors and/or random variation.
8.
r2 (0.953)2 0.908; 90.8% of thevariation in fares is explained by thelinear correlation between times and fares, and 9.2% of thevariation in fares is explained by other factors and/or random variation.
9.
r 0.788; Critical values: r 0.576, assuming a 0.05 significance level. There is sufficient evidence to support a claim of a linear correlation between weights of large cars and thehighway fuel consumption amounts.
10. Excel: 0.620, or 62.0%; r2 (0.788)2 0.621, or 62.1% 11. 29.0 mi/gal 12. 24.6 mi/gal y 33.3 mi/gal; We have 95% confidence that thelimits of 24.6 mi/gal and 33.3 mi/gal contain theamount of highway fuel consumption for a car weighing 4000 pounds. 13. yˆ 59.547 0.00764(3500) 32.81 mi/gal n x0 x 12 3500 4042.1 1 1 (2.228)(1.8756368) 1 4.91 2 n n x 2 x 12 12 197, 046,189 48, 5052 2
2
E t /2 se 1
95% CI: 32.81 4.91 27.9 mi/gal y 37.7 mi/gal XLSTAT Output Predictions for thenew observations (Highway): Observation
Weight
PredObs1
Pred(Highway)
3500.000
32.808
Lower bound 95% (Observation) 27.896
Upper bound 95% (Observation) 37.720
14. yˆ 59.547 0.00764(4200) 27.46 mi/gal n x0 x 12 4200 4042.1 1 1 (3.169)(1.8756368) 1 6.23 1 n n x 2 x 2 12 12 197, 046,189 48, 5052 2
2
E t /2 se
99% CI: 27.46 6.23 21.2 mi/gal y 33.7 mi/gal XLSTAT Output Predictions for thenew observations (Highway): Observation PredObs1
Weight
Pred(Highway)
4200.000
27.460
Lower bound 99% (Observation) 21.201
Copyright © 2022 Pearson Education, Inc.
Upper bound 99% (Observation) 33.719
212
Chapter 10: Correlation and Regression
15. yˆ 59.547 0.00764(3800) 30.52 mi/gal n x0 x 12 3800 4042.1 1 1 1 (3.169)(1.8756368) 1 6.35 2 2 n n x x 12 12 197, 046,189 48, 5052 2
2
E t /2 se
99% CI: 30.52 6.35 24.2 mi/gal y 36.9 mi/gal XLSTAT Output Predictions for thenew observations (Highway): Observation
Weight
PredObs1
Pred(Highway)
3800.000
30.516
Lower bound 99%
Upper bound 99% (Observation) 36.871
(Observation) 24.161
16. yˆ 59.547 0.00764(3750) 30.90 mi/gal n x0 x 123750 4042.1 1 1 (2.228)(1.8756368) 1 4.52 2 n n x 2 x 12 12 197, 046,189 48, 5052 2
2
E t /2 se 1
95% CI: 30.90 4.52 26.4 mi/gal y 35.4 mi/gal XLSTAT Output Predictions for thenew observations (Highway): Observation
Weight
PredObs1 17. a. b. c.
Pred(Highway)
3750.000
30.898
Lower bound 95%
Upper bound 95% (Observation) 35.418
(Observation) 26.378
10,626.59 68.83577 95% CI: 38.0°F y 60.4°F yˆ 72.5 3.686.327 49.2 n x0 x 1 1 7 6.327 20.143 1 2.5706 3.710411 1 2 2 2 n n x x 7 7 3623 141 2
E t /2 se
2
11.2
XLSTAT Output Predictions for thenew observations (Temperature): Observation
Altitude
PredObs1
Pred(Temperature) 6.327
49.188
Lower bound 95% (Observation) 37.956
Upper bound 95% (Observation) 60.419
18. a. 3210.364 b. 1087.191 c. 99% CI: $10,400 y $105,000 yˆ 27.7 0.0373800 57.5 n x0 x 9 800 443.1 1 1 47.1 3.499512.462466 1 n n x 2 x 2 9 9 4, 076, 640 39882 2
2
E t /2 se 1
XLSTAT Output Predictions for thenew observations (Salary): Observation PredObs1
Income
Pred(Salary)
800.000
57.528
Lower bound 99% (Observation) 10.430
Copyright © 2022 Pearson Education, Inc.
Upper bound 99% (Observation) 104.627
Section 10-4: Multiple Regression 213
19. a. 352.7278 b. 109.3722 c. 90% CI: 71.09F y 88.71F yˆ 27.628 0.05227 1000 79.90
81000 1016.25 n x0 x 1 1 8.81. 1 1.9434.2695 1 8 88391204 81302 n n x 2 x 2 2
2
E t /2 se
XLSTAT Output Predictions for thenew observations (Temperature): Observation
Chirps
PredObs1
Pred(Temperature)
1000.000
79.901
Lower bound 90% (Observation) 71.093
Upper bound 90% (Observation) 88.708
20. a. 8880.12 b. 991.1515 c. 99% CI: 125.0 kg y 284.5 kg yˆ 156.879 40.182 9.0 204.76 n x0 x 1 6 9.0 8.5 1 79.79 4.604 15.7413 1 n n x 2 x 2 6 6 439 512 2
2
E t /2 se 1
XLSTAT Output Predictions for thenew observations (Weight): Observation
Width
PredObs1
Pred(Weight) 9.000
204.758
Lower bound 99% (Observation) 124.966
Upper bound 99% (Observation) 284.549
21. yˆ 10.871 0.174(625) 97.879 n x0 x 1 9 625 214.889 1 E (2.365)(4.43722) 21.863 2 n n x 2 x 9 9 455, 364 19342 2
2
E t /2 se
95% CI: 97.879 21.863 76.1 million tickets y 120 million tickets XLSTAT Output Predictions for thenew observations (Tickets): Observation PredObs1
Jackpot 625.000
Pred(Tickets) 97.985
Lower bound 95% (Mean) 76.125
Upper bound 95% (Mean) 119.844
Section 10-4: Multiple Regression 1. The response variable is Speed (the mean speed of thewinner) and thepredictor variables are Distance, thenumber of Stages, and thenumber of Finishers. 2. No, instead of using four predictor variables, it is better to use themultiple regression equation with theoriginal three predictor variables of Distance, Stages, and Finishers. Given that theadjusted R2 did not change with theaddition of thefourth variable for thenumber of entrants, we see that this additional variable should not be included because it did not improve themultiple regression equation. 3.
The unadjusted R2 increases (or remains thesame) as more variables are included, but theadjusted R2 is adjusted for thenumber of variables and sample size. theunadjusted R2 incorrectly suggests that thebest multiple regression equation is obtained by including all of theavailable variables, but by taking into account the sample size and number of predictor variables, theadjusted R2 is much more helpful in weeding out variables that should not be included.
Copyright © 2022 Pearson Education, Inc.
214
Chapter 10: Correlation and Regression
4.
89.7% of thevariation in themean speed of thewinner can be explained by thevariables of Distance, Stages, and Finishers, so 10.3% of thevariation in themean speed of thewinner can be explained by other factors and/or random variation.
5.
Son 18.0 0.504 Father 0.277 Mother
6.
a. P-value 0.0001 b. R2 0.3649 c. adjusted R2 0.3552
7.
8.
A P-value less than 0.0001 is low, but thevalues of R2 (0.3649) and adjusted R2 (0.3552) are not high. Although themultiple regression equation fits thesample data best, it is not a good fit, so it should not be used for predicting theheight of a son based on theheight of his father and theheight of his mother. Predicted height: Son 18.0 0.504 70 0.277 60 70 in.; This result is not likely to be a good predicted value because themultiple regression equation is not a good model (based on theresults from Exercise 7).
9.
The weight of discarded paper, because it has thebest combination of small P-value (0.000) and highest adjusted R2 (0.411).
10. The weights of discarded metal and paper, because they have thebest combination of small P-value (0.000) and highest adjusted R2 (0.498). 11. PLAS 0.170 0.290 METAL 0.122 PAPER 0.0777 GLASS; That equation has a low P-value of 0.000 and its adjusted R2 value of 0.540 is thelargest and it is substantially higher than any of theother values of adjusted R2. 12. PLAS 0.170 0.290(3.00) 0.122(10.25) 0.0777(9.35) 2.68; t h e best predicted weight of discarded plastic is 2.68 lb (based on theresult from Exercise 11). Because theadjusted R2 of 0.540 is not very high for this equation, thepredicted value might not be a good prediction. 13. The best regression equation is HWY 58.9 0.00749 WEIGHT. t h e three different possible regression equations all have a P-value of 0.000. Given that thesingle predictor variable of Weight yields an adjusted R2 of 0.787 that is only slightly less than theadjusted R2 of 0.791 obtained by using thetwo predictor variables of Weight and Displacement, it is better to use thesingle predictor variable instead of two predictor variables. (The single predictor variable of Displacement has an adjusted R2 of 0.506.) Because theadjusted R2 very close to 1, it is likely that predicted values will not be very accurate. thepossible models are shown below:
of 0.787 isn’t
HWY 58.9 0.00749 WEIGHT; adjusted R2 0.787 HWY 43.1 4.36 DISP; adjusted R2 0.506 HWY 58.0 0.00665 WEIGHT 0.814 DISP; adjusted R2 0.791 XLSTAT Outputs Model parameters (HWY): Source Value Standard error Intercept 58.862 2.106 WEIGHT -0.007 0.001 Model parameters (HWY) : Source Value Intercept 43.089 DISPLACEMENT -4.361
Standard error 1.749 0.622
Copyright © 2022 Pearson Education, Inc.
t 27.950 -13.210
Pr > |t| <0.0001 <0.0001
24.639 -7.010
Pr > |t| <0.0001 <0.0001
t
Section 10-4: Multiple Regression 215 13. (continued) Model parameters (HWY): Source Value Intercept 57.977 WEIGHT -0.007 DISPLACEMENT -0.814
Standard error 2.187 0.001 0.602
t 26.505 -7.971 -1.353
Pr > |t| <0.0001 <0.0001 0.183
14. The three different possible regression equations all have a P-value of 0.000. Because thehighest adjusted R2 is 0.827, which is somewhat greater than thenext highest adjusted R2 of 0.808, thebest regression equation uses both predictor variables, so thebest regression equation is HEIGHT 393 0.540 ARMSPAN 1.40 FOOTLENGTH. Because theadjusted R2 of 0.827 isn’t close to 1, it is likely that predicted values will not be very accurate. (Comment: It would also be reasonable to conclude that theadjusted R2 of 0.808 is not dramatically less than 0.827, so it would be reasonable to conclude that the best regression equation is HEIGHT 427 0.730 ARMSPAN.) (TI data: theP-value is 0.000 for all three possible regression equations. Using FootLength and ArmSpan as predictor variables yields an adjusted R2 of 0.825, using only FootLength yields 0.702, and using only ArmSpan yields 0.815. Because 0.815 is only slightly less than 0.825, thebest regression is HEIGHT 451 0.716 ARMSPAN.) The possible models are shown below: HEIGHT 564 4.37 FOOTLENGTH, adjusted R2 0.714 HEIGHT 427 0.730 ARMSPAN, adjusted R2 0.808 HEIGHT 393 0.540 ARMSPAN 1.40 FOOTLENGTH, adjusted R2 0.827 XLSTAT Outputs Model parameters (HEIGHT): Source Value Intercept 564.249 Foot Length 4.373
Standard error 9.374 0.036
Model parameters (HEIGHT): Source Value Intercept 426.729 Arm Span 0.730
Standard error 8.075 0.005
Model parameters (HEIGHT): Source Value Intercept 392.734 Foot Length 1.402 Arm Span 0.540
Standard error 7.785 0.055 0.009
t 60.193 122.965 t 52.847 159.783 t 50.449 25.624 62.923
Pr > |t| <0.0001 <0.0001 Pr > |t| <0.0001 <0.0001 Pr > |t| <0.0001 <0.0001 <0.0001
15. The best regression equation is yˆ 109 0.00670x1, where x1 represents volume. It is best because it has the highest adjusted R2 value of –0.0513 and thelowest P-value of 0.791. thethree regression equations all have adjusted values of R2 that are very close to 0, so none of them are good for predicting IQ. It does not appear that people with larger brains have higher IQ scores. thepossible models are shown below: yˆ 109 0.00670x1, adjusted R2 0.0513 yˆ 101 0.00178x2 , adjusted R2 0.0555 yˆ 108 0.00694x1 0.00722x2 , adjusted R2 0.113
Copyright © 2022 Pearson Education, Inc.
216
Chapter 10: Correlation and Regression
15. (continued) XLSTAT Outputs Model parameters (IQ): Source Value Intercept 108.547 VOL -0.007
Standard error 28.169 0.025
t
Model parameters (IQ): Source Value Intercept 101.139 WT -0.002
Standard error 12.463 0.155
t
Model parameters (IQ): Source Value Intercept 108.257 VOL -0.007 WT 0.007
Standard error 29.718 0.026 0.163
t
3.853 -0.269
Pr > |t| 0.001 0.791
8.115 -0.011
Pr > |t| <0.0001 0.991
3.643 -0.265 0.044
Pr > |t| 0.002 0.794 0.965
16. The best regression equation is yˆ 10.0 0.567x1 0.532x2 , where x1 represents verbal IQ score and x2 represents performance IQ score. It is best because it has thehighest adjusted R2
value of 0.999 and thelowest
2
P-value of 0.000. Because theadjusted R is so close to 1, it is likely that predicted values will be very accurate. Possible models: The possible models are shown below: yˆ 11.5 0.940x1, adjusted R2 0.762 yˆ 10.5 0.806x2 , adjusted R2 0.814 yˆ 10.0 0.567x1 0.532x2 , adjusted R2 0.999 XLSTAT Outputs Model parameters (IQ FULL): Source Value Intercept 11.504 IQ VERB 0.940
Standard error 4.091 0.048
t
Model parameters (IQ FULL): Source Value Intercept 10.465 IQ PERF 0.806
Standard error 3.548 0.035
t
2.812 19.629
Pr > |t| 0.006 <0.0001
2.949 22.938
Pr > |t| 0.004 <0.0001
Model parameters (IQ FULL): Source Value Standard error t Pr > |t| Intercept -10.020 0.329 -30.498 <0.0001 IQ VERB 0.567 0.004 132.954 <0.0001 IQ PERF 0.532 0.004 150.489 <0.0001 0.769317 0 17. For H : 0, Test statistic: t 10.813917; P-value 0.0001; Reject H and conclude that 0 1 0 0.0711414 the regression coefficient of b1 0.769 should be kept. 1.009510 0 For H : 0, Test statistic: t 29.856; P-value 0.0001; Reject H and conclude that the 0 2 0 0.0338123 regression coefficient of b2 1.01 should be kept. It appears that theregression equation should include both independent variables of height and waist circumference.
Copyright © 2022 Pearson Education, Inc.
Section 10-5: Nonlinear Regression 217 18.1 0.629 1 0.910; 0.943 2 1.08; Neither confidence interval includes 0, so neither of thetwo variables 8 . should be eliminated from theregression equation. CI for 1 b1 E 1 b1 E b1 t / 2 sb 1 b1 t / 2 sb 1
1
0.7693 1.976(0.0711414) 1 0.7693 1.976(0.0711414) 0.629 1 0.910 CI for 2 b2 E 2 b2 E b2 t / 2 sb 2 b2 t / 2 sb 2
2
1.0095 1.976(0.033812) 2 1.0095 1.976(0.033812) 0.943 2 1.08 XLSTAT Output Model parameters (WEIGHT): Source Value Intercept -149.452 HEIGHT 0.769 WAIST 1.010
Lower bound (95%) -174.197 0.629 0.943
Upper bound (95%) -124.707 0.910 1.076
19.1 yˆ 3.06 82.4x1 2.91x2 , where x1 represents sex and x2 represents age. 9 . Female: yˆ 3.06 82.4 0 2.9120 61 lb; Male: yˆ 3.06 82.4 1 2.9120 144 lb; t h e sex of the bear does appear to have an effect on its weight. theregression equation indicates that thepredicted weight of a male bear is about 82 lb more than thepredicted weight of a female bear with other characteristics being thesame. XLSTAT Output Model parameters (WEIGHT): Source Value Standard error t Pr > |t| Intercept 3.062 22.458 0.136 0.892 AGE 2.905 0.297 9.769 <0.0001 SEX (1=M) 82.379 20.804 3.960 0.000 Section 10-5: Nonlinear Regression y x2; R2 1.
1.
Since thearea of a square is thesquare of its side, thebest model quadratic:
2.
The exponential model is thebest option because it has thehighest R2
3.
because thevalue of R 0.027 is so low. Using themodels discussed in this section, it appears that we cannot make accurate predictions of thenumbers of points scored in future Super Bowl games. Common sense suggests that no such model could be found. 2.7% of thevariation in Super Bowl points can be explained by theexponential model that relates thevariable of year and thevariable of points scored. Because such a small percentage of thevariation is explained by themodel, themodel is not very useful. Instead of showing a pattern that approximates thegraph of thequadratic equation, thepoints are scattered about with no obvious pattern. thepoints do not fit thegraph of thequadratic equation well, so thevalue of 2
4.
R2 0.205 is very low.
Copyright © 2022 Pearson Education, Inc.
value, but this is not a good model
218 5.
Chapter 10: Correlation and Regression Both quadratic and power: d 0.8t 2 Model Linear Quadratic Logarithmic Power Exponential
6.
power: Cost 22x3 Model Linear Quadratic Logarithmic Power Exponential
7.
R2 0.9529 1.0000 0.7704 1.0000 0.9189
R2 0.8689 0.9977 0.6431 1.0000 0.9192
exponential: y 1000(1.01)x Model Linear Quadratic Logarithmic Power Exponential
R2 0.9998 1.0000 0.8654 0.8747 1.0000
Copyright © 2022 Pearson Education, Inc.
Section 10-5: Nonlinear Regression 219 8.
logarithmic: y 0.00476 4.34 ln x Model Linear Quadratic Logarithmic Power Exponential
9.
R2 0.8949 0.9876 0.8591 0.9972 0.8610
quadratic: y 0.000154x2 0.0799x 6.06; x is theyear with 2000 coded as 1, and y is theworld population in billions. Model Linear
R2 0.9998
Quadratic
0.9999
Logarithmic
0.8591
Power
0.8771
Exponential
0.9995
10. Quadratic: y 194x2 357x 46, 505 (with 1975 coded as 1, 1980 coded as 2, and so on). Projected value for 2025: 27,009 deaths. (Using therounded coefficients, thepredicted value is 26,958.) Model Linear Quadratic Logarithmic Power Exponential
R2 0.6508 0.7011 0.4797 0.4701 0.6471
Copyright © 2022 Pearson Education, Inc.
220
Chapter 10: Correlation and Regression
11. Logarithmic: y 3.22 0.293ln x Model Linear Quadratic Logarithmic Power Exponential
R2 0.6198 0.9012 0.9972 0.9887 0.5662
12. logarithmic: y 77.8 35.1ln x Model Linear Quadratic Logarithmic Power Exponential
R2 0.7585 0.9291 0.9303 0.8221 0.8388
13. quadratic: y 84x2 953x 13, 289 (Result is based on theyear 2000 coded as 1.) Using therounded coefficients, theprojected value for thelast year is 25,506.0, which isn’t too far from theactual value of 26,828.4. Because R2 0.925 for thequadratic model, which is high, predicted values are likely to be reasonably accurate, but we should remember that stock market values can be dramatically affected by events that cannot be foreseen by our most creative minds. Model Linear Quadratic Logarithmic Power Exponential
R2 0.7020 0.9253 0.4189 0.4611 0.7450
Copyright © 2022 Pearson Education, Inc.
Section 10-5: Nonlinear Regression 221 14. logarithmic: y 135 26.7 ln x (Result is based on 1980 coded as 1.) With R2 0.226, this best model is not a good model. A time-series plot shows that sunspots have a cyclical pattern that does not fit any of thefive models included in this section. R2 0.2078 0.2084 0.2262 0.1683 0.1811
Model Linear Quadratic Logarithmic Power Exponential
15. power: y 7.89 x 0.371 ; x is thedepth and y is themagnitude. thepredicted magnitude is 0.371
y 7.89 3.78 4.82, which is far from theactual magnitude of 7.10. Because R2 0.613 for thepower model, which isn’t very high, predicted values are not likely to be very accurate. Model Linear Quadratic Logarithmic Power Exponential
R2 0.4027 0.5172 0.5298 0.6132 0.5001
16. quadratic: y 0.00641x2 0.0280x 13.8 (1880–1889 coded as 1.); theprojected value is y 0.00641(22)2 0.0280(22) 13.8 16.286C (16.295C using unrounded coefficients); t h e decade of 2090–2099 is too far beyond thescope of theavailable data, so thepredicted value is questionable. Model Linear Quadratic Logarithmic Power Exponential
R2 0.7845 0.8776 0.5642 0.5697 0.7888
Copyright © 2022 Pearson Education, Inc.
222
Chapter 10: Correlation and Regression
17. a. Exponential: y 2 3x1 [or y 0.6299611.587401 for an initial value of 1 that doubles every 1.5 years]. x
2
b.
Exponential: y 1.365581.42774 , where x is year after 1970. x
Model Linear Quadratic Logarithmic Power Exponential
R2 0.380 0.550 0.158 0.790 0.991
c. Moore’s law does appear to be working reasonably well. With R2 0.991, themodel appears to be very good. 18. a. 10,149.16 b. 74.07 c. thequadratic sum of squares of residuals (74.07) is less than thesum of squares of residuals from thelinear model (10,149.16). Quick Quiz 1.
2. 3.
0.008 0.05 and 0.846 0.707; Conclude that there is sufficient evidence to support theclaim of a linear correlation between amount of thedinner and theamount of thetip. Since this is an exact algebraic rule, r 1. 4. y 0 0.20x, or simply y 0.20x
5. 6. 7. 8.
None of thegiven values change when thevariables are switched The value of r does not change if all values of one of thevariables are multiplied by thesame constant. Because r must be between –1 and 1 inclusive, thevalue of 1.200 is theresult of an error in thecalculations. 0.846 0.707; Because there is sufficient evidence to support theclaim of a linear correlation between thecostof dinner and thetip, thebest predicted tip is 0.00777 0.145(84.62) $12.26.
9.
0.132 0.707; Because there is not sufficient evidence to support theclaim of a linear correlation between thecost of dinner and thetip, thebest predicted tip is found by computing themean of theeight sample tips. thebest predicted tip is $9.76.
10. Because r2 (0.846)2 0.716, it follows that 0.716 (or 71.6%) of thevariation in tips is explained by the linear relationship between amounts of dinner and amounts of tips. It then follows that 0.284 (or 28.4%) of thevariation in tips is not explained by thelinear relationship between amounts of dinner and amounts of tips.
Copyright © 2022 Pearson Education, Inc.
Review Exercises 223 Review Exercises r 0.445; P-value 0.318 (Table: 0.05); Critical values ( 0.05): 0.754; There is not sufficient evidence to support theclaim that there is a linear correlation between size and revenue. It does not appear that a casino can increase its revenue by enlarging its size. XLSTAT Output Correlation matrix (Pearson): p-values (Pearson): Variables Size Revenue Variables Size Revenue Size 1 0.445 Size 0 0.318 Revenue 0.445 1 Revenue 0.318 0 Values in bold are different from 0 with a significance level alpha=0.05 2. a yˆ 63.9 0.443x 1.
XLSTAT Output Model parameters (Revenue): Source Value Intercept 63.876 Size 0.443
3
Standard error 64.834 0.399
t 0.985 1.110
Pr > |t| 0.370 0.318
b. t h e best predicted value of revenue y 134.7 million dollars. Because thepredicted amount of revenue is is 134.7 million dollars for any casino size, theprediction is not likely to be accurate. a. r 0.450 XLSTAT Output Correlation matrix (Pearson): p-values (Pearson): Variables Time (sec) Height (m) Variables Time (sec) Height (m) Time (sec) 1 0.450 Time (sec) 0 0.192 Height (m) 0.450 1 Height (m) 0.192 0 Values in bold are different from 0 with a significance level alpha=0.05 b. With P-value 0.192 (Table: 0.05) and critical values ( 0.05) of r 0.632, there is not sufficient evidence to support theclaim that there is a linear correlation between time and height. c. Although there is no linear correlation between time and height, thescatterplot shows a very distinct pattern, revealing that time and height are associated by some function that is not linear. (The scatterplot appears to depict a parabola. thequadratic regression equation is
4.
y 4.44x2 9.13x 0.0482.)
a. NICOTINE 0.443 0.0968 TAR 0.0262 CO, or yˆ 0.443 0.0968x1 0.0262x2 , where x1 represents tar and x2 represents carbon monoxide.
Copyright © 2022 Pearson Education, Inc.
224 4.
Chapter 10: Correlation and Regression (continued) XLSTAT Output
Model parameters (Nicotine): Source Value Intercept -0.443 Tar 0.097 CO -0.026
Standard error 0.429 0.012 0.029
t
Pr > |t| -1.031 8.021 -0.897
0.350 0.000 0.411
b. R2 0.936; adjusted R2 0.910; P-value 0.001 c. With high values of R2 and adjusted R2 and a small P-value of 0.001, it appears that theregression equation can be used to predict theamount of nicotine given theamounts of tar and carbon monoxide. d. t h e predicted value yˆ 0.443 0.0968(23) 0.0262(15) 1.39 mg or 1.4 mg rounded, which is close to is the actual value of 1.3 mg of nicotine. Cumulative Review Exercises 1. a. Is there a difference between themean IQ score of airline passengers and themean IQ score of police officers? b. Test for a difference between themeans of two independent populations using themethods of Section 9-2. c. H0: 1 2; H1: 1 2; population1 airline passengers, population2 police officers Test statistic : t 1.557; P-value 0.1516 (Table: P-value 0.05); Critical values ( 0.05): t 2.239 (Table: t 2.262); Fail to reject H0. There is not sufficient evidence to support theclaim that there is a difference between themean IQ score of airline passengers and themean IQ of police officers. (This 95% confidence interval could also be used: 20.0 1 2 3.59. Because theconfidence interval includes 0, there is not sufficient evidence to support theclaim that there is a difference between themean IQ score of airline passengers and themean IQ of police officers.) t
x1 x2 (1 2 ) 102.1110.3 0 s12 s22 n1 n2
x1 x2 t / 2
16.362 3.092 10 10
s12 s22 16.362 3.092 (df 10 1 9) 102.1110.3 2.262 n1 n2 10 10
XLSTAT Outputs for Exercise 1 t-test for two independent samples / Two-tailed test: Difference -8.200 t (Observed value) -1.557 |t| (Critical value) 2.239 DF 9.643 p-value (Twotailed) 0.152 alpha 0.050 95% confidence interval on the difference between the means: ( -19.991, 3.591 ) 2.
XLSTAT Outputs for Exercise 2 t-test for two paired samples / Lower-tailed test: Difference -8.200 t (Observed value) -1.541 t (Critical value) -1.833 DF 9 p-value (one-tailed) 0.079 alpha 0.050 90% confidence interval on the difference between the means: ( -17.957, 1.557 )
a. Was thetraining course effective in raising theIQ scores? That is, do thebefore – after differences have a mean that is less than 0, showing that thecourse is effective in raising IQ scores? b. Use themethods of Section 9-3 to test theclaim that themean of thebefore – after differences is less than 0, showing that thecourse is effective with larger after scores.
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 225 2.
(continued) c. H0: d 0; H1: d 0; difference before after Test statistic: t 1.541; P-value=0.0789 (Table: P-value 0.05); Critical value ( 0.05): t 1.833; Fail to reject H0. There is not sufficient evidence to support theclaim that thecourse is effective with higher after scores. (This 90% confidence interval could also be used: 18.0 d 1.56). Because the confidence interval includes 0, there is not sufficient evidence to support theclaim that thecourse is effective with higher after scores. d d 8.2 0 t sd / n 16.83 / 10 16.83 s d t /2 d 8.2 1.833 (df 10 1 9) n 10
3.
a. For professional horse jockeys, is there a correlation between weight and number of top three race finishes? b. Use themethods of Section 10-1 to test for a linear correlation. c. H0: 0; H1: 0; r 0.060; P-value 0.869 (Table: P-value 0.05); Critical values ( 0.05): r 0.632; There is not sufficient evidence to support theclaim that there is a linear correlation between weight and thenumber of top three race finishes. XLSTAT Output Correlation matrix (Pearson): p-values (Pearson): Variables Weight Top 3 Finishes Variables Weight Top 3 Finishes Weight 1 -0.060 Weight 0 0.869 Top 3 Finishes -0.060 1 Top 3 Finishes 0.869 0 Values in bold are different from 0 with a significance level alpha=0.05 4. a. Because thetable lists time series data, a key question is this: What is thetrend of thedata over time? b. Use themethods of Section 2-3 to construct a time series graph that would reveal a trend of thedata over time. c. A time series graph clearly shows that there is a distinct trend of steadily increasing numbers of digital buyers over time. Businesses should ensure they can market and sell their goods and services online.
5.
a. What is an estimate of theproportion of all adults who have wireless earbuds? b. Use themethods of Section 7-1 to construct a confidence interval estimate of theproportion of all adults who use wireless earbuds. c. 95% CI: 0.280 p 0.320; With 95% confidence, it is estimated that between 28.0% and 32.0% of all adults have wireless earbuds. (It would also be reasonable to conduct a hypothesis test, such as a test of theclaim that fewer than 50% of adults have wireless earbuds. For thetest H0: p 0.5; H1: p 0.5, t h e test statistic is z 17.95 and P-value 0.0000, so there is sufficient evidence to support theclaim that fewer than 50% of adults have wireless earbuds.)
Copyright © 2022 Pearson Education, Inc.
226 5.
Chapter 10: Correlation and Regression (continued) z
p̂ p pq n
0.30 0.50
p̂ z /2
(0.50)(0.50) 2016
XLSTAT Outputs for Exercise 5 z-test for one proportion / Two-tailed test: Difference -0.200 z (Observed value) -17.960 |z| (Critical value) 1.960 p-value (Twotailed) <0.0001 alpha 0.050
6.
7.
p̂q̂ 0.30 1.96 n
0.300.70 2016
Excel Command for Exercise 6 fx | =STANDARDIZE(191,174.12,7.10)
95% confidence interval on the proportion (Wald): ( 0.280, 0.320 ) a. Is Stephen Curry significantly tall in thepopulation of adult males? b. Using themethods of Section 3-3, convert Stephen Curry's height to a z score and use therange rule of thumb to determine whether his height is significantly high. 191174.12 c. Converting Stephen Curry's height to a z score, we get z 2.38. Stephen Curry's height is 7.10 2.38 standard deviations above themean, so his height is significantly high. a. Is themean amount provided by thenew device equal to 16 ounces? Is there anything else about thedata suggesting that there is a problem with thenew device? b. Explore thesample data to see if there are any undesirable characteristics. Use themethods of Section 8-3 to test theclaim that themean of theamounts is equal to 16 ounces. c. H0: 16 oz; H1: 16 oz; Test statistic : t 0.998; P-value 0.3363 (Table: P-value 0.20); Critical values ( 0.05): t 2.160; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that themean is equal to16 oz. (This 95% confidence interval could also be used: 15.42 oz 17.58 oz. Because theconfidence interval includes 16 oz, there is not sufficient evidence to reject theclaim that themean is equal to 16 oz.). Although themean appears to be OK, exploring thedata reveals that thevariation appears to be too high. theminimum sample value is 12.6 oz and themaximum is 19.2 oz, and they show that theamount of variation is far too high. Some containers would be dramatically underfilled while others would be overflowing. thefilling device must be modified to correct theunacceptably high amount of variation. x 16.5 16.0 s 16.5 2.160 1.874 t (df 14 1 13) x t /2 s / n 1.874 / 14 n 14 XLSTAT Outputs for Exercise 7 One-sample t-test / Two-tailed test: Difference 0.500 t (Observed value) 0.998 |t| (Critical value) 2.160 DF 13 p-value (Twotailed) 0.336 alpha 0.050
XLSTAT Output for Exercise 8 95% confidence interval on the proportion (Wald): ( 0.058, 0.122 )
95% confidence interval on the mean: ( 15.418, 17.582 )
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 227 8.
a. What is an estimate of theproportion of wrong results in thepopulation of all of thedrug tests? b. Use themethods of Section 7-1 to construct a confidence interval estimate of theproportion p of wrong results in thepopulation of all of thedrug tests. Decide whether theproportion of wrong results is acceptable. c. 95% CI: 0.0576 p 0.122; This shows that we have 95% confidence that thepercentage of wrong test results is between 5.76% and 12.2%. Because wrong test results could possibly have adverse implications, such as unfair rejection of job applicants, it appears that theproportion of wrong results is unacceptably high. p̂ z /2
p̂q̂ 27 1.96 n 300
27 273 300 300
300
Copyright © 2022 Pearson Education, Inc.
Section 11-1: Goodness-of-Fit 229
Chapter 11: Goodness-of-Fit and Contingency Tables Section 11-1: Goodness-of-Fit 1. a. Observed values are represented by O and expected values are represented by E. b. For theleading digit of 2, O 62 and E (317)(0.176) 55.792. c. For theleading digit of 2, (O E)2 E 0.691. 2
H0: p1 0.301, p2 0.176, p3 0.125, …, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value.
3.
There is sufficient evidence to warrant rejection of theclaim that theleading digits have a distribution that fits well with Benford’s law. Because theleading digits of inter-arrival traffic times do not fit thedistribution described by Benford’s law, it appears that those times are not typical. theanomaly could be due to other factors, but there is a good chance that thecomputer has been hacked. Computer experts should be used to correct theincursion and secure thecomputer against further incursions.
4.
5.
H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 4.050; P-value 0.908 (Table: P-value 0.10); Critical value: 2 16.919; Do not reject H0. There is not sufficient evidence to warrant rejection of theclaim that thelast digits of thereported heights occur with about thesame frequency. There is not sufficient evidence to conclude that theheights were reported instead of being measured. (12 10.1)2 (9 10.1)2 (1110.1)2 (14 10.1)2 4.050 (df 9) 2 0.1101 0.1101 0.1101 0.1101 XLSTAT Output for Exercise 6 XLSTAT Output for Exercise 5 Chi-square test: Chi-square (Observed value) Chi-square (Critical value) DF p-value alpha
6.
7.
4.050 16.919 9 0.908 0.050
Chi-square test: Chi-square (Observed value) Chi-square (Critical value) DF p-value alpha
96.761 16.919 9 <0.0001 0.050
H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 96.762; P-value 0.000 (Table: P-value 0.005); Critical value: 2 16.919; Reject H0. There is sufficient evidence to reject theclaim that thelast digits of thereported heights occur with about the same frequency. theresults suggest that theheights were reported and not measured. thelarger data set in this exercise results in a different conclusion than theone obtained in thepreceding exercise, so thelarger data set does have a dramatic impact on theresults. (321 278.4)2 (315 278.4)2 (297 278.4)2 (329 278.4)2 96.762 (df 9) 2 0.1 2784 0.1 2784 0.1 2784 0.1 2784 H0: thefrequency counts agree with theclaimed distribution. H1: thefrequency counts do not agree with theclaimed distribution. Test statistic: 2 8.185; P-value 0.516 (Table: P-value 0.10); Critical value: 2 16.919; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that theobserved outcomes agree with theexpected frequencies. theslot machine appears to be functioning as expected.
Copyright © 2022 Pearson Education, Inc.
230
Chapter 11: Goodness-of-Fit and Contingency Tables Excel command for Exercise 7 fx | =CHISQ.DIST.RT(8.185,10-1)
8.
XLSTAT Output for Exercise 8 Chi-square test: Chi-square (Observed value) 4.600 Chi-square (Critical value) 7.815 DF 3 p-value 0.204 alpha 0.050
H0: p1 p2 p3 p4 0.25; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 4.600; P-value 0.204 (Table: P-value 0.10); Critical value: 7.815; Fail to reject 2
9.
2
H0. There is not sufficient evidence to warrant rejection of theclaim that thetires selected by thestudents are equally likely. It appears that when four students identify a tire, it is not likely that they would all select thesame tire. 2 15 10.252 18 10.252 6 10.252 2 1110.25 4.600 (df 3) 0.25 41 0.25 41 0.25 41 0.25 41 H0: p1 0.756, p2 0.091, p3 0.108, p4 0.038, p5 0.007; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 524.713; P-value 0.0000 (Table: P-value 0.005); Critical value: 13.277; Reject 2
2
H0. There is sufficient evidence to warrant rejection of theclaim that thedistribution of clinical trial participants fits well with thepopulation distribution. Hispanics have an observed frequency of 60 and an expected frequency of 391.027, so they are very underrepresented. Also, theAsian>Pacific Islander subjects have an observed frequency of 54 and an expected frequency of 163.286, so they are also underrepresented. 2 60 391.0272 54 163.2862 12 30.0792 2 3855 3248.532 524.713 (df 4) 0.756 4297 0.091 4297 0.038 4297 0.007 4297 XLSTAT Output for Exercise 9 XLSTAT Output for Exercise 10 Chi-square test: Chi-square test: Chi-square (Observed value) 524.713 Chi-square (Observed value) 5.860 Chi-square (Critical value) 13.277 Chi-square (Critical value) 11.070 DF 4 DF 5 p-value <0.0001 p-value 0.320 alpha 0.010 alpha 0.050 10. H0: p1 p2 p3 p4 p5 p6 1 6; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 5.860; P-value 0.320 (Table: P-value 0.10); Critical value: 2 11.071; Fail to reject H0. There is not sufficient evidence to support theclaim that theoutcomes are not equally likely. theoutcomes appear to be equally likely, so theloaded die does not appear to behave differently from a fair die. 2 31 28.5712 28 28.5712 32 28.5712 2 27 28.571 5.860 (df 5) 200 7 200 7 200 7 200 7 11. H0: pYS 9 16, pGS 3 16, pYW 3 16, pGW 1 16; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 11.161; P-value 0.011 (Table: P-value 0.025); Critical value: 2 7.815; Reject H0. There is sufficient evidence to support theclaim that theresults contradict Mendel’s theory. 2 77 93.752 98 93.752 18 32.252 2 307 281.25 11.161 (df 3) 0.5625 500 0.1875 500 0.1875 500 0.0625 500
Copyright © 2022 Pearson Education, Inc.
Section 11-1: Goodness-of-Fit 231 XLSTAT Output for Exercise 11 Chi-square test: Chi-square (Observed value) 11.161 Chi-square (Critical value) 7.815 DF 3 p-value 0.011 alpha 0.050
XLSTAT Output for Exercise 12 Chi-square test: Chi-square (Observed value) 0.976 Chi-square (Critical value) 9.488 DF 4 p-value 0.913 alpha 0.050
12. H0: thefrequency counts agree with theclaimed distribution. H1: thefrequency counts do not agree with theclaimed distribution. Test statistic: 2 0.976; P-value 0.913 (Table: P-value 0.10); Critical value: 2 9.488; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that theactual frequencies fit a Poisson distribution. There is no way to prove that thefrequencies fit a Poisson distribution even though it appears that they do. 2 211 211.42 93 97.92 35 30.52 8 8.72 2 229 227.5 0.976 (df 4) 227.5 211.4 97.9 30.5 8.7 13. H0: p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 13.689; P-value 0.134 (Table: P-value 0.10); Critical value: 2 16.919; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thelikelihood of winning is the same for thedifferent post positions. Based on these results, post position should not be considered when betting on theKentucky Derby race.2 2 19 11.9 14 11.9 5 11.92 1111.92 2 13.689 (df 9) 0.1119 0.1119 0.1119 0.1119 XLSTAT Output for Exercise 13 XLSTAT Output for Exercise 14 Chi-square test: Chi-square test: Chi-square (Observed value) 13.689 Chi-square (Observed value) 250.292 Chi-square (Critical value) 16.919 Chi-square (Critical value) 16.919 DF 9 DF 9 p-value 0.134 p-value <0.0001 alpha 0.050 alpha 0.050 14. H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 250.293; P-value 0.000 (Table: P-value 0.005); Critical value: 2 16.919; Reject H0. There is sufficient evidence to warrant rejection of theclaim that thelast digits of thescores occur with about thesame frequency. Because thelast digit of 0 has thehighest frequency by far, a good strategy is tochoose 0’s for thelast digits. (113 42.4)2 (25 42.4)2 (16 42.4)2 (20 42.4)2 250.293 (df 9) 2 0.1 424 0.1 424 0.1 424 0.1 424
Copyright © 2022 Pearson Education, Inc.
232
Chapter 11: Goodness-of-Fit and Contingency Tables
15. H0: p4 0.125, p5 0.25, p6 0.3125, p7 0.3125; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 7.658; P-value 0.054 (Table: P-value 0.05); Critical value: 7.815; Do not reject 2
2
H0. There is not sufficient evidence to warrant rejection of theclaim that theactual numbers of games fit the distribution indicated by theproportions listed in thegiven table. It appears that theactual games conform reasonably well with theresults expected by theory. (2113.75)2 (26 27.5)2 (24 34.375)2 (39 34.375)2 2 7.658 (df 3) 13.75 27.5 34.375 34.375 XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 Chi-square test: Chi-square test: Chi-square (Observed value) 7.658 Chi-square (Observed value) 93.072 Chi-square (Critical value) 7.815 Chi-square (Critical value) 19.675 DF 3 DF 11 p-value 0.054 p-value <0.0001 alpha 0.050 alpha 0.050 16. H0: pJan pFeb pMar pApr pMay pJun pJul pAug pSep pOct pNov pDec 1 12; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 93.072; P-value 0.0000 (Table: P-value 0.005); Critical value: 19.675; Reject 2
2
H0. There is sufficient evidence to warrant rejection of theclaim that American born Major League Baseball players are born in different months with thesame frequency. thesample data appear to support Gladwell’s claim. 2 329 376.252 398 376.252 371 376.252 2 387 376.25 93.072 (df 11) 4515 12 4515 12 4515 12 4515 12 17. H0: pSun pMon pTue pWed pThu pFri pSat 1 7 ; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 9.500; P-value 0.147 (Table: P-value 0.10); Critical value: 16.812; Fail to reject 2
2
H0. There is not sufficient evidence to support theclaim that births do not occur on theseven different days of theweek with equal frequency. Day Sun Mon Tue Wed Thu Fri Sat Expected 57.14 57.14 57.14 57.14 57.14 57.14 57.14 53 52 66 72 57 57 43 Observed 2 52 57.142 57 57.142 43 57.142 2 53 57.14 9.500 (df 6) 400 7 400 7 400 7 400 7 XLSTAT Output for Exercise 17 Chi-square test: Chi-square (Observed value) 9.500 Chi-square (Critical value) 16.812 DF 6 p-value 0.147 alpha 0.010
XLSTAT Output for Exercise 18 Chi-square test: Chi-square (Observed value) 10.760 Chi-square (Critical value) 16.812 DF 6 p-value 0.096 alpha 0.010
Copyright © 2022 Pearson Education, Inc.
Section 11-1: Goodness-of-Fit 233 18. H0: pSun pMon pTue pWed pThu pFri pSat 1 7 ; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 10.760; P-value 0.096 (Table: P-value 0.05); Critical value: 2 16.812; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that discharges occur on theseven different days of theweek with equal frequency. Day Sun Mon Tue Wed Thu Fri Sat Expected 57.14 57.14 57.14 57.14 57.14 57.14 57.14 Observed 65 50 47 48 73 64 53 2 50 57.142 64 57.142 53 57.142 2 65 57.14 10.760 (df 6) 400 7 400 7 400 7 400 7 19. H0: pred 0.13, porange 0.20, pyellow 0.14, pbrown 0.13, pblue 0.24, pgreen 0.16; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 5.714; P-value 0.335 (Table: P-value 0.10); Critical value: 2 11.071; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that thecolor distribution is as claimed. Color Expected Observed
Red 44.85 45
Orange 69.0 82
Yellow 48.3 41
Brown 44.85 36
Blue 82.8 88
Green 55.2 53
(45 44.85)2 (82 69.0)2 (41 48.3)2 (36 44.85)2 (88 82.8)2 (53 55.2)2 5.714 0.13 345 0.20 345 0.14 345 0.13 345 0.24 345 0.16 345 XLSTAT Output for Exercise 19 XLSTAT Output for Exercise 20 Chi-square test: Chi-square test: Chi-square (Observed value) 5.714 Chi-square (Observed value) 11.400 Chi-square (Critical value) 11.070 Chi-square (Critical value) 16.919 DF 5 DF 9 p-value 0.335 p-value 0.249 alpha 0.050 alpha 0.050
2
20. H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 11.400; P-value 0.249 (Table: P-value 0.10); Critical value: 2 16.919; Fail to reject H 0 . There is not sufficient evidence to support theclaim that thesample is from a population of weights in which thelast digits do not occur with thesame frequency. That is, thelast digits appear to occur with approximately thesame frequencies. theresults suggest that theweights were measured. Digit 0 1 2 3 4 5 6 7 8 9 Expected 30 30 30 30 30 30 30 30 30 30 Observed 20 31 30 32 31 39 29 27 39 22 2 31 302 39 302 22 302 2 20 30 11.400 (df 9) 30 30 30 30
Copyright © 2022 Pearson Education, Inc.
234
Chapter 11: Goodness-of-Fit and Contingency Tables
21. H0: p1 0.301, p2 0.176, p3 0.125, … , p7 0.058, p8 0.051, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 3650.251; P-value 0.000 (Table: P-value 0.005); Critical value: 20.090; Reject 2
2
H0. There is sufficient evidence to warrant rejection of theclaim that theleading digits are from a population with a distribution that conforms to Benford’s law. It does appear that thechecks are theresult of fraud (although theresults cannot confirm that fraud is thecause of thediscrepancy between theobserved results and theexpected results). 2 15 137.982 23 39.982 0 36.062 2 0 235.98 3650.251 (df 8) 0.301 784 0.176 784 0.051 784 0.046 784 XLSTAT Output for Exercise 21 XLSTAT Output for Exercise 22 Chi-square test: Chi-square test: Chi-square (Observed value) 3650.251 Chi-square (Observed value) 10.327 Chi-square (Critical value) 20.090 Chi-square (Critical value) 20.090 DF 8 DF 8 p-value <0.0001 p-value 0.243 alpha 0.010 alpha 0.010 22. H0: p1 0.301, p2 0.176, p3 0.125, … , p7 0.058, p8 0.051, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 10.327; P-value 0.243 (Table: P-value 0.01); Critical value: 2 20.090; Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that theleading digits are from a population with a distribution that conforms to Benford’s law. theauthor’s check amounts appear to be legitimate. He appears 2to be an honest2 person who is a true credit to society. 45 52.8 18 15.32 12 13.82 2 102 90.3 10.327 (df 8) 0.301 300 0.176 300 0.051 300 0.046 300 23. H0: p1 0.301, p2 0.176, p3 0.125, … , p7 0.058, p8 0.051, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 1.762; P-value 0.988 (Table: P-value 0.10); Critical value: 15.507; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that theleading digits are from a population with a distribution that conforms to Benford’s law. thetax entries do appear to be legitimate. 2 89 89.92 25 26.12 27 23.52 2 152 153.8 1.762 (df 8) 0.301 511 0.176 511 0.051 511 0.046 511 XLSTAT Output for Exercise 23 XLSTAT Output for Exercise 24 Chi-square test: Chi-square test: Chi-square (Observed value) 45.618 Chi-square (Observed value) 1.762 Chi-square (Critical value) 15.507 Chi-square (Critical value) 15.507 DF 8 DF 8 p-value <0.0001 p-value 0.987 alpha 0.050 alpha 0.050 2
2
24. H0: p1 0.301, p2 0.176, p3 0.125, … , p7 0.058, p8 0.051, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 45.618; P-value 0.000 (Table: P-value 0.005); Critical value: 2 15.507; Reject H0. There is sufficient evidence to warrant rejection of theclaim that theleading digits are from a population with a distribution that conforms to Benford’s law. 2 28 64.5922 11 18.7172 11 17.6162 2 112 110.467 45.618 (df 8) 0.301 367 0.176 367 0.051 367 0.046 367
Copyright © 2022 Pearson Education, Inc.
Section 11-2: Contingency Tables 235 25. H 0: Heights selected from a normal distribution. H1: Heights not selected from a normal distribution. a. b. c.
Height (cm) Frequency Excel: Table: Excel:
Less than 155.45 26 0.2023 0.2033 0.2023(247)
155.45 – 162.05 46 0.3171 0.3166 0.3171(247)
162.05–168.65 49 0.3046 0.3039 0.3046(247)
Greater than 168.65 26 0.1761 0.1762 0.1761(247)
29.7381 0.2033(247)
46.6137 0.3166(247)
44.7762 0.3039(247)
25.8867 0.1762(247)
29.8851
46.5402
44.6733
25.9014
Table:
d. Test statistic: 0.877 (Table: 0.831); P-value 0.831 (Table: P-value 0.10); Critical value: 2
2
2 11.345; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that heights were randomly selected from a normally distributed population. thetest suggests that we cannot rule out thepossibility that 2thedata are from a2normally distributed population. 2 26 29.7381 46 46.6137 49 44.7762 26 25.88672 2 0.877 (df 3) 29.7381 46.6137 44.7762 25.8867 XLSTAT Output for Exercise 25 XLSTAT Output for Exercise 26 Chi-square test: Chi-square test: Chi-square (Observed value) 2940.582 Chi-square (Observed value) 0.877 Chi-square (Critical value) 16.919 Chi-square (Critical value) 11.345 DF 3 DF 9 p-value <0.0001 p-value 0.831 alpha 0.050 alpha 0.010 26. H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 2940.582; P-value 0.000 (Table: P-value 0.005); Critical value: 2 16.919; There is sufficient evidence to warrant rejection of theclaim that thelast digits of thereported weights occur with about thesame frequency, so it appears that theweights were reported and not measured. Also, examination of thefrequencies shows that thefrequency of 0 is 1069, which is far larger than any of theother frequencies, so it appears that many of thereported weights were rounded to a multiple of 10. Examination of themeasured weights of females shows that they are all rounded to one decimal place, and it is highly unlikely that people would use that precision when reporting weights. (TI data: Test statistic is 235.656, critical value is 16.919, and theP-value is 0.000.) 2
2
2
(1069 297.1)2 (78 297.1)2 (189 297.1)2 (155 297.1)2 (165 297.1)2 0.1 2971 0.1 2971 0.1 2971 0.1 2971 0.1 2971
(624 297.1)2 (160 297.1)2 (157 297.1)2 (261 297.1)2 (113 297.1)2 2940.58 0.1 2971 0.1 2971 0.1 2971 0.1 2971 0.1 2971 Section 11-2: Contingency Tables
1. 2.
(123 131)(123 52) 138.906 123 131 52 14 H0: Whether a dog makes a correct identification about malaria is independent of whether malaria is present. E
H1: Whether a dog makes a correct identification and whether malaria is present are dependent.
Copyright © 2022 Pearson Education, Inc.
236 3.
4.
5.
Chapter 11: Goodness-of-Fit and Contingency Tables a. Test statistic: 19.490; P-value 0.000; Reject thenull hypothesis of independence between whether the dog is correct and whether malaria is present. b. No. It is possible that thedogs are wrong significantly more than they are correct. However, with thegiven data, dogs were correct 70.3% of thetime when malaria was present, and they were correct 90.3% of thetime when malaria was not present, so they do appear to be effective in their identifications. The test is right-tailed. thetest statistic is based on differences between observed frequencies and thefrequencies expected with theassumption of independence between therow and column variables. Only largevalues of thetest statistic correspond to substantial differences between theobserved and expected values, andsuch large values are located in theright tail of thedistribution. 2
H 0: Whether a subject lies is independent of polygraph indication. H1: Subject lies depends on polygraph indication. Test statistic: 25.571; P-value 0.0000 (Table: P-value 0.005); Critical value: 3.841; 2
2
df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim that whether a subject lies is independent of thepolygraph test indication. theresults suggest that polygraphs are effective in distinguishing between truths and lies, but there are many false positives and false negatives, so they are not highly reliable. 2 42 29.72 32 19.72 9 21.32 2 15 27.3 25.571 27.3 29.7 19.7 21.3 XLSTAT Output for Exercise 5 XLSTAT Output for Exercise 6 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 25.571 Chi-square (Observed value) 5.191 Chi-square (Critical value) 3.841 Chi-square (Critical value) 6.635 DF 1 DF 1 p-value <0.0001 p-value 0.023 alpha 0.050 alpha 0.010 6.
H 0: Whether response to thequestion about ghosts is independent of gender. H1: Response to thequestion about ghosts depends on gender. Test statistic: 5.191; P-value 0.023 (Table: P-value 0.01); Critical value: 6.635; 2
7.
2
df (2 1)(2 1) 1; Fail to reject H0. There is sufficient evidence to warrant rejection of theclaim that response to thequestion about ghosts is independent of gender. Yes, if thesignificance level is changed to 0.05, thenull hypothesis of independence is rejected so that gender and response appear to be dependent. 2 724 704.52 228 208.52 913 932.52 2 138 157.5 5.191 157.5 704.5 208.5 932.5 H 0: Whether texting while driving is independent of driving when drinking alcohol. H1: Texting while driving depends on driving when drinking alcohol. Test statistic: 576.224; P-value 0.0000 (Table: P-value 0.005); Critical value: 3.841; 2
2
df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between texting while driving and driving when drinking alcohol. Those two risky behaviors appear to be somehow related. 2 3054 3390.32 156 492.32 4564 4227.72 2 731 394.7 576.224 394.7 3390.3 492.3 4227.7
Copyright © 2022 Pearson Education, Inc.
Section 11-2: Contingency Tables 237 XLSTAT Output for Exercise 7 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 576.224 Chi-square (Critical value) 3.841 DF 1 p-value <0.0001 alpha 0.050 8.
XLSTAT Output for Exercise 8 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 49.330 Chi-square (Critical value) 3.841 DF 1 p-value <0.0001 alpha 0.050
H 0: Whether correct identification is independent of whether evaluator is an expert or a novice. H1: Correct identification depends on whether evaluator is an expert or a novice. Test statistic: 49.330; P-value 0.0000 (Table: P-value 0.005); Critical value: 3.841; 2
9.
2
df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between whether thesubject is a novice or expert and whether thefingerprint identification was correct or wrong. theexperts were correct 92.1% of thetime, compared to 74.5% for thenovices, so theexperts appear to be substantially better than thenovices. 2 113 742 409 3702 35 742 2 331 370 49.330 370 74 370 74 H 0: Whether students spent or kept themoney is independent of form of $1. H1: Students spending or keeping themoney depends on form of $1. Test statistic: 12.162; P-value 0.001 (Table: P-value 0.005); Critical value: 3.841; 2
2
df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim that whether students purchased gum or kept themoney is independent of whether they were given four quarters or a $1 bill. It appears that there is2 a denomination effect. 2 27 18.8 16 24.2 12 20.22 34 25.82 2 12.162 18.8 24.2 20.2 25.8 XLSTAT Output for Exercise 9 XLSTAT Output for Exercise 10 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 12.162 Chi-square (Observed value) 0.096 Chi-square (Critical value) 3.841 Chi-square (Critical value) 3.841 DF 1 DF 1 p-value 0.000 p-value 0.757 alpha 0.050 alpha 0.050 10. H 0: Whether winning an overtime game is independent of playing under theold rule or thenew rule. H1: Winning an overtime game depends on playing under theold rule or thenew rule. Test statistic: 0.096; P-value 0.757 (Table: P-value 0.10); Critical value: 3.841; 2
2
df (2 1)(2 1) 1; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim of independence between winning an overtime game and whether playing under theold rule or thenew rule. It appears that thechange in2 therule has no2 substantial effect. 59 60.5 208 209.52 52 50.52 2 252 250.5 0.096 250.5 60.5 209.5 50.5
Copyright © 2022 Pearson Education, Inc.
238 Chapter 11: Goodness-of-Fit and Contingency Tables 11. H 0: Whether gender is independent of success in challenging referee calls. H1: Success in challenging referee calls depends on gender. Test statistic: 2 6.343; P-value 0.012 (Table: P-value 0.05); Critical value: 2 3.841; df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between gender and success in challenging referee calls. However, with success rates of 29.1% for men and 26.7% for women, thedifference does not appear to2 have practical significance 2 2 1757 1704.5 4279 4331.5 887 939.5 2440 2387.52 2 6.343 1704.5 4331.5 939.5 2387.5 XLSTAT Output for Exercise 11 XLSTAT Output for Exercise 12 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 6.343 Chi-square (Observed value) 86.481 Chi-square (Critical value) 3.841 Chi-square (Critical value) 6.635 DF 1 DF 1 p-value 0.012 p-value <0.0001 alpha 0.050 alpha 0.010 12. H 0: Whether deaths on a shift is independent of whether Gilbert was working. H1: Deaths on a shift depends on whether Gilbert was working. Test statistic: 86.481; P-value 0.0000 (Table: P-value 0.005); Critical value: 6.635; df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim that deaths on shifts are independent of whether Gilbert was working. theresults favor theguilt of Gilbert. 2 217 245.42 34 62.42 1350 1321.62 2 40 11.6 86.481 11.6 245.4 62.4 1321.6 13. H 0: Whether eye color is independent of gender. 2
2
H1: Eye color depends on gender. Test statistic: 16.091; P-value 0.001 (Table: P-value 0.005); Critical value: 11.345; 2
2
df (2 1)(4 1) 3; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between gender and eye color. However, because thesample data were reported, it is possible that thesample is not representative of thepopulation. 2 290 291.22 110 139.72 160 157.42 2 359 330.7 330.7 291.2 139.7 157.4 2 2 2 2 370 398.3 352 350.8 198 168.3 187 189.6 16.091 398.3 350.8 168.3 189.6 XLSTAT Output for Exercise 13 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 16.091 Chi-square (Critical value) 11.345 DF 3 p-value 0.001 alpha 0.010
XLSTAT Output for Exercise 14 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 1.358 Chi-square (Critical value) 7.815 DF 3 p-value 0.715 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
Section 11-2: Contingency Tables 239 14. H 0: Whether amount of smoking is independent of seat belt use. H1: Amount of smoking depends on seat belt use. Test statistic: 1.358; P-value 0.715 (Table: P-value 0.10); Critical value ( 0.05): 7.815; df (2 1)(4 1) 3; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that theamount of smoking is independent of seat belt use. thetheory is not supported by thegiven data. 2 6 7.92 149 152.52 9 7.12 2 175 171.5 1.358 171.5 7.9 152.5 7.1 2
2
15. H 0: Whether getting a cold is independent of treatment. H1: Getting a cold depends on treatment. Test statistic: 2.925; P-value 0.232 (Table: P-value 0.10); Critical value: 5.991; 2
2
df (2 1)(3 1) 2; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that getting a cold is independent of thetreatment group. theresults suggest that echinacea is not effective for preventing colds. 2 48 44.72 42 44.72 15 14.42 4 7.32 10 7.32 2 88 88.6 2.925 88.6 44.7 44.7 14.4 7.3 7.3 XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 2.925 Chi-square (Observed value) 9.971 Chi-square (Critical value) 5.991 Chi-square (Critical value) 9.488 DF 2 DF 4 p-value 0.232 p-value 0.041 alpha 0.050 alpha 0.050 16.1 H 0: Whether injuries are independent of helmet color. 6 H : Injuries depend on helmet color. 1 . Test statistic: 9.971; P-value 0.041 (Table: P-value 0.05); Critical value ( 0.05): 9.488; df (2 1)(5 1) 4; Reject H0. There is sufficient evidence to warrant rejection of theclaim that injuries are independent of helmet color. It appears that motorcycle drivers should use yellow or orange helmets. 2 55 58.62 213 194.52 26 22.42 2 491 509.5 9.971 509.5 58.6 194.5 22.4 17. H 0: Whether cooperation of thesubject is independent of age category. H1: Cooperation of thesubject depends on age category. 2
2
Test statistic: 20.271; P-value 0.0011 (Table: P-value 0.005); Critical value: 15.086; 2
2
df (2 1)(6 1) 5; Reject H0. There is sufficient evidence to warrant rejection of theclaim that cooperation of thesubject is independent of theage category. theage group of 60 and over appears to be particularly uncooperative. 2 202 218.52 1110.92 49 32.52 2 73 73.1 20.271 73.1 218.5 10.9 32.5
Copyright © 2022 Pearson Education, Inc.
240
Chapter 11: Goodness-of-Fit and Contingency Tables XLSTAT Output for Exercise 17 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 20.271 Chi-square (Critical value) 15.086 DF 5 p-value 0.001 alpha 0.010
XLSTAT Output for Exercise 18 Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 784.647 Chi-square (Critical value) 11.345 DF 3 p-value <0.0001 alpha 0.010
18. H 0: Whether handedness of offspring is independent of parental handedness. H1: Handedness of offspring depends on parental handedness. Test statistic: 784.647; P-value 0.000 (Table: P-value 0.005); Critical value: 11.345; 2
2
df (4 1)(2 1) 3; Reject H0. There is sufficient evidence to warrant rejection of theclaim that handedness of offspring is independent of parental handedness. It appears that handedness of theparents has an effect on handedness of theoffspring, so handedness appears to be an inherited trait. 2 50, 928 50, 220.12 767 337.62 2736 3125.42 2 5360 6067.9 6067.9 50, 220.1 337.6 3125.4
741 475.22
475.2
3667 3932.82 94 41.32 289 341.7 784.647 2
3932.8
41.3
341.7
19. H 0: Whether state is independent of car having front and rear license plates. H1: State depends on car having front and rear license plates. Test statistic: 2 50.446; P-value 0.0000 (Table: P-value 0.005); Critical value: 2 5.991; df (2 1)(3 1) 2; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between thestate and whether a car has front and rear license plates. It does not appear that thelicense plate laws 2are followed at thesame rates in thethree states. 9 33.82 528 528.42 541 516.22 2 35 34.6 50.446 34.6 33.8 528.4 516.2 XLSTAT Output for Exercise 19 XLSTAT Output for Exercise 20 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 50.446 Chi-square (Observed value) 4.737 Chi-square (Critical value) 5.991 Chi-square (Critical value) 6.251 DF 2 DF 3 p-value <0.0001 p-value 0.192 alpha 0.050 alpha 0.100 20.2 H 0: Whether home/visitor wins is independent of thesport. 0 H : Home/visitor wins depends on thesport. 1 . Test statistic: 4.737; P-value 0.192 (Table: P-value 0.10); Critical value: 6.251; 2
2
df (2 1)(4 1) 3; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that home / visitor wins are independent of thesport. Based on thegiven sample data, baseball teams are not effective in modifying field dimensions for2their own players. 2 57 58.0 71 82.02 42 41.02 2 127 116.0 4.737 116.0 58.0 82.0 41.0
Copyright © 2022 Pearson Education, Inc.
Quick Quiz
241
21. 3.197517514 and z 1.788160371, so z2 . t h e critical values are: 3.841 and 2
2
2
z2 1.96, (or 1.959963986 from technology) so z2 . 2
22. Without Yates’s correction, thetest statistic is 12.162. With Yates’s correction, thetest statistic is 2
2 10.717. Yates’s correction decreases thetest statistic so that sample data must be more extreme in order to reject thenull hypothesis of independence. Without Yates’s correction: 2 16 24.162 12 20.162 34 25.842 2 27 18.84 12.162 18.84 24.16 20.16 25.84 With Yates’s Correction: 2 2 2 2 27 18.84 0.5 16 24.16 0.5 12 20.16 0.5 34 25.84 0.5 10.717 2
18.84 24.16 XLSTAT Output without Yates’s Test of independence between therows and thecolumns (Chi-square): Chi-square (Observed value) 12.162 Chi-square (Critical value) 3.841 DF 1 p-value 0.000 alpha 0.050
20.16
25.84 XLSTAT Output with Yates’s Test of independence between therows and thecolumns (Chi-square with Yates' continuity correction): Chi-square (Observed value) 10.717 Chi-square (Critical value) 3.841 DF 1 p-value (Two-tailed) 0.001 alpha 0.050
Quick Quiz 1. H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.1; H1: At least one of theprobabilities is different from theothers. 2.
O 45 and E 0.1 500 50
3.
right-tailed
4.
df n 1 10 1 9
5.
There is not sufficient evidence to warrant rejection of thenull hypothesis that thedigits are all equally likely. Although we cannot conclude that thesample data support theclaim that thedigits are equally likely (because we can’t support a null hypothesis), it does appear that Statdisk generates thedigits so that they are equally likely.
6.
H0: Surviving thesinking is independent of whether theperson is a man, woman, boy, or girl. H1: Surviving thesinking and whether theperson is a man, woman, boy, or girl are somehow related.
7. 8.
chi-squared distribution right-tailed
9.
df (r 1)(c 1) (2 1)(4 1) 3
10. There is sufficient evidence to warrant rejection of theclaim that surviving thesinking is independent of whether theperson is a man, woman, boy, or girl. Most of thewomen survived, 45% of theboys survived, and most girls survived, but only about 20% of themen survived, so it appears that therule was followed quite well.
Copyright © 2022 Pearson Education, Inc.
242
Chapter 11: Goodness-of-Fit and Contingency Tables
Review Exercises 1.
H0: pJan pFeb pMar pApr pMay pJun pJul pAug pSep pOct pNov pDec 1 12; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 186.332; P-value 0.0000 (Table: P-value 0.005); Critical value: 2 24.725; Reject H0. There is sufficient evidence to warrant rejection of theclaim that weather-related deaths occur in the different months with thesame frequency. thewarmer months appear to have disproportionately more weatherrelated deaths, and that is likely due to thefact that vacations and outdoor activities are much greater during those months. 2 14 44.172 96 44.172 16 44.172 2 61 44.17 186.332 (df 11) 530 12 530 12 530 12 530 12 XLSTAT Output for Exercise 1 Chi-square test: Chi-square (Observed value) 186.332 Chi-square (Critical value) 24.725 DF 11 p-value <0.0001 alpha 0.010
2.
XLSTAT Output for Exercise 2 Chi-square test: Chi-square (Observed value) 396.014 Chi-square (Critical value) 15.507 DF 8 p-value <0.0001 alpha 0.050
H0: p1 0.301, p2 0.176, p3 0.125, … , p7 0.058, p8 0.051, p9 0.046; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 396.014; P-value 0.000 (Table: P-value 0.005); Critical value ( 0.05):
2 15.507; Reject H0. There is sufficient evidence to warrant rejection of theclaim that theobserved frequencies fit thedistribution from Benford’s law. It appears that thedata are from matches in which cheating may have occurred. 2 1515 1904.52 743 551.92 649 497.82 2 3679 3257.1 396.014 (df 8) 0.30110,821 0.176 10,821 0.05110,821 0.046 10,821 3.
H0: Whether success is independent of type of treatment. H1: Success depends on type of treatment. Test statistic: 9.750; P-value 0.002 (Table: P-value 0.005); Critical value: 6.635; df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim that success is independent of thetype of treatment. theresults suggest that thesurgery treatment is better. 2 23 15.42 67 59.42 6 13.62 2 60 67.6 9.750 67.6 15.4 59.4 13.6 XLSTAT Output for Exercise 3 XLSTAT Output for Exercise 4 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 9.750 Chi-square (Observed value) 58.393 Chi-square (Critical value) 6.635 Chi-square (Critical value) 7.815 DF 1 DF 3 p-value 0.002 p-value <0.0001 alpha 0.010 alpha 0.050 2
2
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 243 4.
H0: Whether success is independent of type of treatment. H1: Success depends on type of treatment. Test statistic: 58.393; P-value 0.000 (Table: P-value 0.005); Critical value: 7.815; 2
2
df (4 1)(2 1) 3; Reject H0. There is sufficient evidence to warrant rejection of theclaim that success is independent of thetreatment. Although these results of this test do not tell us which treatment is best, we can see from thetable that thesuccess rates of 81.8%, 44.6%, 95.9%, and 77.3% suggest that thebest treatment is to use a non-weight-bearing cast for 6 weeks. These results suggest that theincreasing use of surgery is a treatment strategy that is not supported by theevidence. 2 12 18.52 41 66.22 51 25.82 2 54 47.5 47.5 18.5 66.2 25.8 2 2 2 70 52.5 3 20.5 17 15.8 5 6.22 58.393 52.5 20.5 15.8 6.2 Cumulative Review Exercises 1. a. Here are two different key questions that are appropriate for thedata: (1) Is there a correlation between thescores of thefemales and thescores of their brothers? (2) For thepaired scores, is there a significant difference between thescores of thesiblings? b. (1) Use thelinear correlation coefficient, as in Section 10-1. (2) Use themethods for inferences about differences from matched pairs, as in Section 9-3. c. (1) Using correlation: r 0.481; P-value 0.412 (Table: 0.05); critical values: r 0.878 (assuming a 0.05 significance level). There is not sufficient evidence to support a claim of a correlation. (2) thedifferences are 6, 7, 5, 5, 3. Test statistic: t 0.459 P-value 0.670 (Table: P-value 0.20); critical values ( 0.05): t 2.776; Fail to reject H0. There is not sufficient evidence to support a claim of a difference between thescores of thesiblings. x d 1.2 0 t d 2 5.8482 5 s n XLSTAT Outputs for (1) Correlation matrix: Brother Sister Brother 1 -0.481 Sister -0.481 1 p-values (Pearson): Variables Sister Sister 0 Brother 0.412 2.
Brother 0.412
XLSTAT Output for (2) t-test for two paired samples / Two-tailed test: Difference -1.200 t (Observed value) -0.459 |t| (Critical value) 2.776 DF 4 p-value (Two-tailed) 0.670 alpha 0.050
0
a. Are thelast digits from a population in which they all occur with thesame frequency? b. Use thegoodness-of-fit test, as in Section 11-1. c. Test statistic: 6.500; P-value 0.689 (Table: P-value 0.10); Critical value ( 0.05): 2
2 16.919; Fail to reject thenull hypothesis that thelast digits are from a population in which they all occur with thesame frequency.
2
(14 16.0)2 (22 16.0)2 (11 16.0)2 (12 16.0)2 (18 16.0)2 0.1160 0.1160 0.1160 0.1160 0.1160 (20 16.0)2 (15 16.0)2 (16 16.0)2 (17 16.0)2 (15 16.0)2 6.500 0.1160 0.1160 0.1160 0.1160 0.1160
Copyright © 2022 Pearson Education, Inc.
244
Chapter 11: Goodness-of-Fit and Contingency Tables XLSTAT Output for Exercise 2 Chi-square test: Chi-square (Observed value) 6.500 Chi-square (Critical value) 16.919 DF 9 p-value 0.689 alpha 0.050
XLSTAT Output for Exercise 3 t-test for two independent samples / Two-tailed test: Difference -1.200 t (Observed value) -0.536 |t| (Critical value) 2.492 DF 5.586 p-value (Two-tailed) 0.613 alpha 0.050 95% confidence interval on the difference between the means: (-6.782 , 4.382)
3.
a. Is there a significant difference between thedistances run by males and thedistances run by females? b. Use thetest for a difference between themeans of two independent populations, as in Section 9-2. c. H0: 1 2; H1: 1 2; Test statistic: t 0.536; P-value 0.613 (Table: P-value 0.20). Critical values ( 0.05): t 2.492 (Table: 2.776); Fail to reject thenull hypothesis of no difference. There is not sufficient evidence to reject theclaim of no significant difference between thedistances run by males and females. (The same conclusion results from a 95% confidence interval: 6.8 miles 1 2 4.4 miles.) t
x1 x2 1 2 15.4 16.6 0 0.536 s12 s22 n1 n2
x1 x2 t / 2 4.
4.5612 2.0742 5 5
s12 s22 4.5612 2.0742 (df 4) 15.4 16.6 2.776 n1 n2 5 5
a. Are thevariables of workday and shift independent? b. Use the
2
test for a contingency table, as in Section 11-2.
c. Test statistic: 4.225; P-value 0.376 (Table: 70.10). Critical value ( 0.05): 9.488; 2
2
df (2 1)(5 1) 4; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim of independence between shift and day of theweek. (14 16.4)2 (22 17.8)2 (1113)2 (12 14)2 (18 15.9)2 16.4 17.8 13 14 15.9 2 2 2 (20 17.6) (15 19.2) (16 14) (17 15)2 (15 17.1)2 4.225 17.6 19.2 14 15 17.1 XLSTAT Output for Exercise 4 XLSTAT Output for Exercise 5 Test of independence between therows Test of independence between therows and thecolumns (Chi-square): and thecolumns (Chi-square): Chi-square (Observed value) 4.225 Chi-square (Observed value) 3.409 Chi-square (Critical value) 9.488 Chi-square (Critical value) 3.841 DF 4 DF 1 p-value 0.376 p-value 0.065 alpha 0.050 alpha 0.050
2
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 245 5.
H 0: Whether money was spent is independent of form of 100 Yuan. H1: Whether money was spent depends on form of 100 Yuan.
6.
Test statistic: 2 3.409; P-value 0.0648 (Table: P-value 0.05); Critical value: 2 3.841; df (2 1)(2 1) 1; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that the form of the100-yuan gift is independent of whether themoney was spent. There is not sufficient evidence to support theclaim of a denomination effect. Women in China do not appear to be affected by whether 100 Yuan are in theform of a single bill or several smaller bills. 2 15 112 68 642 7 112 2 60 64 3.409 64 11 64 11 60 68 128 60 68 60 15 60 143 0.953 a. 0.853 b. 150 150 150 150 60 68 15 7 150 128 127 8128 0.727, not 128 128 0.728 c. 150 149 11,175 150 150
Copyright © 2022 Pearson Education, Inc.
Section 12-1: One-Way ANOVA 247
Chapter 12: Analysis of Variance Section 12-1: One-Way ANOVA 1. a. t h e chest compression amounts are categorized according to theone characteristic of vehicle size category. b. theterminology of analysis of variance refers to themethod used to test for equality of thefour population means. That method is based on two different estimates of a common population variance. 2. As we increase thenumber of individual tests of significance, we increase therisk of finding a difference by chance alone (instead of a real difference in themeans). therisk of a type I error (finding a difference in one of thepairs when no such difference actually exists) is too high. themethod of analysis of variance helps us avoid that particular pitfall (rejecting a true null hypothesis) by using one test for equality of several means, instead of several tests that each compare two means at a time. 3. The test statistic is F 3.815, and theF distribution applies. 4.
The P-value is 0.016. Because theP-value is less than thesignificance level of 0.05, we reject thenull hypothesis of equal means. There is sufficient evidence to warrant rejection of theclaim that small cars, midsize cars, large cars, and SUVs have thesame mean chest compression in car crash tests. It appears that thefour populations have means that are not all thesame, but theANOVA test does not enable us to identify which populations have means that are significantly different.
5.
Test statistic: F 0.570; P-value 0.638; Fail to reject H0: 1 2 3 4. There is not sufficient evidence to warrant rejection of theclaim that thefour vehicle size categories have thesame mean force on theleft femur in crash tests. Size of thecar does not appear to have an effect on theforce on thefemur of theleft leg in crash tests.
6.
Test statistic: F 15.5012; P-value 0.000; Reject H0: 1 2 3 4. There is sufficient evidence to warrant rejection of theclaim that thefour categories of giving candy are associated with thesame mean amount of tip. It appears that tips are affected by gifts of candy.
7.
Test statistic: F 28.1666; P-value 0.0001; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that thethree different types of Chips Ahoy cookies have thesame mean numberof chocolate chips.
8.
Test statistic: F 969.744; P-value 0.0001; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that thethree samples are from populations with thesame mean. It appears that exposure to tobacco smoke does affect theamount of nicotine absorbed by thebody.
9.
Test statistic: F 9.4695; P-value 0.001; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that pages from books by Clancy, Rowling, and Tolstoy have thesame mean Flesch Reading Ease score. It appears that at least one of theauthors has a mean Flesch Reading Ease score that is different from theothers. With means of 70.73, 80.75, and 66.15, it appears that Rowling is different by having thehighest Flesch Reading Ease score, but ANOVA does not justify a conclusion that any particular mean is different from theothers. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 1338.002 2331.387 3669.389
df 2 33 35
MS 669.0011 70.64808
F 9.469487
P-value 0.000562
F crit 3.284918
10. Test statistic: F 2.6925; P-value 0.083; Fail to reject H0: 1 2 3. There is not sufficient evidence to warrant rejection of theclaim that pages from books by Clancy, Rowling, and Tolstoy have thesame mean number of characters per word. It does not appear that any of theauthors use longer words. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 0.195 1.195 1.39
df 2 33 35
MS 0.0975 0.036212
Copyright © 2022 Pearson Education, Inc.
F 2.692469
P-value 0.082571
F crit 3.284918
248
Chapter 12: Analysis of Variance
11. Test statistic: F 27.2488; P-value 0.000; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that thethree different miles have thesame mean time. These data suggest that thethird mile appears to take longer, and a reasonable explanation is that thethird lap has a hill. Excel Output: ANOVA Source of Variation Between Groups Within Groups
SS 0.103444 0.022778
Total
0.126222
df 2 12
MS 0.051722 0.001898
F 27.24878
P-value 3.45E-05
F crit 3.885294
14
12. Test statistic: F 22.9477; P-value 0.000; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that thethree different states have thesame mean arsenic content in brown rice. Even though Texas has thehighest mean, theresults from ANOVA do not allow us to conclude that any one specific population mean is different from theothers, so we cannot conclude that brown rice from Texas poses thegreatest health problem. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 31.65167 22.75833
df 2 33
54.41
MS 15.82583 0.689646
F 22.94775
P-value 5.68E-07
F crit 3.284918
35
13. Test statistic: F 0.0558; P-value 0.964; Fail to reject H0: 1 2 3. There is not sufficient evidence to warrant rejection of theclaim that thethree vehicle size categories (midsize, large, SUV) have thesame mean head injury criterion (HIC) measurements in crash tests. For midsize, large, and SUV cars, size category of thecar does not appear to have an effect on theHIC measurements. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 400.3889 118324.8 118725.2
df 2 33 35
MS 200.1944 3585.601
F 0.055833
P-value 0.945786
F crit 3.284918
14. Test statistic: F 0.7477; P-value=0.588; Fail to reject H0: 1 2 3 4 5 6. There is not sufficient evidence to warrant rejection of theclaim that thesix different colors of M&M candies have thesame mean weight. It does not seem feasible that thecolor would have much of an effect on theweight, so theresult is as expected. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 0.005016 0.454818 0.459834
df 5 339 344
MS 0.001003 0.001342
Copyright © 2022 Pearson Education, Inc.
F 0.747739
P-value 0.588218
F crit 2.240616
Section 12-2: Two-Way ANOVA 249 15. Test statistic: F 144.4462; P-value 0.000; Reject H0: 1 2 3 4. There is sufficient evidence to warrant rejection of theclaim that thefour rides have thesame mean wait time. At least one of therides has a mean 10 AM wait time that is different from theothers. Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 349251 157967 507218
df 3 196 199
MS 116417 805.9541
F 144.4462
P-value 2.09E-49
F crit 2.650677
16. Test statistic: F 44.9964; P-value 0.000; Reject H0: 1 2 3 4. There is sufficient evidence to warrant rejection of theclaim that thefour rides have thesame mean wait time. At least one of therides has a mean 5 PM wait time that is different from theothers. Excel Output: Source of Variation Between Groups Within Groups Total
SS 84385.22 122524.6 206909.8
df 3 196 199
MS 28128.41 625.1254
F 44.99642
P-value 3.61E-22
F crit 2.650677
17. The Tukey test results show different P-values, but they are not dramatically different. theTukey results suggest thesame conclusions as theBonferroni test. 18. a. Test statistic: F 6.1413; P-value: 0.0056; Reject H0: 1 2 3. There is sufficient evidence to warrant rejection of theclaim that thefour treatment categories yield poplar trees with thesame mean weight. b. thedisplayed Bonferroni results show that with a P-value of 0.039, there is a significant difference between themean of theno treatment group (group 1) and themean of thegroup treated with both fertilizer and irrigation (group 4). (Also, when comparing group 1 and group 2, there is no significant difference between means, and when comparing group 1 and group 3, there is no significant difference between means.) c. Test statistic: t 4.007; P-value: 6(0.001018) 0.00611; Reject thenull hypothesis that themean weight from theirrigation treatment group is equal to themean from thegroup treated with both fertilizer and irrigation. Section 12-2: Two-Way ANOVA 1. The measurements are categorized using two different factors of (1) femur side (left, right) and (2) vehicle size (small, midsize, large, SUV). 2. No. To use two individual tests of one-way analysis of variance is to totally ignore thevery important feature of thepossible effect from an interaction between femur side (left, right) and vehicle size category. If there is an interaction, it doesn’t make sense to consider theeffects of one factor without theother. 3. a. An interaction between two factors or variables occurs if theeffect of one of thefactors changes for different categories of theother factor. b. If there is an interaction effect, we should not proceed with individual tests for effects from therow factor and column factor. If there is an interaction, we should not consider theeffects of one factor without considering theeffects of theother factor. c. thelines are not too far from being parallel except for themean for theleft femur in small cars. That large difference does suggest that there is an interaction effect between femur side and size of vehicle. 4. Yes, theresult is a balanced design because each cell has thesame number (10) of values.
Copyright © 2022 Pearson Education, Inc.
250 Chapter 12: Analysis of Variance 5.
6.
For interaction, thetest statistic is F 4.4313 and theP-value is 0.0103, so there is sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. Because there appears to be an interaction between femur side (left, right) and vehicle size category, we should not proceed with a test for an effect from thefemur side (left/right) and a test for an effect from vehicle size category. It appears that an interaction between thefemur side and vehicle size category has an effect on theforce measurements. (Remember, these results are based on fabricated data used in one of thecells, so this conclusion does not necessarily apply to real data.) In XLSTAT, theinteraction is identified as ―Gender*ANSUR.‖ For interaction, thetest statistic is F 0.085 and theP-value is 0.774, so there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. For therow variable of gender, thetest statistic is F 9.459 and theP-value is 0.006, so there does appear to be an effect from gender. For thecolumn variable of ANSUR (ANSUR I or ANSUR II), thetest statistic is F 0.124 and theP-value is 0.728, so there does not appear to be an effect from ANSUR. It appears that weights are affected by gender.
7.
For interaction, thetest statistic is F 3.6296 and theP-value is 0.0749, so there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. For therow variable of gender, thetest statisticis F 5.3519 and theP-value is 0.0343, so there does appear to be an effect from gender. For thecolumn variable of handedness, thetest statistic is F 2.2407 and theP-value is 0.1539, so there does not appear to be an effect from handedness. It appears that thedistance between pupils is affected by gender. For distances between pupils, we expect thefollowing: (1) No interaction between genders; (2) No effect from handedness (right, left); (3) There could be differences based on gender. theresults are as expected.
8.
For interaction, thetest statistic is F 41.38 and theP-value is 0.000, so there is sufficient evidence to conclude that there is an interaction effect. theratings appear to be affected by an interaction between theuseof a supplement and theamount of whey. Because there appears to be an interaction effect, we should not proceed with individual tests of therow factor (supplement) and thecolumn factor (amount of whey).
9.
For interaction, thetest statistic is F 2.1727 and theP-value is 0.1599, so there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. For therow variable of gender, thetest statistic is F 27.8569 and theP-value is 0.0001, so there does appear to be an effect from gender. For thecolumn variable of handedness, thetest statistic is F 1.4991 and theP-value is 0.2385, so there does not appear to be an effect from handedness. It appears that sitting height is affected by gender. For sitting heights, it is reasonable to not expect an interaction between genders and handedness, it is reasonable to not expect that handedness (right, left) would affect sitting heights, and it is reasonable to expect that there could be differences based on gender, so theresults are as expected. Excel Output: ANOVA Source of Variation Row Columns Interaction Within Total
SS 39427.2 2121.8 3075.2 22645.6 67269.8
df 1 1 1 16 19
MS 39427.2 2121.8 3075.2 1415.35
F 27.85686 1.499134 2.172749
P-value 7.5E-05 0.238527 0.159878
F crit 4.493998 4.493998 4.493998
10. For interaction, thetest statistic is F 0.3328 and theP-value is 0.5747, so there is not sufficient evidence to warrant rejection of no interaction effect. There does not appear to be an interaction between gender and smoking. For therow variable of gender, thetest statistic is F 1.3313 and theP-value is 0.2710, so there is not sufficient evidence to warrant rejection of theclaim of no effect from gender. For thecolumn variable of smoking, thetest statistic is F 1.1186 and theP-value is 0.3110, so there is not sufficient evidence to conclude that smoking has an effect on body temperatures. It appears that body temperatures are not affected by an interaction between gender and smoking, they are not affected by gender, and they are not affected by smoking.
Copyright © 2022 Pearson Education, Inc.
Quick Quiz 251 10. (continued) Excel Output: ANOVA Source of Variation Row Columns Interaction Within Total
SS
df
0.36 0.3025 0.09 3.245 3.9975
MS 1 1 1 12 15
0.36 0.3025 0.09 0.270417
F 1.331279 1.118644 0.33282
P-value 0.271041 0.311038 0.574667
F crit 4.747225 4.747225 4.747225
11. a. Test statistics and P-values do not change. b. Test statistics and P-values do not change. c. Test statistics and P-values do not change. d. An outlier can dramatically affect and change test statistics and P-values. Quick Quiz
3.
H0: 1 2 3 4; Because thedisplayed P-value of 0.000 is small, reject H0 . There is sufficient evidence to warrant rejection of theclaim that thefour samples have thesame mean weight. No, it appears that mean weights of Diet Coke and Diet Pepsi are lower than themean weights of regular Coke and regular Pepsi, but themethod of analysis of variance does not justify a conclusion that any particular means are significantly different from theothers. right-tailed.
4.
Test statistic: F 503.06; Larger test statistics result in smaller P-values.
5.
The four samples are categorized using only one factor: thetype of cola (regular Coke, Diet Coke, regular Pepsi, Diet Pepsi). One-way analysis of variance is used to test a null hypothesis that three or more samples are from populations with equal means. With one-way analysis of variance, data from thedifferent samples are categorized using only one factor, but with two-way analysis of variance, thesample data are categorized into different cells determined by two different factors. Two-way analysis of variance includes a test for an interaction between gender and age bracket, but thetwo separate tests fail to include that test for an interaction.
1. 2.
6. 7.
8. 9.
a. Because theeffect from an interaction has a P-value of 0.3973, conclude that there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction. Pulse rates do not appear to be affected by an interaction between gender and age bracket. b. If there does appear to be an interaction, we should not proceed to test for an effect from therow factor of age bracket and we should not proceed to test for an effect from thecolumn factor of gender. If there is an interaction, we should not consider theeffect of either factor without also considering theeffect of theother factor. 10. Pulse rates do not appear to be affected by an interaction between gender and age bracket, they do not appear to be affected by age bracket, but they do appear to be affected by gender. Review Exercises 1. With test statistic F 2.1436 and P-value 0.096, fail to reject H0: 1 2 3 4. Conclude that there is not sufficient evidence to warrant rejection of theclaim that subjects in thefour age brackets have thesame mean cholesterol level. It appears that for thegiven age brackets, age does not have a significant effect on cholesterol. 2.
Test statistic: F 2.3831; P-value 0.092; Fail to reject H0: 1 2 3 4. There is not sufficient evidence to warrant rejection of theclaim that subjects in thefour different age brackets have thesame mean cholesterol level. It appears that for those four age brackets, age does not have an effect on LDL cholesterol.
Copyright © 2022 Pearson Education, Inc.
252 2.
Chapter 12: Analysis of Variance (continued) Excel Output: ANOVA Source of Variation Between Groups Within Groups Total
SS 6069.163 22071.8 28140.97
df 3 26 29
MS 2023.054 848.9155
F 2.383104
P-value 0.092336
F crit 2.975154
3.
For interaction, thetest statistic is F 0.8476 and theP-value 0.6413, so there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. For therow variable of hospital, thetest statisticis F 0.3549 and theP-value 0.5426, so there does not appear to be an effect from hospital. For thecolumn variable of day of weekday/weekend, thetest statistic is F 0.6423 and theP-value 0.5267, so there does not appear to be an effect from whether theday is a weekday or weekend day. It appears that birth weights are not affected by an interaction between hospital and weekday/weekend, they are not affected by thehospital, andthey are not affected by whether theday is a weekday or weekend day.
4.
For interaction, thetest statistic is F 1.0605 and theP-value 0.3795, so there is not sufficient evidence to warrant rejection of thenull hypothesis of no interaction effect. For therow variable of hospital, thetest statisticis F 0.6423 and theP-value 0.5935, so there does not appear to be an effect from hospital. For thecolumn variable of day of weekday/weekend, thetest statistic is F 0.4097 and theP-value 0.5267, so there does not appear to be an effect from whether theday is a weekday or weekend day. It appears that birth weights are not affected by an interaction between hospital and weekday/weekend, they are not affected by thehospital, and they are not affected by whether theday is a weekday or weekend day. Excel Output: ANOVA Source of Variation Sample Columns Interaction Within Total
SS 1.204 0.256 1.988 19.996 23.444
df 3 1 3 32 39
MS 0.401333 0.256 0.662667 0.624875
F 0.642262 0.409682 1.060479
P-value 0.593478 0.526688 0.379536
F crit 2.90112 4.149097 2.90112
Cumulative Review Exercises 1. a. Presidents: x 15.9 years; Popes: x 13.1 years; Monarchs: x 22.7 years b. Presidents: s 9.9 years; Popes: s 9.0 years; Monarchs: s 18.6 years c. Presidents: s2 97.4 years2; Popes: s2 80.3 years2; Monarchs: s2 346.1 years2 XLSTAT Output Statistic Nbr. of observations Mean Variance (n-1) Standard deviation (n-1)
Presidents
Popes
39 15.872 97.430 9.871
39 13.125 80.288 8.960
Monarchs 39 22.714 346.066 18.603
d. Visual inspection of thedata shows that among themonarchs, thevalues of 59 years and 63 years appear to be possible outliers, but they are not outliers according to thecriterion of being above thethird quartile Q3 by more than 1.5 times theinterquartile range. e. ratio level of measurement
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 253 2.
H0: 1 2; H1: 1 2; population1 presidents, population2 popes; Test statistic: t 1.136; P-value 0.2610 (Table: P-value 0.20); Critical values ( 0.05): t 2.006 (Table: t 2.069); Fail to reject H0. There is not sufficient evidence to support theclaim that there is a difference between themean longevity times of presidents and popes x1 x2 1 2 15.872 13.125 0 1.136 t s12 s22 9.706892 8.9603602 39 24 n1 n2 XLSTAT Output t-test for two independent samples / Two-tailed test: Difference 2.747 t (Observed value) 1.136 |t| (Critical value) 2.006 DF 52.468 p-value (Two-tailed) 0.261 alpha 0.050
3.
4.
Because thepattern of thepoints is reasonably close to a straight-line pattern, thelongevity times of presidents do appear to be from a population with a normal distribution. s 15.9 2.024 9.9 12.7 years 19.1 years; We have 95% n 39 30; 95% CI: x t /2 n 39 confidence that thelimits of 12.7 years and 19.1 years contain thevalue of thepopulation mean for longevity times of presidents. XLSTAT Output 95% confidence interval on the mean: ( 12.672, 19.072 )
5.
6.
a. H0: 1 2 3 b. Because theP-value of 0.054 is greater than thesignificance level of 0.05, fail to reject thenull hypothesisof equal means. There is not sufficient evidence to warrant rejection of theclaim that thethree means are equal. thethree populations do not appear to have means that are significantly different. 5.700 5.670 5.600 5.670 0.48 which have a probability of 1.13 and z x5.700 a. zx5.600 0.062 0.062 0.6844 0.1292 0.5552 (Excel: 0.5563) between them. fx | =NORM.DIST(5.700,5.670,0.062,1)-NORM.DIST(5.600,5.670,0.062,1) 5.675 5.670 0.40; which has a probability of 1 0.6554 0.3446 (Excel: 0.3434) to theright. b. z x 5.675 0.062 25 fx | =1-NORM.DIST(5.675,5.670,0.062/sqrt(25),1) c. Half thequarters will weigh less than themean of 5.670 g, so theprobability that eight quarters selected randomly all weighing less than 5.670 g is 1 2 1 256 , or about 0.00391. 8
d. t h e z score for thebottom 10% of values is –1.28, which correspond to a weight of 5.760 1.28 0.062 5.591g. fx | =NORM.INV(0.1,5.67,0.062)
Copyright © 2022 Pearson Education, Inc.
254 Chapter 12: Analysis of Variance 7.
H0: Whether symptoms 28 days after concussion is independent of physical activity within 7 days of event. H1: Symptoms 28 days after concussion depend on physical activity within 7 days of event. Test statistic: 2 85.945; P-value 0.000 (Table: P-value 0.005); Critical value: 2 3.841; df (2 1)(2 1) 1; Reject H0. There is sufficient evidence to warrant rejection of theclaim of independence between physical activity within 7 days after an acute concussion and symptoms 28 days after theconcussion. It appears that early physical activity during theweek after a concussion does have an effect on symptoms 28 days after a concussion. (413 509.4)2 (1264 1167.6)2 (320 223.58)2 (416 512.48)2 85.945 2 509.4 1167.6 223.58 512.48 XLSTAT Output for Exercise 7 XLSTAT Output for Exercise 8 Test of independence between therows Chi-square test: and thecolumns (Chi-square): Chi-square Chi-square (Observed value) 3 (Observed value) 85.945 Chi-square (Critical value) 16.919 Chi-square (Critical value) 3.841 DF 9 DF 1 p-value 0.964 alpha 0.050 p-value <0.0001 alpha 0.050
8.
a. Because thevertical scale begins at 15 instead of 0, thegraph is deceptive by exaggerating thedifferences among thefrequencies. b. No, a normal distribution is approximately bell-shaped, but thegiven histogram is far from being bellshaped. Because thedigits are supposed to be equally likely, thehistogram should be flat with all bars having approximately thesame height. c. t h e frequencies are 19, 21, 22, 21, 18, 23, 16, 16, 22, and 22. d. H0: p0 p1 p2 p3 p4 p5 p6 p7 p8 p9 0.10; H1: At least one of theproportions is not equal to thegiven claimed value. Test statistic: 2 3.000; P-value 0.964 (Table: P-value 0.95); Critical value ( 0.05):
2 16.919; Fail to reject H0. There is not sufficient evidence to warrant rejection of theclaim that the digits are selected from a population in which thedigits are all equally likely. There does not appear to be a problem with thelottery. 2 21 202 22 202 22 202 2 19 20 3.000 (df 9) 0.10 200 0.10 200 0.10 200 0.10 200
Copyright © 2022 Pearson Education, Inc.
13-2: Sign Test 255
Chapter 13: Nonparametric Tests 13-2 : Sign Test
1.
2. 3.
4. 5.
a. t h e only requirement for thematched pairs is that they constitute a simple random sample. b. There is no requirement of a normal distribution or any other specific distribution. c. thesign test is ―distribution free‖ in thesense that it does not require a normal distribution or any other specific distribution. There is 1 positive sign, 5 negative signs, 1 tie, n 6, and thetest statistic is x 1 (the smaller of 1 and 5). H0: There is no difference between hospital admissions due to traffic accidents that occur on Friday the6th and those that occur on thefollowing Friday the13th. H1: There is a difference between hospital admissions due to traffic accidents that occur on Friday the6th and those that occur on thefollowing Friday the13th. The sample data do not contradict H1 because thenumbers of positive signs (1) and negative signs (5) are not exactly thesame. The efficiency of 0.63 indicates that with all other things being equal, thesign test requires 100 sample observations to achieve thesame results as 63 sample observations analyzed through a parametric test. The test statistic of x 3 is not less than or equal to thecritical value of 1 (from Table A-7). There is not sufficient evidence to warrant rejection of theclaim of no difference. Based on thegiven data, there does not appear to be a significant difference between measured weights and reported weights. XLSTAT Output for Exercise 5 XLSTAT Output for Exercise 6 Sign test / Two-tailed test: Sign test / Two-tailed test: N+ 3 N+ 2 Expected value 5.000 Expected value 3.000 Variance (N+) 2.500 Variance (N+) 1.500 p-value (Two-tailed) 0.344 p-value (Two-tailed) 0.688 alpha 0.050 alpha 0.050
6.
The test statistic of x 2 is not less than or equal to thecritical value of 0 (from Table A-7). There is not sufficient evidence to warrant rejection of theclaim that there is no significant difference between thenumbers of words spoken by males and females in couple relationships. Based on thegiven data, there does not appear to be a significant difference.
7.
The test statistic of z
(22 0.5) 562 1.47 results in a P-value of 0.1416, and it does not
(x 0.5) n2 n 2
56 2
fall in thecritical region bounded by z 1.96 and 1.96. There is not sufficient evidence to warrant rejection of theclaim that there is no significant difference between thenumbers of words spoken by males and females in couple relationships. Based on thegiven data, there does not appear to be a significant difference. XLSTAT Output for Exercise 7 XLSTAT Output for Exercise 8 Sign test / Two-tailed test: Sign test / Two-tailed test: N+ 22 N+ 20 Expected value 28.000 Expected value 44.000 Variance (N+) 14.000 Variance (N+) 22.000 p-value (Two-tailed) 0.141 p-value (Two-tailed) <0.0001 alpha 0.050 alpha 0.050 8.
The test statistic of z
(20 0.5) 882 5.01 results in a P-value of 0.0000, and it falls in
(x 0.5) n2 n 2
88 2
the critical region bounded by z 1.96 and 1.96. There is sufficient evidence to warrant rejection of theclaim that there is no significant difference between ages of Oscar-winning actresses and Oscar-winning actors. Based on thegiven data, there does appear to be a significant difference between ages of Oscar-winning actresses and Oscar-winning actors.
Copyright © 2022 Pearson Education, Inc.
256
9.
Chapter 13: Nonparametric Tests
The test statistic of z
(19 0.5) 482 1.30 results in a P-value of 0.1936, and it does not
(x 0.5) n2 n 2
48 2
fall in thecritical region bounded by z 1.96 and 1.96. There is not sufficient evidence to warrant rejection of theclaim that toast will land with thebuttered side down 50% of thetime. Excel Commands for Exercise 9 Excel Commands for Exercise 10 fx | =2*NORM.S.DIST(-1.30,1)
fx | =2*NORM.S.DIST(-13.78,1)
fx | =2*BINOM.DIST 19, 48, 0.5,1
fx | =2*BINOM.DIST 372,1228, 0.5,1
10. The test statistic of z
(372 0.5) 1228 2 13.78 results in a P-value of 0.0000, and it falls
(x 0.5) n2 n 2
1228 2
in thecritical region bounded by z 2.575 and 2.575. There is sufficient evidence to support theclaim that there is a difference between therate of medical malpractice lawsuits that go to trial and therate of such lawsuits that are dropped or dismissed. 11. The test statistic of z
(208 0.5) 4602 2.00 results in a P-value of 0.0455, and it falls
(x 0.5) n2 n 2
460 2
in thecritical region bounded by z 1.96 and 1.96. There is sufficient evidence to warrant rejection of theclaim that thecoin toss is fair in thesense that neither team has an advantage by winning it. thecoin toss doesnot appear to be fair. Excel Commands for Exercise 11 Excel Commands for Exercise 12 fx | =2*NORM.S.DIST(-2.00,1)
fx | =2*NORM.S.DIST(-0.57,1)
fx | =2*BINOM.DIST 208, 460, 0.5,1
fx | =2*BINOM.DIST 52,111, 0.5,1
12. t h e test statistic of z
(52 0.5) 1112 0.57 results in a P-value of 0.5686, and it does not
(x 0.5) n2 n 2
111 2
fall in thecritical region bounded by z 1.96 and 1.96. There is not sufficient evidence to warrant rejection of theclaim that thecoin toss is fair in thesense that neither team has an advantage by winning it. thecoin toss appears to be fair. 13. The test statistic of z
(19 0.5) 842 4.91 results in a P-value of 0.0000, and it falls in
(x 0.5) n2 n 2
84 2
the critical region bounded by z 2.575 and 2.575. There is sufficient evidence to warrant rejection of theclaim that themedian is equal to 98.6F. XLSTAT Output for Exercise 13 Sign test / Two-tailed test: N+ 19 Expected value 42.000 Variance (N+) 21.000 p-value (Two-tailed) <0.0001 alpha 0.010 14. The test statistic of z
(12 0.5) 382 2.11 results in a P-value of 0.0349 (Table: 0.0348),
(x 0.5) n2 n 2
XLSTAT Output for Exercise 14 Sign test / Two-tailed test: N+ 12 Expected value 19.000 Variance (N+) 9.500 p-value (Two-tailed) 0.034 alpha 0.050
38 2
and it falls in thecritical region bounded by z 1.96 and 1.96. There is sufficient evidence to warrant rejection of theclaim that themedian is equal to 8.953 g.
Copyright © 2022 Pearson Education, Inc.
13-3: Wilcoxon Signed-Ranks Test for Matched Pairs 257
15. The test statistic of z
(5 0.5) 9022 29.67 results in a P-value of 0.0000, and it falls in
(x 0.5) n2 n 2
902 2
the critical region bounded by z 2.575 and 2.575. There is sufficient evidence to warrant rejection of theclaim that smokers have a median cotinine level equal to 2.84 ng/mL. XLSTAT Output for Exercise 15 XLSTAT Output for Exercise 16 Sign test / Two-tailed test: Sign test / Two-tailed test: N+ 5 N+ 18 Expected value 451.000 Expected value 23.000 Variance (N+) 225.500 Variance (N+) 11.500 p-value (Two-tailed) <0.0001 p-value (Two-tailed) 0.184 alpha 0.010 alpha 0.010 16. The test statistic of z
(18 0.5) 462 1.33 results in a P-value of 0.1835 (Table:
(x 0.5) n2 n 2
46 2
0.1836), and it does not fall in thecritical region bounded by z 2.575 and 2.575. There is not sufficient evidence to warrant rejection of theclaim that themedian of all 5:00 PM Tower of Terror wait times is equal to 30 minutes. 17. Second approach: thetest statistic of z
30 0.5 1052
4.29 is in thecritical region bounded by 105 2 z 1.645 , so theconclusions are thesame as in Example 4. 38 0.5 106 2 Third approach: thetest statistic of z 2.82 is in thecritical region bounded by 106 2 z 1.645 , so theconclusions are thesame as in Example 4. thedifferent approaches can lead to very different results; as seen in thetest statistics of –4.21, –4.29, and –2.82. theconclusions are thesame in this case, but they could be different in other cases. 18. The column entries are *, *, *, *, *, 0, 0, 0. 13-3 : Wilcoxon Signed-Ranks Test for Matched Pairs 1. a. theonly requirements are that thematched pairs be a simple random sample and thepopulation of differences be approximately symmetric. b. There is no requirement of a normal distribution or any other specific distribution. c. theWilcoxon signed-ranks test is ―distribution free‖ in thesense that it does not require a normal distribution or any other specific distribution. 2. a. –4, –6, –3, 1, –1, –7 d. 1.5 and 19.5 b. 4, 5, 3, 1.5, 1.5, 6 e. T 1.5 c. –4, –5, –3, 1.5, –1.5, –6 f. The critical value of T is 1. 3. The sign test uses only thesigns of thedifferences, but theWilcoxon signed-ranks test uses ranks that are affected by themagnitudes of thedifferences. 4. The efficiency rating of 0.95 indicates that with all other factors being thesame, theWilcoxon rank-sum test requires 100 sample observations to achieve thesame results as 95 observations with theparametric t test of Section 9-2, assuming that thestricter requirements of theparametric t-test are satisfied. 5.
Test statistic: T 7 9 6 22; Critical value: T 8; P-value 0.575; Fail to reject thenull hypothesis that the population of differences has a median of 0. There is not sufficient evidence to warrant rejection of theclaim of no difference between measured weights and reported weights of females.
Copyright © 2022 Pearson Education, Inc.
258
Chapter 13: Nonparametric Tests
XLSTAT Output for Exercise 5 Wilcoxon signed-rank test / Two-tailed test: V 22 V (standardized) -0.561 Expected value 27.500 Variance (V) 96.125 p-value (Two-tailed) 0.575 alpha 0.050
XLSTAT Output for Exercise 6 Wilcoxon signed-rank test / Two-tailed test: V 10 Expected value 10.500 Variance (V) 22.750 p-value (Two-tailed) 1.000 alpha 0.050
6.
Test statistic: T 4 6 10; Critical value: T 1; P-value 1; Fail to reject thenull hypothesis that the population of differences has a median of 0. There is not sufficient evidence to warrant rejection of theclaim of no significant difference between thenumbers of words spoken by males and females in couple relationships. Based on thegiven data, there does not appear to be a significant difference.
7.
Convert T 612 to thetest statistic z 1.52. P-value: 0.1285 (Table: 0.1286); Critical values: z 1.96; There is not sufficient evidence to warrant rejection of theclaim of no significant difference between thenumbers of words spoken by males and females in couple relationships. Based on thegiven data, there does notappear to be a significant difference. n(n 1) 56(56 1) T 612 4 4 z 1.52 n(n 1)(2n 1) 24
56(56 1)(2 56 1) 24
XLSTAT Output for Exercise 7 Wilcoxon signed-rank test / Two-tailed test: V 612 V (standardized) -1.517 Expected value 798.000 Variance (V) 15028.875 p-value (Two-tailed) 0.129 alpha 0.050 8.
Convert T 721 to thetest statistic z 5.15. P-value 0.0001; Critical values: z 1.96; There is sufficient evidence to warrant rejection of theclaim that there is no significant difference between ages of Oscar-winning actresses and Oscar-winning actors. Based on thegiven data, there does appear to be a significant difference. n(n 1) 88(88 1) 721 721 4 4 z 5.15 n(n 1)(2n 1) 24
9.
XLSTAT Output for Exercise 8 Wilcoxon signed-rank test / Two-tailed test: V 721 V (standardized) -5.148 Expected value 1958.000 Variance (V) 57734.125 p-value (Two-tailed) <0.0001 alpha 0.050
88(88 1)(2 88 1) 24
Convert T 491.5 to thetest statistic z 5.77. P-value 0.0001; Critical values: z 2.575; There is sufficient evidence to warrant rejection of theclaim that themedian is equal to 98.6F. n(n 1) 84(84 1) T 491.5 4 4 z 5.77 n(n 1)(2n 1) 24
84(84 1)(2 84 1) 24
XLSTAT Output for Exercise 9 Wilcoxon signed-rank test / Two-tailed test: V 546.500 V (standardized) -5.527 Expected value 1785.000 Variance (V) 50214.625 p-value (Two-tailed) <0.0001 alpha 0.010
XLSTAT Output for Exercise 10 Wilcoxon signed-rank test / Two-tailed test: V 183 V (standardized) -2.719 Expected value 370.500 Variance (V) 4753.750 p-value (Two-tailed) 0.007 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
13-4: Wilcoxon Rank-Sum Test for Two Independent Samples
259
10. Convert T 184 to thetest statistic z 2.70. P-value 0.007 (Table: 0.0070); Critical values: z 1.96; There is sufficient evidence to warrant rejection of theclaim that themedian is equal to 8.953 g. n(n 1) 38(38 1) T 184 4 4 z 2.70 n(n 1)(2n 1) 24
38(38 1)(2 38 1) 24
11. Convert T 19 to thetest statistic z 26.01. P-value 0.0001; Critical values: z 2.575; There is sufficient evidence to warrant rejection of theclaim that smokers have a median cotinine level equal to 2.84 ng/mL. n(n 1) 902(902 1) T 19 4 4 z 26.01 n(n 1)(2n 1) 24
902(902 1)(2 902 1) 24
XLSTAT Output for Exercise 11 Wilcoxon signed-rank test / Two-tailed test: V 19 V (standardized) -26.014 Expected value 203626.500 Variance (V) 61257455.500 p-value (Two-tailed) <0.0001 alpha 0.010
XLSTAT Output for Exercise 12 Wilcoxon signed-rank test / Two-tailed test: V 410.500 V (standardized) -1.435 Expected value 540.500 Variance (V) 8209.125 p-value (Two-tailed) 0.151 alpha 0.010
12. Convert T 410.5 to thetest statistic z 1.42. P-value 0.1556; Critical values: z 2.575; There is not sufficient evidence to warrant rejection of theclaim that themedian of all 5:00 PM Tower of Terror wait times is equal to 30 minutes. n(n 1) 46(46 1) T 410.5 4 4 z 1.42 n(n 1)(2n 1) 24
46(46 1)(2 46 1) 24
13. a. Smallest: T 0; Largest: T 1 2 3 48 49 50
50 50 1
1275
2 1275
637.5 2 c. 1275 165 1110 n n 1 d. k 2 13-4 : Wilcoxon Rank-Sum Test for Two Independent Samples 1. Yes, thetwo samples are independent. (The heights from ANSUR I and ANSUR II are not matched in any way.) Each sample has more than 10 values. 2. The ranks are given in thetable below: R 12 25 3 27 9 2115 7.5 19 22 7.5 4 172. b.
ANSUR I 1988
ANSUR II 2012
1620 (12) 1640 (19) 1672 (23) 1637 (18)
1693 (25) 1660 (22) 1621 (13) 1718 (26)
1558 (3) 1597 (7.5) 1623 (14) 1588 (6)
1783 (27) 1569 (4) 1633 (17) 1520 (1)
1609 (9)
1649 (21)
1628 (15)
1597 (7.5)
1526 (2) 1618 (11)
1570 (5) 1631 (16)
1616 (10) 1642 (20)
1690 (24)
Copyright © 2022 Pearson Education, Inc.
260
Chapter 13: Nonparametric Tests
3.
H0: thesample of heights from ANSUR I and thesample of heights from ANSUR II are from populations with thesame median. There are three different possible alternative hypotheses. H1: thesample of heights from ANSUR I and thesample of heights from ANSUR II are from populations with different medians. H1: theANSUR I sample is from a population with a lower median than themedian of thesecond population. H1: t h e ANSUR I sample is from a population with a higher median than themedian of thesecond population.
4.
The efficiency rating of 0.95 indicates that with all other factors being thesame, theWilcoxon rank-sum test requires 100 sample observations to achieve thesame results as 95 observations with theparametric t test of Section 9-2, assuming that thestricter requirements of theparametric t-test are satisfied.
5.
R1 172; R2 206; R 168; R 20.4939; Test statistic: z 0.20; P-value 0.857; Critical values: z 1.96; Fail to reject thenull hypothesis that thepopulations have thesame median. There is not sufficient evidence to warrant rejection of theclaim that thesample of female heights from ANSUR I and thesample of female heights from ANSUR II are from populations with thesame median. n (n n 1) 12(12 15 1) R 1 1 2 168 2 2 n n (n n 1) 12 15(12 15 1) R 1 2 1 2 20.4939 12 12 R R 172 168 z 0.20 R 20.4939 XLSTAT Output for Exercise 5 Mann-Whitney test / Two-tailed test: U 94 U (standardized) 0.000 Expected value 90.000 Variance (U) 419.872 p-value (Two-tailed) 0.857 alpha 0.050
6.
XLSTAT Output for Exercise 6 Mann-Whitney test / Two-tailed test: U 116.500 U (standardized) 0.000 Expected value 72.000 Variance (U) 297.913 p-value (Two-tailed) 0.008 alpha 0.050
R1 194.5; R2 105.5; R 150; R 17.321; Test statistic: z 2.57; P-value 0.0102; (XLSTAT: P-value 0.008); Critical values: z 1.96; Reject thenull hypothesis that thepopulations have thesame median. There is sufficient evidence to reject theclaim that themedian amount of strontium-90 from Pennsylvania residents is thesame as themedian from New York residents. n1 n1 n2 1 12 12 12 1 R 150 2 2
R
n1n2 n1 n2 1
12 12 12 12 1
12 12 R R 194.5 150 z 2.57 R 17.321
7.
17.321
R1 253.5; R2 124.5; R 182; R 20.607; Test statistic: z 3.47; P-value 0.0005; Critical values: z 1.96; Reject thenull hypothesis that thepopulations have thesame median. There is sufficient evidence to reject theclaim that for those treated with 20 mg of Lipitor and those treated with 80 mg of Lipitor, changes in LDL cholesterol have thesame median. It appears that thedosage amount does have an effect on thechange in LDL cholesterol.
Copyright © 2022 Pearson Education, Inc.
13-4: Wilcoxon Rank-Sum Test for Two Independent Samples 261 7.
(continued)
R
n1 n1 n2 1 1313 14 1 182 2 2 n1n2 n1 n2 1
1314 13 14 1 20.607 12 12 R R 253.5 182 z 3.47 R 20.607
R
XLSTAT Output for Exercise 7 Mann-Whitney test / Two-tailed test: U 162.500 U (standardized) 0.000 Expected value 91.000 Variance (U) 422.463 p-value (Two-tailed) 0.000 alpha 0.050
XLSTAT Output for Exercise 8 Mann-Whitney test / Two-tailed test: U 290.500 U (standardized) -0.332 Expected value 308.000 Variance (U) 2617.497 p-value (Two-tailed) 0.740 alpha 0.010
8.
R1 543.5; R2 731.5; R 561; R 51.1664; Test statistic: z 0.34; P-value 0.732; (XLSTAT: P-value 0.740); Critical values: z 2.575; Fail to reject thenull hypothesis that thepopulations have thesame median. Thereis not sufficient evidence to warrant rejection of theclaim that cars in two lines have a median waiting time equal to that of cars in a single line. n (n n 1) 22(22 28 1) R 1 1 2 561 2 2 n n (n n 1) 22 28(22 28 1) R 1 2 1 2 51.1664 12 12 R R 543.5 561 z 0.34 R 51.1664
9.
R1 7236.5; R2 7298.5; R 7267.5; R 320.868; Test statistic: z 0.10; P-value 0.923; Critical values: z 2.575; Fail to reject thenull hypothesis that thepopulations have thesame median. Thereis not sufficient evidence to warrant rejection of theclaim that cars in two lines have a median waiting time equal to that of cars in a single line. n (n n 1) 85(85 85 1) R 1 1 2 7267.5 2 2 n n (n n 1) 85 85(85 85 1) R 1 2 1 2 320.868 12 12 R R 7236.5 7267.5 z 0.10 R 320.868 XLSTAT Output for Exercise 9 Mann-Whitney test / Two-tailed test: U 3579.500 U (standardized) -0.101 Expected value 3612.500 Variance (U) 102955.244 p-value (Two-tailed) 0.919 alpha 0.010
XLSTAT Output for Exercise 10 Mann-Whitney test / Two-tailed test: U 731 U (standardized) 0.000 Expected value 777.000 Variance (U) 10360.000 p-value (Two-tailed) 0.657 alpha 0.010
Copyright © 2022 Pearson Education, Inc.
262
Chapter 13: Nonparametric Tests
10. R1 1434; R2 1726; R 1480; R 101.7841; Test statistic: z 0.45; P-value 0.6527; (XLSTAT: P-value 0.657); Critical values: z 2.575; Fail to reject thenull hypothesis that thepopulations have thesame median. Thereis not sufficient evidence to warrant rejection of theclaim that themedian of thenumbers of words spoken by men in a day is thesame as themedian of thenumbers of words spoken by women in a day. n1 n1 n2 1 37 37 42 1 R 1480 2 2
R
n1n2 n1 n2 1
37 42 37 42 1
12 12 R R 1434 1480 z 0.45 R 101.7841
101.784
11. R1 501; R2 445; R 484; R 41.15823; Test statistic: z 0.41; P-value 0.3409; Critical value: z 1.645; Fail to reject thenull hypothesis that thepopulations have thesame median. There isnot sufficient evidence to support theclaim that subjects with medium lead levels have a higher median of thefull IQ scores than subjects with high lead levels. Based on these data, it does not appear that lead level affects full IQ scores. n1 n1 n2 1 22 22 211 R 484 2 2
R
n1n2 n1 n2 1
22 2122 211
12 R R 501 484 z 0.41 R 41.15823
12
41.158
XLSTAT Output for Exercise 11 Mann-Whitney test / Upper-tailed test: U 248 U (standardized) 0.401 Expected value 231.000 Variance (U) 1691.058 p-value (one-tailed) 0.344 alpha 0.050
XLSTAT Output for Exercise 12 Mann-Whitney test / Upper-tailed test: U 1097 U (standardized) 2.377 Expected value 819.000 Variance (U) 13629.234 p-value (one-tailed) 0.009 alpha 0.050
12. R1 4178; R2 772; R 3900; R 116.833; Test statistic: z 2.38; P-value 0.0087; (XLSTAT: P-value 0.009); Critical value: z 1.645; Reject thenull hypothesis that thepopulations have thesame median. There is sufficient evidence to support theclaim that subjects with low lead levels have performance IQ scores with a higher median than subjects with high lead levels. It appears that exposure to lead does have an adverse effect. n1 n1 n2 1 7878 211 R 3900 2 2
R
n1n2 n1 n2 1
78 2178 211
12 12 R R 4178 3900 z 2.38 R 116.833
116.833
Copyright © 2022 Pearson Education, Inc.
13-5: Kruskal-Wallis Test for Three or More Samples 263
13. The test statistic is thesame value with opposite sign. n1 n1 1 12 12 1 U nn R 12 15 127 131 1 2 2 2 nn 12 15 U 12 2 131 2 z 2.00 n1n2 n1 n2 1 12 15 12 15 1 12 12 14. a. Rank Sum for Rank Treatment A 1 2 3 4 A A B B 3 A B A B 4 A B B A 5 B B A A 7 B A A B 5 B A B A 6 b. t h e R values of 3, 4, 5, 6, 7 have probabilities of 1 6 , 1 6 , 2 6 , 1 6 , and 1 6 , respectively c. No, none of theprobabilities for thevalues of R would be less than 0.10. 13-5 : Kruskal-Wallis Test for Three or More Samples
1.
R1 103.5, R2 48.5, R3 90, R4 83
2.
Small Midsize Large SUV 29 (13) 32 (20) 27 (10.5) 24 (4) 31 (16.5) 28 (12) 32 (20) 31 (16.5) 35 (23) 26 (8) 39 (24.5) 31 (16.5) 33 (22) 23 (3) 27 (10.5) 25 (5.5) 26 (8) 25 (5.5) 31 (16.5) 30 (14) 32 (20) 26 (8) 39 (24.5) 21 (1) 22 (2) (Ranks for each value shown in parentheses.) Yes, thesamples are independent simple random samples, and each sample has at least five data values.
3.
n1 7, n2 5, n3 6, n4 7, N 7 5 6 7 25
4.
The efficiency rating of 0.95 indicates that with all other factors being thesame, theKruskal-Wallis test requires 100 sample observations to achieve thesame results as 95 observations with theparametric one-way analysis of variance test, assuming that thestricter requirements of theparametric test are satisfied.
5.
Test statistic: H 2.029; Critical value: 7.815; (Excel: P-value 0.563); Fail to reject thenull hypothesis of equal medians. There is not sufficient evidence to warrant rejection of theclaim that small, 2 2 midsize, large, median HIC measurement in car crash tests. R2 vehicles 12and SUV R2 R haveRthesame 1 2 3 4 3(N 1) H N (N 1) n1 n2 n3 n4 103.52 48.52 902 832 12 3(25 1) 25(25 1) 7 5 6 7 2.029 2
Copyright © 2022 Pearson Education, Inc.
264
Chapter 13: Nonparametric Tests XLSTAT Output for Exercise 5 Kruskal-Wallis test / Two-tailed test: K (Observed value) 2.046 K (Critical value) 7.815 DF 3 p-value (one-tailed) 0.563 alpha 0.050
6.
XLSTAT Output for Exercise 6 Kruskal-Wallis test / Two-tailed test: K (Observed value) 22.996 K (Critical value) 9.210 DF 2 p-value (one-tailed) <0.0001 alpha 0.010
Test statistic: H 22.8157; Critical value: 9.210; (Excel: P-value 0.000); Reject thenull hypothesis 2 of equal medians. states have median amounts of arsenic that are not all thesame. 12 It appears R2 R2thatRthethree 1 2 3 3 N 1 H N N 1 n1 n2 n3 1922 116.52 357.52 12 3 36 1 36 36 1 12 12 12 2
22.8157 7.
Test statistic: H 16.949; Critical value: 9.210; (Excel: P-value 0.0002); Reject thenull hypothesis of equal medians. There is sufficient evidence to warrant rejection of theclaim that pages from books by those three authors have thesame median Flesch Reading Ease score. At least one of thebooks has a median different from theothers. R2 R2 R2 12 3 3(N 1) H N (N 1) n1 2n n3 1 2 2 201.5 12 3372 127.52 36(36 1) 12 12 12 3(36 1) 16.949 2
XLSTAT Output for Exercise 7 Kruskal-Wallis test / Two-tailed test: K (Observed value) 16.951 K (Critical value) 9.210 DF 2 p-value (one-tailed) 0.000 alpha 0.010 8.
XLSTAT Output for Exercise 8 Kruskal-Wallis test / Two-tailed test: K (Observed value) 5.532 K (Critical value) 5.991 DF 2 p-value (one-tailed) 0.063 alpha 0.050
Test statistic: H 5.329; Critical value: 5.991; (Excel: P-value 0.063); Fail to reject thenull hypothesis of equal medians. There is not sufficient evidence to warrant rejection of theclaim that pages from books by those three authors have thesame median number of characters per word. It appears that thethree medians are about R2 R2 R2 12 thesame. 3 3(N 1) H N (N 1) n1 2n n3 1 2 2 12 247 1542 2652 3(36 1) 36(36 1) 12 12 12 5.329 2
Copyright © 2022 Pearson Education, Inc.
13-5: Kruskal-Wallis Test for Three or More Samples 265 9.
Test statistic: H 2.745; Critical value: 11.071; (Excel: P-value 0.739); Fail to reject thenull hypothesis of equal medians. There is not sufficient evidence to warrant rejection of theclaim that thesix different colors of M&M candies have thesame median weight. It does not seem feasible that thecolor would 2 have much of12 an effect so R2on theweight, expected. R2 Ris2 as R2 R R2 theresult 3(N 1) 1 2 3 4 5 6 H N (N 1) n1 n2 n3 n4 n5 n6 76992 13, 5312 70762 58322 15, 426.52 10,120.52 12 345(345 1) 45 82 41 36 88 53 3(345 1) 2.745 2
XLSTAT Output for Exercise 9 Kruskal-Wallis test / Two-tailed test: K (Observed value) 2.746 K (Critical value) 11.070 DF 5 p-value (one-tailed) 0.739 alpha 0.050
XLSTAT Output for Exercise 10 Kruskal-Wallis test / Two-tailed test: K (Observed value) 1175.356 K (Critical value) 9.210 DF 2 p-value (one-tailed) <0.0001 alpha 0.010
10. Test statistic: H 1172.165; Critical value: 9.210; (Excel: P-value 0.000); Reject thenull hypothesis of equal medians. thedata suggest that theamounts of nicotine absorbed by smokers are different from theamounts absorbed by people who don’t smoke. 2 12 R2 R2 R 2 3 3(N 1) 1 H N (N 1) n1 n2 n3 1,100, 7032 229,1212 104,1472 12 902 433 358 3(1693 1) 1172.165 1693(1693 1) 2 11. Test statistic: H 2.5999; Critical value: 7.815; (Excel: P-value 0.458); Fail to reject thenull hypothesis of12 equalmedians. thefour hospitals have birth weights with thesame median. R2 R2 that R2 R2 It appears 1 2 3 4 3 N 1 H N N 1 n1 n2 n3 n4 20,1892 21, 262.52 20,106.52 18, 6422 12 3 400 1 400 400 1 100 100 100 100 2
2.5999 XLSTAT Output for Exercise 11 Kruskal-Wallis test / Two-tailed test: K (Observed value) 2.608 K (Critical value) 7.815 DF 3 p-value (one-tailed) 0.456 alpha 0.050
XLSTAT Output for Exercise 12 Kruskal-Wallis test / Two-tailed test: K (Observed value) 116.081 K (Critical value) 7.815 DF 3 p-value (one-tailed) <0.0001 alpha 0.050
Copyright © 2022 Pearson Education, Inc.
266
Chapter 13: Nonparametric Tests
12. Test statistic: H 115.715; Critical value: 7.815; (Excel: P-value 0.0001); Reject thenull hypothesis of equal medians. There is sufficient evidence to warrant rejection of theclaim that thefour different rides have 10 AM wait times with thesame median. At least one of therides has a median 10 AM wait time that is different from theothers. R2 R2 R2 R2 12 3(N 1) 1 2 3 4 H N (N 1) n1 n2 n3 n4 3096.52 5069.52 33322 86022 12 3(200 1) 115.715 200(200 1) 50 50 50 50 2 2 section, 13. Using themethods this 12 of R R2 R H 0.694. 3(N 1) 1 2 3 H N (N 1) n 1 n2 n3 862 50.52 53.52 12 8 6 5 3(19 1) 19(19 1) 0.694 For thecorrection factor, thevalues of t are 2, 2, 2, 2, and 4 (See table below), thevalues of T are 6, 6, 6, 6, and 60, and T 84. 84 1 0.703, which is not Using 0.694 T 84 and N 8 6 5 19, t h e corrected value is H 2
193 19 substantially different from thevalue of 0.694. In this case, thelarge numbers of ties do not appear to have a dramatic effect on thetest statistic H . Lead Level 85 90 97 100 107
Rank 1.5 4.5 6.5 17.0 28.0
t
t3 t
2 2 4 2 2 SUM
6 6 60 6 6 84
13-6 : Rank Correlation
1. Jackpot (rank) Tickets (rank)
9 9
1 1.5
8 8
6 7
5 5
4 3.5
3 3.5
2 1.5
7 6
2.
rs 0.975; Critical values: 0.700; There is sufficient evidence to support theclaim of a correlation between Powerball jackpot amounts and thenumbers of tickets sold. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Jackpot Tickets Variables Jackpot Tickets Jackpot 1 0.975 Jackpot 5.51E-06 0.000 Tickets 0.975 1 Tickets 0.000 5.51E-06
3.
r represents thelinear correlation coefficient computed from sample paired data; represents theparameterof thelinear correlation coefficient computed from a population of paired data; rs denotes therank correlation coefficient computed from sample paired data; s represents therank correlation coefficient computed from a population of paired data. thesubscript s is used so that therank correlation coefficient can be distinguished from thelinear correlation coefficient r. t h e subscript does not represent thestandard deviation s. It is used in recognition of Charles Spearman, who introduced therank correlation method.
Copyright © 2022 Pearson Education, Inc.
13-6: Rank Correlation 267 4.
The efficiency rating of 0.91 indicates that with all other factors being thesame, rank correlation requires 100 pairs of sample observations to achieve thesame results as 91 pairs of observations with theparametric test using linear correlation, assuming that thestricter requirements for using linear correlation are met.
5.
rs 1.000; Critical values: 0.786; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to support a claim of a correlation between altitude and temperature. Altitude (rank) Temperature (rank)
1 7
2 6
Points from left to right 3 4 5 6 5 4 3 2
7 1
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Altitude Temperature Variables Altitude Altitude 1 -1.000 Altitude 0.000397 Temperature -1.000 1 Temperature 0.000 Values in bold are different from 0 with a significance level alpha=0.05 6.
Temperature 0.000 0.000397
rs 1.000; Critical values: 0.618; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to support a claim of a correlation between thetime of travel and thedistance traveled. Time (rank) Distance (rank)
1 1
2 2
Points from left to right 4 5 6 7 8 4 5 6 7 8
3 3
9 9
10 10
11 11
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Time Distance Variables Time Distance Time 1 1.000 Time 0 <0.0001 Distance 1.000 1 Distance <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 7. rs 0.644; Critical values: 0.648; Fail to reject thenull hypothesis of s 0 (no correlation). There is not sufficient evidence to support theclaim of a correlation between thequality ranks and thecosts. It does not appear that more expensive brands have better quality. Rank Cost (rank)
1 9
2 8
3 4
4 10
5 7
6 2.5
7 6
8 1
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Rank Cost Variables Rank 1 -0.644 Rank Cost -0.644 1 Cost Values in bold are different from 0 with a significance level alpha=0.05 8.
9 2.5
10 5
Rank 0 0.051
Cost 0.051 0
rs 0.395; Critical values: 0.648; Fail to reject thenull hypothesis of s 0 (no correlation). There is not sufficient evidence to support theclaim of a correlation between thequality ranks and thecosts. It does not appear that more expensive brands have better quality. Rank Cost (rank)
1 10
2 3
3 7
4 6
5 4.5
6 8
7 4.5
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Rank Cost Variables Rank 1 -0.395 Rank Cost -0.395 1 Cost Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
8 2
9 9
10 1
Rank 0 0.261
Cost 0.261 0
268 9.
Chapter 13: Nonparametric Tests rs 1.000; Critical values: 0.866; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to conclude that there is a correlation between overhead widths of seals from photographs and theweights of theseals. Overhead width (rank) Weight (rank)
1 1
2 2
6 6
5 5
4 4
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Width Weight Variables Width 1 1.000 Width Weight 1.000 1 Weight Values in bold are different from 0 with a significance level alpha=0.05
3 3
Width 0.002778 0.003
Weight 0.003 0.002778
10. rs 0.923; Critical values: 0.648; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to support theclaim of a correlation between cheese consumption and numbers of civil engineering PhD degrees awarded. It should be noted that this is a spurious correlation. Common sense suggests that there is no real association between cheese consumption and numbers of civil engineering PhD degrees awarded. Cheese Consumption (rank) Civil Engineering PhDs (rank)
1 1
3 2
3 3
3 5
5 4
6 6
7 7
10 8
8.5 10
8.5 9
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Consumption PhDs Variables Consumption PhDs Consumption 1 0.923 Consumption 0 0.000 PhDs 0.923 1 PhDs 0.000 0 Values in bold are different from 0 with a significance level alpha=0.05 11. rs 0.434; Critical values: 0.700; Fail to reject thenull hypothesis of s 0 (no correlation). There is not sufficient evidence to support theclaim of a correlation between heights of winning presidential candidates and heights of their main opponents. Hopefully, we do not elect our presidents based on heights or any other physical appearance. President (rank) Opponent (rank)
9 5.5
2 5.5
1 7
4 4
6 2
6 9
3 8
6 3
8 1
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables President Opponent Variables President President 1 -0.434 President 5.51E-06 Opponent -0.434 1 Opponent 0.230 Values in bold are different from 0 with a significance level alpha=0.05
Opponent 0.230 5.51E-06
12. rs 0.857; Critical values: 0.738; Reject thenull hypothesis of s 0. There is sufficient evidence to conclude that there is a correlation between thenumber of chirps in 1 min and thetemperature. Chirps (rank) Temperature (rank)
2 1
7 8
6 6
1 3
8 7
5 5
4 2
3 4
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Chirps/min Temperature Variables Chirps/min Chirps/min 1 0.857 Chirps/min 4.96E-05 Temperature 0.857 1 Temperature 0.015 Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
Variables Temperature 0.015 4.96E-05
13-6: Rank Correlation 269 z 1.96 0.074; Reject thenull hypothesis of 0 (no 13.1 rs 0.429; Critical values: rs s n 1 703 1 3 . correlation). There is sufficient evidence to support theclaim of a correlation between distance of theride and theamount of thetip. It does appear that riders base their tips on thedistance of theride. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Distance Tip Variables Distance Tip Distance 1 0.429 Distance 0 <0.0001 Tip 0.429 1 Tip <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 z 1.96 0.074; Reject thenull hypothesis of 0 (no 14.1 rs 0.867; Critical values: rs s n 1 703 1 4 . correlation). There is sufficient evidence to support theclaim of a correlation between distance of theride and thetime of theride. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Distance Time Variables Distance Time Distance 1 0.867 Distance 0 <0.0001 Time 0.867 1 Time <0.0001 0 Values in bold are different from 0 with a significance level alpha=0.05 15.1 r 0.024; Critical values: r z 1.96 0.207; fail to reject thenull hypothesis of 0 (no 5 . s s s n 1 91 1 correlation). There is not sufficient evidence to support theclaim of a correlation between ages of Best Actresses and Best Actors at thetimes they won Oscars. There does not appear to be a correlation between ages of Best Actresses and Best Actors. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Actresses Actors Variables Actresses Actors Actresses 1 0.024 Actresses 0 0.820 Actors 0.024 1 Actors 0.820 0 Values in bold are different from 0 with a significance level alpha=0.05 16.1 r 0.106; Critical values: r 6 .
z
1.96 0.450; Fail to reject thenull hypothesis of 0 (no s
n 1 20 1 correlation). There is not sufficient evidence to support theclaim of a correlation between brain volume and IQ score. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables IQ Volume Variables IQ Volume IQ 1 0.106 IQ 5.98E-06 0.657 Volume 0.106 1 Volume 0.657 5.98E-06 Values in bold are different from 0 with a significance level alpha=0.05 1.9872 17.1 rs 0.206 (df 91 2 89); t h e critical values of 0.206 are very close to thecritical 1.9872 91 2 7 . values of 0.207 found by using Formula 13-1 on page 721. s
s
Copyright © 2022 Pearson Education, Inc.
270
Chapter 13: Nonparametric Tests
13-7 : Runs Test for Randomness
1.
No, theruns test can be used to determine whether thesequence of political parties is not random, but theruns test does not show whether theproportion of Republicans is significantly greater than theproportion of Democrats.
2.
n1 17; n2 10; G 17
3.
The critical values are 8 and 19. Because G 17 is not less than or equal to 8 nor is G 17 greater than or equal to 19, fail to reject randomness. There is not sufficient evidence to conclude that thesequence of political parties is not random. thesequence appears to be random. No, it is very possible that thesequence of data appears to be random, yet thesampling method (such as voluntary response sampling) might be very unsuitable for statistical methods. Failure to reject randomness does not imply that thesample is good for statistical methods.
4.
5.
The median is 161.0 fatalities. n1 15; n2 15; G 16; critical values: 10, 22 (Table A-10); There is not sufficient evidence to reject randomness. There does not appear to be a trend of increasing or decreasing values.
6.
n1 14; n2 11; G 11; Critical values: 8, 19 (Table A-10); Fail to reject randomness. There is not sufficient evidence to warrant rejection of theclaim that odd and even digits occur in random order. thestatement from theNew York Times appears to be accurate.
7.
n1 20; n2 10; G 16; critical values: 9, 20 (Table A-10); Fail to reject randomness. There is not sufficient evidence to reject theclaim that thedates before and after July 1 are randomly selected.
8.
The mean is $5.84 billion. n1 20; n2 15; G 3; critical values: 12, 25 (Table A-10); There is sufficient evidence to reject randomness. theamounts of CD sales do not appear to be in a random sequence. thethree runs consist of values below themean at thebeginning, with values above themean in themiddle, and values below themean at theend. This suggests that over time, CD sales have declined.
9.
n1 27; n2 26; G 22; G 27.49057; G 3.603576; Test statistic: z 1.52; P-value 0.1276; Critical values: z 1.96; Fail to reject randomness. There is not sufficient evidence to reject randomness. theruns test does not test for disproportionately more occurrences of one of thetwo categories, so theruns test doesnot suggest that either conference is superior. 2n n 2 27 26 1 2 1 1 27.490567 G n1 n2 27 26
G z
(2n1n2 )(2n1n2 n1 n2 ) (n1 n2 ) (n1 n2 1) 2
G G
G
(2 27 26)(2 27 26 27 26) (27 26)2 (27 26 1)
3.603576
22 27.490567 1.52 3.603576
10. n1 66; n2 48; G 67, G 56.57895; G 5.181178; Test statistic: z 2.01; P-value 0.0443; Critical values: z 1.96; Reject randomness. There is sufficient evidence to reject randomness. American League teams and National League teams do not appear to win theWorld Series in a random sequence. 2n1n2 2 66 48 1 1 56.57895 G n1 n2 66 48
G z
(2n1n2 )(2n1n2 n1 n2 ) (n1 n2 )2 (n1 n2 1) G G
G
(2 66 48)(2 66 48 66 48) (66 48)2 (66 48 1)
67 56.57895 2.01 5.181178
Copyright © 2022 Pearson Education, Inc.
5.181178
Quick Quiz 271 11. t h e median is 3291.0. n1 n2 27; G 2; G 28; G 3.639407; Test statistic: z 7.14; 27; P-value 0.0000; Critical values: z 1.96; There is sufficient evidence to reject randomness. All of thevalues below themedian occur at thebeginning and all of thevalues above themedian occur at theend, so thereappears to be an upward trend, and thestock market appears to be a profitable investment for thelong term. 2n1n2 2 27 27 1 1 28 G n1 n2 27 27
G
(2n1n2 )(2n1n2 n1 n2 )
(2 27 27)(2 27 27 27 27)
(n1 n2 ) (n1 n2 1) G G 2 28 z 7.14 G 3.639407 2
(27 27)2 (27 27 1)
3.639407
12. t h e mean is 7.966. n1 n2 45; G 40; G 44.44828; G 4.630918; 42; Test statistic: z 0.96; P-value 0.3368; Critical values: z 1.96; There is not sufficient evidence to reject randomness. thepost positions of winners appear to be random. themost frequent winning post positions would be of more interest to bettors. 2n1n2 2 42 45 1 1 44.44828 G n1 n2 42 45
G z
(2n1n2 )(2n1n2 n1 n2 ) (n1 n2 ) (n1 n2 1) 2
G G
G
(2 42 45)(2 42 45 42 45) (42 45)2 (42 45 1)
4.630918
40 44.44828 0.96 4.630918
13. a. List of sequences not provided. b. the84 sequences yield these results: 2 sequences have 2 runs, 7 sequences have 3 runs, 20 sequences have 4 runs, 25 sequences have 5 runs, 20 sequences have 6 runs, and 10 sequences have 7 runs. c. With P(2 runs) 2 / 84, P(3 runs) 7 / 84, P(4 runs) 20 / 84, P(5 runs) 25 / 84, P(6 runs) 20 / 84, and P(7 runs) 10 / 84, each of theG values of 3, 4, 5, 6, 7 can easily occur by chance, whereas G 2 is unlikely because P(2 runs) is less than 0.025. thelower critical value of G is therefore 2, and there is no upper critical value that can be equaled or exceeded. d. t h e critical value of G 2 agrees with Table A-10. thetable lists 8 as theupper critical value, but it is impossible to get 8 runs using thegiven elements. Quick Quiz 23 4 1. The ranks are 3, 8.5, 8.5, 3, 6, 1, 5, 10, 7, 3. therank for 1.1 is found using 3 and therank for 1.7 is 3 89 found using 8.5. 2 2. The efficiency rating of 0.91 indicates that with all other factors being thesame, rank correlation requires 100 pairs of sample observations to achieve thesame results as 91 pairs of observations with theparametric test for linear correlation, assuming that thestricter requirements for using linear correlation are met. 3. a. distribution-free test b. t h e term ―distribution-free test‖ suggests correctly that thetest does not require that a population must have a particular distribution, such as a normal distribution. theterm ―nonparametric test‖ incorrectly suggests that thetest is not based on a parameter, but some nonparametric tests are based on themedian, which is a parameter; theterm ―distribution-free test‖ is better because it does not make that incorrect suggestion. 4.
rs 1.000; Critical values: 0.700; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to conclude that there is a correlation between thetime of travel and thedistance traveled.
Copyright © 2022 Pearson Education, Inc.
272 Chapter 13: Nonparametric Tests 5.
Rank correlation can be used in a wider variety of circumstances than linear correlation. Rank correlation does not require a normal distribution for any population. Rank correlation can be used to detect some (not all) relationships that are not linear. 6. The sign test can be used to test claims involving matched pairs of sample data, it can be used to test claims involving nominal data with two categories, and it can be used to test claims about themedian of a single population. 7. Because there are only two runs, all of thevalues below themean occur at thebeginning and all of thevalues above themean occur at theend, or vice versa. This indicates thepresence of an upward (or downward) trend. 8. Because thesign test uses only signs of differences while theWilcoxon signed-ranks test uses ranks of thedifferences, theWilcoxon signed-ranks test uses more information about thedata and tends to yield conclusionsthat better reflect thetrue nature of thedata. 9. The Wilcoxon signed-ranks test is used to test a claim that a population of matched pairs has theproperty that they have differences with a median equal to zero or to test a claim that a single population of individual values has a median equal to some claimed value, whereas theWilcoxon rank-sum test is used to test thenull hypothesis that two independent samples are from populations having equal medians. 10. One-way analysis of variance can be used instead of theKruskal-Wallis test. Like many other nonparametric tests, theKruskal-Wallis test has no requirement that thepopulations have a normal distribution or any other particular distribution. Review Exercises 1. The test statistic of x 0 is less than or equal to thecritical value of 0 (from Table A-7). There is sufficient evidence to warrant rejection of theclaim that for thepopulation of freshman male college students, there is not a significant difference between theweights in September and theweights in thefollowing April. Based on thegiven data, there does appear to be a significant difference. thetest does not address thespecific weight gain of 15 lb. XLSTAT Output for Exercise 1 XLSTAT Output for Exercise 2 Sign test / Two-tailed test: Wilcoxon signed-rank test / Two-tailed test: N+ 0 V 0 Expected value 3.500 V (standardized) -2.530 Variance (N+) 1.750 Expected value 14.000 p-value (Two-tailed) 0.016 Variance (V) 30.625 alpha 0.050 p-value (Two-tailed) 0.011 alpha 0.050 2.
The test statistic T 0 is less than or equal to thecritical value of 2. There is sufficient evidence to warrant rejection of theclaim that themedian of thedifferences is equal to 0. There does appear to be a difference. thetest does not address thespecific weight gain of 15 lb.
3.
rs 0.983; Critical values: rs 0.700; Reject thenull hypothesis of s 0 (no correlation). There is sufficient evidence to support theclaim of a correlation between theSeptember weights and theApril weights. thepresence of a correlation tells us nothing about thebelief that college students gain 15 lb (or 6.8 kg) during their freshman year. September (rank)
2
4
9
8
1
6
4
4
7
April (rank)
2
3
9
7.5
1
6
4.5
4.5
7.5
XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables September April Variables September September 1 0.983 September 5.51E-06 April 0.983 1 April 0.000 Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
April 0.000 5.51E-06
Review Exercises 273 4.
Test statistic: H 0.465; (Excel: P-value 0.766); Critical value: 5.991; Fail to reject thenull hypothesis of equal medians. There is not sufficient evidence to warrant rejection of theclaim that thelengths of stay at thethree hospitals have thesame median. 2 12 R2 R2 R 3 3(N 1) 1 2 H N (N 1) n1 n2 n3 254.52 2822 243.52 12 13 13 3(39 1) 13 39(39 1) 0.465 2
XLSTAT Output Kruskal-Wallis test / Two-tailed test: K (Observed value) K (Critical value) DF p-value (one-tailed) alpha 5.
Test statistic: z
(48 0.5) 1142 1.59; P-value 0.1113; t h e P-value is not small, so
(x 0.5) n2 n 2
0.533 9.210 2 0.766 0.010
114 2
fail to reject thenull hypothesis of p 0.5. There is not sufficient evidence to warrant rejection of theclaim that in each World Series, theAmerican League team has a 0.5 probability of winning. 6. The test statistic of x 3 is not less than or equal to thecritical value of 1 (from Table A-7). There is not sufficient evidence to warrant rejection of theclaim that thesample is from a population with a median equal to 15 minutes. XLSTAT Output for Exercise 6 XLSTAT Output for Exercise 7 Sign test / Two-tailed test: Wilcoxon signed-rank test / Two-tailed test: N+ 3 V 15.500 Expected value 5.000 V (standardized) -1.226 Variance (N+) 2.500 Expected value 27.500 p-value (Two-tailed) 0.344 Variance (V) 95.875 alpha 0.050 p-value (Two-tailed) 0.220 alpha 0.050 7.
The test statistic of T 15.5 is not less than or equal to thecritical value of 8. There is not sufficient evidence to warrant rejection of theclaim that thesample is from a population with a median equal to 15 minutes.
8.
n1 14; n2 16; G 17; critical values: 10, 22 (Table A-10); Fail to reject randomness. There is not sufficient evidence to warrant rejection of theclaim that odd and even digits occur in random order. thelottery appears to be working as it should.
9.
R1 204.5; R2 230.5; R 255; R 22.58318; Test statistic: z 2.24; Excel: P-value 0.025; Critical values: z 1.96; Reject thenull hypothesis that thepopulations have thesame median. There is sufficient evidence to warrant rejection of theclaim that therecent eruptions and past eruptions have thesame median time interval between eruptions. theconclusion does change with a 0.01 significance level.
Copyright © 2022 Pearson Education, Inc.
274 9.
Chapter 13: Nonparametric Tests (continued) n1n2 n1 n2 1
17 12 17 12 1 22.58318 12 12 n1 n1 n2 1 17 17 12 1 R 255 2 2 R R 204.5 255 z 2.24 R 22.58318
R
XLSTAT Output Mann-Whitney test / Two-tailed test: U 51.500 U (standardized) 0.000 Expected value 102.000 Variance (U) 509.372 p-value (Two-tailed) 0.024 alpha 0.050 z 1.96 0.741; Fail to reject thenull hypothesis of 0 (no 10. r 0.714. Critical values: rs s s n 1 8 1 correlation). There is not sufficient evidence to support theclaim that there is a correlation between thestudent ranks and themagazine ranks. When ranking colleges, students and themagazine do not appear to agree. XLSTAT Output Correlation matrix (Spearman): p-values (Spearman): Variables Student World Report Variables Student World Report Student 1 0.714 Student 4.96E-05 0.058 World Report 0.714 1 World Report 0.058 4.96E-05 Values in bold are different from 0 with a significance level alpha=0.05 Cumulative Review Exercises 1.
4.488 4.781 4.697 4.54 4.458 4.524 4.395 4.513 4.512 4.666 4.5574 g. 10 4.513 4.524 4.5185 g. The median is Q2 2 The range is 4.781 4.395 0.3860 g. The mean is x
The standard deviation is s
(4.488 4.5574)2 (4.666 4.5574)2 10 1
0.1192 g.
The variance is s2 (0.1192 g)2 0.0142 g2. XLSTAT Output Statistic Nbr. of observations Range Median Mean Variance (n-1) Standard deviation (n-1)
Weight
Copyright © 2022 Pearson Education, Inc.
10 0.386 4.519 4.557 0.014 0.119
Cumulative Review Exercises 275 2.
The normal quantile plot shows that thepoints approximate a straight-line pattern, so theweights do appear to be from a population having a normal distribution.
3.
H0: 4.5333 g; H1: 4.5333 g x 4.5574 4.5333 Test statistic: t 0.640; Critical values: t 2.262; s/ n 0.1192 / 10 P-value 0.5384 (Table: P-value 0.20); Fail to reject H 0 . There is not sufficient evidence to warrant rejection of theclaim that thesample is from a population with mean equal to 4.533 g. theclaim of 340 g appears to be valid. XLSTAT Output for Exercise 3 XLSTAT Output for Exercise 4 Sign test / Two-tailed test: One-sample t-test / Two-tailed test: Difference 0.024 N+ 4 t (Observed value) 0.640 Expected value 5.000 |t| (Critical value) 2.262 Variance (N+) 2.500 p-value (Two-tailed) 0.754 DF 9 alpha 0.050 p-value (Two-tailed) 0.538 alpha 0.050
The test statistic of x 4 is not less than or equal to thecritical value of 1 (from Table A-7). There is not sufficient evidence to warrant rejection of theclaim that thesample of weights is from a population with a median of 4.5333 g. 5. The test statistic of T 27 is not less than or equal to thecritical value of T 8. Fail to reject thenull hypothesis that thepopulation has a median of 4.5333 g. XLSTAT Output for Exercise 5 XLSTAT Output for Exercise 6 Wilcoxon signed-rank test / Two-tailed test: 95% confidence interval on the mean: ( 4.472,4.643 ) V 27 Expected value 27.500 Variance (V) 96.250 p-value (Two-tailed) 1.000 alpha 0.050 s 4.5574 2.262 0.1192 4.4722 g 4.6426 g (df 10 1 9); Because the 6. 95% CI: x t /2 n 10 confidence interval contains theclaimed mean of 4.5333 g, there is not sufficient evidence to warrant rejection of theclaim that thepopulation mean is equal to 4.5333 g. 7. Answers vary, but here is a typical answer: 4.4924 g 4.6293 g. Because theconfidence interval contains the claimed mean of 4.5333 g, there is not sufficient evidence to warrant rejection of theclaim that thepopulation mean is equal to 4.5333 g. 4.
Copyright © 2022 Pearson Education, Inc.
276 Chapter 13: Nonparametric Tests 8. The sample mean is 55.15 years. n1 22; n2 17; G 17; G 20.1794; G 3.029127; Test statistic: z 1.05; Excel: P-value 0.2939; Critical values: z 1.96; Fail to reject thenull hypothesis of randomness. There is not sufficient evidence to warrant rejection of theclaim that thesequence of ages is random relative to values above and below themean. theresults do not suggest that there is an upward trend ora downward trend. 2n n 2 22 17 1 2 1 1 20.1794 G n1 n2 22 17 (2n1n2 )(2n1n2 n1 n2 )
G z
(n1 n2 ) (n1 n2 1) 2
G G
G
(2 22 17)(2 22 17 22 17) (22 17)2 (22 17 1)
3.029127
17 20.1794 1.05 3.029127
z /2 p̂q̂ 1.962 0.25 2
9.
n
E2
2
2401
fx | =NORM.S.INV(1-0.05/2)^2*0.5*0.5/0.03^2
0.02 10. There must be an error, because therates of 13.7% and 10.6% are not possible with samples of size 100.
Copyright © 2022 Pearson Education, Inc.
Section 14-1: Control Charts for Variation and Mean 277
Chapter 14: Statistical Process Control Section 14-1: Control Charts for Variation and Mean 1. a. A run chart is a sequential plot of individual data values over time. It is used to monitor process data for any patterns of changes over time. b. An R chart is a plot of sample ranges. It is used to monitor thevariation in a process. c. An x chart is a plot of sample means. It is used to monitor thecenter in a process. 2. 3. 4.
5.
The control limits are boundaries used to separate and identify any points considered to be significantly low or significantly high. A process is statistically stable (or within statistical control) if it has only natural variation, with no patterns, cycles, or significantly low or significantly high points. The mean is out of statistical control. theelevations have decreased substantially in recent years, so Lake Mead is becoming shallower. thedecreases are significant, and they are having a dramatic impact on theaffected populations. thevariations in Lake Mead elevations have become more stable in recent years, so theelevationsdo not vary as much as they once did, but thevariation is also out of statistical control. x 267.11 lb; R 54.96 lb, n 7 For theR chart: LCL D3R 0.076 54.96 4.18 lb and UCL D4R 1.924 54.96 105.74 lb . For thex chart: LCL x A2R 267.11 0.419 54.96 244.08 lb and UCL x A2R 267.11 0.419 54.96 290.14 lb .
6.
The run chart does not reveal any patterns that would indicate manufacturing problems needing correction.
7.
The R chart does not meet any of thethree out-of-control criteria, so thevariation of theprocess appears to be within statistical control.
Copyright © 2022 Pearson Education, Inc.
278 Chapter 14: Statistical Process Control 8.
The x chart does not meet any of theout-of-control criteria, so themean of theprocess appears to be within statistical control.
9.
x 5.6955 g; R 0.2054 g; n 5 For R chart: LCL D3R 0.000 0.2054 0.0000 g; UCL D4R 2.114 0.2054 0.4342 g For x chart: LCL x A2R 5.6955 0.577 0.2054 5.5770 g;
UCL x A2R 5.6955 0.577 0.2054 5.8140 g 10. The R chart has eight consecutive points below thecenterline, there appears to be an upward trend, and there are points that lie beyond theupper control limit, so thevariation of theprocess appears to be out of statistical control.
11. There are points lying beyond theupper control limit, so theprocess mean appears to be out of statistical control.
Copyright © 2022 Pearson Education, Inc.
Section 14-2: Control Charts for Attributes 279 12. There is a pattern of increasing variation, which is quite bad. theprocess should be fixed.
13.1 s 0.1400C; n 10; LCL B3s 0.284 0.1400 0.0398C UCL B4s 1.716 0.1400 0.2402C; 3 . Except for thevalues on thevertical scale, thes chart is nearly identical to theR chart shown in Example 3.
14. The two x charts are nearly identical. LCL x A3s 14.0802 0.975 0.1400 13.9437C; UCL x A3s 14.0802 0.975 0.1400 14.2167C
Section 14-2: Control Charts for Attributes 1. No, theprocess appears to be out of statistical control. There is a downward trend and there are at least eight consecutive points all lying above thecenterline. Because theproportions of defects are decreasing, themanufacturing process is not deteriorating; it is improving. 2. p is thepooled estimate of theproportion of defective items. It is obtained by finding thetotal number of defects in all samples combined and dividing that total by thetotal number of items sampled. UCL is theupper control limit, and LCL is thelower control limit. 3. Because thevalue of –0.00325 is negative and theactual proportion of defects cannot be less than 0, we should replace that value with 0.
Copyright © 2022 Pearson Education, Inc.
280 Chapter 14: Statistical Process Control 4.
5.
6.
No, it is very possible that theproportions of defects in repeated samplings behave in a way that makes theprocess appear to be within statistical control, but theactual proportions of defects could be very high (such as 90%), so that almost all of theeuro coins fail to meet themanufacturing specifications. Upper and lower controllimits of a control chart for theproportion of defects are based on theactual behavior of theprocess, not thedesired behavior according to specifications. The process appears to be within statistical control. (Considering a shift up, note that thefirst and last points are about thesame.) 32 21 25 19 35 34 27 30 26 33 p 0.00282; q 1 0.00282 0.99718 10, 000 10 LCL p 3
pq 0.00282 3 n
UCL p 3
pq 0.00282 3 n
0.002820.99718 10, 000
0.002820.99718 10, 000
0.001229 0.004411
The process appears to be within statistical control. thecontrol chart is thesame as thecontrol chart from Exercise 5, except for thevalues identifying thecenterline and control limits (UCL 0.4170, centerline is at p 0.282, and LCL 0.1470). In this exercise, theproportions of defects are very high. Even though the process is within statistical control, this manufacturing process is yielding far too many defective euro coins, so corrective action should be taken to lower thedefect rate. 32 21 25 19 35 34 27 30 26 33 p 0.282; q 1 0.282 0.718 100 10 0.2820.718 0.1470 pq LCL p 3 0.282 3 n 100
0.2820.718 0.4170 pq 0.282 3 100 n The process appears to be out of statistical control because of a downward trend, but thenumber of defects appears to be decreasing, so theprocess is improving. Causes for thedeclining number of defects should be identified so that they can be continued. 16 18 13 9 10 8 6 5 5 3 p 0.0093; q 1 0.0093 0.9907 1000 10 0.00930.9907 0.00019 pq LCL p 3 0.0093 3 n 1000 UCL p 3
7.
UCL p 3
pq 0.0093 3 n
0.00930.9907 0.01841 1000
Copyright © 2022 Pearson Education, Inc.
Section 14-2: Control Charts for Attributes 281 7.
(continued)
8.
There is a point that lies beyond theupper control limit, so theprocess is out of statistical control. thecause of thehigh number of defects in the11th sample should be identified and corrected. 8 5 6 4 9 3 12 7 8 5 22 4 9 10 11 8 7 6 8 5 p 0.00785 1000 20 q 1 0.00785 0.99215
9.
0.007850.99215
LCL p 3
pq 0.00785 3 n
UCL p 3
pq 0.00785 3 n
1000
0.000522, so LCL 0
0.007850.99215 0.01622 1000
The process is out of statistical control because there are points lying beyond theupper control limit and there are points lying beyond thelower control limit. thegraph shows a ―sawtooth‖ pattern because presidential elections held every four years tend to attract many more voters than thenational elections that do not include a presidential election. 549 725 560 780 576 660 516 638 453 688 475 644 486 644 425 654 p 0.5921 1000 16 q 1 0.5921 0.4079 pq 0.5921 3 (0.5921)(0.4079) 0.5454 LCL p 3 1000 n pq 0.5921 3 (0.5921)(0.4079) 0.6387 UCL p 3 1000 n
Copyright © 2022 Pearson Education, Inc.
282 9.
Chapter 14: Statistical Process Control (continued)
10. The process appears to be out of statistical control. There is a point lying beyond theupper control limit, and there is a pattern of increasing proportions of defects. themanufacturing process requires corrective action. 3 4 2 5 3 6 8 9 12 14 17 20 p 0.03433; q 1 0.03433 0.96557 250 12 0.034330.96557 pq LCL p 3 0.03433 3 0.000216, so LCL 0 n 250 UCL p 3
pq 0.03433 3 n
0.034330.96557 0.06888 250
11. Based on thecontrol chart for p it appears that theprocess is within statistical control. However, thenumbers of defects in batches of 100 are significantly high. Although theprocess is within statistical control, immediate corrective action should be taken to reduce thelarge numbers of defects. 9 6 12 14 8 7 1111 7 8 9 12 10 5 8 p 0.0913 100 15 q 1 0.0913 0.9087 LCL p 3 UCL p 3
pq n pq
0.0913 3 (0.0913)(0.9087) 0.0049 100
0.0913 3 (0.0913)(0.9087) 0.1778 100 n
Copyright © 2022 Pearson Education, Inc.
Section 14-2: Control Charts for Attributes 283 11. (continued)
12. Because there is a point lying beyond theupper control limit, theprocess is out of control. Corrective action should be taken to identify thecause of theunusually high number of defects, especially because it occurred on thelast day of production. 4 3 5 8 12 7 11 9 9 7 6 19 p 0.02778 300 12 q 1 0.02778 0.97222 pq 0.02778 3 (0.02778)(0.97222) 0.000685, so LCL 0 LCL p 3 300 n pq 0.02778 3 (0.02778)(0.97222) 0.05624 UCL p 3 300 n
13. Except for a different vertical scale, thebasic control chart is identical to theone given for Example 1. 2 0 1 3 1 2 2 4 3 5 12 7 p 0.035; q 1 0.035 0.965; np 100 0.035 3.5 100 12 LCL np 3 npq 3.5 3 1000.0350.965 2.013, so LCL 0 UCL np 3 npq 3.5 3 100 0.0350.965 9.01
Copyright © 2022 Pearson Education, Inc.
284
Chapter 14: Statistical Process Control
13. (continued)
Quick Quiz 1. Process data are data arranged according to some time sequence. They are measurements of a characteristic of goods or services that result from some combination of equipment, people, materials, methods, and conditions. 2. Random variation is due to chance, but assignable variation results from causes that can be identified, such as defective machinery or untrained employees. 3. There is a pattern, trend, or cycle that is obviously not random. There is a point lying outside theregion between theupper and lower control limits. There are at least eight consecutive points all above or all below thecenterline. 4. An R chart uses ranges to monitor variation, but an x chart uses sample means to monitor thecenter (mean) ofa process. 5. No, theR chart has at least eight consecutive points all lying below thecenterline, there are at least eight consecutive points all lying above thecenterline, there are points lying beyond theupper and lower control limits, and there is a pattern showing that theranges have jumped in value for themore recent samples. What a mess! 6.
R 67.0 ft; In general, a value of R is found by first finding therange for thevalues within each individual subgroup; themean of those ranges is thevalue of R.
7.
No, thex chart has a point lying beyond theupper control limit, and there are at least eight consecutive points lying below thecenterline.
8.
x 2.24 ft; In general, a value of x is found by first finding themean of thevalues within each individual subgroup; themean of those subgroup means is thevalue of x.
9.
No, thecontrol charts can be used to determine whether themean and variation are within statistical control, but they do not reveal anything about specifications or requirements. 10. Because there is a downward trend, theprocess is out of statistical control, but therate of defects is decreasing, so we should investigate and identify thecause of that trend so that it can be continued. Review Exercises 1.
x 3157 kWh; R 1729 kWh; n 6 For R chart: LCL D3R 0.000 1729 0 kWh and UCL D4R 2.004 1729 3465 kWh For x chart: LCL x A2R 3157 0.4831729 2322 kWh and UCL x A2R 3157 0.4831729 3992 kWh
Copyright © 2022 Pearson Education, Inc.
Review Exercises 285 2.
The process variation is within statistical control.
3.
There appears to be a shift up in themean values, so theprocess mean is out of statistical control.
4.
There appears to be a slight upward trend. There is 1 point that appears to be exceptionally low. (The author’s power company made an error in recording and reporting theenergy consumption for that time period.) theprocess is not within statistical control.
5.
The process appears to be within statistical control. However, using themanager’s definition of a defect, therate of defects is far too high, so corrective action should be taken to lower theservice time. 19 21 2117 15 12 24 26 28 21 22 23 21 20 25 p 0.42 50 15 q 1 0.42 0.58 LCL p 3 UCL p 3
pq n pq
0.42 3 (0.42)(0.58) 0.2106 50
0.42 3 (0.42)(0.58) 0.6294 50 n
Copyright © 2022 Pearson Education, Inc.
286 5.
Chapter 14: Statistical Process Control (continued)
Cumulative Review Exercises 1. a. 0.08 b. 0.17
c.
0.02 0.83 0.0166
d.
(0.90)3 0.729
e. First try to get thecase dismissed, then try to arrange for a plea deal. Given thevery high conviction rate from trials, a trial should be avoided.
0.470.53 p̂q̂ 0.47 1.96 0.440 p 0.501; Because theconfidence interval n 1038 includes 0.5, we cannot conclude that fewer than half of all adults say that nuclear plants are safe. XLSTAT Output for Exercise 2 XLSTAT Output for Exercise 3 z-test for one proportion / Lower-tailed 95% confidence interval on the proportion (Wald): test: ( 0.440, 0.500 ) Difference -0.030 z (Observed value) -1.933 z (Critical value) -1.645 p-value (one-tailed) 0.027 alpha 0.050 0.47 0.5 3. H 0 : p 0.5; H1: p 0.5; Test statistic: z 1.93 (1.92 if using 0.47(1038) 488); 0.50.5
2.
95% CI: p̂ z /2
1038
P-value P z 1.92 0.0274 (Excel: 0.0272); Critical value: z 1.645; Reject H 0 . There is sufficient
4. 5.
evidence to support theclaim that fewer than half of all Americans say that nuclear plants are safe. This conclusion does contradict theconclusion from thepreceding exercise. Because this hypothesis test is a lefttailed test with a 0.05 level of significance, it is equivalent to a confidence interval with a 90% level of confidence, but thepreceding exercise used a 95% level of confidence. Yes. thevertical scale does not begin with 0, so thedifference between 47% and 49% is visually exaggerated to give a false impression. H0: 0; H1: 0; r 0.419; P-value 0.228 (Table: P-value 0.05); Critical values: r 0.632; Fail to reject H0. There is not sufficient evidence to support theclaim that there is a linear correlation between temperature and carbon dioxide. Even if there had been a linear correlation, that would not be basis for concluding that carbon dioxide causes a rise in temperatures. XLSTAT Output Correlation matrix (Pearson): p-values (Pearson): Variables CO2 Temperature Variables CO2 Temperature CO2 1 -0.419 CO2 0 0.228 Temperature -0.419 1 Temperature 0.228 0 Values in bold are different from 0 with a significance level alpha=0.05
Copyright © 2022 Pearson Education, Inc.
Cumulative Review Exercises 287 yˆ 16.5 0.00412x; From Exercise 5, there is no significant linear correlation, so thebest predicted value of the temperature in theyear with a CO2 level of 420 ppm is y 14.878C. Because that predicted value is the same for any CO2 level, it is not likely to be accurate. XLSTAT Output Model parameters (Temp.): Source Value Standard error t Pr > |t| Intercept 16.473 1.228 13.412 <0.0001 CO2 -0.004 0.003 -1.305 0.228 60.0 68.6 80.0 68.6 3.07 and z 4.07, which have a probability of 7. a. zx60 x80 2.8 2.8 0.9999 0.0001 0.9998, or 99.98% (Excel: 99.89%) between them. 6.
fx | =NORM.DIST(80.0,68.6,2.8,1)-NORM.DIST(60.0,68.6,2.8,1) 70.0 68.6 1.00; which has a probability of 1 0.8413 0.1587 to theright. b. z x70 2.8 4 fx | =1-NORM.DIST(70.0,68.6,2.8/SQRT(4),1) 8.
9.
There is a pattern of an upward trend, so theprocess is out of statistical control. 3 2 4 6 5 9 7 10 12 15 p 0.060833 1200 q 1 0.0608 0.93917
0.060830.93917 0.0046, so LCL 0
LCL p 3
pq 0.06083 3 n
UCL p 3
pq 0.06083 3 n
120
0.060830.93917 0.1263 120
x 7.3; Q2 6.5; s 4.2; These statistics do not convey information about thechanging pattern of thedata over time. XLSTAT Output Statistic Nbr. of observations Median Mean Standard deviation (n-1)
Defective Systems 10 6.500 7.300 4.165
Copyright © 2022 Pearson Education, Inc.
288 Chapter 14: Statistical Process Control 10. H0: 1 2; H1: 1 2; population1 nonstress, population2 stress Test statistic: t 2.879; P-value 0.0026 (Table: P-value 0.005); Critical values: t 2.376 (Table: t 2.426); Reject H0. There is sufficient evidence to support theclaim that themean number of details recalled is lower for thestress group. It appears that ―stress decreases theamount recalled,‖ but we should not conclude that stress is thecause of thedecrease. x1 x2 (1 2 ) 53.3 45.3 0 t 11.62 13.22 s12 s22 40 40 n1 n2 P-value: fx | =T.DIST.RT 2.879, 76.733
Critical Value: fx | =T.INV 1 0.01, 76.733
Copyright © 2022 Pearson Education, Inc.
Holistic Statistics
289
Chapter 15: Holistic Statistics 1.
Test statistic from parametric t test (Section 9-3): t 8.484, critical values are t 1.996 (Table: 2.000), P-value 0.000 and 95% confidence interval is 1.05∘ F d 0.65∘ F. Nonparametric sign test: Test statistic is z 5.86, critical values are 1.96, P-value 0.000. Nonparametric Wilcoxon signed-ranks test: Test statistic is z 5.98, critical values are 1.96, P-value 0.000. Randomization test with 1000 resamplings yields no results at least as extreme as d 0.85072∘ F. Bootstrap 95% confidence interval: 1.04∘ F d 0.66∘ F (which can vary). All of theresults suggest rejection of H 0 : d 0∘ F. There appears to be a significant difference between the8 AM and 12 AM second-day temperatures. It appears that there is generally a significant difference between body temperatures measured at 8 AM and at 12 AM.
2.
Test statistic from parametric t test ( Section 9-3): t 1.113, critical values are t 1.986 (Table: 1.987), P-value 0.269 and 5% confidence interval is 0.25∘ F d 0.07∘ F. Nonparametric sign test: Test statistic is z 1.11, critical values are 1.96, P-value 0.267. Nonparametric Wilcoxon signed-ranks test: Test statistic is z 1.06, critical values are 1.96, P-value 0.288. Randomization test with 1000 resamplings yields 254 (which can vary) results at least as extreme as d 0.08913∘ F. Bootstrap 95% confidence interval: 0.24∘ F d 0.06∘ F (which can vary). All of theresults suggest failure to reject H 0 : d 0∘ F. There does not appear to be a significant difference between body temperatures at 12 AM on thetwo consecutive days.
3.
r 0.206 and critical values are 0.237 which suggests no correlation. rs 0.252 and critical values are 0.238 which suggests that there is a correlation. These results conflict, but a scatterplot shows no clear pattern, so conclude that there is not a correlation.
4.
r 0.261 and critical values are 0.205 which suggests that there is a correlation. rs 0.262 and critical values are 0.205 which suggests that there is a correlation. These results agree, but a scatterplot shows no clear pattern. Even though there appears to be correlation, it appears that this correlation is very weak.
5.
Using 98.20∘ F we get P99 99.64∘ F. Using 98.6∘ F we get P99 100.04∘ F. t h e difference is 0.40∘ F, which is not insignificant, but it probably does not make too much of a difference in practice.
6.
Test statistic from parametric t test (Section 8-2): t 13.272, critical values are t 1.995 (Table: 1.994), P-value 0.000 and 95% confidence interval is 97.32∘ F 97.66∘ F. Nonparametric sign test: Test statistic is z 7.33, critical values are 1.96, P-value 0.000. Nonparametric Wilcoxon signed-ranks test: Test statistic is z 7.05, critical values are 1.96, P-value 0.000. Results can vary, but a randomization test with 1000 resamplings yields no results at least as extreme as 97.48857∘ F. Bootstrap 95% confidence interval: 97.33∘ F d 97.65∘ F
(which can vary). All of theresults suggest
rejection of H 0 : 98.6∘ F. It appears that themean body temperature at 8 AM is different from 98.6∘ F.
Copyright © 2022 Pearson Education, Inc.
290 Chapter 15: Holistic Statistics 7.
Test statistic from parametric t test (Section 8-2): t 7.102, critical values are t 1.986 (Table: 1.987), P-value 0.000 and 95% confidence interval is 97.99∘ F 98.26∘ F. Nonparametric sign test: Test statistic is z 4.91, critical values are 1.96, P-value 0.000. Nonparametric Wilcoxon signed-ranks test: Test statistic is z 5.77, critical values are 1.96, P-value 0.000. Results can vary, but a randomization test with 1000 resamplings yields no results at least as extreme as 98.12366∘ F. Bootstrap 95% confidence interval: 97.99∘ F d 97.25∘ F (which can vary). All of theresults suggest rejection of H 0 : 98.6∘ F. It appears that themean body temperature at 12 AM is different from 98.6∘ F.
8.
Test statistic from parametric t test (Section 8-2): t 3.865, critical values are t 2.026, P-value 0.000 and 95% confidence interval is 97.88∘ F 98.37∘ F. Nonparametric sign test: Test statistic is z 1.64, critical values are 1.96, P-value 0.100. Nonparametric Wilcoxon signed-ranks test: Test statistic is z 3.32, critical values are 1.96, P-value 0.001. Results can vary, but a randomization test with 1000 resamplings yields no results at least as extreme as 98.12632∘ F. Bootstrap 95% confidence interval: 97.88∘ F d 98.36∘ F (which can vary). Except for theresults from the sign test, all of theresults suggest rejection of H 0 : 98.6∘ F. It appears that themean body temperature at 12
9.
AM is different from 98.6∘ F. Typical results are shown in this chapter in thediscussion of simulations. It would be very rare to get a simulated sample with a mean as extreme as 98.2∘ F and this shows that theassumption of a mean equal to 98.6∘ F is probably an incorrect assumption. It appears that themean body temperature is not 98.6∘ F.
Copyright © 2022 Pearson Education, Inc.