TEST BANK for Introduction to Probability and Statistics, 3rd Canadian Edition by Mendenhall, Beaver by StudyGuide

Chapter 1—Describing Data with Graphs MULTIPLE CHOICE 1. Which of the following choices best describes the methods dealt with by the branch of statistics called descriptive statistics? a. organizing and summarizing data b. quantifying and summarizing data c. qualifying and quantifying data d. qualifying and organizing data ANS: A BLM: Remember

PTS:

REF: 5

TOP: 1–3

2. How is the relative frequency of a class computed? a. by dividing the frequency of the class by the number of classes b. by dividing the frequency of the class by the class width c. by dividing the frequency of the class by the total number of observations in the data set d. by dividing the frequency of the class by one less than the total number of observations in the data set ANS: C BLM: Remember

PTS:

REF: 14 | 27-29

TOP: 1–3

3. Which of the following is NOT a goal of descriptive statistics? a. summarizing data b. displaying aspects of the collected data c. reporting numerical findings d. estimating characteristics of populations ANS: D BLM: Remember

PTS:

REF: 5

TOP: 1–3

4. You asked ten of your classmates about their weight. On the basis of this information, you estimated that the average weight of all students at your school is 71.8 kilograms. What is this an example of? a. descriptive statistics b. statistical inference c. a sample d. a population ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 5

TOP: 1–3

5. Which is the best type of chart for comparing two sets of qualitative data? a. a line chart b. a pie chart c. a histogram d. a bar chart

ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 20-21

TOP: 1–3

6. What does the sum of the frequencies for all classes always equal? a. the number of classes b. the class width c. the total number of observations in the data set d. one ANS: C BLM: Remember

PTS:

REF: 14

TOP: 1–3

7. What are the two graphical techniques most commonly used to present qualitative data? a. bar chart and histogram b. bar chart and pie chart c. line chart and pie chart d. line chart and histogram ANS: B BLM: Remember

PTS:

REF: 20-21

TOP: 1–3

8. What are the characteristics of experimental units generally called? a. data sets b. descriptive statistics c. internal data d. variables ANS: D BLM: Remember

PTS:

REF: 10

TOP: 1–3

9. A politician who is running for office in a province with 3 million registered voters commissions a survey. In the survey, 5500 of the 10,000 registered voters interviewed say they plan to vote for her. Which of the following is the population of interest? a. the 3 million registered voters in the province b. the 10,000 registered voters interviewed c. the 5500 voters interviewed who plan to vote for her d. the 4500 voters interviewed who plan not to vote for her ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 10

TOP: 1–3

10. Which of the following terms refers to the set of all possible observations about a specified characteristic of interest? a. a frame b. a multinomial data set c. an observational study d. a population ANS: D BLM: Remember

PTS:

REF: 10

TOP: 1–3

11. What is a single observation about a specified characteristic of interest called? a. a datum b. an elementary unit c. a sample d. a univariate data set ANS: A BLM: Remember

PTS:

REF: 10

TOP: 1–3

12. Which of the following terms refers to a graphical presentation of data that is displayed in order of descending magnitude? a. a line chart b. a stem-and-leaf plot c. a relative frequency histogram d. a Pareto chart ANS: D BLM: Remember

PTS:

REF: 16

TOP: 1–3

13. A market share of 78.5% can be represented in a pie chart by a slice having a central angle measured in degrees. In this case, what would be the size of the central angle of the market share? a. 39.3 degrees b. 78.5 degrees c. 141.3 degrees d. 282.6 degrees ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 14-16

TOP: 1–3

14. Which of the following graphics is the most important and most commonly used graphical presentation of quantitative data? a. the bar chart b. the histogram c. the pie chart d. the dotplot ANS: B BLM: Remember

PTS:

REF: 27-28

TOP: 4–5

15. What is the total area of the six bars in a relative frequency histogram for which the width of each bar is ten units? a. 6 b. 10 c. 16 d. 60 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

16. On what does the total area of the bars in a relative frequency histogram depend?

a. b. c. d.

the sample size the number of bars the width of each bar the population size

ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

17. In general, how do data distributions of incomes of employees in large firms tend to be shaped? a. They are most often skewed to the right. b. They are most often skewed to the left. c. They are generally symmetric about the mean. d. They are generally symmetric about the median. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 25

TOP: 4–5

18. Which of the following numbers constitutes the sum of the relative frequencies found in a relative frequency distribution for quantitative data? a. 0 b. 1 c. 2 d. 3 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

19. Which of the following best describes the term “relative class frequency”? a. It is the number of observations that fall into a given class in a frequency distribution. b. It is the proportion of all observations that fall into a given class in a frequency distribution. c. It is the difference between the numerical lower and upper limits of a class of quantitative data. d. It is the average number of observations that fall into a given class in a frequency distribution. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 14 | 27-29

TOP: 4–5

20. Suppose you have 180 observations divided into classes. One of the classes has a data class frequency of 36. Which of the following would be its relative class frequency? a. 0.10 b. 0.18 c. 0.20 d. 0.36 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 27-29

TOP: 4–5

21. Suppose you are given a graphical portrayal of a relative frequency distribution of continuous quantitative data with the following characteristics. The lower and upper limits of the data classes are identified by tick marks on a horizontal axis. The corresponding relative class frequencies are represented by the areas of vertical rectangles connected to each other and positioned on top of each of these class intervals. What is such a graphical presentation called? a. a stacked bar chart b. a Pareto chart c. a pictogram d. a relative frequency histogram ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

22. A histogram is a graphical device. What kind of data is it commonly used to display? a. time series data b. quantitative data c. qualitative data d. categorical data ANS: B BLM: Remember

PTS:

REF: 27

TOP: 4–5

23. How many classes should generally be used when constructing a relative frequency histogram? a. between 1 and 5 classes b. between 1 and 10 classes c. between 5 and 12 classes d. between 5 and 20 classes ANS: C BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

24. Which of the following should generally NOT be used when constructing a relative frequency histogram? a. open-ended classes b. equal-width classes c. mutually exclusive classes d. exhaustive classes ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

25. Which of the following might best be displayed using a bar chart? a. time series data b. continuous variables c. qualitative variables d. quantitative variables ANS: C BLM: Remember

PTS:

REF: 14-16

TOP: 4–5

26. A stem-and-leaf plot is used to display the distribution of which of the following kinds of data? a. qualitative data b. quantitative data c. two quantitative variables d. two qualitative variables ANS: B BLM: Remember

PTS:

REF: 23-25

TOP: 4–5

27. If the manager of a grocery store wishes to display the sales trend, which of the following is the most effective type of graph to use? a. a bar chart b. a pie chart c. a histogram d. a line chart ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 21-22

TOP: 4–5

TRUE/FALSE 1. A variable is a characteristic that changes over time, and/or varies for different individuals or objects under consideration. An experimental unit is the individual or object on which a variable is measured. ANS: T BLM: Remember

PTS:

REF: 10

TOP: 1–3

2. Bar and pie charts are graphical techniques for qualitative data. The former focus the attention on the frequency of the occurrences of the categories; the latter emphasize the percentage of occurrences of each category. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 14-16 | 20

TOP: 1–3

3. In a sample of 1000 students in a university, 125, or 12.5%, are biology majors. The 12.5% is an example of statistical inference. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 5

TOP: 1–3

4. Individual observations within each class may be found in a frequency distribution. ANS: F BLM: Remember

PTS:

REF: 27-29

TOP: 1–3

5. Twenty-five percent of a sample of 200 professional tennis players indicated that their parents did not play tennis. This is an example of descriptive statistics, rather than inferential statistics. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 5

TOP: 1–3

6. A local cable system using a sample of 1000 subscribers estimates that 50% of its subscribers watch premium channels at least five times per week. This is an example of inferential statistics, as opposed to descriptive statistics. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 5

TOP: 1–3

7. Statistical inference is the process of making an estimate, prediction, or decision about a population based on sample data. ANS: T BLM: Remember

PTS:

REF: 5

TOP: 1–3

8. In the term “frequency distribution,” frequency refers to the number of data values or measurements falling within each class. ANS: T BLM: Remember

PTS:

REF: 14 | 27-29

TOP: 1–3

9. A branch of the statistics discipline that is used to develop and utilize techniques for effectively presenting numerical information is called inferential statistics. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 5

TOP: 1–3

10. A branch of the statistics discipline that is used to develop and utilize techniques for properly making inferences about population characteristics based on information contained in a sample drawn from this population is called inferential statistics. ANS: T BLM: Remember

PTS:

REF: 5

TOP: 1–3

11. Persons or objects that have characteristics of interest to statisticians are called variables. ANS: F BLM: Remember

PTS:

REF: 10

TOP: 1–3

12. A qualitative variable about which observations can be made in only two categories is a bivariate data set. ANS: F BLM: Remember

PTS:

REF: 11

TOP: 1–3

13. A variable that is normally described in words, rather than numerically presented, is a qualitative variable. ANS: T BLM: Remember

PTS:

REF: 12-13

TOP: 1–3

14. A discrete quantitative variable is one that may assume values only at specific points on an interval of values, with inevitable gaps between them. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 12-13

TOP: 1–3

15. A continuous quantitative variable is one that may assume values at all points on an interval of values, with no gaps between possible values. ANS: T PTS: 1 BLM: Higher Order - Understand 16.

REF: 12-13

TOP: 1–3

Persons or objects on which an experiment is performed are called experimental units. ANS: T BLM: Remember

PTS:

REF: 10

TOP: 1–3

17. Groupings of data created to enhance an understanding of them, usually by making the groups collectively exhaustive and mutually exclusive, are called classes or categories. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 14

TOP: 1–3

18. A tabular summary of a categorical data set, which shows the number of observations that fall into each of several collectively exhaustive and mutually exclusive classes, is called a bar chart. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 14

TOP: 1–3

19. A relative frequency distribution is a tabular summary of a data set showing the proportions of all observations which fall into each of several collectively exhaustive and mutually exclusive classes. ANS: T PTS: 1 BLM: Higher Order - Understand 20.

REF: 27-29

TOP: 1–3

Univariate data result when a single variable is measured on a single experimental unit. ANS: T BLM: Remember

PTS:

REF: 11

TOP: 1–3

21. Bivariate data result when fewer than two variables are measured on a single experimental unit. ANS: F BLM: Remember

PTS:

REF: 11

TOP: 1–3

22. Multivariate data result when more than two variables are measured. ANS: T BLM: Remember

PTS:

REF: 11

TOP: 1–3

23. When constructing a frequency distribution for categorical data, it is always necessary to develop class boundaries. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 14-16

TOP: 1–3

24. It is often a good idea to convert frequency distributions to relative frequency distributions when you compare two distributions with different amounts of data. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 1–3

25. A pie chart is the familiar circular graph that shows how the measurements are distributed among the categories of a qualitative variable. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 14-16

TOP: 1–3

26. In constructing a pie chart for a categorical variable, one sector of the pie is assigned to each category of the variable. ANS: T BLM: Remember

PTS:

REF: 14-16

TOP: 1–3

27. A bar chart for a categorical variable shows the same distribution of measurements in categories as the pie chart, except that the height of each bar measures how often a particular category was observed. ANS: T BLM: Remember

PTS:

REF: 14-16

TOP: 1–3

28. A bar chart in which the bars are ordered from smallest to largest is called a Pareto chart. ANS: F BLM: Remember

PTS:

REF: 16

TOP: 4–5

29. A relative frequency distribution describes the proportion of the data values falling within each class, and may be presented in a histogram form.

ANS: T BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

30. Compared to the frequency distribution, the stem-and-leaf plot provides more detail, since it can describe the individual data values and show how many are in each group, or stem. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 14 | 23-25

TOP: 4–5

31. The difference between a histogram and a bar chart is that the histogram represents quantitative data while the bar chart represents qualitative data. ANS: T BLM: Remember

PTS:

REF: 27

TOP: 4–5

32. Suppose that the largest value in a set of data is 99, and the lowest value is 20. If the resulting frequency distribution is to have five classes of equal width, the class width should be 16. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 14 | 27-29

TOP: 4–5

33. A relative frequency distribution describes the proportion of the data values falling within each category. ANS: T BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

34. The class interval in a frequency distribution is the number of data values falling within each class. ANS: F BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

35. Suppose you are given a stem-and-leaf plot that describes two-digit integers between 30 and 80. For one of the classes displayed, the row appears as 5|234. In this case, the numerical values being described are 25, 35, and 45. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 23-25

TOP: 4–5

36. If the six bars of a relative frequency histogram each have a width of five units, then the total area is 5. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 27-29

TOP: 4–5

37. When a distribution has more values to the left and tails to the right, then it is considered to be skewed to the left. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 25

TOP: 4–5

38. Time series data are often graphically depicted on a line chart. A line chart is a plot of the variable of interest over time. ANS: T BLM: Remember

PTS:

REF: 50-51

TOP: 4–5

39. When a distribution has more values to the right and tails to the left, then it is considered to be skewed to the right. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 25

TOP: 4–5

40. A histogram is said to be symmetric if, when we draw a vertical line down the centre of the histogram, the two sides are identical in shape and size. ANS: T BLM: Remember

PTS:

REF: 25

TOP: 4–5

41. A skewed histogram is one with a long tail extending either to the right or left. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 25

TOP: 4–5

42. A dotplot is a graphical portrayal of an absolute or relative frequency distribution of continuous quantitative data having the following characteristics: a. the lower and upper limits of the data classes are identified by tick marks on a horizontal axis, and b. the corresponding absolute or relative class frequencies are represented by the areas of contiguous rectangles that stand on top of each of these class intervals. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 22-23

TOP: 4–5

43. A pie chart is a portrayal of divisions of some aggregate by a segmented circle in such a way that the sector areas are proportional to the sizes of the divisions in question. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 20-21

TOP: 4–5

44. For the same data, a relative frequency histogram will look exactly the same as a frequency histogram. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

45. When constructing a relative frequency distribution, if the data are discrete, it will always be necessary to develop class boundaries. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

46. Relative frequency distributions are specifically constructed for analyzing discrete data. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

47. If you wish to compare two data sets of different sizes, it is usually a good idea to convert frequency distributions to relative frequency distributions. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 14 | 27-29

TOP: 4–5

48. A relative frequency histogram can be constructed for qualitative as well as quantitative data. ANS: F BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

49. A relative frequency histogram can be constructed by letting either the horizontal axis or the vertical axis represent the variable of interest. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 27-29

TOP: 4–5

50. The four classes: 0 to < 5, 5 to < 10, 10 to < 20, over 20, would be acceptable for developing a frequency distribution. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 14 | 27-29

TOP: 4–5

51. Stem-and-leaf plots are often used to analyze qualitative data in most real-life applications. ANS: F BLM: Remember

PTS:

REF: 23-25

TOP: 4–5

52. One of the differences between a bar chart and a histogram is that a histogram typically displays data in a percentage form. ANS: F BLM: Remember

PTS:

REF: 27-29

TOP: 4–5

53. Bar charts can typically be formed with the bars vertical or horizontal without affecting the interpretation.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 20-21

TOP: 4–5

54. In a line chart, the horizontal axis represents time (such as months, years), and the vertical axis represents the value of the variable of interest. ANS: T BLM: Remember

PTS:

REF: 21-22

TOP: 4–5

55. The data values plotted on a line graph are connected by drawing a straight line between each pair of successive points. ANS: T BLM: Remember

PTS:

REF: 21-22

TOP: 4–5

56. A distribution is skewed to the left if it contains a few unusually large measurements. ANS: F BLM: Remember

PTS:

REF: 25

TOP: 4–5

57. A bar chart and a histogram can be used interchangeably. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 27

TOP: 4–5

58. The simplest graph for quantitative data is the dotplot. ANS: T BLM: Remember

PTS:

REF: 22-23

TOP: 4–5

59. Relative frequency histograms are used most often to display bivariate data. ANS: F BLM: Remember

PTS:

REF: 11 | 27-28

TOP: 4–5

PROBLEM 1. Describe the difference between a population and a sample, and provide two examples for each. ANS: A population is the collection of all measurements of interest to the investigator in a particular study. For example, (1) all Canadian tennis players and (2) all residents of Nova Scotia. A sample is a subset of measurements selected from the population of interest. For example, (1) Canadian female tennis players, (2) Nova Scotia residents living in Halifax.

PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

2. Describe the difference between descriptive statistics and inferential statistics, and give an example for each. ANS: Descriptive statistics consists of procedures used to summarize and describe the important characteristics of a set of measurements. For example, Canada Revenue Agency reports that the mean time to file Form TP4 is 3 hours and 34 minutes. Inferential statistics consists of procedures used to make inferences about population characteristics from information contained in a sample drawn from this population. For example, based on a sample survey by the federal government reported in Maclean’s, only 46% of high school seniors can solve problems involving fractions, decimals, and percentages. PTS: 1 REF: 5 BLM: Higher Order - Understand

TOP: 1–3

3. A hospital administration would like to know the average length of a hospital stay. It would be almost impossible to look through all the past records and average the lengths of stay. Instead, a sample of 1000 patients over the past year was randomly chosen and their lengths of stay were averaged. Describe the population and the sample in this problem. ANS: Population: The lengths of stay of all the patients that the hospital has ever had Sample: The lengths of stay of the 1000 patients sampled PTS: 1 REF: 10 BLM: Higher Order - Analyze

TOP: 1–3

4. A highway department would like to repair a certain highway during the slowest part of a day to help minimize traffic congestion. Because it would be impossible to monitor the traffic flow on every day, the department monitors how many vehicles pass on this highway during one particular day. Based on these results, the department decides when to repair the highway. Describe the population and the sample in this problem. ANS: Population: All the traffic flow on all days on that highway Sample: The cars that passed on that highway on the particular day the department recorded PTS: 1 REF: 10 BLM: Higher Order - Analyze

TOP: 1–3

5. The freshness and overall quality of milk depend upon the type of packaging used. The manager of a dairy company is considering changing the packaging from cartons to plastic. The quality control team of this dairy company packaged milk in 100 containers of each type of material. After a specific amount of time, the team tested the milk for freshness and overall quality. Identify the population and the sample in this problem. ANS: Population: All the milk packaged at this particular dairy company Sample: The 100 selected containers of each kind of packaging PTS: 1 REF: 10 BLM: Higher Order - Analyze

TOP: 1–3

6. Identify each of the following variables as either quantitative or qualitative. a. the brands of ice cream that you purchase b. the daily high temperature for the past four weeks c. the amount of sugar consumed by Canadians in one year d. the species of fish in the zoo e. the lengths of time children wait for the school bus f. your favourite professional football team ANS: a. qualitative b. quantitative c. quantitative d. qualitative e. quantitative f. qualitative PTS: 1 REF: 12-13 BLM: Higher Order - Understand

TOP: 1–3

7. A piano teacher gave 33 lessons this month. The level of each student is listed below: where “B” is beginner, “I” is intermediate, and “A” is advanced. Summarize these data in a bar chart, and explain what the chart tells us. B I I ANS:

I I AA I B

I B B

A B B

B B I

B B A

I A B

B B I

A B I

The bar chart emphasizes that beginner students taking piano lessons are the most frequent, followed by intermediate, and advanced. PTS: 1 REF: 20-21 TOP: 1–3 BLM: Higher Order - Apply | Higher Order - Understand Tobacco Use A doctor is concerned that too many of her patients use tobacco. She conducted a survey of 200 randomly chosen patients; the results of her findings are shown below. Tobacco Use Number of Users Cigarette 40 Pipe/Cigar 7 Smokeless 22 Any combination 35 None 96 8. Refer to the Tobacco Use table. Determine the relative frequencies. ANS: Tobacco Use Cigarette Pipe/Cigar Smokeless Any combination None

Relative Frequency 0.200 0.035 0.110 0.175 0.480

PTS: 1 REF: 27-29 BLM: Higher Order - Apply

TOP: 1–3

9. Refer to the Tobacco Use table. Construct a relative frequency bar chart. ANS:

PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Apply

TOP: 1–3

10. Identify each of the following variables as qualitative or quantitative. a. rating of the effectiveness of a new cold remedy (not effective, effective) b. amount of time spent assembling a five-shelf bookcase c. number of children in a beginners’ swimming class d. university where a student is enrolled e. colour preference for a nursery f. rating the Canadian foreign policy in the Middle East (fair, biased) ANS: a. qualitative b. quantitative c. quantitative d. qualitative e. qualitative f. qualitative PTS: 1 REF: 12-13 BLM: Higher Order - Understand

TOP: 1–3

11. Identify each of the following quantitative variables as discrete or continuous. a. average monthly temperature for a particular city b. number of employees of a statistical consulting firm who own laptop computers c. flight time between two cities d. number of puppies enrolled in an obedience class e. number of persons on a flight from Chicago to Calgary f. amount of gas purchased at a gas station ANS:

a. b. c. d. e. f.

discrete discrete continuous discrete discrete continuous

PTS: 1 REF: 12-13 BLM: Higher Order - Understand

TOP: 1–3

Potato Preferences A study was performed to examine preferences for various types of potatoes: mashed, french fries, hash browns, steak fries, and baked. The following data were recorded regarding the preferences of 150 people: Type of Potato Mashed French fries Hash browns Steak fries Baked

Frequency 24 54 15 12 45

12. Refer to Potato Preferences table. What are the experimental units? ANS: The people being asked their preference represent the experimental units. PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

13. Refer to the Potato Preferences table. What is the variable being measured? Is it qualitative or quantitative? ANS: Type of potato. This is a qualitative variable. PTS: 1 REF: 10 | 12-13 BLM: Higher Order - Understand

TOP: 1–3

14. Refer to the Potato Preferences table. Construct a pie chart to describe the data. ANS:

PTS: 1 REF: 14-16 BLM: Higher Order - Apply

TOP: 1–3

15. Refer to the Potato Preferences table. What proportion of people prefer french fries or steak fries? ANS: Proportion of people preferring french fries or steak fries is 0.36 + 0.08 = 0.44. PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 1–3

16. Refer to the Potato Preferences table. Construct a bar chart to describe the data. ANS:

PTS: 1 REF: 20-21 BLM: Higher Order - Apply

TOP: 1–3

17. Classify the following variables, first as qualitative or quantitative and second, if quantitative, as discrete or continuous: a. the colours of cars at an auction b. the amount of money spent on building a new school c. the genders of members of parliament d. the styles of houses (1-story, 2-story, split level, etc.) e. the letter grades of students in a statistics exam (A, B, C, D, F) f. the number of credit cards owned by customers ANS: a. qualitative b. quantitative, continuous c. qualitative d. qualitative e. qualitative f. quantitative, discrete PTS: 1 REF: 12-13 BLM: Higher Order - Understand

TOP: 1–3

18. The owner of an Italian restaurant would like to see a graphical display of the number of customers that the restaurant serves. The following data are the average number of customers served each day for the past 12 months. Construct and interpret a line graph. ANS:

The restaurant serves the most customers in November and December, which may be because of the holiday season. It also has a peak in midsummer and dips in the spring and fall. NARR: Italian restaurant PTS:

REF: 21-22

TOP: 4–5

BLM: Higher Order - Apply | Higher Order - Understand 19. A math teacher would like to present the midterm results to her class in a way that shows the overall spread of the data. The 15 test scores for the history midterm are listed below. Construct and interpret a character dotplot. 45 73

78 70

62 73

98 79

50 80

61 68

91 72

89 62

57 65

64 78

77 50

69 95

ANS: Character Dotplot

. :. .:.. ... .:.:.. . . . . . -------+---------+---------+---------+---------+ --------- Score 50 60 70 80 90 100 The points are fairly evenly distributed between 45 and 98, with most scores between 60 and 80. PTS: 1 REF: 22-23 TOP: 4–5 BLM: Higher Order - Apply | Higher Order - Understand 20. The librarian of a small community library has compiled the number of people who visited the library and the respective number of checked-out books, and has created the line chart shown below. Interpret the chart, where the solid line is the number of visitors and the dashed line is the number of books checked out.

ANS: The two lines track each other. This means as the number of visitors increases (or decreases) the number of books checked out increases (or decreases).

PTS: 1 REF: 21-22 BLM: Higher Order - Understand

TOP: 4–5

Garbage Weight A garbage carrier would like to start charging by the weight of a customer’s garbage rather than the number of cans. The weights (in kilograms) of 90 randomly selected cans of garbage are summarized in the chart below. Class 1 2 3 4 5 6 7

Interval 4.9 to < 8.9 8.9 to <12.9 12.9 to <16.9 16.9 to <20.9 20.9 to <24.9 24.9 to <28.9 28.9 to <32.9

Frequency 4 11 16 27 19 10 3

21. Refer to the Garbage Weight table. Construct the frequency histogram. ANS:

PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Apply

TOP: 4–5

22. Refer to the Garbage Weight table. Find the relative frequency of each class. ANS: The relative frequencies are approximately: 0.044, 0.122, 0.178, 0.300, 0.211, 0.111, and 0.033. PTS:

REF: 14 | 27-29

TOP: 4–5

BLM: Higher Order - Apply 23. Refer to the Garbage Weight table. What percentage of the cans weigh less than 20.9 kilograms? ANS: (58/90) . (100%) = 64.44% PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

24. Refer to the Garbage Weight table. What percentage of the cans weigh at least 24.9 kilograms? ANS: (13/90) (100%) = 14.44% PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

25. A high school volleyball coach has summarized the wins, losses, and ties of her team for the past four years in the following stacked bar chart. Interpret the chart. ANS: The team’s best year was 2001 and their worst was 1998. PTS: 1 REF: 20-21 BLM: Higher Order - Understand

TOP: 4–5

Absenteeism A high school band teacher has a record of each student’s absence. The results, in days, are 3, 4, 7, 2, 2, 1, 0, 0, 1, 0, 3, 3, 2, 1, 6, 0, 1, 0, 1, 1, 1, 5, 3, 1, 1, 0, 0, 2, 1, 2, 1, 0, 0, and 4.

26. Refer to the Absenteeism statement. Summarize the data using frequencies and relative frequencies. ANS: Days 0 1 2 3 4 5 6 7

Frequency 9 11 5 4 2 1 1 1

Relative Frequency 0.26 0.32 0.15 0.12 0.06 0.03 0.03 0.03

PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Apply

TOP: 4–5

27. Refer to the Absenteeism statement. What proportion of students has been absent less than three days? ANS: 0.73 PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

28. Refer to the Absenteeism statement. How many students have been absent more than five days? ANS: 2 PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

29. Refer to the Absenteeism statement. What proportion of students has been absent at least two days? ANS: 0.42 PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

30. Refer to the Absenteeism statement. How many students have been absent at most six days? ANS:

33 PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

31. Refer to the Water Temperature statement. Construct a stem-and-leaf plot to display the distribution of the data. ANS: Stem 4 5 6 7 8 9

Leaves 29 79 35889 01379 15 02

Leaf unit = 0.1 NARR: Water Temperature PTS: 1 REF: 23-25 BLM: Higher Order - Apply

TOP: 4–5

Water Temperature A limnologist is studying an Alberta lake in October. He records the temperatures in C for surface water taken every other day at noon. The data are as follow: 8.5, 8.1, 7.9, 9.0, 7.7, 7.3, 7.1, 6.8, 9.2, 6.8, 6.3, 7.0, 6.5, 5.7, 5.9, 4.9, 4.2, and 6.9. 32. Refer to the Water Temperature statement. Would you describe the distribution of the data as symmetric, skewed to the right, or skewed to the left? Explain. ANS: The distribution is symmetric since the left and right sides of the distribution, when divided at the middle value, form mirror images. PTS: 1 REF: 25 BLM: Higher Order - Evaluate

TOP: 4–5

Weight of Bag of Potatoes The weight of a bag of potatoes is supposed to be 5 kilograms. Each bag may vary slightly from this standard. The weights for 18 bags of potatoes are as follow: 4.5, 5.0, 5.1, 6.0, 5.2, 3.9, 4.7, 4.9, 5.5, 5.6, 5.1, 4.9, 6.1, 4.8, 4.9, 5.1, 4.6, and 3.5. 33. Refer to the Weight of Bag of Potatoes statement. Construct a stem-and-leaf plot to describe the data. ANS:

Leaf unit = 0.1 Stems 3 4 5 6

Leaves 59 5678999 0111256 01

PTS: 1 REF: 23-25 BLM: Higher Order - Apply

TOP: 4–5

34. Refer to the Weight of Bag of Potatoes statement. What can be said about the shape of the distribution of the data? Why? ANS: The distribution of the data is symmetric since the left and right sides of the distribution, when divided at the middle value, form mirror images. PTS: 1 REF: 25-27 BLM: Higher Order - Evaluate

TOP: 4–5

Lecture Notes The numbers of pages of notes per lecture taken by a student in a beginning statistics course are as follow: 1, 5, 2, 6, 2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 6, 6, 5, 6, 4, 5, and 6. 35. Refer to the Lecture Notes statement. Construct a dotplot to describe the data. ANS: Dotplot for Number of Pages . .: :: :: ::: . ::::: ---+---------+---------+---------+---------+---------+--1.0 2.0 3.0 4.0 5.0 6.0 PTS: 1 REF: 22-23 BLM: Higher Order - Apply

TOP: 4–5

36. Refer to the Lecture Notes statement. Based on your dotplot, what can be said about the shape of the distribution of the data? Why? ANS: The distribution of the data is skewed to the left since the distribution contains few small measurements at the left side and many large measurements at the right side.

PTS: 1 REF: 25-27 BLM: Higher Order - Evaluate

TOP: 4–5

Kittens in Litters The following data represent the numbers of kittens born in litters for a particular breed of cat: 4, 5, 3, 6, 5, 5, 3, 4, 4, 5, 7, 5, 6, 6, 7, 4, 5, 5, 6, and 6. 37. Refer to the Kittens in Litters statement. Construct a relative frequency histogram. ANS:

PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Apply

TOP: 4–5

38. Refer to the Kittens in Litters statement. What proportion of litters has fewer than five kittens? ANS: Proportion of litters with fewer than five kittens = 0.10 + 0.20 = 0.30. PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

39. Refer to the Kittens in Litters statement. What proportion of litters has more than five kittens? ANS: Proportion of litters with more than five kittens = 0.25 + 0.10 = 0.35. PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

40. Refer to the Kittens in Litters statement. What proportion of litters has between four and six kittens, inclusive?

ANS: The proportion of litters with between four and six kittens = 0.20 + 0.35 + 0.25 = 0.80. PTS: 1 REF: 14 BLM: Higher Order - Analyze

TOP: 4–5

Social Media The Executive Board of a popular social media website wanted to know which kinds of professional people were accessing their website regularly over a two-month period. To that end, a random sample from its database of active users was collected. The following data were recorded: User

Number

Business Persons Academics in the Arts Engineers Academics in the Sciences Lawyers Medical Personnel

24 54 15 12 45 50

Average Frequency of Access per Week 35 21 10 8 15 11

41. Refer to the Social Media table. What are the experimental units? ANS: The people drawn from the database of active users represent the experimental units. PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

42. Refer to the Social Media table. What is the sample size of the data collected? ANS: 200 PTS: 1 REF: 10 BLM: Higher Order - Apply

TOP: 1–3

43. Refer to the Social Media table. What is the variable of primary interest? Is it qualitative or quantitative? ANS: Type of professional person accessing their website. This is a qualitative variable. PTS: 1 REF: 10 | 12-13 BLM: Higher Order - Understand

TOP: 1–3

44. Refer to the Social Media table. Name two of the most common graphical techniques which may be used by the Executive to show to their staff the results of their experiment.

ANS: Bar charts and pie charts. PTS: 1 REF: 20-21 BLM: Higher Order - Understand

TOP: 1–3

45. Refer to the Social Media table. From the data gleaned, approximately what proportion of their users are academics? ANS: Proportion of their users which are academics is 0.27 + 0.06 = 0.33. PTS: 1 REF: 14-16 BLM: Higher Order - Apply

TOP: 1–3

Friendly Encounter The Executive Board of a popular online dating website, called Friendly Encounter, wanted to know how to market the site by targeting the age groups most likely to be interested in its product. To find out, a random sample of 250 users was drawn from its database of participants who had been active during the previous month. The users were then categorized into age groups. The average frequency of access per week was recorded for each age group. The result was organized into the following table of data: Group Label 15 25 35 45 55 65

User Age Group in Years < 20 years 20 to < 30 30 to < 40 40 to < 50 50 to < 60 = 60 years

Number of Users 44 54 55 35 42 20

Average Number of Logons per Week 35 21 15 10 8 11

46. Refer to the Friendly Encounter table. What are the experimental units? ANS: The people drawn from the database of active participants represent the experimental units. PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

47. Refer to the Friendly Encounter table. What is the primary variable of interest in this experiment? ANS: The primary variable of interest is the number of users in each age group. PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

48. Refer to the Friendly Encounter table. Find the relative frequency of the primary variable of interest. ANS: The relative frequencies are approximately (in order or group label): 0.176, 0.216, 0.220, 0.140, 0.168, and 0.080. PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Apply

TOP: 4–5

49. Refer to the Friendly Encounter table. From this sample data, what may the Executive Board deduce about the approximate percentage of its dating site users being under 40 years of age? ANS: (0.176 + 0.216 + 0.220) * 100% = 61.2% PTS: 1 REF: 14 BLM: Higher Order - Apply

TOP: 4–5

50. Refer to the Friendly Encounter table. From this sample data, what may the Executive Board deduce about the approximate percentage of its dating site users being 40 years of age and over? ANS: (0.140 + 0.168 + 0.080) *100% = 38.8% PTS: 1 REF: 14 BLM: Higher Order - Apply

TOP: 4–5

51. Refer to the Friendly Encounter table. Just from the statistics alone, should the Board decide to target the under-40 age group, rather than the 40 & over group? Explain your conclusion. ANS: Yes. From the statistics alone, of the two groups, the under 40 group is most likely to be interested in the dating site because the likelihood of a person in that age category being interested is 61.2% of the available population. PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Evaluate

TOP: 4–5

52. Refer to the Friendly Encounter table. Should the Board decide to ignore the 40 & over age group altogether in its next marketing campaign? Explain your conclusion. ANS: No. To ignore the 40 & over age group is to ignore approximately 38.8% of the available population. That is too high a percentage to ignore completely without risking the loss of potential income.

PTS: 1 REF: 14 | 27-29 BLM: Higher Order - Evaluate

TOP: 4–5

Chapter 2—Describing Data with Numerical Measures MULTIPLE CHOICE 1. Which of the following is a meaningful measure of centre when the data are qualitative? a. the mean b. the median c. the mode d. the quartile ANS: C BLM: Remember

PTS:

REF: 56 | 59

TOP: 1–3

2. Which of the following is a property of a symmetric distribution? a. The mean is greater than the median. b. The mean and median are equal. c. The mean is less than the median. d. The mean is less than the mode. ANS: B BLM: Remember

PTS:

REF: 59

TOP: 1–3

3. In a histogram, what may be said of the proportion of the total area that must be to the right of the mean? a. It is less than 0.50 if the distribution is skewed to the left. b. It is always exactly 0.50. c. It is more than 0.50 if the distribution is skewed to the right. d. It is exactly 0.50 only if the distribution is symmetric. ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 57-59

TOP: 1–3

4. Which of the following statements applies to this set of data values: 17, 15, 16, 14, 17, 18, and 22? a. The mean, median, and mode are all equal. b. Only the mean and median are equal. c. Only the mean and mode are equal. d. Only the median and mode are equal. ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 57-59

TOP: 1–3

5. Which of the following best describes the relationship between the population mean and the sample mean? a. The population mean is always larger than the sample mean. b. The population mean is always smaller than the sample mean. c. The population mean is always larger than or equal to the sample mean. d. The population mean can be smaller than, larger than, or equal to the sample mean. ANS: D

PTS:

REF: 57-58

TOP: 1–3

BLM: Higher Order - Understand 6. The average score for a class of 35 students was 70. The 20 male students in the class averaged 73. What was the average score for the 15 female students in the class? a. 60 b. 66 c. 70 d. 73 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 57-58

TOP: 1–3

7. In a histogram, what may one conclude about the proportion of the total area that must be to the left of the median? a. It is exactly 0.50. b. It is less than 0.50 if the distribution is skewed to the left. c. It is more than 0.50 if the distribution is skewed to the right. d. It is between 0.25 and 0.75 if the distribution is symmetric. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 58-59

TOP: 1–3

8. Which of the following statements about the mean is NOT always correct? a. The sum of the deviations from the mean is 0. b. Half the observations are on either side of the mean. c. The mean is a measure of the middle of a distribution. d. The value of the mean times the number of observations equals the sum of all of the observations. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 57-59

TOP: 1–3

9. Which of the following can be used to summarize data about qualitative variables? a. measures of centre b. measures of variability c. proportions d. measures of relative standing ANS: C BLM: Remember

PTS:

REF: 14

TOP: 1–3

10. Consider this data set: 5, 6, 7, 11, and 15. Which of the following values equals its mean? a. 7.0 b. 7.1 c. 8.1 d. 8.8 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 57-58

TOP: 1–3

11. A random sample from an unknown population had a sample standard deviation of zero. From this piece of information, which one of the following is a reasonable conclusion? a. The sample range must be zero. b. An error was made in computing the sample standard deviation. It must always be greater than zero. c. The population standard deviation must be zero. d. The population standard deviation must be less than zero ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 67

TOP: 1–3

12. The following data represent a sample of 10 scores on a 20-point statistics quiz: 16, 16, 16, 16, 16, 18, 18, 20, 20, and 20. After the mean, median, range, and variance were calculated for the scores, it was discovered that one of the scores of 20 should have been an 18. Which of the following pairs of measures will change when the calculations are redone using the correct scores? a. mean and range b. median and range c. mean and variance d. median and variance ANS: C TOP: 1–3

PTS: 1 REF: 57-58 | 63-64 BLM: Higher Order - Apply

13. Which of the following represents a disadvantage of using the sample range to measure dispersion? a. It produces spreads that are not meaningful for data analysis. b. The largest or smallest observation (or both) may be an outlier. c. The sample range is not measured in the same units as the data. d. The sample range is measured in the same units as the data. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 63

TOP: 1–3

14. The following 10 scores were obtained on a 20-point quiz: 4, 5, 8, 9, 11, 13, 15, 18, 18, and 20. The teacher computed the usual descriptive measures of centre and variability for these data, and then discovered an error was made. One of the 18s should have been a 16. Which pair of the following measures, calculated on the corrected data, would change from the original computation? a. mean and standard deviation b. mean and median c. range and median d. mean and range ANS: A TOP: 1–3

PTS: 1 REF: 57-58 | 63-65 BLM: Higher Order - Apply

15. Which of the following is NOT a measure of variability? a. the variance b. the standard deviation c. the mean

d. the range ANS: C BLM: Remember

PTS:

REF: 57 | 62-63

TOP: 1–3

16. If two data sets have the same range, which of the following characteristics do these data sets also share? a. The distances from the smallest to the largest observations in both sets will be the same. b. The smallest and largest observations will be the same in both sets. c. They will have the same variance. d. They will have the same interquartile range. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 63

TOP: 1–3

17. A sample of 26 observations has a standard deviation of 4. What is the sum of the squared deviations from the sample mean? a. 21 b. 25 c. 100 d. 400 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 65

TOP: 1–3

18. Which of the following refers to numbers that indicate the spread or scatter of observations in a data set? a. measures of centre b. measures of location c. measures of variability d. measures of shape ANS: C BLM: Remember

PTS:

REF: 62-63

TOP: 1–3

19. Which of the following statements describes the variance of a data set? a. The variance is a mean of absolute deviations. b. The variance is a mean of positive and negative deviations. c. The variance is a mean of squared deviations. d. The variance is a mean of only the positive deviations. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 64

TOP: 1–3

20. If a store manager selected a sample of customers and computed the mean income for this sample, what has he computed? a. a parameter b. a statistic c. a qualitative value d. a categorical value

ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 56

TOP: 1–3

21. Which of the following is a characteristic of a population mean? a. It will always be larger than the mean of a sample selected from that population. b. It will always be larger than the population median. c. It will usually differ in value from the mean of a sample selected from that population. d. It will always be smaller than the population median. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 57-58

TOP: 1–3

22. A sample of students who have taken a calculus test has a mean score of 78.2, a mode of 67, and a median score of 67. Based on this information, what may one deduce about the distribution of the test scores? a. It is symmetric. b. It is right-skewed. c. It is left-skewed. d. It is bimodal. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 57-60

TOP: 1–3

23. Which of the following is the most frequently used measure of variation? a. the mean b. the range c. the variance d. the standard deviation ANS: D BLM: Remember

PTS:

REF: 65

TOP: 1–3

24. Which of the following measures is NOT affected by extreme values in the data? a. the mean b. the median c. the variance d. the range ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 58

TOP: 1–3

25. A university placement office conducted a survey of 100 engineers who had graduated from a local university. For these engineers, the mean salary was computed to be $72,000 with a standard deviation of $8,000. Which of the following best characterizes the percentage of these engineers who earn either more than $96,000 or less than $48,000? a. approximately 2.3% b. at least 5.6% (1/18 of the engineers) c. at most 5.6% (1/18 of the engineers) d. at most 11.1% (1/9 of the engineers)

ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 68-69 | 71

TOP: 4–5

26. According to Tchebysheff’s Theorem, what is the percentage of measurements in a data set that will fall within three standard deviations of the mean? a. 16% b. at least 68% c. 75% d. at least 89% ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 68-69

TOP: 4–5

27. You are given a distribution of measurements that is approximately mound-shaped. According to the Empirical Rule, what would be the approximate percentage of measurements in a data set that will fall within two standard deviations of their mean? a. 99% b. 95% c. 90% d. 68% ANS: B BLM: Remember

PTS:

REF: 69-70

TOP: 4–5

28. The expression where is recognizable as the formula for which of the following measures? a. the population mean, computed from ungrouped data b. the sample mean, computed from ungrouped data c. the population mean, computed from grouped data d. the sample mean, computed from grouped data ANS: D BLM: Remember

PTS:

REF: 76

29. The expression ; formula for which of the following measures? a. the sample variance, computed from ungrouped data b. the population variance, computed from ungrouped data c. the sample variance, computed from grouped data d. the population variance, computed from grouped data ANS: C BLM: Remember

PTS:

REF: 76

TOP: 4–5

is recognizable as the

TOP: 4–5

30. Suppose that a particular statistical population can be described, at least roughly, by the normal curve. Which of the following can we use to estimate the percentages of all population values that lie within specified numbers of standard deviations from the mean? a. Tchebysheff’s Theorem b. the Empirical Rule

c. the interquartile range d. a box plot ANS: B BLM: Remember

PTS:

REF: 69-70

TOP: 4–5

31. The lengths of screws produced by a machine are normally distributed, with a mean of 3 cm and a standard deviation of 0.2 cm. What can we conclude from this? a. Approximately 68% of all screws have lengths between 2.8 and 3.2 cm. b. Approximately 95% of all screws have lengths between 2.8 and 3.2 cm. c. Just about all screws have lengths between 2.8 and 3.2 cm. d. Just about all screws have lengths between 2.9 and 3.1 cm. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 69-71

TOP: 4–5

32. According to Tchebysheff’s Theorem, which of the following bounds will delimit the fraction of observations falling within k (where ) standard deviations of the mean? a. at most, 1 – b. at least c. at most, 1 – d. at least 1 – ANS: D BLM: Remember

PTS:

REF: 68-69

TOP: 4–5

33. The distribution of actual volumes of tomato soup in 450 mL cans is thought to be bell-shaped, with a mean of 450 mL and a standard deviation equal to 8 mL. Based on this information, between what two values could we expect 95% of all cans to contain? a. 430 and 470 mL b. 432 and 468 mL c. 434 and 466 mL d. 440 and 460 mL ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 69-71

TOP: 4–5

34. Incomes of workers in an automobile company in Ontario are known to be right-skewed, with a mean equal to $36,200. Applying Tchebysheff’s Theorem, at least 8/9 of all incomes are in the range of $29,600 to $42,800. What is the standard deviation of those incomes from that mean? a. $2,200 b. $4,755 c. $6,500 d. $6,700 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 68-69

TOP: 4–5

35. Which of the following randomly selected measurements, x, might be considered a potential outlier if it were to be selected from the given population? a. x = 0 from a population with = 0 and =2 b. x = –5 from a population with = 1 and =4 c. x = 7 from a population with = 3 and =2 d. x = 4 from a population with = 0 and =1 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 59 | 77-78

TOP: 6–7

36. Which of these values represents a lower quartile for the data set 23, 24, 21, and 20? a. 20.25 b. 22.0 c. 22.5 d. 23.5 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 79-81

TOP: 6–7

37. Which one of these values represents the upper quartile of the data set 10, 12, 16, 7, 9, 7, 41, and 14? a. 7 b. 8 c. 15.5 d. 24 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 79-81

TOP: 6–7

38. Expressed in percentiles, how is the interquartile range defined? a. It is the difference between the 20% and 70% values. b. It is the difference between the 20% and 80% values. c. It is the difference between the 25% and 75% values. d. It is the difference between the 45% and 95% values. ANS: C BLM: Remember

PTS:

REF: 80

TOP: 6–7

39. Scores on a chemistry exam were mound-shaped with a mean score of 90 and a standard deviation of 64. Scores on a statistics exam were also mound-shaped, with a mean score of 70 and a standard deviation of 16. A student who took both exams achieved a grade of 102 on the chemistry exam and a grade of 77 on the statistics exam. Which of these may be inferred from the information given? a. The student did relatively better on the chemistry exam than on the statistics exam, compared to the other students in each class. b. The student did relatively better on the statistics exam than on the chemistry exam, compared to the other students in the two classes. c. The student’s scores on both exams are similar when accounting for the scores of the other students in the two classes.

d. Without more information it is impossible to say which of the student’s exam scores indicates the better performance. ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 77-78

TOP: 6–7

40. Which of the following summary measures is most affected by outliers? a. the first quartile b. the second quartile c. the third quartile d. the variance ANS: D TOP: 6–7

PTS: 1 BLM: Remember

REF: 64-65 | 79-80

41. What percentage of all observations in a data set lie between the 30th percentile and the third quartile? a. 30% b. 45% c. 75% d. 79% ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 78-81

TOP: 6–7

42. Which of the following describes a graphical device that displays the highest and lowest values in a data set, as well as the upper quartile, the middle value, and the lower quartile? a. a box plot b. a five-number summary c. a dotplot d. a stem-and-leaf plot ANS: A BLM: Remember

PTS:

REF: 81-84

TOP: 6–7

43. Lily’s score on her biochemistry text placed her at the 97th percentile. What does this mean? a. Lily’s score has a z-score of 0.97. b. Lily was in the bottom 3% of the students who took the test. c. Lily scored as high as or higher than 97% of the students who took the test. d. Lily’s score has a z-score of –0.97. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 78

TOP: 6–7

44. A sample of 50 values produced the following summary statistics: and Based on this information, what are the left and right ends, respectively, of the box plot using whiskers? a. 5.3 and 32.0 b. 10 and 14.6 c. 10 and 16.7

d. 14.6 and 16.7 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 81-84

TOP: 6–7

45. A sample of 600 values produced the following summary statistics: and Given this information, which of the following values constitutes the lower fence on a box plot? a. –4.60 b. 26.80 c. 75.80 d. 102.60 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 81-84

TOP: 6–7

46. A sample of 600 values produced the following summary statistics: and Given this information, which of the following values is the upper fence on a box plot of this data set? a. –4.60 b. 26.80 c. 75.80 d. 102.60 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 81-84

TOP: 6–7

47. If a data set has 15 values that have been sorted in ascending order, which value in the data set will be at the 25th percentile? a. the fourth value b. the third value c. the second value d. the first value ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 78

TOP: 6–7

48. If the distribution of sales is thought to be symmetric with very little variation, then what may one conclude about the box plot that represents the data set? a. The whiskers on a box plot the box should be about half as long as the box is wide. b. The width of the box will be very wide but the whiskers will be very short. c. The left and right edges will be approximately at equal distance from the second quartile. d. The width of the box will be very short, but the whiskers will be very long. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 81-84

TOP: 6–7

49. The following summary statistics were computed from a sample of size 250: and

Given this information, which of the following statements is

correct? a. The distribution of the data is slightly right-skewed. b. The distribution of the data is symmetric. c. A data value of 1 is an outlier. d. A data value of 25 is an outlier. ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 82

TOP: 6–7

TRUE/FALSE 1. Numerical descriptive measures computed from population measurements are called parameters. ANS: T BLM: Remember

PTS:

REF: 56

TOP: 1–3

2. Numerical descriptive measures computed from sample measurements are called statistics. ANS: T BLM: Remember

PTS:

REF: 56

TOP: 1–3

3. Two classes, one with 15 students and the other with 25 students, took the same test and averaged 85 points and 75 points, respectively. If the two classes were combined, the overall average score of the 40 students would be 80 points. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 57

TOP: 1–3

4. If, from a set of data, the sample mean was found to be 15 but the sample median was only 9, then the data set would be said to be skewed to the right. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

5. When data have been grouped (as in a frequency table, a relative frequency histogram, etc.), the class with the highest frequency is called the modal class, and the midpoint of that class is taken to be the mode. ANS: T BLM: Remember

PTS:

REF: 59-60

TOP: 1–3

6. The mode is generally used to describe large data sets. ANS: T BLM: Remember

PTS:

REF: 59-60

TOP: 1–3

7. The mode of a data set or a distribution of measurements, if it exists, is unique. ANS: F BLM: Remember

PTS:

REF: 59-60

TOP: 1–3

8. Jessica has been keeping track of what she spends to eat out. Last week’s expenditures for meals eaten out were $15.69, $15.95, $16.19, $20.91, $17.49, $24.53, and $17.66. The mean amount Jessica spends on meals is $18.35. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 57

TOP: 1–3

9. A data sample has a mean of 87 and a median of 117. The distribution of the data is positively skewed. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

10. A student scores 89, 75, 94, and 88 on four exams during the semester and 97 on the final exam. If the final is weighted double and the four others weighted equally, the student’s final average would be 90. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 57

TOP: 1–3

11. In a mound-shaped distribution, there is no difference in the values of the mean and the median. ANS: T BLM: Remember

PTS:

REF: 59

TOP: 1–3

12. Measures of centre are values around which observations tend to cluster and which describe the location of what, in some sense, might be called the “centre” of a data set. ANS: T BLM: Remember

PTS:

REF: 56

TOP: 1–3

13. The median is a measure of centre that divides an ordered array of data into two halves. If the data are arranged in ascending order from smallest to largest, all the observations below the median are smaller than or equal to it, while all the observations above the median are larger than or equal to it. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 58

TOP: 1–3

14. The mode is the sum of a data set’s minimum and maximum values, divided by 2. ANS: F

PTS:

REF: 59

TOP: 1–3

BLM: Remember 15. If the variability of a set of data is very small, then the sample variance may be negative. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 64-65

TOP: 1–3

16. When all the numbers in the data set are the same, the standard deviation, s, must be zero. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 67

TOP: 1–3

17. In all cases, the sum of the deviations of the measurements from their mean is 0. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 64

TOP: 1–3

18. The sample variance is approximately the average of the squared deviations of the measurements from their mean. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 65

TOP: 1–3

19. The sample variance calculated with a divisor of n gives a better estimate of the population variance, 2, than does the sample variance, s2, with a divisor of n – 1. ANS: F BLM: Remember

PTS:

REF: 66

20. The larger the values of the sample variance, greater the variability in the data. ANS: T PTS: 1 BLM: Higher Order - Understand

TOP: 1–3

, and the sample standard deviation, s, the

REF: 67

TOP: 1–3

21. In order to measure the variability in the same units as the original observations, we compute the sample variance. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 67

TOP: 1–3

22. Measures of variability describe typical values in the data. ANS: F BLM: Remember

PTS:

REF: 62-63

TOP: 1–3

23. The mean is one of the most frequently used measures of variability. ANS: F

PTS:

REF: 57 | 62-63

TOP: 1–3

BLM: Remember 24. The range is considered the weakest measure of variability. ANS: T BLM: Remember

PTS:

REF: 63

TOP: 1–3

25. The value of the standard deviation will always exceed that of the variance. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 65

TOP: 1–3

26. The standard deviation is expressed in terms of the original units of measurement, but the variance is not. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 67

TOP: 1–3

27. The value of the standard deviation may be either positive or negative, while the value of the variance will always be positive or zero. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 67

TOP: 1–3

28. The standard deviation is the positive square root of the variance. ANS: T BLM: Remember

PTS:

REF: 65

TOP: 1–3

29. A sample of 20 observations has a standard deviation of 4. The sum of the squared deviations from the sample mean is 320. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 65

TOP: 1–3

30. The value of the mean times the number of observations equals the sum of all of the observations. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 57

TOP: 1–3

31. In a histogram, the proportion of the total area that must be to the left of the median is less than 0.50 if the distribution is skewed to the left. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

32. In a histogram, if the distribution is skewed to the right, the proportion of the total area that must be to the left of the median is more than 0.50.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

33. If two data sets have the same range, the variances in both sets will be the same. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 63-65

TOP: 1–3

34. The sum of the deviations squared from the mean is always zero. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 64

TOP: 1–3

35. Measures of variability are numbers that indicate the spread or scatter of observations; they show the extent to which individual values in a data set differ from one another and, hence, differ from their central location. ANS: T BLM: Remember

PTS:

REF: 62-63

TOP: 1–3

36. A parameter and a statistic can be used interchangeably. ANS: F BLM: Remember

PTS:

REF: 56

TOP: 1–3

37. The median is one of the most commonly used measures of variability. ANS: F BLM: Remember

PTS:

REF: 58 | 62-63

TOP: 1–3

38. For distributions of data that are skewed to the left or right, the median would likely be the best measure of centre. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

39. You are given the data values 5, 10, 15, 20, and 25. If these data were considered to be a population, and you calculated its mean, you would get the same value as if these data were considered to be a sample from another larger population. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 56-57

TOP: 1–3

40. The value (n + 1)/2 indicates the value of the median in an ordered data set, where n is the number of data values. ANS: F BLM: Remember

PTS:

REF: 58

TOP: 1–3

41. For any distribution, if the mean is equal to the standard deviation, you can conclude that the distribution is symmetric. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

42. A distribution is said to be skewed to the right if the population mean is larger than the sample mean. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

43. One advantage of using the median as a measure of centre is that its value is NOT affected by extreme values. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

44. A data set in which the mean and median are equal is said to be bimodal data. ANS: F BLM: Remember

PTS:

REF: 59-60

TOP: 1–3

45. If the mean value of a distribution is 85 and the median is 67, the distribution must be skewed to the right. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

46. One of the advantages of the standard deviation over the variance as a measure of variability is that the standard deviation is measured in the original units. ANS: T BLM: Remember

PTS:

REF: 67

TOP: 1–3

47. For any distribution, the standard deviation is a measure of the variability of the data around the median. ANS: F BLM: Remember

PTS:

REF: 65

TOP: 1–3

48. Suppose the standard deviation for a given sample is known to be 12. If each data value in the sample is multiplied by 3, the standard deviation will be 36. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 65

TOP: 1–3

49. When the distribution is skewed to the left, then the mean > the median.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

50. When the distribution is skewed to the right, the mean < the median. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 59

TOP: 1–3

51. When the distribution is symmetric and unimodal, the mean = the median. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 57-59

TOP: 1–3

52. If a distribution is strongly skewed by one or more extreme values, you should use the mean rather than the median as a measure of centre. ANS: F BLM: Remember

PTS:

REF: 59

TOP: 1–3

53. Half of the observations in a data set are on either side of the mean. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 58

TOP: 1–3

54. The mean is a measure of the middle centre of a distribution. ANS: T BLM: Remember

PTS:

REF: 56-57

TOP: 1–3

55. The sum of the squared deviations from the mean is always zero. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 65

TOP: 1–3

56. The standard deviation is always smaller than the variance. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 65

TOP: 1–3

57. Tchebysheff’s Theorem states the following: Given a number k greater than or equal to 1, and a set of measurements, at least ( ) of the measurements in the data set will lie within k standard deviations of their mean. ANS: T BLM: Remember

PTS:

REF: 68-69

TOP: 4–5

58. The Empirical Rule states the following: Given a distribution of measurements that is approximately bell-shaped (mound-shaped), then the interval contains approximately 68% of the measurements; the interval of the measurements; and the interval measurements. ANS: T BLM: Remember

PTS:

contains approximately 95%

contains all or almost all of the

REF: 69-70

TOP: 4–5

59. The Empirical Rule and Tchebysheff’s Theorem can be used to describe data sets. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 68-70

TOP: 4–5

60. The Empirical Rule can be applied to any numerical data set. ANS: F BLM: Remember

PTS:

REF: 69-70

TOP: 4–5

61. For larger sample sizes, a rough approximation for the sample standard deviation s is that s R/4, where R is the range. ANS: T BLM: Remember

PTS:

REF: 63 | 69

TOP: 4–5

62. Since Tchebysheff’s Theorem applies to any distribution, it provides a very conservative estimate of the fraction of measurements that fall into a particular interval. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 69

TOP: 4–5

63. Tchebysheff’s Theorem gives a lower bound to the fraction of measurements to be found in an interval constructed as

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 68-69

TOP: 4–5

64. Tchebysheff’s Theorem applies only to data sets which have a mound-shaped distribution. ANS: F BLM: Remember

PTS:

REF: 69

TOP: 4–5

65. While Tchebysheff’s Theorem applies to any distribution, regardless of shape, the Empirical Rule applies only to distributions that are mound-shaped. ANS: T BLM: Remember

PTS:

REF: 69

TOP: 4–5

66. The mean of 40 sales receipts is $69.75 and the standard deviation is $10.25. Using Tchebysheff’s Theorem, at least 75% of the sales receipts were between $49.25 and $90.25. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 68-69

TOP: 4–5

67. According to Tchebysheff’s Theorem, at least 96% of observations should fall within five standard deviations of the mean. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 68-69

TOP: 4–5

68. Tchebysheff’s Theorem provides us with a measure of the shape of a set of data that focuses on the difference between the mode and the mean, and then relates it to the standard deviation. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 68-69

TOP: 4–5

69. The distribution of chequing account balances for customers at Independent Bank is known to be bell-shaped with a mean of $1800 and a standard deviation of $300. Given this information, the percentage of accounts with balances between $1500 and $2100 is approximately 95%. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 69-71

TOP: 4–5

70. The distribution of dollars paid for home insurance by home owners in Windsor is bell-shaped with a mean equal to $800 every six months, and a standard deviation equal to $120. Based on this information, we can use Tchebysheff’s Theorem to determine the percentage of home owners who will pay between $560 and $1040 for home insurance. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 68-69

TOP: 4–5

71. The distribution of credit card balances for customers is highly skewed to the right, with a mean of $1200 and a standard deviation of $150. Based on this information, approximately 68% of the customers will have credit card balances between $1050 and $1350. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 68-69 | 71

TOP: 4–5

72. The sample z-score is a measure of relative standing defined by . It measures the distance between an observation and the mean in units of the standard deviation. ANS: T BLM: Remember

PTS:

REF: 77-78

73. z-scores exceeding 3 in absolute value are likely to occur.

TOP: 6–7

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 77-78

TOP: 6–7

74. Any unusually large observation (as measured by a z-score greater than 3), or any unusually small observation (as measured by a z-score smaller than –3) is considered to be an outlier. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 77-78

TOP: 6–7

75. The 10th percentile of a set of measurements is the value that exceeds 90% of the measurements and is less than the remaining 10% of the measurements. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 78

TOP: 6–7

76. The difference between the largest and smallest values in an ordered array is called the interquartile range. ANS: F BLM: Remember

PTS:

REF: 63 | 80

TOP: 6–7

77. Quartiles divide the values in a data set into four parts of equal size. ANS: T BLM: Remember

PTS:

REF: 79-81

TOP: 6–7

78. The interquartile range is the difference between the lower and upper quartiles. ANS: T BLM: Remember

PTS:

REF: 80

TOP: 6–7

79. Expressed in percentiles, the upper quartile is the 75th percentile. ANS: T BLM: Remember

PTS:

REF: 79-81

TOP: 6–7

80. Measures of relative standing indicate the position of one observation relative to other observations in a set of data. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 77-78

TOP: 6–7

REF: 79-81

TOP: 6–7

81. The median equals the second quartile. ANS: T PTS: 1 BLM: Higher Order - Understand

82. The standard deviation is a measure of relative standing.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 77-78

TOP: 6–7

83. If a set of data has 120 values, the value of the 30th percentile will be calculated using the 36th and 37th values in the data, when the data values have been arranged in ascending order. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 78

TOP: 6–7

84. The distribution of a set of data is considered to be symmetric if the first quartile and the 25th percentile are equal. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 78-79

TOP: 6–7

85. If the mean value of a set of data is 83.5 and the median is 72.8, then the third quartile will be at least 83.5. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 79-81

TOP: 6–7

86. Assume that 75% of the households in Saskatchewan have incomes of $24,375 or below. Given this information, it is certain that the mean household income is less than $24,375. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 79-81

TOP: 6–7

87. The left and right ends of the box in a box plot represent the 25th and 75th percentiles, respectively. ANS: T BLM: Remember

PTS:

REF: 81-84

TOP: 6–7

88. The following five-number summary for a sample of size 500 was obtained: Minimum = 250, and Maximum = 4,950. Based on this information, the distribution of the data seems to be symmetric. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 79

TOP: 6–7

89. The following five-number summary for a sample of size 500 was obtained: Minimum = 250, and Maximum = 4,950. Based on this information, if you were to construct a box plot, the value 215 would be considered an outlier. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 81-84

TOP: 6–7

90. The following five-number summary for a sample of size 500 was obtained: Minimum = 250, and Maximum = 4,950. Based on this information, if you were to construct a box plot, the value corresponding to the right-hand edge of the box would be 4,800. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 118-120

TOP: 6–7

91. The following five-number summary for a sample of size 500 was obtained: Minimum = 250, and Maximum = 4,950. Based on this information, if you were to construct a box plot, the value corresponding to the upper fence is 10,200. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 118-120

TOP: 6–7

92. A sample of 2500 vehicles in Minnesota showed the following statistics related to the number of accidents per month: and conclude that the distribution of accidents is skewed. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 59 | 79-81

Based on these data, we can

TOP: 6–7

PROBLEM Motor Skills of Children The times required for 10 children to learn a particular motor skill were recorded as 9, 15, 23, 20, 16, 15, 24, 18, 10, and 20 minutes. 1. Refer to Motor Skills of Children statement. Find the mean time to learn this task. ANS: minutes PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

2. Refer to Motor Skills of Children statement. Find the median time to learn this task. ANS: m = 17 PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

3. Refer to Motor Skills of Children statement. Based on the values of the mean and median in the previous two questions, are the measurements symmetric or skewed? Give a reason for your answer. ANS: Since the mean and median values are the same, we conclude that the measurements are symmetric. PTS: 1 REF: 59 BLM: Higher Order - Understand

TOP: 1–3

4. Suppose someone told you that each value of a data set of 5 measurements had been multiplied by 100 and the sample mean was calculated to be 17.20. What was the sample mean of the original data? ANS: = 0.172 PTS: 1 REF: 57 BLM: Higher Order - Apply

TOP: 1–3

Flu Shot Eight doctors were asked how many flu shots they had given to patients this fall. The numbers of flu shots were 6, 3, 5, 24, 2, 6, 0, and 8. 5. Refer to Flu Shot statement. Find the sample mean. ANS: PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

6. Refer to Flu Shot statement. Find the median number of flu shots given. ANS: m = 5.5 PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

7. Refer to Flu Shot statement. Based on the values of the mean and median in the previous two questions, are the measurements symmetric or skewed? Why? ANS: Since the mean is larger than the median, we conclude that the measurements are skewed to the right. PTS:

REF: 59

TOP: 1–3

BLM: Higher Order - Understand 8. In assembling a home appliance, workers generally finish the process within 30 minutes to 1 hour. Occasionally, due to system failures, the assembly process takes a long time, possibly as long as 4 to 5 hours. What is the most appropriate measure of central tendency to use in this case if you want the measure to be representative of most of the observed times? Why is it the most appropriate measure? ANS: Median is the most appropriate measure because it is not influenced by extreme values. PTS: 1 REF: 59 BLM: Higher Order - Analyze

TOP: 1–3

9. The following data represent scores on a 15-point aptitude test: 8, 10, 15, 12, 14, and 13. Subtract 5 from every observation and compute the sample mean for both the original data and the new data. What effect, if any, does subtracting 5 from every observation have on the sample mean? ANS: = 12, and

= 7. The sample mean

is shifted to the left (decreased) by 5.

PTS: 1 REF: 57 BLM: Higher Order - Apply

TOP: 1–3

Student Ratings Thirty-three students were asked to rate themselves on whether they were outgoing or not, using this five-point scale: 1 = extremely extroverted, 2 = extroverted, 3 = neither extroverted nor introverted, 4 = introverted, or 5 = extremely introverted. The results are shown in the table below: Rating Frequency

10. Refer to Student Ratings table. Calculate the sample mean. ANS: PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

11. Refer to Student Ratings table. Calculate the median. ANS: m=3

PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

Cracks in Bar The following data represent the number of small cracks per bar for a sample of eight steel bars: 4, 6, 10, 1, 3, 1, 25, and 8. 12. Refer to Cracks in Bar statement. What is the average number of small cracks per bar? ANS: = 7.25 PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

13. Refer to Cracks in Bar statement. Which, if any, of the observations appear to be outliers? Justify your answer. ANS: The value 25 has a z-score of 2.26 making it a suspect outlier. PTS: 1 REF: 59 BLM: Higher Order - Apply

TOP: 6–7

14. Refer to Cracks in Bar statement. Find the standard deviation for the number of small cracks per bar. ANS: s=

= 7.85

PTS: 1 REF: 66 BLM: Higher Order - Apply

TOP: 1–3

Aptitude Tests Twenty-eight applicants interested in working in community services took an examination designed to measure their aptitude for social work. A stem-and-leaf plot of the 28 scores appears below, in which the first column is the count per “branch,” the second column is the stem value, and the remaining digits are the leaves. Count Stems Leaves 1 4 1 5 4 6 6 7 9 8 7 9

6 9 3688 026799 145667788 1234788

15. Refer to Aptitude Tests table. What is the median score? ANS: m = 84.5 PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

16. Refer to Aptitude Tests table. What is the sample mean for this data set? ANS: = 80.64 PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

17. Refer to Aptitude Tests table. Should the Empirical Rule be applied to this data set? ANS: No. The data do not appear to be mound-shaped. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

18. Refer to Aptitude Tests table. Use the range approximation to determine an approximate value for the standard deviation. Is this a good approximation? ANS: s R /4 = 13. This approximation is very close to the actual value of s = 12.85. PTS: 1 REF: 72-73 BLM: Higher Order - Apply

TOP: 4–5

19. Refer to Aptitude Tests table. What is the value of the sample standard deviation? ANS: s = 12.85 PTS: 1 REF: 66 BLM: Higher Order - Apply

TOP: 1–3

20. Refer to Aptitude Tests table. What is the range of these data? ANS: R = 52 PTS: 1 REF: 63 | 72-73 BLM: Higher Order - Apply

TOP: 1–3

21. Refer to Aptitude Tests table. What is the value of the first and third quartiles? ANS: Position of first quartile = 0.25(29) = 7.25, then Q1 = 70 + 0.25(2) = 70.5 Position of third quartile = 0.75(29) = 21.75, then Q3 = 88 + 0.75(3) = 90.25 PTS: 1 REF: 79-81 BLM: Higher Order - Apply

TOP: 6–7

22. Refer to Aptitude Tests table. What is the interquartile range? ANS: IQR =

–

= 19.75

PTS: 1 REF: 80 BLM: Higher Order - Apply

TOP: 6–7

23. Refer to Aptitude Tests table. Find the inner fences. ANS: Q1 – 1.5(IQR) = 70.5 – 1.5(19.75) = 40.875, and Q3 + 1.5(IQR) = 90.25 + 1.5(19.75) = 119.875 PTS: 1 REF: 81-84 BLM: Higher Order - Apply

TOP: 6–7

24. Refer to Aptitude Tests table. Find the outer fences. ANS: Q1 – 3(IQR) = 70.5 – 3(19.75) = 11.25, and Q3 + 3(IQR) = 90.25 + 3(19.75) = 149.50 PTS: 1 REF: 81-84 BLM: Higher Order - Apply

TOP: 6–7

25. Refer to Aptitude Tests table. Construct a box plot for these data. ANS:

PTS: 1 REF: 81-84 BLM: Higher Order - Apply

TOP: 6–7

26. Refer to Aptitude Tests table. Does the box plot indicate the presence of any outliers? ANS: There do not appear to be any outliers present since there are no observations between the inner and outer fences or outside the outer fences. PTS: 1 REF: 82 BLM: Higher Order - Understand

TOP: 6–7

27. Suppose you are given the following set of sample measurements: –1, 0, 2, 6, 5, and 6. a. Calculate the sample mean. b. Find the median. c. Find the mode. d. Are these data symmetric, skewed to the right or skewed to the left? Justify your answer. ANS: a. b. c. d.

=3 m = (2 + 5)/2 = 3.5 6 The data are skewed to the left since the mean is less than the median.

PTS: 1 REF: 57-59 BLM: Higher Order - Apply

TOP: 1–3

Ice Cream Cone Sales A neighbourhood ice cream vendor reports the following sales of single-scoop ice cream cones (measured in hundreds of cones) for five randomly selected weeks: 5, 4, 6, 5, and 3.

28. Refer to the Ice Cream Cone Sales statement. Find the average number of weekly sales of single-scoop ice cream cones. ANS: PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

29. Refer to the Ice Cream Cone Sales statement. Find the median number of weekly sales of single-scoop ice cream cones. ANS: m=5 PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

30. Refer to the Ice Cream Cone Sales statement. Find the variance for the weekly sales of single scoop ice cream cones. ANS: s2 = 1.3 PTS: 1 REF: 65-66 BLM: Higher Order - Apply

TOP: 1–3

31. The following data represent the sales (measured in $10,000s) of seven real estate salespersons employed by a local agency: 23, 34, 56, 47, 45, 60, and 249. Which measure of centre, the mean or the median, would provide a better measure of the average sales of the company? Give a reason for your answer. ANS: The median would seem to provide a better measure of the average sales since it will not be adversely affected by the extreme value of 249. (The mean will be pulled strongly to the right by the extreme value of 249.) PTS: 1 REF: 59 BLM: Higher Order - Analyze

TOP: 1–3

Athletic Training Time The following data represent the numbers of minutes an athlete spends training per day: 73, 74, 76, 77, 79, 79, 83, 84, 88, 84, 84, 85, 86, 86, 87, 87, 88, 91, 92, 92, 93, 97, 98, 98, 81, and 82. The mean and standard deviation were computed to be 85.54 and 6.97, respectively. 32. Refer to the Athletic Training Time statement. Create a stem-and-leaf plot for the distribution of training times.

ANS: Stems 7 7 8 8 9 9

Leaves 34 6799 123444 5667788 1223 788

PTS: 1 REF: 52 BLM: Higher Order - Apply

TOP: 1–3

33. Refer to the Athletic Training Time statement. Is the distribution relatively mound-shaped? ANS: Yes, the distribution of training times appears to be relatively mound-shaped. PTS: 1 REF: 53 BLM: Higher Order - Understand

TOP: 1–3

34. Refer to the Athletic Training Time statement. What percentage of measurements would you expect to be between 71.60 and 99.48? ANS: Since the distribution appears to be relatively mound-shaped, the Empirical Rule applies. The interval (71.60, 99.48) represents two standard deviations from the mean, so we would expect approximately 95% of the measurements to lie in this interval. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

35. Refer to the Athletic Training Time statement. What percentage of the measurements lies in the interval (71.60, 99.48)? ANS: 26 of the 26 measurements or 100% of the measurements lie in the given interval. PTS: 1 REF: 69-70 BLM: Higher Order - Apply

TOP: 4–5

Calories in Soft Drinks The following data represent the number of calories in 340 mL cans of a sample of 8 popular soft drinks: 124, 144, 147, 146, 148, 154, 150, and 234. 36. Refer to the Calories in Soft Drinks statement. Find the median and the sample mean. ANS: m = (147 + 148)/2 = 147.5,

=155.875

PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

37. Refer to the Calories in Soft Drinks statement. Are these measurements of numbers of calories symmetric or skewed? Justify your conclusion. ANS: Since the mean is larger than the median, we conclude that the measurements are skewed to the right. PTS: 1 REF: 59 BLM: Higher Order - Understand

TOP: 1–3

Psychological Experiments In a psychological experiment, the time on task was recorded for ten subjects having a five-minute time constraint. These measurements (in seconds) were 182, 197, 207, 272, 192, 257, 247, 197, 232, and 237. 38. Refer to the Psychological Experiments statement. Find the average time on task. ANS: PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

39. Refer to the Psychological Experiments statement. Find the median time on task. ANS: m = (207 + 232)/2 = 219.5 PTS: 1 REF: 58 BLM: Higher Order - Apply

TOP: 1–3

40. Refer to the Psychological Experiments statement. If you were writing a report to describe these data, which measure of central tendency would you use? Explain. ANS: Since there are no unusually large or small observations to affect the value of the mean, we would probably report the mean or average time on task. PTS: 1 REF: 57-59 BLM: Higher Order - Analyze

TOP: 1–3

41. You are given a sample of 8 measurements: 13, 11, 15, 16, 14, 14, 13, and 15. Calculate the sample mean. ANS:

= 13.875 PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

42. A sample of n = 10 measurements consists of the following values: 15, 12, 13, 16, 11, 12, 14, 15, 11, and 13. Calculate the sample mean and the median of this data set. Are the data mound-shaped? ANS: = 13.2, and m = 13. No; the data is slightly skewed to the right since the mean is slightly larger than the median. PTS: 1 REF: 57-59 BLM: Higher Order - Apply

TOP: 1–3

43. The following data represent the scores for a sample of 10 students on a 20-point chemistry quiz: 16, 14, 2, 8, 12, 12, 9, 10, 15, and 13. Find the median and the sample mean. ANS: Median m = (12 + 12)/2 = 12, and

= 11.1

PTS: 1 REF: 57-58 BLM: Higher Order - Apply

TOP: 1–3

Community College Raises Assume that all employees of a community college received a monthly raise. 44. Refer to the Community College Raises statement. How would a $150 raise affect the mean of salaries? How would a $150 raise affect the standard deviation of salaries? ANS: a. b.

The mean of salaries will increase by $150. The standard deviation of salaries will remain unchanged.

PTS: 1 REF: 57 | 65-66 BLM: Higher Order - Apply

TOP: 1–3

45. Refer to the Community College Raises statement. What would happen to the mean of salaries if all salaries were raised by 5%? What would happen to the standard deviation of salaries if all salaries were raised by 4%? ANS: a. b.

The mean of salaries will increase by 5%. The standard deviation of salaries will increase by 4%.

PTS: 1 REF: 57 | 65-66 BLM: Higher Order - Apply

TOP: 1–3

Optometrist Customers The following values denote the number of customers handled by an optometrist during a random sample of four periods of one hour each: 4, 6, 2, and 5. 46. Refer to the Optometrist Customers statement. Find the standard deviation of these values. ANS: s = 1.708 customers PTS: 1 REF: 65-66 BLM: Higher Order - Apply

TOP: 1–3

47. Refer to the Optometrist Customers statement. Find the range R. ANS: R=6–2=4 PTS: 1 REF: 63 BLM: Higher Order - Apply

TOP: 1–3

48. The following data represent scores on a 15-point aptitude test: 8, 10, 15, 12, 14, and 13. Subtract 5 from every observation and compute the sample variance for the original data and the new data. What effect, if any, does subtracting 5 from every observation have on the sample variance? ANS: = 6.80, and

= 6.80. The sample variance remains unchanged.

PTS: 1 REF: 64-65 BLM: Higher Order - Analyze

TOP: 1–3

Student Extroversion Thirty-three students were asked to rate themselves on whether they were outgoing or not using this five-point scale: 1 = extremely extroverted, 2 = extroverted, 3 = neither extroverted nor introverted, 4 = introverted, or 5 = extremely introverted. The results are shown in the table below: Rating Frequency

49. Refer to the Student Extroversion statement and table. Calculate the sample standard deviation. ANS: s = 0.696 PTS:

REF: 65-66

TOP: 1–3

BLM: Higher Order - Apply 50. Refer to the Student Extroversion statement and table. Find the percentage of measurements in the intervals and . Compare these results with the Empirical Rule percentages, and comment on the shape of the distribution. ANS: Sixty-one percent of the observations are in the interval = (2.19, 3.57). The Empirical Rule says if the data set is mound-shaped, we should expect to see approximately 68% of the data within one standard deviation of the mean. Ninety-seven percent of the observations are in the interval = (1.50, 4.26). The Empirical Rule says that if the data set is mound-shaped, we should expect to see approximately 95% of the observations within two standard deviations of the mean. Since both percentages are relatively close to those predicted by the Empirical Rule, the data must be approximately mound-shaped. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

51. Suppose you are given the following set of sample measurements: –1, 0, 2, 6, and 6. a. Calculate the sample variance. b. Calculate the sample standard deviation. c. Calculate the range. ANS: a. s2 = 10.8 b. c.

s= R=7

= 3.286

PTS: 1 REF: 63 | 64-66 BLM: Higher Order - Apply

TOP: 1–3

52. You are given a sample of 8 measurements: 13, 11, 15, 16, 14, 14, 13, and 15. a. Calculate the range. b. Calculate the sample variance and standard deviation. c. Compare the range and the standard deviation. Approximately how many standard deviations equal the value of the range? ANS: a. R = 5 b. c.

, and s = 1.5526 The range R = 5, is 5/1.5526 = 3.22 standard deviations.

PTS: 1 REF: 63 | 64-66 BLM: Higher Order - Apply

TOP: 1–3

53. A sample of n = 10 measurements consists of the following values: 15, 12, 13, 16, 11, 12, 14, 15, 11, and 13. Calculate the value of the standard deviation (s) and the range (R), and use R to approximate s. Is this a good approximation? ANS: s = 1.75, R = 5, s R/4 = 1.25. Yes, this is a good approximation. PTS: 1 REF: 63 | 65-66 | 72-73 BLM: Higher Order - Apply

TOP: 1–3

54. The following data represent the scores for a sample of 10 students on a 20-point chemistry quiz: 16, 14, 2, 8, 12, 12, 9, 10, 15, and 13. Calculate the sample variance, the lower and upper quartiles, and the IQR for these data. ANS: = 16.767, position of lower quartile = 0.25(11) = 2.75; of upper quartile = 0.75(11) = 8.25;

= 8 + 0.75(1) = 8.75; position –

= 14 + 0.25(1) =14.25, and IQR =

PTS: 1 REF: 64-65 | 79-81 BLM: Higher Order - Apply

= 5.5.

TOP: 1–3

Parasites in Foxes A random sample of 100 foxes was examined by a team of veterinarians to determine the prevalence of a particular type of parasite. Counting the number of parasites per fox, the veterinarians found that 65 foxes had no parasites, 20 had one parasite, and so on. A frequency tabulation of the data is given here: Number of Parasites, x Number of Foxes, f

0 65

1 20

2 7

3 3

4 1

5 2

6 1

7 0

8 1

55. Refer to the Parasites in Foxes statement and table. Construct a relative frequency histogram for x, the number of parasites per fox. ANS:

PTS: 1 REF: 56-58 BLM: Higher Order - Apply

TOP: 1–3

56. Refer to the Parasites in Foxes statement and table. Calculate the sample mean sample standard deviation for the sample. ANS: = 0.71, and

and the

= 1.387

PTS: 1 REF: 76 BLM: Higher Order - Apply

TOP: 1–3

57. Refer to the Parasites in Foxes statement and table. What fraction of the parasite counts fall within two standard deviations of the mean? Do they fall within three standard deviations or the mean? Do these results agree with Tchebysheff’s Theorem? Do they agree with the Empirical Rule? ANS: The two intervals for k = 2, 3 are calculated in the table below along with the actual proportion of measurements falling in the intervals. Tchebysheff’s Theorem is satisfied and the approximations given by the Empirical Rule are fairly close for k = 2 and k = 3. k Rule 2 3

Interval 0.71 2.774 0.71 4.161

Fraction in Interval

–2.064 to 3.484 –3.451 to 4.871

PTS: 1 REF: 68-71 BLM: Higher Order - Analyze

Tchebysheff’s Theorem

95/100 = 0.95 At least 0.75 96/100 = 0.96 At least 0.89 TOP: 4–5

Empirical .95 1.00

58. Refer to the Parasites in Foxes statement and table. Construct a relative frequency histogram for x, the number of parasites per fox. ANS:

PTS: 1 REF: 56-58 | 76 BLM: Higher Order - Apply

TOP: 6–7

59. Refer to the Parasites in Foxes statement and table. Calculate the sample mean sample standard deviation for the sample. ANS: = 0.71, and

and

= 1.387

PTS: 1 REF: 76 BLM: Higher Order - Apply

TOP: 6–7

60. Refer to the Parasites in Foxes statement and table. What fraction of the parasite counts fall within two standard deviations of the mean? Within three standard deviations? Do these results agree with Tchebysheff’s Theorem? Do they agree with the Empirical Rule? ANS: The two intervals for k = 2, 3 are calculated in the table below along with the actual proportion of measurements falling in the intervals. Tchebysheff’s Theorem is satisfied and the approximations given by the Empirical Rule are fairly close for k = 2 and k = 3. k

Interval

0.71 2.774 –2.064 to 3.484 0.71 4.161 –3.451 to

Fraction in Interval 95/100 = 0.95 96/100 =

Tchebysheff’s Empirical Theorem Rule At least 0.75 .95 At least 0.89

1.00

4.871 PTS: 1 REF: 68-71 BLM: Higher Order - Analyze

0.96 TOP: 6–7

61. The times required to service customers’ cars at a repair shop are skewed to the right, with a mean of 2.5 hours and a standard deviation of 0.75 hours. What can be said about the percentage of cars whose service time is either less than 1 hour or more than 4 hours? ANS: Applying Tchebysheff’s Theorem, we can say that at most 25% of the cars take less than one hour or more than four hours to service. PTS: 1 REF: 68-69 | 71 BLM: Higher Order - Analyze

TOP: 4–5

Cola Bottling When a machine dispensing cola at a bottling plant is working correctly, it dispenses a mean of 340 mL of cola per bottle, with a standard deviation of 6 mL. 62. Refer to the Cola Bottling statement. When the machine is working correctly, what percentage of the bottles will be filled with between 328 and 352 mL of cola? ANS: At least 75% of the bottles will be filled with between 328 and 352 mL of cola. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

TOP: 4–5

63. Refer to the Cola Bottling statement. On a particular day, the bottling plant supervisor randomly selects two bottles from among those filled by the machine. One bottle contains 336 mL of cola, and the other contains 344 mL of cola. Based on the contents of these two bottles, what can the supervisor conclude about the machine’s performance? ANS: The machine seems to be working correctly. PTS: 1 REF: 68-69 BLM: Higher Order - Evaluate

TOP: 4–5

Job Applicant Test Scores A new manufacturing plant has 20 job openings. To select the best 20 applicants from among the 1000 job seekers, the plant’s personnel office administers a written aptitude test to all applicants. The average score on the aptitude test is 150 points, with a standard deviation of 10 points. Assume the distribution of test scores is approximately mound-shaped. 64. Refer to the Job Applicant Test Scores statement. What percentage of the test scores will fall between 130 and 160 points?

ANS: Approximately 81.5% of the test scores will fall between 130 and 160 points. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

65. Refer to the Job Applicant Test Scores statement. How many applicants will score between 130 and 160 points? ANS: Approximately 815 applicants will score between 130 and 160 points. PTS: 1 REF: 69-70 BLM: Higher Order - Apply

TOP: 4–5

66. Refer to the Job Applicant Test Scores statement. One of the applicants scored 192 points on the test. What might you conclude about this test score? ANS: The score should be regarded as an outlier; the score should be double-checked to see if it was recorded correctly. PTS: 1 REF: 59 | 69-70 BLM: Higher Order - Analyze

TOP: 4–5

Frequency Table Suppose you are given the following frequency table of ratings from 0 to 8: Rating Frequency

Assume that the sample mean and the sample standard deviation are 0.66 and 1.387, respectively. 67. Refer to the Frequency Table. What fraction of the x-values fall within two standard deviations of the mean? Within three standard deviations of the mean? ANS: 0.95 of the x values fall within two standard deviations of the mean. 0.96 of the x values fall within three standard deviations of the mean. PTS: 1 REF: 68-69 | 71 BLM: Higher Order - Analyze

TOP: 4–5

68. Refer to the Frequency Table. Do the results of the previous question agree with Tchebysheff’s Theorem?

ANS: Yes. According to Tchebysheff’s Theorem, at least 3/4 or 0.75 of the measurements fall within two standard deviations of the mean, and at least 8/9 or 0.89 of the measurements fall within three standard deviations of the mean. PTS: 1 REF: 68-69 BLM: Higher Order - Analyze

TOP: 4–5

69. Refer to the Frequency Table Do the results of the previous question agree with the Empirical Rule? ANS: Yes. According to the Empirical Rule, approximately 95% of the measurements fall within two standard deviations of the mean, and all or almost all of the measurements fall within three standard deviations of the mean. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

Amount of Food Sold Suppose the hourly dollar amount of food sold by a local restaurant follows an approximately mound-shaped distribution, with a mean sales level of $400 per hour and a standard deviation of $60 per hour. 70. Refer to the Amount of Food Sold statement. During what percentage of working hours does this restaurant sell between $280 and $520 worth of food per hour? ANS: 95% of working hours PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

71. Refer to the Amount of Food Sold statement. During a one-hour period, this restaurant had sales at the 84th percentile. What dollar sales figure does this represent? ANS: $460 PTS: 1 REF: 78 BLM: Higher Order - Apply

TOP: 4–5

72. For Labrador Retrievers, the average weight at 12 months of age is 23 kg, with a standard deviation of 1.2 kg. What can be said about the proportion of 12-month-old Labrador Retrievers that will weigh between 21.2 kg and 24.8 kg? ANS:

Since it is not known whether the distribution of weights is mound-shaped, the Empirical Rule doesn’t necessarily apply. Using Tchebysheff’s Theorem, since the given interval represents 1.5 standard deviations on each side of the mean, at least 1 – 1/(1.5)2 = 0.56 of the weights will lie in the interval. PTS: 1 REF: 68-69 BLM: Higher Order - Analyze

TOP: 4–5

73. The mean and variance of a sample of n = 25 measurements are 80 and 100, respectively. Explain in detail how to use Tchebysheff’s Theorem to describe the distribution of the measurements. ANS: You are given = 80, and =100. The standard deviation is s = 10. The distribution of measurements is centred about = 80, and Tchebysheff’s Theorem states that ▪ At least 3/4 of the 25 measurements lie in the interval = 80 20; that is, 60 to 100. ▪ At least 8/9 of the measurements lie in the interval = 80 30; that is, 50 to 110. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

TOP: 4–5

Manufacturing Operation Time In a time study conducted at a manufacturing plant, the length of time to complete a specified operation is measured for each one of n = 40 workers. The mean and standard deviation are found to be 15.2 and 1.40, respectively. 74. Refer to the Manufacturing Operation Time statement. Describe the sample data using the Empirical Rule. ANS: To describe the data using the Empirical Rule, calculate these intervals: ( ) = 15.2 1.40, or 13.8 to 16.6 ( ) = 15.2 2.80, or 12.4 to 18.0 ( ) = 15.2 4.20, or 11.0 to 19.4 If the distribution of measurements is mound-shaped, you can apply the Empirical Rule and expect approximately 68% of the measurements to fall into the interval from 13.8 to 16.6, approximately 95% to fall into the interval from 12.4 to 18.0, and all or almost all to fall into the interval from 11.0 to 19.4. PTS: 1 REF: 69-71 BLM: Higher Order - Apply

TOP: 4–5

75. Refer to the Manufacturing Operation Time statement. Describe the sample data using Tchebysheff’s Theorem. ANS:

If you doubt that the distribution of measurements is mound-shaped, or if you wish for some other reason to be conservative, you can apply Tchebysheff’s Theorem and be absolutely certain of your statements. Tchebysheff’s Theorem tells you that at least 3/4 of the measurements fall into the interval from 12.4 to 18.0, and at least 8/9 into the interval from 11.0 to 19.4 PTS: 1 REF: 68-69 | 71 BLM: Higher Order - Apply

TOP: 4–5

76. A sample of n = 10 measurements consists of the following values: 15, 12, 13, 16, 11, 12, 14, 15, 11, and 13. a. Can you use Tchebysheff’s Theorem to describe this data set? Why or why not? b. Can you use the Empirical Rule to describe this data set? Why or why not? ANS: a. Yes, since the data set is not mound-shaped. b. No, since the data set is not mound-shaped. PTS: 1 REF: 71 BLM: Higher Order - Analyze

TOP: 4–5

77. A distribution of measurements is relatively mound-shaped, with mean 70 and standard deviation 10. a. What percentage of the measurements will fall between 60 and 80? b. What percentage of the measurements will fall between 50 and 90? c. What percentage of the measurements will fall between 50 and 80? d. If a measurement is chosen at random from this distribution, what is the probability that it will be greater than 80? ANS: a. The interval from 60 to 80 represents = 70 10. Since the distribution is relatively mound-shaped, the percentage of measurements between 60 and 80 is approximately 68% according to the Empirical Rule. b. Again, using the Empirical Rule, the interval = 70 20 or between 50 and 90 contains approximately 95% of the measurements. c. Since approximately 68% of the measurements are between 60 and 80, the symmetry of the distribution implies that approximately 34% of the measurements are between 70 and 80. Similarly, since approximately 95% of the measurements are between 50 and 90, approximately 47.5% of the measurements are between 50 and 70. Thus, the percentage of measurements between 50 and 80 is 34% + 47.5% = 81.5%. d. Since the proportion of the measurements between 70 and 80 is 0.34, and the proportion of the measurements that is greater than 70 is 0.50, the proportion that is greater than 80 must be 0.50 – 0.34 = 0.16. PTS: 1 REF: 69-71 BLM: Higher Order - Analyze

TOP: 4–5

78. A sample of n = 10 measurements consists of the following values: 15, 12, 13, 16, 11, 12, 14, 15, 11, and 13. a. Can you use Tchebysheff’s Theorem to describe this data set? Why or why not? b. Can you use the Empirical Rule to describe this data set? Why or why not? ANS: a. Yes, since the data set is not mound-shaped. b. No, since the data set is not mound-shaped. PTS: 1 REF: 71 BLM: Higher Order - Analyze

TOP: 4–5

Height of Basketball Players A sample of basketball players has a mean height of 190 cm, with a standard deviation of 12 cm. You know nothing else about the size of the data set or the shape of the data distribution. 79. Refer to the Height of Basketball Players statement. Can you use Tchebysheff’s Theorem and/or the Empirical Rule to describe the data? Explain. ANS: Since nothing is known about the shape of the data distribution, you must use Tchebysheff’s Theorem to describe the data. PTS: 1 REF: 71 BLM: Higher Order - Analyze

TOP: 4–5

80. Refer to the Height of Basketball Players statement. What can you say about the fraction of measurements that fall between 154 and 226 cm? ANS: The interval from 154 to 226 represents of the measurements. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

= 190

36, which will contain at least 8/9

TOP: 4–5

81. Refer to the Height of Basketball Players statement. What can you say about the fraction of measurements that fall between 166 and 214? ANS: The interval from 166 to 214 represents of the measurements. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

= 190

TOP: 4–5

24, which will contain at least 3/4

82. Refer to the Height of Basketball Players statement. What can you say about the fraction of measurements that are less than 166? ANS: The value x = 166 lies two standard deviations below the mean. Since at least 3/4 of the measurements are within the two standard deviations range, at most 1/4 can lie outside that range, which means that at most 1/4 can be less than 166. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

TOP: 4–5

Solution Volumes An analytical chemist wanted to use electrolysis to determine the number of moles of cupric ions in a given volume of solution. The solution was partitioned into n = 30 portions of 0.2 mL each. Each of the n = 30 portions was tested. The average number of moles of cupric ions for the n = 30 portions was found to be 0.185 mole; the standard deviation was 0.015 mole. 83. Refer to the Solution Volumes statement. Calculate the intervals ( ( ).

), (

), and

ANS: ( ) = 0.185 0.015 or 0.170 to 0.200 ( ) = 0.185 0.030 or 0.155 to 0.215 ( ) = 0.185 0.045 or 0.140 to 0.230 PTS: 1 REF: 57-58 | 65-66 | 68-69 BLM: Higher Order - Analyze

TOP: 4–5

84. Refer to the Solution Volumes statement. Describe the distribution of the measurements for the n = 30 portions of the solution using Tchebysheff’s Theorem. ANS: If we doubt that the distribution of measurements is mound-shaped, or if no prior information as to the shape of the distribution is available, we use Tchebysheff’s Theorem. We would expect none of the measurements to fall in the interval 0.17 to 0.20, at least 3/4 of the measurements to fall in the interval 0.155 to 0.215, and at least 8/9 of the measurements to fall in the interval from 0.14 to 0.23. PTS: 1 REF: 68-69 BLM: Higher Order - Apply

TOP: 4–5

85. Refer to the Solution Volumes statement. Suppose the chemist had used only n = 5 portions of the solution for the experiment and obtained the readings 0.18, 0.21, 0.20, 0.22, and 0.18. Would the Empirical Rule be suitable for describing the n = 5 measurements? Why? ANS:

If the chemist had used only a sample of size n = 5 for this experiment, the distribution would not be mound-shaped. Therefore, the Empirical Rule would not be suitable for describing n = 5 measurements. PTS: 1 REF: 69-71 BLM: Higher Order - Evaluate

TOP: 4–5

86. Attendance at London Symphony concerts for the past two years showed an average of 3000 people per performance, with a standard deviation of 100 people per performance. Attendance at a randomly selected concert was found to be 3290. If attendance data is mound-shaped, does the attendance at the selected concert appear to be unusual? Justify your conclusion. ANS: The z-score associated with 3290 is 2.90, indicating that 3290 is 2.90 standard deviations above the mean. Although the z-score does not exceed 3, it is close enough for one to suspect that 3290 is an outlier. PTS: 1 REF: 77-78 BLM: Higher Order - Evaluate

TOP: 6–7

87. Consider the following set of measurements: 5.4, 5.9, 3.5, 4.1, 4.6, 2.5, 4.7, 6.0, 5.4, 4.6, 4.9, 4.6, 4.1, 3.4, and 2.2. a. Find the 25th, 50th, and 75th percentiles. b. What is the value of the interquartile range? ANS: a.

25th percentile =

IQR =

–

= 3.5; 50th percentile =

= 4.6; 75th percentile =

= 5.4

= 5.4 – 3.5 =1.9

PTS: 1 REF: 78-80 BLM: Higher Order - Apply

TOP: 6–7

Number of Calories in Soft Drinks The following data represent the number of calories in 340 mL cans of a sample of 8 popular soft drinks: 124, 144, 147, 146, 148, 154, 150, and 234. 88. Refer to the Number of Calories in Soft Drinks statement. Find the inner fences. ANS: Q1 – 1.5(IQR) = 144.5 – 1.5(8.5) = 131.75, and Q3 + 1.5(IQR) = 153 + 1.5(8.5) = 166.75 PTS: 1 REF: 81-84 BLM: Higher Order - Apply

TOP: 6–7

89. Refer to the Number of Calories in Soft Drinks statement. Find the outer fences.

ANS: Q1 – 3(IQR) = 144.5 – 3(8.5) = 119, and Q3 + 3(IQR) = 153 + 3(8.5) = 178.5 PTS: 1 REF: 81-84 BLM: Higher Order - Apply

TOP: 6–7

90. Refer to the Number of Calories in Soft Drinks statement. Construct a box plot for these data. Does the box plot indicate the presence of any outliers? ANS:

Yes, the observation 124 is a suspect outlier since it lies between the lower outer fence and the lower inner fence. Also, the observation 234 is an extreme outlier since it lies above the upper outer fence. PTS: 1 REF: 82 BLM: Higher Order - Apply

TOP: 6–7

91. The following data represent the scores for a sample of 10 students on a 20-point chemistry quiz: 16, 14, 2, 8, 12, 12, 9, 10, 15, and 13. Calculate the z-score for the smallest and largest observations. Is either of these observations unusually large or unusually small? ANS: For x = 2, z-score = (2 – 11.1)/4.095 = –2.22. For x = 16, z-score = (16 – 11.1)/4.095 = 1.197. Since the z-score for the smallest observation exceeds 2 in absolute value, the smallest observation is unusually small. However, the largest observation is not unusually large. PTS: 1 REF: 77-78 BLM: Higher Order - Apply

TOP: 6–7

92. Two students are enrolled in different sections of an introductory statistics class at a local university. The first student, enrolled in the morning section, earns a score of 76 on a midterm exam where the class mean was 64 with a standard deviation of 8. The second student, enrolled in the afternoon section, earns a score of 72 on a midterm exam where the class mean was 60 with a standard deviation of 7.5. If the scores on the midterm exams are normally distributed, which student scored better relative to his or her classmates? ANS: = (76 – 64)/8 = 1.5; = (72 – 60)/7.5 = 1.6; the student in the afternoon section scored better relative to her classmates since her z-score is larger. PTS: 1 REF: 77-78 BLM: Higher Order - Evaluate

TOP: 6–7

93. If the 90th and 91st observations in a set of 100 data values are 158 and 167, respectively, what is the 90th percentile value? ANS: 166.1 PTS: 1 REF: 78 BLM: Higher Order - Apply

TOP: 6–7

94. If the 18th and 19th observations in a set of 25 data values are 42.6 and 43.8, what is the 70th percentile value? ANS: 42.84 PTS: 1 REF: 78 BLM: Higher Order - Apply

TOP: 6–7

Chapter 3—Describing Bivariate Data MULTIPLE CHOICE 1. Which of the following is NOT a measure of the linear relationship between two variables? a. the covariance b. the correlation coefficient c. the variance d. the coefficient of determination ANS: C BLM: Remember

PTS:

REF: 107-109

TOP: 1–4

2. Generally speaking, if two variables are unrelated, what will the covariance be? a. a large positive number b. a large negative number c. a positive or negative number close to zero d. a positive number close to 1 ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 107-108

TOP: 1–4

3. Given the least squares regression line y = 3.8 – 2x, which of the following best describes the relationship between the two variables? a. The relationship between x and y is positive. b. The relationship between x and y is negative. c. There is no linear relationship between x and y. d. As x decreases, so does y. ANS: B PTS: 1 BLM: Higher Order - Understand 4. Given that a. 0.875 b. 0.70 c. 0.56 d. 0.156 ANS: B TOP: 1–4

400,

= 625,

REF: 109-110

TOP: 1–4

= 350, and n = 10, what is the correlation coefficient?

PTS: 1 REF: 107-108 | 112 BLM: Higher Order - Apply

5. Suppose that a regression line for a set of data has a y-intercept of 6.75 and a slope of 1.25. From this information, is it possible to determine the actual value of y when x = 2? a. Yes, it is 9.25. b. Yes, it is 8.75. c. Yes, it is 2.25. d. No, it is not possible. ANS: D

PTS:

REF: 110-111

TOP: 1–4

BLM: Higher Order - Apply 6. Which of the following values would be the correlation coefficient produced by a perfectly straight line sloping downward? a. +1 b. –1 c. +2 d. –2 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

7. If all the points in a scatterplot lie on the least squares regression line, then what must the correlation coefficient r be? a. only 1.0 b. only –1.0 c. either 1.0 or –1.0 d. only 100 ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 108 | 110

TOP: 1–4

8. A manager of a supermarket wishes to show the relationship between the number of customers who come to the store on weekends and the total volume of sales (in dollars) during the same weekend. Which of the following graphs would likely be most useful if the manager has a sample of 52 weekends worth of data? a. bar chart b. pie chart c. box and whisker plot d. scatterplot ANS: D PTS: 1 BLM: Higher Order - Understand 9. Given that 100, = 64, best-fitting regression line? a. 7.5 b. 0.75 c. 0.64 d. 0.60 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 105-106

TOP: 1–4

= 60, and n = 8, what would be the slope of the

REF: 110-111

TOP: 1–4

10. For which of the following tasks can the best-fitting regression line be used? a. for finding the actual value of y for a given value of x b. for predicting the value of y for a given value of x c. for calculating the correlation coefficient d. for calculating the covariance

ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 110-111

TOP: 1–4

11. Which of the following values of the correlation coefficient r indicates a stronger correlation than 0.72? a. 0.65 b. 0.60 c. –0.70 d. –0.75 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

12. A scatterplot can be used to depict the relationship between which kinds of variables? a. two qualitative variables b. two quantitative variables c. one qualitative variable and one quantitative variable d. three quantitative variables ANS: B BLM: Remember

PTS:

REF: 105-106

TOP: 1–4

13. Which of the following would NOT be considered appropriate when constructing a scatterplot? a. labelling the x and y axes b. labelling the graph using titles c. connecting the data points on the graph with straight lines d. drawing the best-fitting line on the graph ANS: C BLM: Remember

PTS:

REF: 105-106

TOP: 1–4

TRUE/FALSE 1. If the correlation coefficient ANS: T PTS: 1 BLM: Higher Order - Understand

, then all the data points lie exactly on a straight line. REF: 107-109

TOP: 1–4

2. The correlation coefficient r is a number that indicates the direction and the strength of the relationship between the dependent variable y and the independent variable x. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

3. If the correlation coefficient , then there is no linear relationship whatsoever between the dependent variable y and the independent variable x. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

4. A perfectly straight line sloping upward would produce a covariance value of +1. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

5. A perfectly straight line sloping downward would produce a covariance value of –1. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

6. The standard deviation is a measure of the linear relationship between two quantitative variables. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 107-109

TOP: 1–4

7. Generally speaking, if two quantitative variables are unrelated, the covariance will be a positive or negative number close to zero. ANS: T BLM: Remember

PTS:

REF: 107-108

TOP: 1–4

8. The best-fitting line relating the dependent variable y to the independent variable x, often called the regression or least-squares line, is found by minimizing the sum of the squared differences between the data points and the line itself. ANS: T BLM: Remember

PTS:

REF: 109-111

TOP: 1–4

9. The scatterplot is a graph that is used to represent the relationship between two quantitative variables. ANS: T BLM: Remember

PTS:

REF: 105-106

TOP: 1–4

10. When constructing a scatterplot, the independent variable x is placed on the horizontal axis, and the dependent variable y is placed on the vertical axis. ANS: T TOP: 1–4

PTS: 1 BLM: Remember

REF: 105-106 | 109

11. A scatterplot is particularly useful in determining if the relationship between the independent and dependent variables is not linear. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 105-106

TOP: 1–4

12. If the linear relationship between the dependent and independent variables is positive, the scatterplot will show the data points on the (x, y) plane moving generally from the lower left corner to the upper right corner. ANS: T BLM: Remember

PTS:

REF: 108

TOP: 1–4

13. If the correlation coefficient between the independent variable x and the dependent variable y is 0.87, then the best-fitting line would have a slope equal to 0.87. ANS: F BLM: Remember

PTS:

REF: 110-111

TOP: 1–4

14. If two variables have a correlation coefficient equal to 0.005, this means that there is a strong relationship between the two variables. ANS: F BLM: Remember

PTS:

REF: 107-109

TOP: 1–4

15. A perfect correlation between two variables will always produce a correlation coefficient of +1.0. ANS: F BLM: Remember

PTS:

REF: 107-109

TOP: 1–4

PROBLEM Legislation Poll A councillor was interested in determining whether people between the ages of 18 and 30 years of age would react to a piece of legislation differently than people over 30 years of age. The councillor polled a sample of 150 people from his district. The resulting data are shown in the table below: Reaction Age 18–30 years old Over 30 years old

Favour 25 45

Oppose 40 20

No Opinion 15 5

1. Refer to the Legislation Poll table. Construct a side-by-side bar chart. ANS:

PTS: 1 REF: 100-102 BLM: Higher Order - Apply

TOP: 1–4

2. Refer to the Legislation Poll table. Construct a pie chart for each of the age groups. ANS:

PTS: 1 REF: 100-102 BLM: Higher Order - Apply

TOP: 1–4

3. Refer to the Legislation Poll table. Which of the two types of presentations in the previous two questions is more easily understood? ANS: Since the total number of people in each age group is different, the pie charts displaying percentages, as opposed to raw frequencies, are more easily understood. PTS: 1 REF: 100-102 BLM: Higher Order - Analyze

TOP: 1–4

Gender Differences Narrative Male and female respondents to a questionnaire about gender differences are categorized into three groups according to their answers, as shown below: Gender Men Women

Group 1

Group 2

Group 3

52 17

36 40

62 43

4. Refer to Gender Differences Narrative. Create a side-by-side bar chart to describe these data. ANS:

PTS: 1 REF: 100-102 BLM: Higher Order - Apply

TOP: 1–4

5. Refer to Gender Differences Narrative. Create two pie charts (one for men and one for women) to describe these data. ANS:

PTS: 1 REF: 100-102 BLM: Higher Order - Apply

TOP: 1–4

6. Refer to Gender Differences Narrative. Which of the charts created in the above three questions best depicts the difference or similarity of the responses of men and women? Give reasons for your answer. ANS: The differences in the proportion of men and women in the three groups is most graphically portrayed by the pie charts, since the unequal number of men and women tend to confuse the interpretation of the bar charts. However, the bar charts are useful in retaining the actual frequencies in each group, which are lost in the pie chart. PTS: 1 REF: 100-102 BLM: Higher Order - Analyze

TOP: 1–4

7. Refer to Gender Differences Narrative. Create a stacked bar chart to describe these data. ANS:

PTS: 1 REF: 100-102 BLM: Higher Order - Apply

TOP: 1–4

8. Refer to Selling Price and Age of Home Narrative. Plot the data on a scatterplot. ANS:

PTS: 1 REF: 105-106 BLM: Higher Order - Apply

TOP: 1–4

Selling Price and Age of Home Narrative A real estate agent is interested in knowing whether there is a relationship between the age of a house and the selling price. Listed below are the ages (in years) and selling prices (in $1000s) of a sample of six houses the agent has sold in the past year: Age (x) Price (y)

20 60

15 58

17 59

16 61

15 59

18 60

9. Refer to Selling Price and Age of Home Narrative. Based on the plot in the previous question, does there appear to be a relationship between the age of a house and selling price? ANS: The points on the graphs are randomly scattered, showing no particular relationship. PTS: 1 REF: 105-106 | 108-109 BLM: Higher Order - Understand

TOP: 1–4

10. Consider the following set of bivariate data: x y

1 3

3 5

5 7

7 9

3 10

4 6

2 4

6 7

a. Plot the data on a scatterplot. b. Based on the plot in (a), are there any data points that seem unusual (i.e., are there any outliers)? If so, which one or ones? c. Ignoring any observations you have considered to be unusual, what can be said about the relationship between the variables x and y? ANS:

a. There appears to be one outlier: (3, 10) b. If you ignore the point (3, 10), the variables x and y appear to have a positive linear relationship. PTS: 1 REF: 105-106 | 108-109 BLM: Higher Order - Apply

TOP: 1–4

Students’ GPA Narrative A law school administrator was interested in whether a student’s score on the entrance exam can be used to predict a student’s grade point average (GPA) after one year of law school. The administrator took a random sample of 15 students and computed the following summary information, where x = entrance exam score and y = GPA after one year: and

11. Refer to Students’ GPA Narrative. Find the correlation between the entrance exam score and the grade point average after one year of law school. ANS:

Since

PTS:

, then the correlation is

REF: 107-108 | 112

TOP: 1–4

BLM: Higher Order - Apply 12. Refer to Students’ GPA Narrative. Interpret the correlation coefficient found in the previous question as to whether it could be used to predict students’ GPAs after the first year. Justify your answer. ANS: There is a strong positive linear relationship between the score on the entrance exam and grade point average after one year of law school. Thus, the exam would be effective/useful in predicting GPA after one year. PTS: 1 REF: 107-108 BLM: Higher Order - Understand

TOP: 1–4

13. Refer to Students’ GPA Narrative. Find the best-fitting line relating grade point average after one year of law school and score on the entrance exam. ANS: Since

and

, then,

. PTS: 1 REF: 110-112 BLM: Higher Order - Apply

TOP: 1–4

14. Refer to Students’ GPA Narrative. If a student scored 91 on the entrance exam, what would you predict the student’s grade point average to be after one year of law school? ANS: = –1.7868 + 0.0583(91) = 3.5185 PTS: 1 REF: 110-111 BLM: Higher Order - Apply

TOP: 1–4

Gasoline Prices and Fuel Efficiency Narrative. When the price of gasoline gets high, consumers become very concerned about the gas mileage obtained by their cars. One consumer was interested in the relationship between car engine size (number of cylinders) and gas mileage (miles/gallon). The consumer took a random sample of 7 cars and recorded the following information: n = 7, ?xi = 24.7, ?yi = 177, ?xiyi = 600.7, sx = 1.2406, and sy=4.3861 15. Refer to Gasoline Prices and Fuel Efficiency Narrative. Would you expect the correlation between engine size and fuel efficiency to be positive or negative?

ANS: One would expect the correlation to be positive; as the size of the engine gets bigger, the more gas it will use. PTS: 1 REF: 107-108 BLM: Higher Order - Understand

TOP: 1–4

16. Refer to Gasoline Prices and Fuel Efficiency Narrative. Find the correlation between engine size and fuel efficiency. ANS:

Since

, then the correlation is

PTS: 1 REF: 107-108 | 112 BLM: Higher Order - Apply

TOP: 1–4

17. Refer to Gasoline Prices and Fuel Efficiency Narrative. Find the best-fitting line relating car engine size and fuel efficiency. ANS: then = –3.115 + 2.4702x litres per 100 km. PTS: 1 REF: 110-112 BLM: Higher Order - Apply

TOP: 1–4

18. Refer to Gasoline Prices and Fuel Efficiency Narrative. What fuel efficiency would you predict for a car with a 6-cylinder engine? ANS: = –3.115 + 2.4702(6) = 11.71 litres per 100 km PTS: 1 REF: 110-111 BLM: Higher Order - Understand Soft Drink Sales Narrative

TOP: 1–4

A soft drink distributor was interested in examining the relationship between the number of ads (x) for his product during prime time on a local television station and the number of sales per week (y) in 1000s of cases. She compiled the figures for 20 weeks and computed the following summary information: n = 20, and

19. Refer to Soft Drink Sales Narrative. Find the correlation coefficient for the number of ads during prime time and weekly sales. ANS:

Since

, then

PTS: 1 REF: 107-108 | 112 BLM: Higher Order - Apply

TOP: 1–4

20. Refer to Soft Drink Sales Narrative. Find the best-fitting line relating the number of ads during prime time and weekly sales. ANS: Since

and

, then

. PTS: 1 REF: 110-112 BLM: Higher Order - Apply

TOP: 1–4

21. Refer to Soft Drink Sales Narrative. If the soft drink distributor ran 21 TV ads per week for her product, what would you predict her sales to be? ANS: = 0.1256 + 1.8966(21) = 39.9542 thousand cases PTS: 1 REF: 110-111 BLM: Higher Order - Apply

TOP: 1–4

Weekly Amount Spent on Groceries Narrative The number of household members, x, and the amount spent on groceries per week, y, rounded to the nearest dollar, are measured for eight households in Lakehead area. The data are shown below: x y

5 140

2 50

2 55

1 35

4 95

3 70

5 130

3 65

22. Refer to Weekly Amount Spent on Groceries Narrative. Draw a scatter plot of these eight data points ANS:

PTS: 1 REF: 105-106 BLM: Higher Order - Apply

TOP: 1–4

23. Refer to Weekly Amount Spent on Groceries Narrative. Find the best-fitting regression line for these data. ANS: = 0.1681 + 25.546x PTS: 1 REF: 110-112 BLM: Higher Order - Apply

TOP: 1–4

24. Refer to Weekly Amount Spent on Groceries Narrative. Plot the points and the best-fitting line on the same graph. ANS:

PTS: 1 REF: 110-112 BLM: Higher Order - Apply

TOP: 1–4

25. Refer to Weekly Amount Spent on Groceries Narrative. What would you estimate a household of seven to spend on groceries per week? Should you use the fitted line to estimate this amount? Why or why not? ANS: When x = 7, the estimated value of y is = 0.1681 + 25.546(7) = 178.99. However, it is risky to try to estimate the value of y for a value of x outside the experimental region, that is, the range of x values for which you have collected data. PTS: 1 REF: 110-111 BLM: Higher Order - Understand

TOP: 1–4

Meet Your Match Media Narrative The Executive Board of a popular online dating website, called Meet Your Match, wanted to know if there was a relationship between the age of a client and the number of persons they made contact with through the site. To find out, a random sample of 50 users from each of the 6 delineated age groups was drawn from its data base of active participants. The users were then surveyed as to how many persons had responded to the introduction email during the previous 3-month period. The average number of initial responses for each of the age groups was then recorded and organized into the following table. Group Label 15 25 35 45 55 65

User Age Group in Years < 20 years 20 to < 30 30 to < 40 40 to < 50 50 to < 60 = 60 years

Average Number of Contacts Made 15 22 10 8 15 11

26. Refer to Meet Your Match Media Narrative. What are the experimental units? ANS: The people drawn from the database of active participants represent the experimental units. PTS: 1 REF: 10 BLM: Higher Order - Understand

TOP: 1–3

27. Refer to Meet Your Match Media Narrative. What is the sample size? ANS: 6 * 50 = 300

PTS: 1 REF: 10 BLM: Higher Order - Apply

TOP: 4–5

28. Refer to Meet Your Match Media Narrative. What are the two primary variables of interest in this experiment? Are they qualitative or quantitative? ANS: The variables are age group of user and average number of contacts made for each age group. Both are quantitative. PTS: 1 REF: 12-13 BLM: Higher Order - Understand

TOP: 1–3

29. Refer to Meet Your Match Media Narrative. Of all the graphical techniques available for displaying data, which would be the most useful to determine if any kind of relationship existed between the two variables? Justify your answer. ANS: Scatter plot because there are two quantitative variables to be compared. PTS: 1 REF: 105-106 BLM: Higher Order - Analyze

TOP: 1–4

Chapter 4A—Probability and Probability Distributions MULTIPLE CHOICE 1. What is the set of all simple events of an experiment called? a. a compound event b. a sample space c. a population d. a random sample ANS: B BLM: Remember

PTS: 1

REF: 131-132

TOP: 1–4

2. Which of the following is a useful graphical method for displaying the sample space of an experiment? a. a tree diagram b. a box plot c. a histogram d. a scatterplot ANS: A BLM: Remember

PTS: 1

REF: 133-134

TOP: 1–4

3. Which of the following correctly describes experiments? a. They are two random events, A and B, such that the probability of one event is not affected by the occurrence of the other event; therefore, P(A) = P(A|B). b. They are different events that have no outcomes in common. c. They are activities that result in one and only one of several clearly defined possible outcomes and from which one may not predict, in advance, which of these outcomes will prevail in any particular instance. d. They are different events that have different outcomes. ANS: C BLM: Remember

PTS: 1

REF: 131

TOP: 1–4

4. Which of the following may be used to represent the sample space of an experiment? a. a joint probability table b. the additive rule of probability c. the multiplicative rule of probability d. a tree diagram ANS: D BLM: Remember

PTS: 1

REF: 133-134

TOP: 1–4

5. What is any subset of the sample space called? a. an event b. an experiment c. a mutually exclusive event d. independent events ANS: A BLM: Remember

PTS: 1

REF: 131-132

TOP: 1–4

6. Suppose that an experiment consists of tossing three unbiased coins simultaneously. How many simple events are contained in this experiment?

a. b. c. d.

3 6 8 9

ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 141-142

TOP: 1–4

7. Suppose you are told that an experiment consists of three stages and that there are three ways to accomplish the first stage, four ways to accomplish the second stage, and five to accomplish the third stage. What would be the number of ways to accomplish the experiment? a. 12 b. 15 c. 20 d. 60 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 141-142

TOP: 1–4

8. How many ways can one choose a combination of three items out of eight distinct items? a. 28 b. 56 c. 112 d. 224 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 143-144

TOP: 1–4

9. Which of the following is the best description of an event? a. an experiment that is not controlled by the decision maker b. the list of all possible simple events of an experiment c. a collection of one or more simple events d. a collection of two or more simple events ANS: C BLM: Remember

PTS: 1

REF: 131-132

TOP: 1–4

10. When Cynthia enters a grocery store, there are three simple events: buy nothing, buy a small amount, or buy a large amount. In this situation, if Cynthia buys a small amount, she cannot also buy a large amount or buy nothing. How may one best classify these three events? a. They are mutually exclusive events. b. They are not mutually exclusive events. c. They are dependent events. d. They are independent events. ANS: A TOP: 1–4

PTS: 1 REF: 131-132 | 150 BLM: Higher Order - Analyze

11. What does the notation a. the union of two events b. the intersection of two events c. the complement of an event d. the additive rule of probability ANS: C

PTS: 1

represent?

REF: 146-147

TOP: 5–6

BLM: Remember 12. What is the sum of the probability of an event and the probability of its complement? a. –1 b. 0 c. 1 d. any value between 0 and 1 ANS: C BLM: Remember

PTS: 1

REF: 149

TOP: 5–6

13. Suppose P(A) = 0.4, P(B) = 0.3, and P(A B) = 0. Which one of the following statements correctly defines the relationship between events A and B? a. Events A and B are independent, but not mutually exclusive. b. Events A and B are mutually exclusive, but not independent. c. Events A and B are neither mutually exclusive nor independent. d. Events A and B are both mutually exclusive and independent. ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 155-157

TOP: 5–6

14. If events A and B are mutually exclusive, then what is the probability of both events occurring simultaneously? a. –1 b. 0 c. 1 d. any value between 0 and 1 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 132 | 148

TOP: 5–6

15. If P(A/B) = P(A), or P(B/A) = P(B), which of the following best describes the events A and B? a. They are mutually exclusive. b. They are disjoint. c. They are independent. d. They are dependent. ANS: C PTS: 1 BLM: Higher Order - Understand 16. If P(A) = 0.80, P(B) = 0.70, and P(A a. 0.56 b. 0.60 c. 0.63 d. 0.72 ANS: B PTS: 1 BLM: Higher Order - Apply 17. If P(A) = 0.30, P(B) = 0.40, and P(A a. 0.08 b. 0.12 c. 0.50 d. 0.67

REF: 155-157

TOP: 5–6

B) = 0.90, then what is the value of P(A

REF: 148

TOP: 5–6

B) = 0.20, what is the value of P(A/B)?

B)?

ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 153

TOP: 5–6

18. If P(A) = 0.40, P(B) = 0.30, and P(A B) = 0.12, then what could you deduce about the events A and B? a. They are dependent events. b. They are independent events. c. They are mutually exclusive events. d. They are disjoint events. ANS: B TOP: 5–6

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

19. If P(A) = 0.42 and P(B) = 0.38, then what is P(A B)? a. 0.80 b. 0.58 c. 0.04 d. cannot be determined from the information given ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 148 | 151

TOP: 5–6

20. Which of the following best describes all the outcomes (simple events) contained in one or the other of two random events, or possibly in both? a. the events of an experiment b. the intersection of two events c. the probability space of an experiment d. the union of two events ANS: D BLM: Remember

PTS: 1

REF: 146-147

TOP: 5–6

21. Which of the following clearly describes the general multiplicative rule of probability? a. It is a rule of probability theory that is used to compute the probability for the occurrence of a union of two or more events: for any two events, A and B, b. It is a rule of probability theory that is used to compute the probability for the occurrence of a union of two or more events: for any two events A and B, c. It is a rule of probability theory that is used to compute the probability for an intersection of two or more events: for any two events, A and B, and also = P(B) P(A|B) d. It is a rule of probability theory that is used to compute the probability for an intersection of two or more events: for any two events A and B, ANS: C BLM: Remember

PTS: 1

REF: 151

TOP: 5–6

22. Which of the following best describes the concept of marginal probability? a. It is a measure of the likelihood that a particular event will occur, regardless of whether another event occurs. b. It is a measure of the likelihood that a particular event will occur, given the fact that

another event has already occurred or is certain to occur. c. It is a measure of the likelihood of the simultaneous occurrence of two or more events. d. It is a direct way for defining the sample space of an experiment. It is measure of the likelihood that either one or the other out of a possible two events will occur. ANS: A BLM: Remember

PTS: 1

REF: 149-150

TOP: 5–6

23. In the case of independent events A, B, and C, which of the following is equal to a. P(A|B) P(B|C) P(C|A) b. P(A|B) + P(B|C) + P(C|A) c. P(A) P(B) P(C) d. P(A) + P(B) + P(C) ANS: C TOP: 5–6

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

24. Two events, A and B, are said to be dependent if and only if what condition is true? a. P(A) = P(B). b. P(A) increases along with P(B). c. P(A) increases as P(B) decreases. d. Event A is affected or changed by the occurrence of event B. ANS: D TOP: 5–6

PTS: 1 BLM: Remember

REF: 150 | 154-155

25. A false negative in screening tests (e.g., steroid testing of athletes) represents which of the following events? a. The test is negative for a given condition, given that the person does not have the condition. b. The test is positive for a given condition, given that the person does not have the condition. c. The test is negative for a given condition, given that the person has the condition. d. The test is positive for a given condition, given that the person has the condition. ANS: C BLM: Remember

PTS: 1

REF: 163

TOP: 7

26. A false positive in screening (e.g., home pregnancy tests) may be best described by which of the following events? a. The test is negative for a given condition, given that the person does not have the condition. b. The test is positive for a given condition, given that the person does not have the condition. c. The test is negative for a given condition, given that the person has the condition. d. The test is positive for a given condition, given that the person has the condition. ANS: B BLM: Remember

PTS: 1

REF: 163

TOP: 7

27. Screening tests (e.g., HIV testing) are evaluated on the probability of a false negative or a false positive. How may one classify these probabilities? a. They are both conditional probabilities. b. They measure the probability of the intersection of two events.

c. They measure the probability of the union of two events d. They are both marginal probabilities. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 163

TOP: 7

28. Which of these statements is a property of the probability distribution for a discrete random variable x ? a. The probabilities must be nonnegative. b. The probabilities must sum to 1. c. The random variable must take on positive values between 0 and 1. d. Both (a) and (b) are true. ANS: D BLM: Remember

PTS: 1

REF: 171

TOP: 8

29. What is the term for a table, formula, or graph showing all possible values that a random variable x can assume, together with their associated probabilities p(x)? a. a discrete probability distribution b. a continuous probability distribution c. a bivariate probability distribution d. the law of total probability ANS: A BLM: Remember

PTS: 1

REF: 170-171

TOP: 8

30. What would be the expected number of heads turning up in 500 tosses of an unbiased coin? a. 150 b. 200 c. 250 d. 300 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 172-175

TOP: 8

31. Which of the following is a (are) required condition(s) for the distribution of a discrete random variable that can assume values a.

b. c. Both (a) and (b) are required conditions. d. The population data of the distribution must be quantitative. ANS: C BLM: Remember

PTS: 1

REF: 171

TOP: 8

32. Which of the following correctly describes the nature of discrete quantitative variables? a. They can assume values only at specific points on a scale of values, with inevitable gaps between successive observations. b. When dealing with such variables, we can count all possible observations and, with some exceptions, that count leads to a finite result. c. Discrete quantitative variables are correctly described by both (a) and (b). ANS: C

PTS: 1

REF: 170

TOP: 8

BLM: Remember 33. What may NOT be said about the mean of a discrete random variable x? a. It is denoted by . b. It is the middle value of its probability distribution. c. It is denoted by E(x), because it is the value of x one can expect to find, on average, by numerous repetitions of the random experiment that generates the variable’s actual values. d. It is correctly described by both (a) and (b). ANS: B BLM: Remember

PTS: 1

REF: 172-173

TOP: 8

34. The probability distribution of the number of accidents in North York, Ontario, each day is given by x P(x)

0 0.20

1 0.15

2 0.25

3 0.15

4 0.20

5 0.05

Which of the following could be used to describe this distribution? a. a continuous probability distribution b. a discrete probability distribution c. a conditional probability distribution d. an expected value distribution ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 170-172

TOP: 8

35. The probability distribution of the number of accidents in North York, Ontario, each day is given by X P(x)

0 0.20

1 0.15

2 0.25

3 0.15

4 0.20

5 0.05

Based on this distribution, what would be the expected number of accidents on a given day? a. 4.62 b. 2.15 c. 1.81 d. 1.47 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 172-173

TOP: 8

36. The probability distribution of the number of accidents in North York, Ontario, each day is given in the table below: x P(x)

0 0.20

1 0.15

2 0.25

3 0.15

4 0.20

5 0.05

Based on this distribution, what is the approximate value of the standard deviation of the number of accidents per day? a. 6.95 b. 2.64 c. 2.33 d. 1.53 ANS: D

PTS: 1

REF: 172-173

TOP: 8

BLM: Higher Order - Apply TRUE/FALSE 1. Probability is the tool that allows the statistician to use sample information to make inferences about or describe the population from which the sample was drawn. ANS: T BLM: Remember

PTS: 1

REF: 131

TOP: 1–4

2. Statistics provides ways to reason from the population to the sample, whereas probability acts in reverse, moving from the sample to the population. ANS: F BLM: Remember

PTS: 1

REF: 131

TOP: 1–4

3. The probability of an event A is equal to the sum of the probabilities of the simple events contained in A. ANS: T BLM: Remember

PTS: 1

REF: 134-136

TOP: 1–4

4. Relative frequency histograms are constructed for a sample of n measurements drawn from the population, whereas the probability histogram is constructed as a model for the entire population of measurements. ANS: T TOP: 1–4

PTS: 1 BLM: Remember

REF: 134 | 171-172

5. Suppose that an experiment consists of tossing four unbiased coins simultaneously. The number of simple events in this experiment is 16. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 141

TOP: 1–4

6. Combinations are distinguishable ordered arrangements of items, all of which have been drawn from a given group of items. ANS: F BLM: Remember

PTS: 1

REF: 143-144

TOP: 1–4

7. An experiment is any activity that results in one and only one of several clearly defined possible outcomes, but does not allow us to tell in advance which of these will prevail in any particular instance. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 131

TOP: 1–4

8. An event is a collection of one or more simple events of an experiment. ANS: T BLM: Remember

PTS: 1

REF: 131-132

TOP: 1–4

9. A tree diagram is a listing of all the simple events of an experiment. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 133-134

TOP: 1–4

10. Different events that have no outcomes in common are mutually exclusive events. ANS: T BLM: Remember

PTS: 1

REF: 132 | 148

TOP: 1–4

11. Invariably, Venn diagrams illustrate the intersection of two events. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 132-133

TOP: 1–4

12. In general, the simple events of an experiment take on values between 0 and 1.0, inclusive. ANS: F BLM: Remember

PTS: 1

REF: 131-132

TOP: 1–4

13. If an investor were interested in assessing the probability that a new supermarket will be successful in a Calgary market area, he would most likely use the relative frequency definition of probability as the method for assessing the probability of success. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 134-136

TOP: 1–4

14. Suppose that a patient who is complaining of several specific symptoms arrives at a doctor’s office and the doctor says that she is 90% certain that the patient has the flu. In this case, it is likely that she is basing her assessment on the relative frequency approach of assigning probabilities. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 134-136

TOP: 1–4

15. An experiment is the process by which an observation or measurement is obtained. ANS: T BLM: Remember

PTS: 1

REF: 131

TOP: 1–4

16. The experiment of tossing a single coin once contains one simple event. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 131-132

TOP: 1–4

17. The experiment of rolling a single die once contains six simple events. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 131-132

TOP: 1–4

18. The experiment of spinning the Monte Carlo roulette wheel once contains 27 simple events. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 131-132

TOP: 1–4

19. The experiment of drawing a single card once from a standard deck contains four events. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 131-132

TOP: 1–4

20. The probability of getting the king of diamonds when randomly drawing a card from a well-shuffled deck is 1/52. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 134-135

TOP: 1–4

21. The probability of getting a black card when randomly drawing a card from a well-shuffled deck is 1/2. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 134-135

TOP: 1–4

22. The probability of getting a 15 when randomly drawing a card from a well-shuffled deck is 4/52. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 134-135

TOP: 1–4

23. The probability of getting two heads when tossing a fair coin twice is 1/4. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 135

TOP: 1–4

24. The sum of the probabilities for all simple events in the sample space equals 1. ANS: T BLM: Remember

PTS: 1

REF: 135

TOP: 1–4

25. The intersection of events A and B is the event that A or B or both occur. ANS: F BLM: Remember

PTS: 1

REF: 146-147

26. The complement of an event A, denoted by that are not in A. ANS: T BLM: Remember

PTS: 1

TOP: 5–6

, consists of all the simple events in the sample space S

REF: 146-147

TOP: 5–6

27. The conditional probability of event B, given that event A has occurred is defined by . ANS: T BLM: Remember

PTS: 1

REF: 153

TOP: 5–6

28. Two events A and B are said to be independent if and only if P(A/B) = P(B) or P(B/A) = P(A). ANS: F TOP: 5–6

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

29. If P(A) > 0, P(B) > 0, and P(A ANS: F TOP: 5–6

B) = 0, then the events A and B are independent.

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

30. If P(A) = 0.4, P(B) = 0.5, and P(A

B) = 0.20, then the events A and B are mutually exclusive.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 148

TOP: 5–6

31. If P(A) > 0 and P(B) > 0, then when A and B are mutually exclusive events, they are also dependent events. ANS: T BLM: Remember 32. If P(A) = 0.3, P(A

PTS: 1

REF: 155-157

B) = 0.7, and P(A

B) = 0.2, then P(B) = 0.2.

ANS: F PTS: 1 BLM: Higher Order - Apply 33. If P(A) = 0.4, P(B) = 0.5, and P(A

TOP: 5–6

REF: 148

TOP: 5–6

B) = 0.7, then P(A

ANS: T PTS: 1 BLM: Higher Order - Apply

B) = 0.2.

REF: 148

TOP: 5–6

34. If P(A) = 0.60, P(B) = 0.40, and P(B/A) = 0.60, then P(A/B) = 0.24. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 151-153

TOP: 5–6

35. If A and B are independent events with P(A) = 0.30 and P(B) = 0.50, then P(A/B) is 0.15. ANS: F TOP: 5–6

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Apply

36. The probability that event A will not occur is 1 – ANS: F BLM: Remember

PTS: 1

REF: 149

TOP: 5–6

37. If P(A/B) = P(A), then events A and B are said to be independent. ANS: T TOP: 5–6

PTS: 1 BLM: Remember

REF: 150 | 153-155

38. Conditional probability is the probability that an event will occur, with no other events taken into consideration. ANS: F

PTS: 1

REF: 151-153

TOP: 5–6

BLM: Remember 39. If A and B are two independent events with P(A) = 0.25 and P(B) = 0.45, then P(A ANS: F TOP: 5–6

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Apply

40. If P(A) = 0, P(B) = 0.4, and P(A ANS: T TOP: 5–6

B) = 0, then events A and B are independent.

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

41. Two events A and B are said to be independent if P(A ANS: F TOP: 5–6

B) = P(A) + P(B).

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

42. Two events A and B are said to be mutually exclusive if P(A ANS: T BLM: Remember

B) = 0.70.

PTS: 1

B) = 0.

REF: 148

TOP: 5–6

43. Suppose A and B are mutually exclusive events where P(A) = 0.1 and P(B) = 0.7, then P(A 0.8. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 148

TOP: 5–6

44. Suppose A and B are mutually exclusive events where P(A) = 0.2 and P(B) = 0.3, then P(A 0.5. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 148

45. Suppose A and B are events where P(A) = 0.4, P(B) = 0.5, and P(A ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 151-153

B) =

TOP: 5–6

B) = 0.2. Then P(B/A) = 0.4. TOP: 5–6

46. All the outcomes contained in one or the other of two events (or possibly in both) constitute the union of two events. ANS: T BLM: Remember

PTS: 1

REF: 146-147

TOP: 5–6

47. The additive rule of probability is used to compute the probability of an intersection of two or more events. In other words, given two events A and B,

ANS: F

PTS: 1

REF: 148 | 153

and also

TOP: 5–6

BLM: Remember 48. The addition law of probability theory is used to compute the probability for the occurrence of a union of two or more events; namely, given two events A and B, ANS: T BLM: Remember

PTS: 1

REF: 148

TOP: 5–6

49. If A and B are independent events, they are also mutually exclusive. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 155-157

TOP: 5–6

50. If A and B are dependent events, they are also mutually exclusive. ANS: F PTS: 1 BLM: Higher Order - Understand 51. If P(A/B) = P(A ANS: F TOP: 5–6

REF: 155-157

B), then A and B are independent events. PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Understand

52. If A and B are mutually exclusive events, then A experiment. ANS: T BLM: Remember

TOP: 5–6

PTS: 1

B can never occur on the same trial of an

REF: 132 | 148

TOP: 5–6

53. Bayes’ Rule is a formula for revising an initial subjective (prior) probability value on the basis of results obtained by an empirical investigation, thereby obtaining a new (posterior) probability value. ANS: T BLM: Remember

PTS: 1

REF: 164

TOP: 7

54. Given a set of events that are mutually exclusive and exhaustive and an event A, the law of total probability states that P(A) can be expressed as

ANS: T BLM: Remember

PTS: 1

REF: 162

TOP: 7

55. A false positive in screening tests is the event that the test is negative for a given condition, given that the person has the condition. ANS: F BLM: Remember

PTS: 1

REF: 163

TOP: 7

56. A false negative in screening tests is the event that the test is positive for a given condition, given that the person does not have the condition.

ANS: F BLM: Remember

PTS: 1

REF: 163

TOP: 7

57. The probability distribution for a discrete variable x is a formula, a table, or a graph providing or showing p(x), the probability associated with each of the values of x. ANS: T BLM: Remember

PTS: 1

REF: 170-172

TOP: 8

58. A random variable is any variable for which the numerical value is determined by a random experiment, and, thus, by chance. ANS: T BLM: Remember

PTS: 1

REF: 170

TOP: 8

59. A set of all possible values of a discrete random variable is countable, because the variable can assume values only at specific points on a scale of values, with inevitable gaps in between. ANS: T BLM: Remember

PTS: 1

REF: 170-172

TOP: 8

60. A table, graph, or formula that associates each possible value of a discrete random variable, x, with its probability of occurrence, p(x), is called a discrete probability distribution. ANS: T BLM: Remember

PTS: 1

REF: 170-172

TOP: 8

61. There are two types of random variables: discrete and continuous. ANS: T BLM: Remember

PTS: 1

REF: 170

TOP: 8

62. If x is a discrete random variable, then x can take on only one of two possible values. ANS: F BLM: Remember

PTS: 1

REF: 170

TOP: 8

63. The number of students living off-campus is an example of a discrete random variable. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 170

TOP: 8

64. The time required to assemble a computer could be classified as a discrete random variable. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 170

TOP: 8

65. The number of homes sold by a real estate agency in a three-week period is an example of a continuous random variable. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 170

TOP: 8

66. If the mean of the random variable x is larger than the mean of the random variable y, then the variance of x must be larger than the variance of y. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 172-173

TOP: 8

67. The graph of a discrete random variable looks like a histogram in which the probability of each possible outcome is represented by a bar. ANS: T BLM: Remember

PTS: 1

REF: 170-172

TOP: 8

68. The weight of a box of candy bars is an example of a discrete random variable since there are only a specific number of bars in the box. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 170

TOP: 8

69. The mean of a discrete probability distribution is equal to the square root of the variance. ANS: F BLM: Remember

PTS: 1

REF: 172-173

TOP: 8

70. The expected value of a discrete probability distribution is its long-run average value, if the experiment is to be repeated many times. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 172-173

TOP: 8

71. The standard deviation of a discrete probability distribution measures the average variation of the random variable from the mean. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 172-173

TOP: 8

Chapter 4B—Probability and Probability Distributions PROBLEM Window Frame Supplier A manufacturing firm producing odd-sized decorative windows buys the window frames from either supplier customer

or supplier

or customer

. The firm sells the finished window to either

1. Refer to Window Frame Supplier paragraph. Describe the sample space; that is, list all possible supplier–customer combinations that a given finished window might represent. ANS: The sample space is S = {(

), (

PTS: 1 REF: 131-132 BLM: Higher Order - Apply

), (

)}.

TOP: 1–4

2. Refer to Window Frame Supplier paragraph. If each supplier–customer combination in the previous question is equally likely to occur, what is the probability that a randomly chosen finished window is sold to customer

ANS: 2/4 = 0.50 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

Political Opinions Narrative A political scientist asked a group of people how they felt about two political policy statements. Each person was to respond either A (agree), N (neutral), or D (disagree) to each policy statement. 3. Refer to Political Opinions Narrative. Describe the sample space; that is, list all possible response combinations to the two statements. ANS: The sample space is S = {AA, AN, AD, NA, NN, ND, DA, DN, DD}. PTS: 1 REF: 131-134 BLM: Higher Order - Apply

TOP: 1–4

4. Refer to Political Opinions Narrative. Assuming each response combination in the sample space is equally likely, what is the probability the person being interviewed agrees with at least one of the two policy statements?

ANS: 5/9 0.556 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

5. Refer to Political Opinions Narrative. Assuming each response combination in the sample space is equally likely, what is the probability the person being interviewed agrees with exactly one of the two political policy statements? ANS: 4/9 0.444 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

6. Refer to Political Opinions Narrative. Assuming each response combination in the sample space is equally likely, what is the probability the person being interviewed agrees with both of the two political policy statements? ANS: 1/9 0.111 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

Driver Education Narrative Three randomly chosen 14-year-old middle school students who had not yet taken driver’s education classes were given the written part of the Manitoba Driver’s Exam. Each student was graded as passing (P) or failing (F) the written exam. 7. Refer to Driver Education Narrative. Describe the sample space; that is, list all possible combinations of the three students’ grades. ANS: The sample space is S = {PPP, PPF, PFP, FPP, PFF, FPF, FFP, FFF}. PTS: 1 REF: 131-134 BLM: Higher Order - Apply

TOP: 1–4

8. Refer to Driver Education Narrative. Assuming each combination in the sample space is equally likely, what is the probability that all three students fail? ANS: 1/8 = 0.125 PTS:

REF: 134-136

TOP: 1–4

BLM: Higher Order - Apply 9. Refer to Driver Education Narrative. Assuming each combination in the sample space is equally likely, what is the probability at least one student passes the written test? ANS: 7/8 = 0.875 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

10. Refer to Driver Education Narrative. Assume each combination in the sample space is equally likely. Then, if you knew that at least one student passed the test, what is the probability all three students passed the test? ANS: 1/7 0.143 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

11. Maria selected two M&M candies at random from a bowl containing three M&Ms. One was red; one was yellow; and the remaining one was orange. Describe the sample space if the sampling is done with replacement. ANS: The sample space is S = {RR, RY, RO, YR, YY, YO, OR, OY, OO}. PTS: 1 REF: 131-134 BLM: Higher Order - Apply

TOP: 1–4

SALES NARRATIVE A salesperson either makes a sale (S) or does not make a sale (N) with each of two potential customers. The simple events and their probabilities are given below. Simple Event

Probability

0.08

0.12

0.15

0.65

12. Refer to Sales Narrative. What is the probability that no sales are made? ANS: 0.65

PTS: 1 REF: 137 BLM: Higher Order - Apply

TOP: 1–4

13. Refer to Sales Narrative. What is the probability that at least one sale is made? ANS: 0.08 + 0.12 + 0.15 = 0.35 PTS: 1 REF: 137 BLM: Higher Order - Apply

TOP: 1–4

14. Refer to Sales Narrative. What is the probability that exactly one sale is made? ANS: 0.12 + 0.15 = 0.27 PTS: 1 REF: 137 BLM: Higher Order - Apply

TOP: 1–4

15. Refer to Sales Narrative. What is the probability that exactly two sales were made? ANS: 0.08 PTS: 1 REF: 137 BLM: Higher Order - Apply

TOP: 1–4

Job Applicants Narrative Five applicants apply for two jobs. Applicants A and B are male; applicants C, D, and E are female. The personnel officer selects two applicants at random to fill the two jobs. 16. Refer to Job Applicants Narrative. List all possible combinations of the five applicants for the two different jobs. ANS: {AB, AC, AD, AE, BA, BC, BD, BE, CA, CB, CD, CE, DA, DB, DC, DE, EA, EB, EC, ED} PTS: 1 REF: 131-132 BLM: Higher Order - Apply

TOP: 1–4

17. Refer to Job Applicants Narrative. If the two jobs are different, and denotes the collection of outcomes where the successful job applicants include at least one male, what is P(

ANS:

) = 14/20 = 0.7

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

18. Refer to Job Applicants Narrative. If the two jobs are different, and denotes the collection of outcomes where the successful job applicants include exactly one male, what is P(

ANS: P(

) = 12/20 = 0.6

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

19. Refer to Job Applicants Narrative. If the two jobs are different, and denotes the collection of outcomes where the successful job applicants include at least one female, what is P(

ANS: P(

) = 18/20 = 0.9

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

20. Refer to Job Applicants Narrative. If the two jobs are different, and denotes the collection of outcomes where the successful job applicants include exactly one female, what is P( )? NAR: Job Applicants Narrative ANS: P(

) = 12/20 = 0.6

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

21. A sample space S consists of five simple events with the following probabilities: P(

) = P(

) = 0.20, P(

) = 0.45, and P(

) = 2P(

Find the probabilities for simple events

b. c. d.

Find the probabilities for these two events: A: , , and B: , List the simple events that are in either event A or event B or both. List the simple events that are in both event A and event B.

and

). . .

ANS: a. 2P(

Since

, then P(

), then 3 P(

) + P(

) = 1 – (0.20 + 0.20 + 0.45) = 0.15. But P(

) = 0.15. This implies that P(

b. P(A) = P( ) + P( = 0.20 + 0.10 = 0.30

) + P(

A B={

}

) = 0.05, and P(

) = 0.10

) = 0.20 + 0.45 + 0.10 = 0.75, and P(B) = P(

) + P(

)

}

PTS: 1 REF: 131-132 | 146-147 BLM: Higher Order - Apply

TOP: 1–4

Coffee Brands Narrative A food company plans to conduct an experiment to compare its brand of coffee with that of two competitors. A single person is hired to taste each of three brands of coffee, which are unmarked except for identifying symbols, A, B, and C. 22. Refer to Coffee Brands Narrative. Define the experiment. ANS: Experiment: preference.

A taster tastes and marks three varieties of coffee, A, B, and C, according to

PTS: 1 REF: 131 BLM: Higher Order - Understand

TOP: 1–4

23. Refer to Coffee Brands Narrative. List the simple events in the sample space S. ANS: Simple events in S are in triplet form, where 1 is assigned to the most desirable, 2 to the next most desirable, and 3 to the least desirable. 1),

:(3, 2, 1), and

:(1, 2, 3)

:(1, 3, 2),

:(2, 1, 3),

:(2, 3,

:(3, 1, 2).

PTS: 1 REF: 131-132 BLM: Higher Order - Apply

TOP: 1–4

24. Refer to Coffee Brands Narrative. If the taster has no ability to distinguish difference in taste among coffees, what is the probability that the taster will rank coffee type C as the most desirable? As the least desirable? ANS: Define the events. D: variety C is ranked first (i.e., most desirable) F: variety C is ranked third (i.e., least desirable) Then,

P(D) = P(

) + P(

) = 1/6 + 1/6 = 1/3

P(F) = P(

) + P(

) = 1/6 + 1/6 = 1/3

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 1–4

25. A graduate student has decided she needs a day at the beach. She will need a swimsuit, a pair of sunglasses, and a beach towel for the occasion. If she has two swimsuits, three pairs of sunglasses, and five beach towels, how many different choices does she have? ANS: Number of different choices = (2)(3)(5) = 30 PTS: 1 REF: 141 | 143-144 BLM: Higher Order - Apply

TOP: 1–4

26. A professor has received a grant to travel to an archaeological dig site. The grant includes funds for three graduate students to accompany the professor. If there are six graduate students available to the professor and all the funds are to be used (i.e., three students will go), how many choices does the professor have? ANS: Number of choices the professor has = PTS: 1 REF: 143-144 BLM: Higher Order - Apply

TOP: 1–4

27. An interior decorator must furnish two offices. Each office must have a desk, a chair, a file cabinet, and two bookcases. At a local office furniture store there are 6 models of desks, 8 models of chairs, 4 models of file cabinets, and 10 models of bookcases, all of which are compatible. (Any desk can be matched with any chair, etc.) How many choices does the decorator have if he wants to select 2 desks, 2 chairs, 2 file cabinets, and 4 bookcases but he doesn’t want to select more than one of any model? ANS: Number of choices the decorator has = PTS: 1 REF: 141 | 143-144 BLM: Higher Order - Apply

TOP: 1–4

28. A businessman in Hamilton is preparing an itinerary for a visit to five major cities. Each city will be visited once and only once. The distance travelled, and hence the cost of the trip, will depend on the order in which he plans his route. How many different itineraries (and trip costs) are possible? ANS:

PTS: 1 REF: 141-142 BLM: Higher Order - Apply

TOP: 1–4

29. An Italian restaurant in Québec City offers a special summer menu in which, for a fixed dinner cost, you can choose from one of two salads, one of three entrees, and one of four desserts. How many different dinners are available? ANS: Using the extended mn Rule, the total number of options are (2)(3)(4) = 24. PTS: 1 REF: 141 | 143-144 BLM: Higher Order - Apply

TOP: 1–4

30. Heidi prepares for an exam by studying a list of 15 problems. She can solve 9 of them. For the exam, the instructor selects 7 questions at random from the list of 15. What is the probability that Heidi can solve all 7 problems on the exam? ANS: The instructor can select 7 out of the list of 15 questions in ways. If Heidi is to able to answer all 7 questions, the instructor must choose 7 questions out of the 9 that Heidi can answer, and none of the 6 questions that Heidi cannot answer. The number of ways in which this event can occur is /

. Hence, the probability that Heidi can answer all 7 questions =

= 36/6435 = 0.0056

PTS: 1 REF: 141 | 143-144 BLM: Higher Order - Apply

TOP: 1–4

31. How many different combinations of 5 students can be drawn from a class of 25 students? ANS:

PTS: 1 REF: 143-144 BLM: Higher Order - Apply

TOP: 1–4

32. How many permutations of 3 colours can be drawn from a group of 20 colours? ANS:

PTS: 1 REF: 141-142 BLM: Higher Order - Apply

TOP: 1–4

Smoking Habits of Health Club Members Narrative A group of 40 people at a health club were classified according to their gender and smoking habits, as shown in the table below. One person is selected at random from that group of 40 people. Smoking Habits Gender

Smoker (S)

Non-smoker (N)

Total

Male (M)

Female (F)

Total

33. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person smokes? ANS: P(S) = 8/40 = 0.20 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

34. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person does not smoke? ANS: P(N) = 32/40 = 0.80 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

35. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person is female and does not smoke? ANS: P(F N) = 8/40 = 0.20 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

36. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person is male? ANS: P(M) = 26/40 = 0.65

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

37. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person is female? ANS: P(F) = 14/40 = 0.35 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

38. Refer to Smoking Habits of Health Club Members Narrative. What is the probability the person is male and smokes? ANS: P(M S) = 2/40 = 0.05 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

39. Refer to Smoking Habits of Health Club Members Narrative. If the person is male, what is the probability he smokes? ANS: P(S/M) = P(M

S)/P(M) = 0.05/0.65 = 0.0769

PTS: 1 REF: 134-136 | 151-153 BLM: Higher Order - Apply

TOP: 5–6

40. Refer to Smoking Habits of Health Club Members Narrative. If the person is female, what is the probability she does not smoke? ANS: P(N/F) = P(F

N)/P(F) = 0.2/0.35 = 0.5714

PTS: 1 REF: 134-136 | 151-153 BLM: Higher Order - Apply

TOP: 5–6

Mall Shopper Narrative One hundred shoppers at a local shopping mall were categorized by age and gender as shown in the frequency distribution below. One shopper is selected at random from that group of 100 shoppers. Age Group Under 25 Years

25–40 Years

Over 40 Years

Gender

(

)

(

)

Total

)

Male (M)

Female (F)

Total

100

41. Refer to Mall Shopper Narrative. Convert the frequency table shown above into a probability table. ANS: Age Group Under 25 years

25–40 years

Over 40 years

Gender

(

Male (M)

0.15

0.13

0.12

0.40

Female (F)

0.24

0.18

0.60

Total

0.39

0.31

0.30

1.00

)

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

(

)

(

)

Total

TOP: 5–6

42. Refer to Mall Shopper Narrative. What is the probability that the randomly selected shopper is under 25 years of age? ANS: P(

) = 0.39

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

43. Refer to Mall Shopper Narrative. What is the probability that the randomly selected shopper is male? ANS: P(M) = 0.40 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

44. Refer to Mall Shopper Narrative. What is the probability that the randomly selected shopper is male and under 25 years of age? ANS: P(M

) = 0.15

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

45. Refer to Mall Shopper Narrative. If the randomly selected shopper is male, what is the probability he is under 25 years of age? ANS: P(

/M) = P(M

)/P(M) = 0.15/0.40 = 0.375

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

46. Refer to Mall Shopper Narrative. If the randomly selected shopper is under 25 years of age, what is the probability that the shopper is male? ANS: P(M/

) = P(M

)/P(

) = 0.15/0.39 = 0.3846

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

47. Refer to Mall Shopper Narrative. What is the probability that the randomly selected shopper is either female or over 40 years of age? ANS: P(F

) = P(F) + P(

) – P(F

) = 0.60 + 0.30 – 0.18 = 0.72

PTS: 1 REF: 134-136 | 148 BLM: Higher Order - Apply

TOP: 5–6

48. Refer to Mall Shopper Narrative. If the randomly selected shopper is female, what is the probability that she is 25 to 40 years old? ANS: P(

/F) = P(F

)/P(F) = 0.18/0.60 = 0.30

PTS: 1 REF: 134-136 | 151-153 BLM: Higher Order - Apply

TOP: 5–6

49. Refer to Mall Shopper Narrative. Are the gender of the shopper and the shopper’s age mutually exclusive events? Explain. ANS: No, the gender of the shopper and the shopper’s age are not mutually exclusive events. For example, P(F

) = 0.18

PTS: 1 REF: 132 | 148 BLM: Higher Order - Analyze

TOP: 5–6

50. Refer to Mall Shopper Narrative. Are the gender of the shopper and the shopper’s age independent events? Explain. ANS: No, gender of the shopper and age are not independent events. For example, the probability P(

/M) = 0.375

) = 0.39.

PTS: 1 REF: 154 BLM: Higher Order - Analyze

TOP: 5–6

Psychological Tests Narrative A psychologist tests Grade 7 students on basic word association skills and number pattern recognition skills. Let W be the event a student does well on the word association test. Let N be the event a student does well on the number pattern recognition test. A student is selected at random, and the following probabilities are given: P(W 0.15, P(

N) = 0.10, and P(

N) = 0.25, P(W

) = 0.50.

51. Refer to Psychological Tests Narrative. What is the probability that the randomly selected student does well on the word association test? ANS: P(W) = P(W

N) + P(W

) = 0.25 + 0.15 = 0.40

PTS: 1 REF: 164 BLM: Higher Order - Apply

TOP: 5–6

52. Refer to Psychological Tests Narrative. What is the probability that the randomly selected student does well on the number pattern recognition test? ANS: P(N) = P(W

N) + P(

N) = 0.25 + 0.10 = 0.35

PTS: 1 REF: 164 BLM: Higher Order - Apply

TOP: 5–6

53. Refer to Psychological Tests Narrative. What is the probability that the randomly selected student does well on at least one of the tests? ANS: P(W N ) = P(W) + P(N) – P(W ) = 1 – 0.50 = 0.50

P( PTS:

N) = 0.40 + 0.35 – 0.25 = 0.50, or P(W

REF: 148-149

TOP: 5–6

N)=1–

BLM: Higher Order - Apply 54. Refer to Psychological Tests Narrative. If the randomly selected student does well on the word association test, what is the probability he or she will also do well on the number pattern recognition test? ANS: P(N/W ) = P(W

N)/P(W) = 0.25/0.4 = 0.625

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

55. Refer to Psychological Tests Narrative. If the randomly selected student does well on the number pattern recognition test, what is the probability he or she will also do well on the word association test? ANS: P(W/N ) = P(W

N)/P(N) = 0.25/0.35 = 0.7143

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

56. Refer to Psychological Tests Narrative. Are the events W and N mutually exclusive? Justify your ANS. ANS: No, they are not mutually exclusive because P(W PTS: 1 REF: 132 | 148 BLM: Higher Order - Analyze

N) = 0.25

TOP: 5–6

57. Refer to Psychological Tests Narrative. Are the events W and N independent? Explain. ANS: No, they are not independent. For example, P(W/N ) = 0.7143 PTS: 1 REF: 154 BLM: Higher Order - Analyze

P(W) = 0.40.

TOP: 5–6

58. Studies have shown a particular television commercial is understood by 25% of Grade 1 students and 80% of Grade 4 students. If a television advertising agency randomly selects one Grade 1 and one Grade 4 student, what is the probability neither child would understand the commercial, assuming the children’s reactions are independent? ANS: Let A: The TV commercial is understood by Grade 1 students. Let B: The TV commercial is understood by Grade 4 students. P(A) = 0.25 and P(B) = 0.80.

Since A and B are assumed to be independent of each other, then P(

) = P(

) P(

) = (0.75)(0.20) = 0.15.

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

TOP: 5–6

Salary of Working Mothers Narrative A researcher studied the relationship between the salary of a working woman with school-aged children and the number of children she had. The results are shown in the following probability table: Number of Children 2 or Fewer Children

More than 2 Children

High salary

0.13

0.02

Medium salary

0.20

0.10

Low salary

0.30

0.25

Salary

Let A denote the event that a working woman has two or fewer children, and let B denote the event that a working woman has a low salary. 59. Refer to Salary of Working Mothers Narrative. What is the probability that a working woman has two or fewer children? ANS: P(A) = 0.13 + 0.20 + 0.30 = 0.63 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

60. Refer to Salary of Working Mothers Narrative. What is the probability that a working woman has a low salary? ANS: P(B) = 0.30 + 0.25 = 0.55 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

61. Refer to Salary of Working Mothers Narrative. What is the probability that a working woman has two or fewer children and has a low salary? ANS: P(A

B) = 0.30

PTS:

REF: 134-136

TOP: 5–6

BLM: Higher Order - Apply 62. Refer to Salary of Working Mothers Narrative. What is the probability that a working woman either has two or fewer children or has a low salary? ANS: P(A

B) = P(A) + P(B) – P(A

B) = 0.63 + 0.55 – 0.30 = 0.88

PTS: 1 REF: 134-136 | 148 BLM: Higher Order - Apply

TOP: 5–6

63. Refer to Salary of Working Mothers Narrative. If a working woman has two or fewer children, what is the probability that she has a low salary? ANS: P(B/A) = P(A

B)/P(A) = 0.30/0.63 = 0.4762

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

64. Refer to Salary of Working Mothers Narrative. If a working woman has a low salary, what is the probability that she has two or fewer children? ANS: P (A/B) = P (A

B)/P (B) = 0.30/0.55 = 0.5455

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

65. Refer to Salary of Working Mothers Narrative. From this information, can one conclude that the salary of a working woman with school-aged children and the number of children she has are independent events? Explain. ANS: No. For example, P(A/B) = 0.5455

P(A) = 0.63

PTS: 1 REF: 154 BLM: Higher Order - Evaluate

TOP: 5–6

Drug Offenders Narrative Research studies suggest that the likelihood a drug offender will be convicted of a drug offence within two years after treatment for drug abuse may depend on the person’s educational level. The proportions of the total number of cases that fall into four education/conviction categories are shown in the table below: Status within Two Years after Treatment Education

Convicted

Not Convicted

Total

10 or more years of education

0.10

0.30

0.40

Less than 10 years of education

0.25

0.35

0.60

Total

0.35

0.65

1.00

Suppose a single offender is selected from the treatment program. Here are two events of interest: A: The offender has 10 or more years of education. B: The offender is convicted within two years after completion of treatment. 66. Refer to Drug Offenders Narrative. Find P(A). ANS: P(A) = 0.40 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

67. Refer to Drug Offenders Narrative. Find P(B). ANS: P(B) = 0.35 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

68. Refer to Drug Offenders Narrative. Find P(A

B).

ANS: P(A B) = 0.10 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

69. Refer to Drug Offenders Narrative. Find P(A ANS: P(A B) = P(A) + P(B) – P(A

B).

B) = 0.40 + 0.35 – 0.10 = 0.65

PTS: 1 REF: 148 | 134-136 BLM: Higher Order - Apply 70. Refer to Drug Offenders Narrative. Find P( ANS:

TOP: 5–6

) = 1 – P(A) = 1 – 0.40 = 0.60

PTS: 1 REF: 149 BLM: Higher Order - Apply

TOP: 5–6

71. Refer to Drug Offenders Narrative. Find P

ANS: P

= 1 – P(A

B) = 1 – 0.65 = 0.35

PTS: 1 REF: 149 BLM: Higher Order - Apply

TOP: 5–6

72. Refer to Drug Offenders Narrative. Find P

ANS: P

= 1 – P(A

B) = 1 – 0.10 = 0.90

PTS: 1 REF: 149 BLM: Higher Order - Apply

TOP: 5–6

73. Refer to Drug Offenders Narrative. Find the probability of A given that B has occurred. ANS: P(A/B) = P(A

B)/P(B) = 0.10/0.35 = 0.2857

PTS: 1 REF: 153 BLM: Higher Order - Apply

TOP: 5–6

74. Refer to Drug Offenders Narrative. Are events A and B independent? Explain. ANS: No. For example, P(A/B) = 0.2857

P(A) = 0.40.

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

TOP: 5–6

75. Refer to Drug Offenders Narrative. Find the probability of B given that A has occurred. ANS: P(B/A) = P(A

B)/P(A) = 0.10/0.40 = 0.25

PTS: 1 REF: 153 BLM: Higher Order - Apply

TOP: 5–6

76. A missile designed to destroy enemy satellites has a 0.80 chance of destroying its target. If the government tests three missiles by firing them at a target, what is the probability all three fail to destroy the target? (Assume the missiles perform independently.) ANS: Probability all three fail to destroy the target =

= 0.008.

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Apply

TOP: 5–6

Late Night Talk Shows Narrative Let A be the event that a randomly selected person watches the Tonight Show with Jay Leno (event A) and B be the event that a randomly selected person watches the Late Show with David Letterman (event B). It is possible to time-shift a program to a more convenient hour and thus watch both programs. Suppose the following probabilities are given: P(A B) = 0.20, P(A

) = 0.40, P(

) = 0.10, and P(

) = 0.30.

77. Refer to Late Night Talk Shows Narrative. What is the probability that a randomly selected person watches both shows? ANS: P(A

B) = 0.20

PTS: 1 REF: 146-147 BLM: Higher Order - Understand

TOP: 5–6

78. Refer to Late Night Talk Shows Narrative. What is the probability that a randomly selected person watches only Jay Leno? ANS: P(A

) = 0.40

PTS: 1 REF: 146-147 BLM: Higher Order - Understand

TOP: 5–6

79. Refer to Late Night Talk Shows Narrative. What is the probability that a randomly selected person watches only David Letterman? ANS: P(

) = 0.10

PTS: 1 REF: 146-147 BLM: Higher Order - Understand

TOP: 5–6

80. Refer to Late Night Talk Shows Narrative. What is the probability that a randomly selected person watches Jay Leno?

ANS: P(A) = P(A

B) + P(A

) = 0.20 + 0.40 = 0.60

PTS: 1 REF: 162 BLM: Higher Order - Apply

TOP: 5–6

81. Refer to Late Night Talk Shows Narrative. What is the probability that a randomly selected person watches David Letterman? ANS: P(B) = P(A

B) + P(

) = 0.20 + 0.10 = 0.30

PTS: 1 REF: 162 BLM: Higher Order - Apply

TOP: 5–6

82. Refer to Late Night Talk Shows Narrative. If we know a person watches Jay Leno, what is the probability he or she also watches David Letterman? ANS: P(B/A) = P(A

B)/P(A) = 0.20/0.60 = 0.333

PTS: 1 REF: 153 BLM: Higher Order - Apply

TOP: 5–6

83. Refer to Late Night Talk Shows Narrative. If we know a person watches David Letterman, what is the probability he or she also watches Jay Leno? ANS: P(A/B) = P(A

B)/P(B) = 0.20/0.30 = 0.667

PTS: 1 REF: 153 BLM: Higher Order - Apply

TOP: 5–6

84. Refer to Late Night Talk Shows Narrative. Are the events A and B independent? Justify your answer. ANS: No, since P(A/B) = 0.667  P(A) = 0.60 PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

TOP: 5–6

City Council Election Narrative An election is being held to fill two city council seats. Two fiscally conservative candidates (denoted by C) and three small-L liberal candidates (denoted by L) are running for office. Assume the candidates are equally likely to be elected, and independent of each other.

85. Refer to City Council Election Narrative. What are the possible outcomes of the election? ANS: CC, CL, LC, LL PTS: 1 REF: 131-132 BLM: Higher Order - Apply

TOP: 5–6

86. Refer to City Council Election Narrative. What is the probability both seats are filled by Conservatives? ANS: P(CC) = (2/5) (1/4) = 0.10 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

87. Refer to City Council Election Narrative. What is the probability one Conservative and one Liberal are elected to the two city council seats? ANS: P(CL

LC) = P(CL) + P(LC) = (2/5) (3/4) + (3/5) (2/4) = 0.60

PTS: 1 REF: 148 | 150 | 153-155 BLM: Higher Order - Apply

TOP: 5–6

88. There are four different kinds of radar systems designed to detect or monitor the airspace around a major airport for high-flying objects, low-flying objects, runway traffic, and wind shear. Each operates independently from the others. If each system has probability 0.90 of functioning correctly, find the probability at least one radar system will fail. ANS: P(at least one will fail) = 1 – P(none fail) = 1 – (0.90)4 = 0.3439 PTS: 1 REF: 149 BLM: Higher Order - Analyze

TOP: 5–6

Random Selection of Marbles Narrative A box contains one red, three blue, and two green marbles. Two marbles are randomly selected without replacement. Define events R, B, G, C, and D as follows: R = {The selected marble is red.} B = {The selected marble is blue.} G = {The selected marble is green.} C = {Both marbles selected are the same colour.} D = {At least one of the marbles is blue.} 89. Refer to Random Selection of Marbles Narrative. Find P(C).

ANS: P(C) = P(B

B) + P(G

G) = (3/6)(2/5) + (2/6)(1/5) = 4/15 = 0.2667

PTS: 1 REF: 134-135 | 143-144 BLM: Higher Order - Analyze

TOP: 5–6

90. Refer to Random Selection of Marbles Narrative. Find P(D). ANS: P(D) = 1 – P(neither marble is blue) = 1 – (3/6)(2/5) = 4/5 = 0.80 PTS: 1 REF: 149 | 143-144 BLM: Higher Order - Analyze

TOP: 5–6

91. Refer to Random Selection of Marbles Narrative. Find P(C ANS: P(C

D) = P(B

B) = (3/6)(2/5) = 1/5 = 0.20

PTS: 1 REF: 151-153 BLM: Higher Order - Apply

TOP: 5–6

92. Refer to Random Selection of Marbles Narrative. Find P(C ANS: P(C

D).

D) = P(C) + P(D) – P(C

D).

D) = 0.2667 + 0.80 – 0.20 = 0.8667

PTS: 1 REF: 148 BLM: Higher Order - Apply

TOP: 5–6

93. Refer to Random Selection of Marbles Narrative. Find P(D | C). ANS: P(D/C) = P(C

D)/P(C) = 0.20/0.2667 = 0.75

PTS: 1 REF: 153 BLM: Higher Order - Apply

TOP: 5–6

Waste Management Project Narrative A federal agency is trying to decide which of two waste management projects to investigate as the source of air pollution. In the past, projects of the first type were in violation of air quality standards with probability 0.3 on any given day, while projects of the second type were in violation of air quality standards with probability 0.25 on any given day. It is not possible for both projects to pollute the air in one day. Let type i was in violation of air quality standards.

, i = 1, 2, denote that project of

94. Refer to Waste Management Project Narrative. Find the probability of an air pollution problem being caused by either the first project or the second project. ANS: P( ) = P( ) + P( = 0.30 + 0.25 = 0.55

); since

and

PTS: 1 REF: 132 | 148 BLM: Higher Order - Apply

are mutually exclusive

TOP: 5–6

95. Refer to Waste Management Project Narrative. If the first project is violating air quality standards, what is the probability the second project is also violating federal air quality standards? ANS: P( / ) = P( )/P( ) = 0.0/0.30 = 0.0 This answer was expected without any calculations since both projects can’t pollute on the same day (i.e., the events of the projects polluting on any given day are mutually exclusive). PTS: 1 REF: 148 | 151-153 BLM: Higher Order - Apply

TOP: 5–6

Hand Soap Product Narrative A hand soap manufacturer introduced a new liquid, lotion-enriched, antibacterial soap and conducted an extensive consumer survey to help judge the success of the new product. The survey showed 40% of the consumers had seen an advertisement for the new soap, 20% had tried the new soap, and 15% had both seen an advertisement and tried the new soap. Let event A denote that the consumers had seen an ad for the new soap, and event B denote that the consumers had tried the new soap. 96. Refer to Hand Soap Product Narrative. What is the probability a randomly selected consumer had either seen an advertisement or tried the new soap? ANS: P(A B) = P(A) + P(B) – P(A

B) = 0.40 + 0.20 – 0.15 = 0.45

PTS: 1 REF: 148 BLM: Higher Order - Apply

TOP: 5–6

97. Refer to Hand Soap Product Narrative. If a randomly chosen consumer has seen an advertisement for the new soap, what is the probability he or she has tried the product? ANS: P(B/A) = P(A PTS:

B)/P(A) = 0.15/0.40 = 0.375 REF: 153

TOP: 5–6

BLM: Higher Order - Apply 98. A laboratory test for a disease affecting 5% of the population is either positive, indicating the disease is present, or negative, indicating the disease is not present. When people having the disease are tested, 80% of the tests come back positive, and when people who don’t have the disease are tested, 15% of the tests come back from the lab marked positive (a “false positive” result). What is the chance a randomly chosen person’s test results would come back positive? ANS: Let event D denote that the disease is present. Let the event Y denote that the test comes back positive. P(Y) = P(Y D) + P(Y DC) = P(D) P(Y/D) + P(DC) P(Y/DC) = (0.05) (0.80) + (0.95) (0.15) = 0.1875 PTS: 1 REF: 151-153 | 162 BLM: Higher Order - Analyze

TOP: 5–6

Delegates Attendance Narrative Of the delegates at a convention, 60% attended the breakfast forum, 70% attended the dinner speech, and 40% attended both events. Define events A and B as follows: A: attended the breakfast forum B: attended the dinner speech. 99. Refer to Delegates Attendance Narrative. If a randomly selected delegate is known to have attended the dinner speech, what is the probability that she or he also attended the breakfast forum? ANS: P(A/B) = P(A

B)/P(B) = 0.40/0.70 = 0.5714

PTS: 1 REF: 153 BLM: Higher Order - Apply 100.

TOP: 5–6

Refer to Delegates Attendance Narrative. What is the probability that a randomly selected delegate either attended the breakfast forum, or attended the dinner speech, or attended both? ANS: P(A B ) = P(A) + P(B) – P(A PTS: 1 REF: 148 BLM: Higher Order - Apply Lab Rats Experiments Narrative

B) = 0.60 + 0.70 – 0.40 = 0.90 TOP: 5–6

An experiment was conducted in which rats could choose to enter one of two corridors, A or B. A random sample of three rats is selected. Let x = number of rats that select corridor B. 101.

Refer to Lab Rats Experiments Narrative. Assuming the rats select their favourite corridor independently of one another and that the two corridors are equally likely to be selected, find the probability distribution of x. ANS: P(0) = P(AAA) = (1/2)3 = 1/8 = 0.125, P(1) = P(AAB) + P(ABA) + P(BAA) = 3(1/2)2 = 3/8 = 0.375, P(2) = P(ABB) + P(BAB) + P(BBA) = 3(1/2)2 = 3/8 = 0.375, P(3) = P(BBB) = (1/2)3 = 1/8 = 0.125. The probability distribution of x is shown in the table below: x p(x) 0 0.125 1 0.375 2 0.375 3 0.125 PTS: 1 REF: 131-132 | 141 | 171-172 BLM: Higher Order - Analyze

102.

TOP: 5–6

Refer to Lab Rats Experiments Narrative. What is the probability that, at most, one rat selects corridor B? ANS: P(x = 0) + P(x = 1) = 0.125 + 0.375 = 0.50 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Analyze

103.

TOP: 5–6

Refer to Lab Rats Experiments Narrative. What is the probability that at least one rat selects corridor B? ANS: P(x = 1) + P(x = 2) + P(x = 3) = 0.375 + 0.375 + 0.125 = 0.875 or 1 – P(x = 0) = 1 – 0.125 = 0.875 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Analyze

TOP: 5–6

Food and Drink Narrative A student has decided to study at a local coffee shop. After some time, she gets hungry. There are two beverages available, tea and coffee, and three bakery items: doughnuts, muffins, and bagels. Define the following events: C = {student gets coffee to drink}

T = {student gets tea to drink}

D = {student gets a doughnut to eat} M = {student gets a muffin to eat} B = {student gets a bagel to eat} 104.

Refer to Food and Drink Narrative. The student decides she wants to get one item to eat and one item to drink. List the elements in the sample space S. ANS: The sample space S = {(C,D),(C,M),(C,B),(T,D),(T,M),(T,B)}. PTS: 1 REF: 131-132 BLM: Higher Order - Apply

105.

TOP: 5–6

Refer to Food and Drink Narrative. If each combination is equally likely, what is the probability that the student gets a coffee and a bagel? ANS: P(C,B) = 1/6 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

106.

TOP: 5–6

Refer to Food and Drink Narrative. If each combination is equally likely, what is the probability that the student gets a muffin and coffee or tea? ANS: P[(C,M) or (T,M)] = 2/6 = 1/3 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

107.

TOP: 5–6

Refer to Food and Drink Narrative. If each combination is equally likely, what is the probability the student does not get a doughnut? ANS: P(no doughnut) = 1 – P(gets a doughnut) = 1 – P[(C,D) or (T,D)] = 1 – 2/6 = 2/3 PTS: 1 REF: 134-136 | 149 BLM: Higher Order - Apply

TOP: 5–6

Gender of Athletes Narrative Two hundred single-sport athletes were cross-classified according to gender, as shown in the table below. An athlete is selected at random. Single-Sport Athletes Gender Swimmer Male 25 Female 20

Runner 60 50

Cyclist 25 20

108.

Refer to Gender of Athletes Narrative. What is the probability that the athlete is a runner? ANS: P(Runner) = (60 + 50)/200 = 0.55 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

109.

TOP: 5–6

Refer to Gender of Athletes Narrative. What is the probability that the athlete is a female runner? ANS: P(Runner and Female) = 50/200 = 0.25 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

110.

TOP: 5–6

Refer to Gender of Athletes Narrative. If the athlete is known to be a runner, what is the probability that the athlete is female? ANS: P(Female/Runner) = P(Runner and Female)/P(Runner) = 0.25/0.55 = 0.4545 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

111.

TOP: 5–6

Refer to Gender of Athletes Narrative. What is the probability that the athlete is male or a swimmer or both? ANS: P(Male or Swimmer) = P(Male) + P(Swimmer) – P(Male and Swimmer) =110/200 + 45/200 – 25/200 = 130/200 = 0.65 PTS: 1 REF: 134-136 | 148 BLM: Higher Order - Apply

112.

TOP: 5–6

Refer to Gender of Athletes Narrative. Are the events being a female and being a swimmer mutually exclusive? Justify your answer. ANS: No, since P(Female and Swimmer) = 20/200 = 0.10  0 PTS: 1 REF: 132 | 148 BLM: Higher Order - Analyze

113.

TOP: 5–6

Refer to Gender of Athletes Narrative. Are the single-sport type and gender of athletes independent? Justify your answer. ANS:

No, for example, P(Female/Runner) = 0.4545  P(Female) = 0.45. PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze 114.

TOP: 5–6

At a local pet adoption centre for dogs and cats, it is known that if a person adopts a pet, there is a 0.45 probability that it will be a cat and a 0.55 probability that it will be a dog. If a cat is adopted, the probability that it is female is 0.60. If a dog is adopted, the probability that it is female is 0.35. An adopted pet is selected at random and is found to be male. What is the probability that the adopted pet is a dog? ANS: P(Dog/Male) = P(Dog and Male)/P(Male) = P(Male/Dog) P(Dog)/P(Male) But, P(Male/Dog) = 0.65, P(Dog) = 0.55, and P(Male) = P(Male/Dog) P(Dog) + P(Male/Cat) P(Cat) = (0.65) (0.55) + (0.40) (0.45) = 0.5375 Therefore, P(Dog/Male) = (0.65) (0.55)/(0.5375) = 0.6651 PTS: 1 REF: 153 BLM: Higher Order - Analyze

115.

TOP: 5–6

Suppose that P(A) = 0.6, P(B) = 0.7, and that events A and B are independent. a. Find P(A B). b. Find P(A B). c. Find P(A/B). d. Find P(B/A). ANS: a. P(A B) = P(A) . P(B) = (0.6)(0.7) = 0.42 (since A and B are independent). b. P(A B) = P(A) + P(B) – P(A B) = 0.6 + 0.7 – 0.42 = 0.88 c. P(A/B) = P(A) = 0.6 (since A and B are independent). d. P(B/A) = P(B) = 0.7 (since A and B are independent). PTS: 1 REF: 148 | 150 | 153-155 BLM: Higher Order - Apply

116.

TOP: 5–6

Suppose that P(A) = 0.4, P(B) = 0.5, and that events A and B are mutually exclusive. a. Find P(A B). b. Find P(A B). ANS: a. P(A B) = 0 (since A and B are mutually exclusive). b. P(A B) = P(A) + P(B) = 0.4 + 0.5 = 0.9 (since A and B are mutually exclusive). PTS:

REF: 132 | 148

TOP: 5–6

BLM: Higher Order - Apply Smoking and Gender Narrative An experiment can result in one or both of events A = Smoker and B = Female, with the joint probabilities shown in the table below. A person is selected at random. A B 0.40 0.35 0.15 117.

0.10

Refer to Smoking and Gender Narrative. Find the probability that the person is a smoker. ANS: P(A) =

= 0.40 + 0.15 = 0.55

PTS: 1 REF: 149-150 | 162 BLM: Higher Order - Apply 118.

TOP: 5–6

Refer to Smoking and Gender Narrative. Find the probability that the person is female. ANS: P(B) =

= 0.40 + 0.35 = 0.75

PTS: 1 REF: 149-150 | 162 BLM: Higher Order - Apply 119.

TOP: 5–6

Refer to Smoking and Gender Narrative. Find the probability that the person is a female smoker. ANS: P(A B) = 0.40 PTS: 1 REF: 149-150 BLM: Higher Order - Apply

120.

Refer to Smoking and Gender Narrative. Find the probability that the person is either a smoker or a female or both. ANS: P(A B) = P(A) + P(B) – P(A

B) = 0.55 + 0.75 – 0.40 = 0.90

PTS: 1 REF: 148-150 BLM: Higher Order - Apply 121.

TOP: 5–6

Refer to Smoking and Gender Narrative. If the person is female, find the probability that she is a smoker.

ANS: P(A/B) = P(A

B)/P(B) = 0.40/0.75 = 0.5333

PTS: 1 REF: 149-150 | 153 BLM: Higher Order - Apply 122.

TOP: 5–6

Refer to Smoking and Gender Narrative. Find the probability that the person is a female, given that she smokes. ANS: P(B/A) = P(A

B)/P(A) = 0.40/0.55 = 0.7273

PTS: 1 REF: 149-150 | 153 BLM: Higher Order - Apply 123.

Refer to Smoking and Gender Narrative. Are smoking and gender of the person mutually exclusive events? Explain. ANS: Since P(A exclusive.

B) = 0.40

0, then smoking and gender of the person are not mutually

PTS: 1 REF: 132 | 148 BLM: Higher Order - Analyze 124.

TOP: 5–6

Refer to Smoking and Gender Narrative. Are smoking and gender of the person independent events? Explain. ANS: Since P(A/B) = 0.5333 independent events.

P(A) = 0.55, then smoking and gender of the person are not

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

TOP: 5–6

Fast-Food Restaurants Narrative Lily frequents one of two fast-food restaurants, choosing McDonald’s 25% of the time and Burger King 75% of the time. Regardless of where she goes, she buys french fries on 60% of her visits. 125.

Refer to Fast-Food Restaurants Narrative. The next time Lily goes into a fast-food restaurant, what is the probability that she goes to McDonald’s and orders french fries? ANS: Define the following events: M: Lily chooses McDonald’s

B: Lily chooses Burger King F: Lily orders french fries. Then, P(M) = 0.25, P(B) = 0.75, and P(F/M) = P(F/B) = 0.60. Therefore, P(M F) = P(M) P(F/M) = (0.25)(0.60) = 0.15. PTS: 1 REF: 151 BLM: Higher Order - Analyze 126.

TOP: 5–6

Refer to Fast-Food Restaurants Narrative. Are the two events in the previous question independent? Explain. ANS: Since P(F) = 0.60 regardless of whether Lily visits McDonald’s or Burger King, the two events are independent. PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Analyze

127.

TOP: 5–6

Refer to Fast-Food Restaurants Narrative. If Lily goes to a fast-food restaurant and orders french fries, what is the probability that she is at a Burger King? ANS: P(B/F) = P(B F)/P(F) = P(B) result of independence).

P(F/B)/P(F) = P(B) = 0.75 ( Note P(F/B) = P(F) is a

PTS: 1 REF: 150 | 153-155 BLM: Higher Order - Apply 128.

TOP: 5–6

Refer to Fast-Food Restaurants Narrative. What is the probability that Lily goes to McDonalds, or orders french fries, or both? ANS: P(M F) = P(M) + P(F) – P(M

F) = 0.25 + 0.60 – 0.15 = 0.70

PTS: 1 REF: 148-150 BLM: Higher Order - Apply

TOP: 5–6

Dell Computer Owners Narrative Dell computer owners are very faithful. Despite reporting problems with their current systems, 90% of Dell owners said they would buy another computer from the company, based on the service they received. Suppose you randomly select three current Dell computer users and ask them whether they would buy another Dell computer system. 129.

Refer to Dell Computer Owners Narrative. Find the probability distribution for x, the number of Dell users in the sample of three who would buy another Dell computer. ANS: Define the following two events:

Y: Dell owner would buy another Dell computer, N: Dell owner would not buy another Dell. The simple events are NNN, NNY, NYN, YNN, NYY, YNY, YYN, YYY. Since P(Y) = 0.90 and P(N) = 0.10, and the events are independent, the probability distribution for x , the number of Ys in the sample, is given by p(0) = P(NNN) = + P(NNY) = 3(0.90)

= 0.027, p(2) = P(YYN) + P(YNY) + P(NYY) = 3

0.243, p(3) = P(YYY) =

(0.10) =

= 0.729.

PTS: 1 REF: 131-132 | 141-142 | 170-171 BLM: Higher Order - Analyze 130.

= 0.001, p(1) = P(YNN) + P(NYN)

TOP: 5–6

Refer to Dell Computer Owners Narrative. What is the probability that exactly one of the three Dell computer users would buy another Dell computer? ANS: P(1) = 0.027 PTS: 1 REF: 171-172 BLM: Higher Order - Apply

131.

TOP: 5–6

Refer to Dell Computer Owners Narrative. What is the probability that at least two of the three Dell computer users would buy another Dell computer? ANS: P(x 2) = P(2) + P(3) = 0.243 + 0.729 = 0.972 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

132.

TOP: 5–6

Refer to Dell Computer Owners Narrative. What are the population mean, variance, and standard deviation for the random variable x? ANS: The population mean is

= 2.7, the population variance is

= 0.27, and the population standard deviation is PTS: 1 REF: 172-173 BLM: Higher Order - Apply 133.

= 0.5196.

TOP: 5–6

Refer to Dell Computer Owners Narrative. Construct the probability histogram for p(x). ANS:

PTS: 1 REF: 171-172 BLM: Higher Order - Apply

TOP: 5–6

College in Ontario Narrative A college in Ontario has 1000 employees. Four hundred of the employees have at least 20 years of experience (event A); 100 of the employees were born in Ontario (event B); and 300 of the employees had a background in Microsoft Office 2003 (event C). Assume events A, B, and C are independent. 134.

Refer to College in Ontario Narrative. What is the probability of finding an employee who meets all three of these criteria? ANS:

PTS: 1 REF: 146-147 | 150 | 153-155 BLM: Higher Order - Analyze 135.

TOP: 5–6

Refer to College in Ontario Narrative. What is the probability of finding an employee who meets at least two of the three criteria? ANS:

PTS: 1 REF: 146-147 | 150 | 153-155 BLM: Higher Order - Analyze Population Samples Narrative

TOP: 5–6

A sample is selected from one of two populations, and P(

) = 0.20. If the sample has been selected from

event A is P(A/

, with probabilities P(

) = 0.80

, the probability of observing an

) = 0.15. Similarly, If the sample has been selected from

probability of observing A is P(A/ 136.

and

, the

) = 0.25.

Refer to Population Samples Narrative. If a sample is randomly selected from one of the two populations, what is the probability that event A occurs? ANS: Using the Law of Total Probability, we have P(A) = P(

) P(A/

) + P(

) P(A/

PTS: 1 REF: 162-163 BLM: Higher Order - Analyze 137.

) = (0.80)(0.15) + (0.20)(0.25) = 0.17. TOP: 7

Refer to Population Samples Narrative. If a sample is randomly selected and event A is observed, what is the probability that the sample was selected from population population

? From

ANS: Using the result results of part (a) in the form of Bayes’ Rule:

For i =1:

0.7059

For i =2:

0.2941

PTS: 1 REF: 163-165 BLM: Higher Order - Analyze 138.

TOP: 7

Steve takes either a bus or the subway to go to work, with probabilities 0.25 and 0.75, respectively. When he takes the bus, he is late 40% of the time. When he takes the subway, he is late 30% of the time. If Steve is late for work on a particular day, what is the probability that he took the bus? ANS: Define the following events: B: Steve takes the bus. S: Steve takes the subway. L: Steve is late for work. It is given that P(B) = 0.25, P(S) = 0.75, P(L/B) = 0.40, and P(L/S) = 0.30. Using Bayes’ Rule, then

= PTS: 1 REF: 163-165 BLM: Higher Order - Analyze 139.

= 0.3077 TOP: 7

Medical case histories indicate that different illnesses may produce identical symptoms. Suppose a particular set of symptoms, which we will denote as event H, occurs only when any one of these illnesses, A, B, or C, occurs. (For the sake of simplicity, we will assume that illnesses A, B, and C are mutually exclusive.) Studies show the probabilities of getting the three illnesses are as follows: P(A) = 0.015, P(B) = 0.005, and P(C) = 0.025. The probabilities of developing the symptoms H, given a specific illness, are P(H/A) = 0.85, P(H/B) = 0.90, and P(H/C) = 0.70. Assuming that an ill person shows the symptoms H, what is the probability that the person has illness A? ANS: The probability of interest is P(A/H), which can be calculated using Bayes’ Rule and the probabilities given above.

P(A/H)

= 0.3669

PTS: 1 REF: 163-165 BLM: Higher Order - Analyze

TOP: 7

Weight Gain for Calves Narrative Let x denote the weight gain in kilograms per month for a calf. The probability distribution of x is shown below. x 0 5 10 15 140.

p(x) 0.1 0.5 0.3 0.1

Refer to Weight Gain for Calves Narrative. Find the average weight gain in kilograms per month for a calf. ANS: = 7 kilograms

PTS: 1 REF: 172-173 BLM: Higher Order - Apply 141.

TOP: 8

Refer to Weight Gain for Calves Narrative. Find the variance of the weight gain. ANS: = 16 PTS: 1 REF: 172-173 BLM: Higher Order - Apply

142.

TOP: 8

Refer to Weight Gain for Calves Narrative. What is P(x

10)?

ANS: P(x 10) = 0.3 + 0.1 = 0.4 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply 143.

Refer to Weight Gain for Calves Narrative. What is P(0 x ANS: P(0 x

5)?

5) = 0.1 + 0.5 = 0.6

PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply 144.

TOP: 8

Refer to Weight Gain for Calves Narrative. What is the probability that the variable x will lie strictly between 0 and 10 kilograms? ANS: P(0 < x < 10) = P(x = 5) = 0.5 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

TOP: 8

Typists’ Errors Narrative The random variable x is defined as the number of mistakes made by a typist on a randomly chosen page of a physics thesis. The probability distribution follows: x 0 1 2 3

p(x) 0.3 0.4 0.2 0.1

145.

Refer to Typists’ Errors Narrative. What is E(x)? ANS: E(x) =

= 1.1

PTS: 1 REF: 172-173 BLM: Higher Order - Apply 146.

Refer to Typists’ Errors Narrative. Find

TOP: 8

ANS: = 0.9434 PTS: 1 REF: 172-173 BLM: Higher Order - Apply 147.

TOP: 8

Refer to Typists’ Errors Narrative. Find P(x < 1). ANS: P(x < 1) = P(x = 0) = 0.3 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

148.

Refer to Typists’ Errors Narrative. Find P(0

TOP: 8

x < 2).

ANS: P(0 x < 2) = P(x = 0) + P(x = 1) = 0.3 + 0.4 = 0.7 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply 149.

TOP: 8

Refer to Typists’ Errors Narrative. In what fraction of pages in the thesis would the number of mistakes made be within two standard deviations of the mean? ANS: = 1.1 2(0.9434) = 1.1 1.8868; or –0.7868 to 2.9868. For the random variable x to be in this interval, x = 0, 1, or 2. The fraction of pages where x = 0, 1, or 2 is p(0) + p(1) + p(2) = 0.9 = 9/10. PTS: 1 REF: 137 | 171-173 BLM: Higher Order - Analyze

TOP: 8

Casino Card Game Narrative The probability distribution of your winnings at a casino’s card game is shown below.

x $0 $1 $2 $5 150.

p(x) 0.1 0.4 0.2 0.3

Refer to Casino Card Game Narrative. What is the chance you win more than $1 if you play just once? ANS: P(x = 2) + P(x = 5) = 0.2 + 0.3 = 0.5 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

151.

TOP: 8

Refer to Casino Card Game Narrative. How much should you expect to win if you play the game once? ANS: E(x) =

= $2.30

PTS: 1 REF: 172-173 BLM: Higher Order - Analyze 152.

TOP: 8

Refer to Casino Card Game Narrative. After breaking the bank at the casino playing this card game, you decide to open your own casino where the customers can play your favourite card game. How much should you charge the players if you want to have an average profit of $1 per play? ANS: E(x) + 1 = $2.30 + $1.0 = $3.3 PTS: 1 REF: 172-173 BLM: Higher Order - Analyze

153.

TOP: 8

Let x be a random variable with the following distribution: x –10 –5 0 5 10 a. b. c. mean?

p(x) 0.1 0.3 0.1 0.3 0.2 Find the expected value of x. Find the standard deviation of x. What is the probability that x is further than one standard deviation from the

ANS: a.

= 1.0

b. = 6.633 c. = 1 6.333 , or –5.633 to 7.633, so x would have to take on the values –5, 0, or 5 to be in this interval, and the probability of this occurring is 0.3 + 0.1 + 0.3 = 0.7. The probability x being outside of this interval is 1 – 0.7 = 0.3. PTS: 1 REF: 172-173 BLM: Higher Order - Apply

TOP: 8

Cheesecake Sales and Donations Narrative The neighbourhood deli specializes in New York Style Cheesecake. The cheesecakes are made fresh daily and any unsold cheesecake is donated to a food bank. Each cheesecake costs $5 to make and sells for $11. The daily demand for cheesecake (the number the deli could sell if it had a cheesecake for everyone who wanted one) has the following distribution: x 0 1 2 3 154.

p(x) 0.0 0.5 0.3 0.2

Refer to Cheesecake Sales and Donations Narrative. What is the expected demand for cheesecake? ANS: E(x) =

= 1.7 cheesecakes per day

PTS: 1 REF: 172-173 BLM: Higher Order - Apply 155.

TOP: 8

Refer to Cheesecake Sales and Donations Narrative. What is the expected daily profit if the deli makes only one cheesecake per day? ANS: Demand x 0 1

Profit for a Demand of x Cheesecakes When Making Only One Cheesecake Unsold cake is donated: –$5

Probability the Demand Is for x Cheesecakes

One cake made, one sold: $11 – $5 = $6

0.5

One cake made and sold, one customer wants a cake but there are none to buy: $6

0.3

One cake made and sold, two customers want a cake but there are none to buy: $6

0.2

Let

be the profit associated with a demand

, and that

Expected profit = E(y) = = (–$5) (0) + ($6) (0.5) + ($6) (0.3) + ($6) (0.2) = $6 PTS: 1 REF: 172-173 BLM: Higher Order - Analyze 156.

TOP: 8

Refer to Cheesecake Sales and Donations Narrative. What is the expected daily profit if the deli makes only two cheesecakes per day? ANS: Demand x

Profit for a Demand of x Cheesecakes When Making Only Two Cheesecakes

Probability the Demand Is for x Cheesecakes

Unsold cakes are donated: (–$5) + (–$5) = –$10

Two cakes made, one sold: (–$5) + $6 = $1

0.5

Two cakes made and sold: $6 + $6 = $12

0.3

Two cakes made and sold, one customer wants a cake but there are none to buy: $6 + $6 = $12

0.2

PTS: 1 REF: 172-173 BLM: Higher Order - Analyze

TOP: 8

Defective Bolts Narrative Approximately 5% of the bolts coming off a production line have serious defects. Two bolts are randomly selected for inspection. 157.

Refer to Defective Bolts Narrative. Find the probability distribution for x, the number of defective bolts in the sample. ANS: P(x = 0) = P(both bolts are not defective) = (0.95)2 = 0.9025 P(x = 1) = P(one bolt is defective) = (0.05)(0.95) + (0.95)(0.05) = 0.095 P(x = 2) = P(both bolts are defective) = (0.05)2 = 0.0025 The probability distribution of x is shown in the table below: x p(x)

0 1 2

0.9025 0.0950 0.0025

PTS: 1 REF: 131-132 | 134-136 | 170-172 BLM: Higher Order - Analyze 158.

TOP: 8

Refer to Defective Bolts Narrative. Find E(x). ANS: E(x) =

159.

= 0.1

PTS: 1 REF: 172-173 BLM: Higher Order - Apply

TOP: 8

Refer to Defective Bolts Narrative. Find

ANS: = 0.095 PTS: 1 REF: 172-173 BLM: Higher Order - Apply 160.

TOP: 8

In the “Quick Draw” casino card game, a player chooses one card from a deck of 52 well-shuffled cards. If the card the player selects is the king of hearts, the player wins $104; if the card is an ace, the player wins $78; if the card selected is anything else, the player loses $13. Considering a negative amount won to be a loss, how much should the player expect to win in one play of “Quick Draw”? ANS: A regular deck of playing cards consists of 52 cards: 13 values (ace, 2, 3, 4, 5, 6, 7, 8, 9, 10, jack, queen, and king), each of which occurs in four suits ( ,  ,  , and ). Therefore, P(king of ) = 1/52; P(ace) = 4/52; P(other card) = 47/52 E($ winnings) = $104 (1/52) + $78 (4/52) + (–$13) (47/52) = –$9 The player should expect to lose $9. PTS: 1 REF: 134-136 | 172-173 BLM: Higher Order - Analyze

TOP: 8

Number of Cars Narrative Let the random variable x represent the number of cars owned by a family. Assume that x can take on five values: 0, 1, 2, 3, 4. A partial probability distribution is shown below: x p(x)

0 0.2

1 0.1

2 0.3

3 ?

4 0.1

161.

Refer to Number of Cars Narrative. Find the probability that a family owns three cars. ANS: Since

, then p(3) = 1 – (0.2 + 0.1 + 0.3 + 0.1) = 0.3

PTS: 1 REF: 171-172 BLM: Higher Order - Apply 162.

TOP: 8

Refer to Number of Cars Narrative. Find the probability that a family owns one to three cars, inclusive. ANS: p(1) + p(2) + p(3) = 0.1 + 0.3 + 0.3 = 0.7 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

163.

TOP: 8

Refer to Number of Cars Narrative. Find the probability that a family owns no more than one car. ANS: p(0) + p(1) = 0.2 + 0.1 = 0.3 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

164.

TOP: 8

Refer to Number of Cars Narrative. Construct a probability histogram for p(x). ANS:

PTS: 1 REF: 171-172 BLM: Higher Order - Apply

TOP: 8

165.

Refer to Number of Cars Narrative. Calculate the population mean, variance, and standard deviation. ANS: The population mean is

= 2.0.

The population variance is The population standard deviation is PTS: 1 REF: 172-173 BLM: Higher Order - Apply 166.

=1.6. =1.2649. TOP: 8

Refer to Number of Cars Narrative. What is the probability that a family owns more than two cars? ANS: P(x > 2) = p(3) + p(4) = 0.3 + 0.1 = 0.4 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

167.

TOP: 8

Refer to Number of Cars Narrative. What is the probability that that a family owns, at most, three cars? ANS: P(x 3) = 1 – P(x = 4) = 1 – 0.1= 0.9 PTS: 1 REF: 137 | 171-172 BLM: Higher Order - Apply

168.

TOP: 8

From experience, a shipping company knows that the cost of delivering a small package within 24 hours is $16.20. The company charges $16.95 for shipment but guarantees to refund the charge if delivery is not made within 24 hours. If the company fails to deliver only 3% of its packages within the 24-hour period, what is the expected gain per package? ANS: The company will either gain ($16.95 – 16.20) if the package is delivered on time, or will lose $16.20 if the package is not delivered on time. We assume that, if the package is not delivered within 24 hours, the company does not collect the $16.95 delivery fee. Then the probability distribution for x, the company’s gain, is given by x p(x)

0.75 0.97

–16.20 0.03

Hence, the expected gain per package is

= $0.2415, that is, about 24 cents.

PTS: 1 REF: 172-173 BLM: Higher Order - Analyze

TOP: 8

Working mothers in Canada Statistics Canada wanted to find the distribution of ages of working women living in Canada who were single mothers. The researcher drew a random sample of some 300 families from the government tax records and found the following distribution: Age Group Marital

Under 25 Years

Status

(

25–40 Years

)

(

Over 40 Years (

)

Total

Married (M)

120

Single (S)

180

Total

113

300

One family was selected at random from tax base records of families with working mothers. 169.

Refer to Working mothers in Canada. Convert the frequency table shown above into a probability distribution. ANS: Age Group Marital

Under 25 years

25–40 years

Status

(

Married (M)

0.08

0.20

0.12

0.40

Single (S)

0.24

0.18

0.60

Total

0.32

0.38

0.30

1.00

)

(

Over 40 years

)

(

PTS: 1 REF: 134 | 149-150 BLM: Higher Order - Apply 170.

)

Total

TOP: 5–6

Refer to Working mothers in Canada. What is the probability that the randomly selected working mother is under 25 years of age? ANS: P(

) = 0.08 + 0.24=0.32

PTS: 1 REF: 134-136 BLM: Higher Order - Apply

TOP: 5–6

171.

Refer to Working mothers in Canada. What is the probability that the randomly selected working mother is married? ANS: P(M) = 0.08 + 0.20 + 0.12 = 0.40 PTS: 1 REF: 134-136 BLM: Higher Order - Apply

172.

TOP: 5–6

Refer to Working mothers in Canada. What is the probability that the randomly selected working mother is single and under 25 years of age? ANS: P(S

) = 0.24

PTS: 1 REF: 151-152 BLM: Higher Order - Apply 173.

TOP: 5–6

Refer to Working mothers in Canada. If the randomly selected working mother is married, what is the probability she is over 40 years of age? ANS: P(

|M) = P(M

)/P(M) = 0.12/0.40 = 0.30

PTS: 1 REF: 153 BLM: Higher Order - Apply 174.

TOP: 5–6

Refer to Working mothers in Canada. If the randomly selected working mother is between 25 and 40 years of age, what is the probability that she is single? ANS: P(S|

) = P(S

)/P(

) = 0.18/0.38 = 0.474

PTS: 1 REF: 153 BLM: Higher Order - Apply 175.

TOP: 5–6

Refer to Working mothers in Canada. What is the probability that the randomly selected working mother is either single or over 40 years of age? ANS: P(S

) = P(S) + P(

) – P(S

PTS: 1 REF: 148-150 BLM: Higher Order - Apply

) = 0.60 + 0.30 – 0.18 = 0.72 TOP: 5–6

Chapter 5—Several Useful Discrete Distributions MULTIPLE CHOICE 1. Which of the following is NOT a characteristic of a binomial problem? a. There are n identical trials, and all trials are independent. b. Each trial has two possible outcomes, which are traditionally labelled “failure” and “success,” and the probability of success p is the same on each trial. c. We are interested in x, the number of successes observed during the n trials. d. The probability of failure may differ from trial to trial. ANS: D BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

2. For a binomial experiment with n trials, p is the probability of success, q is the probability of failure, and x is the number of successes in n trials. Which one of the following statements is NOT a property of such an experiment? a. p + q = 1 b. = 1 for x = 0, 1, . . ., n c. P(x = 0) = d. P(x = 1) = ANS: B TOP: 1–2

PTS: 1 BLM: Remember

REF: 192-193 | 195

3. A telephone survey of Canadian families is conducted to determine the number of children in the average Canadian family. Past experience has shown that 30% of the families who are telephoned will refuse to respond to the survey. Which of the following data could NOT be classified as a binomial random variable? a. the number of families out of 50 who respond to the survey b. the number of families out of 50 who refuse to respond to the survey c. the number of children in a family who respond to the survey ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 192-193

TOP: 1–2

4. Which of the following scenarios is an example of a binomial experiment? a. A shopping mall is interested in the income level of its customers and is taking a survey to gather information. b. A business firm introducing a new product wants to know how many purchases its clients will make each year. c. A sociologist is researching an area in an effort to determine the proportion of households with a male head of household. d. A study is concerned with the average number of hours worked by high school students. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 192-193

TOP: 1–2

5. What is the standard deviation of a binomial distribution for which n = 50 and p = 0.15? a. 50.15 b. 7.082 c. 6.375 d. 2.525 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 195

TOP: 1–2

6. Supporters of gun control in a university town claim that 60% of the students are in favour of stronger gun control in Canada. A social scientist at the university conducts a survey of 20 randomly chosen students and finds that 9 of the 20 favour stronger gun control. Given this information, which of the following is a reasonable conclusion? a. There is no reason to doubt the claim. b. The survey results constitute a rare event. c. The claim is incorrect. d. The true percentage of students who favour stronger gun control must be 45%. ANS: A TOP: 1–2

PTS: 1 REF: 192-193 | 195 | 68-69 BLM: Higher Order - Evaluate

7. What is the expected number of heads in 200 tosses of an unbiased coin? a. 50 b. 75 c. 100 d. 125 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 195

TOP: 1–2

8. What is the expected value, E(X), of a binomial probability distribution with n trials and a probability p of success? a. n/p b. np(1 – p) c. np d. np – 1 ANS: C BLM: Remember

PTS:

REF: 195

TOP: 1–2

9. Which of the following best describes a binominal probability distribution? a. It is a probability distribution that shows the probabilities associated with possible values of a discrete random variable that are generated by a binomial experiment. b. It is a probability distribution that shows, for each of a random variable’s possible values, the probability of the variable being less than or equal to that value. c. It is a probability distribution that shows the probabilities associated with possible values of a discrete random variable when the probability of success of these values changes from one trial to the next. d. It is a probability distribution that shows the probabilities associated with possible values of a discrete random variable when these values equal the number of

occurrences of a specified event within a specified time or space. ANS: A TOP: 1–2

PTS: 1 BLM: Remember

REF: 192-193 | 195

10. What does the binomial distribution formula do? a. It computes the number of permutations of x successes (and, therefore, n – x failures) that can be achieved in n trials of a random experiment that satisfies the conditions of the binomial experiment. b. It computes the probability of x successes in n trials of a random experiment that satisfies the conditions of a binomial experiment. c. It produces one of two possible outcomes, conventionally called success and failure. d. It produces the probability of x successes when a random sample of size n is drawn without replacement from a population of size N within which M units have the characteristic that denotes success. ANS: B BLM: Remember

PTS:

REF: 195

TOP: 1–2

11. Given that n is the number of trials and p is the probability of success in any one trial of a random experiment, which of the following is equal to the expected value of a binomial random variable? a. (n – 1)p b. p c. np d. n/p ANS: C BLM: Remember

PTS:

REF: 195

TOP: 1–2

12. A manufacturer of tennis balls uses a production process that produces 5% defective balls. A quality inspector takes samples of a week’s output, with replacement. Using the cumulative binomial probability table available in your text, which of the following probabilities can the inspector determine? a. If five units are inspected, the probability of zero units being defective is 0.774. b. If 10 units are inspected, the probability of 2 units being defective is 0.374. c. If 15 units are inspected, the probability of 4 units being defective is 0.999. d. If 20 units are inspected, the probability of 6 units being defective is 0.0569. ANS: A TOP: 1–2

PTS: 1 REF: 197 | 712-717 BLM: Higher Order - Apply

13. A manufacturer of golf balls uses a production process that produces 10% defective balls. A quality inspector takes samples of a week’s output with replacement. Using the cumulative binomial probability table available in your text, which of the following probabilities can the inspector determine? a. If five units are inspected, the probability of at most three of these units being defective is 0.984. b. If 10 units are inspected, the probability of 5 or 6 of these units being defective is 0.002.

c. If 15 units are inspected, the probability of at least 10 of these units being defective is 0.547. d. If 20 units are inspected, the probability of at least 19 of these units being defective is 0.0009. ANS: B TOP: 1–2

PTS: 1 REF: 197-200 | 712-717 BLM: Higher Order - Apply

14. It has been alleged that 40% of all community college students favour Dell computers. If this were true, and we took a random sample of 25 students, the binomial probability table for cumulative values of x available in your text would reveal which of the following probabilities? a. The probability of 10 or fewer students in favour is 0.586. b. The probability of fewer than 19 students in favour is 1.000. c. Both a and b. ANS: C TOP: 1–2

PTS: 1 REF: 197-200 | 712-717 BLM: Higher Order - Apply

15. Which of the following is NOT a characteristic of a binomial experiment? a. The experiment consists of n identical trials. b. The probability of failure on a single trial remains constant from trial to trial. c. The standard deviation of the binomial random variable is independent of the number of trials. d. Each trial results in one of two outcomes. ANS: C BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

16. If the random variable x is binomially distributed with n = 10 and p = 0.05, what is P(x = 2)? a. 0.914 b. 0.599 c. 0.55 d. 0.074 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 195

TOP: 1–2

17. Which of the following statements is a property of the binomial distribution? a. The binomial distribution tends to be more symmetric as the probability of success p approaches 0.5. b. As the number of trials increases, the expected value of the random variable decreases. c. As the number of trials increases for a given probability of success, the binomial distribution becomes more skewed. d. As the number of trials increases, the probability of success increases ANS: A BLM: Remember

PTS:

REF: 195

TOP: 1–2

18. Which of the following experiments CANNOT be modelled by a Poisson distribution?

a. the number of calls received by a switchboard during a given period of time b. the number of bacteria per small volume of fluid c. the gender of individuals going through a security check at an airport in a given hour d. the number of customer arrivals at a checkout counter during a given minute ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

19. Given a Poisson random variable x, where the average number of times an event occurs in a certain period of time is 2.5, what is P(x = 0)? a. 2.5 b. 1.5811 c. 0.40 d. 0.0821 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 205

TOP: 3

20. Which probability distribution is appropriate when the events of interest occur randomly, independently of one another, and rarely? a. binomial distribution b. Poisson distribution c. hypergeometric distribution d. any discrete probability distribution ANS: B BLM: Remember

PTS:

REF: 205

TOP: 3

21. Which of the following is the mean of a Poisson random variable x, where number of times that an event occurs in a certain period of time or space? a. b. c. d.

is the average

ANS: A BLM: Remember

PTS:

REF: 205

TOP: 3

22. Which of the following CANNOT generate a Poisson distribution? a. the number of telephone calls received by a switchboard in a specified time period b. the number of customers arriving at a gas station on Christmas day c. the number of bacteria found in a cubic yard of soil d. the number of children in a family ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

23. Which of the following best describes a Poisson random variable? a. It is a continuous random variable with infinitely many possible values.

b. It is a discrete random variable with infinitely many possible values. c. It is a continuous random variable with a finite number of possible values. d. It is a discrete random variable with a finite number of possible values. ANS: B BLM: Remember

PTS:

REF: 205

TOP: 3

24. Which of the following is the expression for the standard deviation of a Poisson distribution, for which is the average number of times that an event occurs in a certain period of time or space? a. b. c. d.

ANS: B BLM: Remember

PTS:

REF: 205

TOP: 3

25. Given a Poisson random variable x, where the average number of times an event occurs in a certain period of time or space is 1.5, then what is P(x = 2)? a. 0.5020 b. 0.2510 c. 0.2231 d. 0.01116 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 205

TOP: 3

26. Which of the following expressions gives the variance of a Poisson random variable for which is the average number of times that an event occurs in a certain period of time or space? a. b. c. d.

ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

27. Which of the following correctly describes a Poisson random variable? a. It does not generate a binomial either/or outcome because only a single type of outcome or “event” is occurring during the Poisson process. b. It equals the number of occurrences of a specified event within a specified time or space. c. a and b ANS: C BLM: Remember

PTS:

REF: 205

TOP: 3

28. Given that is the Poisson mean or average in a given unit of time or space, and t is the total number of time or space units examined, which of the following correctly describes the expected value of a Poisson random variable? a. b. c. d. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

29. In a book, two misprints occur per 100 pages. Using the cumulative Poisson probability table available in your text, which of the following probabilities can we determine for a book of 500 pages? a. The probability of finding 5 or 6 misprints equals 0.099. b. The probability of finding at least 20 misprints equals 0.003. c. The probability of finding at least 24 misprints equals 0.1234. ANS: B TOP: 3

PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

30. The number of traffic accidents per day on a certain section of highway is thought to be distributed with a mean equal to 2.19. What is the standard deviation of the number of accidents? a. approximately 4.80 b. 3.14 c. 2.19 d. approximately 1.48 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 205

TOP: 3

31. The number of traffic accidents per day on a certain section of highway is thought to be Poisson distributed with a mean to equal 2.19. Which of the following best approximates the probability of no accidents occurring on this section of highway during a one-day period? a. 0.457 b. 0.318 c. 0.296 d. 0.112 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 205

TOP: 3

32. The number of traffic accidents per day on a certain section of highway is thought to be Poisson distributed with a mean equal to 2.19. Based on this, how many traffic accidents should be expected during a period of one week? a. 15.33 b. approximately 12.21 c. 10.95 d. approximately 10.36

ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 205

TOP: 3

33. If the standard deviation for a Poisson random variable is known to be 3.60, what is its expected value? a. approximately 1.90 b. 3.60 c. 8.28 d. 12.96 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 205

TOP: 3

34. Which of the following distributions could NOT be used to describe the exact distribution for a discrete random variable? a. binomial distribution b. Poisson distribution c. hypergeometric distribution d. normal distribution ANS: D BLM: Remember

PTS:

REF: 190-191

TOP: 3

35. Which of these statements is NOT a property of a Poisson distribution? a. The Poisson distribution is an example of a discrete probability distribution. b. The Poisson distribution is more skewed to the right for smaller values of the parameter . c. The Poisson distribution is symmetrical when the value of the parameter is close to 5. d. The mean of the Poisson distribution variable is equal to the variance. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

36. When sampling without replacement, which of the following is the appropriate probability distribution to use? a. the binomial distribution b. the hypergeometric distribution c. the Poisson distribution d. the normal approximation to the binomial distribution ANS: B BLM: Remember

PTS:

REF: 212-213

TOP: 4

37. Which of the following best describes our options when evaluating probabilities if we use the hypergeometric formula ? a. We may calculate the probability of k successes when a random sample of size n is drawn without replacement from a population of size N within which M units have

the characteristic that denotes success. b. We may assume that n < N and M < N. c. both a and b ANS: C BLM: Remember

PTS:

REF: 212-213

TOP: 4

38. Suppose we are given that n is the number of trials of a random experiment, N is population size, M is the number of population units with the “success” characteristic, and p is the probability of success in the first trial. Under those circumstances, which of these expressions equals the mean of the hypergeometric random variable’s probability distribution? a. n b. np c. n(N/M) d. n(M/N) ANS: B BLM: Remember

PTS:

REF: 212-213

TOP: 4

39. What kinds of probabilities does the hypergeometric probability distribution provide? a. It provides probabilities associated with possible values of a binomial random variable in situations in which sampling is done without replacement. b. It provides probabilities associated with possible values of a binomial random variable in situations in which the probability of success changes from one trial to the next. c. both a and b d. neither a or b ANS: C BLM: Remember

PTS:

REF: 212-213

TOP: 4

40. In which of the following sampling methods would the hypergeometric probability distribution be used instead of the binomial distribution? a. when sampling is performed with replacement from a finite population b. when sampling is performed without replacement from a finite population c. when sampling is performed without replacement from an infinite population d. when sampling is performed with replacement from an infinite population ANS: B BLM: Remember

PTS:

REF: 212-213

TOP: 4

41. A small community college in Ontario has four student organizations (A, B, C, and D). Organization A has 5 students, B has 8, C has 10, and D has 12. It is thought that new students have no preference for one of these organizations over the other. If seven new students are admitted to the college, what is the probability that one student will choose organization A, one will choose B, two will choose C, and three will choose D? a. approximately 0.059 b. 0.200 c. approximately 0.243 d. 0.258

ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 195 | 153

TOP: 4

TRUE/FALSE 1. A binomial random variable is an example of a discrete random variable. ANS: T TOP: 1–2

PTS: 1 BLM: Remember

REF: 190 | 192-193

2. In a binomial experiment, the probability of success is the same on every trial. ANS: T BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

3. A coin-toss experiment represents a binomial experiment only if the coin is balanced, meaning that p = 0.5. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 192-193

TOP: 1–2

4. A jug contains five black marbles and five white marbles well mixed. A marble is removed and its colour is noted. A second marble is removed, without replacing the first marble, and its colour is also noted. If x is the total number of black marbles in the two draws, then x has a binomial distribution. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 195

TOP: 1–2

5. As a rule of thumb, if the sample size n is large relative to the population size N, in particular, if n/N > 0.05, then the resulting experiment will NOT be binomial. ANS: T BLM: Remember

PTS:

REF: 195

TOP: 1–2

6. The binomial random variable is the number of successes that occur in a certain period of time or space. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 192-193

TOP: 1–2

7. If x is a binomial random variable with n = 20, and p = 0.5, then P(x = 20) = 1.0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 195

TOP: 1–2

8. A binomial probability distribution shows the probabilities associated with possible values of a discrete random variable that are generated by a binomial experiment.

ANS: T BLM: Remember

PTS:

REF: 195

TOP: 1–2

9. A binomial experiment is a sequence of n identical trials such that both of the following properties hold: a.) each trial produces one of two outcomes that are conventionally called “success” and “failure,” and b.) each trial is independent of any other trial so that the probability of success or failure is constant from trial to trial. ANS: T BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

10. The number of successes observed during the n trials of a binomial experiment is called the binomial random variable. ANS: T BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

11. A binomial experiment requires that the success and failure probabilities be constant from one trial to the next, and also that these two probabilities be equal to each other. ANS: F BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

12. The binomial probability distribution could be used to describe the speed of tennis balls when the players are serving. ANS: F TOP: 1–2

PTS: 1 REF: 192-193 | 195 BLM: Higher Order - Understand

13. A life insurance salesperson makes 15 sales calls daily. The chance of making a sale on each call is 0.40. The probability that he will make at most two sales is less than 0.10. ANS: T TOP: 1–2

PTS: 1 REF: 205-208 | 718-719 BLM: Higher Order - Apply

14. The distribution of the number of phone calls to a doctor’s office in a one-hour time period is likely to be described by a binomial distribution. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 195 | 205

TOP: 1–2

15. The number of defects in a random sample of 200 parts produced by a machine is binomially distributed with p = 0.03. Based on this information, the expected number of defects in the sample is six. ANS: T

PTS:

REF: 195

TOP: 1–2

BLM: Higher Order - Apply 16. The number of defects in a random sample of 200 parts produced by a machine is binomially distributed with p = 0.03. Based on this information, the standard deviation of the number of defects in the sample is 5.82. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 195

TOP: 1–2

17. The binomial distribution is used to describe continuous random variables. ANS: F TOP: 1–2

PTS: 1 BLM: Remember

REF: 192-193 | 195

18. Where n = 150 is the number of trials and p = 0.6 is the probability of success in each trial, the standard deviation of a binomial experiment is 36. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 195

TOP: 1–2

19. Where n = 100 is the number of trials and p = 0.04 is the probability of success in each trial, the mean of a binomial experiment is ANS: F PTS: 1 BLM: Higher Order - Apply

. REF: 195

TOP: 1–2

20. There are only two possible outcomes in a binomial experiment where n is the number of trials and p is the probability of success in each trial. ANS: T BLM: Remember

PTS:

REF: 192-193

TOP: 1–2

21. The Poisson probability distribution is an example of a continuous probability distribution. ANS: F BLM: Remember

PTS:

REF: 205

TOP: 3

22. The probability distribution of a Poisson random variable provides a good model for data that represent the number of occurrences of a specified event in a given unit of time or space. ANS: T BLM: Remember

PTS:

REF: 205

TOP: 3

23. The Poisson probability distribution provides a good approximation to binomial probabilities when n is large and = np is small, preferably with np < 7. ANS: T

PTS:

REF: 205 | 209

TOP: 3

BLM: Remember 24. The mean and variance of a Poisson distribution variable are equal. ANS: T BLM: Remember

PTS:

REF: 205

TOP: 3

25. The Poisson distribution is appropriate to determine the probability of a given number of defective items in a shipment. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 195 | 205

26. The mean of a Poisson distribution variable, where occurring in a specified interval, is ANS: F BLM: Remember

PTS:

TOP: 3

is the average number of successes

. REF: 205

TOP: 3

27. The Poisson random variable is the number of successes achieved when a random sample of size n is drawn without replacement from a population of size N within which M units have the characteristic that denotes success. ANS: F BLM: Remember

PTS:

REF: 205

TOP: 3

28. A Poisson process is the occurrence of a series of events of a given type in a random pattern over time or space such that (1) the number of occurrences within a specified time or space can equal any integer between zero and infinity, (2) the number of occurrences within one unit of time or space is independent of that in any other such (nonoverlapping) unit, and (3) the probability of occurrences is the same in all such units. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 205

TOP: 3

29. The Poisson parameter is the mean number of occurrences of an event per unit of time or space during the Poisson process. ANS: T BLM: Remember

PTS:

REF: 205

TOP: 3

30. The Poisson probability tables list the probabilities of x occurrences in a Poisson process for various values of , the mean number of occurrences. ANS: T TOP: 3

PTS: 1 REF: 206-209| 718-719 BLM: Higher Order - Understand

31. Hypergeometric probability distributions are examples of continuous probability distributions. ANS: F TOP: 4

PTS: 1 BLM: Remember

REF: 190| 212-213

32. Hypergeometric probability distributions are examples of discrete probability distributions. ANS: T TOP: 4

PTS: 1 BLM: Remember

REF: 190 | 212-213

33. The hypergeometric probability distribution formula calculates the probability of x successes when a random sample of size n is drawn without replacement from a population of size N within which M units have the characteristic that denotes success, and N – M units have the characteristic that denotes failure. ANS: T BLM: Remember

PTS:

REF: 212-213

TOP: 4

34. A hypergeometric probability distribution shows the probabilities associated with possible values of a discrete random variable when these values are generated by sampling with replacement, and the probability of success, therefore, changes from one trial to the next. ANS: F BLM: Remember

PTS:

REF: 212-213

TOP: 4

35. A warehouse contains six parts made by company A and ten parts made by company B. If four parts are selected at random from the warehouse, the probability that none of the four parts is from company A is approximately 0.1154. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 212-213

TOP: 4

36. A warehouse contains six parts made by company A and ten parts made by company B. If four parts are selected at random from the warehouse, the probability that one of the four parts is from company B is approximately 0.1099. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 212-213

TOP: 4

37. A warehouse contains six parts made by company A and ten parts made by company B. If four parts are selected at random from the warehouse, the probability that all four parts are from company B is approximately 0.008. ANS: F PTS: 1 BLM: Higher Order - Apply PROBLEM

REF: 212-213

TOP: 4

1. A fellow student tells you he is working with a discrete random variable, x, which takes on integer values from 0 to 50 and has a mean of 40 and a variance of 10. Is x a binomial random variable? ANS: If x is binomial, it has n = 50 trials, since x = 0, 1, 2, . . ., 50. The binomial mean is np, so if x is binomial, np = 50p = 40, so p = 0.8. If x is binomial with n = 50 and p = 0.8, the variance of x is npq. However, npq = (50) (0.8) (0.2) = 8, not 10 as stated in the problem. Thus the random variable x is not binomially distributed. PTS: 1 REF: 195 BLM: Higher Order - Analyze

TOP: 1–2

2. The taste test for PTC (phenylthiourea) is a common class demonstration in the study of genetics. It is known that 70% of Canadians are “tasters” and 30% are “non-tasters.” Suppose a genetics class of size 20 does the test to see if they match the Canadian percentage of “tasters” and “non-tasters.” (Assume the assignment of students to classes constitutes a random process.) a. What is the probability distribution of the random variable x, the number of “non-tasters” in the class? b. Find P(3 < x < 9). c. Find the mean of x. d. Find the variance of x. ANS: a. b. c. d.

p(x) = P(3 < x < 9) = 0.78 = np = 6

for x = 0, 1, 2, . . ., 20

= 4.2

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

3. An oil firm plans to drill 20 wells, each having a probability 0.2 of striking oil. Each well costs $20,000 to drill; a well that strikes oil will bring in $750,000 in revenue. Find the expected gain from the 20 wells. ANS: Whether or not the wells strike oil, each well costs $20,000 to drill, for a total outlay of $400,000. The expected revenue is the expected number of wells striking oil multiplied by the $750,000 each one will bring in. Therefore, the expected revenue = E(revenue) = np(750,000) = 20(0.2)(750,000) = $3,000,000. As gain = revenue – costs, E(gain) = E(revenue) – costs = $2,600,000. PTS: 1 REF: 195 BLM: Higher Order - Analyze

TOP: 1–2

4. From past experience, it is known 90% of one-year-old children can distinguish their mother’s voice from the voice of a similar-sounding female. A random sample of 20 one-year-olds are given this voice recognition test. a. Find the probability that at least three children do not recognize their mother’s voice. b. Find the probability that all 20 children recognize their mother’s voice. c. Let the random variable x denote the number of children who do not recognize their mother’s voice. Find the mean of x. d. Let the random variable x denote the number of children who do not recognize their mother’s voice. Find the variance of x. e. Find the probability that, at most, four children do not recognize their mother’s voice. ANS: a. P(x 3) = 0.323 (here p = 0.10) b. P(x = 20) = 0.122 (here p = 0.90) c. = np = 2 d. e.

P(x

= 1.8 4) = 0.957 (here p = 0.10)

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

5. A quiz consists of 15 multiple-choice questions. Each question has five choices, with exactly one correct choice. A student, totally unprepared for the quiz, guesses on each of the 15 questions. a. How many questions should the student expect to answer correctly? b. What is the standard deviation of the number of questions answered correctly? c. If at least nine questions must be answered correctly to pass the quiz, what is the chance the student passes? ANS: In this case, n = 15 and p = 1/5 = 0.2. a. = np = 3 b. c.

P(x

= 1.549 9) = 0.01

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

6. Suppose 40% of the TV sets in use in Canada on a particular night were tuned into game 7 of the Stanley Cup Playoffs. a. If we were to take a sample of six in-use TV sets that night, what is the probability exactly three are tuned to the Stanley Cup Playoffs? b. If, instead, the sample consisted of 15 in-use TVs, what is the probability 5 or more are tuned to the Stanley Cup Playoffs? ANS: a. P(x = 3) = 0.277

P(x

5) = 0.783

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

7. In an effort to persuade customers to reduce their electricity consumption during business hours, special discounts have been offered to low-consumption customers. One hydroelectric company estimates that 70% of the local residences have qualified for these discounts. a. If we randomly select four residences, what is the probability exactly two of the four will have qualified for the discounts? b. If, instead, a sample of 20 residences is selected, what is the probability 12 or more will have qualified for the discounts? ANS: a. P(x = 2) = 0.264 (here n = 4) b. P(x 12) = 0.887 (here n = 20) PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

8. Hotels, like airlines, often overbook, counting on the fact that some people with reservations will cancel at the last minute. A certain hotel chain has found that 20% of the reservations will not be used. a. If we randomly selected 15 reservations, what is the probability more than 8 but fewer than 12 reservations will be used? b. If four reservations are made, what is the chance fewer than two will cancel? ANS: a. P(8 < x < 12) = 0.334 (here p = 0.8) b. P(x < 2) = 0.8192 (here p = 0.2) PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

9. To determine the effectiveness of a diet to reduce the amount of serum cholesterol, 20 people were randomly selected to try the diet. After they had been on the diet a sufficient amount of time, their blood was tested to determine the level of serum cholesterol. The clinic running the experiment will approve and endorse the diet only if at least 16 of the 20 people lower their cholesterol significantly. a. What is the random variable of interest? b. What is the probability the clinic will erroneously endorse the diet when, in fact, it is useless in treating high serum cholesterol? [Hint: If the diet is ineffective, a person’s cholesterol level is equally likely to go up or down.] ANS: a. The number of persons in the study who significantly lowered their serum cholesterol while on the diet. b. P(x 16) = 0.006 (here p = 0.50)

PTS: 1 REF: 192-193 | 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

10. The local Ford dealership claims 40% of its customers are return buyers, meaning that the customers have previously purchased a Ford vehicle. Let the random variable x be the number of return buyers in a random sample of 10 recent customers. a. Find P(1 < x < 5). b. Find P(x > 6). c. What would you conclude about the accuracy of the dealership’s claim if you found nine repeat buyers in the sample? Explain your reasoning. ANS: a. 0.627 b. 0.166 c. There is reason to doubt the dealership’s claim because P(x = 9) = 0.01 if p = 0.40. PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Evaluate

TOP: 1–2

11. An applicant must score at least 80 points on a particular psychological test to be eligible to join CUSO (Canadian University Services Overseas). From several years’ experience, it is known that 60% of the applicants meet this requirement. Let the random variable x be the number of applicants who meet this requirement out of a (randomly selected) group of 25 applicants. a. Find the mean of x. b. Find the standard deviation of x. c. Using the Empirical Rule, within what limits would you expect most (say, 95%) of the measurements to be? ANS: a. = np = 15 b. = 2.45 c. We would expect most of the measurements to be between 10.1 and 19.9, which, since x is discrete, translates as between 11 and 19. PTS: 1 REF: 195 | 69-70 BLM: Higher Order - Analyze

TOP: 1–2

12. A politician claims 55% of the voters will vote for her in an upcoming election. An independent watchdog organization queried a random sample of 500 likely voters, of whom only 249 said they would vote for the politician in question. Do the results of this sample support the politician’s claim? [Hint: Binomial distributions with p near 50% are quite mound-shaped.] ANS: No, because 249 is more than 2 standard deviations below the mean. Sample results like this would occur approximately 2.5% of the time if the politician’s claim of 55% were true.

PTS: 1 REF: 195 | 69-70 BLM: Higher Order - Evaluate

TOP: 1–2

13. A Canadian Medical Association study showed that 20% of all Canadians suffer from high blood pressure. Suppose we randomly sample ten Canadians to determine the number in the sample who have high blood pressure. a. What is the probability that, at most, two persons have high blood pressure? b. If, instead, we sampled 25 Canadians, give a reason why the following statement is true: “The probability between one and nine individuals from a sample of 25 would have high blood pressure is at least 0.75.” [Hint: For n and p of approximately the sizes given, the distribution would not be particularly symmetric.] ANS: a. P(x 2) = 0.678 b.  2 = 5 4, or (1, 9). Tchebysheff’s Theorem says at least 75% of the observations lie in the interval  ± 2. PTS: 1 REF: 195-197 | 68-69 | 712-717 BLM: Higher Order - Evaluate

TOP: 1–2

14. Consider an experiment with 25 trials where the probability of success on any trial is 0.01, and let the random variable x be the number of successes among the 25 trials. What are p(0), p(1), p(2), and p(3) using the exact binomial distribution? ANS: x

p(x)

0.778

0.196

0.024

0.002

PTS: 1 REF: 195 BLM: Higher Order - Apply

TOP: 1–2

15. It is known that 70% of the customers in a sporting goods store purchase a pair of running shoes. A random sample of 25 customers is selected. Assume that customers’ purchases are made independently, and let x represent the number of customers who purchase running shoes. a. What is the probability that exactly 18 customers purchase running shoes? b. What is the probability that no more than 19 customers purchase running shoes? c. What is the probability that at least 17 customers purchase running shoes? d. What is the probability that between 17 and 21 customers, inclusively, purchase running shoes? ANS:

a. b. c. d.

P(x = 18) = P(x 18) – P(x 17) = 0.659 – 0.488 = 0.171 P(x 19) = 0.807 P(x 17) = 1 – P(x  17) = 1 – P(x 16) = 1 – 0.323 = 0.677. P(17 x 21) = P(x 21) – P(x 16) = 0.967 – 0.323 = 0.644

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

16. A statistics department is contacting alumni by telephone and asking for donations to help fund a new computer laboratory. Past history shows that 80% of the alumni contacted in this manner will make a contribution of at least $50. A random sample of 20 alumni is selected. Let x represent the number of alumni that make a contribution of at least $50. a. What is the probability that exactly 15 alumni will make a contribution of at least $50? b. What is the probability that between 14 and 18 alumni, inclusively, will make a contribution of at least $50? c. What is the probability that fewer than 17 alumni will make a contribution of at least $50? d. What is the probability that more than 15 alumni will make a contribution of at least $50? e. How many alumni would you expect to make a contribution of at least $50? ANS: a. P(x= 15) = P(x 15) – P(x 14) = 0.370 – 0.196 = 0.174 b. P(14  x  18) = P(x  18) – P(x  13) = 0.931 – 0.087 = 0.844 c. P(x  17) = P(x 16) = 0.589 d. P(x 15) = 1 – P(x 15) = 1 – 0.370 = 0.630 e. E(x) = np = (20)(0.80) = 16 PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

17. Let x be a binomial random variable with n = 15 and p = 0.20. a.

Calculate

using the binomial formula.

b. c. d.

Calculate using the cumulative binomial probabilities table in Appendix I. Compare the results of (a) and (b). Calculate the mean and standard deviation of the random variable x.

e. Use the results of (d) to calculate the intervals , , and . Find the probability that an observation will fall into each of these intervals. f. Are the results of part (e) consistent with Tchebysheff’s Theorem? With the Empirical Rule? ANS: a.

, .230897,

Hence, b.

, .187604 0.835767

= 0.836

The results are almost identical.

d. e.

or 1.451 to 4.549, then or –0.098 to 6.098, then

or –1.647 to 7.647, then The results are consistent with Tchebysheff’s Theorem and the Empirical Rule.

PTS: 1 REF: 195-197 | 712-717 | 68-70 BLM: Higher Order - Evaluate

TOP: 1–2

18. The Scholastic Aptitude Test (SAT) is a standardized test for college admissions in the United States. In 1996, the average combined SAT score (math + verbal) for students in the United States was 1013, and 41% of all high school graduates took this test. Suppose that 500 students are randomly selected throughout the United States. Which of the following random variables has an approximate binomial distribution? In each case, provide a justification for your answer. a. the number of students who took the SAT b. the scores of the 500 students on the SAT c. the number of students who scored above average on the SAT d. the amount of time it took each student to complete the SAT ANS: a. Each student either took (S) or did not take (F) the SAT. Since the population of students is large, the probability p = 0.41 that a particular student took the SAT will not vary from student to student and the trials will be independent. This is a binomial random variable. b. The measurement taken on each student is a score, which can take more than two values. This is not a binomial random variable. c. Each student either will (S) or will not (F) score above average. As in (a), the trials are independent although the value of p, the proportion of students in the population who score above average, is unknown. This is a binomial experiment. d. The measurement taken on each student is amount of time, which can take more than two values. This is not a binomial random variable. PTS: 1 REF: 192-193 | 195 BLM: Higher Order - Analyze

TOP: 1–2

19. Car colour preferences change over the years and according to the particular model that the customer selects. In a recent year, 20% of all luxury cars sold were white. Define x to be the number of cars that are white. Assume that 25 cars of that year and type are randomly selected. a. Find the probability that at least five cars are white. b. Find the probability that at most six cars are white. c. Find the probability that more than three cars are white. d. Find the probability that exactly four cars are white. e. Find the probability that three to five cars (inclusive) are white. f. Find the probability that more than 20 cars are not white.

ANS: p =P(white) = 0.20 and n = 25. Use the cumulative binomial probabilities table in Appendix I of your book. a. b. c. d. e. f.

P(more than 20 not white) = P(fewer than 5 white) =

PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Apply

TOP: 1–2

20. A new surgical procedure is said to be successful 90% of the time. Suppose the operation is performed five times and the results are assumed to be independent of one another. Define x to be the number of successful operations. a. Find the probability that all five operations are successful. b. Find the probability that exactly four are successful. c. Find the probability that fewer than two are successful. ANS: Then p = P(success) = 0.9 and n = 5. We will use the binomial formula; however, we may also use the cumulative binomial probabilities table in Appendix I of our book to calculate the necessary probabilities. a. b. c. PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Analyze

TOP: 1–2

21. Four in ten Canadians who travel by car look for gas and food outlets that are close to or visible from the highway. Suppose a random sample of n = 20 Canadians who travel by car are asked how they determine where to stop for food and gas. Let x be the number in the sample who respond that they look for gas and food outlets that are close to or visible from the highway. a. What are the mean and variance of x? b. Calculate the interval this interval?

What values of the binomial random variable x fall into

c. Find How does this compare with the fraction in the interval for any distribution? For mound-shaped distributions? ANS:

, and

b. 3.6180 to 12.3818. Since x can take only integer values from 0 to 20, this interval consists of the values of x in the range c. =0.979 – 0.016 = 0.963. This value agrees with Tchebysheff’s Theorem (at least 3/4 of the measurements are in this interval) and also with the Empirical Rule (approximately 95% of the measurements are in this interval.) PTS: 1 REF: 195-197 | 712-717 | 68-70 BLM: Higher Order - Evaluate

TOP: 1–2

22. Let x be a binomial random variable with n = 20 and p = 0.05. a. Calculate your text. b. c.

using the cumulative binomial probabilities table in Appendix I of

Use the Poisson approximation to calculate Compare the results of (a) and (b). Is the approximation accurate?

ANS: a. b.

With

, the approximation is

= 0.3679 + 0.3679 + 0.1839 = 0.9197. The approximation is quite accurate.

PTS: 1 REF: 195-197 | 205 | 209 | 718-719 BLM: Higher Order - Analyze

. Then

TOP: 1–2

23. A psychiatrist believes that 90% of all people who visit doctors have problems of a psychosomatic nature. She decides to select 25 patients at random to test her theory. a. Assuming that the psychiatrist’s theory is true, what is the expected value of x, the number of the 20 patients who have psychosomatic problems? b. What is the variance of x, assuming that the theory is true? c. Find , assuming that the theory is true. d. Based on the probability in (c), if only 14 of the 20 sampled had psychosomatic problems, what conclusions would you make about the psychiatrist’s theory? Explain. ANS: a. E(x) = np =20(0.9) = 18 b. c.

= 0.011

d. Assuming that the psychiatrist is correct, the probability of observing x = 14 or the more unlikely values, x = 0, 1, 2,..…, 13 is very unlikely. Hence, one of two conclusions can be drawn. Either we have observed a very unlikely event, or the psychiatrist is incorrect and p is actually less than 0.9. We would probably conclude that the psychiatrist is incorrect. The probability that we have made an incorrect decision is 0.011, which is quite small. PTS: 1 REF: 195-197 | 712-717 BLM: Higher Order - Evaluate

TOP: 1–2

24. About 80% of Victoria residents believe that Victoria is a nice or very nice place to live. Suppose that five randomly selected Victoria residents are interviewed. a. Find the probability that all five residents think that Victoria is a nice or very nice place to live. b. What is the probability that at least one resident does not think that Victoria is a nice or very nice place to live? c. What is the probability that exactly one resident does not think that Victoria is a nice or very nice place to live? ANS: a. The random variable x, the number of Victorians who think that Victoria is a nice or very nice place to live, has a binomial distribution, with n = 5 and p = 0.80. Then, . b. The probability that at least one does not think Victoria is nice is the same as the probability that at most four do think Victoria is a nice place to live. The desired probability is P(at most four think Victoria is nice) = c.

P(exactly one does not) =

PTS:

REF: 195

= 1 – 0.3277 = 0.6723.

= = 0.4096 TOP: 1–2

BLM: Higher Order

25. Let x be a Poisson random variable with mean 3. What is the probability that x will fall in the interval  ± 2? ANS:  = 1.732; P(–0.464

6.464) = 0.966

PTS: 1 REF: 205-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

Telephone Switchboard Narrative The number of telephone calls coming into a business’s switchboard averages four calls per minute. Let x be the number of calls received. 26. Refer to Telephone Switchboard Narrative. Find P(x = 0). ANS: x has a Poisson distribution with mean 4. Then, p(0) = e–4 = 0.018.

PTS: 1 REF: 205 BLM: Higher Order - Analyze

TOP: 3

27. Refer to Telephone Switchboard Narrative. What is the probability there will be at least one call in a given one-minute period? ANS: P(x 1) = 1 – P(x = 0) = 0.982. PTS:

REF: 205

TOP: 3

BLM: Higher Order

28. Refer to Telephone Switchboard Narrative. What is the probability at least one call will be received in a given two-minute period? ANS: P(x 1) = 1 – P(x = 0) = 1 – P(no calls in first minute no calls in second minute) = 1 – p(0) p(0) [since calls are independent] = 1 – 0.0182 = 0.999676  1 PTS: 1 REF: 205 | 153 BLM: Higher Order - Apply

TOP: 3

Computer Disks Narrative The quality of computer disks is measured by sending the disks through a certifier that counts the number of missing pulses. A certain brand of computer disks averages 0.1 missing pulses per disk. Let the random variable x denote the number of missing pulses. 29. Refer to Computer Disks Narrative. What is the distribution of x? ANS: x has a Poisson distribution with mean 0.1. PTS: 1 REF: 205 BLM: Higher Order - Analyze

TOP: 3

30. Refer to Computer Disks Narrative. Find the probability that the next inspected disk will have no missing pulse. ANS: P(x = 0) = 0.905 PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

31. Refer to Computer Disks Narrative. Find the probability the next disk inspected will have more than one missing pulse. ANS:

P(x > 1) = 0.005 PTS: 1 REF: 206-208 | 718-719 BLM: Higher Order - Apply

TOP: 3

32. Refer to Computer Disks Narrative. Find the probability neither of the next two disks inspected will contain any missing pulse. ANS: We can assume the disks are independent. P(x = 0 on first disk and x = 0 on second disk) = P(x = 0). P(x = 0) = 0.9052 = 0.819. PTS: 1 REF: 205 | 153 BLM: Higher Order - Apply

TOP: 3

33. The number of teleport inquiries, x, in a timesharing computer system averages 0.2 per millisecond and follows a Poisson distribution. a. Find the probability that no inquiries are made during the next millisecond. b. Find the probability that no inquiries are made during the next three milliseconds. ANS: a. P(x = 0) = 0.819 b. Since teleport inquiries are independent, then P(x = 0 in the first, second, and third milliseconds) = 0.8193 = 0.549. PTS: 1 REF: 205 | 153 BLM: Higher Order - Apply

TOP: 3

34. Rebuilt ignition systems leave an aircraft repair shop at an average rate of three per hour. The assembly line needs four ignition systems in the next hour. What is the probability they will be available? ANS: Let the random variable x be the number of rebuilt ignition systems available per hour. Then x is a Poisson random variable with mean 3. P(x > 4) = 0.353. (Any value of x greater than or equal to four will assure there are four available to the assembly line.) PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Analyze

TOP: 3

35. Consider an experiment with 25 trials where the probability of success on any trial is 0.01, and let the random variable x be the number of successes among the 25 trials. What are p(0), p(1), p(2), and p(3), based on the Poisson approximation to the binomial? ANS: x

p(x)

0.779

0.195

0.024

0.002

PTS: 1 REF: 205 | 209 BLM: Higher Order - Apply

TOP: 3

36. The probability that the 1993–94 flu vaccine immunizes those receiving it is 0.97. If a random sample of 200 people receive the vaccine, what is the probability the vaccine will be ineffective on at most 5 people? ANS: Let the random variable x be the number of persons in whom the vaccine is ineffective; x has a binomial distribution with n = 200 and p = 0.03. Using the Poisson approximation to the binomial, P(x < 5) = 0.446. PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Analyze

TOP: 3

37. A salesperson has found the probability of making a sale on a particular product manufactured by his company is 0.05. If the salesperson contacts 140 potential customers, what is the probability he will sell at least 2 of these products? ANS: Let the random variable x be the number of products sold; x then has a binomial distribution with n = 140 and p = 0.05. Using the Poisson approximation to the binomial, P(x > 2) = 0.993. PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Analyze

TOP: 3

38. A shipment of six parrots from Brazil includes two parrots with a potentially fatal disease. As usual, the Customs Office at the shipment’s point of entry randomly samples two parrots and tests them for disease. Let the random variable x be the number of healthy parrots in the sample. Find the probability distribution of x. ANS: x has a hypergeometric distribution with two diseased parrots and four healthy parrots, for a total of six parrots. Then, the probability distribution of x is given by for x = 0, 1, 2. PTS: 1 REF: 212-213 BLM: Higher Order - Analyze Warehouse Narrative

TOP: 3

A warehouse contains ten computer printers, four of which are defective. A company randomly selects five of the ten printers to purchase. 39. Refer to Warehouse Narrative. What is the probability all five are non-defective? ANS: Let the random variable x be the number of non-defective printers in the sample; x has a hypergeometric distribution. Then, P(x = 5) = PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

= 1/42 = 0.0238.

TOP: 3

40. Refer to Warehouse Narrative. What is the mean of x? ANS:

PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 3

41. Refer to Warehouse Narrative. What is the variance of x? ANS: 2/3 = 0.667 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 3

Automobile Spark Plugs Narrative An eight-cylinder automobile engine has two misfiring spark plugs. The mechanic removes all four plugs from one side of the engine. 42. Refer to Automobile Spark Plugs Narrative. What is the probability the two misfiring spark plugs are among those removed? ANS: The random variable x = number of misfiring spark plugs has a hypergeometric distribution. Then, P(x = 2) =

= 3 /14 = 0.2143

PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 3

43. Refer to Automobile Spark Plugs Narrative. What is the mean number of misfiring spark plugs? ANS: 1 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 3

44. Refer to Automobile Spark Plugs Narrative. What is the variance of the number of misfiring spark plugs? ANS: 3/7 = 0.4286 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 3

45. From a group of ten bank officers, three are selected at random to be relocated and to supervise new branch offices. If two of the ten officers are women and eight are men, what is the probability exactly one of the officers to be relocated will be a woman? ANS: The random variable x = number of women to be relocated and to supervise new branch offices has a hypergeometric distribution. Then, P(x = 1) = PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

= 7/15 = 0.4667.

TOP: 3

46. A package of six light bulbs contains two defective bulbs. If three bulbs are selected for use, find the probability none are defective. ANS: The random variable x = number of defective bulbs has a hypergeometric distribution. Then, P(x = 0) =

= 1/5 = 0.20

PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 3

47. Three yellow and two blue pencils are in a drawer. If we randomly select two pencils from the drawer, find the probability distribution of x, the number of yellow pencils selected.

ANS: Using the multiplication rule: P(YY) = (3/5)(2/4) = 6/20, and x = 2 yellow pencils P(YB) = (3/5)(2/4) = 6/20, and x = 1 yellow pencil P(BY) = (2/5)(3/4) = 6/20, and x = 1 yellow pencil P(BB) = (2/5)(1/4) = 2/20, and x = 0 yellow pencils Using the hypergeometric distribution: for x = 0, 1, and 2 yellow pencils, both approaches yield the following probability distribution: x

p(x)

2/20

12/20

6/20

PTS: 1 REF: 198 | 212-213 BLM: Higher Order - Analyze

TOP: 3

Health Care Narrative Students arrive at a health centre, according to a Poisson distribution, at a rate of four every 15 minutes. Let x represent number of students arriving in a 15-minute period. 48. Refer to Health Care Narrative. What is the probability that no more than three students arrive during a 15-minute period? ANS: P(x 3) = 0.433 PTS: 1 REF: 205-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

49. Refer to Health Care Narrative. What is the probability that exactly five students arrive in a 15-minute period? ANS: P(x = 5) =

= P(x

5) – P(x

PTS: 1 REF: 205 BLM: Higher Order - Apply

4) = 0.785 – 0.629 = 0.156 TOP: 3

50. Refer to Health Care Narrative. What is the probability that more than five students arrive in a 15-minute period?

ANS: P(x  5) = 1 – P(x  5) = 1 – 0.785 = 0.215 PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

51. Refer to Health Care Narrative. What is the probability that between four and eight students, inclusively, arrive in a 15-minute period? ANS: P(4 x

8) = P(x

8) – P(x

3) = 0.979 – 0.433 = 0.546

PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

Bicycle Repair Shop Narrative The number of people arriving at a bicycle repair shop follows a Poisson distribution, with an average of five arrivals per hour. Let x represent number of people arriving per hour. 52. Refer to Bicycle Repair Shop Narrative. What is the probability that seven people arrive at the bike repair shop in a one-hour period? ANS: P(x = 7) =

= 0.1044 or P(x  7) – P(x  6) = 0.867 – 0.762 = 0.105

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

53. Refer to Bicycle Repair Shop Narrative. What is the probability that, at most, seven people arrive at the bike repair shop in a one-hour period? ANS: P(x 7) = 0.867 PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

54. Refer to Bicycle Repair Shop Narrative. What is the probability that more than seven people arrive at the bike repair shop in a one-hour period? ANS: P(x  7) = 1 – P(x

7) = 1 – 0.867 = 0.133

PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

55. Refer to Bicycle Repair Shop Narrative. What is the probability that between four and nine people, inclusively, arrive at the bike repair shop in a one-hour period? ANS: P(4 x

9) = P(x

9) – P(x

3) = 0.968 – 0.265 = 0.703

PTS: 1 REF: 206-209 | 718-719 BLM: Higher Order - Apply

TOP: 3

56. It was estimated that 2% of a particular 1997 model minivan had incorrectly installed brake lines. Suppose 300 minivans of this model are selected at random. Let x represent the number of minivans with incorrectly installed brake lines. What is the probability that nine have incorrectly installed brake lines? ANS: Since n is large and p is small, use the Poisson approximation to the binomial distribution with = np = 300(0.02) = 6. P(x = 9) =

= P(x

9) – P(x

PTS: 1 REF: 205 | 209 BLM: Higher Order - Analyze

8) = 0.916 – 0.847 = 0.069 TOP: 3

Intensive Care Unit Narrative The number x of people entering the intensive care unit at a particular hospital on any one day has a Poisson probability distribution with mean equal to four persons per day. 57. Refer to Intensive Care Unit Narrative. What is the probability that the number of people entering the intensive care unit on a particular day is two? ANS: = 0.146525 PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

58. Refer to Intensive Care Unit Narrative. What is the probability that the number of people entering the intensive care unit on a particular day is less than or equal to two? ANS: = 0.238103 PTS: 1 REF: 205 | 209 | 718-719 BLM: Higher Order - Apply

TOP: 3

59. Refer to Intensive Care Unit Narrative. Is it likely that x will exceed ten? Explain. ANS: Recall that for the Poisson distribution, = 4 and . Therefore the value x = 10 lies (10 – 4)/2.0 = 3.0 standard deviations above the mean. It is not a very likely event. PTS: 1 REF: 205 | 68-69 BLM: Higher Order - Analyze

TOP: 3

Insulin-Dependent Diabetes Narrative Insulin-dependent diabetes (IDD) is a common chronic disorder of children. Let us assume that an area in Europe has an incidence of 6 cases per 100,000 per year. 60. Refer to Insulin-Dependent Diabetes Narrative. Can the distribution of the number of cases of IDD in this area be approximated by a Poisson distribution? If so, what is the mean? Justify your answers. ANS: Since cases of IDD are not likely to be contagious, these cases of the disorder occur independently at a rate of 6 per 100,000 per year. This random variable can be approximated by the Poisson random variable with PTS: 1 REF: 205 BLM: Higher Order - Analyze

. TOP: 3

61. Refer to Insulin-Dependent Diabetes Narrative. What is the probability that the number of cases of IDD in this area is less than or equal to 3 per 100,000? ANS:

PTS: 1 REF: 205-208 | 718-719 BLM: Higher Order - Apply

TOP: 3

62. Refer to Insulin-Dependent Diabetes Narrative. What is the probability that the number of cases is greater than or equal to 3 but less than or equal to 7 per 100,000? ANS:

PTS: 1 REF: 206-208 | 718-719 BLM: Higher Order - Apply

TOP: 3

63. Refer to Insulin-Dependent Diabetes Narrative. Would you expect to observe 10 or more cases of IDD per 100,000 in this area in a given year? Why or why not?

ANS: The probability of observing 10 or more cases per 100,000 in a year is . This is an occurrence that we would not expect to see very often, if in fact PTS: 1 REF: 206-208 | 718-719 | 68-69 BLM: Higher Order - Evaluate

TOP: 3

Toll Station Narrative It is known that between 8 and 10 a.m. on Saturdays, cars arrive at a certain toll station at a rate of 60 per hour. Assume that a Poisson process is occurring and that the random variable x represents the number of cars arriving at the station between 9:00 and 9:05 a.m. 64. Refer to Toll Station Narrative. What is the expected number of cars arriving at the toll station between 9:00 and 9:05 a.m.? ANS: T = 5/60 hour, hence,

= (5/60)

PTS: 1 REF: 205 BLM: Higher Order - Analyze

60 = 5 TOP: 3

65. Refer to Toll Station Narrative. What is the standard deviation of the number of cars arriving at the toll station between 9:00 and 9:05 a.m.? ANS:

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

66. Refer to Toll Station Narrative. Find P(x = 0). ANS:

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

67. Refer to Toll Station Narrative. Find P(x = 2). ANS:

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

68. Refer to Toll Station Narrative. Find P(x = 5). ANS:

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

69. Refer to Toll Station Narrative. Find P(x = 10). ANS:

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

Automobile Service Centre Narrative An automobile service centre can take care of eight cars per hour. Assume that the cars arrive at the service centre randomly and independently of each other at a rate of six per hour, on average. 70. Refer to Automobile Service Centre Narrative. What is the standard deviation of the number of cars that arrive at the centre? ANS:

PTS: 1 REF: 205 BLM: Higher Order - Analyze

TOP: 3

71. Refer to Automobile Service Centre Narrative. What is the probability of the service centre being empty in any given hour? ANS: P(x = 0) = PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

72. Refer to Automobile Service Centre Narrative. What is the probability that exactly six cars will be in the service centre at any point during a given hour? ANS: P(x = 6) =

PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

73. Refer to Automobile Service Centre Narrative. What is the probability that fewer than two cars will be in the service centre at any point during a given hour? ANS: P(x = 0) + P(x = 1) = PTS: 1 REF: 205 BLM: Higher Order - Apply

TOP: 3

74. A professor has received a grant to travel to an archaeological dig site. The grant includes funding for five graduate students. If there are five male and four female graduate students eligible and equally qualified, what is the probability that the professor will select three male and two female graduate students to accompany her to the dig site? ANS: P(3 Male, 2 Female) =

= 0.4762

PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 4

75. Let x be a hypergeometric random variable with N = 12, n = 4, and M = 3. a. Calculate p(0), p(1), p(2), and p(3). b. Construct the probability histogram for x. c.

Calculate the mean

and variance

What proportion of the population of measurements fall into the interval Do these results agree with those given by Tchebysheff’s

Into the interval Theorem? ANS: a.

The formula for p(x) is p(x) = ,

for x = 0, 1, 2, 3. Therefore, p(0) = ,

The mean is given by

, and the variance is

, or –0.4772 to 2.4772 , or –1.2157 to 3.2157. Then,

These results agree with Tchebysheff’s Theorem. PTS: 1 REF: 212-213 | 68-69 BLM: Higher Order - Evaluate

TOP: 4

Job Applicants Narrative A company has five applicants, three women and two men, for two positions. Suppose that the five applicants are equally qualified and that no preference is given for choosing either gender. Let x equal the number of men chosen to fill the two positions. 76. Refer to Job Applicants Narrative. Write the formula for p(x), the probability distribution of x. ANS:

The random variable x has a hypergeometric distribution with N = 5, M = 2, and n = 2. Then for x = 0, 1, 2. PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 4

77. Refer to Job Applicants Narrative. What are the mean, variance, and standard deviation of this distribution? ANS: The mean is given by

, and the variance is given by . Hence, the standard deviation

PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

78. Refer to Job Applicants Narrative. Construct a probability histogram for x. ANS: The probability distribution and histogram for x are shown below. x 0 1 2 P(x) 0.3 0.6 0.1

PTS: 1 REF: 212-213 | 170-171 BLM: Higher Order - Apply

TOP: 4

79. A student has decided to rent three movies for a three-day weekend. If there are four action movies and six romance movies that are of equal interest to the student, what is the probability that the student will select one romance movie and two action movies? ANS: P(1 Romance, 2 Action) =

= 0.30

PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 4

Defective Items Narrative A random sample of 4 units is taken from a group of 15 items in which 4 units are known to be defective. Assume that sampling occurs without replacement, and the random variable x represents the number of defective units found in the sample. 80. Refer to Defective Items Narrative. What kind of discrete random variable x is this? ANS: Hypergeometric PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 4

81. Refer to Defective Items Narrative. What is the variance of the random variable x? ANS: 0.6146 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

82. Refer to Defective Items Narrative. What is P(x = 0)? ANS: 0.2418 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

83. Refer to Defective Items Narrative. Find P(x = 1). ANS: 0.4835 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

84. Refer to Defective Items Narrative. Find P(x = 2). ANS: 0.2418 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

85. Refer to Defective Items Narrative. What is P(x =3)? ANS: 0.0322 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

86. Refer to Defective Items Narrative. Evaluate P(x = 4). ANS: 0.0007 PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

Scholarship Narrative A college has seven applicants for three scholarships: four females and three males. Suppose the seven applicants are equally qualified and no preference is given by the selection committee for choosing either gender. Let x equal the number of female students chosen for the three scholarships. 87. Refer to Scholarship Narrative. Write a formula for p(x), the probability distribution of x. ANS: Since N = 7, n = 3, M = 4, and N – M = 3, PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

TOP: 4

88. Refer to Scholarship Narrative. What is the mean of the distribution of x? ANS: N = 7, n = 3, M = 4, and N – M = 3. then PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

89. Refer to Scholarship Narrative. What is the variance of the distribution of x? ANS:

PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

90. Refer to Scholarship Narrative. What is the probability that only one female will receive a scholarship? ANS: P(1) = PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

91. Refer to Scholarship Narrative. What is the probability that two females will receive a scholarship? ANS: p(2) = PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

92. Refer to Scholarship Narrative. What is the probability that none of the three males will receive a scholarship? ANS: p(3) = PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

93. Refer to Scholarship Narrative. What is the probability that none of the four females will receive a scholarship? ANS: P(0) = PTS: 1 REF: 212-213 BLM: Higher Order - Apply

TOP: 4

94. It is known that in an Introductory Statistics class of 20 students, 70% of them own smartphones. If you were to randomly pick 10 students from that class, what is the probability that 70% of those picked would have smartphones? ANS: Using the hypergeometric distribution, define a success to be a student owning a smartphone. Then the population size is N = 20. The number of successes in that population is M = (20)(0.70) = 14. The number of failures in that population is N – M = 6. The number of independent trials = the number of students picked at random = n = 10. Then the probability of picking k = (10)(0.7) = 7 successes out of the 10 trials = /

= 68,640/554,268 = 0.12

Note that PTS: 1 REF: 212-213 BLM: Higher Order - Analyze

and TOP: 4

Chapter 6—The Normal Probability Distribution MULTIPLE CHOICE 1. Which of these statements is valid with regard to probabilities of continuous random variables? a. The height of the curve shows the probability of an event. b. The probability of exactly an event A occurring is always equal to one. c. Probabilities of events are determined from areas under the curve. d. The probability distribution is always mound-shaped. ANS: C BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

2. Let x be a continuous random variable and let c be a constant. Which of the following statements does NOT apply to the probability of x? a. The probability that x assumes a value in the interval the probability density function between and b. P(x = c) = c c. P(x < c) = P(x < c) and P(x > c) = P(x > c) d. P(x = c) = c and P(x > c) = P(x > c) ANS: B BLM: Remember

PTS:

is the area under

REF: 232-233

TOP: 1–3

3. Which of the following incorrectly describes both of the discrete probability distributions and the continuous probability density functions? a. They both reflect the distribution of a random variable. b. They both have outcomes, which occur only in integers. c. They both assist in determining the probability of a given outcome. d. They are both relative frequency distributions. ANS: B BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

4. What proportion of the data from a normal distribution is within two standard deviations from the mean? a. 0.3413 b. 0.4772 c. 0.6826 d. 0.9544 ANS: D TOP: 1–3

PTS: 1 REF: 237-238 | 69-70 BLM: Higher Order - Understand

5. Which of the following values is the z-score representing the first quartile of the standard normal distribution? a. 0.67 b. –0.67

c. 1.28 d. –1.28 ANS: B PTS: 1 BLM: Higher Order - Apply 6. Let

REF: 238 | 79-81

TOP: 1–3

be a z-score that is unknown but identifiable by position and area. If the area to the

right of a. 1.06 b. 0.55 c. –0.55 d. –1.06 ANS: C TOP: 1–3

is 0.7088, then what is the value of

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

7. If z is a standard normal random variable, the area between z = 0.0 and z = 1.20 is 0.3849, while the area between z = 0.0 and z = 1.40 is 0.4192. What is the area between z = –1.20 and z = 1.40? a. 0.0343 b. 0.0808 c. 0.1151 d. 0.8041 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 238-242

TOP: 1–3

8. If z is a standard normal random variable, how does the area between z = 0.0 and z = 1.25 compare to the area between z = 1.25 and z = 2.5? a. The latter area will be larger than the former. b. The latter area will be smaller than the former. c. The two areas are the same. d. The two areas are not the same. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 238-242

TOP: 1–3

9. If x is a normal random variable with a mean of 1228 and a standard deviation of 120, how many standard deviations are there from 1228 to 1380? a. 11.50 b. 10.233 c. 3.1989 d. 1.267 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

10. Using the standard normal table, what is the total probability to the right of z = 2.0 and to the left of z = –2.0? a. 0.0228

b. 0.0456 c. 0.4772 d. 0.9544 ANS: B TOP: 1–3 11. Let

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

be a z-score that is unknown but identifiable by position and area. If the symmetrical

area between a negative a. z = 2.64 b. z = 1.78 c. z = 1.32 d. z = 0.89 ANS: C TOP: 1–3

and a positive

is 0.8132, what would be the value of

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

12. What is the z-score representing the third quartile of the standard normal distribution? a. 0.67 b. –0.67 c. 1.28 d. –1.28 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 238 | 79-81

TOP: 1–3

13. Given that Z is a standard normal random variable, what is P(–1.2 Z 1.5)? a. 0.8181 b. 0.5228 c. 0.4772 d. 0.3849 ANS: A TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

14. Given that z is a standard normal variable, which of the following is the value P(z ) = 0.242? a. 0.70 b. 0.65 c. –0.65 d. –0.70 ANS: D TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

15. Which of the following is a property of the standard normal distribution? a. It has a mean of 0 and a standard deviation of 1. b. It has a mean of 0 and a standard deviation of 0. c. It has a mean of 1 and a standard deviation of 0.

for which

d. It has a mean of 1 and a standard deviation of 1. ANS: A BLM: Remember

PTS:

REF: 238-239

TOP: 1–3

16. If z is a standard normal random variable, what is the value of P(–1.25 a. 0.6678 b. 0.2266 c. 0.1210 d. 0.1056 ANS: C TOP: 1–3

–0.75)?

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

17. If the random variable x is normally distributed with a mean of 88 and a standard deviation of 12, what is the value of P(x 96)? a. 0.1243 b. 0.2486 c. 0.2514 d. 0.4972 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 243-246

TOP: 1–3

18. Which of the following is always true for all probability density functions of continuous random variables? a. They are symmetrical. b. They are skewed to the right. c. They are skewed to the left. d. The area under the curve is 1.0. ANS: D BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

19. Many different types of continuous random variables give rise to a large variety of probability density functions. Which of those listed below are generated by such variables? a. the binomial probability distributions b. the hypergeometric probability distributions c. the normal probability distributions d. the Poisson probability distributions ANS: C BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

20. Which of the following statements best describes continuous random variables? a. They can assume values at all points on an interval with no breaks between possible values. b. We can gauge the likely occurrence of specific values of such variables with the help of one or another of certain probability distributions, including the binomial, Poisson, or hypergeometric distributions. c. They model random variables such as waiting times or lifetimes.

d. They calculate probabilities for the binomial random variable . ANS: A BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

21. Continuous random variables that can assume values at all points on an interval of values, with no breaks between possible values, are quite common. Which of the following items could NOT be classified as a continuous random variable? a. profit per dollar of sales b. cost per credit taken by graduate students c. the average time it takes to assemble a car, or write a test d. heights of human beings rounded to the nearest inch ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

22. Which of the following correctly pertains to a continuous random variable? a. We cannot list all the probabilities for each one of the infinite number of conceivable values of the variable. b. We commonly associate probabilities with ranges of values along the continuum of possible values that the random variable might take on. c. Both a and b ANS: C BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

23. Which of the following correctly describes the normal probability distribution? a. It is single-peaked above the random variable’s mean, median, and mode, all of which are equal to one another. b. It features tails extending indefinitely in both directions from the centre, approaching (but never touching) the horizontal axis, which implies a positive probability for finding values of the random variable anywhere between minus infinity and plus infinity. c. Both a and b ANS: C BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

24. Members of the normal probability distribution family differ from one another only by which of the following measures? a. mean and standard deviation b. mean and median c. mean and mode d. mode and standard deviation ANS: A BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

25. If a normal curve appears more peaked, what may be said of the population parameters? a. The smaller is the value of the standard deviation . b. The larger is the value of the standard deviation .

c. The smaller is the value of the mean . d. The larger is the value of the mean . ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 237-238

TOP: 1–3

26. Given a normal distribution with a mean of 80 and a standard deviation of 20, which z-score would correspond to an observation of x = 50? a. z = +3.0 b. z = +2.0 c. z = +1.5 d. z = –1.5 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

27. The z-score of a random variable value of x = 2 is z = –2, while the standard deviation of the random variable x equals 2. What is the mean of x? a. 6 b. 4 c. 2 d. 0 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

28. Which of the following is NOT a characteristic of the normal distribution? a. It is symmetric. b. It is not bell-shaped. c. The mean equals the median. d. The mean equals the mode. ANS: B BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

29. Which of the following is a valid statement about the mean, median, and mode? a. They are equal to one another in any Poisson probability distribution. b. They are equal to one another in any normal probability distribution. c. They are different measures of centre and therefore cannot possibly be equal to one another. d. They are different measures of centre, but can equal each other only if the probability distribution is negatively or positively skewed. ANS: B BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

30. Which of the following probability distributions can be used to describe the distribution for a continuous random variable? a. binomial distribution b. normal distribution c. Poisson distribution

d. hypergeometric distribution ANS: B BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

31. The random variable x is normally distributed, with a mean equal to 0.45 and a standard deviation equal to 0.40. What is the value of P(x 0.75)? a. 0.7734 b. 0.7500 c. 0.2734 d. 0.2266 ANS: D PTS: 1 BLM: Higher Order - Apply 32.

and

REF: 243-246

TOP: 1–3

are normally distributed random variables with a mean of 95 and a standard

deviation of 20, such that 60)? a. 0.2115 b. 0.0802 c. 0.0401 d. 0.0016

and

ANS: D PTS: 1 BLM: Higher Order - Apply

are independent of each other. What is P(

REF: 243-246

and

TOP: 1–3

33. Suppose you are given that x is a normally distributed random variable with a mean of and a standard deviation of 0.15. If P(x < 2.10) = 0.025, what is the value of ? a. 2.394 b. 2.104 c. 2.096 d. 1.806 ANS: A TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

34. x is a normally distributed random variable with a mean of 8.20 and variance of 4.41, and P(x > b) = 0.08. What is the value of b? a. 3.409 b. 3.448 c. 3.452 d. 11.914 ANS: B TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

35. The time it takes Jessica to bicycle to school is normally distributed, with a mean time of 15 minutes and a variance of 4 minutes. Jessica has to be at school at 8:00 a.m. What time should she leave her house so she will be late only 4% of the time? a. 8:00 a.m.

b. 11.5 minutes before 8:00 a.m. c. 18.5 minutes before 8:00 a.m. d. 22 minutes before 8:00 a.m. ANS: C TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

36. The time it takes Jessica to bicycle to school is normally distributed, with a mean time of 15 minutes and a variance of 4 minutes. Jessica has to be at school at 8:00 a.m. Suppose it took her 23 minutes to get to school. What can you reasonably infer or conclude? a. Twenty-three minutes to school is not an unusually long commute time. b. A commuting time of 23 minutes is highly unusual or atypical. c. The distribution of commute times must not be normal with mean 15 minutes and standard deviation 2 minutes. d. Both b and c ANS: D TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Evaluate

37. Given that Z is a standard normal random variable, which of the following is the value of P(–1.0 Z 1.5)? a. 0.0919 b. 0.7745 c. 0.8413 d. 0.9332 ANS: B TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

38. Given that Z is a standard normal variable, and P(Z b? a. –0.65 b. 0.242 c. 0.70 d. 0.758 ANS: A TOP: 1–3

b) = 0.2580, which of these values is

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

39. Which of the following statements best describes a standard normal distribution? a. It is a normal distribution with a mean of 0 and a standard deviation of 1. b. It is a normal distribution with a mean of 1 and a standard deviation of 0. c. It is a normal distribution with a mean usually larger than the standard deviation. d. It is a normal distribution with a mean always larger than the standard deviation. ANS: A BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

40. If Z is a standard normal random variable, then which of these values is P(–1.75 –1.25)? a. 0.8543

b. 0.1056 c. 0.0655 d. 0.0401 ANS: C TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

41. If Z is a standard normal random variable, then which of these values is the value of z for which P(–z Z z) equals 0.8764? a. 3.08 b. 1.54 c. 1.16 d. 0.3764 ANS: B TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

42. If the random variable X is normally distributed, with a mean of 75 and a standard deviation of 8, which of the following values is the probability that X is greater than or equal to 75? a. 0.125 b. 0.500 c. 0.625 d. 0.975 ANS: B TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

43. Given that Z is a standard normal random variable, which of these is the value of z if the area to the right of z is 0.1949? a. 0.51 b. –0.51 c. 0.86 d. –0.86 ANS: C TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

44. Given that Z is a standard normal random variable, which of these is the value of z if the area to the right of z is 0.9066? a. 1.32 b. –1.32 c. 0.66 d. –0.66 ANS: B TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

45. Given that Z is a standard normal random variable, which of the following is the value for P(Z > –1.58)? a. –0.4429 b. 0.0571

c. 0.5571 d. 0.9429 ANS: D TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

46. Given that the random variable X is normally distributed with a mean of 80 and a standard deviation of 10, which of these is the value of P(85 X 90)? a. 0.5328 b. 0.3413 c. 0.1915 d. 0.1498 ANS: D TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

47. What proportion of the data from a normal distribution is within two standard deviations from the mean? a. 0.3413 b. 0.4772 c. 0.6826 d. 0.9544 ANS: D TOP: 1–3

PTS: 1 REF: 237-238 | 69-70 BLM: Higher Order - Apply

48. Given that Z is a standard normal random variable, which of the following best describes the area to the left of a value z? a. P(Z z) b. P(Z z) c. P(0 Z z) d. P(Z –z) ANS: B BLM: Remember

PTS:

REF: 238-239

TOP: 1–3

49. Which of the following distributions are always symmetrical? a. exponential b. normal c. binomial d. all continuous distributions ANS: B BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

50. Suppose that the z-value for a given value x of the random variable X is z = 1.96, and that the distribution of X is normally distributed with a mean of 60 and a standard deviation of 6. To what x-value does this z-value correspond? a. 71.76 b. 67.96 c. 61.96

d. 48.24 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

51. If Z is a standard normal random variable, then the area between z = 0.0 and z =1.30 is 0.4032, while the area between z = 0.0 and z = 1.50 is 0.4332. What is the area between z = –1.30 and z = 1.50? a. 0.0300 b. 0.0668 c. 0.0968 d. 0.8364 ANS: D TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

52. If Z is a standard normal random variable, how would one describe the area between z = 0.0 and z =1.50 compared to the area between z = 1.5 and z = 3.0? a. the same b. larger c. smaller ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 238-242

TOP: 1–3

53. Which of the following characteristics is NOT a property of a normal distribution? a. It is unimodal. b. It is symmetrical. c. It is discrete. d. It has a bell shape. ANS: C BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

54. Which of the following distributions is considered to be the cornerstone distribution of statistical inference? a. binomial distribution b. normal distribution c. Poisson distribution d. uniform distribution ANS: B BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

55. Which of the following must be specified in order to completely determine the probability density function f(x) of a random variable X that is normally distributed? a. the mean and median of X b. the median and mode of X c. the mean and mode of X d. the mean and standard deviation of X ANS: D

PTS:

REF: 237-238

TOP: 1–3

BLM: Remember 56. For some positive value of z, the probability that a standard normal variable is between 0 and z is 0.3770. What is the value of that z? a. 0.18 b. 0.81 c. 1.16 d. 1.47 ANS: C TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

57. For some value of z, the probability that a standard normal variable is below z is 0.2090. What is the value of that z? a. –0.81 b. –0.31 c. 0.31 d. 1.96 ANS: A TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

58. For some positive value of x, the probability that a standard normal variable is between 0 and +2x is 0.1255. Which of the following is the value of that x? a. 0.99 b. 0.40 c. 0.32 d. 0.16 ANS: D TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

59. For some positive value of x, the probability that a standard normal variable is between 0 and +1.5x is 0.4332. What is the value of x? a. 0.10 b. 0.50 c. 1.00 d. 1.50 ANS: C TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

60. Even though the normal probability distribution deals with continuous variables, under certain circumstances we can use it to approximate various discrete probability distributions. Under which of these conditions is the use of the normal probability distribution permissible? a. in the case of binomial probabilities, when np > 5 and also nq > 5 b. in the case of Poisson probabilities, when > 100 c. in the case of hypergeometric probabilities, when np > 1 and also nq > 1 ANS: A

PTS:

REF: 249-251

TOP: 4

BLM: Remember 61. Given that X is a binomial random variable, the binomial probability P(X x) is approximated by the area under a normal curve to the right of which of the following values? a. x – 0.5 b. x + 0.5 c. x – 1 d. x + 1 ANS: A BLM: Remember

PTS:

REF: 251

TOP: 4

62. As a general rule, the normal distribution is used to approximate the sampling distribution of the sample proportion only if which of the following conditions holds? a. The sample size n is greater than 30. b. The population proportion p is close to 0.50. c. The underlying population is normal. d. np and n(1 – p) are both greater than 5. ANS: D BLM: Remember

PTS:

REF: 251

TOP: 4

63. Suppose that the probability p of success on any trial of a binomial distribution equals 0.90. For which of the following number of trials, n, would the normal distribution provide a good approximation to the binomial distribution? a. 25 b. 35 c. 45 d. 55 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 251

TOP: 4

64. Given that X is a binomial random variable, then between which of the following values would the binomial probability P(X = x) be approximated by the area under a normal curve? a. x – 0.5 and 0.0 b. 0.0 and x + 0.5 c. 1 – x and 1 + x d. x – 0.5 and x + 0.5 ANS: D BLM: Remember

PTS:

REF: 251

TOP: 4

65. Given that X is a binomial random variable, the binomial probability P(X x) is approximated by the area under a normal curve to the left of which of the following values? a. x b. –x c. x + 0.5 d. x – 0.5

ANS: C BLM: Remember

PTS:

REF: 251

TOP: 4

TRUE/FALSE 1. If x denotes a continuous random variable, then P(x = c) = 0 for every number c. ANS: T BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

2. For a continuous random variable x, P(6 < x < 8) is represented by the area under the probability function over the interval from 6 to 8, including both end points. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

3. A normal random variable x is standardized by expressing its value as the number of standard deviations it lies to the right or left of its mean . ANS: T BLM: Remember

PTS:

REF: 238

TOP: 1–3

4. The mean, median, and standard deviation of a normally distributed random variable are all at the same position on the horizontal axis since the distribution is symmetric. ANS: F BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

5. For a continuous random variable x, the probabilities P(0 < x < 1) and P(0 < x < 1) are always equal. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

6. Any continuous random variable cannot take on negative values. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

7. For any continuous random variable x, P(x = 5) = 1 is always true. ANS: F BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

8. Continuous random variables can assume infinitely many values corresponding to points on a line interval. ANS: T

PTS:

REF: 232-233

TOP: 1–3

BLM: Remember 9. The relative frequency associated with a particular class in the population is the fraction of measurements in the population falling in that class. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

10. The relative frequency associated with a particular class in the population is the probability of drawing a measurement from that class. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 232-233

TOP: 1–3

11. The total area under the curve f(x) of any continuous random variable x is equal to one. ANS: T BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

12. The normal probability distribution is important because a large number of random variables observed in nature possess a frequency distribution that is approximately a normal probability distribution. ANS: T BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

13. The left half of the normal curve is slightly smaller than the right half. ANS: F BLM: Remember

PTS:

REF: 237-239

TOP: 1–3

14. Using the standard normal curve, the area between z = 0 and z = 2.2 is 0.4868. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238-242

TOP: 1–3

15. The shape of the normal probability distribution is determined by the population standard deviation . Small values of  reduce the height of the curve and increase the spread; large values of  increase the height of the curve and reduce the spread. ANS: F BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

16. The proportion of the total area under the normal curve that lies within one standard deviation of the mean is approximately 0.75. ANS: F TOP: 1–3

PTS: 1 REF: 237-238 | 69-70 BLM: Higher Order - Apply

17. Given a normal random variable x with mean

and standard deviation

normal random variable associated with x is ANS: T BLM: Remember

PTS:

. REF: 238

TOP: 1–3

18. Given a normal random variable x with mean and standard deviation standard normal random variable z associated with x is 1. ANS: F BLM: Remember

PTS:

REF: 238-239

PTS:

REF: 238-239

, the mean of the

TOP: 1–3

19. Given a normal random variable x with mean and standard deviation the standard normal random variable z associated with x is 0. ANS: F BLM: Remember

, the standard

, the variance of

TOP: 1–3

20. Given a normal random variable x with mean of 82 and standard deviation of 12, the value of the standard normal random variable z associated with x = 70 is smaller than 0. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

21. Given a normal random variable x with mean of 70 and standard deviation of 12, the value of the standard normal random variable z associated with x = 82 is larger than 0. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

22. A continuous random variable x is normally distributed with a mean of 1200 and a standard deviation of 150. Given that x = 1410, its corresponding z-score is 1.40. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

23. Given that z is a standard normal random variable, a negative value of z indicates that the standard deviation of z is negative. ANS: F BLM: Remember

PTS:

REF: 238-239

TOP: 1–3

24. Given that z is a standard normal random variable, a value of z that is equal to 0 indicates that the standard deviation of z is also 0. ANS: F BLM: Remember

PTS:

REF: 238-239

TOP: 1–3

25. If x is a normal random variable with mean of 2 and standard deviation of 5, then P(x < 3) = P(x > 7). ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 238-242

TOP: 1–3

26. If x is a normal random variable with mean = 4 and standard deviation = 2, and Y is a normal random variable with mean = 10 and standard deviation = 5, then P(x < 0) = P(Y < 0). ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 238

TOP: 1–3

27. Using the standard normal curve, the area between z = –1.50 and z = 1.50 is 0.4332. ANS: F PTS: 1 BLM: Higher Order - Apply 28. Let

TOP: 1–3

be a z-score that is unknown but identifiable by position and area. If the area to the

right of

is 0.8849, the value of

ANS: F TOP: 1–3 29. Let

REF: 238-242

is 1.2.

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

be a z-score that is unknown but identifiable by position and area. If the symmetrical

area between –

and +

is 0.903, the value of

ANS: T PTS: 1 BLM: Higher Order - Apply

is 1.66.

REF: 238-242

TOP: 1–3

30. The z-score representing the 10th percentile of the standard normal curve is –1.28. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238 | 78

TOP: 1–3

31. Using the standard normal curve, the z-score representing the 95th percentile is 1.645. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238 78

TOP: 1–3

32. The mean and variance of any normal distribution are always 0 and 1, respectively. ANS: F BLM: Remember

PTS:

REF: 237-239

TOP: 1–3

33. A continuous random variable x is normally distributed with a mean of 100 km and a standard deviation of 20 km. Given that x = 120, its corresponding z-score is –1.0.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

34. The z-score representing the 99th percentile of the standard normal curve is 3.0. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 238 | 78

TOP: 1–3

35. Continuous random variables can assume values at all points on an interval, with no breaks between possible values. ANS: T BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

36. The possible observations of continuous random variables are infinite in number. Characteristics measured in units of money, time, distance, height, volume, length, or weight are examples of such variables. ANS: T BLM: Remember

PTS:

REF: 232-233

TOP: 1–3

37. A smooth frequency curve that describes the probability distribution of a continuous random variable is called a standard normal curve. ANS: F BLM: Remember

PTS:

REF: 232-233

38. For a continuous random variable x and constant a, P(x 0. ANS: T BLM: Remember

PTS:

TOP: 1–3

a) = P(x > a) because P(x = a) =

REF: 232-233

TOP: 1–3

39. Examples of continuous probability distributions include the normal probability distribution, the Poisson probability distribution, and the binomial probability distribution. ANS: F TOP: 1–3

PTS: 1 BLM: Remember

REF: 237-238 | 205 | 195

40. The normal random variable’s density function is perfectly symmetric about a peaked central value and, thus bell shaped; and characterized by tails extending indefinitely in both directions from the centre, approaching (but never touching) the horizontal axis. All of this implies a positive probability for finding values of the random variable anywhere between minus infinity and plus infinity. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 237-238

TOP: 1–3

41. A standard normal curve is a normal probability density function with a mean of 1 and a standard deviation of 0. ANS: F BLM: Remember

PTS:

REF: 238-239

42. For any random variable x and constant a, P(x ANS: F BLM: Remember

PTS:

TOP: 1–3

a) = P(x < a) because P(x = a) = 0.

REF: 232-233

TOP: 1–3

REF: 232-233

TOP: 1–3

43. For any random variable x, P(x = a) = 0. ANS: F BLM: Remember

PTS:

44. For a continuous random variable x and constant a, P(x < a). ANS: T BLM: Remember

PTS:

a) = P(x >a) and P(x

REF: 232-233

a) = P(x

TOP: 1–3

45. The total area under the normal probability distribution curve is equal to 1. ANS: T TOP: 1–3

PTS: 1 BLM: Remember

REF: 237-238 | 233

46. The normal probability distribution is one of the most commonly used discrete probability distributions. ANS: F BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

47. One difference between a binomial random variable x and a standard normal random variable z is that with the binomial random variable we cannot determine P(x = a), while for the standard normal variable we can determine P(z = a). ANS: F TOP: 1–3

PTS: 1 REF: 233-234 | 239 BLM: Higher Order - Understand

48. If the mean, median, and mode are all equal for a continuous random variable, then the random variable must be normally distributed. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 237-238

TOP: 1–3

49. The time it takes a student to finish a final exam is known to be normally distributed with a mean equal to 84 minutes and a standard deviation equal to 10 minutes. Given this information, the probability that it will take a randomly selected student between 75 and 90 minutes is approximately 0.0902. ANS: F TOP: 1–3

PTS: 1 REF: 237-238 | 69-70 BLM: Higher Order - Apply

50. The normal probability distribution is right-skewed for large values of the standard deviation , and left-skewed for small values of . ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 237-238

TOP: 1–3

51. Assume that x is normally distributed random variable with a mean equal to 13.4 and a standard deviation equal to 3.6. Assume also that P(x > a) = 0.05. The value of a must be 19.322. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 243-246

TOP: 1–3

52. The area between z = 0 and z = 3.50 of a standard normal curve is about 0.50. ANS: T TOP: 1–3

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

53. In the standard normal curve, the probability or area between z = –1.28 and z = 1.28 is 0.3997. ANS: F TOP: 1–3 54. Let

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

be a z-score that is unknown but identifiable by position and area. If the area to the

right of

is 0.8413, the value of

ANS: F TOP: 1–3 55. Let

is 1.0.

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

be a z-score that is unknown but identifiable by position and area. If the symmetrical

area between – ANS: T TOP: 1–3

and +

is 0.9544, the value of

is 2.0.

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

56. In the standard normal curve, the z-score representing the 10th percentile is 1.28. ANS: F

PTS:

REF: 238-239 | 78 TOP: 1–3

BLM: Higher Order - Apply 57. Using the standard normal curve, the z-score representing the 75th percentile is 0.67. ANS: T TOP: 1–3

PTS: 1 REF: 287-288 | 114-115 BLM: Higher Order - Apply

58. The z-score representing the 90th percentile of the standard normal curve is 1.28. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238-239 | 78 TOP: 1–3

59. The mean and standard deviation of a normally distributed random variable, which has been “standardized,” are 1 and 0, respectively. ANS: F BLM: Remember

PTS:

REF: 238-239

TOP: 1–3

60. A random variable X is normally distributed with a mean of 150 and a variance of 36. Given that X = 120, then its corresponding z-score is 5.0. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

61. A random variable X is normally distributed with a mean of 250 and a standard deviation of 50. Given that X = 175, its corresponding z-score is –1.50. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 238

TOP: 1–3

62. For a normal curve, if the mean is 25 minutes and the standard deviation is 5 minutes, the area to the left of 10 minutes is about 0.50. ANS: F TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

63. For a normal curve, if the mean is 20 minutes and the standard deviation is 5 minutes, the area to the right of 13 minutes is 0.9192. ANS: T TOP: 1–3

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

64. Given that Z is a standard normal random variable, a negative value of Z indicates that the standard deviation of Z is negative. ANS: F BLM: Remember

PTS:

REF: 238-239

65. The mean of any normal distribution is always 0.

TOP: 1–3

ANS: F BLM: Remember

PTS:

REF: 237-238

TOP: 1–3

66. Adding or subtracting 0.5 from the binomial random variable x’s interval endpoints (i.e., the so-called continuity correction) improves the normal approximation to the binomial by correcting for or accounting for the “missing” corners of the probability rectangles corresponding to values of x. ANS: T BLM: Remember

PTS:

REF: 251

TOP: 4

67. All binomial distributions can be approximated very closely by normal distributions. ANS: F BLM: Remember

PTS:

REF: 251

TOP: 4

68. The normal approximation to the binomial probability distribution is appropriate when the number of trials n is large and the probability of success p is near 0.05. ANS: T BLM: Remember

PTS:

REF: 251

TOP: 4

69. Even though the normal probability distribution deals with continuous variables, we can use it to approximate the binomial probability distribution whenever n 30 and also n(q) 30. ANS: F BLM: Remember

PTS:

REF: 251

TOP: 4

70. Even though the normal probability distribution deals with continuous variables, we can use it to approximate the binomial distribution whenever np > 5 and also nq > 5. ANS: T BLM: Remember

PTS:

REF: 251

TOP: 4

71. The normal approximation to the binomial distribution works best when the number of trials is large, and when the binomial distribution is symmetrical (like the normal distribution). ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 251

TOP: 4

72. In general, the binomial probability P(X = x) is approximated by the area under a normal curve between x – 0.5 and x + 0.5. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 251

TOP: 4

73. In general, the binomial probability P(X curve to the left of x + –0.5. ANS: F PTS: 1 BLM: Higher Order - Understand 74. In general, the binomial probability P(X curve to the right of x – 0.5. ANS: T PTS: 1 BLM: Higher Order - Understand

x) is approximated by the area under the normal

REF: 251

TOP: 4

x) is approximated by the area under the normal

REF: 251

TOP: 4

75. The market share of a relatively popular soft drink in Canada could be classified as a uniform random variable. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 234

TOP: 4

76. The probability density function of a uniform random variable has the shape of a non-zero horizontal line over some finite interval. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 234

TOP: 4

77. The probability density function of a uniform random variable is an example of a normal distribution with a large population standard deviation. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 234

TOP: 4

PROBLEM 1. A random variable x is normally distributed with  = 100 and  = 20. a. What is the median of this distribution? b. Find P(x median). c. Find P(x 75). ANS: a. Since the normal distribution is symmetric, the mean and the median have the same value, so the median is 100. b. P(x median) = P(x 100) = P(z 0) = 0.50 c. P(x 75) = P(z –1.25) = 0.50 – 0.3944 = 0.1056 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply 2. Let z denote a standard normal random variable. a. Find P(z > 1.48).

TOP: 1–3

Find P(-0.44 < z < 2.68).

c. d. e.

Determine the value of which satisfies P(z > Find P(z < –0.87). Find P(–1.66 < z < –0.48).

Find

such that P(–

Find

such that P(z <

Find

such that P(–

<z<

) = 0.7995.

) = 0.901.

) = 0.0375.

<z<

) = 0.7698.

ANS: a. 0.0694 b. 0.6663 c. d. e.

= –0.84 0.1922 0.2671

= 1.65

= –1.76

= 1.2

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

3. A chemical manufacturer sells its product in steel drums. The net weight of the chemical in the drums has a normal distribution with mean 145 kg and standard deviation 2 kg. The manufacturer guarantees its customers each drum will contain at least 140 kg of the chemical. What percentage of the drums will satisfy this guarantee? ANS: P(x 140) = P(z

–2.5) = 0.9938 = 99.38%

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

High School Graduates IQ’s The distribution of IQ scores for high school graduates is normally distributed with = 104 and = 16. 4. Refer to High School Graduates IQ’s paragraph. Find the probability a person chosen at random from this group has an IQ score above 146. ANS: P(x 146) = P(z

2.63) = 0.0043

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

5. Refer to High School Graduates IQ’s paragraph. What percentage of the IQ scores would be between 97 and 126? ANS: P(97 x 126) = P(-0.44 z 1.38) = 0.5862. Thus, approximately 58.6% of the IQ scores would lie between 97 and 126. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

6. Refer to High School Graduates IQ’s paragraph. What is the 95th percentile of this normal distribution? ANS: P(x

) = 0.05. This implies that P

= 0.05. Therefore, we have

= 1.645. Hence, the 95th percentile is

= 130.32.

PTS: 1 REF: 243-246 | 720-721 |78 BLM: Higher Order

TOP: 1–3

7. Two students are enrolled in an introductory statistics class at the university. The first student is in a morning section and the second student is in an afternoon section. If the student in the morning section takes a midterm exam and earns a score of 76, and the student in the afternoon section takes a midterm exam and earns a score of 72, which student performed better compared to the rest of the students in his or her respective class? Assume test scores are normally distributed. In the morning section, the mean was 64 and the standard deviation was 8. In the afternoon section, the class mean was 60 with a standard deviation of 7.5 ANS: = (76 – 64)/8 = 1.5;

= (72 – 60)/7.5 = 1.6. Since the standardized

score is larger

than , we may conclude that the student in the afternoon section performed better, relative to his or her classmates. PTS: 1 REF: 243 BLM: Higher Order - Evaluate

TOP: 1–3

8. Transport Canada studies of fuel consumption indicate compact car fuel consumption to be normally distributed with a mean of 8.5 litres per 100 km and a standard deviation of 1.5 L/100 km. What percentage of compact cars use between 6.8 and 12.0 L/100 km? ANS: P(6.8 x 12.0) = P(–1.13 z 2.33) = 0.8609. Then, the percentage of compact cars that use between 6.8 and 12.0 L/100 km is 86.09%.

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

9. The Scholastic Aptitude Test (SAT) is a standardized test for college admissions in the United States. The mean SAT verbal score of next year’s entering freshmen at the local college is 600. The college also knows 69.5% of the students scored less than 625. If the scores are normally distributed, what is the standard deviation of the SAT scores? ANS: This is a standardization problem. Let the normal random variable x be the SAT verbal score value. Since P(x < 625) = 0.695, then P which implies that = 49.02.

= 0.695. Hence, 25/ = 0.51,

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

10. The distribution of final exam scores in an inferential statistics class is normal, with standard deviation 6. Find the mean score if 93.32% of the students received scores no higher than 90. ANS: This is a standardization problem. Let the normal random variable x be the final exam score. Since P(x 90) = 0.9332, then P implies that = 81.

= 0.9332. Hence, (90 –

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

)/6 = 1.5, which

TOP: 1–3

11. The scores on an aptitude test are normally distributed with an unknown mean, a variance of 36, and only 7% of the scores below 26. What is the mean score? ANS: This is a standardization problem. Let the normal random variable x be the aptitude test score. Since P(x < 26) = 0.07, then P which implies that = 34.88. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

= 0.07. Hence, (26 – )/6 = –1.48,

TOP: 1–3

12. A television set manufacturer has found the length of time until the first repair is normally distributed, with a mean of 4.5 years and a standard deviation of 1.5 years. If the manufacturer wants only 10.2% of the first repairs to occur within the warranty period, how long should the warranty be?

ANS: P

= 0.102

(

- 4.5)/1.5 = –1.27, which implies

) = 0.102

that

= 2.595; that is, the warranty should expire after x = 2.595 years

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

Salmon Flies Lifespan The lifetime of salmon flies is normally distributed, with a mean of 60 days and a standard deviation of 20 days. 13. Refer to Salmon Flies Lifespan paragraph. What percentage of salmon flies live fewer than 12 days? ANS: P( x < 12) = P( z < –2.4) = 0.0082. Thus, approximately 0.8% of salmon flies live less than 12 days. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

14. Refer to Salmon Flies Lifespan paragraph. What percentage of salmon flies live between 80 and 101 days? ANS: P(80 x 101) = P(1 z 2.05) = 0.1385. Thus, approximately 13.9% of salmon flies live between 80 and 101 days. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

15. Refer to Salmon Flies Lifespan paragraph. Find the value x0 such that 6.3% of salmon flies live less than x0 days. ANS: P(x

) = 0.063

that

= 29.4 days.

) = 0.063

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

(

– 60)/20 = –1.53; all of which imply

TOP: 1–3

16. Suppose the amount of heating oil used annually by households in Ontario is normally distributed, with a mean of 760 litres per household per year and a standard deviation of 150 litres of heating oil per household per year.

a. b. c.

What is the probability that a randomly selected Ontario household uses more than 570 litres of heating oil per year? What is the probability that a randomly selected Ontario household uses between 680 and 1130 litres per year? If the members of a particular household were scared into using fuel conservation measures by newspaper accounts of the probable price of heating oil next year, and they decided they wanted to use less oil than 97.5% of all other Ontario households currently using heating oil, what is the maximum amount of oil they can use and still accomplish their conservation objective?

ANS: a. P(x > 570) = P(z > –1.27) = 0.8980 b. P(680 x 1130) = P (–0.53 z

2.47) = 0.6952

= 0.975

P(x >

) = 0.975

implies that

(

– 760)/150 = –1.96; which

= 466 litres of heating oil per year.

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

17. A certain type of automobile battery is known to have a lifespan that is normally distributed, with mean 1100 days and standard deviation 80 days. For how long should these batteries be guaranteed if the manufacturer wants to replace only 5% of the batteries sold because they “died” before the guarantee expired? ANS: P(x < that

) = 0.05 P = 968.4 days

= 0.05

(

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

– 1100)/80 = –1.645; which implies

TOP: 1–3

18. Let x denote a normal random variable with mean 30 and standard deviation 5. What x-value is the 67th percentile of this distribution? ANS: P(x = 32.2.

) = 0.67

= 0.67

(

PTS: 1 REF: 243-246 | 720-721 | 78 BLM: Higher Order - Apply Tar Amounts in Cigarettes Narrative

– 30)/5 = 0.44; which implies that

TOP: 1–3

Suppose the amount of tar in cigarettes is normally distributed, with mean 3.5 mg and standard deviation 0.5 mg. 19. Refer to Tar Amounts in Cigarettes Narrative. What proportion of cigarettes have a tar content exceeding 4.25 mg? ANS: P(x > 4.25) = P( z > 1.5) = 0.0668 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

20. Refer to Tar Amounts in Cigarettes Narrative. “Low tar” cigarettes must have tar content below the 25th percentile of the tar content distribution. What is the value which is the 25th percentile of the tar content distribution? ANS: P(x

) = 0.25

= 0.25

(

– 3.5)/0.5 = –0.67; which implies that

= 3.165 mg of tar. PTS: 1 REF: 243-246 | 720-721 | 78 BLM: Higher Order - Apply

TOP: 1–3

Annual Rainfall Narrative The annual rainfall in a particular area of the country is normally distributed, with mean 100 cm and standard deviation 20 cm. 21. Refer to Annual Rainfall Narrative. In a given year, what is the probability that the annual rainfall will be between 105 and 125 cm? ANS: P(105

125) = P(0.25

1.25) = 0.2957

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

22. Refer to Annual Rainfall Narrative. A drought is said to occur after a year in which the annual rainfall dips below the 20th percentile. For a year to be classified as a year of drought, what would the annual rainfall in this area of the country have to be? ANS: P(x

) = 0.20

= 0.20

(

– 100)/20 = –0.84; which implies that

= 83.2 cm. A drought occurs in this area when the annual rainfall drops below 83.2 cm per year.

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

23. The time necessary to assemble a discount store display is normally distributed, with =5 minutes and = 30 seconds. A large number of potential employees are timed assembling a practice display. How much time (in minutes) should the personnel manager allow the potential employees to assemble the display if the store wants only 80% of the people to complete the task? ANS: P(x ) = 0.80 5.42 minutes.

= 0.80

(

– 5)/0.5 = 0.84; which implies that

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

24. For the standard normal distribution, find the value conditions.

satisfying each of the following

z

P(–

P(z <

) = 0.0174

P(z <

) = 0.5398

) = 0.966

ANS: a.

Since the symmetrical area between –

= 0.483; so b.

and

is less than 0.5,

z  0) = 0.5 – 0.0174 = 0.4826, so

= –2.11.

Since the area to the left of

z

)

must be smaller than 0. Then, P(



= 2.12.

Since the area to the left of

must satisfy P(0  z 

is 0.966,

is greater than 0.5,

) = 0.5398 – 0.5 = 0.0398, so

must be larger than 0. Then, P(0

= 0.10.

PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

Bike Racks Assembly Time Narrative A manufacturer of bike racks for cars claims that the assembly time x for a particular model is normally distributed, with a mean of 1 hour and a standard deviation of 0.10 hours. 25. Refer to Bike Racks Assembly Time Narrative. Find the probability that it takes at most 1.1 hours to assemble a bike rack of this model.

ANS: P(x 1.1) = P(z  1) = 0.5 + 0.3413 = 0.8413 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

26. Refer to Bike Racks Assembly Time Narrative. Find the probability that it takes longer than 1.2 hours to assemble a bike rack of this model. ANS: P(x  1.2) = P(z  2) = 0.5 – 0.4772 = 0.0228 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

27. Refer to Bike Racks Assembly Time Narrative. Find the probability that it takes between 0.8 hours and 1.1 hours to assemble a bike rack of this model. ANS: P(0.8  x  1.1) = P(–2  z  1) = 0.4772 + 0.3413 = 0.8185 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

28. Refer to Bike Racks Assembly Time Narrative. Find the assembly time that exceeds 97.5% of all other assembly times for bike racks of this model. ANS: Find

such that P(x 

) = 0.975 = P(z 

). Since the area to the left of

than 0.5, must be larger than 0 such that P(0  z  = 1 + (1.96)(0.10) = 1.196 hours. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

) = 0.475, so

is greater

= 1.96. Thus,

TOP: 1–3

29. Suppose x is normally distributed, with a mean of 75 and a standard deviation of 4. Would it be unusual to observe a value x = 59? ANS: z = (59 – 75)/4 = –4.0. Thus the value x = 59 is 4 standard deviations below the mean of 75, which is very unlikely. PTS: 1 REF: 243-246 | 720-721 | 77-78 BLM: Higher Order - Evaluate

TOP: 1–3

30. Suppose x is normally distributed with a mean of 75 and a standard deviation of 4. a. Find the 90th percentile. b. Find the 95th percentile.

Find the 5th percentile.

ANS: a.

Find

such that P(x 

greater than 0.5, Thus, b.

Find

must be larger than 0 such that P(0  z 

such that P(x 

) = 0.40, so

= 1.28.

) = 0.95 = P(z 

). Since the area to the left of

must be larger than 0 such that P(0  z 

) = 0.45, so

= 1.645.

= 75 + (1.645)(4) = 81.58 such that P(x 

smaller than 0.5, Thus,

). Since the area to the left of

= 75 + (1.28)(4) = 80.12

greater than 0.5, Thus,

) = 0.90 = P(z 

) = 0.05 = P(z 

). Since the area to the left of

must be smaller than 0 such that P(

 z  0) = 0.45, so

is = –1.645.

= 75 + (–1.645)(4) = 68.42

PTS: 1 REF: 243-246 | 720-721 | 78 BLM: Higher Order - Apply

TOP: 1–3

Learning Time of Computer Software Narrative The time x a student spends learning a computer software package is normally distributed with a mean of 8 hours and a standard deviation of 1.5 hours. A student is selected at random. 31. Refer to Learning Time of Computer Software Narrative. What is the probability that the student spends less than 6 hours learning the software package? ANS: P(x  6) = P(z  –1.33) = 0.5 – 0.4082 = 0.0918 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

32. Refer to Learning Time of Computer Software Narrative. What is the probability that the student spends at least 8.5 hours learning the software package? ANS: P(x  8.5) = P(z  0.33) = 0.5 – 0.1293 = 0.3707 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

33. Refer to Learning Time of Computer Software Narrative. What is the probability that the student spends between 6.5 and 8.5 hours learning the software package? ANS: P(6.5  x  8.5) = P(–1  z  0.33) = 0.3413 + 0.1293 = 0.4706

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply 34. A normal random variable x has an unknown mean the probability that x exceeds 7.5 is 0.8289, find

TOP: 1–3

and a standard deviation

. If

ANS: It is given that x is normally distributed with = 2.5 but with unknown mean , and that P (x > 7.5) = 0.8289. In terms of the standard normal random variable z, we can write . Since the area to the right of greater than 0.5, then

must be negative. Moreover, P[ = –0.095. This implies

0.3289. Hence,

< z < 0] =

= 9.875.

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze 35.

TOP: 1–3

A normal random variable x has mean 36.7 and standard deviation 10. Find a value of x such that the area under the normal distribution to the right of this value is equal to 0.01. ANS: This is the 99th percentile of this normal distribution. Find

such that P(x 

P(z 

must be larger than 0, such

). Since the area to the left of

that P(0  z  60.0.

) = 0.49, so

is greater than 05,

= 2.33. Thus, the value of

PTS: 1 REF: 243-246 | 720-721 | 78 BLM: Higher Order - Analyze

) = 0.99 =

= 36.7 + (2.33)(10) =

TOP: 1–3

36. A normal random variable x has mean 54 and standard deviation 15. Would it be unusual to observe the value x = 0? Explain your answer. ANS: The z-value corresponding to x = 0 is z = –3.60. Since the value x = 0 lies more than three standard deviations away from the mean, it is considered an unusual observation. The probability of observing a value of z as large as or larger than z = –3.6 is approximately 0. PTS: 1 REF: 243-246 | 720-721 | 77-78 BLM: Higher Order - Evaluate

TOP: 1–3

37. A normal random variable x has an unknown mean and standard deviation. The probability that x exceeds 4 is 0.975, and the probability that x exceeds 5 is 0.95. Find and ANS:

The random variable x is normal with unknown 4) = P

and

= 0.975 and P(x > 5) = P

Since the area to the right of

. However, it is given that P(x > = 0.95.

is greater than 0.50, then

with P[

< z < 0] = 0.475. Therefore,

Similarly,

is also negative, with P[

is negative,

= –1.96 (a). < z < 0] = 0.45. Therefore,

= –1.645 (b). Solving Equations (a) and (b) simultaneously for = 10.222, and =3.1745. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order

and

, we get

TOP: 1–3

Braking Distance Narrative For a car travelling 60 kilometres per hour (km/h), the distance required to brake to a stop is normally distributed, with mean of 17 metres and a standard deviation of 2.4 metres. Suppose you are traveling 60 km/h in a residential area and a car moves abruptly into your path at a distance of 20 m. 38. Refer to Braking Distance Narrative. If you apply your brakes, what is the probability that you will brake to a stop within 14 m or less? ANS: It is given that x is normally distributed with P

and

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

39. Refer to Braking Distance Narrative. If you apply your brakes, what is the probability that you will brake to a stop within 17 m or less? ANS: It is given that x is normally distributed with

and

P PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

40. Refer to Braking Distance Narrative. If the only way to avoid a collision is to brake to a stop, what is the probability that you will avoid the collision? ANS: In order to avoid a collision, you must brake within 20 m or less. Hence, P

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

National Achievement Test Scores Narrative The scores on a national achievement test were normally distributed, with a mean of 550 and a standard deviation of 112. 41. Refer to National Achievement Test Scores Narrative. If you achieved a score of 690, how far, in standard deviations, did your score depart from the mean? ANS: It is given that the scores were normally distributed, with mean = 550 and standard deviation =112. Therefore, the z-score associated with a score of 680 is z = (690 – 550)/112 = 1.25. PTS: 1 REF: 237-238 BLM: Higher Order - Apply

TOP: 1–3

42. Refer to National Achievement Test Scores Narrative. What percentage of those who took the examination scored higher than you? ANS: P(x > 690) = P(z > 1.25) = 0.5 – 0.3944 = 0.1056. Thus, approximately 10.6% of the people who took the test scored higher than 690. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

Canada Revenue Agency Audits Narrative How does Canada Revenue Agency decide on the percentage of income tax returns to audit for each province? Suppose the Agency does it by randomly selecting 50 values from a normal distribution with a mean equal to 1.25% and a standard deviation equal to 0.4%. 43. Refer to Canada Revenue Agency Audits Narrative. What is the probability that a particular province will have more than 2% of its income tax returns audited? ANS: Define x to be the percentage of returns audited for a particular province. It is given that x is normally distributed with P

and

P(x > 2.0) =

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

44. Refer to Canada Revenue Agency Audits Narrative. What is the probability that a province will have less than 1% of its income tax returns audited?

ANS: P(x < 1) = P

0.5 – 0.2357 = 0.2643

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

45. Suppose the numbers of a particular type of bacterium in samples of 1 milliliter (mL) of drinking water tend to be approximately normally distributed, with a mean of 80 and a standard deviation of 10. What is the probability that a given 1 mL sample will contain more than 100 bacteria? ANS: It is given that the counts of the number of bacteria are normally distributed, with and The z-value corresponding to x = 100 is z = 2. Hence, P( x >100) = P( z >2.0) = 0.5 – 0.4772 = 0.0228. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

46. A car dealership has found that the length of time before a major repair is required on the new cars it sells is normally distributed, with a mean of 36 months and a standard deviation of 9 months. If the dealer wants only 5% of the cars to fail before the end of the warranty period, for how many months should the cars be guaranteed? ANS: It is given that x is normally distributed with

and

. Let t be the warranty time

for the car. It is necessary that only 5% of the cars fail before time t. That is, P or P . Since the area to the left of (t – 36)/9 is smaller than 0.50, then (t – 36)/9 must be negative and P[(t – 36)/9 < z < 0] = 0.45. Hence, (t – 36)/9 = –1.645, which implies that t = 21.195 months. PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

47. The average length of time required to complete a college achievement test was found to equal 75 minutes, with a standard deviation of 15 minutes. When should the test be terminated if you wish to allow sufficient time for 90% of the students to complete the test? (Assume that the time required to complete the test is normally distributed.) ANS:

It is given that

and

. The objective is to determine a particular value,for

the random variable x so that P

; that is, 90% of the students will finish the

examination before the set time limit. P to the left of

is larger than 0.5, then

. Hence,

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

. Since the area must be greater than 0, and = 1.28, which implies that

= 94.2.

TOP: 1–3

Gestation Time for Human Babies Narrative The gestation time for human babies is approximately normally distributed, with an average of 280 days and a standard deviation of 12 days. 48. Refer to Gestation Time for Human Babies Narrative. Find the upper and lower quartiles for the gestation times. ANS: The random variable x, the gestation time for a human baby is normally distributed, with = 280 and The values (rounded to two decimal places) z = –0.67 and z = 0.67 represent the 25th and 75th percentiles of the standard normal distribution. Converting these values to their equivalents for the general random variable x using the relationship you have: the lower quartile, quartile,

; and the upper

PTS: 1 REF: 237-238 | 79-81 BLM: Higher Order - Analyze

TOP: 1–3

49. Refer to Gestation Time for Human Babies Narrative. Would it be unusual to deliver a baby after only 6 months of gestation? Explain. ANS: If you consider a month to be approximately 30 days, the value x = 6(30) = 180 is unusual, since it lies gestation time.

= (180 – 280)/12 = –8.333 standard deviations below the mean

PTS: 1 REF: 237-238 | 77-78 BLM: Higher Order - Evaluate

TOP: 1–3

Water Usage Narrative The mayor of the city of Windsor was informed that household water usage is a normally distributed random variable with a mean of 95 litres and a standard deviation of 16 litres per day.

50. Refer to Water Usage Narrative. Find the probability that a randomly chosen household uses more than 87 litres per day. ANS: P(x > 87) = P(z > –0.50) = 0.6915 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

51. Refer to Water Usage Narrative. Find the probability that a randomly chosen household uses between 75 and 95 litres per day. ANS: P(75 x

95) = P(–1.25

0) = 0.3944

PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

52. Refer to Water Usage Narrative. Find the probability that a randomly chosen household uses less than 79 litres per day. ANS: P(x < 79) = P(z < –1) = 0.1587 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

53. Refer to Water Usage Narrative. If the mayor wants to give a tax rebate to the 20% lowest water users, what should the litres-per-day cut-off be? ANS: x=

= 95 + (–0.84)(16) = 81.56 litres

PTS: 1 REF: 237-238 BLM: Higher Order - Analyze

TOP: 1–3

54. Refer to Water Usage Narrative. The mayor advertised on a local TV channel for about 3 weeks his intention to give a tax rebate to the 20% lowest water users. If the advertisement lowered the mean usage of 95 litres to 87 litres, what should the adjusted litres-per-day cut-off be? ANS: x=

= 87 + (–0.84)(16) = 73.56 litres

PTS: 1 REF: 237-238 BLM: Higher Order - Analyze

TOP: 1–3

55. If Z is a standard normal random variable, find the value z for which the following hold: a. the area between 0 and z is 0.3729

b. c. d. e. f.

the area to the right of z is 0.7123 the area to the left of z is 0.1736 the area to the left of z is 0.7673 the area to the right of z is 0.1841 the area between –z and z is 0.6630

ANS: a. 1.14 b. –0.56 c. –0.94 d. 0.73 e. 0.90 f. 0.96 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

56. If X is a normal random variable with a mean of 45 and a standard deviation of 8, find the following probabilities: a. P(X 50) b. P(X 32) c. P(37 X 48) d. P(50 X 60) e. P(X = 45) ANS: a. 0.2643 b. 0.0516 c. 0.4893 d. 0.2342 e. 0.0 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

Soup Cans Narrative The liquid volumes contained in cans of soup produced by a company are normally distributed, with a mean of 425 mL and a standard deviation of 16 mL. 57. Refer to Soup Cans Narrative. What is the probability that a can of soup selected randomly from the entire production line will contain at most 400 mL? ANS: 0.0594 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

58. Refer to Soup Cans Narrative. Determine the minimum volume of the fullest 5% of all cans of soup produced. ANS: 451.32 mL PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

59. Refer to Soup Cans Narrative. If 28,390 of the cans of soup of the entire production contain at least 449 mL, how many cans of soup have been produced? ANS: 425,000 cans PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

60. Given that X is a normally distributed random variable with a mean of 50 and a standard deviation of 2, find the probability that X is between 46 and 52. ANS: 0.8185 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

Pacific Salmon Narrative The owner of a fish market determined that the average weight for a Pacific salmon is 3.6 kg, with a standard deviation of 0.8 kg. Assume the weights of the salmon are normally distributed. 61. Refer to Pacific Salmon Narrative. What is the probability that a randomly selected salmon will weigh more than 4.8 kg? ANS: 0.0668 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

62. Refer to Pacific Salmon Narrative. What is the probability that a randomly selected salmon will weigh between 3 and 5 kg? ANS: 0.7333 PTS:

REF: 243-246 | 720-721

TOP: 1–3

BLM: Higher Order - Apply 63. Refer to Pacific Salmon Narrative. A randomly selected salmon will weigh more than x kg to be one of the top 5% in weight. What is the value of x? ANS: x = 3.6 + (1.645)(0.8) = 4.916 kg PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

64. Refer to Pacific Salmon Narrative. A randomly selected salmon must weigh less than a certain number, say x, kg to be one of the bottom 20% in weight. What is the value of x? ANS: x = 3.6 + (–0.84)(0.8) = 2.928 kg PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

65. Refer to Pacific Salmon Narrative. Above what weight do 87.70% of the weights occur? ANS: x = 3.6 + (–1.16)(0.8) = 2.76 kg PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

66. Refer to Pacific Salmon Narrative. What is the probability that a randomly selected salmon will weigh less than 3.2 kg? ANS: 0.3085 PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

67. Refer to Pacific Salmon Narrative. Below what weight do 83.4% of the weights occur? ANS: x = 3.6 + (0.97)(0.80) = 4.376 kg PTS: 1 REF: 243-246 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

68. Suppose z has a standard normal distribution. Then below which value do 28.1% of the possible z-values occur? ANS:

–0.58 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

69. Suppose z has a standard normal distribution. Then 85.08% of the possible z-values are smaller than which z-value? ANS: 1.04 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

70. Suppose z has a standard normal distribution. Then between which two z-values (symmetrically distributed around the mean) would 95.96% of the possible z-values occur? ANS: –2.05 and 2.05 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

71. Suppose z has a standard normal distribution. Then between which two z-values (symmetrically distributed around the mean) would 59.9% of the possible z-values occur? ANS: –0.84 and 0.84 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

72. Suppose z has a standard normal distribution. Then above which z-value would 16.6% of the possible z-values occur? ANS: 0.97 PTS: 1 REF: 238-242 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

73. Consider a binomial random variable x with n = 25 and p = 0.01. Would the normal approximation be appropriate here? Why or why not? ANS: No. Because p is so small, the distribution of x will be asymmetrical (right-skewed) and the normal curve, being symmetrical, would provide a poor approximation to the distribution of x.

PTS: 1 REF: 251 BLM: Higher Order - Evaluate

TOP: 4

University Faculty Survey Narrative A recent survey of university faculty reported 55% were considering going into another profession. 74. Refer to University Faculty Survey Narrative. Based on this information, what is the expected number of faculty who would be considering going into another profession if we randomly sampled 200 faculty members? ANS: 200 (0.55) = 110 faculty members PTS: 1 REF: 249 BLM: Higher Order - Analyze

TOP: 4

75. Refer to University Faculty Survey Narrative. What is the approximate probability 60 or more faculty from a random sample of 200 would be considering going into another profession? ANS: P(x > 60) = P(z > –7.18) 1.0 We can be almost certain 60 or more faculty members from a random sample of 200 would be considering going into another profession. PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

Eligible Manitoba Voters Narrative Past records indicate 60% of the eligible voters in Manitoba vote in local elections. 76. Refer to Eligible Manitoba Voters Narrative. What is the expected number in a random sample of 200 eligible voters who will vote in the next local election? ANS: 200(0.6) = 120 PTS: 1 REF: 249 BLM: Higher Order - Analyze

TOP: 4

77. Refer to Eligible Manitoba Voters Narrative. is the approximate probability more than 150 from a sample of 200 will vote in the next local election? ANS: P(x > 150) = P(z > 4.4)

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

Textbook Narrative Twenty-five psychology instructors have formed a committee to pick next year’s textbook, and they have narrowed their decision down to two equally good books, one with a better bibliography and references, and the other with a better format and illustrations. Since the books are considered to be equally good, we will assume the probability an instructor chooses either book is 0.5 and the instructors’ decisions are made independently. 78. Refer to Textbook Narrative. Using the binomial distribution, find the probability 15 or more instructors choose the book with the better format and illustrations. ANS: P(x 15) = 0.212 PTS: 1 REF: 197-199 BLM: Higher Order - Apply

TOP: 4

79. Refer to Textbook Narrative. Use the normal approximation to the binomial to find the probability 15 or more instructors choose the book with the better format and illustrations. ANS: P(x 15) = 0.2119 PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

80. Refer to Textbook Narrative. Compare the results in the previous two questions. ANS: The normal approximation is quite close to the actual value. PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

81. Twenty percent of Canadian men over 30 have high blood pressure. To run an experiment, a researcher needs to identify at least 15 men with high blood pressure. If the researcher examines 100 men over 30, what is the approximate probability he or she finds enough men with high blood pressure to run the experiment? ANS: Let the random variable x denote the number of men who have high blood pressure; x is binomial with n = 100 and p = 0.2. Then, using the normal approximation to the binomial with

= n p = 20 and

P(z > –1.38) = 0.9162.

P(x > 15)

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

82. Suppose the current median age of Canadian citizens is 30. If a survey of 500 randomly selected Canadian citizens is conducted, what is the probability that, at most, 240 of them will be under 30 years old? ANS: P(under 30 years of age) = 0.5 since the median age is 30, and 50% of the values fall below their median. Let x be the number of persons surveyed who are under 30; x is binomial with n = 500 and p = 0.5. Then, using the normal approximation to the binomial with = n p = 250 and P(z < –0.85) = 0.3023.

P(x < 240)

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

83. A firm with many travelling salespersons decides to check the salespersons’ travel expenses to see if they are correctly reported. An auditor for the firm selects 200 expense reports at random to audit. What is the probability more than 20% of the sampled reports will be incorrect when, in fact, only 10% of the firm’s expense reports are improperly documented? ANS: Let the random variable x denote the number of travel reports improperly documented; is binomial with n = 200 and p = 0.10. Then, using the normal approximation to the binomial with

– np = 20, and

we have P(x > 40)

P(z > 4.83) = 0. PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

University Football Narrative A student government representative claims that 55% of the student body of a local university favour a move to Division I in university football. A random sample of 2000 students is selected. 84. Refer to University Football Narrative. What is the expected number of students that will favour the move to Division I?

ANS: E(x) = np = 2000(0.55) = 1100 PTS: 1 REF: 249 BLM: Higher Order - Analyze

TOP: 4

85. Refer to University Football Narrative. What is the approximate probability that fewer than 1150 students of the 2000 selected will favour the move to Division I? ANS: Since np = 2,000(0.55) = 1100  5, and nq = 2000(0.45) = 900  5, use the normal approximation with

= 22.2486. Thus, P(X  1150) =

= np = 1100 and

P(z  2.27) = 0.5 + 0.4884 = 0.9884. PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

86. Let x be a binomial random variable with n = 25 and p = 0.3. a. Is the normal approximation appropriate for this binomial random variable? b. Find the mean and standard deviation for x. c. Use the normal approximation to find d. Use the cumulative binomial probabilities table in Appendix 1 of the textbook to calculate the exact probability close was your approximation?

Compare the results of parts c and d. How

ANS: a. The normal approximation will be appropriate if both np and nq are greater than 5. For this binomial experiment, np = 25(0.3) = 7.5 and nq = 25(0.7) = 17.5, 9 and the normal approximation is appropriate. b. For the binomial random variable, and c. To approximate the binomial probabilities

, we use the “correction for

continuity” and find the area under a normal curve with mean between P

= 5.5 and

and

= 9.5. The approximating probability is

d. P P P approximate probability calculated in (c).

, which is not too far from the

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

87. Suppose the records of a motel show that, on average, 10% of prospective guests will not claim their reservation. If the motel accepts 218 reservations and there are only 200 rooms in the motel, what is the probability that all guests who arrive to claim a room will receive one? ANS: Define x to be the number of guests claiming a reservation at the motel. Then p = P[guest claims reservation] = 1 – 0.1 = 0.9 and n = 218. The motel has only 200 rooms. Hence, if x > 200, a guest will not receive a room. The probability of interest is then P the normal approximation, we first calculate

Using

and The probability P

is approximated by the area under the appropriate normal curve

to the left of 200.5. Thus, P

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

U.S. Presidential Election Narrative Do Americans tend to vote for the taller of the two candidates in a presidential election? In 30 presidential elections since 1856, 18 of the winners were taller than their opponents. Assume that Americans are not biased by a candidate’s height and that the winner is just as likely to be taller or shorter than his opponent. Is the observed number of taller winners in the U.S. presidential elections unusual? 88. Refer to U.S. Presidential Election Narrative. Find the approximate probability of finding 18 or more of the 30 pairs in which the taller candidate wins. ANS: Define x to be the number of elections in which the taller candidate won. If Americans are not biased by height, then the random variable x has a binomial distribution with n = 30 and p = 0.5. Calculate and .Using the normal approximation with correction for continuity, we find the area to the right of x = 17.5. P

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

. TOP: 4

89. Refer to U.S. Presidential Election Narrative. Based on your answer to the previous question, can you conclude that Americans consider a candidate’s height when casting their ballot? ANS:

Since the occurrence of 18 or more of 30 taller choices is not unusual, based on the results the previous question, it appears that Americans do not consider height when casting a vote for a candidate. PTS: 1 REF: 248-249 BLM: Higher Order - Evaluate

TOP: 4

Blood Types Narrative In a certain population, 18% of the people have Rh-negative blood. A blood bank serving this population receives 95 blood donors on a particular day. 90. Refer to Blood Types Narrative. What is the probability that 10 or fewer are Rh-negative? ANS: Define x to be the number of people with Rh-negative blood. Then the random variable x has a binomial distribution with n = 95 and p = 0.18. Calculate and 3.745. Using the normal approximation with correction for continuity, we find the area to the left of x = 10.5, P

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

. TOP: 4

91. Refer to Blood Types Narrative. What is the probability that 15 to 20 (inclusive) of the donors are Rh-negative? ANS: Similar to the previous question, finding the area between x = 14.5 and x = 20.5: P

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

. TOP: 4

92. Refer to Blood Types Narrative. What is the probability that more than 80 of the donors are Rh-positive? ANS: The probability that more than 80 donors are Rh-positive is the same as the probability that less than 15 donors are Rh-negative, approximated with the area to the left of x = 14.5: P

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze College Applicants Narrative

TOP: 4

The admissions office of a community college in Ontario is asked to accept deposits from a number of qualified prospective freshmen so that, with probability of about 0.95, the size of the freshmen class will be less than or equal to 150. Suppose the applicants constitute a random sample from a population of applicants, 80% of whom would actually enter the freshmen class if accepted. 93. Refer to College Applicants Narrative. How many deposits should the admissions counselor accept? ANS: The random variable x is the size of the freshman class. That is, the admission office will send letters of acceptance to (or accept deposits from) a certain number of qualified students. Of these students, a certain number will actually enter the freshman class. Since the experiment results in one of two outcomes (enter or not enter), the random variable x, the number of students entering the freshman class, has a binomial distribution with n = number of deposits accepted and p = P[student, having been accepted, enters freshman class] = 0.8. It is necessary to find a value for n such that P and

Note that

. Using the normal approximation, we

need to find a value of n such that P 150.5 is

. The z-value corresponding to x = . The z-value corresponding to an area of

0.05 in the right tail of the normal distribution is 1.645. Then Solving for n in the above equation, we obtain the quadratic equation:

. It can be shown that the positive root of this equation is or n = 177.1727. Thus, 177 deposits should be accepted. PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Analyze

TOP: 4

94. Refer to College Applicants Narrative. If applicants in the number determined in the previous question are accepted, what is the probability that the freshmen class size will be less than 135? ANS: Once n = 177 has been determined, the mean and standard deviation of the distribution are < 135) is: P(x

= 141.6 and 134.5) = P(z

5.322. Then the approximation for P(x –1.33) = 0.50 – 0.4082 = 0.0918.

PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

Loans Narrative Historical data collected at a small bank in Nova Scotia revealed that 80% of all customers applying for a loan are accepted. Suppose that 50 new loan applications are selected at random.

95. Refer to Loans Narrative. Find the expected value and the standard deviation of the number of loans that will be accepted by the bank. ANS: Let X be the number of loans from a total of 50 that are accepted. Then X is a binomial random variable with n = 50 and p = 0.80. Therefore, E(X) = 40, and = 2.828. PTS: 1 REF: 249 BLM: Higher Order - Analyze

TOP: 4

96. Refer to Loans Narrative. What is the probability that at least 42 loans will be accepted? ANS: 0.2981 PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

97. Refer to Loans Narrative. What is the probability that the number of loans rejected is between 10 and 15, inclusive? ANS: 0.5452 PTS: 1 REF: 251 | 720-721 BLM: Higher Order - Apply

TOP: 4

Chapter 7—Sampling Distributions MULTIPLE CHOICE 1. Which of the following does NOT correctly describe a random sample? a. It is a subset of the population of interest. b. Its summary measures are called parameters. c. Its summary measures are called statistics. d. Each of the elements in it has the same likelihood of being selected. ANS: B BLM: Remember

PTS:

REF: 268-269

TOP: 1–2

2. Which of the following is a commonly used parameter? a. the sample mean, b. the standard deviation, s c. the standard deviation, d. the sample mean and the standard deviation, s ANS: C BLM: Remember

PTS:

REF: 268

TOP: 1–2

3. Consider a large population with a mean of 150 and a standard deviation of 27. A random sample of size 36 is taken from this population. Which of these values is equal to the standard error of the sampling distribution of sample mean? a. 4.17 b. 4.50 c. 5.20 d. 5.56 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 281

TOP: 3–5

4. Random samples of size 36 each are taken from a large population whose mean is 120 and standard deviation is 39. In this case, which of the following are the values of the mean and the standard error, respectively, of the sampling distribution of the sample mean? a. 120 and 39 b. 120 and 6.5 c. 39 and 120 d. 6.5 and 120 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 281

TOP: 3–5

5. Which of the sample sizes below will produce a sampling distribution of the mean that is approximately normal? a. 10 b. 20 c. 30

d. 35 ANS: C BLM: Remember

PTS:

REF: 279-280

TOP: 3–5

6. If all possible samples of size n are drawn from a large population with a mean of 20 and a standard deviation of 5, then for which of the following samples sizes would the standard error of the sample mean equal 1.0? a. 15 b. 20 c. 25 d. 30 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 281

TOP: 3–5

7. Which of the following is the standard error of a statistic used as an estimator of a population parameter? a. the standard deviation of the sampling distribution of the statistic b. the variance of the sampling distribution of the statistic c. the same value as the population standard deviation d. the same value as the population variance ANS: A BLM: Remember

PTS:

REF: 281

TOP: 3–5

8. If all possible samples of size n are drawn from a large population with a mean of standard deviation of , then the standard error of the sample mean is inversely proportional to which of the values listed below? a. b. c. n d. ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 281

and a

TOP: 3–5

9. When drawing all possible simple random samples of a given size n from a population, many different values of a sample statistic might occur. What is the term for a listing of all these possible values, along with the associated probabilities of their occurrence? a. the statistic’s finite population correction factor b. the statistic’s probability density function c. the statistic’s sampling distribution d. the statistic’s standard normal deviate ANS: C BLM: Remember

PTS:

REF: 275-277

TOP: 3–5

10. Given a sampling distribution of the sample mean, , that is normally distributed and has a mean of 30 and a standard deviation of 8, to which of the following does a sample mean of 40 correspond?

a. b. c. d.

a z value of –1.25 a z value of +1.25 a sample size of 120 a sample size of 300

ANS: B TOP: 3–5

PTS: 1 REF: 279 | 281-281 BLM: Higher Order - Apply

11. Given a population variance of = 36 and a sample size of n = 9, which of the following would equal the standard deviation of the sampling distribution of the sample mean, ? a. 4 b. 2 c. 1/2 d. 1/4 ANS: B PTS: 1 BLM: Higher Order - Apply 12. Given N = 1000, n = 30, and a. 0.009 b. 0.5 c. 1.095 d. 33.33

REF: 279

TOP: 3–5

= 6, what is the standard error of the sample mean, SE

ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 281

TOP: 3–5

13. When a sample is selected at random from a population, which of the following will probably be the case? a. The sample mean, , will likely be larger than the population mean. b. The sample mean, , will likely be smaller than the population mean. c. The sample mean, , will likely be different from the population mean. d. The sample mean, , will likely be equal to the population mean. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 275-277

TOP: 3–5

14. What is the term for a numerical descriptive measure calculated from a sample? a. a parameter b. a statistic c. a population d. a sampling distribution ANS: B BLM: Remember

PTS:

REF: 268

TOP: 3–5

15. What is the term for a numerical descriptive measure calculated from the entire population? a. a parameter b. a statistic

c. a sample d. a sampling distribution ANS: A BLM: Remember

PTS:

REF: 268

TOP: 3–5

16. Suppose the monthly rents of all one-bedroom apartments in a small town are known to be normally distributed with a mean equal to $175 a month and standard deviation of $35. Which of the following would be the highest individual rent that you might expect to find? a. about $210 b. about $280 c. about $4245 d. more information is needed to answer the question ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 279 | 69-70

TOP: 3–5

17. The scores of a class are normally distributed with a mean of 82 and a standard deviation of 8. What is the mean of the sampling distribution of the sample mean, , if a sample of 64 students is selected at random from all students taking that course? a. 64 b. between 74 and 90 c. between 78 and 86 d. 82 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

18. The scores of a class are normally distributed with a mean of 82 and a standard deviation of 8. What is the standard deviation of the sampling distribution of the sample mean, , if a sample of 64 students is selected at random from all students taking that course? a. 10.25 b. 8.00 c. 1.25 d. 1.00 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

19. The scores of a class are normally distributed with a mean of 82 and a standard deviation of 8. What is the probability that the mean score of a sample of 64 students is at least 80? a. 0.0987 b. 0.4772 c. 0.5987 d. 0.9772 ANS: D TOP: 3–5

PTS: 1 REF: 282 | 720-721 BLM: Higher Order - Apply

20. Suppose that 20% of the households in Alberta have incomes in excess of $60,000. Assume that a random sample of 500 households in Alberta is taken. What will be the standard error of the sampling distribution of sample proportion of households that have incomes in excess of $60,000? a. 0.0003 b. 0.0179 c. 0.0256 d. 0.1600 ANS: B TOP: 6

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Analyze

21. As a general rule, the normal distribution provides a good approximation to the sampling distribution of the sample proportion, , only if which of the following conditions holds? a. The sample size, n, is greater than 30. b. The population proportion, p, is greater than 0.50. c. The underlying population has a small standard deviation and n is large. d. np and n(1 – p) are both greater than 5. ANS: D BLM: Remember

PTS:

REF: 289

TOP: 6

22. If the standard error of the sampling distribution of the sample proportion is 0.02049 for samples of size 500, then what may we conclude about the population proportion? a. The population proportion must be either 0.2 or 0.8. b. The population proportion must be either 0.5 or 0.5. c. The population proportion must be either 0.3 or 0.7. d. The population proportion must be either 0.6 or 0.4. ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 289

TOP: 6

23. Given a population proportion of p = 0.8 and a sample size of n = 100, what is the standard deviation of the sampling distribution of the sample proportion, a. 0.0258 b. 0.0355 c. 0.0400 d. 0.4000

ANS: C PTS: 1 BLM: Higher Order - Apply

TOP: 6

REF: 289

24. In a recent study, it was reported that the proportion of employees who miss work on Fridays is 0.15, and that the standard deviation of the sampling distribution of sample proportion, a. 204 b. 108 c. 26 d. 87

, is 0.025. What was the sample size, n?

ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 289

TOP: 6

25. A statistics professor has stated that 90% of his students pass the class. To check this claim, a random sample of 150 students indicated that 129 passed the class. If the professor’s claim is correct, what is the probability that 129 or fewer will pass the class this semester? a. 0.9484 b. 0.5516 c. 0.4484 d. 0.0516 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 290-291

TOP: 6

26. For a control chart, where are the lower and upper control limits usually set? a. one standard deviation from the centreline b. two standard deviations from the centreline c. three standard deviations from the centreline d. four standard deviations from the centreline ANS: C BLM: Remember

PTS:

REF: 294

TOP: 7

27. Twenty-five samples of size 1000 each were drawn from a manufacturing process and the number of defectives in each sample was counted. The average sample proportion was 0.05. What would be the upper control limit for the p chart? a. 0.0206 b. 0.0293 c. 0.0475 d. 0.0707 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 296

TOP: 7

28. The mean of the sample means and the standard deviation of 50 samples of size 5 taken from a production process under control are found to be 300 and 25, respectively. At which of the following values would the lower control limit for the chart be located? a. 333.54 b. 310.61 c. 289.39 d. 266.46 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 294-295

TOP: 7

29. Fifty samples of size 500 were drawn from a manufacturing process and the number of defectives in each sample was counted. The average of the sample proportion was 0.032. At which of the following values would the centreline for the p chart be located? a. 0.032 b. 0.512

c. 0.968 d. 16.0 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 296

TOP: 7

30. Sixty samples of size 600 each were drawn from a manufacturing process and the number of defectives in each sample was counted. The average sample proportion was 0.04. What would be the lower control limit for the p chart? a. 0.008 b. 0.016 c. 0.048 d. 0.064 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 296

TOP: 7

31. Which of the following is another name for assignable cause variation? a. random variation b. special cause variation c. common cause variation d. controlled variation ANS: B BLM: Remember

PTS:

REF: 293-294

TOP: 7

32. What are some of the most common sources of random variation? a. people b. materials c. neither people nor materials d. both people and materials ANS: D BLM: Remember

PTS:

REF: 293-294

TOP: 7

TRUE/FALSE 1. If random samples of size n = 36 are drawn from a non-normal population with finite mean = 75 and standard deviation = 15, then the sampling distribution of the sample mean is approximately normally distributed with mean 2.5. ANS: T PTS: 1 BLM: Higher Order - Apply

= 75 and standard deviation

REF: 279-280

TOP: 3–5

2. If random samples of size n = 50 are drawn from a non-normal population with finite mean = 100 and standard deviation = 20, then the sampling distribution of the sum of sample measurements

is approximately normally distributed with mean

and standard deviation

= 1000.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 279-280

= 5000

TOP: 3–5

3. The spread of the distribution of sample means is considerably less than the spread of the sampled population. ANS: T BLM: Remember

PTS:

REF: 279

TOP: 3–5

4. The most important contribution of the Central Limit Theorem is in statistical inference. ANS: T BLM: Remember

PTS:

REF: 280

TOP: 3–5

5. The sampling distribution of the sample mean is exactly normally distributed, regardless of the sample size n. ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

6. According to the Central Limit Theorem, for large samples, the standard error of the sample mean is the population standard deviation divided by the square root of the sample size. ANS: T BLM: Remember

PTS:

REF: 279

TOP: 3–5

7. The Central Limit Theorem describes the distribution of the sample mean except for populations that are normal. ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

8. The Central Limit Theorem does not apply to the sample means of large samples drawn from a discrete distribution. ANS: F BLM: Remember

PTS:

REF: 277-279

TOP: 3–5

9. The total area under a probability density function curve is equal to 1.0. ANS: T BLM: Remember

PTS:

REF: 233

TOP: 3–5

10. The standard deviation of a statistic that is used to estimate an unknown parameter is called the standard error of the statistic. ANS: T BLM: Remember

PTS:

REF: 281

TOP: 3–5

11. As the sample size increases, the standard error of the sample mean decreases. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 281

TOP: 3–5

12. When all possible simple random samples of size n are drawn from a population that is normally distributed, the sampling distribution of the sample means will be normal, regardless of sample size, n. ANS: T BLM: Remember

PTS:

REF: 279-280

TOP: 3–5

13. A summary measure calculated for a population is called a parameter and is designated by Greek letters (such as for mean or for proportion). ANS: T BLM: Remember

PTS:

REF: 268

TOP: 3–5

14. A sampling distribution is defined as a sample chosen in such a way that every possible subset of like size has an equal chance of being selected. ANS: F BLM: Remember

PTS:

REF: 275-277

TOP: 3–5

15. A sampling distribution is a probability distribution that shows the likelihood of occurrence associated with all the possible values of a parameter whose values would be obtained when drawing all possible samples of a given size from a population. ANS: F BLM: Remember

PTS:

REF: 275-277

TOP: 3–5

16. The sample standard deviation measures the variability of all possible sample mean, values that might be obtained. ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

17. If is the mean of a simple random sample taken from a large population, and if the N population values are normally distributed, the sampling distribution of is also normally distributed, regardless of sample size, n.

ANS: T BLM: Remember

PTS:

REF: 279-280

TOP: 3–5

18. According to the Central Limit Theorem, any sampling distribution of normal, provided . ANS: F BLM: Remember

PTS:

REF: 279

is considered

TOP: 3–5

19. According to the Central Limit Theorem, any sampling distribution of the sample proportion will be normal provided np < 5, while ANS: F BLM: Remember

PTS:

. REF: 279

TOP: 3–5

20. A sampling distribution is the distribution of the values that are included in a sample selected randomly from the population. ANS: F BLM: Remember

PTS:

REF: 275

TOP: 3–5

21. In practical business situations, it is very unlikely that a decision maker will actually construct a sampling distribution of any kind. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 275

TOP: 3–5

22. The mean for a sample selected randomly from a population is most likely to be lower or higher than the known value of the population mean. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 275-277

TOP: 3–5

23. The sampling distribution of the sample mean, , is the distribution of all possible sample means that could be computed from all possible samples of a given size, n. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 275

TOP: 3–5

24. The mean of the sampling distribution of sample mean, , is equal to the mean of the population from which the samples are selected to construct the sampling distribution. ANS: T BLM: Remember

PTS:

REF: 279

TOP: 3–5

25. If a population standard deviation is equal to 24.8, then the sampling distribution of the sample mean, , will have a standard deviation that is less than 24.8 for all possible sample sizes.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

26. The population of pop cans filled by a particular machine is known to be normally distributed with a mean of 12 ounces and a standard deviation of 0.16 ounces. Given this information, the sampling distribution of the sample mean, , for a random sample of 16 cans will also be normally distributed with a standard deviation equal to 0.04. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

27. The population of pop cans filled by a particular machine is known to be normally distributed with a mean of 340 mL and a standard deviation of 4.5 mL. Given this information, the sampling distribution of the sample mean, , for a random sample of 16 cans will also be normally distributed with a mean equal to ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

28. If a population standard deviation is equal to 50, then the sampling distribution of the sample mean, , will have a standard deviation that is less than 50 for all samples of size n > 2. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 279

TOP: 3–5

29. The Central Limit Theorem is used to describe the sampling distributions of statistics, such as

and

, when the population is known to be normally distributed.

ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

30. The population of incomes in a community college in Ontario is thought to be highly skewed to the right with a mean equal to $38,765 and a standard deviation equal to $2,640. Based on this information, if a sample of size 36 is selected at random, then the highest sample mean that we would expect to see would be approximately $40,085. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 279 | 68-69

TOP: 3–5

31. In order to use the Central Limit Theorem to describe the sampling distribution of the sample mean, , the sample size, n, must be 30 or more. ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

32. A population with a large standard deviation will have a sampling distribution that is more spread out for a given sample size than a similar population with a small standard deviation. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 279

33. The Central Limit Theorem states that the sample mean, population mean, . ANS: F BLM: Remember

PTS:

TOP: 3–5

, is always equal to the

REF: 279

TOP: 3–5

34. The Central Limit Theorem states that the sampling distribution of the sample mean, approximately normal for large sample sizes ( ). ANS: T BLM: Remember

PTS:

REF: 279

35. The Central Limit Theorem states that the sample mean, , provided that . ANS: F BLM: Remember

PTS:

, is

TOP: 3–5

, is equal to the population mean,

REF: 279

TOP: 3–5

36. The Central Limit Theorem states that the sampling distribution of the population mean, is approximately normal, provided that . ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

37. According to the Central Limit Theorem, if is the mean of a simple random sample taken from a large population, and if the N population values are not normally distributed, the sampling distribution of nevertheless approaches a normal distribution as sample size, n, increases. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 279

TOP: 3–5

38. According to the Central Limit Theorem, if is the mean of a simple random sample taken from a large population, and if the N population values are not normally distributed, the sampling distribution of nevertheless approaches a normal distribution when n 30 and also n < 0.05N, because these values make the approximation almost perfect. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 279-280

TOP: 3–5

39. According to the Central Limit Theorem, if is the mean of a simple random sample taken from a large population, and if the N population values are normally distributed, the sampling distribution of is also normally distributed, regardless of sample size, n. ANS: F BLM: Remember

PTS:

REF: 279

TOP: 3–5

40. As the sample size increases, the standard error of the sample proportion decreases. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 288-289

41. The mean of the sampling distribution of the sample proportion, 0.5 is 5.0. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 289

TOP: 6

, when n = 100 and p =

TOP: 6

42. The standard error of the sampling distribution of the sample proportion, and p = 0.15 is 0.001275. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 289

, when n = 100

TOP: 6

43. Recall the rule of thumb used to indicate when the normal distribution is a good approximation of the sampling distribution for the sample proportion, combination n = 25, p = 0.05; the rule is satisfied. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 289

. For the

TOP: 6

44. As a general rule, the normal distribution is used to approximate the sampling distribution of the sample proportion only if the sample size, n, is greater than or equal to 30. ANS: F BLM: Remember

PTS:

45. The expression SE proportion,

REF: 289

TOP: 6

represents the standard error sampling distribution of the sample

ANS: T BLM: Remember

PTS:

46. The expression SE sample proportion,

REF: 288

TOP: 6

represents the standard deviation of the sampling distribution of the .

ANS: T BLM: Remember

PTS:

REF: 288-289

TOP: 6

47. Standard error of the sample proportion is another term for the variance of the sampling distribution of the sample proportion,

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 288-289

TOP: 6

48. The Central Limit Theorem applies to the sampling distribution of sample proportion, but not to the sampling distribution of sample mean, . ANS: F BLM: Remember

PTS:

REF: 279

TOP: 6

49. Suppose that 56% of all registered voters in Ontario are supporting a certain candidate for prime minister A sample of 1500 voters is selected at random from Ontario. Based on the concept of sampling distribution of sample proportion, ANS: F PTS: 1 BLM: Higher Order - Analyze

, it can be assumed that

REF: 288-289

= 0.56.

TOP: 6

50. The standard error of the sampling distribution of sample proportion, SE( ), depends on the value of the population proportion, p, and the closer the value of p to 0.50, the larger SE( ) will be for a given sample size n. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 293

TOP: 6

51. The sampling distribution of the sample proportion, , can be approximated by a normal distribution as long as the population proportion, p, is very close to 0.50. ANS: F BLM: Remember

PTS:

REF: 293

TOP: 6

52. On a particular highway in Nova Scotia, it is reported that the proportion of cars that exceed the speed limit is 0.15. Given this information, the probability that a sample of 250 cars will have a sample proportion below 0.12 is approximately 0.0918. ANS: T TOP: 6

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Analyze

53. The mean of all possible sample proportions distribution of ANS: T

and the expected value of the sampling

are the same. PTS:

REF: 288-289

TOP: 6

BLM: Higher Order - Understand 54. The standard deviation of the sampling distribution of the proportion SE

is denoted by

ANS: T BLM: Remember

PTS:

REF: 289

TOP: 6

55. The Central Limit Theorem indicates that the mean of the sampling distribution of the sample proportion,

, will be equal to the population proportion, p.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 279

TOP: 6

56. The Central Limit Theorem indicates that the sampling distribution of the sample proportion can be approximated by a normal distribution if np > 5 and nq > 5. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 279 | 289

TOP: 6

57. The Central Limit Theorem can be applied to the sampling distribution of the sample proportion,

, regardless of the sample size, n, and the population proportion, p.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 279 | 289

TOP: 6

58. The cause of a change in a process variable being monitored is regarded as random variation if it can be found and corrected. ANS: F BLM: Remember

PTS:

REF: 293-294

TOP: 7

59. Small haphazard changes in a process variable due to alteration in the production environment that is not controllable are said to be assignable causes. ANS: F BLM: Remember

PTS:

REF: 293-294

TOP: 7

60. If the variation in a process variable is solely random, the process is said to be in control. ANS: T BLM: Remember

PTS:

REF: 293-294

TOP: 7

61. The first objective in statistical process control is to eliminate assignable causes of variation in the process variable and then get the process in control. The next step is to reduce variation and get the measurements on the process variable within specification limits, the limits within which the measurements on usable items or services must fall.

ANS: T BLM: Remember

PTS:

REF: 293-294

TOP: 7

62. Once a process is in control and is producing a satisfactory product, the process variables are monitored by use of control charts. ANS: T BLM: Remember

PTS:

REF: 293-294

TOP: 7

63. Statistical process control (SPC) methodology was developed to monitor, control, and improve products and services. ANS: T BLM: Remember

PTS:

REF: 293-294

TOP: 7

64. If a process is in control, we expect all the data values to fall within one standard deviation of the mean. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 294

TOP: 7

65. If a process is in control, we expect all the data values to fall within two standard deviations of the mean. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 294

TOP: 7

66. If a process is in control, we expect all the data values to fall within three standard deviations of the mean. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 294

TOP: 7

67. Random variation in a process can be eliminated. ANS: F BLM: Remember

PTS:

REF: 293-294

TOP: 7

68. Random cause variation refers to variation in the output of a process that is unexpected and has an assignable cause. ANS: F BLM: Remember

PTS:

REF: 293-294

TOP: 7

69. Assignable cause variation refers to variation in the output of a process that is naturally occurring and expected, and that may be the result of random causes. ANS: F

PTS:

REF: 293-294

TOP: 7

BLM: Remember 70. Process control charts, such as the and p charts, are used to provide signals to indicate when the output of a process is out of control. ANS: T BLM: Remember

PTS:

REF: 294

TOP: 7

71. An in-control process is typically defined as a process in which all output is operating within 3 standard deviations of the centreline of the process. ANS: T PTS: BLM: Higher Order

REF: 294

TOP: 7

72. In most processes, the process control limits are set to correspond with the specification limits on the process. ANS: F BLM: Remember

PTS:

REF: 293-294

TOP: 7

73. The c chart is another commonly used process control chart that is used to monitor the number of defects per item in that process. ANS: T BLM: Remember

PTS:

REF: 298

TOP: 7

74. If a sample of n elements is selected from a population of N elements using a sampling plan in which each of the possible samples has the same chance of being selected, then the sampling is said to be random and the resulting sample is a simple random sample. ANS: T BLM: Remember

PTS:

REF: 268-269

TOP: 1–2

75. There are four ways of selecting a distinct, unordered sample of size n = 2 without replacement from a population of size N = 4. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 268-269

TOP: 1–2

76. A cluster sample is a simple random sample of clusters from the available clusters in the population. ANS: T BLM: Remember

PTS:

REF: 271

TOP: 1–2

77. Non-random samples can be described and can also be used for making inferences. ANS: F BLM: Remember

PTS:

REF: 272

TOP: 1–2

78. A newspaper is trying to decide which comics to carry on a daily basis, which comics to carry only on Sunday, and which comics to discontinue. The editor invites the readers to “vote” for their preferences by clipping a “ballot” from the editorial page and mailing it in. This is an example of a judgment sample. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 272

TOP: 1–2

79. Stratified random sampling involves selecting a simple random sample from each of a given number of subpopulations, called strata. ANS: T BLM: Remember

PTS:

REF: 270-271

TOP: 1–2

80. The publisher of a newspaper decides which articles will be submitted for the consideration of the Pulitzer Prize committee. This is an example of cluster sampling. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 271

TOP: 1–2

81. Numerical descriptive measures calculated from a sample are called statistics. ANS: T BLM: Remember

PTS:

REF: 268

TOP: 1–2

82. The latest census shows the population aged 25 to 40 to be 48% male and 52% female. A market research firm interested in the buying habits of young adults aged 25 to 40 instructs its interviewers to sample 300 persons, 144 of whom are to be men and 156 of whom are to be women. This is an example of a quota sample. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 272

TOP: 1–2

83. A convenience sample is a sample that can be easily and simply obtained with random selection. ANS: F BLM: Remember

PTS:

REF: 271-272

TOP: 1–2

84. Judgment sampling allows the sampler to decide who will or will not be included in the sample. ANS: T BLM: Remember

PTS:

REF: 272

TOP: 1–2

85. A 1-in-k systematic random sample involves the random selection of one of the first k elements in an ordered population, and then the systematic selection of every kth element thereafter.

ANS: T BLM: Remember

PTS:

REF: 271

TOP: 1–2

86. Quota sampling, in which the makeup of the sample must reflect the makeup of the population on some preselected characteristics, often has a non-random component in the selection process. ANS: T BLM: Remember

PTS:

REF: 272

TOP: 1–2

87. A group of people who, in response to some general appeal, have selected themselves to participate in a survey is called a simple random sample. ANS: F BLM: Remember

PTS:

REF: 268-269

TOP: 1–2

88. The simple random sample is a subset of a population, chosen in such a fashion that every possible subset of like size has an equal chance of being selected. ANS: T BLM: Remember

PTS:

REF: 268-269

TOP: 1–2

89. The stratified random sample is a subset of a population, chosen by randomly selecting one of the first k elements and then including every kth element thereafter until the desired sample size has been reached. ANS: T BLM: Remember

PTS:

REF: 270-271

TOP: 1–2

90. If a sample is selected at random, the main reason that the sample mean, , might be different than the corresponding population mean, , is that the sample might be biased. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 279

91. The shape of the normal distribution is determined by its mean, deviation, . ANS: T BLM: Remember

PTS:

REF: 237-238

TOP: 1–2

, and its standard

TOP: 1–2

92. All sampling plans involve random selection of the sample from the corresponding population. ANS: F BLM: Remember

PTS:

REF: 268 | 272

TOP: 1–2

93. A stratified random sample is useful when the population to be sampled is known to contain two or more mutually exclusive and clearly distinguishable subgroups or strata that differ greatly from one another with respect to some characteristic of interest. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 270-271

TOP: 1–2

94. A stratified random sample is chosen by taking separate (simple or systematic) random samples from every stratum, often in such a way that the sizes of the separate samples vary with the importance of the different strata. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 270-271

TOP: 1–2

95. A systematic random sample has the advantage of being usable for a population that may be ordered in some way but whose size we may not know at any given time. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 271

TOP: 1–2

96. A systematic random sample is a subset of a population chosen by taking separate censuses in a randomly chosen subset of distinct clusters into which the population is naturally divided. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 271

TOP: 1–2

97. A systematic random sample is a subset of a population, chosen by randomly selecting one of the first k elements and then including every kth element thereafter until the desired sample size has been reached. ANS: T BLM: Remember

PTS:

REF: 271

TOP: 1–2

98. A systematic random sample might be used by a quality inspector who wants to determine the percentage of defective units produced on an assembly line that never shuts down. ANS: T PTS: BLM: Higher Order

REF: 271

TOP: 1–2

PROBLEM 1. Identify the sampling design for each of the following: a. The Academic Dean of a university decides which research proposals submitted to her by each department will be submitted for federal funding consideration. b. Every student in an introductory genetics class is assigned a number. The professor then randomly selects 10 numbers and surveys the corresponding students.

d. e.

Class rosters list student names in alphabetical order. A professor wants to select some students to participate in a research project. The professor starts with the fifth name and then selects every tenth student thereafter. An apartment complex manager randomly selects 10 buildings from the complex of 30 buildings, and then interviews one household from every apartment in the 10 buildings. The student body of a large law school comprised 85% in-province residents and 15% out-of-province residents. An evaluation team wishes to interview 200 alumni from the school. They are instructed to contact 170 in-province alumni and 30 out-of-province alumni. A doctor wanted to find out if watching one’s diet for omega-3 fatty acid content was dependent on the age of the patient. He divided his patients into 6 age groups and took a random sample from each group. The question asked of each patient was whether he/she paid attention to the amount of omega-3 fatty acids in their diets.

ANS: a. judgment sampling b. simple random sampling c. 1-in-k systematic sampling d. cluster sampling e. quota sampling f. stratified random sampling PTS: 1 REF: 272 BLM: Higher Order - Analyze

TOP: 1–2

2. A questionnaire was mailed to 1000 registered municipal voters selected at random. Only 500 questionnaires were returned, and of the 600 returned, 450 respondents were strongly opposed to a surcharge proposed to support the city Parks and Recreation department. Are you willing to accept the 75% figure as a valid estimate of the percentage in the city who are opposed to the surcharge? Why or why not? ANS: The questionnaires that were returned do not constitute a representative sample from the 1000 questionnaires that were randomly sent out. It may be that the voters who chose to return the questionnaire were particularly adamant about the Parks and Recreation surcharge, while the others had no strong feelings one way or the other. The nonresponse of 40% of the voters in the sample will undoubtedly bias the resulting statistics. PTS: 1 REF: 269-270 BLM: Higher Order - Evaluate

TOP: 1–2

MRI Accuracy Narrative A group of medical researchers reported on the accuracy of using magnetic resonance imaging (MRI) to evaluate ligament sprains and tears on 50 patients. Consecutive patients with acute or chronic knee pain were selected from the clinical practice of one of the researchers and agreed to participate in the study.

3. Refer to MRI Accuracy Narrative. Describe the sampling plan used to select study participants. ANS: The sample was chosen from patients in the clinical practice of one of the researchers who was willing to participate. This sample is not randomly selected; it is a convenience sample. PTS: 1 REF: 271-272 BLM: Higher Order - Analyze

TOP: 1–2

4. Refer to MRI Accuracy Narrative. Can valid inferences be made using the results of this study? Why or why not? ANS: Valid inferences can be made from this study only if the convenience sample chosen by the researcher behaves like a random sample. That is, the patients in this particular clinical practice must be representative of the population of patients as a whole. PTS: 1 REF: 271-272 BLM: Higher Order - Evaluate

TOP: 1–2

5. Refer to MRI Accuracy Narrative. What chance mechanism could be introduced in order to select a representative sample of 50 individuals with knee pain? ANS: In order to increase the chances of obtaining a sample that is representative of the population of patients as a whole, the researcher might try to obtain a larger base of patients to choose from. Perhaps there is a computerized database from which he or she might select a random sample. PTS: 1 REF: 271-272 BLM: Higher Order - Analyze

TOP: 1–2

Blood Thinner Study Narrative A study of an experimental blood thinner was conducted to determine whether it works better than the simple aspirin tablet in warding off heart attacks and strokes. The study involved 20,000 people who had suffered heart attacks, strokes, or pain from clogged arteries. Each person was randomly assigned to take either aspirin or the experimental drug for one to three years. Assume that each person was equally likely to be assigned one of the two medications. 6. Refer to Blood Thinner Study Narrative. Devise a randomization plan to assign the medications to the patients. ANS:

Since each patient must be randomly assigned to either aspirin or the experimental drug with equal probability, assign the digits 0–4 to the aspirin treatment, and the digits 5–9 to the experimental drug treatment. As each patient enters the study, choose a random digit using the table of random numbers in your book and assign the appropriate treatment. PTS: 1 REF: 270-272 BLM: Higher Order - Analyze

TOP: 1–2

7. Refer to Blood Thinner Study Narrative. Will there be an equal number of patients in each treatment group? Explain. ANS: The randomization scheme in the previous question does not guarantee an equal number of patients in each group. PTS: 1 REF: 270-272 BLM: Higher Order - Analyze

TOP: 1–2

Battery Lifetime Narrative The lifetime of a particular type of battery is normally distributed with a mean of 1100 days and a standard deviation of 80 days. The manufacturer randomly selects 400 batteries of this type and ships them to a tire retailer. 8. Refer to Battery Lifetime Narrative. What is the mean and standard deviation of the sampling distribution of ? ANS: The mean is

1100, and the standard deviation is

PTS: 1 REF: 279 BLM: Higher Order - Apply

TOP: 3–5

9. Refer to Battery Lifetime Narrative. What is the probability the average lifetime of these 400 batteries is between 1097 and 1104 days? ANS: P(1097 

 1104) = P(–0.75  z  1) = 0.2734 + 0.3413 = 0.6141

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Baggage Weights Narrative The weight of baggage checked by airline passengers is a random variable with a mean of 25 kg and a standard deviation of 12 kg. The total baggage limit for 100 randomly selected passengers is 2740 kg.

10. Refer to Baggage Weights Narrative. What are the mean and standard deviation of the sampling distribution of

, the sum of the sample measurements?

ANS: The mean is

2500 kg, and the standard deviation is

PTS: 1 REF: 279-280 BLM: Higher Order - Apply

120 kg.

TOP: 3–5

11. Refer to Baggage Weights Narrative. What is the approximate probability the baggage limit will be exceeded? ANS: This is an application of the CLT for the sum of random variables, P(

>2,740)  P(z > 2.5) = 0.0062.

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

12. A population consists of four numbers, 0, 1, 2, and 4. Suppose we randomly select two of those four numbers without replacement and compute the sample mean, , of the two numbers selected. Find the sampling distribution of . ANS: p( ) 0.5

2/12

1.0

2/12

1.5

2/12

2.0

2/12

2.5

2/12

3.0

2/12

PTS: 1 REF: 275-277 BLM: Higher Order - Apply

TOP: 3–5

Examination Times Narrative The time necessary to complete an examination has a mean of 45 minutes and a standard deviation of 4 minutes. The average time necessary for 64 randomly selected students is computed. 13. Refer to Examination Times Narrative. What is the mean of the sampling distribution of

ANS: The mean:

45 minutes.

PTS:

REF: 279

TOP: 3–5

BLM: Remember

14. Refer to Examination Times Narrative. What is the standard deviation of the sampling distribution of ? ANS: The standard deviation is PTS: 1 REF: 279 BLM: Higher Order - Apply

0.5 minutes. TOP: 3–5

15. Refer to Examination Times Narrative. Can we say that the sampling distribution of approximately normally distributed? Why?

ANS: Yes, because n = 64 > 30; our rule-of-thumb minimum value for large samples, therefore the CLT can be applied to conclude that the sampling distribution of is approximately normally distributed. PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

16. Refer to Examination Times Narrative. Completely describe the sampling distribution of the sample mean. ANS: The sampling distribution of the sample mean is approximately normally distributed with mean

52 minutes and standard deviation

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

0.80 minutes.

TOP: 3–5

17. Refer to Examination Times Narrative. What is the probability this sample produces an average of less than 50 minutes? ANS: P( < 50) = P(z < –2.5) = 0.50 – 0.4938 = 0.0062. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply 18. Refer to Examination Times Narrative. If the sample mean, can be said about the claim that  = 52?

TOP: 3–5

, is actually 50 minutes, what

ANS: 50 minutes is more than 2 standard deviations from the mean, which casts doubt on the claim. PTS: 1 REF: 331-332 | 68-70 BLM: Higher Order - Analyze

TOP: 3–5

19. A fair die is rolled. If the top face is a 5 or 6, you win $1; otherwise you win nothing. Let be the amount you win in one play of the game and let be the average amount you win in two plays of the game. Find the sampling distribution of . ANS:

$1.0 $0.50 $0.0

P( ) 1/9 4/9 4/9

PTS: 1 REF: 275-277 BLM: Higher Order - Apply

TOP: 3–5

20. The distribution of scores for the 1000 final exams in a statistics course has a mean of 74 and a standard deviation of 15. A random sample of 36 exam papers is selected. What is the probability that the sample mean is higher than 77? ANS: P(  77) = P(z  1.2) = 0.5 – 0.3849 = 0.1151 PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Average Newborn Weights Narrative It is known that the birth weight of newborn babies in Canada has a mean of 3.20 kg with a standard deviation of 0.80 kg. Suppose we randomly sample 64 birth certificates and record the birth weights of these babies. 21. Refer to Average Newborn Weights Narrative. Find the mean and standard deviation of the sampling distribution of . ANS: The mean is

3.20 kg, and the standard deviation is

PTS: 1 REF: 279 BLM: Higher Order - Apply

TOP: 3–5

0.10 kg.

22. Refer to Average Newborn Weights Narrative. What is the probability the sample mean birth weight as recorded on the birth certificates will be less than 3.125 kg? ANS: P( < 3.125) = P(z < –0.75) = 0.5 – 0.2734 = 0.2266 PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

IQ of Elementary School Children Narrative In a particular large school system, the average IQ of elementary school children is 105 and the standard deviation is 15. A sample of 81 children is randomly selected from elementary schools within the system. 23. Refer to IQ of Elementary School Children Narrative. Find the mean and standard deviation of the sampling distribution of . ANS: The mean is

105, and the standard deviation is

PTS: 1 REF: 279 BLM: Higher Order - Apply

1.667.

TOP: 3–5

24. Refer to IQ of Elementary School Children Narrative. What is the probability that the average IQ of this sample is below 108? ANS: P( < 108) = P(z < 1.8) = 0.50 + 0.4641 = 0.9641 PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Department Store Audits Narrative An auditor is going to sample 64 charge accounts from the many active accounts at a large department store. Suppose the average balance owed among all the store’s accounts is $75 with a standard deviation of $100. 25. Refer to Department Store Audits Narrative. Find the mean and standard deviation of the sampling distribution of . ANS: The mean is

75, and the standard deviation is

PTS: 1 REF: 279 BLM: Higher Order - Apply

TOP: 3–5

12.5.

26. Refer to Department Store Audits Narrative. What is the probability the sample mean of the balances owed among the 64 accounts sampled by the auditor is below $70? ANS: P( < 70) = P(z < –0.4) = 0.50 – 0.1554 = 0.3446. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Miller Analogies Test Narrative Graduate students applying for entrance to many universities must take a Miller Analogies Test. It is known that the test scores have a mean of 75 and a variance of 16. In 1990, 100 students applied for entrance into graduate school in physics. 27. Refer to Miller Analogies Test Narrative. Find the mean and standard deviation of the sampling distribution of . ANS: The mean is

75, and the standard deviation is

PTS: 1 REF: 279 BLM: Higher Order - Apply

0.40.

TOP: 3–5

28. Refer to Miller Analogies Test Narrative. Find the probability that the average score of this group of students is higher than 76. ANS: P( > 76) = P(z > 2.5) = 0.50 – 0.4938 = 0.0062. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Lab Mice Experiments Narrative In a learning experiment, untrained mice are placed in a maze and the time required for each mouse to exit the maze is recorded. The average time for untrained mice to exit the maze is  = 50 seconds and the standard deviation of their times is  = 16 seconds. Suppose that a sample of 64 randomly selected untrained mice are placed in the maze and the time necessary to exit the maze is recorded for each. 29. Refer to Lab Mice Experiments Narrative. Find the mean and standard deviation of the sampling distribution of . ANS: The mean is

50, and the standard deviation is

PTS:

REF: 279

TOP: 3–5

2.0.

BLM: Higher Order - Apply 30. Refer to Lab Mice Experiments Narrative. What is the probability that the sample mean differs from the population mean by more than 3? ANS: P( < 47 or

> 53) = P(z < –1.5) + P(z >1.5) = 2(0.5 – 0.4332) = 0.1336.

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Food Expenditures Narrative The amount of money spent on food per week by a Canadian family is known to have a mean of $92 and a standard deviation of $9. Suppose a random sample of 81 families is taken and their sample mean food expenditure is calculated. 31. Refer to Food Expenditures Narrative. Completely describe the sampling distribution of the sample mean, . ANS: The sampling distribution of the sample mean, 92 and standard deviation PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

, is approximately normal with mean 1.

TOP: 3–5

32. Refer to Food Expenditures Narrative. Find the probability that the sample mean, not exceed $90.40.

, does

ANS: P( < 90.4) = P(z < –1.6) = 0.0548. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Fuel Consumption of Cars Narrative Suppose the average fuel consumption (in litres per 100 km) of a particular model car is 8.4 L/100 km with a standard deviation of 2.4 L/100 km. A random sample of 64 cars of this model is selected. 33. Refer to Fuel Consumption of Cars Narrative. Describe the sampling distribution of

ANS: The sampling distribution of

is approximately normal with a mean

km and a standard deviation

0.3 L/100 km.

8.4 L/100

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

34. Refer to Fuel Consumption of Cars Narrative. What is the probability that the sample mean exceeds 9.0 L/100 km? ANS: P(  9.0) = P(z  2) = 0.5 – 0.4772 = 0.0228. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Lifetime of Fluorescent Light Bulbs The mean lifetime of a fluorescent light bulb is 1570 hours with a standard deviation of 200 hours. Suppose we take a sample of 100 bulbs, test them, and calculate the sample mean. 35. Refer to Lifetime of Fluorescent Light Bulbs. Completely describe the sampling distribution of the sample mean. ANS: The sampling distribution of the sample mean is approximately normally distributed with mean

1570 hours and standard deviation

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

20 hours.

TOP: 3–5

36. Refer to Lifetime of Fluorescent Light Bulbs. What is the probability the sample mean exceeds 1560 hours? ANS: P( > 1560) = P(z > –0.5) = 0.50 + 0.01915 = 0.6915. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Childcare Amounts Paid Narrative The distribution of the amount spent for childcare in a city has a mean of $675 and a standard deviation of $80. A random sample of 64 families paying for childcare is selected. 37. Refer to Childcare Amounts Paid Narrative. Describe the sampling distribution of ANS: Since the sample size n = 64 is greater than 30, the sampling distribution of approximately normal with a mean

$675 and standard deviation

is $10.

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

38. Refer to Childcare Amounts Paid Narrative. Find the probability that the sample mean is between $645 and $700. ANS: P(645 

 700) = P(–3.00  z  2.50) = 0.4987 + 0.4938 = 0.9925.

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

39. Refer to Childcare Amounts Paid Narrative. Find the probability that the sample mean is less than $700. ANS: P(  700) = P(z  2.5) = 0.5 + 0.4938 = 0.9938 PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

Heights of Adult Males Narrative The distribution of heights of adult males has a mean of 175.0 cm and a standard deviation of 10.0 cm. A random sample of 36 adult males is selected. 40. Refer to Heights of Adult Males Narrative. Completely describe the sampling distribution of . ANS: Since the sample size n = 36 is greater than 30, the sampling distribution of approximately normal with a mean

175.0 cm and standard deviation

1.667 cm. PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

41. Refer to Heights of Adult Males Narrative. Find the probability that the average height will be more than 177.5 cm. ANS: P(  177.5) = P(z  1.5) = 0.5 – 0.4332 = 0.0668. PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

42. Refer to Heights of Adult Males Narrative. Find the probability that the average height will be between 171.0 cm and 178.75 cm, inclusively. ANS: P(171.0 

 178.75) = P(–2.40  z  2.25) = 0.4918 + 0.4878 = 0.9796.

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

43. The distribution of the time that a battery pack for a laptop computer can function before requiring recharging is normal with a mean of 6 hours and a standard deviation of 1.8 hours. A random sample of 25 laptops with this type of battery pack is selected and tested. What is the probability that the mean time until recharging is necessary is at least 7 hours? ANS: P(  7) = P(z  2.78) = 0.5 – 0.4973 = 0.0027 PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Analyze

TOP: 3–5

44. Random samples of size n were selected from three populations with the means and variances given here: i. ii. iii. a. Find the mean and standard deviation of the sampling distribution of the sample mean in each of the three cases above. b. If the sampled populations are normal, are the sampling distributions of for each of the three cases also normal? Justify your answer. c. According to the Central Limit Theorem, if the sampled populations are NOT normal, what can be said about the sampling distribution of for each of the three cases? ANS: a.

ii. iii. b. If the sampled populations are normal, the distribution of is also normal for all values of n. c. The Central Limit Theorem states that for sample sizes as small as n = 25, the sampling distribution of will be approximately normal. Hence, we can be relatively certain that the sampling distribution of for cases (i) and (ii) will be approximately normal. However, the sample size in case (iii), n = 8, is too small to assume that the distribution of is approximately normal.

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

Final Exam Scores Narrative Suppose a random sample of 25 students is selected from a community college where the scores on the final exam (out of 125 points) are normally distributed having mean equal to 112 and standard deviation equal to 12. 45. Refer to Final Exam Scores Narrative. Find the mean and the standard deviation of the sampling distribution of the sample mean . ANS:

PTS: 1 REF: 279 BLM: Higher Order - Analyze

TOP: 3–5

46. Refer to Final Exam Scores Narrative. Find the probability that

exceeds 116.

ANS: P

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

47. Refer to Final Exam Scores Narrative. Find the probability that the sample mean deviates from the population mean = 112 by no more than 4. ANS: P

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order

. TOP: 3–5

Salaries of Professors Narrative Suppose that college faculty with the rank of professor at four-year institutions earn an average of $72,500 per year with a standard deviation of $4,500. In an attempt to verify this salary level, a random sample of 60 professors was selected from a personnel database for all four-year institutions in Canada. 48. Refer to Salaries of Professors Narrative. Describe the sampling distribution of the sample mean . ANS:

Since the sample size is large, the sampling distribution of with mean

will be approximately normal

72500 and standard deviation

PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

580.9475.

TOP: 3–5

49. Refer to Salaries of Professors Narrative. Within what limits would you expect the sample average to be, with probability 0.95? ANS: From the Empirical Rule and the general properties of the normal distribution, approximately 95% of the measurements will lie within two standard deviations of the mean; that is, $73,661.895.

, or $71,338.105 to

PTS: 1 REF: 279 | 69-70 BLM: Higher Order - Analyze

TOP: 3–5

50. Refer to Salaries of Professors Narrative. Calculate the probability that the sample mean is greater than $75,000. ANS: P

PTS: 1 REF: 282-285 | 720-721 BLM: Higher Order - Apply

TOP: 3–5

51. Refer to Salaries of Professors Narrative. If your random sample actually produced a sample mean of $75,000, would you consider this unusual? What conclusion might you draw? ANS: Refer to the previous question. You have observed a very unlikely occurrence, assuming that = $72,500. Perhaps your sample was not a random sample, or perhaps the average salary of $72,500 is no longer correct. PTS: 1 REF: 77-78 BLM: Higher Order - Evaluate

TOP: 3–5

Deli Sales Narrative The total daily sales, x, in the deli section of a large chain of food stores is the sum of the sales generated by a fixed number of customers who make purchases on a given day. 52. Refer to Deli Sales Narrative. What kind of probability distribution do you expect the total daily sales to have? Explain. ANS:

Since the total daily sales is the sum of the sales made by a fixed number of customers on a given day, it is a sum of random variables, which, according to the Central Limit Theorem, will have an approximate normal distribution. PTS: 1 REF: 279-280 BLM: Higher Order - Analyze

TOP: 3–5

53. Refer to Deli Sales Narrative. For this particular market, the average sale per customer in the deli section is $10.50 with = $2.50. If 30 customers make deli purchases on a given day, give the mean and standard deviation of the probability distribution of the total sales, x. ANS: Let be the total daily sales for a single customer, with i = 1, 2, …, 30. Then has a probability distribution with = 10.50 and = 2.5. The total daily sales can now be written as

. If n = 30, the mean and standard deviation of the sampling distribution

of x are given as

30(10.5) = 315 and

PTS: 1 REF: 280 BLM: Higher Order - Analyze

TOP: 3–5

54. An accountant reviewed a firm’s billing for an entire year and computed an average bill of $125, with a standard deviation of $15. The firm’s comptroller claims that a sample of 50 bills would have saved a lot of work and achieved the same result. Describe the sampling distribution and comment. ANS: The sampling distribution of

is approximately normal (since n

30) with mean of 125

and standard deviation of . In this case, 68.22% of all possible sample means would have been between $122.88 and $127.12. Also, 95.44% of all sample means would have been between $120.76 and $129.24. The comptroller has a point. PTS: 1 REF: 279-280 | 69-70 BLM: Higher Order - Evaluate

TOP: 3–5

55. Completely describe the sampling distribution of the sample proportion for samples of size n = 500 from a population with p = 0.4. ANS: Since np = 200 and n(1 – p) = 300 are both greater than 5, the distribution of the sample proportion

will be approximately normal with a mean

deviation (or standard error) PTS: 1 REF: 289 BLM: Higher Order - Analyze

= 0.0219. TOP: 6

= 0.4 and a standard

Federal Budget Narrative Suppose public opinion is split 65% against and 35% for increasing taxes to help balance the federal budget. Suppose also that 500 people from the population are selected randomly and interviewed. 56. Refer to Federal Budget Narrative. Completely describe the sampling distribution of the sample proportion of people who are in favour of increasing taxes to help balance the federal budget. ANS: Since np = 175 and n(1 – p) = 325 are both greater than 5, the distribution of the sample proportion deviation

will be approximately normal with a mean

= 0.35 and a standard

= 0.02133.

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

57. Refer to Federal Budget Narrative. What is the probability the proportion favouring a tax increase is more than 30%? ANS: P( > 0.30) = P(z > –2.34) = 0.50 + 0.4904 = 0.9904 PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

Unemployment Rate Narrative An eastern province has an unemployment rate of 6%. The province conducts monthly surveys in order to track the unemployment rate. In a recent month, a random sample of 700 people showed that 35 were unemployed. 58. Refer to Unemployment Rate Narrative. If the true unemployment rate is 6%, describe the sampling distribution of

ANS: Since np = 42 and n(1 – p) = 658 are both greater than 5, the sampling distribution of approximately normal with a mean 0.009.

= 0.06 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

is =

59. Refer to Unemployment Rate Narrative. If Find the probability that the sample unemployment rate is at most 5%. ANS: P(

 0.05) = P(z  –1.11) = 0.50 – 0.3665 = 0.1335

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

60. Refer to Unemployment Rate Narrative. If Assume the population proportion, p, is unknown. Describe the sampling distribution of

based on the most recent sample.

ANS: The sampling distribution of

is approximately normal with a mean

standard deviation

= 0.008.

PTS: 1 REF: 289 BLM: Higher Order - Analyze

= 0.05 and a

TOP: 6

61. Refer to Unemployment Rate Narrative. If Based on the most recent sample, find the probability that the sample proportion will lie within 0.005 of the true proportion p of people who are unemployed. ANS: The sampling distribution is approximately normal with a mean of 0.05 and a standard deviation of 0.008. Hence, P(–0.005 2(0.2357) = 0.4714

–p

0.005)

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

P(–0.63

0.63) =

TOP: 6

Canadian Cancer Society Survey Narrative A recent nationwide survey by the Canadian Cancer Society found the percentage of women who smoke has increased to 30%. That seemed a little low for your province, so you sampled 500 women from your province and found that 180 of them smoke. 62. Refer to Canadian Cancer Society Survey Narrative. Find the sample proportion of women who smoke in your province. ANS: = 180/500 = 0.36 PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

63. Refer to Canadian Cancer Society Survey Narrative. What is the probability that at least 36% of women in your province are smokers if the true population proportion p = 0.30? ANS: P( > 0.36) = P(z > 2.93) = 0.50 – 0.4983 = 0.0017 PTS: 1 REF: 290-291 BLM: Higher Order - Apply

TOP: 6

64. Refer to Canadian Cancer Society Survey Narrative. Based on your answer to the previous question, what might you conclude about the Canadian Cancer Society’s claim that p = 0.30? ANS: There is evidence to cast doubt on the Canadian Cancer Society’s claim, at least for this province. PTS: 1 REF: 290-291 BLM: Higher Order - Evaluate

TOP: 6

Defective Items Narrative Suppose a sample of 120 items is drawn from a population of manufactured products and the number of defective items is recorded. Prior experience has shown that the proportion of defectives is 0.05. 65. Refer to Defective Items Narrative. Completely describe the sampling distribution of proportion of defectives.

, the

ANS: Since np = 6 and n(1 – p) = 114 are both greater than 5, the sampling distribution of approximately normal with a mean 0.0199.

= 0.05 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

66. Refer to Defective Items Narrative. How would the sampling distribution of the sample proportion change if the sample size were raised to 200? ANS: The sampling distribution will remain approximately normal with mean of 0.05, but the standard deviation (or standard error) of

would decrease from 0.0199 to 0.0154.

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

67. Refer to Defective Items Narrative. Is the normal approximation to the sampling distribution of

appropriate in this situation? Explain.

ANS: For n = 400, p = 0.2, and np = 80 and nq = 320 are both greater than 5. Therefore, the normal approximation will be appropriate, with standard error given by SE =

PTS: 1 REF: 289 BLM: Higher Order - Evaluate

TOP: 6

68. Refer to Defective Items Narrative. Use the results of the previous question to find the probability that

is greater than 0.23.

ANS: P

= 0 0.5 – 0.4332 = 0.0668

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

69. Refer to Defective Items Narrative. Describe the sampling distribution of the proportion of defectives for a simple random sample of n = 50. ANS: The sample distribution of

is normal since np > 5 and nq > 5, with mean = p = 0.10, and

standard deviation PTS: 1 REF: 289 BLM: Higher Order - Analyze

. TOP: 6

70. Refer to Defective Items Narrative. What is the likelihood of encountering a sample proportion within of the population proportion? ANS:

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

71. Refer to Defective Items Narrative. If the production process is shut down whenever a sample proportion of defectives exceeds 0.05, what is the likelihood of this happening? ANS: P( > 0.05) = P(z > – 1.18) = 0.381 + 0.500 = 0.881

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

72. Refer to Defective Items Narrative. Calculate the upper and lower control limits for a p chart. ANS: .The upper and lower control limits for a p chart are UCL = 0.1022 and LCL = –0.0182 (or 0 for LCL since p cannot be negative). PTS: 1 REF: 296 BLM: Higher Order - Apply

TOP: 7

73. Refer to Defective Items Narrative. Explain how to construct a p chart for the process and how it can be used. ANS: The control chart is constructed by plotting three horizontal lines, one located at the upper control limit 0.1022, one at the centreline

, and one at the lower control limit 0.

Values of are plotted and should remain within the control limits. If not, the process should be checked. PTS: 1 REF: 296 BLM: Higher Order - Analyze

TOP: 7

74. Refer to Defective Items Narrative. What is the probability that the sample proportion is less than 0.10? ANS: P( < 0.10) = P(z < 2.51) = 0.50 + 0.494 = 0.994. PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

Defective Engine Parts Narrative A machine that manufactures a part for a car engine was observed over a period of time before a random sample of 300 parts was selected from those produced by this machine. Of the 300 parts, 15 were defective. 75. Refer to Defective Engine Parts Narrative. Find the proportion of defective parts in the sample. ANS: = 15/300 = 0.05.

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

76. Refer to Defective Engine Parts Narrative. Completely describe the sampling distribution of

, the proportion of defectives.

ANS: Since the true proportion, p, is unknown, we will use the sample proportion,

, to check the

required conditions for normal approximation of the sampling distribution of

. Since n

= 15 and n(1 – ) = 285 are both greater than 5, the sampling distribution is approximately normal with a mean

= 0.05 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

= 0.0126.

TOP: 6

77. Refer to Defective Engine Parts Narrative. What is the probability that the sample proportion will lie within 0.02 of the true population proportion of defective parts? ANS: P(–0.02

–p

0.02) = P(–1.59 < z < 1.59) = 2(0.44410 = 0.8882.

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

Weights of Candy Bars Narrative A candy bar company is interested in reducing the percentage of defective candy bars made, where a defective candy bar is one that has too few almonds by weight. The company randomly samples 100 candy bars a day for 5 days and finds the percentage of the defective bars to be 0.0200, 0.0125, 0.0225, 0.0100, and 0.0150, respectively. The company wants to construct a control chart for the proportion defective in samples of size n = 100. 78. Refer to Weights of Candy Bars Narrative. Estimate the process fraction defective p. ANS: The process fraction defective is estimated by the average of the 5 sample proportions: 0.08/5 = 0.016. PTS: 1 REF: 296 BLM: Higher Order - Analyze

TOP: 6

79. Refer to Weights of Candy Bars Narrative. Estimate the standard deviation of the sample proportions. ANS:

0.0125. PTS: 1 REF: 296 BLM: Higher Order - Apply

TOP: 6

80. Refer to Weights of Candy Bars Narrative. What are the control limits? ANS: 0.016 3(0.0125) = 0.016 –0.0215 (use 0 since p cannot be negative). PTS: 1 REF: 296 BLM: Higher Order - Apply

0.0375; UCL = 0.0535 and LCL =

TOP: 6

81. Refer to Weights of Candy Bars Narrative. How are these control limits used? ANS: When the sample percentage defective exceeds 5.35%, the process needs to be adjusted. PTS: 1 REF: 296-298 BLM: Higher Order - Analyze

TOP: 6

College Football Division Narrative A student government representative at a local university claims that 60% of the undergraduate students favour a move to Division I in college football. A random sample of 250 undergraduate students is selected. 82. Refer to College Football Division Narrative. Completely describe the sampling distribution of the sample proportion. ANS: Since np = 150 and n(1 – p) = 100 are both greater than 5, the sampling distribution is approximately normal with a mean 0.03098.

= 0.60 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

83. Refer to College Football Division Narrative. Find the probability that the sample proportion exceeds 0.65. ANS: P( PTS:

 0.65) = P(z  1.61) = 0.5 – 0.4463 = 0.0537

REF: 290-291 | 720-721

TOP: 6

BLM: Higher Order - Apply NHL Stanley Cup Finals Narrative A TV pollster believed that 70% of Canadian TV households would be tuned in to Game 5 of the 2008 NHL Stanley Cup series between the Pittsburgh Penguins and the Detroit Red Wings. A random sample of 500 TV households is selected. 84. Refer to NHL Stanley Cup Finals Narrative. Completely describe the sampling distribution of the sample proportion. ANS: Since np = 350 and n(1 – p) = 150 are both greater than 5, the sampling distribution is approximately normal with a mean 0.02049.

= 0.70 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

85. Refer to NHL Stanley Cup Finals Narrative. Find the probability that the sample proportion “on” will be between 0.65 and 0.75. ANS: P(0.65 

 0.75) = P(–2.44  z  2.44) = 2P(0  z  2.44) = 2(0.4927) = 0.9854

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

Content Composition Analysis Narrative A well-known juice manufacturer claims that its citrus punch contains 15% real orange juice. A random sample of 150 cans of the citrus punch is selected and analyzed for content composition. 86. Refer to Content Composition Analysis Narrative. Completely describe the sampling distribution of the sample proportion. ANS: Since np = 22.5 and n(1 – p) = 127.5 are both greater than 5, the sampling distribution is approximately normal with a mean 0.0292.

= 0.15 and a standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

87. Refer to Content Composition Analysis Narrative. Find the probability that the sample proportion will be less than 0.10.

ANS: P(

 0.10) = P(z  –1.71) = 0.5 – 0.4564 = 0.0436

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

88. Refer to Content Composition Analysis Narrative. Would a value of considered unusual? Justify your answer.

= 0.25 be

ANS: Yes, since the corresponding z-score = 3.42, indicating that the value of standard deviations above the mean. PTS: 1 REF: 289 | 69-70 BLM: Higher Order - Evaluate

= 0.25 is 3.42

TOP: 6

Rh-Positive Blood Type Narrative The proportion of individuals with an Rh-positive blood type is 88%. You have a random sample of n = 500 individuals. 89. Refer to Rh-Positive Blood Type Narrative. What are the mean and standard deviation of , the sample proportion with Rh-positive blood type? ANS: Mean

= 0.88 and standard deviation

PTS: 1 REF: 289 BLM: Higher Order - Analyze

= 0.0145.

TOP: 6

90. Refer to Rh-Positive Blood Type Narrative. Is the distribution of Justify your answer.

approximately normal?

ANS: Yes. Since np = 440 and n(1 – p) = 60 are both greater than 5, the sampling distribution of is approximately normal. PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

91. Refer to Rh-Positive Blood Type Narrative. What is the probability that the sample proportion ANS:

exceeds 85%?

> 0.85) = P(z > –2.07) = 0.5 + 0.4808 = 0.9808.

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

92. Refer to Rh-Positive Blood Type Narrative. What is the probability that the sample proportion

lies between 86% and 91%?

ANS: P(0.86 <

< 0.91) = P(–1.38 < z < 2.07) = 0.4162 + 0.4808 = 0.897.

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

93. Refer to Rh-Positive Blood Type Narrative. Between which two limits would the sample proportion

lie 99% of the time?

ANS: For a normal (or approximately normal) random variable, the interval will contain 99% of the measurements. For this binomial random variable x, the interval is = 0.88

0.0373, or 0.8427 to 0.9173.

PTS: 1 REF: 289 | 69-70 BLM: Higher Order - Analyze

TOP: 6

Stress and Sweets Narrative According to one study, 50% of Canadians admit to overeating sweet foods when stressed. Suppose that the 50% figure is correct and that a random sample of n = 100 Canadians is selected. 94. Refer to Stress and Sweets Narrative. Does the distribution of the sample proportion of Canadians who relieve stress by overeating sweet foods, have an approximately normal distribution? If so, what are its mean and standard deviation? ANS: For n = 100 and p = 05, np = 50 and nq = 50 are both greater than 5. Therefore, the normal approximation will be appropriate, with mean . PTS: 1 REF: 289 BLM: Higher Order - Analyze

TOP: 6

and standard deviation

95. Refer to Stress and Sweets Narrative. What is the probability that the sample proportion, exceeds 0.54? ANS: P

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

96. Refer to Stress and Sweets Narrative. What is the probability that 0.39 to 0.59?

lies within the interval

ANS: P

PTS: 1 REF: 290-291 | 720-721 BLM: Higher Order - Apply

TOP: 6

97. Refer to Stress and Sweets Narrative. What might you conclude if the sample proportion were as small as 34%? ANS: The z-value associated with

is z = (0.34 – 0.50)/0.05 = –3.2. This means that the

value 4 lies 3.2 standard deviations below the mean. This is an unlikely occurrence, assuming that p = 0.5, and would tend to contradict the reported figure. PTS: 1 REF: 77-78 | 69-70 BLM: Higher Order - Evaluate

TOP: 6

Weights of Chocolate Bars Narrative A candy bar factory is pouring chocolate into moulds to cool. The finished bars are sold as 35 gram bars. The company will lose money if the moulds are overfilled. If the moulds are underfilled, the weight of the candy bars will be less than the wrapper label says, and the company will be subject to a fine for misrepresenting the size of its product. The company wants to create an chart to monitor the weight of the candy bars. Suppose 5 samples are taken, each of size 30, having the sample means 36.1, 34.6, 35.8, 34.9, and 35.8 grams, respectively. 98. Refer to Weights of Chocolate Bars Narrative. What is the centreline value? ANS: The centreline is located at the average of the sample means, or PTS: 1 REF: 294-296 BLM: Higher Order - Analyze

TOP: 7

= 174.8/5 = 34.96 grams.

99. Refer to Weights of Chocolate Bars Narrative. The calculated value of s, the sample standard deviation of all nk = (30)(5) = 150 observations, is 3.36 grams. What is the standard error of the mean of 30 observations? ANS: The estimated standard error of the mean is PTS: 1 REF: 294-296 BLM: Higher Order - Apply 100.

= 3.36/

= 0.6134 g.

TOP: 7

Refer to Weights of Chocolate Bars Narrative. What are the control limits? ANS: = 34.96

1.8402; UCL= 36.8002 and

LCL = 33.1198 grams. PTS: 1 REF: 294-296 BLM: Higher Order - Apply 101.

TOP: 7

Refer to Weights of Chocolate Bars Narrative. How is the

chart used in this situation?

ANS: Samples are selected and their sample mean calculated. The company readjusts its candy mould-filling machine whenever a sample mean falls outside the control limits. PTS: 1 REF: 294-296 BLM: Higher Order - Understand 102.

TOP: 7

A producer of brass rivets randomly samples 500 rivets each hour and calculates the proportion of defectives in the sample. The mean sample proportion calculated from 250 samples was equal to 0.025. Calculate the upper and lower control limits for a control chart for the proportion of defectives in samples of 500 rivets. Explain how the control chart can be of value to a manager. ANS: . The upper and lower control limits for a p chart are then, UCL = 0.046 and LCL = 0.004. The manager can use the control chart to detect changes in the production process that might produce an unusually large number of defectives. PTS: 1 REF: 296 BLM: Higher Order - Analyze

TOP: 7

Chapter 8—Large-Sample Estimation MULTIPLE CHOICE 1. Which of the following best describes an unbiased estimator? a. any sample statistic used to approximate a population parameter b. a sample statistic which has an expected value equal to the value of the population parameter c. a sample statistic whose value is usually less than the value of the population parameter d. any estimator whose standard error is relatively small ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 314

TOP: 1–4

2. Which of these options is the best definition of a point estimate? a. It is the average of the sample values. b. It is the average of the population values. c. It is a single value that is the best estimate of an unknown population parameter. d. It is a single value that is the best estimate of an unknown sample statistic. ANS: C BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

3. From a sample of 200 items, 12 items are defective. In this case, what will be the point estimate of the population proportion defective? a. 0.06 b. 0.12 c. 12 d. 16.67 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 316

TOP: 1–4

4. Which of the following best defines statistical estimation? a. a process of inferring the values of unknown population parameters from those of known sample statistics b. a process of inferring the values of unknown sample statistics from those of known population parameters c. any procedure that views the parameter being estimated not as a constant, but, just like the estimator, as a random variable d. a sampling procedure that matches each unit from population A with a “twin” from population B so that any sample observation about a unit in population A automatically yields an associated observation about a unit in population B ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 318

TOP: 1–4

5. What is the type of sample statistic that is used to make inferences about a given type of population parameter?

a. b. c. d.

the estimator of that parameter the confidence level of that parameter the confidence interval of that parameter the point estimate of that parameter

ANS: A BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

6. Why do those who engage in estimation insist on random sampling, rather than convenience sampling or judgment sampling? a. because random sampling avoids the errors inherent in matched pairs sampling b. because random sampling avoids the errors inherent in work sampling c. because random sampling eliminates the systematic error or bias that arises in non-random sampling ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 315-316

TOP: 1–4

7. What is a sample statistic such that the mean of all its possible values differs from the population parameter that the statistic seeks to estimate? a. an efficient estimator b. an inconsistent estimator c. a biased estimator d. a Bayesian estimator ANS: C BLM: Remember

PTS:

REF: 314

TOP: 1–4

8. Whenever a sampled population is normally distributed, or whenever the conditions of the Central Limit Theorem are fulfilled, what may be said of the sample mean ? a. It is a consistent estimator of the population mean, , because the mean of the sampling distribution of the sample mean equals . b. It is an efficient estimator of the population mean, , because the mean of the sampling distribution of the sample mean equals . c. It is an unbiased estimator of the population mean, , because the mean of the sampling distribution of the sample mean equals . d. It is an efficient estimator of the population mean, , because the mean of the sampling distribution of the sample proportion equals p. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 312-314

TOP: 1–4

9. In order to estimate the average number of kilometres that students living off-campus commute to classes every day, the following statistics were given: n = 50, = 5.21, and s = 2.48. Which of the values below would be the best point estimate of the true population mean ? a. 1.96 b. 2.10 c. 5.21 d. 7.07

ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 315-316

TOP: 1–4

10. Which of the following best describes the term “margin of error”? a. It is the difference between the point estimate and the true value of the population parameter. b. It is the critical value times the standard error of the estimator. c. It is the smallest possible sampling error. d. It is a measurement of the variability of the true value of the population parameter. ANS: B BLM: Remember

PTS:

REF: 315-316

TOP: 1–4

11. Which of these options provides the best interpretation of a 90% confidence interval estimate of the population mean ? a. If we repeatedly draw samples of the same size from the same population, 90% of the values of the sample means will result in a confidence interval that includes the population mean . b. There is a 90% probability that the population mean will lie between the lower confidence limit (LCL) and the upper confidence limit (UCL). c. We are 90% confident that we have selected a sample whose range of values does not contain the population mean . d. We are 90% confident that 10% the values of the sample means will result in a confidence interval that includes the population mean . ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 324-325

TOP: 5

12. Which of these statements is NOT a property of the confidence interval estimate of the population mean? a. Its width narrows when the sample size increases. b. Its width narrows when the value of the sample mean increases. c. Its width widens when the confidence level increases. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 323-324

TOP: 5

13. A 99% confidence interval estimate for a population mean is determined to be 85.58 to 96.62. If the confidence level is reduced to 90%, what happens to the confidence interval for ? a. It becomes wider. b. It remains the same. c. It becomes narrower. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 323-324

TOP: 5

14. In developing an interval estimate for a population mean for which the population standard deviation was 8, the interval estimate was 40.52 3.24. If had equalled 16, what would the interval estimate have been? a. 40.52 6.48 b. 40.52 11.24 c. 48.52 11.24 d. 81.04 6.48 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 323-324

TOP: 5

15. In developing an interval estimate for a population mean, a sample of 40 observations was used. The interval estimate was 17.25 2.42. If the sample size had been 160 instead of 40, what would the interval estimate have been? a. 17.25 1.21 b. 17.25 9.68 c. 34.50 4.82 d. 69.00 9.68 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 323-324

TOP: 5

16. After constructing a confidence interval estimate for a population mean, you believe that the interval is useless because it is too wide. In order to correct this problem, what should you do? a. increase the population size b. increase the sample mean c. increase the confidence coefficient d. increase the sample size ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 325

TOP: 5

17. Which of the following is NOT a part of the formula for constructing a confidence interval estimate of the population proportion? a. a point estimate of the population proportion b. the standard error of the sampling distribution of the sample proportion c. the confidence coefficient d. the value of the population proportion ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 326

TOP: 5

18. Which of the following must hold before one can make use of the standard normal distribution in order to construct a confidence interval estimate for the population proportion p? a. and ) are both greater than 5, where is the sample proportion. b. np and n(1 – p) are both greater than 5. c. (p + ) and (p – ) are both greater than 1. d. The sample size is greater than 5.

ANS: A BLM: Remember

PTS:

REF: 326

TOP: 5

19. What would be the lower limit of a confidence interval, at the 95% level of confidence, for the population proportion if a sample of size 100 were to have 30 successes? a. 0.2102 b. 0.2959 c. 0.3041 d. 0.3898 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 326-327

TOP: 5

20. Which of the following best describes an interval estimate? a. It is a sampling procedure that matches each unit from population A with a “twin” from population B, so that any sample observation about a unit in population A automatically yields an associated observation about a unit in population B. b. It is an estimate of a population parameter that is expressed as a range of values within which the unknown but true parameter presumably lies. c. It is a sample statistic such that the mean of all its possible values equals the population parameter the statistic seeks to estimate. d. It is the sum of an estimator’s squared bias plus its variance. ANS: B BLM: Remember

PTS:

REF: 320

TOP: 5

21. Which of the following are possible options when estimating a population mean the population standard deviation is known? a. We may define the limits of an interval estimate of

, where

b. We may define the limits of an interval estimate of as . c. We may choose a smaller z-value, construct a narrower confidence interval, and achieve a higher confidence level. d. We may choose a larger z-value, construct a wider confidence interval, and achieve a lower confidence level. ANS: A BLM: Remember

PTS:

REF: 323-324

TOP: 5

22. To what does the term “confidence level” refer? a. the absolute number of interval estimates that can be expected to contain the actual value of the parameter being estimated when the same procedure of interval construction is used again and again b. the percentage of interval estimates that can be expected to contain the actual value of the parameter being estimated when the same procedure of interval construction is used again and again c. the range of values among which an unknown population parameter can presumably be found d. the sum of an estimator’s squared bias plus its variance, which indicates the degree to which it is consistent, efficient, and unbiased

ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 320

TOP: 5

23. A random sample of 400 students was surveyed to determine an estimate for the proportion of all students who had attended at least three football games. The estimate revealed that between 0.372 and 0.458 of all students attended. Given this information, which of the following is the approximate value of the confidence coefficient? a. 0.95 b. 0.92 c. 0.90 d. 0.88 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 320 | 326

TOP: 5

24. A 95% confidence interval for the population proportion of professional tennis players who earn more than $2 million a year is found to be between 0.82 and 0.88. What was the approximate sample size used to obtain this information? a. 545 b. 387 c. 382 d. 233 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 326

TOP: 5

25. A recent survey indicates that the proportion of season ticket holders for the school hockey team who renew their seats is about 0.80. Using a 95% confidence interval and a margin of error of 0.025, what is the approximate size of the sample needed to estimate the true proportion who plan to renew their seats? a. 689 b. 697 c. 984 d. 1179 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 326

TOP: 5

26. If the population deviation is known and we wish to estimate the population mean with 95% confidence, which of the following would be the appropriate critical z-value to use? a. 1.28 b. 1.645 c. 1.96 d. 2.33 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 323

TOP: 5

27. If the population deviation is known and we wish to estimate the population mean with 90% confidence, what is the appropriate critical z-value to use? a. 1.28 b. 1.645 c. 1.96 d. 2.33 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 323

TOP: 5

28. A statistician wishes to reduce the margin of error associated with a confidence interval estimate for a population proportion p. What does she or he need to do? a. reduce the confidence level 1 – b. decrease the sample size n c. take another sample d. increase the sample size n ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 325-326

TOP: 5

29. In order to construct a 95% confidence interval estimate for the difference between the means of two normally distributed populations, where the unknown population variances are assumed not to be equal, the following summary statistics were computed from two independent samples: , , this case, what is the upper confidence limit? a. 6.78 b. 18.78 c. 77.3 d. 89.3 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 331

, and

. In

TOP: 6

30. In developing a confidence interval estimate for the difference between two population means, which of the following will result from an increase in the size of the sample? a. a wider confidence interval b. a narrower confidence interval c. a smaller critical z-value d. a larger critical z-value ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 331

31. When two independent random samples of sizes populations with means

and

and variances

and and

TOP: 6

have been selected from , respectively, which of the

following is a property of the sampling distribution of ? a. If the sampled populations are normally distributed, then the sampling distribution of is exactly normal only when and are both 30 or more. b. If the sampled populations are normally distributed, then the sampling distribution

of is exactly normal regardless of the sizes of and . c. If the sampled populations are not normally distributed, then the sampling distribution of

is approximately normally distributed regardless of the sizes

of and . d. If the sampled populations are not normally distributed, then the sampling distribution of more. ANS: B BLM: Remember

is approximately normally distributed only if PTS:

REF: 331

is 30 or

TOP: 6

32. Suppose you wish to estimate the difference between two population means when the population variances are known. Which critical values of z can you use to develop the 90% confidence interval estimate? a. 2.33 b. 1.96 c. 1.645 d. 1.28 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 331

TOP: 6

33. Suppose you wish to estimate the difference between two population means when the population variances are known. Which critical value of z can you use to develop the 99% confidence interval estimate? a. 2.575 b. 2.325 c. 1.645 d. 1.275 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 331

TOP: 6

34. If a 90% confidence interval estimate for the difference between two population proportions is to be constructed, what would the confidence coefficient be? a. 0.90 b. 0.45 c. 0.10 d. 0.05 ANS: A TOP: 7

PTS: 1 BLM: Remember

REF: 336-337 | 320

35. If we wish to construct a 95% confidence interval estimate for the difference between two population proportions, what would the confidence level be? a. 1.96 b. 0.95 c. 0.475 d. 0.05

ANS: B TOP: 7

PTS: 1 REF: 336-337 | 320 BLM: Higher Order - Understand

36. What is the z-value needed to construct a 97.8% confidence interval estimate for the difference between two population proportions? a. 2.29 b. 2.02 c. 1.96 d. 1.65 ANS: A TOP: 7

PTS: 1 REF: 336-337 | 320 BLM: Higher Order - Apply

37. What is the z-value needed to construct a 92.5% confidence interval estimate for the difference between two population proportions? a. 2.58 b. 2.33 c. 1.96 d. 1.78 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 336-337

TOP: 7

38. Suppose the population standard deviation equals 10. What is the sample size needed to estimate, with 95% confidence, a population mean within 1.5 units of its true value? a. 171 b. 121 c. 54 d. 13 ANS: A TOP: 8–9

PTS: 1 REF: 323 | 342-343 BLM: Higher Order - Analyze

39. What is the approximate z-value you would use if you wish to construct an 80% lower confidence bound for the population mean ? a. 0.84 b. 1.28 c. 1.96 d. 2.33 ANS: A TOP: 8–9

PTS: 1 REF: 323 | 340-341 BLM: Higher Order - Apply

40. What is the approximate z-value you would use if you wish to construct an 85% upper confidence bound for the population proportion p? a. 2.33 b. 1.96 c. 1.65 d. 1.04 ANS: D

PTS:

REF: 326 | 340-341

TOP: 8–9

BLM: Higher Order - Apply

41. What is the approximate z-value you would use if you wish to construct a 92% lower confidence bound for the difference between population means in the case of large samples? a. 2.58 b. 1.65 c. 1.41 d. 1.06 ANS: C TOP: 8–9

PTS: 1 REF: 331 | 340-341 BLM: Higher Order - Apply

42. What is the approximate z-value you would use if you wish to construct a 98% upper confidence bound for the difference between population proportions? a. 2.33 b. 2.05 c. 1.65 d. 1.41 ANS: B TOP: 8–9

PTS: 1 REF: 336-337 | 340-341 BLM: Higher Order - Apply

43. Suppose you wish to estimate a population mean based on a sample of n observations. What sample size is required if you want your estimate to be within 2 standard deviations of with probability equal to 0.95, if you know the population standard deviation is 12? a. 239 b. 196 c. 139 d. 98 ANS: C TOP: 8–9

PTS: 1 REF: 323 | 342-343 BLM: Higher Order - Analyze

TRUE/FALSE 1. An interval estimate is an interval that provides an upper and a lower bound for a specific population parameter whose value is unknown. ANS: T BLM: Remember

PTS:

REF: 313 | 320

TOP: 1–4

2. A statistic is said to be unbiased if its sampling distribution has the smallest standard error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 314

TOP: 1–4

3. A point estimate is a single number that is used as an estimate of a population parameter or population characteristic. It is usually derived from a random sample from the population of interest.

ANS: T BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

4. The maximum distance between an estimator and the true value of a parameter is called the margin of error. ANS: T BLM: Remember

PTS:

REF: 315-316

TOP: 1–4

5. An unbiased estimator of a population parameter is an estimator whose variance is the same as the actual value of the population variance. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 314

TOP: 1–4

6. The sample standard deviation, s, is an unbiased estimator of the population standard deviation, . ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 314

TOP: 1–4

7. An estimator is a random variable calculated from a random sample that provides either a point estimate or an interval estimate for some population parameter. ANS: T BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

8. The error of estimation is the distance between an estimate and the estimated parameter. ANS: T BLM: Remember

PTS:

REF: 315

TOP: 1–4

9. An estimator is unbiased if the mean of its sampling distribution is the population parameter being estimated. ANS: T BLM: Remember

PTS:

REF: 314

TOP: 1–4

10. The process of inferring the values of unknown population parameters from those of known sample statistics is called estimation. ANS: T BLM: Remember

PTS:

REF: 311-312

TOP: 1–4

11. A point estimate is an estimate of a population parameter, expressed as a single numerical value. ANS: T BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

12. A sample statistic such that the mean of all its possible values differs from the population parameter that the statistic seeks to estimate is a biased estimator. ANS: T BLM: Remember

PTS:

REF: 314

TOP: 1–4

13. A sample statistic such that the mean of all its possible values equals the population parameter that the statistic seeks to estimate is an unbiased estimator. ANS: T BLM: Remember

PTS:

REF: 314

TOP: 1–4

14. The margin of error equals the sum of an estimator’s squared bias plus its variance. ANS: F BLM: Remember

PTS:

REF: 315-316

TOP: 1–4

15. The error of estimation is the difference between a statistic computed from a sample and the corresponding parameter computed from the population. ANS: T BLM: Remember

PTS:

REF: 315-316

TOP: 1–4

16. If a store manager is interested in estimating the mean amount spent per customer per visit at her store, the sample mean would be the approximate point estimate. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

17. If the campaign manager of the Conservative Party is interested in estimating the proportion of voters who will support the Conservative Party in the next federal election, the sample proportion,

, would be the appropriate point estimate.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

18. If a store manager has recently stated that she estimates the mean amount spent per customer per visit to be between $38.75 and $72.23, the numbers $38.75 and $72.23 are considered point estimates for the true population mean. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

19. A point estimate of a population parameter will likely be different from the corresponding population value due to the fact that point estimates are subject to sampling error. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

20. Increasing the sample size, n, will result in a point estimate that is closer to the true value of the population parameter. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

21. The concept of margin of error applies directly when estimating the population mean, but is not applicable when estimating the population proportion, p. ANS: F BLM: Remember

PTS:

REF: 315-316

TOP: 1–4

22. A point estimate is an estimate of the range of a population parameter. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

23. An interval estimate is an estimate of the range for a sample statistic. ANS: F TOP: 1–4

PTS: 1 BLM: Remember

REF: 312-313 | 320

24. A point estimate is a single value estimate of the value of a population parameter. ANS: T BLM: Remember

PTS:

REF: 312-313

TOP: 1–4

25. Statisticians routinely construct interval estimates by setting the point estimate as the centre of the interval and then creating a range of other possible values, known as the margin of error, below and above the centre. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 320-322

TOP: 1–4

26. The margin of error is a half-width of an interval estimate, equal to the difference between the point estimate on the one hand and either the lower or the upper limit of the interval on the other hand. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 320-322

TOP: 1–4

27. The unknown parameter of a population is presumed to lie at the centre of the interval that the point estimate and margin of error create. ANS: F TOP: 1–4

PTS: 1 REF: 315-316 | 321-322 BLM: Higher Order - Understand

28. A point estimate is subject to sampling error and will almost always be different from the true value of the population parameter. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 312-313

TOP: 1–4

29. As the sample size increases and other factors remain the same, the width of a confidence interval for a population mean tends to decrease. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 323

TOP: 5

30. Suppose a 95% confidence interval for the mean height of a 12-year-old male in Canada is 137 to 165 cm. It can be said that 95% of 12-year-old males in Canada have height greater than or equal to 137 cm and less than or equal to 165 cm. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 324-325

TOP: 5

31. If the population variance is increased and other factors remain the same, the width of a confidence interval for the population mean tends to increase. ANS: T PTS: 1 BLM: Higher Order - Understand 32. The sample proportion ANS: T BLM: Remember

REF: 323

TOP: 5

is an unbiased estimator of the population proportion, p.

PTS:

REF: 314

TOP: 5

33. The confidence coefficient is the probability that a confidence interval will enclose the estimated parameter. ANS: T BLM: Remember

PTS:

REF: 320

TOP: 5

34. Given that n = 49, = 75, and = 7, the lower and upper limits of the 68.26% confidence interval for the population mean are 74 and 76, respectively. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 323-324

TOP: 5

35. In the formula , the refers to the area in the lower tail or upper tail of the sampling distribution of the sample mean. ANS: T BLM: Remember

PTS:

REF: 322-323

TOP: 5

36. In order to construct a confidence interval estimate of the population proportion p, the value of p is needed. ANS: F BLM: Remember

PTS:

REF: 326

TOP: 5

37. In developing an interval estimate for the population mean , the population standard deviation was assumed to be 6. The interval estimate was 45.0 1.5. Had equalled 12, the interval estimate would be 90 3. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 323-324

TOP: 5

38. A 90% confidence interval estimate for a population mean is determined to be 62.8 to 73.4. If the confidence level is reduced to 80%, the confidence interval for becomes narrower. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 323-324

TOP: 5

39. When constructing a confidence interval for a population parameter, we generally set the confidence coefficient ( ) close to 0 (usually between 0 and 0.05) because it is the probability that the interval does not include the actual value of the population parameter. ANS: F BLM: Remember

PTS:

REF: 321-322

TOP: 5

40. Suppose that a 95% confidence interval for the population proportion p is given by This notation means that we are 95% confident that p falls between ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 326-327

and

TOP: 5

41. Given that n = 400 and = 0.10, the lower limit of the 90% confidence interval for the population proportion p is 0.1247. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 326

TOP: 5

42. The term confidence level refers to an estimate of a population parameter, expressed as a range of values within which the unknown but true parameter presumably lies. ANS: F BLM: Remember

PTS:

REF: 321-322

TOP: 5

43. The term confidence interval refers to ranges of values among which an unknown population parameter can presumably be found.

. .

ANS: T BLM: Remember

PTS:

REF: 321-322

TOP: 5

44. The two limits that define an interval estimate are known as confidence limits. ANS: T BLM: Remember

PTS:

REF: 322

TOP: 5

45. A confidence interval for the population mean will contain the true value of as the point estimate is within the lower and the upper confidence limits. ANS: F TOP: 5

PTS: 1 BLM: Remember

as long

REF: 324-325 | 327

46. A confidence interval for the population proportion p may or may not contain the true value of p. ANS: T TOP: 5

PTS: 1 BLM: Remember

REF: 324-325 | 327

47. The wider the confidence interval, the more likely it is that the interval contains the true value of the population parameter. ANS: F BLM: Remember

PTS:

REF: 324

TOP: 5

48. If a population is right skewed, the point estimate will be pushed to the right of the middle of the confidence interval estimate. ANS: F BLM: Remember

PTS:

REF: 313-314

TOP: 5

49. Based on the formula we can assume that the point estimate population mean will be at the centre of the confidence interval estimate. ANS: T BLM: Remember

PTS:

REF: 323

of the

TOP: 5

50. A 95% confidence interval for the population proportion p is found to be between 0.214 and 0.336. Based on this information, the sample proportion that generated the confidence interval was 0.122. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 326

TOP: 5

51. A 90% confidence interval for the population mean is found to be between 5.28 and 6.72. Based on this information, the sample mean that generated the confidence interval was 6. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 323-324

TOP: 5

52. In constructing a confidence interval for a population parameter, such as of error is directly dependent on the value of the point estimate. ANS: F TOP: 5

PTS: 1 BLM: Remember

or p, the margin

REF: 321 | 323 | 326

53. Increasing the confidence coefficient from 0.90 to 0.95 and decreasing the sample size from 100 to 50 has unknown impact on the margin of error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 321-322

TOP: 5

54. One way to reduce the margin of error in a confidence interval is to decrease the confidence coefficient. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 321-322

TOP: 5

55. For a given sample size and given confidence coefficient, the closer the population proportion p to 1.0, the greater the margin of error will be. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 326

TOP: 5

56. Suppose a 95% confidence interval for the mean height of a 12-year-old male in Canada is 137 to 165 cm. In repeated sampling, 95% of the intervals constructed will contain the interval from 137 to 165 cm. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 324-325

TOP: 6

57. Suppose a 90% confidence interval for the mean time it takes to serve a customer at a drive-in bank is 120 seconds to 220 seconds. In repeated sampling, 90% of the intervals constructed using the appropriate formula will contain the actual mean time. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 324-325

TOP: 6

58. Suppose a 90% confidence interval for the mean time it takes to serve a customer at a drive-in bank is 120 seconds to 220 seconds. At the 90% confidence level, there is not enough evidence to conclude that the mean service time is not 200 seconds.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 324-325

59. The difference between two sample means

is an unbiased estimator of the

difference between two population means ANS: T BLM: Remember

PTS:

TOP: 6

. REF: 331 | 314

TOP: 6

60. The best estimator of the difference between two population means between two sample means ANS: T BLM: Remember

PTS:

is the difference

. 1

REF: 331

61. The standard error of the sampling distribution of

TOP: 6

is given by the formula

SE = ANS: F BLM: Remember

PTS:

REF: 331

62. When two independent random samples of sizes populations with means

and

and variances

and and

TOP: 6

have been selected from , respectively, the standard

error of the sampling distribution of the two population variances.

is found by taking the square root of the sum of

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 331

TOP: 6

63. In estimating the difference between two population means, if a 90% confidence interval estimate includes 0, then we can conclude that there is a 90% chance that the difference between the two population means is 0. ANS: F TOP: 6

PTS: 1 REF: 331 | 324-325 BLM: Higher Order - Understand

64. Increasing the confidence level for a confidence interval estimate for the difference between two population means, with all other things held constant, will result in a wider confidence interval estimate. ANS: T PTS: BLM: Higher Order

REF: 331

TOP: 6

65. When estimating the difference between two population means, one can use the properties of the sampling distribution of two corresponding sample means samples are selected independently of one another. ANS: F BLM: Remember

PTS:

REF: 331

provided the two

TOP: 6

66. A simple extension of the estimation of a binomial proportion p is the estimation of the difference between two binomial proportions ANS: T BLM: Remember

PTS:

and

REF: 336-337

67. Assume that two independent random samples of sizes binomial populations with parameters sampling distribution of

and

TOP: 7

and

have been selected from

, respectively. The standard error of the

(the difference between the sample proportions) can be

estimated by ANS: F BLM: Remember

PTS:

REF: 336-337

68. Assume that two independent random samples of sizes binomial populations with parameters of

and

TOP: 7

and

have been selected from

, respectively. The sampling distribution

can be approximated by a normal distribution provided that

and

are all greater than 5. ANS: T BLM: Remember

PTS:

REF: 336-337

69. Two independent random samples of sizes

and

binomial populations with respective parameters

TOP: 7

have been selected from

and

, resulting in 38 and 65

successes, respectively. Then, the point estimation of the difference ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 336-337

70. Two independent random samples of sizes

and

binomial populations with respective parameters successes, respectively. Then the standard error of ANS: T PTS: 1 BLM: Higher Order - Apply

is –27.

TOP: 7

have been selected from

and

, resulting in 38 and 65 is estimated as 0.077.

REF: 336-337

TOP: 7

71. One-sided confidence bounds can be constructed for the population mean

and population

proportion p, but not for (the difference between population means) or difference between population proportions). ANS: F TOP: 8–9

PTS: 1 BLM: Remember

(the

REF: 336-337 | 340-341

72. A 95% lower confidence bound (LCB) for the population mean

can always be

constructed using the following equation: LCB = ANS: F TOP: 8–9

PTS: 1 REF: 323-324 | 340-341 BLM: Higher Order - Understand

73. A sociologist wanted to discover whether there was any difference between Eastern Canadians and Western Canadians in the collective acceptance of multiculturalism as being beneficial to the country. To test the hypothesis that there was essentially no difference, it would be sufficient to collect a convenient sample size from both populations. ANS: F BLM: Remember

PTS:

REF: 336-337

TOP: 7

PROBLEM 1. A random sample of 45 door-to-door salespersons were asked how long on average they were able to talk to the potential customer. Their answers revealed a mean of 8.5 minutes with a variance of 9 minutes. Give a point estimate for the average conversation length and the margin of error. ANS: Point estimate is minutes.

= 8.5 minutes. Margin of error = 1.96 s/

PTS: 1 REF: 316 BLM: Higher Order - Analyze

= 1.96(3)/

= 0.88

TOP: 1–4

2. Twenty retired people living within the Crystal city limits were asked if they would use public transportation if a system was implemented. Their responses are listed below where Y = Yes and N = No. Y N N N N N Y Y Y Y N N Y Y Y Y Y Y Y N Use these data to estimate p, the true proportion of all retired people living in the city limits who would use a public transportation system, and find the estimated margin of error. ANS:

The point estimate is = 12/20 = 0.6. The margin of error is (1.96)(0.1095) = 0.2147. PTS: 1 REF: 316 BLM: Higher Order - Analyze

TOP: 1–4

3. A proportion of a basketball team’s season ticket holders renew their tickets for the next season. Let p denote the true proportion of ticket holders who buy tickets again for the following season. A random sample of 125 ticket holders revealed 90 people plan on renewing their tickets. Give a point estimate for p and find the estimated margin of error. ANS: The point estimate is = 90/125 = 0.72. The margin of error is (1.96)(0.0402) = 0.079. PTS: 1 REF: 316 BLM: Higher Order - Analyze

TOP: 1–4

4. Random sample of n = 1000 observations from a binomial population produced x = 728 successes. Estimate the binomial proportion p and calculate the margin of error. ANS: The point estimate for p is

= 728/1000 = 0.728, and the margin of error is

approximately 1.96 PTS:

(1.96)(0.0141) = 0.0276. REF: 316

TOP: 1–4

BLM: Higher Order

5. Random sample of n = 50 observations from a quantitative population produced and = 2.8. Give the best point estimate for the population mean margin of error.

= 65.4

, and calculate the

ANS: The point estimate for

= 65.4, and the margin of error in estimation with

and n = 50 is approximately 1.96 PTS: 1 REF: 316 BLM: Higher Order - Analyze

= 2.8

= 0.4638. TOP: 1–4

Telephone Poll Radio and television stations often air controversial issues during broadcast time and ask viewers to indicate their agreement or disagreement with a given stand on the issue. A poll is conducted by asking those viewers who agree to call a certain 900 telephone number and those who disagree to call a second 900 telephone number. All respondents pay a fee for their calls.

6. Please refer to Telephone Poll paragraph. Does this polling technique result in a random sample? ANS: This method of sampling would not be random, since only interested viewers (those who were adamant in their approval or disapproval) would reply. PTS: 1 REF: 269 BLM: Higher Order - Analyze

TOP: 1–4

7. Please refer to Telephone Poll paragraph. What can be said about the validity of the results of such a survey? Do you need to be concerned about a margin of error in this case? Justify your conclusion. ANS: The results of such a survey will not be valid, and a margin of error would be useless, since its accuracy is based on the assumption that the sample was random. PTS: 1 REF: 313-315 BLM: Higher Order - Evaluate

TOP: 1–4

8. Independent samples of

= 400 and

= 400 observations were selected from binomial

populations 1 and 2, and

= 100 and

= 127 successes were observed, respectively.

a. What is the best point estimator for the difference ( ) in the two binomial proportions? b. Calculate the approximate standard error for the statistic used in (a). c. What is the margin of error for this point estimate? ANS: a.

= 100/400 = 0.25, and

= 127/400 = 0.3175. The best estimate of

= 0.25 – 0.3175 = - 0.0675.

b. The standard error of is estimated by SE = c. The approximate margin of error is 1.96 SE = 0.0623. PTS: 1 REF: 336-337 BLM: Higher Order - Apply

= 0.0318.

TOP: 1–4

9. A random sample of 45 salespersons were asked how long, on average, they were able to talk to a potential customer. Their answers revealed a mean of 8.5 minutes with a variance of 9 minutes. Construct a 95% confidence interval for , the time it takes an salesperson to talk to a potential customer. ANS: 8.5

1.96 (3)/

= 8.5

0.88 LCL = 7.62, and UCL = 9.38

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

TOP: 5

10. A study conducted by a commuter train transportation authority involved surveying a random sample of 200 passengers. The results show that a customer had to wait, on average, 9.3 minutes, with a standard deviation of 6.2 minutes, to buy a ticket. Construct a 95% confidence interval for , the true mean waiting time. ANS: 9.3

(1.96)(6.2)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= (8.44, 10.16) TOP: 5

11. A study conducted by the doctors of a particular hospital involved monitoring a random sample of 75 patients. The results showed that it took an average of 3 cc of tranquillizer, with a standard deviation of 0.2 cc, to put a patient to sleep before surgery. Construct a 95% confidence interval for , the true mean amount of tranquillizer needed to put any patient to sleep. ANS: 3

(1.96)(0.2)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= (2.95, 3.05) TOP: 5

12. A random sample of 60 people revealed that it took an average of 55 minutes, with a standard deviation of 10 minutes, for a person to complete a loan application at the bank. Construct a 90% confidence interval for , the true time it takes any person to complete the loan form. ANS: 55

(1.645)(10)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= (52.88, 57.12) TOP: 5

13. An auto mechanic knows that the average time it takes to replace a car radiator is 70 minutes, with a standard deviation of 12 minutes. This average is based on a random sample of 50. Construct a 90% confidence interval for , the true time it takes any auto mechanic to replace a car radiator. ANS: 70

(1.645)(12)/

= (67.21, 72.79)

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

TOP: 5

14. A random sample of 80 jars of grape jelly has a mean weight of 568 g, with a standard deviation of 48.28 g. Construct a 99% confidence interval for , the true weight of a jar of jelly. ANS: 568

(2.575)(48.28)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= (554.1, 581.9) TOP: 5

15. A study was conducted to see how long Dr. Kennedy’s patients had to wait before their scheduled appointments. A random sample of 33 patients showed the average waiting time was 22 minutes, with a standard deviation of 16 minutes. Construct a 99% confidence interval for , the true mean waiting time. ANS: 22

(2.575)(16)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= (14.83, 29.17) TOP: 5

16. A college graduate school administrator is interested in knowing what proportion of applicants would like to be accepted into a particular chemistry program. Of a random sample of 75 applicants, 12 requested entry into the program. Construct a 95% confidence interval for p, the true proportion of all applicants who want acceptance into the program. ANS: = 12/75 = 0.16 = 0.16

(1.96)

PTS: 1 REF: 326-327 BLM: Higher Order - Analyze

= 0.16

0.083 = (0.077, 0.243)

TOP: 5

17. A lawn service owner is testing new environmentally friendly weed killers. He discovers that a particular weed killer is effective 89% of the time. Suppose that this estimate was based on a random sample of 60 applications. Construct a 90% confidence interval for p, the true proportion of weeds killed by this particular brand. ANS: = 0.89

(1.645)

PTS: 1 REF: 326-327 BLM: Higher Order - Analyze

= 0.89 TOP: 5

0.07 (0.82, 0.96)

18. Some people claim there are health benefits in eating less meat. A health club committee reported the proportion of vegetarians in their city is 0.13. Suppose this estimate was based on a random sample of 80 people. Construct a 99% confidence interval for p, the true proportion of all vegetarians in this particular city. ANS: = 0.13

(2.575)

= (0.033, 0.227)

PTS: 1 REF: 326-327 BLM: Higher Order - Analyze

TOP: 5

Shuttle Bus Study Narrative An airport bus driver conducted a study to see what proportion of customers use the shuttle bus to get to and from the parking lot. The results of his study are listed below, where B = customer used the bus and W = customer walked. B

B W B W B W W W B B

B W B B B B B W B W

W W

B B

W W

W B

B B

W B

B B

19. Refer to Shuttle Bus Study Narrative. Construct a 95% confidence interval for p, the true proportion of all people who used the bus. ANS: = 30/48 = 0.625; 0.137 = (0.488, 0.762)

= 0.625

PTS: 1 REF: 326-327 BLM: Higher Order - Analyze

(1.96)

= 0.625

TOP: 5

20. Refer to Shuttle Bus Study Narrative. Construct a 90% confidence interval for p, the true proportion of all people who used the bus. ANS: = 0.625

(1.645)

PTS: 1 REF: 326-327 BLM: Higher Order - Apply

= (0.51, 0.74) TOP: 5

Childcare Expenses Narrative A childcare agency was interested in examining the average amount that families pay per child per month for childcare outside the home. A random sample of 64 families was selected and the mean and standard deviation were computed to be $675 and $80, respectively. 21. Refer to Childcare Expenses Narrative. Find a 95% confidence interval for the true average amount spent per child per month for childcare outside the home. ANS: 675

(1.96)(80)/

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

= 675

19.6 = (655.4, 694.6)

TOP: 5

22. Refer to Childcare Expenses Narrative. Interpret the interval in the previous question. ANS: One can estimate with 95% confidence that the true average amount spent per child per month for childcare outside the home is between $655.40 and $694.60. PTS: 1 REF: 324-325 BLM: Higher Order - Understand

TOP: 5

Public versus Private Childcare Expenses Narrative A social worker was interested in determining whether there is a significant difference in the average monthly cost per child for childcare outside the home between publically supported facilities and privately owned facilities. Two independent random samples were selected, yielding the following information: Sample Size Sample Mean ($) Standard Deviation ($)

Publically Supported Facilities 64 725 95

Privately Owned facilities 64 675 80

23. Refer to Public versus Private Childcare Expenses Narrative. Find a 90% confidence interval for the true difference in average monthly cost of childcare between publically supported and privately owned facilities. ANS: = (725 – 675) $75.538)

(1.645)(241.02) = 50

25.538 = ($24.462,

PTS: 1 REF: 331 BLM: Higher Order - Analyze

TOP: 5

24. Refer to Public versus Private Childcare Expenses Narrative. Interpret the confidence interval in the previous question. ANS: One can estimate with 90% confidence that the difference in average monthly cost per child for childcare outside the home between publically supported and privately owned facilities is roughly between $24.50 and $75.50. PTS: 1 REF: 324-325 | 331 BLM: Higher Order - Understand

TOP: 5

25. Refer to Public versus Private Childcare Expenses Narrative. Can one conclude there is a significant difference in the average cost of childcare between the publically supported facilities and the privately owned facilities? Justify your answer. ANS: Since 0 is not within the limits of the confidence interval, it is not likely the means are the same. Therefore, one can conclude there is a difference in the average cost of childcare between publically supported and privately owned facilities. PTS: 1 REF: 327 | 331 BLM: Higher Order - Evaluate

TOP: 5

26. A statistician knows that the population of light bulb lifetimes is normally distributed and has a standard deviation of 30 hours. A simple random sample of 36 bulbs yields a mean lifetime of 504 hours. Construct and interpret a 99% confidence interval for the mean lifetime of all such bulbs. ANS: or . The statistician can be 99% confident that light bulbs last between 491.125 and 516.875 hours on average. PTS: 1 REF: 323-324 BLM: Higher Order - Apply

TOP: 5

Lifetime of Laptop Batteries Narrative The manufacturer of a particular battery pack for a laptop computer claims the battery pack can function for 8 hours, on average, before having to be recharged. A random sample of 36 such battery packs was selected and tested. The mean and standard deviation were found to be 6 hours and 1.8 hours, respectively. 27. Refer to Lifetime of Laptop Batteries Narrative. Find a 90% confidence interval for the true average time the battery pack can function before having to be recharged.

ANS: 6

(1.645)(1.8)/

PTS: 1 REF: 323-324 BLM: Higher Order - Apply

0.4935 = (5.5065, 6.4935) TOP: 5

28. Refer to Lifetime of Laptop Batteries Narrative. Based on the interval in the previous question, can the manufacturer’s claim be rejected? Justify your answer. ANS: Since 8 is outside the limits of the 90% confidence interval, one can conclude, with 90% confidence, the claim is in error. PTS: 1 REF: 327 | 323-324 BLM: Higher Order - Evaluate

TOP: 5

= 0.56

(1.96)

= 0.56

TOP: 5

30. Refer to College Beach Volleyball Narrative. Based on the interval in the previous question, can the representative’s claim be rejected? Justify your answer. ANS: Since 0.60 is within the limits of the 95% confidence interval, the claim cannot be rejected. PTS: 1 REF: 324-325 BLM: Higher Order - Evaluate NBA Series Narrative

TOP: 5

A TV pollster believed that 70% of all TV households would be tuned in to Game 6 of the 2009 NBA Championship series between the LA Lakers and the Orlando Magic. A random sample of 500 TV households was selected and 365 indicated they were tuned into the game. 31. Refer to NBA Series Narrative. Find a 99% confidence interval for the true proportion of TV households that tuned in to the game. ANS: , 0.0511 = (0.6788, 0.7811).

= 0.73

PTS: 1 REF: 326-327 BLM: Higher Order - Analyze

2.575)

= 0.73

TOP: 5

32. Refer to NBA Series Narrative. Based on the interval in the previous question, can the pollster’s claim be rejected? Justify your answer. ANS: Since 0.70 is within the limits of the 99% confidence interval, the claim cannot be rejected. PTS: 1 REF: 324-325 BLM: Higher Order - Evaluate

TOP: 5

33. A comparison between the average jail time of bank robbers and car thieves yielded the following results (in years), respectively:

= 80,

= 3.2,

= 0.6,

= 90,

= 2.8,

and = 0.7. Estimate , the difference in mean years of jail time, and find the margin of error for your estimate. ANS: The point estimate =

= 3.2 – 2.8 = 0.4.

The margin of error = 1.96 PTS: 1 REF: 331 BLM: Higher Order - Analyze

= (1.96)(0.997) = 0.195. TOP: 6

34. A parent believes the average height for 14-year-old girls differs from that of 14-year-old boys. Estimate the difference in height between girls and boys, using a 95% confidence interval. The summary data are listed below. Based on your interval, do you think there is a significant difference between the true mean height of 14-year-old girls and boys? Explain.

ANS:

14-year-old girls’ summary data:

= 40,

= 155 cm,

= 6.1 cm

14-year-old boys’ summary data:

= 40,

= 146 cm,

= 9.1 cm

= (155 – 146) (1.96)(1.732) = (5.61, 12.39) Since this interval does not contain 0, the sample data support the conclusion that the heights of girls and boys differ. PTS: 1 REF: 331 | 327 BLM: Higher Order - Evaluate

TOP: 6

35. A dieter believes that the average number of calories in a homemade peanut butter cookie is more than in a store-bought peanut butter cookie. Estimate the difference in the mean calories between the two types of cookies using a 90% confidence interval. The data are = 40,

= 180,

= 2,

= 45,

= 179, and

= 4, respectively.

ANS: = (180 – 179) PTS: 1 REF: 331 BLM: Higher Order - Analyze

(1.645)(0.6749) = (–0.11, 2.11) TOP: 6

36. It is of interest to know if the average time it takes police to reach the scene of an accident differs from that of an ambulance to reach the same accident. Use the summary data listed below to estimate the difference in the times (measured in minutes) between the police and the ambulance, using a 99% confidence interval. Interpret the meaning of the interval thus obtained. Police:

= 60,

= 4.2,

= 0.08

Ambulance:

= 55,

= 4.5,

= 0.10

ANS: = (4.2 – 4.5) (2.575)(0.05614) = (–0.445, –0.155) Since this interval does not contain 0, the sample data provide evidence that the times differ. PTS: 1 REF: 331 | 327 BLM: Higher Order - Evaluate

TOP: 6

Laptop Batteries Narrative A computer laboratory manager was in charge of purchasing new battery packs for her lab of laptop computers. She narrowed her choices to two models that were available for her machines. Since the two models cost about the same, she was interested in determining whether there was a difference in the average time the battery packs would function before needing to be recharged. She took two independent random samples and computed the following summary information:

Sample Size Sample Mean Standard Deviation

Battery Pack Model 1 30 6 hours 1.8 hours

Battery Pack Model 2 30 6.5 hours 2.6 hours

37. Refer to Laptop Batteries Narrative. Find a 95% confidence interval for the difference in average functioning time before recharging in the two models. ANS: = (6 – 6.5)

(1.96)(0.577) = –0.5

PTS: 1 REF: 331 BLM: Higher Order - Analyze

1.131 = (–1.631, 0.631)

TOP: 6

38. Refer to Laptop Batteries Narrative. Find Based on the interval in the previous question, can one conclude there is a difference in the true average functioning time before recharging between the two models of battery packs? Justify your answer. ANS: Since 0 is within the limits of the confidence interval, it is possible that the means are the same. Therefore one cannot conclude there is a difference in average functioning time before recharging between the two models of battery packs. PTS: 1 REF: 324-325 | 331 BLM: Higher Order - Evaluate

TOP: 6

Ground Beef Weights Narrative The meat department of a local supermarket packages ground beef using meat trays of two sizes: one designed to hold approximately 700 g of meat, and a larger one that holds approximately 1.4 kg. A random sample of 36 packages of the smaller meat trays produced weight measurements with an average of 715 g and a standard deviation of 90 g. 39. Refer to Ground Beef Weights Narrative. Construct a 99% confidence interval for the average weight of all packages sold in the smaller meat trays by this supermarket. ANS: With n = 36,

= 715, s = 90, and

approximated by < 753.625 grams.

= 0.01, a 99% confidence interval for = 715

PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

2.575 (90)/

= 715

38.625, or 676.375 <

TOP: 6

40. Refer to Ground Beef Weights Narrative. What does the phrase “99% confident” mean?

ANS: In repeated sampling, 99% of all intervals constructed in this manner will enclose . Hence, we are fairly certain that this particular interval contains . (In order for this to be true, the sample must be randomly selected). PTS: 1 REF: 324-325 BLM: Higher Order - Understand

TOP: 6

41. Refer to Ground Beef Weights Narrative. Suppose that the quality control department of this supermarket chain intends that the amount of ground beef in the smaller trays should be 700 g on average. Should the confidence interval above concern the quality control department? Explain. ANS: No. Since the value = 700 g is contained in the interval in the previous question, it is one of several possible values for . The quality control department would have no reason to be concerned that the trays are being over- or underfilled. Ground Beef Weights Narrative PTS: 1 REF: 324-325 BLM: Higher Order - Evaluate

TOP: 6

Online Time Usage Narrative An Internet server conducted a survey of 400 of its customers and found that the average amount of time spent online was 12.5 hours per week, with a standard deviation of 5.4 hours. 42. Refer to Online Time Usage Narrative. Do you think that the random variable x, the number of hours spent online, has a mound-shaped distribution? If not, what shape do you expect? ANS: The distribution of hours spent online is probably skewed to the right, with the majority of people spending a relatively small number of hours online, and with a few people spending a very large number of hours online. PTS: 1 REF: 59 BLM: Higher Order - Analyze

TOP: 6

43. Refer to Online Time Usage Narrative. If the distribution of the original measurements is not normal, you can still use the standard normal distribution to construct a confidence interval for , the average online time for all users of this Internet server. Why? ANS: As long as the sample size n is large (in this case n = 400), the Central Limit Theorem will guarantee the approximate normality of the sampling distribution of , which is the basic statistic used in the large sample confidence interval.

PTS: 1 REF: 322-323 BLM: Higher Order - Understand

TOP: 6

44. Refer to Online Time Usage Narrative. Construct a 95% confidence interval for the average online time for all users of the particular Internet server. ANS: The 95% confidence interval for (5.4)/

= 12.5

is approximated by

0.5292, or 11.9708 <

PTS: 1 REF: 323-324 BLM: Higher Order - Apply

= 12.5

1.96

< 13.0292. TOP: 6

45. Refer to Online Time Usage Narrative. If the Internet server claimed that its users averaged 15 hours of use per week, would you agree or disagree? Explain. ANS: Since the value values for .

= 15 is not contained in the confidence interval, it is not one of the likely

PTS: 1 REF: 324-327 BLM: Higher Order - Evaluate

TOP: 6

46. A study was conducted to compare the mean numbers of police emergency calls per eight-hour shift in two districts of Toronto. Samples of 100 eight-hour shifts were randomly selected from the police records for each of the two regions, and the number of emergency calls was recorded for each shift. The sample statistics are listed below: Sample Statistic Sample size Sample mean Sample variance

Region 1 100 2.8 1.64

Region 2 100 3.5 2.84

Find a 90% confidence interval for the difference in the mean numbers of police emergency calls per shift between the two districts of the city. Interpret the interval. ANS: The 90% confidence interval for = (2.8 – 3.5) –1.0482 <

< –0.3518.

is approximated by 1.645 (0.2117) = –0.7

0.3482, or

One can estimate with 90% confidence that the difference in the mean numbers of police emergency calls per shift between the two districts of Toronto is between –1.0482 and –0.3518. In repeated sampling, all intervals constructed in this manner will enclose (

) 90% of the time. Hence, we are fairly certain that this particular interval

encloses

PTS: 1 REF: 331 | 324-325 BLM: Higher Order - Evaluate

TOP: 6

47. An experiment was conducted to compare two diets A and B, designed for weight reduction. Two groups of 50 overweight dieters each were randomly selected. One group was placed on diet A and the other on diet B, and their weight losses were recorded over a 30-day period. The means and standard deviations of the weight-loss measurements (in kg) for the two groups are shown in the table. Sample Statistic Sample size Sample mean Sample standard deviation

Diet A 50 10.7 kg 1.11 kg

Diet B 50 7.1 kg 0.80 kg

Find a 95% confidence interval for the difference in mean weight loss for the two diets. Interpret your confidence interval. ANS:

The 95% confidence interval for

is approximated by

(10.7– 7.1) 1.96 (0.1935) = 3.6 0.37926, or 3.22074 < < 3.97926 kg. One can estimate with 95% confidence that the difference in the mean weight loss for the two diets is roughly between 3.22 and 3.98 kg. In repeated sampling, all intervals constructed in this manner will enclose ( certain that this particular interval encloses PTS: 1 REF: 331 | 324-325 BLM: Higher Order - Evaluate

) 95% of the time. Hence, we are fairly . TOP: 6

Number of Salon Hair Colourings Narrative A stylist at the Hair Care Palace gathered data on the number of hair colourings given on Saturdays and on weekdays. Her results are listed below. Assume the two random samples were independently taken from normal populations. Saturday:

= 50 and

= 14

Weekday:

= 65 and

= 13

48. Refer to Number of Salon Hair Colourings Narrative. Find the point estimate of p1 – p2 and the margin of error. ANS: The point estimate is

= (14/50 – 13/65) = 0.08.

The margin of error is

= 1.96

PTS: 1 REF: 336-337 BLM: Higher Order - Analyze

= 0.158.

TOP: 7

49. Refer to Number of Salon Hair Colourings Narrative. Estimate the difference in the true proportions with a 99% confidence interval. Interpret this interval. ANS: = 0.08 (2.575)(0.0806) = 0.08 0.208 = (–0.128, 0.288) Since this interval contains 0, it is highly possible that there may be no difference in these proportions. PTS: 1 REF: 336-337 | 324-325 BLM: Higher Order - Evaluate

TOP: 7

Defective Glass Bottles Narrative A manufacturing plant has two assembly lines for producing glass bottles. The plant manager was concerned about whether the proportion of defective bottles differs between the two lines. Two independent random samples were selected and the following summary data computed:

Sample Proportion of Defectives Sample Size

Line 1 0.10 100

Line 2 0.13 100

50. Refer to Defective Glass Bottles Narrative. Find a 95% confidence interval for the true difference in proportion of defective bottles produced by the two assembly lines. ANS: = (0.10 – 0.13)

(1.96)(0.0451) = –0.03

= (–0.1183, 0.0583). PTS: 1 REF: 336-337 BLM: Higher Order - Analyze

TOP: 7

0.0883

51. Refer to Defective Glass Bottles Narrative. Based on the interval in the previous question, can one conclude there is a difference in proportion of defective bottles produced by the two lines? Justify your answer. ANS: Since 0 is within the limits of the confidence interval, it is possible the proportions of defective bottles produced by the two assembly lines are the same. Therefore, one cannot conclude there is a difference in the proportions of defective bottles produced by the two lines. PTS: 1 REF: 324-325 BLM: Higher Order - Evaluate

TOP: 7

52. Instead of paying to support welfare recipients, many people want them to find jobs. If necessary, they want each province to create public service jobs for those who cannot find jobs in private industry. In a survey of 800 voters, 400 Conservatives and 400 Liberals, 75% of the Conservatives and 90% of the Liberals favoured the creation of public service jobs. Use a large-sample estimation procedure to compare the difference between the proportions of Conservatives and Liberals who favour creating public service jobs in the population of registered voters. Explain your conclusions. ANS: 0.75,

= 0.90,

= 400. The approximate 95% confidence interval is = (0.75 – 0.90)

or –0.2015 <

(1.96)(0.0263) = –0.15

0.0515

< –0.0985.

Since the value = 0 is not in the confidence interval, it is not likely that . We should conclude that there is a difference in the proportion of Conservatives and Liberals who favour creating public service jobs. It appears that the percentage of Liberal voters is higher than the Conservative percentage. PTS: 1 REF: 336-337 | 327 BLM: Higher Order - Evaluate

TOP: 7

53. In a study of the relationship between birth order and university success, an investigator found that 140 in a sample of 200 university graduates were firstborn or only children. In a sample of 120 nongraduates of comparable age and socioeconomic background, the number of firstborn or only children was 66. Estimate the difference between the proportions of firstborn or only children in the two populations from which these samples were drawn. Use a 90% confidence interval and interpret your results. ANS:

= 140/200 = 0.70, and

= 66/120 = 0.55. The approximate 90% = (0.70 – 0.55)

confidence interval is

(1.645)(0.0558) =

0.15 0.092 or, 0.058 < < 0.242. One can estimate with 90% confidence that the difference between the proportions of firstborn or only children in the two populations from which these samples were drawn is roughly between 0.06 and 0.24. Since the value

= 0 is not in the confidence

interval, it is not likely that . It would appear that firstborn or only children do achieve somewhat greater university success than those who are not. PTS: 1 REF: 336-337 | 327 BLM: Higher Order - Evaluate

TOP: 7

54. A quality control engineer wants to determine what is the proportion of defective parts coming off the assembly line. Past experiments, based on large sample sizes, have shown this proportion to be 0.19. What sample size does the engineer need in order to estimate, with 90% confidence, this proportion with a margin of error of 0.12? Justify your conclusion. ANS: (0.19)(0.81) The sample size should be at least 29.

= 28.9

PTS: 1 REF: 326 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

55. A researcher wants to determine the proportion of elm trees in Windsor, Ontario, dying of Dutch elm disease. Past experiments, based on large sample sizes, have shown this proportion to be 0.3. What sample size does the researcher need in order to estimate this proportion to within 0.04 with 95% confidence? Justify your conclusion. ANS: (0.3)(0.7) The sample size should be at least 505.

= 504.21

PTS: 1 REF: 326 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

56. A laboratory technician is interested in the proportion of 1 litre containers used in the lab that are glass. How many containers should be sampled in order to estimate, with 99% confidence, this proportion with a margin of error of less than 0.2? Justify your conclusion. ANS: (0.5)(0.5) The sample size should be at least 42.

= 41.44

PTS: 1 REF: 326 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

57. A provincial job service employee wishes to estimate the mean number of people who register with the service each week. How many weeks should be sampled in order to estimate , the mean number of weekly registrants? (The employee would like the margin of error to be less than 0.5 with confidence of 0.95. Past records show the weekly standard deviation to be 2.5.) Justify your conclusion. ANS: = 96.04 The number of weeks sampled should be at least 97. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

58. In a study of radio listening habits, a station owner would like to estimate the average number of hours that teenagers spend listening each day. If it is reasonable to assume that  = 1.3 hours, how large a sample size is needed to be 90% confident that the sample mean is off by, at most, 30 minutes? Justify your conclusion. ANS: = 18.29 The sample size should be at least 19. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

59. Trish attends a yoga class four times each week. She would like to estimate the mean number of minutes of continuous exercise until her heart reaches 90 beats per minute. If it can be assumed that  = 1.7 minutes, how large a sample is needed so that it will be possible to assert with 99% confidence that the sample mean has a margin of error of, at most, 0.62 minutes? Justify your conclusion. ANS: = 49.85 The sample size should be at least 50. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

60. The postmaster at the Huntington Post Office would like to compare the delivery times to two different locations that are the same distance from Huntington. A random sample of letters is to be divided into two equal groups, the first to be delivered to Location A and the second to be delivered to Location B. Each letter will be delivered on a randomly selected day and the number of days for each letter to arrive at its destination is to be recorded. The measurements for both groups are expected to have a range (variability) of approximately 4 days. If the estimate of the difference in mean delivery times is desired to be correct to within 1 day, with probability equal to 0.99, how many letters must be included in each group? (Assume

.) Justify your conclusion.

ANS: As noted in the problem, the variability of each group of measurements is the same; hence, . Since the range, equal to 4, is approximately equal to 4 or

, we have 4

= 4,

= 1.

= (1 + 1) = 13.26 The sample size should be at least 14 for each group. PTS: 1 REF: 331 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

61. A researcher wants to compare the average ages at which men and women first get their driver’s licence. A random sample of 75 men yielded a mean and standard deviation of 17.3 and 4.7 years, respectively. A random sample of 96 women yielded a mean and standard deviation of 19.6 and 5.1 years, respectively. If the researcher wants to estimate the mean difference to within 1.5 years with 95% confidence, how large a sample should be taken from each population? (Assume n1 = n2 = n.) Justify your conclusion. ANS: = = 82.12 The sample size should be at least 83 for each population. PTS: 1 REF: 331 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

62. A childcare agency was interested in examining the amount that families pay per child per month for childcare outside the home. A random sample of 64 families was selected and the mean and standard deviation were computed to be $675 and $80, respectively. Find a 95% upper confidence bound for the true average amount spent per child per month on childcare outside the home. ANS: 675 + (1.645)(80)/ = 675 + 16.45 = 691.45 Thus, a 95% upper confidence bound is $691.45. PTS: 1 REF: 323-324 BLM: Higher Order - Analyze

TOP: 8–9

63. The manufacturer of a particular battery pack for a laptop computer claims the battery pack can function for eight hours, on average, before having to be recharged. A random sample of 36 such battery packs was selected and tested. The mean and standard deviation were found to be 6 hours and 1.8 hours, respectively. Find a 95% lower confidence bound for the true average time the battery pack can function before having to be recharged. Interpret this bound. ANS: 6 – (1.645)(1.8)/ = 6 – 0.4935 = 5.5065. One can estimate with 95% confidence that the true average time the battery pack can function before having to be recharged will be no less than 5.5065 hours. PTS: 1 REF: 322-323 | 340-341 BLM: Higher Order - Analyze

TOP: 8–9

64. A machine produces aluminum tins used in packaging cheese. A random sample of 1000 tins was selected and 43 were found to be defective. Find a 95% upper confidence bound for the true proportion of defective tins produced by the machine. Interpret this bound. ANS: , = 0.043 + 1.645) = 0.043 + 0.0106 = 0.0536 One can estimate with 95% confidence that the true proportion of defective tins produced by the machine will be no more than 0.0536 tins. PTS: 1 REF: 326 | 324-325 BLM: Higher Order - Analyze

TOP: 8–9

65. A manufacturer wishes to estimate the mean time a battery pack will function before needing to be recharged, with a margin of error of no more than 0.5 hours and with probability 0.95. If the standard deviation is known to be 1.5 hours, how many observations should be included in the sample? Justify your conclusion. ANS: = 34.57 The sample size should be at least 35. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

66. A process control engineer wishes to estimate the true proportion of defective computer chips, with a margin of error of no more than 0.09 and with probability 0.90. How many observations does the engineer need to include in the sample to achieve his goal? Justify your conclusion. ANS:

(0.5)(0.5) The sample size should be at least 84.

= 83.52

PTS: 1 REF: 326 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

67. Suppose you wish to estimate a population mean based on a random sample of n observations, and prior experience suggests that = 13.2. If you wish to estimate correct to within 1.8, with probability equal to 0.95, how many observations should be included in your sample? Justify your conclusion. ANS: = = 206.59 The sample should include at least 207 observations. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

68. A questionnaire is designed to investigate attitudes about political corruption in government. The experimenter would like to survey two different groups—Conservatives and Liberals— and compare the responses to various “yes/no” questions for the two groups. The experimenter requires that the sampling error for the difference in the proportions of “yes” responses for the two groups be no more than 4 percentage points, with confidence equal to 0.95. If the two samples are both the same size, how large should the samples be? Justify your conclusion. ANS: = sample should include at least 1201 observations. PTS: 1 REF: 336-337 | 342-343 BLM: Higher Order - Analyze

= 1200.5. Hence each

TOP: 8–9

69. You want to estimate the difference in grade point averages between two groups of students, accurate to within 0.20 of a grade point, with confidence equal to 0.95. If the standard deviation of the grade point measurements is approximately equal to 0.5, how many students must be included in each group? (Assume that the groups will be of equal size.) Justify your conclusion. ANS: = = 48.02 Hence, at least 49 students should be included in each group. PTS: 1 REF: 331 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

70. Assume that the population standard deviation of annual incomes of all Manitoba residents is $2500. How many individuals must we include in a simple random sample if we want to be 95% confident that the population mean income lies within $150 of our sample mean income? Justify your conclusion. ANS:

The required sample size is at least n = 1068. PTS: 1 REF: 323-324 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

71. An airline executive estimates that 25% of all flights arrive late. How many flights must we include in a simple random sample if we want to be 90% confident that the true population proportion of flights that arrive late lies within 0.01 of our sample proportion? Justify your conclusion. ANS:

The required sample size is at least n = 5074. PTS: 1 REF: 326 | 342-343 BLM: Higher Order - Analyze

TOP: 8–9

Chapter 9A—Large-Sample Tests of Hypotheses MULTIPLE CHOICE 1. A quality control officer tests bottles of shampoo to see if the filling machines are putting the proper amount in each bottle. After testing a sample of bottles, the quality control officer decides to leave the filling machines operating. However, the filling machines are not operating properly. Which type of error, if any, did the quality control officer commit? a. This is a Type I error. b. This is a Type II error. c. This is a correct decision. d. This is an incorrect decision. ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 368

TOP: 1–3

2. When is a Type I error committed? a. when we reject a true null hypothesis b. when we reject a false null hypothesis c. when we don’t reject a false null hypothesis d. when we don’t reject a true null hypothesis ANS: A BLM: Remember

PTS:

REF: 360 | 368

TOP: 1–3

3. A government testing agency studies aspirin capsules to see if they contain less medication than advertised. Suppose the testing agent concludes the capsules contain a mean amount below the advertised level when in fact the advertised level is the true mean. Which type of error, if any, did the testing agency commit? a. This is a Type I error. b. This is a Type II error. c. This is a correct decision. d. It is impossible to answer this question. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 368

TOP: 1–3

4. If we reject the null hypothesis, what are we concluding? a. that there is not enough statistical evidence to infer that the alternative hypothesis is true b. that there is enough statistical evidence to infer that the alternative hypothesis is true c. that there is enough statistical evidence to infer that the null hypothesis is true d. that the test is statistically insignificant at whatever level of significance the test was conducted ANS: B BLM: Remember

PTS:

REF: 357-358

5. In a criminal trial, when is a Type I error committed?

TOP: 1–3

a. b. c. d.

when a guilty defendant is acquitted when an innocent person is convicted when a guilty defendant is convicted when an innocent person is acquitted

ANS: B PTS: 1 BLM: Higher Order - Analyze 6.

REF: 360 | 368

TOP: 1–3

REF: 368

TOP: 1–3

How is a Type II error is committed? a. by rejecting a true null hypothesis b. by rejecting a false null hypothesis c. by not rejecting a true null hypothesis d. by not rejecting a false null hypothesis ANS: D BLM: Remember

PTS:

7. Which of the following statistics denotes the power of a test? a. b. c. 1 – d. 1 – ANS: D BLM: Remember

PTS:

REF: 369

TOP: 1–3

8. What is the p-value of a test? a. It is the smallest at which the null hypothesis can be rejected. b. It is the largest at which the null hypothesis can be rejected. c. It is the smallest at which the null hypothesis cannot be rejected. d. It is the largest at which the null hypothesis cannot be rejected. ANS: A BLM: Remember

PTS:

REF: 364

TOP: 1–3

9. If we do NOT reject the null hypothesis, what are we concluding? a. that there is not enough statistical evidence to infer that the alternative hypothesis is true b. that there is enough statistical evidence to infer that the alternative hypothesis is true c. that there is enough statistical evidence to infer that the null hypothesis is true d. that the test is statistically insignificant at whatever level of significance the test was conducted ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 357-358

TOP: 1–3

10. There is a close connection between Type I errors, Type II errors, and the power of a test. Which of the following statements is NOT true of those interrelationships? a. The probability of committing a Type II error increases as the probability of committing a Type I error decreases.

b. The probability of committing a Type II error and the level of significance are the same. c. The power of the test decreases as the level of significance decreases. ANS: B TOP: 1–3

PTS: 1 REF: 368-369 | 371 BLM: Higher Order - Understand

11. In a two-tailed test for the population mean, the null hypothesis will be rejected at level of significance if which of the following conditions holds for the value of the test statistic z? a. |z|> b. |z|<– c. – <z< d. |z|>z ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 363-364

12. Consider testing the hypothesis statistic is equal to 1.36, then what is the p-value? a. 0.1738 b. 0.2066 c. 0.4131 d. 0.9131 ANS: A TOP: 1–3

TOP: 1–3

. If the value of the test

PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Apply

13. If a hypothesis is rejected at the 0.05 level of significance, what can be deduced from that? a. The hypothesis must be rejected at any level. b. The hypothesis must be rejected at the 0.02 level. c. The hypothesis must not be rejected at the 0.02 level. d. The hypothesis may or may not be rejected at the 0.02 level. ANS: D PTS: 1 BLM: Higher Order - Understand 14. In testing the hypothesis known: n = 64, = 78, and a. +2.4 b. +1.96 c. –1.96 d. –2.4 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 365

TOP: 1–3

, the following information is = 10. In this case, what is the value of the test statistic?

REF: 363-364

TOP: 1–3

15. Which of the following p-values will lead us to reject the null hypothesis if the level of significance 0.05?

a. 0.025 b. 0.05 c. 0.10 d. 0.20 ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 365

TOP: 1–3

16. Which of these terms is a proposition which is tentatively advanced as being possibly true? a. an acceptance region b. a confidence level c. a hypothesis d. a p-value ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 357-358

TOP: 1–3

17. What is the first step in hypothesis testing? a. formulating two opposing hypotheses, called the null and the alternative hypotheses b. selecting a test statistic c. calculating the p-value d. determining the rejection region ANS: A BLM: Remember

PTS:

REF: 357-358

TOP: 1–3

18. Which of the following does NOT correctly describe an alternative hypothesis about a population mean ? a. b. c. d. ANS: B BLM: Remember

PTS:

REF: 363-364

TOP: 1–3

19. Which of the following is an example of a null hypothesis? a. This industrial process makes windshields having an average length equal to 33 inches. b. The average quantity of detergent put into a box by this filling machine is not 1 pound. c. This shipping company’s average delivery time is different from 3 days. d. The average thickness of aluminum sheets is not 0.03 inches, as required. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 363-364

TOP: 1–3

20. In hypothesis testing, to what does the term “critical value” refer? a. the probability, 1 – , of avoiding the Type I error of erroneously rejecting a null

hypothesis that is in fact true b. the value of a test statistic that divides all possible values into an acceptance region and a rejection region c. any sample result that leads to the continued acceptance of the null hypothesis because it has a high probability of occurring when the null hypothesis is true d. the probability, 1 – , of avoiding the Type II error of erroneously accepting a null hypothesis that is in fact false ANS: B BLM: Remember

PTS:

REF: 360 | 368

TOP: 1–3

21. In a hypothesis test involving the population mean, which of the following would be an acceptable formulation? a. vs. b.

vs.

c. d.

vs. vs.

>100

ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 363-364

TOP: 1–3

22. If you wish to determine there is evidence that the average starting salary for finance graduates exceeds $42,000, how would you formulate the null and alternative hypotheses? a. so that either a two-tailed or one-tailed test could be used b. so that a two-tailed test should be used c. so that a one-tailed test should be used d. so that the probability of committing a Type II error would be equal to 0.42 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 363-364

TOP: 1–3

23. In a two-tailed test, if the p-value is less than the probability of committing a Type I error, what can you conclude? a. A one-tailed test should be used. b. The null hypothesis should be rejected. c. The null hypothesis should not be rejected. d. Another sample should be selected at random from the population. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 365

TOP: 1–3

24. If a hypothesis test is to be conducted using = 0.025, what does that imply about the test? a. There is a 2.5% chance that the null hypothesis is true. b. There is a maximum 2.5% chance that a false null hypothesis will be rejected. c. There is a maximum 2.5% chance that a true null hypothesis will be rejected. d. There is 2.5% chance of committing a Type I error and 97.5% chance of committing a Type II error. ANS: C

PTS:

REF: 365

TOP: 1–3

BLM: Higher Order - Understand 25. In testing vs. which of the following would happen if the sample size were increased? a. The sampling distribution of the sample mean would have the same variability. b. There would be no effect on the level of significance for the test. c. This would have an effect on whether the null hypothesis is true or not. ANS: B TOP: 1–3

PTS: 1 REF: 360 | 363-364 BLM: Higher Order - Understand

26. In formulating the null and alternative hypotheses, which of the following would be an acceptable null hypothesis? a. The population mean is greater than 20. b. The population mean is smaller than 20. c. The population mean is equal to 20. d. The population mean is equal to 0. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 363-364

TOP: 1–3

27. If the probability of committing a Type I error for a given test is to be decreased, then for a fixed sample size n, what will happen? a. The power of the test will increase. b. The probability of committing a Type II error will increase. c. The probability of committing a Type II error will decrease. d. A two-tailed test must be used. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 368 | 371

TOP: 1–3

28. Suppose that we reject the null hypothesis at the 0.05 level of significance. For which of the following -values do we also reject the null hypothesis? a. 0.02 b. 0.04 c. 0.06 ANS: C TOP: 4

PTS: 1 REF: 374-375 | 365 BLM: Higher Order - Understand

29. If you wish to construct a confidence interval estimate for the difference between two population means, what would an increase in the sample sizes used result in? a. a decrease in the critical value z b. a narrower confidence interval c. a wider confidence interval d. a confidence interval that contains 0 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 374-376

TOP: 4

30. If you wish to estimate the difference between two population means using two independent large samples, the 90% confidence interval estimate can be constructed using which of the following critical values? a. 2.33 b. 1.96 c. 1.645 d. 1.28 ANS: C PTS: 1 BLM: Higher Order - Apply 31. In testing vs. What is the p-value of the test? a. 0.0455 b. 0.0910 c. 0.1977 d. 0.3023 ANS: A TOP: 4

REF: 374-376

TOP: 4

the test statistic value z is found to be 1.69.

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

32. Two independent samples of sizes 40 and 50 are randomly selected from two populations to test the difference between the population means

. Which of the following best

describes the sampling distribution of the sample mean difference a. It is normally distributed. b. It is approximately normal. c. It is student t-distributed with 88 degrees of freedom. d. It is student t-distributed with 90 degrees of freedom. ANS: B TOP: 4

PTS: 1 REF: 331 | 374-375 BLM: Higher Order - Understand

33. When testing vs. , the observed value of the z-score was found to be –2.15. What would the p-value for this test be? a. 0.0158 b. 0.0316 c. 0.9684 d. 0.9842 ANS: A TOP: 4

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

34. In testing the difference between two population means using two independent samples, the population standard deviations were assumed to be known and the calculated test statistic equalled 2.56. If the test was two-tailed and a 5% level of significance had been specified, what would be the most appropriate conclusion from the findings? a. to reject the null hypothesis b. not to not reject the null hypothesis c. to choose two other independent samples

d. to accept the null hypothesis ANS: A PTS: 1 BLM: Higher Order - Evaluate

REF: 374-375

TOP: 4

35. When testing vs. , the observed value of the z-score was found to be –2.15. Which of the following is the p-value for this test? a. 0.0158 b. 0.0316 c. 0.9684 d. 0.9842 ANS: C TOP: 4

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

36. In testing for differences between the means of two independent populations, what should the null hypothesis be? a. b. c. d. ANS: B BLM: Remember

PTS:

REF: 374-375

TOP: 4

37. When testing vs. , the observed value of the z-score was found to be –2.15. What is the p-value for this test? a. 0.0158 b. 0.0316 c. 0.9684 d. 0.9842 ANS: B TOP: 4

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

38. The necessary conditions having been met, a two-tailed test is being conducted to test the difference between two population means, but your statistical software provides only a one-tail area of 0.036 as part of its output. What is the p-value for this test? a. 0.009 b. 0.018 c. 0.072 d. 0.964 ANS: C TOP: 4

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Understand

39. What is the rejection region for testing of significance?

at the 0.05 level

a. |z| < 1.28 b. |z| > 1.96 c. z > 1.645 d. z < 2.33 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 379-380

TOP: 5

40. In a hypothesis test involving a population proportion, which of the following would be an acceptable formulation? a. vs. b. vs. c.

vs.

ANS: B BLM: Remember 41. In testing

PTS:

vs.

proportion a. 0.0384 b. 0.0768 c. 0.4616 d. 0.5384 ANS: B TOP: 5

REF: 379-380

TOP: 5

a random sample of size 200 produced a sample

Given these results, what is the approximate p-value of the test?

PTS: 1 REF: 379-380 | 720-721 BLM: Higher Order - Apply

42. In selecting the sample size to estimate the population proportion p, if we have no knowledge of even the approximate values of the sample proportion

, what should we do?

a. take another sample and estimate b. take two more samples and find the average of their c. let

= 0.50

d. let

= 0.95

ANS: C PTS: 1 BLM: Higher Order - Analyze

’s

REF: 379-380

TOP: 5

43. Which of the following conditions must hold before one can make use of the standard normal distribution for constructing a confidence interval estimate for the population proportion p? a. and ) are both greater than 5, where denotes the sample proportion b. np and n(1 – p) are both greater than 5 c. n(p + ) and n(p – ) are both greater than 5, where denotes the sample proportion d. the sample size is greater than 5 ANS: A

PTS:

REF: 382

TOP: 5

BLM: Remember 44. What would be the lower limit of a confidence interval, at the 95% level of confidence, for the population proportion if a sample of size 200 had 40 successes? a. 0.2554 b. 0.2465 c. 0.1535 d. 0.1446 ANS: D TOP: 5

PTS: 1 REF: 381-382 | 720-721 BLM: Higher Order - Apply

45. Assuming that all necessary conditions are met, what needs to be changed in the formula so that we can use it to construct a confidence interval estimate for the population proportion p? a. The should be replaced by p. b. The

should be replaced by

c. The

should be replaced by

d. The

should be replaced by

ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 381-382

TOP: 5

46. From a sample of 400 items, 14 are found to be defective. In this case, what is the point estimate of the population proportion defective? a. 0.035 b. 0.05 c. 14 d. 28.57 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 379-380

TOP: 5

47. Which of the following would be an appropriate null hypothesis to test a proportion? a. The sample proportion is equal to 0.60. b. The population proportion is greater than 0.60. c. The population proportion is equal to 0.60. d. The population proportion is not equal to 0.60. ANS: C BLM: Remember

PTS:

REF: 379-380

TOP: 5

48. Which of the following would be an appropriate alternative hypothesis to test a proportion? a. The population proportion is less than 0.65. b. The sample proportion is less than 0.65. c. The population proportion is equal to 0.65. d. The sample proportion is equal to 0.65. ANS: A

PTS:

REF: 379-380

TOP: 5

BLM: Remember 49. A survey claims that 9 of 10 doctors recommend aspirin for their patients with headaches. To test this claim against the alternative, that the actual proportion of doctors who recommend aspirin is less than 0.90, a random sample of 100 doctors results in 83 who indicate that they recommend aspirin. What is the approximate value of the test statistic in this problem? a. –2.33 b. –1.86 c. –1.67 d. –0.14 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 379-380

TOP: 5

50. In constructing a confidence interval estimate for the difference between two population proportions, what is the most appropriate option? a. Pool the population proportions when the populations are normally distributed. b. Pool the population proportions when the population means are equal. c. Pool the population proportions when they are equal. d. Never pool the population proportions to construct a confidence interval for . ANS: D TOP: 6–7

PTS: 1 BLM: Remember

REF: 386-387 | 376

51. A sample of size 100 selected from one population has 60 successes, and a sample of size 150 selected from a second population has 95 successes. In this case, what is the test statistic for testing the equality of the population proportions? a. –0.5319 b. –0.419 c. 0.2702 d. 0.7293 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 384-387

TOP: 6–7

52. For testing the difference between two population proportions, under what circumstances should the pooled proportion estimate be used to compute the value of the test statistic? a. when the populations are normally distributed b. when the sample sizes are small c. when the samples are independently drawn from the populations d. when the null hypothesis states that the two population proportions are equal ANS: D BLM: Remember

PTS:

53. In testing the null hypothesis could the test lead to? a. a Type I error

REF: 384-385

, if

TOP: 6–7

is false, which of the following errors

b. a Type II error c. a Type III error d. a Type IV error ANS: B TOP: 6–7

PTS: 1 BLM: Remember

REF: 368 | 384-385

54. Which of the following is a required condition for using the normal approximation to the binomial in testing the difference between two population proportions? a. and b. and c. d. ANS: C BLM: Remember

and and PTS:

REF: 385

TOP: 6–7

55. A sample of size 150 from population 1 has 40 successes. A sample of size 250 from population 2 has 30 successes. What is the value of the test statistic for testing the null hypothesis that the proportion of successes in population one exceeds the proportion of successes in population two by 0.05? a. 1.645 b. 1.960 c. 1.977 d. 2.327 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 384-387

TOP: 6–7

56. The necessary conditions having been met, a two-tailed test is being conducted to test the difference between two population proportions. If the value of the test statistic is 2.05, what is the p-value? a. 0.4798 b. 0.2399 c. 0.0404 d. 0.0202 ANS: C TOP: 6–7

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

TRUE/FALSE 1. A Type I error for a statistical test is committed if we reject the null hypothesis when it is true. ANS: T BLM: Remember

PTS:

REF: 360 | 368

TOP: 1–3

2. The p-value of a statistical test measures the actual risk of committing a Type I error.

ANS: T BLM: Remember

PTS:

REF: 364

TOP: 1–3

3. A Type II error for a statistical test is committed if we do not reject the null hypothesis when it is false. ANS: T BLM: Remember

PTS:

REF: 368

TOP: 1–3

4. The p-value of a statistical test is the largest value of the significance level null hypothesis can be rejected. ANS: F BLM: Remember

PTS:

REF: 364

for which the

TOP: 1–3

5. If the power of a statistical test is 0.9207, then the probability of accepting a false null hypothesis is 0.0793. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 369

TOP: 1–3

6. For a fixed sample size n, as the probability of a Type II error  decreases, the probability of a Type I error  increases. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 371

TOP: 1–3

7. A two-tail test is a test in which a null hypothesis can be rejected by an extreme result occurring in only one direction. ANS: F BLM: Remember

PTS:

8. A Type I error is represented by hypothesis. ANS: F BLM: Remember

PTS:

REF: 359

TOP: 1–3

, and is the probability of not rejecting a false null

REF: 360 | 368

TOP: 1–3

9. As the significance level  increases, the probability of a Type I error increases and the size of the rejection region increases. ANS: T TOP: 1–3

PTS: 1 REF: 360 | 363-364 BLM: Higher Order - Understand

10. Reducing the probability of a Type I error also reduces the probability of a Type II error. ANS: F

PTS:

REF: 371

TOP: 1–3

BLM: Higher Order - Understand 11. In a one-tailed test, the p-value is found to be equal to 0.036. If the test had been two-tailed, the p-value would have been 0.072. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 365

TOP: 1–3

12. In order to calculate the p-value associated with a test, it is necessary to know the level of significance . ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 364

TOP: 1–3

13. A professor of mathematics refutes the claim that the average student spends 4.5 hours studying for the final comprehensive exam. To test the claim, she should use the hypothesis

ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 363-364

TOP: 1–3

14. The necessary conditions having been met, a two-tailed test is being conducted to test the difference between two population means, but your statistical software provides only a one-tail area of 0.156 as part of its output. The p-value for this test will be 0.078. ANS: F TOP: 1–3

PTS: 1 REF: 365 | 374-375 BLM: Higher Order - Understand

15. Hypothesis testing is a systematic approach to assessing tentative beliefs about reality, which involves confronting those beliefs with evidence and deciding, in light of this evidence, whether the beliefs can be maintained as reasonable or must be discarded as untenable. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 357-358

TOP: 1–3

16. The null hypothesis is a vehicle for making startling new claims that contradict the conventional wisdom, that assert “guilt without a reasonable doubt.” ANS: F BLM: Remember

PTS:

REF: 357-358

TOP: 1–3

17. If the null hypothesis, H0, cannot be corroborated in a hypothesis test, embraced, which calls for action; thus, one can think of ANS: T PTS: 1 BLM: Higher Order - Understand

is tentatively

as the action hypothesis.

REF: 357-358

TOP: 1–3

18. If H0 is corroborated in a hypothesis test, no action need be taken by anyone. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 357-358

TOP: 1–3

19. A hypothesis that specifies a single value for the unknown parameter is called a point estimate. ANS: F TOP: 1–3

PTS: 1 REF: 357-358 | 313 BLM: Higher Order - Understand

20. A hypothesis that specifies a range of values for the unknown parameter is called an interval estimate. ANS: F TOP: 1

PTS: 1 REF: 357-358 | 320 BLM: Higher Order - Understand

21. A p-value is a statistic computed from a simple random sample taken from the population of interest in a hypothesis test and then used for establishing the probable truth or falsity of the null hypothesis. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 364

TOP: 1–3

22. The p-value in hypothesis testing equals the probability, 1 – , of avoiding the Type I error of erroneously rejecting a null hypothesis that is in fact true. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 364

TOP: 1–3

23. A set consisting of values that support the alternative hypothesis and lead to rejecting the null hypothesis is called the rejection region. ANS: T BLM: Remember

PTS:

REF: 360

TOP: 1–3

24. In hypothesis testing, the statement of the null hypothesis always contains the equality sign. ANS: T PTS: 1 REF: 358-359 | 363-364 | 374 | 379-380 | 384-385 BLM: Remember

TOP: 1–3

25. In hypothesis testing, the decision to “accept” the null hypothesis is the same as the decision to “fail to reject” the null hypothesis. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 358 | 389

TOP: 1–3

26. A two-tailed test of hypothesis for a population mean to 0.05 will have a critical value z equal to 0.475. ANS: F TOP: 1–3

with a significance level

equal

PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Apply

27. In a hypothesis test, if the null hypothesis has been rejected, then it is impossible that a Type II error has been committed. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 368

28. In a one-tailed test, the larger the significance level ANS: F PTS: 1 BLM: Higher Order - Understand

TOP: 1–3

, the larger the critical value will be.

REF: 360

TOP: 1–3

29. If you wish to conduct a hypothesis test using a small significance level , you should increase your sample size to lower the probability of making a Type II error. ANS: T TOP: 1–3

PTS: 1 REF: 368 | 371 | 389 BLM: Higher Order - Understand

30. Decision makers have more control over Type I error than Type II error. ANS: T BLM: Remember

PTS:

REF: 368 | 389

TOP: 1–3

31. If the null hypothesis is actually false, then a hypothesis test may result in a Type II error. ANS: T BLM: Remember

PTS:

REF: 368

TOP: 1–3

32. If a hypothesis test leads to incorrectly rejecting the null hypothesis, a Type II error has been committed. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 368

TOP: 1–3

33. If a hypothesis test leads to incorrectly accepting the null hypothesis, a Type I error has been committed. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 360 | 368

TOP: 1–3

34. A statistics professor has claimed that her top student will average more than 98 points in the final exam. If you wish to test this claim, you would formulate the following null and alternative hypotheses:

vs.

ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 363-364

TOP: 1–3

35. If the probability of committing a Type I error is set at 0.10, then the probability of committing a Type II error will be 0.90, since the sum of probabilities must be 1.0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 368

TOP: 1–3

36. A two-tailed hypothesis test of the population mean is used when the alternative hypothesis takes the form ANS: T BLM: Remember

PTS:

REF: 362-363

TOP: 1–3

37. With all other factors held constant, the chance of committing a Type II error increases if the true population mean

is closer to the hypothesized value

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 371

TOP: 1–3

38. If a decision maker is concerned that the chance of committing a Type II error is large, one way to reduce the risk is to increase the significance level. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 368-369

TOP: 1–3

39. The p-value or observed significance level measures the strength of the evidence against the alternative hypothesis. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 364

TOP: 1–3

40. If the p-value for a hypothesis test is less than a preassigned significance level , then the null hypothesis can be rejected, and you can report that the results are statistically significant at level . ANS: T BLM: Remember

PTS:

REF: 365 | 389

TOP: 1

41. Type II error is typically greater for two-tailed hypothesis tests than for one-tailed tests. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 367-368

TOP: 1–3

42. The p-value for a hypothesis test is set up by the decision maker to minimize the probability of committing a Type I error.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 364-365

TOP: 1–3

43. The probability of correctly accepting a true null hypothesis equals 1 – confidence level of the hypothesis test. ANS: T BLM: Remember

PTS:

REF: 376

and is called the

TOP: 1–3

44. The probability of making the Type II error of incorrectly rejecting a true null hypothesis equals and is called the test’s significance level, or risk. ANS: F BLM: Remember

PTS:

REF: 360 | 368

45. If we reject the null hypothesis must also reject it at the 0.05 level. ANS: T PTS: 1 BLM: Higher Order - Understand

TOP: 1–3

at the 0.01 level of significance, then we

REF: 365 | 374

TOP: 4

46. In testing the difference between two population means using two independent samples, we use the pooled variance in estimating the standard error of the sampling distribution of the sample mean difference

if the populations are normal with equal variances.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 374-375

TOP: 4

47. The z-test can be used to determine whether two population means are equal. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 374-375

TOP: 4

48. In estimating the difference between two population means, if a 90% confidence interval includes 0, then we can be 90% certain that the difference between the two population means is 0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 376

TOP: 4

49. Increasing the sizes of the samples in a study to estimate the difference between two population means will increase the probability of committing a Type I error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 371

TOP: 4

50. In estimating the difference between two population means, the estimate for the standard deviation of the sampling distribution of sum of the two sample variances. ANS: F BLM: Remember

PTS:

is found by taking the square root of the

REF: 374-375

TOP: 4

51. With all other factors held constant, increasing the confidence level for a confidence interval estimate for the difference between two population means will result in a wider confidence interval estimate. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 376

TOP: 4

52. In estimating the difference between two population means, the following summary statistics were found:

and

these results, the point estimate of

is 0.70.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 374-375

Based on

TOP: 4

53. The significance level in a hypothesis test for the difference between two population means is the same as the probability of committing a Type I error. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 360 | 368

TOP: 4

54. If you wish to test whether two populations means are the same, the appropriate null and alternative hypotheses would be ANS: T BLM: Remember

PTS:

vs. REF: 374-375

TOP: 4

55. In testing the difference between two population means using two independent samples, the sampling distribution of the sample mean difference are both greater than 30. ANS: F BLM: Remember

PTS:

REF: 374-375

is normal if the sample sizes

TOP: 4

56. In testing the difference between two population means using two independent samples, the population standard deviations are assumed to be known, and the calculated test statistic equals 2.75. If the test is two-tailed and 5% level of significance has been specified, the conclusion should be not to reject the null hypothesis. ANS: F TOP: 4

PTS: 1 REF: 374-375 | 720-721 BLM: Higher Order - Analyze

57. The sampling distribution of

is normal if the sampled populations are normal, and

approximately normal if the populations are non-normal and the sample sizes are large. ANS: T BLM: Remember

PTS:

REF: 331

and

TOP: 4

58. When we test for differences between the means of two independent populations, we can use only a two-tailed test. ANS: F BLM: Remember

PTS:

REF: 374-375

TOP: 4

59. The probability of making the Type I error of incorrectly accepting a false null hypothesis equals

and is called the

ANS: F BLM: Remember

PTS:

-risk. 1

REF: 368

TOP: 1–3

60. The probability of correctly rejecting a false null hypothesis equals power of the hypothesis test. ANS: F BLM: Remember

PTS:

REF: 368-369

and is called the

TOP: 1–3

61. When formulating a hypothesis test, the null hypothesis should be written in such a way so as to minimize the probability of committing a Type I error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 360

TOP: 1–3

62. When formulating a hypothesis test about a population mean, the alternative hypothesis should avoid using an equality. ANS: T BLM: Remember

PTS:

REF: 363-364

TOP: 1–3

63. The probabilities of committing Type I and Type II errors are related such that when one is increased, the other will increase as well. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 371

TOP: 1–3

64. The decision maker controls the probability of committing a Type I error. ANS: T

PTS:

REF: 360 | 389

TOP: 1–3

BLM: Remember 65. If the null hypothesis rejected at the 0.01 level. ANS: F TOP: 5

is rejected at the 0.05 level of significance, it must be

PTS: 1 REF: 360 | 363 | 379-380 BLM: Higher Order - Understand

66. In testing vs. to a rejection of the null hypothesis. ANS: F TOP: 5

any p-value greater than 0.025 will lead

PTS: 1 REF: 365 | 379-380 BLM: Higher Order - Understand

67. A one-tailed hypothesis test of the population proportion is used when the alternative hypothesis takes the form ANS: F BLM: Remember 68. In testing

PTS:

vs.

when testing ANS: F TOP: 5

REF: 379-380

TOP: 5

the level of significance must be twice as large as vs.

PTS: 1 REF: 379-380 | 389 BLM: Higher Order - Understand

69. A two-tailed hypothesis test of the population proportion takes the form

ANS: F BLM: Remember

PTS:

REF: 379-380

vs.

TOP: 5

70. The campaign manager for the Conservatives believes that more than 52% of the registered voters will vote Conservative. If you wish to test this claim, the appropriate null and alternative hypotheses are ANS: T PTS: 1 BLM: Higher Order - Analyze

vs. REF: 379-380

TOP: 5

71. When testing vs. , an increase in the sample size will result in a decrease in the probability of committing a Type I error. ANS: F TOP: 5

PTS: 1 REF: 371 | 379-380 BLM: Higher Order - Understand

72. In testing proportion of

vs.

a random sample of size 100 produced a sample

Given these results, the test statistic value is z = –0.655.

ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 379-380

TOP: 5

73. In testing vs. the test statistic value is found to be equal to 1.20. The p-value for this test would be approximately 0.1151. ANS: F TOP: 5 74. In testing

PTS: 1 REF: 379-380 | 720-721 BLM: Higher Order - Apply vs.

a random sample of size 200 produced a sample

proportion 0.05.

Given these results, the null hypothesis should not be rejected at

ANS: T TOP: 5

PTS: 1 REF: 379-380 | 720-721 BLM: Higher Order - Evaluate

75. Suppose in testing a hypothesis about a proportion, the p-value is computed to be 0.027. The null hypothesis should be rejected if the chosen level of significance is 0.01. ANS: F TOP: 5

PTS: 1 REF: 365 | 379-380 BLM: Higher Order - Understand

76. The lower limit of the 90% confidence interval for the population proportion p, given that n = 400 and ANS: F TOP: 5

= 0.10, is 0.1247. PTS: 1 REF: 381-382 | 376 BLM: Higher Order - Apply

77. If a null hypothesis about the population proportion p is rejected at the 0.10 level of significance, it must be rejected at the 0.05 level. ANS: F TOP: 5

PTS: 1 BLM: Remember

REF: 379-380 | 365

78. In a one-tailed test about the population proportion p, the p-value is found to be equal to 0.0352. If the test had been two-tailed, the p-value would have been 0.0704. ANS: T TOP: 5

PTS: 1 REF: 365 | 379-380 BLM: Higher Order - Understand

79. A professor of statistics refutes the claim that the proportion of Conservative voters in Alberta is at most 44%. To test the claim, the hypotheses be used.

, should

ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 379-380

TOP: 5

80. In testing a hypothesis about a population proportion p, the z-test statistic measures how close the computed sample proportion parameter. ANS: T PTS: BLM: Higher Order

has come to the hypothesized population

REF: 379-380

TOP: 5

81. In a two-tailed test for the population proportion, if the null hypothesis is rejected when the alternative hypothesis is false, a Type I error is committed. ANS: T TOP: 5

PTS: 1 REF: 360 | 368 | 379-380 BLM: Higher Order - Understand

82. In a one-tailed test for the population proportion, if the null hypothesis is not rejected when the alternative hypothesis is false, a Type II error is committed. ANS: F BLM: Remember

PTS:

83. The sampling distribution of large enough (n > 30). ANS: F BLM: Remember

PTS:

REF: 368

TOP: 5

is approximately normal, provided that the sample size is

REF: 380

TOP: 5

84. Suppose in testing a hypothesis about a proportion, the p-value is computed to be 0.038. The null hypothesis should be rejected if the chosen level of significance is 0.05. ANS: T TOP: 5

PTS: 1 REF: 365 | 379-380 BLM: Higher Order - Understand

85. For a given data set and confidence level, the confidence interval of the population proportion p will be wider for 95% confidence than for 90% confidence. ANS: T TOP: 5

PTS: 1 BLM: Remember

REF: 376 | 379-380

86. A two-tailed test of the population proportion produces a test statistic z = 1.77. The p-value of the test is 0.4616. ANS: F TOP: 5

PTS: 1 REF: 379-382 | 720-721 BLM: Higher Order - Apply

87. The upper limit of the 85% confidence interval for the population proportion p, given that n = 60 and

= 0.20, is 0.274.

ANS: T TOP: 5

PTS: 1 REF: 381-382 | 376 BLM: Higher Order - Apply

88. Suppose in testing a hypothesis about a proportion, the z test statistic is computed to be 1.92. The null hypothesis should be rejected if the chosen level of significance is 0.01 and a two-tailed test is used. ANS: F PTS: 1 BLM: Higher Order - Evaluate

REF: 379-382

TOP: 5

89. The width of a confidence interval estimate for a proportion will be narrower for 99% confidence than for 95% confidence. ANS: F TOP: 5

PTS: 1 REF: 376 | 379-380 BLM: Higher Order - Understand

90. The width of a confidence interval estimate for a proportion will be wider for a sample size of 100 than for a sample size of 50. ANS: F TOP: 5

PTS: 1 REF: 376 | 379-380 BLM: Higher Order - Understand

91. The width of a confidence interval estimate for a proportion will be narrower for 90% confidence than for 95% confidence. ANS: T TOP: 5

PTS: 1 REF: 376 | 379-380 BLM: Higher Order - Understand

92. If we reject the null hypothesis , we conclude that there is not enough statistical evidence to infer that the population proportions are equal. ANS: T TOP: 6–7

PTS: 1 REF: 357-358 | 384-385 BLM: Higher Order - Understand

93. The necessary conditions having been met, a two-tailed test is being conducted to test the difference between two population proportions. The two sample proportions are and , and the standard error of the sampling distribution of is 0.0085. Under these circumstances, the calculated value of the test statistic will be z = 3.41. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 384-385

TOP: 6–7

94. In testing vs. using a significance level equal to 0.05, the critical value that will be used to conduct the test is z = 1.645. ANS: F

PTS:

REF: 384-385

TOP: 6–7

BLM: Higher Order - Apply 95. In testing vs. the test statistic value was found to be z = 1.28. In this case, the p-value of the test would be approximately 0.1003. ANS: T TOP: 6–7

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

96. The test statistic that is used in testing

vs.

where ANS: F BLM: Remember 97. In testing

PTS:

vs.

and be rejected at the significance level ANS: F PTS: 1 BLM: Higher Order - Evaluate

REF: 384-387

TOP: 6–7

the following summary statistics are found: Based on these results, the null hypothesis should

REF: 384-387

TOP: 6–7

98. The necessary conditions having been met, a two-tailed test is being conducted to test the difference between two population proportions. The two sample proportions are and , respectively, and the standard error of the sampling distribution of 0.04. Then, the calculated value of the test statistic will be 1.50. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 384-387

TOP: 6–7

99. The necessary conditions having been met, a two-tailed test is being conducted for the difference between two population proportions. If the value of the test statistic is –1.35, then the p-value is 0.0885. ANS: F TOP: 6–7 100.

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

The necessary conditions having been met, a two-tailed test is being conducted for the difference between two population proportions. If the value of the test statistic is 1.96, then the null hypothesis is rejected at = 0.10. ANS: T PTS: 1 BLM: Higher Order - Evaluate

REF: 384-387

TOP: 6–7

101.

The necessary conditions having been met, an upper-tailed test is being conducted for the difference between two population proportions. If the value of the test statistic is 2.90, then the p-value is 0.0038. ANS: F TOP: 6–7

102.

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

The necessary conditions having been met, a lower-tailed test is being conducted for the difference between two population proportions. If the value of the test statistic is –2.43, then the null hypothesis cannot be rejected at = 0.025. ANS: F PTS: 1 BLM: Higher Order - Evaluate

103.

REF: 384-387

TOP: 6–7

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

Chapter 9B—Large-Sample Tests of Hypotheses PROBLEM Catalogue Mail-Orders Narrative A mail-order catalogue claims that customers will receive their product within four days of ordering. A competitor believes that this claim is an underestimate. 1. Refer to Catalogue Mail-Orders Narrative. State the appropriate null and alternative hypotheses to be tested by the competitor. ANS: ,

is the average number of days until the product is received.

PTS: 1 REF: 363-364 BLM: Higher Order - Analyze

TOP: 1–3

2. Refer to Catalogue Mail-Orders Narrative. Describe the Type I error for this problem. ANS: A Type I error occurs if the competitor rejects the null hypothesis when, in fact, it is true; that is, if the competitor concludes the mean time until the product is received is more than four days when, in fact, it is not. PTS: 1 REF: 360 | 368 BLM: Higher Order - Analyze

TOP: 1–3

3. Refer to Catalogue Mail-Orders Narrative. Describe the Type II error for this problem. ANS: A Type II error occurs if the competitor does not reject the null hypothesis when, in fact, it is false; that is, if the competitor does not conclude that the mean time until the product is received is more than four days when, in fact, it is. PTS: 1 REF: 368 BLM: Higher Order - Analyze

TOP: 1–3

Federal Votes Narrative A Conservative party candidate in a federal election believes that 54% of Canadian voters are supporting him. His Liberal party opponentbelieves this estimate is too high. 4. Refer to Federal Votes Narrative. State the appropriate null and alternative hypotheses to be tested by the Liberal party opponent . ANS:

, where p is the true proportion of Canadian voters supporting the Conservative party candidate . PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

TOP: 1–3

5. Refer to Federal Votes Narrative. Describe the Type I error for this problem. ANS: A Type I error occurs if the Liberal party opponent rejects the null hypothesis when, in fact, it is true; that is, if the Liberal party opponent concludes the proportion of voters supporting Harper is less than 0.54 when, in fact, it is not. PTS: 1 REF: 360 | 368 BLM: Higher Order - Analyze

TOP: 1–3

6. Refer to Federal Votes Narrative. Describe the Type II error for this problem. ANS: A Type II error occurs if the Liberal party opponent does not reject the null hypothesis when, in fact, it is false; that is, if the Liberal party opponent does not conclude the proportion of voters supporting the candidate is less than 0.54 when, in fact, it is. PTS: 1 REF: 368 BLM: Higher Order - Analyze

TOP: 1–3

7. Refer to Federal Votes Narrative. Describe the practical consequences for the Liberal party opponent if he makes a Type I error. ANS: The Liberal party opponent may think he is doing better in the campaign than he actually is. PTS: 1 REF: 360 | 368 BLM: Higher Order - Analyze

TOP: 1–3

8. Refer to Federal Votes Narrative. Describe the practical consequences for the Liberal party opponent if he makes a Type II error. ANS: The Liberal party opponent may think he is doing worse in the campaign than he actually is. PTS: 1 REF: 368 BLM: Higher Order - Analyze Defective Toasters Narrative

TOP: 1–3

A toaster manufacturer receives large shipments of thermal switches from a supplier. A sample from each shipment is selected and tested. The manufacturer is willing to send the shipment back if the proportion of defective switches is more than 5%. Otherwise, the shipment will be kept. 9. Refer to Defective Toasters Narrative. State the appropriate null and alternative hypotheses to be tested by the manufacturer. ANS: p is the true proportion of defective switches. PTS:

REF: 378-380

TOP: 1–3

BLM: Higher Order

10. Refer to Defective Toasters Narrative. Describe the Type I error. ANS: A Type I error occurs if the manufacturer rejects the null hypothesis when, in fact, it is true; that is, if the manufacturer concludes the proportion of defective switches is more than 0.05 when, in fact, it is not. PTS:

REF: 360 | 368

TOP: 1–3

BLM: Higher Order

11. Refer to Defective Toasters Narrative. Describe the Type II error for this problem. ANS: A Type II error occurs if the manufacturer does not reject the null hypothesis when, in fact, it is false; that is, if the manufacturer does not conclude the proportion of defective switches is more than 0.05 when, in fact, it is. PTS:

REF: 368

TOP: 1–3

BLM: Higher Order

12. Refer to Defective Toasters Narrative. From the manufacturer’s point of view, which error would be the more serious? Justify your answer. ANS: The Type II error would be the more serious, since the manufacturer would be keeping a shipment that actually had more than 5% defective switches. PTS: 1 REF: 368 BLM: Higher Order - Analyze

TOP: 1–3

13. Refer to Defective Toasters Narrative. From the supplier’s point of view, which error would be the more serious? Justify your answer. ANS: The Type I error would be more serious since the supplier would have to take back a shipment even though it does not have more than 5% defective switches. PTS:

REF: 368

TOP: 1–3

BLM: Higher Order - Analyze 14. Develop the null and appropriate hypotheses that are most appropriate for each of the following situations: a. A meteorologist claims that the average high temperature for the month of August in Montreal is 27°C. If the residents of Montreal do not believe this to be true, what hypotheses should they test? b. A car manufacturing plant is acting in accordance with the public’s interest in making cars that have fuel consumption of at most 8.2 litres per 100 km. The supervisor will let the cars off the manufacturing floor only if the fuel consumption is less than 8.8 L/100 km. What hypotheses should the plant test? c. A spokesperson for the Health Department reports that a fish is unsafe for human consumption if the polychlorinated biphenyl (PCB) concentration exceeds 5 ppb. The Carlson family is interested in the mean PCB concentration in a fish from the lake on which they live. What hypotheses should they test? d. An Internet survey revealed that 50% of Internet users received more than 10 email messages per day. A similar study on the use of email was repeated. The purpose of the study was to see whether use of email has increased. ANS: a. b. c. d. PTS: 1 REF: 363-364 | 379-380 BLM: Higher Order - Analyze

TOP: 1–3

15. A sample of size 150 is to be used to test the hypotheses H0:  = 3.75 kg vs. Ha:   3.75 kg, where,  is the average weight of a newborn Canadian baby. Give the appropriate rejection region associated with each of the following significance levels. a. = 0.01 b. = 0.05 c. = 0.1 ANS: a. Reject b. Reject c. Reject

if z > 2.575 or z < –2.575. if z > 1.96 or z < –1.96. if z > 1.645 or z < –1.645.

PTS: 1 REF: 363-364 BLM: Higher Order - Analyze

TOP: 1–3

16. A sample of size 80 is to be used to test the hypotheses H0:  = 29 vs. Ha: > 29, where  is the average age of a man when he gets married. What is the appropriate rejection region associated with each of the following significance levels? a. = 0.01

b. c. d.

= 0.005 = 0.05 = 0.1

ANS: a. Reject b. Reject c. Reject d. Reject

if z > 2.33. if z > 2.575. if z > 1.645. if z > 1.28.

PTS: 1 REF: 362-364 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

17. Transport Quebec repaired hundreds of bridges in 1993. To check the average cost to repair a bridge, a random sample of n = 55 bridges was chosen. The mean and standard deviation for the sample are $25,788 and $1540, respectively. Records from previous years indicate an average bridge repair cost was $25,003. Use the sample data to test that the 1993 mean  is greater than $25,003. Use  = 0.05. ANS: The hypotheses to be tested are

. The test statistics is

= (25,788 – 25,003)/(1540/ ) = 3.78. Since z > 1.645, we reject the null hypothesis and conclude that the average bridge repair cost in 1993 is greater than $25,003. PTS: 1 REF: 362-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

18. A new light bulb is being considered for use in an office It is decided that the new bulb will be used only if it has a mean lifetime of more than 500 hours. A random sample of 40 bulbs is selected and placed on life test. The mean and standard deviation are found to be 505 hours and 18 hours, respectively. Perform the appropriate test of hypothesis to determine whether the new bulb should be used. Use a 0.01 level of significance. ANS: The hypotheses to be tested are

. The test statistic is

= (505 – 500)/(18/ ) = 1.76. Since –2.33 < z < 2.33, we fail to reject the null hypothesis. We conclude that the true average lifetime of the bulb is not significantly larger than 500 (i.e., do not use this bulb.) PTS: 1 REF: 362-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

19. Refer to Swimming Average Narrative. Perform the appropriate test of hypothesis to determine whether Jessica’s average time has changed. Use  = 0.01.

ANS: The hypotheses to be tested are

. The test statistic is

= (147.8 – 148.4)/(2.3/ ) = –1.845. Since –2.575 < z < 2.575, we fail to reject the null hypothesis. We conclude that Jessica’s true average time to swim the 200 m butterfly is not significantly different from 148.4. (i.e., do not conclude that her time has changed significantly). NAR Swimming Average Narrative PTS: 1 REF: 362-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

Swimming Average Narrative Historically, the average time it takes Jessica to swim the 200 m butterfly is 148.4 seconds. Jessica would like to know if her average time has changed. She records her time on 50 randomly selected occasions and computes the mean to be 147.8 seconds with a standard deviation of 2.3 seconds. 20. Refer to Swimming Average Narrative. Compute the power of the test if Jessica’s actual mean swimming time is 147.3 seconds. Interpret the results. ANS: Since = 148.4 2.575(2.3/ ) = 148.4 0.84 = (147.56, 149.24), then the power of the test is 0.7881. Thus, the probability of correctly rejecting the null hypothesis, given that  = 147.3, is 0.7881. PTS: 1 REF: 369-371 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

21. Let  denote the true average delivery time of a letter from a specific carrier. For a large-sample z-test of H0:  = 3 versus Ha:   3, find the p-value associated with each of the given values of the test statistic, and state whether each p-value will lead to a rejection of the null hypothesis when performing a level 0.05 test. a. 2.16 b. 0.38 c. –2.81 d. 1.07 e. –0.68 ANS: a. p-value = 2(0.5 – 0.4846) = 0.0308. Reject the null hypothesis since p-value < . b. p-value = 2(0.5 – 0.148) = 0.704. Don’t reject the null hypothesis since p-value > . c. p-value = 2(0.5 –0.4975) = 0.005. Reject the null hypothesis since p-value < . d. p-value = 2(0.5 – 0.3577) = 0.2846. Don’t reject the null hypothesis since p-value > . e. p-value = 2(0.5 – 0.2517) = 0.4966. Don’t reject the null hypothesis since p-value > .

PTS: 1 REF: 363-365 BLM: Higher Order - Evaluate

TOP: 1–3

Laptop Battery Time Narrative The manufacturer of a particular battery pack for laptop computers claims its battery pack can function for eight hours, on average, before having to be recharged. A random sample of 36 battery packs was selected and placed on test. The mean functioning time before having to be recharged was 7.2 hours with a standard deviation of 1.9 hours. 22. Refer to Laptop Battery Time Narrative. A competitor claims that the manufacturer’s claim is too high. Perform the appropriate test of hypothesis to determine whether the competitor is correct. Test using = 0.05. ANS: Let = true average functioning time of battery before having to recharge. The hypotheses to be tested are

= (7.2 –

. The test statistic is

8.0)/(1.9 / ) = –2.53. Since z < –1.645, we reject the null hypothesis and conclude that the true average functioning time of the battery pack before having to be recharged is less than eight hours (i.e., can reject the manufacturer’s claim). PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

23. Refer to Laptop Battery Time Narrative. Find the p-value for this test. ANS: p-value = P(z  –2.53) = 0.5 – 0.4943 = 0.0057 PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

University Housing Costs Narrative A large university claims that the average cost of housing within 3 kilometres of the campus is $6900 per school year. A high school student is preparing her budget for her freshman year at the university. She is concerned that the university’s estimate is too low. Having taken statistics, she decides to perform the following test of hypothesis: , where represents the average cost of housing per year within 3 kilometres of the university. 24. Refer to University Housing Costs Narrative. Describe the Type I error for this problem. ANS: A Type I error occurs if the student rejects the null hypothesis when, in fact, it is true; that is, if the student concludes the average cost of housing is more than $6900 when, in fact, it is not.

PTS: 1 REF: 360 | 368 BLM: Higher Order - Analyze

TOP: 1–3

25. University Housing Costs Narrative Describe the Type II error for this problem. ANS: A Type II error occurs if the student does not reject the null hypothesis when, in fact, it is false; that is, if the student does not conclude that the average cost of housing is more than $6900 when, in fact, it is. PTS: 1 REF: 368 BLM: Higher Order - Analyze

TOP: 1–3

26. University Housing Costs Narrative Which error has more serious consequences for the student? Explain. ANS: The Type II error is more serious since the student would not budget a sufficient amount for the cost of housing. PTS: 1 REF: 368 BLM: Higher Order - Analyze

TOP: 1–3

Average Childcare Costs Narrative The public relations officer for a particular city claims the average monthly cost for childcare outside the home for a single child is $700. A potential resident is interested in whether the claim is correct. She obtains a random sample of 64 records and computes the average monthly cost of this type of childcare to be $689 with a standard deviation of $40. 27. Refer to Average Childcare Costs Narrative. Perform the appropriate test of hypothesis for the potential resident using = 0.01. ANS: Let = true average monthly cost for childcare outside the home for a single child. The hypotheses to be tested are . The test statistic is = (689 – 700)/(40 / ) = –2.20. Since –2.575 < z < 2.575, we fail to reject the null hypothesis. We cannot conclude that the true average monthly cost for childcare outside the home for a single child is significantly different from $700 (i.e., cannot reject the public relations officer’s claim). PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

28. Refer to Average Childcare Costs Narrative. Find the p-value for the test in the previous question. ANS:

p-value = P(z  –2.20) + P(z  2.20) = 2P(z  2.20) = 2(0.5 – 0.4861) = 0.0278 PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Apply

TOP: 1–3

29. Refer to Average Childcare Costs Narrative. What effect, if any, would there be on the conclusion of the test of hypothesis in the first question if you changed to 0.05? ANS: If you changed the value of the null hypothesis.

to 0.05, then p-value = 0.0278  0.05, and you would reject

PTS: 1 REF: 365 BLM: Higher Order - Evaluate

TOP: 1–3

30. Refer to Average Childcare Costs Narrative. Find the power of the test when $685 and = 0.05. Interpret the results.

is actually

ANS: 1.96(400)/ = 700 9.8 = (690.2, 709.8). (690.2 – = (709.8 – 685)/(40/8) = 4.96. Then, the power of the test is = 1 – P(1.04  z  4.96) = P(z < 1.04) = 0.5 + 0.3508 = 0.8508. The power of the test is 0.8508. Thus the probability of correctly rejecting the null hypothesis, given that = $685, is 0.8508. = 700 685)(40/8) = 1.04, and

PTS: 1 REF: 363-364 | 369 | 371 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

31. A random sample of n = 36 observations from a quantitative population produced a mean = 2.5 and a standard deviation s = 0.30. Suppose your research objective is to show that the population mean exceeds 2.4. a. State the null and alternative hypotheses for the test. b. Locate the rejection region for the test using a 5% significance level. c. Find the standard error of the mean. d. Do the data provide sufficient evidence to indicate that Test at e. Calculate the p-value for the test statistic in (d). f. Use the p-value to draw a conclusion at the 5% significance level. g. Compare the conclusion reached in (f) with the conclusion reached in (d). Are they the same? h. Find the critical value of used for rejecting i. Calculate (accept when = 2.5) ANS: a. The hypotheses to be tested are

vs.

e. f. g. h.

If = 0.05, the critical value of z that separates the rejection and non-rejection regions will be a value (denoted by ) such that P . That is, Hence, will be rejected if z > 1.645. The standard error of the mean is found using the sample standard deviation s to approximate the population standard deviation SE = To conduct the test, calculate the value of the test statistic

The observed value of the test statistic, z = 2.0, falls in the rejection region and the null hypothesis is rejected. There is sufficient evidence to indicate that . Since this is a right-tailed test, the p-value is the area under the standard normal distribution to the right of z = 2.0: p-value = = 0.5 – 0.4772 = 0.0228. The p-value, 0.0228, is less than and the null hypothesis is rejected at the 5% level of significance. There is sufficient evidence to indicate that . The conclusions reached using the critical value approach and the p-value approach are identical. The rejection region was given as z > 1.645 where solving for

, we obtain the critical value

The probability of a Type II error is defined as Since the acceptance region from part (h) is P when is false) = P =P when =P

necessary for rejection of

= P (accept when , can be rewritten as when

is false).

)=P

PTS: 1 REF: 363-365 | 369 | 371 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

32. An Internet server claimed that its users averaged 15 hours per week. To determine whether this was an overstatement, a competitor conducted a survey of 150 customers and found that the average time spent online was 13 hours per week with a standard deviation of 6.5 hours. Do the data provide sufficient evidence to indicate that the average hours of use are less than that claimed by the first Internet server? Test at the 1% level of significance. ANS: The hypotheses to be tested are

vs. .

, with the test statistic

Since this is a one-tailed test, the rejection region with is set in the left tail of the z distribution as Since the observed value falls in the rejection region, is rejected. There is evidence that the average time is less than claimed by the Internet server. PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

Average Daily Wages Narrative The daily wages in a particular industry are normally distributed with a mean of $60 and a standard deviation of $13. Suppose a company in this industry employs 50 workers and pays them $57.5 on the average. Based on this sample mean, can these workers be viewed as a random sample from among all workers in the industry? 33. Refer to Average Daily Wages Narrative. Find the p-value for the test. ANS: The parameter of interest is the average daily wage of workers in a given industry. A sample of n = 50 workers has been drawn from a particular company within this industry, and the sample average, has been calculated. The objective is to determine whether this company pays wages different from the total industry. That is, assume that this sample of 50 workers has been drawn from a hypothetical population of workers. Does this population have an average wage or is different from 60? Thus, the hypotheses to be tested are vs. . The test statistic is , and the p-value is P( | z | > 1.36) = 2(0.50 – 0.4131) = 0.1738. PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Analyze

TOP: 1–3

34. Refer to Average Daily Wages Narrative. If you planned to conduct your test using what would be your test conclusions? ANS: Since is smaller than the p-value, 0.1738, cannot be rejected and we cannot conclude that the company is paying wages different from the industry average. These workers can be viewed as a random sample from among all workers in the industry. PTS: 1 REF: 365 BLM: Higher Order - Evaluate Copper Pipes Narrative

TOP: 1–3

A manufacturer of copper pipes must produce pipes with a diameter of precisely 5 cm. The firm’s quality inspector wants to test the hypothesis that pipes of the proper size are being produced. Accordingly, a simple random sample of 100 pipes is taken from the production process. The sample mean diameter turns out to be 4.98 cm and the sample standard deviation 0.2 cm. Using a significance level of = 0.05, test the appropriate hypotheses. 35. Refer to Copper Pipes Narrative. State the appropriate hypotheses. ANS: vs. PTS: 1 REF: 363-364 BLM: Higher Order - Analyze

TOP: 1–3

36. Refer to Copper Pipes Narrative. Calculate the value of the test statistic. ANS:

PTS: 1 REF: 363-364 BLM: Higher Order - Analyze

TOP: 1–3

37. Refer to Copper Pipes Narrative. Calculate the p-value and write your conclusion. ANS: p-value = 2p(z < –1) = 2(0.50 – 0.3413) = 0.3174. Since p-value = 0.3174 > = 0.05, we fail to reject . We conclude that the observed divergence of the sample mean from the 5 cm standard is attributable to sampling error rather than a faulty production process that systematically turns out pipes with an average diameter other than 5 cm. PTS: 1 REF: 363-364 | 720-721 BLM: Higher Order - Evaluate

TOP: 1–3

38. An airline company would like to know if the average number of passengers on a flight in November is less than the average number of passengers on a flight in December. The results of random sampling are printed below. Test the appropriate hypotheses using  = 0.01. November December ANS: The hypotheses to be tested are are

= (476 – 482)/1.30 = –4.62.

. The test statistics

Since z < –2.33, we reject the null hypothesis and conclude that the average number of passengers on a November flight is significantly less than that on a December flight. PTS: 1 REF: 374-376 | 389 BLM: Higher Order - Evaluate

TOP: 4

Medical School Completion Narrative A university investigation was conducted to determine whether women and men complete medical school in significantly different amounts of time, on the average. Two independent random samples were selected and the following summary information concerning times to completion of medical school computed:

Sample Size Sample Mean Sample Standard Deviation

Women 90 8.4 years 0.6 years

Men 100 8.5 years 0.5 years

39. Refer to Medical School Completion Narrative. Perform the appropriate test of hypothesis to determine whether there is a significant difference in time to completion of medical school between women and men. Test using  = 0.05. ANS: Let women be population 1, men be population 2, and be the true average time to complete medical school for population i, i = 1, 2. The hypotheses to be tested are . The test statistics is

= (8.4

– 8.5)/0.0806 = –1.24. Since –1.96  z  1.96, we fail to reject the null hypothesis. Thus one cannot conclude that there is a significant difference in mean time to completion of medical school between women and men. PTS: 1 REF: 374-376 | 389 BLM: Higher Order - Evaluate

TOP: 4

40. Refer to Medical School Completion Narrative. Find the p-value associated with the test in the previous question. ANS: p-value = P(z  – 1.24) + P(z  1.24) = 2P(z  1.24) = 2(0.5 – 0.3925) = 0.2150. PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply Laptop Battery Charge Time Narrative

TOP: 4

A computer laboratory manager was in charge of purchasing new battery packs for her lab of laptop computers. She narrowed her choices to two models that were available for her machines. Since the models cost about the same, she was interested in determining whether there was a difference in the average time the battery packs would function before needing to be recharged. She took two independent samples and computed the following summary information: Battery Pack Model 1 30 6 hours 1.8 hours

Sample Size Sample Mean Standard Deviation

Battery Pack Model 2 30 6.8 hours 1.6 hours

41. Refer to Laptop Battery Charge Time Narrative. Perform the appropriate test of hypothesis to determine whether there is a significant difference in average functioning time before recharging between the two models of battery packs. Test using = 0.10. ANS: Let Model 1 be population 1, Model 2 be population 2, and be the true average functioning time before battery need to be recharged for model i, i = 1, 2. The hypotheses to be tested are . The test statistics are = (6.0 – 6.8)/0.4397 = –1.82. Since z  –1.645, reject

. Thus,

one can conclude that there is a significant difference in the average functioning time before the battery need to be recharged between the two types of battery packs. PTS: 1 REF: 374-376 | 389 BLM: Higher Order - Evaluate

TOP: 4

42. Refer to Laptop Battery Charge Time Narrative. Find the p-value for the test in the previous question. ANS: p-value = P(z  –1.82) + P(z  1.82) = 2P(z  1.82) = 2(0.5 – 0.4656) = 0.0688 PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

TOP: 4

43. Independent random samples of 35 and 50 observations are drawn from two quantitative populations, 1 and 2, respectively. The sample data summary is shown here:

Sample Size Sample Mean Sample Variance

Sample 1 36 1.28 0.058

Sample 2 50 1.35 0.056

Do the data present sufficient evidence to indicate that the mean for population 1 is smaller than the mean for population 2? Use the p-value approach and the critical value approach and explain your conclusion. ANS: The hypotheses to be tested are

vs.

calculated under the assumption that

with the unknown

and

. The test statistic, and

estimated

respectively, is given by =

p-value approach: Calculate p-value = P . Since this p-value is greater than 0.05, the null hypothesis is not rejected. There is insufficient evidence to indicate that the mean for population 1 is smaller than the mean for population 2. Critical value approach: The rejection region, with , is Since the observed value of z does not fall in the rejection region, is not rejected. There is insufficient evidence to indicate that the mean for population 1 is smaller than the mean for population 2. PTS: 1 REF: 374-376 | 389 | 720-721 BLM: Higher Order - Evaluate

TOP: 4

Red Meat Consumption Narrative To test the theory that the consumption of red meat in Canada has decreased over the past 10 years, a researcher decides to select hospital nutrition records for 400 subjects surveyed 10 years ago and to compare their average amount of beef consumed per year to amounts consumed by an equal number of subjects interviewed this year. The data are given in the table.

Sample Mean Sample Standard Deviation

Ten Years Ago 34.1 kg 12.73 kg

This Year 30.9 kg 14.09 kg

44. Refer to Red Meat Consumption Narrative. Do the data present sufficient evidence to indicate that per capita beef consumption has decreased in the past 10 years? Test at the 1% level of significance. ANS: The hypotheses to be tested are calculated under the assumption that

vs. is .

. The test statistic,

The rejection region, with is z > 2.33 and is rejected. There is sufficient evidence to indicate that or , that is, the average per capita beef consumption has decreased in the past 10 years. (Alternatively, the p-value for this test is the area to the right of z = 3.37, which is very close to 0 and less than PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Evaluate

TOP: 4

45. Refer to Red Meat Consumption Narrative. Find a 99% lower confidence bound for the difference in the average per capita beef consumption for the two groups. Does your confidence bound confirm your conclusions in the previous question? Explain. What additional information does the confidence bound give you? ANS: For the difference in the population means this year and 10 years ago, the 99% lower confidence bound uses and is calculated as or kg. Since the difference in the mean is positive, you can again conclude that there has been a decrease in the average per capita beef consumption over the past 10 years. In addition, it is likely that the average consumption has decreased by more than 0.99 kg per year. PTS: 1 REF: 376 | 389 BLM: Higher Order - Evaluate

TOP: 4

46. In an attempt to compare the starting salaries for graduates who majored in business and education, random samples of 50 recent graduates in each major were selected and the following information was obtained: Minor Business Education

Sample Mean 30,684 27,924

Sample Standard Deviation 2,625 2,865

Do the data provide sufficient evidence to indicate a difference in average starting salaries for graduates who majored in business and education? Test using ANS: The hypotheses to be tested are calculated under the assumption that

vs. is .

, and the test statistic,

The rejection region, with is | z | > 1.96, and is rejected. There is sufficient evidence to indicate a difference in the means for the graduates in business and education PTS: 1 REF: 374-376 BLM: Higher Order - Evaluate

TOP: 4

Life Insurance Narrative An insurance company wants to test the hypothesis that the mean amount of life insurance held by professional men equals that held by professional women. Accordingly, two independent simple random samples are taken from appropriate professional listings of men and women. The sample of 200 men reveals a mean amount of $140,000 with a standard deviation of $26,000. The sample of 400 women shows a mean amount of $128,000 with a standard deviation of $3,000. 47. Refer to Life Insurance Narrative. State the appropriate hypotheses. ANS: vs. PTS: 1 REF: 374 BLM: Higher Order - Analyze

TOP: 4

48. Refer to Life Insurance Narrative. Calculate the p-value. ANS: p-value = 2P(z > 6.51)

PTS: 1 REF: 374-376 | 720-721 BLM: Higher Order - Apply

TOP: 4

49. Refer to Life Insurance Narrative. Determine the critical region using a significance level = 0.05. ANS: Reject

if z < –1.96 or z > 1.96

PTS: 1 REF: 374-376 BLM: Higher Order - Analyze

TOP: 4

50. Refer to Life Insurance Narrative. What is your conclusion? ANS: Since z = 6.51 > 1.96, should be rejected. We conclude that the observed divergence between the hypothesized difference and the sample difference between the means is not the result of sampling error alone. Men’s life insurance policies are higher than those of women. PTS:

REF: 374-376

TOP: 4

BLM: Higher Order - Evaluate 51. Refer to Life Insurance Narrative. Construct a 95% confidence interval for the difference in mean amount of life insurance held by professional men and women. ANS: or

PTS: 1 REF: 376 BLM: Higher Order - Analyze

TOP: 4

52. Refer to Life Insurance Narrative. Explain how to use the 95% confidence interval for to test the appropriate hypotheses at = 0.05. ANS: Since the hypothesized difference should be rejected. PTS: 1 REF: 376 | 389 BLM: Higher Order - Evaluate

is not contained in the confidence interval,

TOP: 4

53. Refer to Life Insurance Narrative. Calculate the value of the test statistic. ANS:

PTS: 1 REF: 374 BLM: Higher Order - Analyze

TOP: 4

54. The proportion of defective computers built by Byte Computer Corporation is 0.15. In an attempt to lower the defective rate, the owner ordered some changes made in the assembly process. After the changes were put into effect, a random sample of 42 computers was tested, revealing a total of 4 defective computers. Perform the appropriate test of hypothesis to determine whether the proportion of defective computer has been lowered. Use  = 0.01. ANS: The proportion of defective computers in the sample is = x/n = 4/42 = 0.0952. Let p = the true proportion of defective computers. The hypotheses to be tested are . The test statistics is = (0.095 – 0.15)/0.0551 = –0.995. Since –2.33  z  2.33, we fail to reject the null hypothesis. Therefore, we cannot conclude that the proportion of defective computers has been lowered.

PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

Gas Heat Narrative A gas company president for a particular city is interested in the proportion of homes heated by gas. Historically, the proportion of homes heated by gas has been 0.65. A sample of 75 homes was selected and it was found that 44 of them heat with gas. 55. Refer to Gas Heat Narrative. Perform the appropriate test of hypothesis to determine whether the proportion of homes heated by gas has changed. Test using  = 0.05. ANS: The sample proportion of homes heated by gas is = x/n = 44/75 = 0.587. Let p = the true proportion of homes heated by gas. The hypotheses to be tested are . The value of test statistics are = (0.587 – 0.65)/0.0551 = –1.14. Since –1.96  z  1.96, we fail to reject the null hypothesis. Therefore, we cannot conclude that the proportion of homes heated by gas has changed. PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

56. Refer to Gas Heat Narrative. Find the p-value for this test. ANS: p-value = P(z  – 1.14) + P(z  1.14) = 2P(z  1.14) = 2(0.5 – 0.3729) = 0.2542 PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Apply

TOP: 5

57. The owner of a marina would like to believe that more than 40% of the sailboat owners use their boats more than 6 times each summer. A random sample of 70 sailboat owners showed 42 used their boats more than 6 times each summer. State and test the appropriate hypotheses using a significance level of 0.005. Is there a reason for the marina owner to believe more than 40% of the sailboat owners use their boats more than 6 times each summer? ANS: The sample proportion of sailboat owners who used their boats more than 6 times each summer is = x/n = 42/70 = 0.6. The hypotheses to be tested are . The test statistics are = (0.6 – 0.4)/0.0586 = 3.41. Since z > 2.575, we reject the null hypothesis and conclude that there is a reason to believe that the proportion of sailboat owners who use their boats more than 6 times each summer is greater than 0.4. PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

College Beach Volleyball Narrative A student government representative at a local university claims that 60% of the undergraduate students favour a move from court volleyball to beach volleyball`. A random sample of 250 undergraduate students was selected and 140 students indicated they favoured a move to beach volleyball. 58. Refer to College Beach Volleyball Narrative. Perform the appropriate test of hypothesis to test the representative’s claim. Use = 0.05. ANS: Let p = true proportion of undergraduate students who favour a move to beach volleyball. The sample proportion = 140/250 = 0.56. The hypotheses to be tested are . The test statistics is = (0.56 – 0.60)/0.03098 = –1.29. Since –1.96  z  1.96, we fail to reject the null hypothesis, and therefore cannot conclude that the proportion of undergraduate students who favour a move to beach volleyball is not 0.60. PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

59. Refer to College Beach Volleyball Narrative. Find the p-value for the test in the previous question. ANS: p-value = P(z  – 1.29) + P(z  1.29) = 2P(z  1.29) = 2(0.5 – 0.4015) = 0.1970 PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Apply

TOP: 5

60. A random sample of 150 observations was selected from a binomial population, and 87 successes were observed. Do the data provide sufficient evidence to indicate that the population proportion p is greater than 0.5? Use the critical value approach and the p-value approach. ANS: The hypotheses to be tested are

vs.

. With x = 90 and n = 150, so

that , the test statistic is = (058 – 0.5)/0.0408 = 1.96. Critical value approach: The rejection region is one-tailed, z > 1.645 with or z > 2.33 with Hence, is rejected at the 5% level, but not at the 1% level. At the 5% significance level, we conclude that p > 0.5. p-value approach: Calculate p-value = P . Since this p-value is between 0.01 and 0.05, is rejected at the 5% level, but not at the 1% level. At the 5% significance level, we conclude that p > 0.5.

PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Evaluate

TOP: 5

61. Refer to Plant Experiments Narrative. What hypothesis should you use to test the geneticist’s claim? ANS: The hypotheses to be tested are NAR Plant Experiments Narrative PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

vs.

TOP: 5

Plant Experiments Narrative A peony plant with red petals was crossed with another plant having streaky petals. A geneticist states that 80% of the offspring resulting from this cross will have red flowers. To test this claim, 120 seeds from this cross were collected and germinated and 84 plants had red petals. 62. Refer to Plant Experiments Narrative. Calculate the test statistic and its observed significance level (p-value). Use the p-value to evaluate the statistical significance of the results at the 1% level. ANS: Since x = 84 and n = 120, then

, and the test statistic is

= (0.70 – 0.80)/0.0365 = –2.74 with p-value = P( | z | > 2.74) = 2(0.5 – 0.4969) = 0.0062. Since this p-value is less than 0.01, the null hypothesis is rejected at the 1% level of significance, and the results are declared highly significant. There is evidence that the proportion of red-flowered plants is not 0.80. PTS: 1 REF: 379-382 BLM: Higher Order - Evaluate

TOP: 5

Tennis Magazine Narrative A marketing manager wants to test the hypothesis that 90% of Tennis magazine’s subscribers are homeowners. Accordingly, a simple random sample of 80 is taken from the magazine’s list of subscribers. The sample turns out to contain 64 homeowners. Use a significance level of = 0.05. 63. Refer to Tennis Magazine Narrative. State the appropriate hypotheses. ANS: vs. PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

TOP: 5

64. Refer to Tennis Magazine Narrative. Calculate the value of the test statistic. ANS:

PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

TOP: 5

65. Refer to Tennis Magazine Narrative. Calculate the p-value and write your conclusion. ANS: p-value = 2p(z < –2.98) = 2(0.50 – 0.4986) = 0.0028. Since p-value = 0.0028 < = 0.01, should be rejected, and we can conclude that the percentage of Tennis magazine’s subscribers who live in their own homes is not 90%. PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Evaluate

TOP: 5

Union Contract Narrative A union composed of several thousand employees is preparing to vote on a new contract. A random sample of 500 employees yielded 320 who planned to vote yes. It is believed that the new contract will receive more than 60% yes votes. 66. Refer to Union Contract Narrative. State the appropriate null and alternative hypotheses. ANS: , PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

TOP: 5

67. Refer to Union Contract Narrative. Can we infer at the 5% significance level that the new contract will receive more than 60% yes votes? Justify your conclusion. ANS: Rejection region: Test statistic: z = 1.83 Conclusion: Reject . Yes, we can infer at the 5% significance level that the new contract will receive more than 60% yes votes. PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

68. Refer to Union Contract Narrative. Compute the p-value for the test. ANS: p-value = 0.0336 PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Apply

TOP: 5

Business Graduates Earnings Narrative A professor claims that 70% of business graduates earn more than $45,000 per year. In a random sample of 300 graduates, 195 earn more than $45,000. 69. Refer to Business Graduates Earnings Narrative. State the appropriate null and hypotheses. ANS: , PTS: 1 REF: 379-381 BLM: Higher Order - Analyze

TOP: 5

70. Refer to Business Graduates Earnings Narrative. Can we reject the professor’s claim at the 5% significance level? Justify your conclusion. ANS: Rejection region: Test statistic: z = –1.89 Conclusion: Don’t reject significance level.

. No, we can’t reject the professor’s claim at the 5%

PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

71. Refer to Business Graduates Earnings Narrative. Compute the p-value for the test. ANS: p-value = 0.0294 PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Apply

TOP: 5

Allergy Drug Narrative In clinical studies of an allergy drug, 81 of the 900 subjects experienced drowsiness. A competitor claims that 10% of the users of this drug experience drowsiness. 72. Refer to Allergy Drug Narrative. State the appropriate null and hypotheses.

ANS: , PTS: 1 REF: 379-380 BLM: Higher Order - Analyze

TOP: 5

73. Refer to Allergy Drug Narrative. Is there enough evidence at the 5% significance level to infer that the competitor is correct? Justify your conclusion. ANS: Rejection region: Test statistic: z = -1.0 Conclusion: Don’t reject . Yes, there is enough evidence at the 5% significance level to infer that the competitor is correct. PTS: 1 REF: 379-381 BLM: Higher Order - Evaluate

TOP: 5

74. Refer to Allergy Drug Narrative. Compute the p-value of the test. ANS: p-value = 0.3174 PTS: 1 REF: 379-381 | 720-721 BLM: Higher Order - Apply

TOP: 5

75. Refer to Allergy Drug Narrative. Construct a 95% confidence interval estimate of the population proportion of the users of this allergy drug who experience drowsiness. ANS: = 0.09

0.019. Thus, LCL = 0.071, and UCL = 0.109.

PTS: 1 REF: 379 | 381-382 | 376 BLM: Higher Order - Analyze

TOP: 5

76. Refer to Allergy Drug Narrative. Explain how to use this confidence interval to test the hypotheses. ANS: Since the hypothesized value to reject at = 0.05.

= 0.10 is included in this 95% confidence interval, we fail

PTS: 1 REF: 376 | 389 BLM: Higher Order - Evaluate Nuclear Weapons Freeze Narrative

TOP: 5

A group in favour of freezing production of nuclear weapons believes that the proportion of individuals in favour of a nuclear freeze is greater for those who have seen the movie “The Day After” (population 1) than those who have not (population 2). In an attempt to verify this belief, random samples of size 500 are obtained from the populations of interest. Among those who had seen “The Day After,” 228 were in favour of a freeze. For those who had not seen the movie, 196 favoured a freeze. 77. Refer to Nuclear Weapons Freeze Narrative. Set up the appropriate null and alternative hypotheses. ANS:

PTS: 1 REF: 384-385 BLM: Higher Order - Analyze

TOP: 6–7

78. Refer to Nuclear Weapons Freeze Narrative. Set up the rejection region for this test using = 0.05. ANS: Reject the null hypothesis if z > 1.645. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Analyze

TOP: 6–7

79. Refer to Nuclear Weapons Freeze Narrative. Find the appropriate test statistic. ANS: = 228/500 = 0.456, and the standard error is

= 196/500 = 0.392. The pooled estimate for p required for = (228 + 196)/1000 = 0.424. Then, the test

statistics is PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

= (0.456 – 0.392)/0.0313 = 2.045. TOP: 6–7

80. Refer to Nuclear Weapons Freeze Narrative. State and interpret your conclusion. ANS: Since z > 1.645, we reject the null hypothesis. Based on this data, the proportion in favour of a freeze who have seen the movie is greater than the proportion in favour of a freeze who have not seen the movie. PTS: 1 REF: 384-387 BLM: Higher Order - Evaluate

TOP: 6–7

81. Refer to Environment Canada Project Narrative. Set up the appropriate null and alternative hypotheses. ANS: Let = proportion of plants in violation from industry i, i = 1, 2 for steel and utility, respectively. The null and alternative hypotheses to be tested are . PTS: 1 REF: 384-385 BLM: Higher Order - Analyze

TOP: 6–7

82. Refer to Environment Canada Project Narrative. Compute the value of the test statistic. ANS: = 12/150 = 0.08, and standard error is

= 12/160 = 0.075. The pooled estimate for p required for the = (12 + 12)/310 = 0.0774. The test statistics is = (0.08 – 0.075)/0.0304 = 0.1645.

PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

TOP: 6–7

83. Refer to Environment Canada Project Narrative. Set up the appropriate rejection region for  = 0.01. ANS: Reject

if z  –2.575 or if z  2.575

PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Analyze

TOP: 6–7

84. Refer to Environment Canada Project Narrative. What is the appropriate conclusion? ANS: Since –2.58  z  2.58, we fail to reject the null hypothesis. Thus, one cannot conclude that there is a significant difference in the proportion of violations between the two industries. Refer to Environment Canada Project Narrative. PTS: 1 REF: 384-387 BLM: Higher Order - Evaluate

TOP: 6–7

85. A manufacturing plant has two assembly lines for producing plastic bottles. The plant manager was concerned about whether the proportion of defective bottles differed between the two lines. Two independent random samples were selected and the following summary data computed: Line 1

Line 2

Sample Proportion of Defectives Number of defectives Sample Size

0.10 5 50

0.12 6 50

Perform the appropriate test of hypothesis using  = 0.05. ANS: Let be the proportion of defective bottles from line i, i = 1, 2. = 5/50 = 0.10, and = 6/50 = 0.12. The pooled estimate for p required for the standard error is = (5 + 6)/100 = 0.11. The test statistics are = (0.10 – 0.12)/0.0626 = –0.319. Since –1.96  z  1.96, we fail to reject . Thus, one cannot conclude that there is a significant difference in the proportion of defective bottles between the two assembly lines. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate

TOP: 6–7

Insurance Policy Sales Narrative Independent random samples of and sales phone calls for an insurance policy were randomly selected from binomial populations 1 and 2, respectively. Sample 1 had 80 successful sales, and sample 2 had 88 successful sales. 86. Refer to Insurance Policy Sales Narrative. Suppose you have no preconceived theory concerning which parameter, or is the larger and you wish to detect only a difference between the two parameters if one exists. What should you choose as the null and alternative hypotheses for a statistical test? ANS: Since it is necessary to detect either

null and alternative hypotheses to be tested are PTS: 1 REF: 384-385 | 389 BLM: Higher Order - Analyze

a two-tailed test is necessary. The vs.

TOP: 6–7

87. Refer to Insurance Policy Sales Narrative. Calculate the standard error of the difference in the two sample proportions Make sure to use the pooled estimate for the common value of p. ANS:

The standard error of

estimates for and we are assuming that

. In order to evaluate the standard error,

must be obtained, using the assumption that , the best estimate for this common value will be

Because

, and the estimated standard error is . PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

TOP: 6–7

88. Refer to Insurance Policy Sales Narrative. Calculate the standard error of . Based on your knowledge of the standard normal distribution, is this a likely or an unlikely observation, assuming that is true and the two population proportions are the same? Justify your conclusion. ANS: and

The test statistic, based on the sample data

will be

This is a likely observation if

is true, since it lies less than one standard deviation below

PTS: 1 REF: 384-387 BLM: Higher Order - Evaluate

TOP: 6–7

89. Refer to Insurance Policy Sales Narrative. p-value approach: Find the p-value for the test. Test for a significant difference in the population means at the 1% significance level. ANS: p-value = P( | z | > 0.94) = 2(0.50 – 0.3264) = 0.3472 Since this p-value is greater than 0.01, is not rejected. There is no evidence of a difference in the two population proportions. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate

TOP: 6–7

90. Refer to Insurance Policy Sales Narrative. Critical value approach: Find the rejection region when Do the data provide sufficient evidence to indicate a difference in the population proportions?

ANS: The rejection region with is | z | > 2.575 and is not rejected. There is no evidence of a difference in the two population proportions. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate

TOP: 6–7

Drug Testing Narrative An experiment was conducted to test the effect of a new drug on a viral infection. The infection was induced in 100 mice, and the mice were randomly split into 2 groups of 50. The first group, the control group, received no treatment for the infection. The second group received the drug. The proportions of survivors, and in the 2 groups after a 30-day period, were found to be 0.40 and 0.64, respectively. 91. Refer to Drug Testing Narrative. Is there sufficient evidence to indicate that the drug is effective in treating the viral infection? Use ANS: The hypotheses to be tested are then

vs.

statistic is then

. Since = (20 + 32)/100 = 0.52. The test . The rejection with

is and is rejected. There is evidence of a difference in the proportion of survivors for the two groups. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate

TOP: 6–7

92. Refer to Drug Testing Narrative. Is Use a 95% confidence interval to estimate the actual difference in the cure rates for the treated versus the control groups. ANS: The approximate 95% confidence interval is given by

= =

PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

TOP: 6–7

Cable Narrative A cable company in Ontario is thinking of offering its service in one of two cities: Guelph and Kitchener. Allegedly, there is a proportion of households in either city ready to be hooked up to the cable, but the company wants to test the claim. Accordingly, it takes a simple random sample in each city. In Guelph, 175 of 200 households say they will join. In Kitchener, 665 of 800 households say the same. 93. Refer to Cable Narrative. State the appropriate hypotheses. ANS: vs. PTS: 1 REF: 384-385 BLM: Higher Order - Analyze

TOP: 6–7

94. Refer to Cable Narrative. Calculate the pooled estimate of the common proportion p. ANS:

PTS: 1 REF: 384-385 BLM: Higher Order - Analyze

TOP: 6–7

95. Refer to Cable Narrative. Calculate the standard error of

ANS:

PTS: 1 REF: 384-385 BLM: Higher Order - Analyze

TOP: 6–7

96. Refer to Cable Narrative. Calculate the value of the test statistic. ANS:

PTS: 1 REF: 384-385 BLM: Higher Order - Apply

TOP: 6–7

97. Refer to Cable Narrative. Calculate the p-value and write your conclusion given that 0.05. ANS:

p-value = 2p(z > 1.51) = 2(0.50 – 0.4345) = 0.131. Since p-value = 0.131 > = 0.05, should not be rejected. It is quite likely that the relevant proportions of households that are ready to hook up the cable in the two cities are the same. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate

TOP: 6–7

98. Refer to Cable Narrative. Construct a 95% confidence interval for the difference in proportions of households in Guelph and Kitchener that are ready to hook up to the cable. ANS:

, or PTS: 1 REF: 386-387 BLM: Higher Order - Analyze

TOP: 6–7

99. Refer to Cable Narrative. Explain how to use the 95% confidence interval for test the appropriate hypotheses at = 0.05. ANS: Since the hypothesized difference should not be rejected. PTS: 1 REF: 376 | 389 BLM: Higher Order - Evaluate

is included in the confidence interval,

TOP: 6–7

Soap Sales Narrative In testing the hypotheses vs. statistics: , , , and , where number of Dial Soap sales in the two samples, respectively. 100.

, use the following and represent the

Refer to Soap Sales Narrative. What conclusion can we draw at the 5% significance level? Justify your answer. ANS: Rejection region: |z| > , Test statistic: z = 1.449 Conclusion: Don’t reject the null hypothesis. PTS: 1 REF: 384-387 BLM: Higher Order - Evaluate

TOP: 6–7

101.

Refer to Soap Sales Narrative. What is the p-value of the test? ANS: p-value = 0.147. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Apply

102.

Refer to Soap Sales Narrative. Explain how to use the p-value to test the hypotheses. ANS: Since p-value = 0.147 >

0.05, we fail to reject the null hypothesis.

PTS: 1 REF: 365 BLM: Higher Order - Evaluate 103.

TOP: 6–7

Refer to Soap Sales Narrative. Estimate with 95% confidence the difference between the two population proportions. ANS: 0.08 0.107. Thus, LCL = –0.027 and UCL = 0.117. PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

104.

TOP: 6–7

Refer to Soap Sales Narrative. Interpret and explain how to use the confidence interval to test the hypotheses. ANS: We estimate that the difference between the population proportions lies between –0.028 and 0.118. Since the hypothesized value 0 is included in the 95% interval estimate, we fail to reject the null hypothesis at 0.05. PTS: 1 REF: 376 | 389 BLM: Higher Order - Evaluate

TOP: 6

Medical Instruments Narrative In testing the hypotheses vs. , the following statistics were obtained: , , , and , where and represent the number of defective components found in medical instruments in the two samples. 105.

Refer to Medical Instruments Narrative. What conclusion can we draw at the 5% significance level? ANS: Rejection region:

1.645

Test statistic: z = 1.199 Conclusion: Don’t reject the null hypothesis. PTS: 1 REF: 384-387 | 720-721 BLM: Higher Order - Evaluate 106.

TOP: 6–7

Refer to Medical Instruments Narrative. What is the p-value of the test? Briefly explain how to use the p-value for testing the hypotheses. ANS: p-value = 0.1151. Since p-value = 0.1151 >

0.05, we fail to reject the null hypothesis.

PTS: 1 REF: 384-387 | 720-721 | 365 BLM: Higher Order - Evaluate 107.

TOP: 6–7

Refer to Medical Instruments Narrative. Estimate with 95% confidence the difference between the two population proportions. ANS: 0.05 0.0824. Thus, LCL = –0.0324 and UCL = 0.1324 PTS: 1 REF: 384-387 BLM: Higher Order - Analyze

TOP: 6–7

Chapter 10A—Inference From Small Samples MULTIPLE CHOICE 1. When finding a confidence interval for a population mean based on a sample of size 8, which of the following assumptions is made? a. The population standard deviation, , is known. b. The sampling distribution of z is normal. c. The sampled population is approximately normal. d. There is no special assumption made. ANS: C BLM: Remember

PTS:

REF: 403-404

TOP: 1–3

2. As the degrees of freedom for the t distribution increase, which of the following distributions does it more and more closely approximate? a. the chi-square distribution b. the standard normal distribution c. the normal distribution d. the F distribution ANS: B BLM: Remember

PTS:

REF: 401

TOP: 1–3

3. Suppose that a t test is being conducted at the 0.05 level of significance to test vs. . A sample of size 20 is randomly selected. What is the rejection region? a. t > –2.093 b. t < –1.729 c. t > 1.725 d. t < 2.086 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

4. On which of the following does the shape of the Student’s t distribution primarily depend? a. the population variance b. the number of its degrees of freedom c. the population mean d. the range of the data values ANS: B BLM: Remember

PTS:

REF: 401

TOP: 1–3

5. Which of the following best describes the Student’s t distribution? a. It is a sampling distribution for a random variable, t, derived from a normally distributed population. b. It is single-peaked distribution above the random variable’s mean, median, and mode of value 1. c. It is always skewed, either to the right or to the left.

ANS: A BLM: Remember

PTS:

REF: 401

TOP: 1–3

6. Which of the following correctly describes the term “degrees of freedom”? a. They are equal to the difference between the point estimate located at the interval centre and either the lower or the upper limit of the interval between which the parameter is free to move. b. They determine the shape of the t distribution, and refer to the number of independent squared deviations in that are available for estimating this sense, can be “freely chosen.” c. They equal sample size minus 2 in the case of a single sample. ANS: B BLM: Remember

PTS:

REF: 401

and, in

TOP: 1–3

7. What do the values in a t distribution table measure? a. the area under a specified curve, defined by a given number of degrees of freedom, that lies to the left of a specified value of t b. the area under a specified curve, defined by a given number of degrees of freedom, that lies to the right of a specified value of t c. the heights of a specified curve, defined by a given number of degrees of freedom, at various specified values of t d. the specified values of t for different heights of a specified curve, defined by a given number of degrees of freedom ANS: B BLM: Remember

PTS:

REF: 401-402

TOP: 1–3

8. To estimate with 95% confidence the average number of kilometres that students living off-campus commute to classes every day, a random sample of 20 students was taken and produced a mean equal to 5.2 km and a standard deviation of 3.05 km. Under these circumstances, what is the appropriate critical value? a. 1.7291 b. 1.96 c. 2.086 d. 2.093 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

9. To estimate with 95% confidence the average number of kilometres that students living off-campus commute to classes every day, a random sample of 20 students was taken and produced a mean equal to 5.2 km and a standard deviation of 3.05 km. In this case, what would be the approximate value of the upper limit for a 95% confidence interval estimate for the true population mean? a. 6.63 b. 5.20 c. 3.22 d. 2.15

ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

10. To estimate, with 95% confidence, the average number of kilometres that students living off-campus commute to classes every day, a random sample of 20 students was taken and produced a mean equal to 5.2 km and a standard deviation of 3.05 km. In this case, which of the following is the point estimate for the true population mean? a. 6.63 b. 5.20 c. 3.22 d. 2.15 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

11. A random sample is selected from a normally distributed population. The following sample statistics are obtained: n = 20, = 30, and s = 10. Based on this information, and using a 95% confidence level, which of the following is a valid calculation from the sample statistics? a. The critical value is 1.7921. b. The critical value is 1.96. c. The standard deviation of the sampling distribution of is 0.50. d. The margin of error is approximately 4.68. ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

12. Under which of the following circumstances is it impossible to construct a confidence interval for the population mean? a. a non-normal population with a large sample and an unknown population variance b. a normal population with a large sample and a known population variance c. a non-normal population with a small sample and an unknown population variance d. a normal population with a small sample and an unknown population variance ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 403-404

TOP: 1–3

13. Suppose that a one-tailed t-test is being applied to find out if the population mean is less than 100. The level of significance is 0.05 and 25 observations were sampled. Which of the following best describes the rejection region? a. t > 1.708 b. t < –1.711 c. t > 1.318 d. t < –1.316 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 471-473

TOP: 1–3

14. A random sample of size n has been selected from a normally distributed population. In hypothesis testing for the population mean, when should the t test be used instead of the z test? a. when n < 30 and is unknown b. when n > 30 and is unknown c. when n < 30 and is known d. both a and c ANS: D BLM: Remember

PTS:

REF: 400-401

TOP: 1–3

15. Which of the following is the best description of a robust estimator? a. It is one that is unbiased and symmetrical about 0. b. It is one that is consistent and also mound-shaped. c. It is one that is efficient and less spread out. d. It is one that is not sensitive to a moderate departure from the assumption of normality in the population. ANS: D BLM: Remember

PTS:

REF: 447

TOP: 1–3

16. For a sample of size 20 taken from a normally distributed population with standard deviation equal to 5, which of the following would be required to construct a 90% confidence interval for the population mean? a. z = 1.96 b. t = 1.729 c. z = 1.645 d. t = 1.328 ANS: C TOP: 1–3

PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

17. A major department store chain is interested in estimating the average amount its credit card customers spent. Fifteen credit card accounts were randomly sampled and analyzed with the following results: = $50.50 and = 400. Assuming the distribution of the amount spent is approximately normal, what is the shape of the sampling distribution of the sample mean that will be used to create the desired confidence interval for ? a. approximately normal with a mean of $50.50 b. a standard normal distribution c. a t distribution with 15 degrees of freedom d. a t distribution with 14 degrees of freedom ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 401-403

TOP: 1–3

18. Researchers determined that 60 tissues is the average number of tissues used during a cold. Suppose a random sample of 100 tissue users yielded the following data on the number of tissues used during a cold: = 52 and s = 22. Suppose the alternative we wanted to test is . Which of the following best describes the rejection region for

= 0.05?

a. Reject

if t > 1.6604.

b. Reject c. Reject d. Reject

if t < –1.6604.

ANS: B TOP: 1–3

if t > 1.9842 or Z < –1.9842. if t < –1.9842. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

19. A random sample of size 15 taken from a normally distributed population revealed a sample mean of 75 and a sample variance of 25. What would the upper limit of a 95% confidence interval for the population mean equal? a. 77.769 b. 77.273 c. 72.727 d. 72.231 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

20. A major department store chain is interested in estimating the average amount its credit card customers spent. Fifteen credit card accounts were randomly sampled and analyzed with the following results: = $50.50 and = 400. Which of the following is a 95% confidence interval for the average amount the credit card customers spent? a. $50.50 $9.09 b. $50.50 $10.12 c. $50.50 $11.00 d. $50.50 $11.08 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 404

TOP: 1–3

21. Private colleges and universities rely on money contributed by individuals and corporations for their operating expenses. Much of this money is put into a fund called an endowment, and the college spends only the interest earned by the fund. A recent survey of eight private colleges in Canada revealed the following endowments (in millions of dollars): 60.2, 47.0, 235.1, 490.0, 122.6, 177.5, 95.4, and 220.0. Summary statistics yield = 180.975 and s = 143.042. Which of the following is a 95% confidence interval for the mean endowment of all the private colleges in Canada? a. $200.0 $94.066 b. $200.0 $99.123 c. $200.0 $116.621 d. $200.0 $119.605 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 404-406

TOP: 1–3

22. Researchers determine that 60 tissues is the average number of tissues used during a cold. Suppose a random sample of 100 users yielded the following statistics on the number of tissues used during a cold: = 52 and s = 22. Using the sample information provided, what is the value of the test statistic? a. t = (52 – 60)/22 b. t = (52 – 60)/(22/10) c. t = (52 – 60)/(22/100) d. t = (52 – 60)/(22/1002) ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 404-406

TOP: 1–3

23. If you were constructing a 99% confidence interval for the population mean based on a sample of n = 25, where the standard deviation of the sample s = 0.05, what would be the critical value of t? a. 2.7969 b. 2.7874 c. 2.4922 d. 2.4851 ANS: A TOP: 1–3

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

24. Based on sample data, the 90% confidence interval limits for the population mean are LCL = 170.86 and UCL = 195.42. If the 10% level of significance were used in testing the hypotheses vs. , what would you conclude? a. The null hypothesis would be rejected. b. The null hypothesis would not be rejected. c. The null hypothesis would have to be revised. ANS: A PTS: 1 BLM: Higher Order - Evaluate

REF: 404-405

TOP: 1–3

25. Which of the following is NOT a required assumption for constructing a confidence interval estimate of the difference between two population means when the samples are small? a. The populations are normally distributed. b. The population variances are equal. c. The population variances are both equal to 1. d. The samples are selected at random from the populations. ANS: C TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

26. Two independent samples are selected at random from two normal populations. The sample statistics are as follows: and Assuming that a two-tailed hypothesis test is conducted at α = 0.05, what is the value of the test statistic? a. z = 1.645 b. t = 0.891 c. z = 1.960

d. t = 0.928 ANS: B TOP: 4

PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Apply

27. Two independent samples are selected at random from two normal populations. The sample statistics are as follows: and Assuming that a two-tailed hypothesis test is conducted at α = 0.05, what is the critical value? a. t = 2.0484 b. z = 1.96 c. t = 1.7011 d. z = 1.65 ANS: A TOP: 4

PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Apply

28. Two samples of sizes 25 and 35 are independently drawn from two normal populations, where the unknown population variances are assumed to be equal. Which one of the following values is the number of degrees of freedom of the equal-variances t test statistic? a. 60 b. 59 c. 58 d. 35 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 414-415

TOP: 4

29. In constructing a 95% confidence interval estimate for the difference between the means of two normally distributed populations, where the unknown population variances are assumed NOT to be equal, summary statistics computed from two independent samples are as follows: , , what is the upper confidence limit? a. 28.212 b. 24.911 c. 19.123 d. 5.788 ANS: A TOP: 4

, and

. In this case,

PTS: 1 REF: 414-415 | 418-419 BLM: Higher Order - Apply

30. In testing the difference between the means of two normally distributed populations, the number of degrees of freedom associated with the unequal-variances t test statistic usually results in a non-integer number. In this situation, what is the best recommendation in order to proceed? a. You should round it up to the nearest integer. b. You should round it down to the nearest integer. c. You should change the sample sizes until the number of degrees of freedom becomes an integer. d. You should assume that the population variances are equal, and then use df = .

ANS: B TOP: 4

PTS: 1 REF: 414-415 | 418-419 BLM: Higher Order - Understand

31. The quantity is called the pooled variance estimate of the common variance of two unknown but equal population variances. It is the weighted average of the two sample variances. What do the weights represent? a. sample variances b. sample standard deviations c. sample sizes d. degrees of freedom for each sample ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 418 | 414

TOP: 4

32. Two independent samples of sizes 20 and 30 are randomly selected from two normally distributed populations in order to test the difference between the population means, . Assume that the population variances are unknown but equal. From the following options, how may the sampling distribution of the sample mean difference described? a. It is normally distributed. b. It is t distributed with 50 degrees of freedom. c. It is t distributed with 48 degrees of freedom. d. It is F distributed with 19 and 29 degrees of freedom. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 413

best be

TOP: 4

33. Two independent samples of sizes 40 and 50 are randomly selected from two populations to test the difference between the population means

. Which of the following best

describes the sampling distribution of the sample mean difference a. It is normally distributed. b. It is approximately normal. c. It is Student’s t distributed, with 88 degrees of freedom. d. It is t distributed, with 88 degrees of freedom. ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 413

TOP: 4

34. Two independent samples of sizes 25 and 35 are randomly selected from two normal populations with equal variances. In order to test the difference between the population means, which of the following would best describe the test statistic? a. It is a standard normal random variable. b. It is an approximately standard normal random variable. c. It is Student’s t distributed with 58 degrees of freedom. d. It is Student’s t distributed with 33 degrees of freedom. ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 414-415

TOP: 4

35. In testing the difference between two population means using two independent samples, under which of the following conditions would we use the pooled variance in estimating the standard error of the sampling distribution of the sample mean difference a. The sample sizes are both large. b. The populations are normal with equal variances. c. The populations are non-normal with unequal variances. ANS: B BLM: Remember

PTS:

REF: 414-415

TOP: 4

36. In testing the difference between two population means using two independent samples, when is the sampling distribution of the sample mean difference normal? a. when the sample sizes are both greater than 30 b. when the populations are normal c. when the populations are non-normal and the sample sizes are large d. when the populations are non-normal and the sample sizes are small ANS: B BLM: Remember

PTS:

REF: 413

TOP: 4

37. In testing whether the means of two normal populations are equal, summary statistics computed for two independent samples are as follows: ,

, and

. Assume that the population variances are equal. What

is the standard error of the sampling distribution of the sample mean difference a. 0.1017 b. 0.3189 c. 1.1275 d. 1.2713 ANS: B TOP: 4

, ?

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Apply

38. In constructing a confidence interval estimate for the difference between the means of two normally distributed populations, using two independent samples, which of the following approaches should we follow? a. Pool the sample variances when the unknown population variances are equal. b. Pool the sample variances when the population variances are known and equal. c. Pool the sample variances when the population means are equal. d. Never pool the sample variances. ANS: A BLM: Remember

PTS:

REF: 414-415

TOP: 4

39. Which one of the following conditions is assumed with the t test for the difference between the means of two independent populations? a. The respective sample sizes are equal. b. The respective sample variances are equal. c. The respective populations are approximately normal.

ANS: C TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

40. We are testing for the difference between the means of two independent populations with equal variances, and samples of degrees of freedom equal? a. 29 b. 28 c. 14 d. 13

and

ANS: B PTS: 1 BLM: Higher Order - Apply

are taken. What does the number of

REF: 414-415

TOP: 4

41. In testing for differences between the means of two independent populations, what is the null hypothesis? a. b. c. d. ANS: B BLM: Remember

PTS:

42. Given the information: , , should be used in the pooled variance t test? a. 40 b. 38 c. 25 d. 15 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 414-415

TOP: 4

, how many degrees of freedom

REF: 414-415

TOP: 4

43. Which of the following could be one possible reason for performing a paired-difference experiment? a. to reduce the degrees of freedom b. to allow one to use a larger value of  c. to reduce the quantity of information in the experiment d. to reduce the variability in the sample data ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 425

TOP: 5

44. Which of the following is an assumption behind the paired-difference t test for the mean difference between two populations? a. The populations are approximately normally distributed. b. The sample variances are equal.

c. The sample means are equal. d. The two samples are independent. ANS: A BLM: Remember

PTS:

REF: 446

TOP: 5

45. You wish to test the difference between the means of two paired populations with samples of size 30 each. What is the appropriate value of the degrees of freedom for this test? a. 29 b. 28 c. 15 d. 14 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 426

TOP: 5

46. In testing for differences between the means of two matched pairs populations, how may the null hypothesis best be formulated? a. b. c. d. ANS: C BLM: Remember

PTS:

REF: 426

TOP: 5

47. What is the term used to describe samples in which there exists some natural relationship between each pair of observations that provides a logical reason to compare the first observation of sample 1 with the first observation of sample 2, the second observation of sample 1 with the second observation of sample 2, and so on? a. matched samples b. independent samples c. weighted samples d. random samples ANS: A BLM: Remember

PTS:

REF: 425

TOP: 5

48. What is the number of degrees of freedom associated with the t test when the data are gathered from a matched pairs experiment with 10 pairs? a. 9 b. 10 c. 18 d. 20 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 426

TOP: 5

49. Which of the following values is the number of degrees of freedom associated with the t test when the data are gathered from a matched pairs experiment with 21 pairs?

a. 9 b. 10 c. 18 d. 20 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 426

TOP: 5

50. A test is being conducted to test the difference between two population means using data that are gathered from a matched pairs experiment. If the paired differences are normal, then which distribution should be used for testing? a. normal distribution b. binomial distribution c. Student t distribution d. F distribution ANS: C BLM: Remember

PTS:

REF: 426-427

TOP: 5

51. For which kinds of data can we design a matched pairs experiment? Choose the best response. a. observational b. experimental c. controlled d. observational, experimental, and controlled ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 426

TOP: 5

52. We are testing for the difference between the means of two dependent populations (matched pairs experiment) with samples sizes of degrees of freedom? a. 29 b. 28 c. 14 d. 13 ANS: C PTS: 1 BLM: Higher Order - Apply

and

REF: 426

. What would be the number of

TOP: 5

53. In what type of test is the variable of interest the difference between the values of the observations rather than the observations themselves? a. a test for the equality of variances from two independent populations b. a test for the difference between the means of two dependent populations c. a test for the difference between the means of two independent populations ANS: B BLM: Remember

PTS:

REF: 425-427

TOP: 5

54. In testing for differences between the means of two dependent populations, which of the following is the best formulation of the null hypothesis?

a. b. c. d. ANS: B BLM: Remember

PTS:

REF: 426

TOP: 5

55. Which of the following is a characteristic of the chi-square distribution? a. symmetric around 0 b. positively skewed c. negatively skewed d. mound-shaped ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 432

TOP: 6

56. On which of the following does the shape of the chi-square distribution depend? a. the population variance b. the number of its degrees of freedom c. the population mean d. the range of the data values ANS: B BLM: Remember

PTS:

REF: 432

TOP: 6

57. Which of the following may be used to describe the sampling distribution of the quantity a. b. c. d.

? It is an F distribution. It is a chi-square distribution. It is a normal distribution. It is a t distribution.

ANS: B BLM: Remember

PTS:

REF: 432

TOP: 6

58. A random sample of 20 observations is selected from a normally distributed population. The sample variance is 12. What is the upper limit of the 95% confidence interval for the population variance? a. 6.940 b. 7.564 c. 22.536 d. 25.599 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 434-437

TOP: 6

59. In a hypothesis test for the population variance, the hypotheses to be tested are vs. . The sample size is 20 and the test is being carried out at the 10% level of significance. Under which of the following conditions would the null hypothesis be rejected? a. b. c. d. ANS: B TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Evaluate

60. Which sampling distribution is used to make inferences about a single population variance? a. a normal distribution b. a t distribution c. an F distribution d. a chi-square distribution ANS: D BLM: Remember

PTS:

REF: 434

TOP: 6

61. A hypothesis test is to be conducted regarding a population variance. Which distribution would be used to obtain the critical value of the test? a. a normal distribution b. a t distribution with n – 1 degrees of freedom c. a chi-square distribution with n – 1 degrees of freedom d. a binomial distribution ANS: C BLM: Remember

PTS:

REF: 434

TOP: 6

62. You are testing vs. using a sample of 15 observations and a significance level equal to 0.05. What is the critical value of the test? a. 1.7613 b. 4.867 c. 6.6450 d. 23.685 ANS: D TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Analyze

63. In testing vs. the following sample data were recorded: 5.0, 6.1, and 11.1. In this case, what is the value of the test statistic? a. 10.570 b. 7.400 c. 5.285 d. CHOICE BLANK

ANS: C TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Analyze

64. In testing vs. the following sample data were recorded: 11.5, 6.5, and 5.4. What is the p-value of the test? a. It is between 0.05 and 0.10. b. It is between 0.10 and 0.20. c. It is less than 0.05. d. It is greater than 0.20. ANS: A TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Analyze

65. A random sample of 25 observations is selected from a normally distributed population. The sample variance is 10. What is the upper limit of the 95% confidence interval for the population variance? a. 6.097 b. 17.110 c. 17.331 d. 19.353 ANS: D TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Apply

66. In a hypothesis test for the population variance, the hypotheses are

vs.

. The sample size is 20 and the test is being carried out at the 5% level of significance. Under which of the following conditions would the null hypothesis be rejected? a. b. c. d. ANS: D TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Evaluate

67. In a hypothesis test for the population variance, the hypotheses are

vs.

. The sample size is 15 and the test is being carried out at the 10% level of significance. What would be the rejection region? a. < 6.571 or > 23.685 b. < 7.261 or < 24.996 c. < 7.790 or > 21.064 d. < 8.547 or > 22.307 ANS: A

PTS:

REF: 434-437 | 723-724

TOP: 6

BLM: Higher Order - Analyze

68. Which of these statements could be used to describe the chi-square distribution? a. It is symmetrical about 0. b. It is positively skewed, ranging between 0 and . c. It is negatively skewed ranging between – and 0. d. It is mound-shaped. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 432-433

TOP: 6

69. On which of the following does the shape of the chi-square distribution depend? a. the population size b. the number of its degrees of freedom c. the population standard deviation d. whether the population is unimodal or bimodal ANS: B BLM: Remember

PTS:

REF: 432-433

TOP: 6

70. Which of the following characteristics must hold if the statistic is to be chi-square distributed with n – 1 degrees of freedom? a. The population must be normally distributed with variance equal to . b. The sample must be normally distributed with variance equal to . c. The sample must have a Student’s t distribution with degrees of freedom equal to n – 1. ANS: A BLM: Remember

PTS:

REF: 432 | 434

TOP: 6

71. A random sample of size 20 taken from a normally distributed population resulted in a sample variance of 32. What would be the lower limit of a 90% confidence interval for the population variance? a. 20.170 b. 20.375 c. 52.185 d. 54.931 ANS: A TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Apply

72. A random sample of size 25 taken from a normally distributed population resulted in a sample standard deviation of 0.93054. What are the lower and upper limits of a 99% confidence interval for the population variance? a. 9.886 and 45.559 b. 3.144 and 6.750 c. 0.678 and 1.449 d. 0.456 and 2.102 ANS: D

PTS:

REF: 434-437 | 723-724

TOP: 6

BLM: Higher Order - Apply

73. Which of the following is NOT a property of the F distribution? a. F distributions are non-symmetrical. b. F distributions have (n – 1) degrees of freedom. c. F can assume only positive values. d. There are many F distributions and each has a different shape. ANS: B BLM: Remember

PTS:

REF: 439

TOP: 7

74. Which of the following best describes the sampling distribution of the ratio of two independent sample variances selected randomly from normal populations with equal variances? a. It is a normal distribution. b. It is a t distribution. c. It is an F distribution. d. It is a chi-square distribution. ANS: C BLM: Remember

PTS:

REF: 439

TOP: 7

75. Two random samples of 10 and 12 observations produced sample variances equal to 7.50 and 3.20, respectively. Which of the following is the calculated value of the test statistic when testing a. 10.70 b. 7.50 c. 5.49 d. 3.20 ANS: C TOP: 7

vs.

PTS: 1 REF: 441-442 | 725-732 BLM: Higher Order - Apply

76. Which of the following distributions is used when testing a. chi-square distribution b. normal distribution c. F distribution d. t distribution ANS: C BLM: Remember

PTS:

REF: 441-442

vs.

TOP: 7

77. For which of the following ratios is its sampling distribution the F distribution ? a. the ratio of two normal population variances b. the ratio of two normal population means c. the ratio of two sample variances provided that the samples are independently drawn from two normal populations with equal variances d. the ratio of two sample variances provided that the sample sizes are large ANS: C

PTS:

REF: 439-440

TOP: 7

BLM: Remember 78. When testing for the difference between two population variances with sample sizes of a. b. c. d.

and 18 and 2 8 and 10 7 and 9 2 and 18

, what are the numbers of degrees of freedom?

ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 441-442

TOP: 7

79. Which of the following statements is correct regarding the percentile points of the F distribution? a. b. c. d. ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 439-440

TOP: 7

80. A researcher wants to test for the equality of two population variances when the populations are normally distributed. He has opted for a10% level of significance . Which of the following upper-tail areas of the F table must he use to determine the rejection region? a. 0.90 b. 0.20 c. 0.10 d. 0.05 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 439-442

TOP: 7

81. In constructing a 90% interval estimate for the ratio of two population variances, / , two independent samples of sizes 40 and 60 are drawn from the populations. The sample variances are 515 and 920. What is the lower confidence limit? a. 0.341 b. 0.352 c. 0.890 d. 0.918 ANS: B TOP: 7

PTS: 1 REF: 441-442 | 725-732 BLM: Higher Order - Apply

TRUE/FALSE 1. In making inferences about a population mean t distribution are equal to the sample size.

, the degrees of freedom used in

ANS: F BLM: Remember

PTS:

REF: 401

TOP: 1–3

2. Regardless of the degrees of freedom, every t distribution is symmetric around 0. ANS: T BLM: Remember

PTS:

REF: 401

TOP: 1–3

3. The “shape” of the t distribution changes as the value of the sample mean changes. ANS: F BLM: Remember

PTS:

REF: 401

TOP: 1–3

4. If a sample of size 15 is randomly selected from a population, the value of A for the probability P(t A) = 0.01 is 2.602. ANS: F TOP: 1–3

PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

5. If a sample of size 20 is randomly selected from a population, the value of A for the probability P(–A t A) = 0.95 is 2.093. ANS: T TOP: 1–3

PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

6. If a sample has 15 observations and a 95% confidence estimate for constructed, the appropriate t score is 2.131. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

7. If a sample has 10 observations and a 90% confidence estimate for appropriate t score is 1.833. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

needs to be

TOP: 1–3

is needed, the

TOP: 1–3

8. The Student’s t distribution is a sampling distribution for a random variable, t, derived from a normally distributed population, that is (1) single-peaked above the random variable’s mean, median, and mode of 0; (2) perfectly symmetrical about this central value; and (3) characterized by tails extending indefinitely in both directions from the centre, approaching, but never touching, the horizontal axis. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 401

TOP: 1–3

9. With all other factors held constant, confidence intervals constructed with small samples tend to have greater margins of error than those constructed from larger samples.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 400-403

TOP: 1–3

10. A random sample of size 10 produced a sample mean equal to 12 and a standard deviation equal to 0.15. Based on this information, the upper limit for a 95% confidence interval estimate is approximately 12.107. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

11. A random sample of size 10 produced a sample mean equal to 12 and a standard deviation equal to 0.15. Based on this information, the margin of error associated with a 90% confidence interval estimate for the population mean is 1.8331. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

12. The t distribution is used to construct a confidence interval for the population mean when the population is unknown and the sample size is small. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 404

TOP: 1–3

13. A random sample of size 9 produced a sample mean equal to 13.5 and a standard deviation of 3.2. The margin of error associated with a 95% confidence interval estimate for the population mean is approximately 2.46. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

14. A 90% confidence interval estimate for the population mean constructed with a small sample will have a margin of error that is approximately 90% of the population size n. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 404

15. If a sample has 15 observations and a 90% confidence estimate for appropriate t score is 1.341. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

16. If a sample has 18 observations and a 90% confidence estimate for appropriate t score is 1.740. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

is needed, the

TOP: 1–3

is needed, the

TOP: 1–3

17.

The statistic ) when the sampled population is normal is the Student’s t distributed with n degrees of freedom. ANS: F BLM: Remember

PTS:

REF: 401 | 403

TOP: 1–3

18. The t distribution approaches the standard normal distribution as the number of degrees of freedom increases. ANS: T BLM: Remember

PTS:

REF: 401

TOP: 1–3

19. In order to determine the p-value associated with hypothesis testing about the population mean , it is necessary to know the value of the test statistic. ANS: T BLM: Remember

PTS:

REF: 404-406

TOP: 1–3

20. Statisticians have shown that the mathematical process that derived the Student’s t distribution is robust, which means that if the sampled population is non-normal, the t test of the population mean is still valid, provided that the population is not extremely non-normal. ANS: T BLM: Remember

PTS:

REF: 403

TOP: 1–3

21. A race car driver tested his car for time from 0 to 60 mph, and in 20 tests obtained an average of 48.5 seconds with a standard deviation of 1.47 seconds. A 95% confidence interval for the 0 to 60 time is 4.52 seconds to 5.18 seconds. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

22. In forming a 95% confidence interval for a population mean from a sample size of 20, the number of degrees of freedom from the t distribution equals 18. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 404 | 722

TOP: 1–3

23. The t distribution allows the calculation of confidence intervals for means when the actual standard error is not known. ANS: T BLM: Remember

PTS:

REF: 404

TOP: 1–3

24. The t distribution allows the calculation of confidence intervals for means for small samples when the population variance is not known, regardless of the shape of the distribution in the population.

ANS: F BLM: Remember

PTS:

REF: 404

TOP: 1–3

25. For a t distribution with 12 degrees of freedom, the area between –2.6810 and 2.1788 is 0.980. ANS: F TOP: 1–3

PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

26. The t distribution is symmetric about 0 but is more spread out than the standard normal distribution. ANS: T BLM: Remember

PTS:

REF: 401

TOP: 1–3

27. Even though the t distribution is mound-shaped, as the degrees of freedom get smaller, the t distribution’s dispersion also decreases. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 401

TOP: 1–3

28. The shape of the t distribution depends on the size of the sample because that influences the number of degrees of freedom. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 401

TOP: 1–3

29. When small samples are used to estimate the true population mean where the population standard deviation is unknown, the standard normal distribution must be used to obtain the critical value. ANS: F BLM: Remember

PTS:

REF: 400-401

TOP: 1–3

30. When small samples are used to estimate the true population mean where the population standard deviation is unknown, the t distribution must be used to obtain the critical value. ANS: T BLM: Remember

PTS:

REF: 400-401

TOP: 1–3

31. When small samples are used to estimate the true population mean where the population standard deviation is unknown, the margin of error for the confidence interval estimate tends to be very small. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 401

TOP: 1–3

32. An assumption behind the t distribution is that it assumes the population is normally distributed. ANS: T BLM: Remember

PTS:

REF: 403

TOP: 1–3

33. The Student’s t distribution has more area in the tails and less in the centre than does the normal distribution. ANS: T BLM: Remember

PTS:

REF: 401

TOP: 1–3

34. The Student’s t distribution is used to construct confidence intervals for the population mean when the population standard deviation is known. ANS: F BLM: Remember

PTS:

REF: 404 | 401

TOP: 1–3

35. In testing the difference between two population means using two independent samples, the sampling distribution of the sample mean difference are both greater than 30. ANS: F BLM: Remember

PTS:

is normal if the sample sizes

REF: 413 | 331

TOP: 4

36. Two samples of sizes 15 and 20 are randomly and independently selected from two normally distributed populations, where the unknown population variances are assumed to be equal. The number of degrees of freedom associated with the two-sample t test is equal to 35. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 414-415

TOP: 4

37. The pooled-variances t-test requires that the two population variances need not be the same. ANS: F TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

38. In testing the difference between two population means using two independent random samples, we use the pooled variance in estimating the standard error of the sampling distribution of the sample mean difference variances. ANS: T BLM: Remember

PTS:

if the populations are normal with equal

REF: 414-415

TOP: 4

39. In testing the difference between two population means using two independent random samples, the population standard deviations are assumed to be known, and the calculated test statistic equals 2.75. If the test is two-tailed and 5% level of significance has been specified, the conclusion should be not to reject the null hypothesis. ANS: F TOP: 4

PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Analyze

40. The necessary conditions having been met, a two-tailed test was conducted to test the difference between two population means. However, the statistical software provided only a one-tailed area of 0.045 as part of its output. In this case, the p-value for this test would have been 0.09. ANS: T TOP: 4

PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Understand

41. Two random samples of sizes 25 and 20 are independently drawn from two normal populations, where the unknown population variances are assumed to be equal. The number of degrees of freedom of the equal-variances t-test statistic is 44. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 414-415

TOP: 4

42. The equal-variances test statistic of is Student’s t distributed with freedom, provided that the two populations are normal. ANS: F BLM: Remember

PTS:

REF: 414-415

degrees of

TOP: 4

43. When the population variances are unequal, we estimate each population variance with its sample variance. Hence, the unequal-variances test statistic of distributed with ANS: F BLM: Remember

is Student’s t

– 2 degrees of freedom. PTS:

REF: 418

TOP: 4

44. Statisticians have shown that for given sample sizes and , the number of degrees of freedom associated with the equal-variances test statistic and confidence interval estimator of is always greater than or equal to number of degrees of freedom associated with the unequal-variances test statistic and confidence interval estimator. ANS: T TOP: 4

PTS: 1 REF: 414-415 | 418 BLM: Higher Order - Understand

45. Both the equal-variances and the unequal-variances test statistics and confidence interval estimator of

require that the two populations be normally distributed.

ANS: T TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

46. The sample size in each independent sample must be the same if we are to test for differences between the means of two independent populations. ANS: F BLM: Remember

PTS:

REF: 414-415

TOP: 4

47. When we test for differences between the means of two independent populations, we can use only a two-tailed test. ANS: F BLM: Remember

PTS:

REF: 414-415

TOP: 4

48. To test the difference between two population means by applying the two-sample procedure that uses a pooled estimate of the common variance , the two samples must be independent of each other. ANS: T TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

49. To test the difference between two population means by applying the two-sample procedure that uses a pooled estimate of the common variance , the populations from which the samples are drawn must be t distributed. ANS: F TOP: 4

PTS: 1 BLM: Remember

REF: 414-415 | 418

50. A political analyst in Manitoba surveys a random sample of registered Liberals and compares the results with those obtained from a random sample of registered Conservatives. This would be an example of an experimental design called a paired-difference or matched -pairs design. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 425

TOP: 5

51. In a paired-difference experiment, two samples of size n are being used. The number of degrees of freedom associated with the paired-difference test is n – 1. ANS: T BLM: Remember

PTS:

REF: 426

TOP: 5

52. In comparing two population means when samples are matched pairs, the variable under consideration is d, the difference between the corresponding sample values. ANS: T BLM: Remember

PTS:

REF: 426

TOP: 5

53. The matched pairs experiment always produces a larger test statistic than the independent-samples experiment. ANS: F TOP: 5

PTS: 1 REF: 426 | 414-415 BLM: Higher Order - Understand

54. A statistics professor wanted to test whether the grades on statistics test were the same for first- and third-year students. The professor took a random sample of size 12 from each and conducted a test, determining that the variances were equal. For this situation, the professor should use a matched pairs t test. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 425-427

TOP: 5

55. The number of degrees of freedom associated with the t test when the data are gathered from a matched pairs experiment with 8 pairs is 7. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 426

TOP: 5

56. When comparing two population means using data that are gathered from a matched pairs experiment, the test statistic for is Student’s t distributed, with freedom, provided that the differences are normally distributed. ANS: T BLM: Remember

PTS:

REF: 426

degrees of

TOP: 5

57. A Canadian Forces drill instructor recorded the time in which each of 15 recruits completed an obstacle course both before and after basic training. To test whether any improvement occurred, the instructor would use a t distribution with 15 degrees of freedom. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 425-426

TOP: 5

58. In comparing two population means of interval data, we must decide whether the samples are independent (in which case the parameter of interest is which case the parameter is ANS: T TOP: 5

) or matched pairs (in

) in order to select the correct test statistic.

PTS: 1 BLM: Remember

REF: 414-415 | 426

59. A researcher is curious about the effect of sleep on students’ test performances. He chooses 50 students and gives each two tests: one given after four hours of sleep and one after eight hours of sleep. The test the researcher should use would be a matched pairs t test. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 425-427

TOP: 5

60. A Canadian Forces instructor recorded the time in which each of 10 recruits completed an obstacle course both before and after basic training. To test whether any improvement occurred, the instructor would use a t distribution with nine degrees of freedom. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 425-426

TOP: 5

61. Matched pairs sampling is a procedure that matches each unit from population A with a “twin” from population B so that any sample observation about a unit in population A automatically yields an associated observation about a unit in population B. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 425 | 427

TOP: 5

62. Matched pairs sampling is designed to control for extraneous factors that might influence the characteristic being measured in addition to the variable under study. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 425 | 427

TOP: 5

63. Matched pairs sampling may be used when testing the effectiveness of a new drug compared to a traditional one. Each patient in an experimental group might be matched with a partner in a control group of the same age, weight, height, sex, occupation, medical history, lifestyle, and so on. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 425 | 427

TOP: 5

64. The test statistic used to test hypotheses about the population variance

, which is chi-square distributed with n – 1 degrees of freedom when the population is normally distributed with variance equal to . ANS: T BLM: Remember

PTS:

REF: 433-434

TOP: 6

65. The t distribution with n – 1 degrees of freedom is used when testing a null hypothesis for a population variance. ANS: F BLM: Remember

PTS:

REF: 433-434

TOP: 6

66. If you wish to test vs. at the 0.05 level of significance using a sample of 15 observations, the critical value to be used is 23.685. ANS: T TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Apply

67. If you wish to test vs. at the 0.05 level of significance using a sample of 20 observations, the critical values to be used are 32.852. ANS: F TOP: 6

PTS: 1 REF: 434-437 | 723-724 BLM: Higher Order - Apply

68. For a given level of significance, increasing the sample size will tend to increase the chi-square critical value used in testing the null hypothesis about a population variance. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 433-434

TOP: 6

69. The chi-square critical value denotes the number on the measurement axis such that 10% of the area under the chi-square curve with 6 degrees of freedom lies to the right of . ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 433-434

TOP: 6

70. The 5th percentile of a chi-square distribution with 10 degrees of freedom is equal to 4.3903. ANS: F TOP: 6

PTS: 1 REF: 433-434 | 723-724 BLM: Higher Order - Apply

71. The 90th percentile of a chi-square distribution with 15 degrees of freedom is equal to 22.3072. ANS: T TOP: 6

PTS: 1 REF: 433-434 | 723-724 BLM: Higher Order - Apply

72. The area under a chi-square curve with 10 degrees of freedom, which is captured between the critical values ANS: T TOP: 6

, is

PTS: 1 REF: 433-434 | 723-724 BLM: Higher Order

73. The chi-square distribution can be used in constructing confidence intervals and carrying out hypothesis tests regarding the value of a population variance. ANS: T BLM: Remember

PTS:

REF: 434

TOP: 6

74. The chi-square distribution is skewed to the left (negatively skewed), but as degrees of freedom increase, it approaches the shape of the binomial distribution. ANS: F

PTS:

REF: 432-433

TOP: 6

BLM: Higher Order - Understand 75. The area to the right of a chi-square variable is 0.025. For five degrees of freedom, the critical value is 11.143. ANS: F TOP: 6

PTS: 1 REF: 433-434 | 723-724 BLM: Higher Order - Apply

76. A right-tailed area in the chi-square distribution equals 0.05. For eight degrees of freedom, the critical value equals 13.362. ANS: F TOP: 6

PTS: 1 REF: 433-434 | 723-724 BLM: Higher Order - Apply

77. In a hypothesis test for a population variance . ANS: T BLM: Remember

PTS:

REF: 434

78. In a hypothesis test for a population variance determine the critical value. ANS: T BLM: Remember

PTS:

, the null hypothesis is stated in terms of

TOP: 6

, the chi-square distribution is used to

REF: 434

TOP: 6

79. In a hypothesis test for a population variance , the critical value increases as the sample size increases for a given level of significance. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 433-434

TOP: 6

80. In testing the equality of two population variances, when the populations are normally distributed, the 5% level of significance has been used. To determine the rejection region, you will refer to the F table corresponding to an upper-tail area of 0.025. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 441-442

TOP: 7

81. We can use either the z test or the t test to determine whether two population variances are equal. ANS: F BLM: Remember

PTS:

82. The value of F with area 0.05 to its right for ANS: T

PTS:

REF: 439

= 6 and REF: 439-440

TOP: 7

= 9 is 3.37. TOP: 7

BLM: Higher Order - Apply 83. The value of F that locates an area 0.01 in the upper tail of the F distribution for and

= 15

= 10 is 3.80.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 439-440

TOP: 7

84. If two population variances have been tested and found to be equal, then it is reasonable to conclude that the two random samples selected from the two populations have equal variances. ANS: F PTS: 1 BLM: Higher Order - Understand 85. In testing

vs.

REF: 438-439

TOP: 7

the null hypothesis will be rejected if the ratio

is substantially larger than 1.0. ANS: T TOP: 7

PTS: 1 BLM: Remember

REF: 441-442 | 725-732

86. A two-tailed test for two population variances could have the null hypothesis written as

ANS: T BLM: Remember

PTS:

REF: 441-442

TOP: 7

87. In testing vs. the larger the sample sizes from the two populations, the smaller will be the chance of committing a Type I error. ANS: F TOP: 7

PTS: 1 REF: 441-442 | 368 BLM: Higher Order - Understand

88. In testing vs. the critical value is determined from the F distribution table with an upper tail area equal to half the value of the level of significance. ANS: T PTS: 1 BLM: Higher Order - Understand 89. In testing vs. of the test statistic is F = 1.60. ANS: F TOP: 7

REF: 441-442

TOP: 7

and

then the calculated value

PTS: 1 REF: 441-442 | 725-732 BLM: Higher Order - Apply

90. The F distribution is used in testing the hypotheses

vs.

ANS: F BLM: Remember

PTS:

91. In testing where

vs.

REF: 441-442

the F-test statistic is calculated as F =

are the two sample variances and

ANS: F BLM: Remember

PTS:

TOP: 7

is the smaller of the two.

REF: 441-442

TOP: 7

92. The necessary conditions having been met, a two-tailed test is being conducted at = 0.05 to test

. The two sample variances are

sizes are

, and the sample

. The rejection region is F > 2.20 or F < 0.4255.

ANS: T TOP: 7

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Apply

93. In testing for the equality of two population variances, when the populations are normally distributed, the 5% level of significance has been used. To determine the rejection region, it will be necessary to refer to the F table corresponding to an upper-tail area of 0.05. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 441-442

94. The test statistic employed to test

TOP: 7

, which is F distributed with

degrees of freedom, provided that the two populations are F distributed. ANS: F BLM: Remember

PTS:

REF: 441-442

TOP: 7

95. The necessary conditions having been met, a two-tail test is being conducted at test

. The two sample variances are

sizes are

PTS: 1 REF: 441-442 | 725-732 BLM: Higher Order - Apply

96. When comparing two population variances, we use the ratio

ANS: T BLM: Remember

, and the sample

. The calculated value of the test statistic will be F = 2.

ANS: F TOP: 7

difference

= 0.05 to

rather than the

. PTS:

REF: 441-442

TOP: 7

97. The F test used for testing the difference in two population variances is always a one-tailed test. ANS: F BLM: Remember

PTS:

REF: 441-442

TOP: 7

98. The test for the equality of two population variances assumes that each of the two populations is normally distributed. ANS: T TOP: 7

PTS: 1 BLM: Remember

REF: 439 | 441-442

99. The F distribution is symmetric. ANS: F BLM: Remember 100.

REF: 439

TOP: 7

For to have an F distribution, random samples must be drawn from each of two normal populations and be independent. ANS: T BLM: Remember

101.

PTS:

For

PTS:

REF: 439

TOP: 7

to have an F distribution, the variability of the measurements in the two

populations must be the same and can be measured by a common variance, ANS: T BLM: Remember 102.

PTS:

REF: 439-440

TOP: 7

For an F distribution test statistic, the number of degrees of freedom associated with its denominator must be equal to the number of degrees of freedom associated with its numerator. ANS: F PTS: 1 BLM: Higher Order - Understand

104.

TOP: 7

For an F distribution test statistic, the number of degrees of freedom associated with its denominator must be larger than the number of degrees of freedom associated with its numerator. ANS: F PTS: 1 BLM: Higher Order - Understand

103.

REF: 439

REF: 439-440

TOP: 7

For an F distribution test statistic, the number of degrees of freedom associated with its denominator must be smaller than the number of degrees of freedom associated with its numerator. ANS: F

PTS:

REF: 439-440

TOP: 7

BLM: Higher Order - Understand 105.

For an F distribution test statistic, the number of degrees of freedom associated with its denominator can be larger, smaller, or equal to the number of degrees of freedom associated with its numerator. ANS: T PTS: 1 BLM: Higher Order - Understand

106.

In testing vs. numerator of the F test statistic. ANS: F BLM: Remember

107.

PTS:

TOP: 7

REF: 441-442

TOP: 7

REF: 440

TOP: 7

REF: 439

TOP: 7

The exact shape of the F distribution is determined by two numbers of degrees of freedom. ANS: T BLM: Remember

111.

REF: 441-442

Variables that are F distributed range from 0 to ANS: T PTS: 1 BLM: Higher Order - Understand

110.

the larger sample variance must be used as the

In testing vs. you are free to decide which of the two populations you want to call “Population 1.” ANS: T BLM: Remember

109.

TOP: 7

In testing vs. the level of significance needs to be doubled before finding the critical value in the F table. ANS: F PTS: 1 BLM: Higher Order - Understand

108.

REF: 439-440

PTS:

REF: 439-440

TOP: 7

The sampling distribution of the ratio of two sample variances / is said to be F distributed provided that the samples are dependent and their sizes are large. ANS: F BLM: Remember

PTS:

REF: 439

TOP: 7

Chapter 10B—Inference From Small Samples PROBLEM 1. Given a random variable that has a t distribution with the specified degrees of freedom, in each of the following cases what percentage of the time will its value fall in the indicated region? a. 15 degrees of freedom, between –2.131 and 2.131 b. 19 degrees of freedom, between –2.539 and 2.539 c. 23 degrees of freedom, between –1.319 and 1.319 d. 10 degrees of freedom, between –3.169 and 3.169 ANS: a. 95% b. 98% c. 80% d. 99% PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

TOP: 1–3

2. What is the appropriate t critical value for each of the following confidence levels and sample sizes when testing the two-sided alternative hypothesis? a. 80% confidence, n = 17 b. 90% confidence, n = 7 c. 99% confidence, n = 4 d. 95% confidence, n = 14 ANS: a. 1.337 b. 1.943 c. 5.841 d. 2.16 PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

TOP: 1–3

3. Let  denote the true average number of minutes of a television commercial. Suppose the hypotheses are tested. Assuming the commercial time is normally distributed, give the appropriate rejection region for each of the following sample sizes and significance levels. a. n = 6, = 0.01 b. n = 12, = 0.05 c. n = 20, = 0.05 d. n = 23, = 0.1 ANS:

a. b. c. d.

| t | > 4.032 | t | > 2.201 | t | > 2.093 | t | > 1.717

PTS: 1 REF: 404 | 722 BLM: Higher Order - Apply

TOP: 1–3

Average Fuel Consumption The average fuel consumption of a 4-wheel drive truck is 12.9 L/100 km. The average fuel consumption for seven randomly selected trucks is 13.5, 13.0, 12.6, 12.2, 12.8, 12.9, and 13.1. Assume the fuel consumption distribution is normal. The researcher wishes to know if the sample data suggest that the average fuel consumption is different from 12.9 L/100 km. 4. Please refer to the Average Fuel Consumption paragraph. State the appropriate hypotheses. ANS:

PTS: 1 REF: 404 BLM: Higher Order - Analyze

TOP: 1–3

5. Please refer to the Average Fuel Consumption paragraph. Compute the test statistic for the hypotheses in the previous question. ANS: The sample mean is = 12.871, and the sample standard deviation is s = 0.487 L/100 km. Hence, the test statistic is PTS: 1 REF: 404 BLM: Higher Order - Apply

= (12.871 – 12.9)/(0.487/

) = –0.158.

TOP: 1–3

6. Please refer to the Average Fuel Consumption paragraph. Compute the approximate p-value associated with the test statistic in the previous question. Do the sample data support the null hypothesis at the = 0.05 level? Justify your conclusion. ANS: p-value = 2P( t < –0.158) 2(0.5 – 0.1) = 0.80. Yes, the sample data support the null hypothesis at the and

= 0.05 level, since p-value >

is not rejected.

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Evaluate Vertical Blinds Installation Narrative

TOP: 1–3

A drapery store manager was interested in determining whether a new employee can install vertical blinds faster than an employee who has been with the company for two years. The manager takes independent samples of ten vertical blind installations of each of the two employees and computes the following information.

Sample Size Sample Mean Standard Deviation

New Employee 10 22.2 min 0.90 min

Veteran Employee 10 24.8 min 0.75 min

7. Refer to Vertical Blinds Installation Narrative. State the appropriate null and alternative hypotheses to test whether the new employee installs vertical blinds faster, on the average, than the veteran employee. ANS: Let the new employee be population 1, the veteran employee be population 2, and be the true average time it takes employee i to install vertical blinds, i = 1, 2. The hypotheses to be tested are PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

. TOP: 1–3

8. Refer to Vertical Blinds Installation Narrative. Calculate the pooled estimate of the common variance ANS: The pooled estimate of the common variance is 5.0625)/18 = 0.68625. PTS: 1 REF: 414-415 BLM: Higher Order - Apply

= (7.29 +

TOP: 1–3

9. Refer to Vertical Blinds Installation Narrative. Calculate the value of the test statistic. ANS:

The test statistic is PTS: 1 REF: 414-415 BLM: Higher Order - Apply

= (22.2 – 24.8)/0.3705 = -7.018. TOP: 1–3

10. Refer to Vertical Blinds Installation Narrative. Set up the appropriate rejection region for the hypotheses above and assume  = 0.05.

ANS: Reject

if t < –

= –1.734.

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 1–3

11. Refer to Vertical Blinds Installation Narrative. What is the appropriate conclusion? Give reasons for your answer. ANS: Since t  –1.734, reject and conclude that the new employee installs vertical blinds faster, on the average, than the veteran employee. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

12. Refer to Vertical Blinds Installation Narrative. Is it reasonable to assume equality of variances in this problem? Justify your answer. ANS: Yes, since PTS: 1 REF: 418 BLM: Higher Order - Evaluate

= 1.44 < 3. TOP: 7

13. Refer to Vertical Blinds Installation Narrative. Use two population variances are equal.

= 0.05 to test the hypothesis that the

ANS: The hypothesis to be tested are test statistic is F = Since F = 1.44, we fail to reject equal.

The observed value of the = 1.44. The rejection region is F >

= 4.03.

, and we conclude that the population variances are

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Evaluate

TOP: 7

14. A logger knows the average time for his cutting machine to cut 20 trees is 9.8 minutes. A new machine on the market claims to cut the trees in less than 9.8 minutes. A random sample of 25 test runs on the new machine yielded a mean of 8.5 minutes with a standard deviation of 1.5. Do the sample data suggest the new machine cuts faster than the logger’s machine? Test at the  = 0.05 level. Assume the cutting time is normally distributed and interpret your results. ANS:

. The test statistic is: = (8.5 – 9.8)/(1.5/ ) = –4.33. Since t < –1.711, reject the null hypothesis. The sample data supports that the new machine cuts faster. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

Earthquake Analysis Narrative The length of duration, in minutes, of earthquakes in British Columbia has been recorded for future analysis and information. The length of duration of a random sample of six earthquakes is as follows: 1.1, 0.9, 1.5, 0.7, 1.4, and 1.3. 15. Refer to Earthquake Analysis Narrative. Assuming the distribution of the length of duration of the earthquakes is approximately normal, find a 98% confidence interval for the true average duration of earthquakes in British Columbia. ANS: The sample mean and standard deviation are interval is 1.573.

1.15

= 1.15, and s = 0.308. The 98% confidence

(3.365) (0.308) /

PTS: 1 REF: 404 | 722 BLM: Higher Order - Analyze

= 1.15

0.423, or 0.727 < <

TOP: 1–3

16. Refer to Earthquake Analysis Narrative. Interpret the interval in the previous question. ANS: One can estimate with 98% confidence that the true average duration of earthquakes in British Columbia is between 0.727 and 1.573 minutes. PTS: 1 REF: 404-406 BLM: Higher Order - Understand

TOP: 1–3

17. Refer to Earthquake Analysis Narrative. An earthquake expert claims that the average duration of earthquakes in British Columbia is 0.5 minutes. Based on the interval calculated above, can this claim be rejected? Justify your answer. ANS: Since 0.5 is outside the limits of the 98% confidence interval, one can reject the expert’s claim that the true average duration of earthquakes in California is 0.5 minutes. PTS: 1 REF: 404-406 BLM: Higher Order - Evaluate

TOP: 1–3

Average Battery Life Narrative The average life of a certain type and brand of battery is 75 weeks. The average life of each of nine randomly selected batteries is as follows: 74.5, 75.0, 72.3, 76.0, 75.2, 75.1, 75.3, 74.9, and 74.8. Assume the battery life distribution is normal. Do the sample data suggest the average life is smaller than 75 weeks? 18. Refer to Average Battery Life Narrative. State the appropriate hypotheses. ANS:

PTS: 1 REF: 404 BLM: Higher Order - Analyze

TOP: 1–3

19. Refer to Average Battery Life Narrative. Compute the test statistic for the hypotheses in the previous question. ANS: The sample mean and standard deviation are

= 74.789, and s = 1.020, respectively.

Hence, the test statistic is

= (74.789 – 75.0)/(1.02 /

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

) = –0.62.

TOP: 1–3

20. Refer to Average Battery Life Narrative. Compute the approximate p-value associated with the test statistic in question 54. Do the sample data support the alternative hypothesis at the = 0.05 level? Justify your conclusion. ANS: p-value = P( t < -0.62) = P( t > 0.62) > 0.10. No; the sample data do not support the alternative hypothesis at the 0.05 level, since p-value > PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

and

is not rejected. TOP: 1–3

Manufacturing Garment Average Narrative A garment manufacturing company recorded the amount of time that it took to make a pair of jeans on eight different occasions. The times in minutes are as follows: 12.5, 13.0, 11.9, 10.2, 13.1, 13.6, 13.8, and 14.0. Assume these measurements were taken from a population with a normal distribution. Do the sample data suggest that the average time it takes this company to make a pair of jeans is less than 13.5 minutes? 21. Refer to Manufacturing Garment Average Narrative. State the appropriate hypotheses. ANS:

PTS: 1 REF: 404 BLM: Higher Order - Analyze

TOP: 1–3

22. Refer to Manufacturing Garment Average Narrative. Compute the test statistic for the hypotheses in the previous question. ANS: The sample mean and standard deviation are

= 12.7625, and s = 1.2455, respectively.

Hence, the test statistic is

= (12.7625 – 13.5)/(1.2455 /

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

) = –1.675.

TOP: 1–3

23. Refer to Manufacturing Garment Average Narrative. Do the sample data support the alternative hypothesis at the = 0.05 level? Justify your conclusion. ANS: No. Since t > –1.895, do not reject the null hypothesis. The sample data don’t support that the average time it takes this company to make a pair of jeans is less than 13.5 minutes. PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

24. Refer to Manufacturing Garment Average Narrative. Construct a 95% confidence interval for the mean amount of time it takes this company to make a pair of jeans. ANS: 12.7625

(2.365)(1.2455) /

= 12.7625

1.0414, or 11.7211 <

13.8039. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

TOP: 1–3

Temperature Average Narrative The average low temperature for Victoria, B.C. in September is 12°C. The average low temperature for each of eight randomly selected years is 11.0, 12.4, 11.8, 10.9, 11.4, 12.2, 10.8, and 12.2. Assume the September low temperature distribution is normal. Do the sample data suggest the average low temperature is lower than 12°C? 25. Refer to Temperature Average Narrative. State the appropriate hypotheses. ANS:

PTS: 1 REF: 404 BLM: Higher Order - Analyze

TOP: 1–3

26. Refer to Temperature Average Narrative. Compute the test statistic for the hypotheses in the previous question. ANS: The sample mean and standard deviation are

= 11.5875, and s = 0.6468, respectively.

Hence, the test statistic is

= (11.5875 – 12.0)/(0.6468/

PTS: 1 REF: 404-406 BLM: Higher Order - Apply

) = –1.8038.

TOP: 1–3

27. Refer to Temperature Average Narrative. Compute the approximate p-value associated with the test statistic in the previous question. Do the sample data support the null hypothesis at the  = 0.1 level? Justify your conclusion. ANS: p-value = P( t < –1.80) = P( t > 1.80) hypothesis at the

0.05. No; the sample data do not support the null

= 0.1 level, since p-value <

and

is rejected.

PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

Cigarette Tar Content Narrative Ten measurements of the tar content of a certain brand of cigarette are 13.5, 14.0, 13.9, 14.2, 15.1, 14.6, 13.8, 14.0, 14.1, and 14.7 in milligrams per cigarette. Assume these measurements were taken from a population with a normal distribution. 28. Refer to Cigarette Tar Content Narrative. Construct a 90% confidence interval for the mean tar content of any cigarette of this brand. ANS: The sample mean and standard deviation are interval is < 14.466.

14.19

= 14.19, and s = 0.477. The 90% confidence

(1.833) (0.477)/

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

= 14.19

0.276, or 13.914 <

TOP: 1–3

29. Refer to Cigarette Tar Content Narrative. Interpret the interval in the previous question. ANS: One can estimate with 95% confidence that the mean tar content of any cigarette of this brand is roughly between 13.9 and 14.5 milligrams.

PTS: 1 REF: 404-406 BLM: Higher Order - Understand

TOP: 1–3

30. One study revealed a child under the age of 10 watches television 4.5 hours per day. A group of families from a certain community would like to believe that their children watch less television than the national average. A random sample of 14 children from the community yielded a mean of 4.1 hours per day with a standard deviation of 1.2. Test the appropriate hypotheses at the = 0.01 level. Assume the viewing time is normally distributed and interpret your results. ANS: = (4.1 –

The test statistic is:

4.5)/(1.2/ ) = -1.247. Since t > –2.65, do not reject the null hypothesis. The sample data do not support that these children watch less television than the national average. PTS: 1 REF: 404-406 BLM: Higher Order - Evaluate

TOP: 1–3

Laptop Battery Average Narrative The manufacturer of a particular battery pack for laptop computers claims its battery pack can function for 8 hours, on average, before having to be recharged. A random sample of 16 battery packs was selected and tested. The mean functioning time before having to be recharged was 7.2 hours with a standard deviation of 1.9 hours. 31. Refer to Laptop Battery Average Narrative. Assuming the distribution of functioning times is approximately normal, find a 95% confidence interval for the true average functioning time before needing to be recharged. ANS: = 7.2

1.0122, or 6.1878 <

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

< 8.2122.

TOP: 1–3

32. Refer to Laptop Battery Average Narrative. Interpret the interval in the previous question. ANS: One can estimate with 95% confidence that the true average time the battery pack will function before needing to be recharged is between 6.1878 and 8.2122 minutes. PTS: 1 REF: 404-406 BLM: Higher Order - Understand

TOP: 1–3

33. Refer to Laptop Battery Average Narrative. Based on the interval calculated above, can the manufacturer’s claim be rejected? Justify your answer.

ANS: Since 8 is within the limits of the confidence interval, the claim cannot be rejected. PTS: 1 REF: 404-410 BLM: Higher Order - Evaluate

TOP: 1–3

Childcare Costs Narrative The public relations officer for a particular city claims the average monthly cost for childcare outside the home for a single child is $600. A potential resident is interested in whether the claim is correct. She obtains a random sample of 14 records and computes the average monthly cost of this type of childcare to be $589 with a standard deviation of $40. 34. Refer to Childcare Costs Narrative. Perform the appropriate test of hypothesis for the potential resident using = 0.01. ANS: Let be the true average monthly cost for childcare outside the home for a single child. The hypotheses to be tested are = (589 – 600)/(40 /

The test statistic is ) = –1.029. Reject

if | t | >

= 3.012.

Since | t | < 3.012, do not reject . One cannot conclude that the true average monthly cost for childcare outside the home for a single child is significantly different from $700. (i.e., cannot reject the public relations officer’s claim). PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

35. Refer to Childcare Costs Narrative. Approximate the p-value for the test in the previous question. ANS: p-value = P( | t |  1.029) = 2P(t  1.029) > 2(0.10) = 0.20. PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Analyze

TOP: 1–3

Motorcycle Fuel Consumption Narrative A Harley Davidson dealer wants to know the average fuel consumption (in litres per 100 km) of a 1992 XLT. A random sample of 17 was taken from a normally distributed population and produced a mean of 4.5 L/100 km and a standard deviation of 0.36 L/100 km. 36. Refer to Motorcycle Fuel Consumption Narrative. Construct a 95 percent confidence interval for the mean fuel consumption of any 1992 Harley Davidson XLT. ANS:

4.5

(2.120) (0.36) /

= 4.5

0.185, or 4.315 <

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

< 4.685.

TOP: 1–3

37. Refer to Motorcycle Fuel Consumption Narrative. Interpret the interval in the previous question. ANS: One can estimate with 95% confidence that the true average fuel consumption of a Harley Davidson 1992 XLT is roughly between 4.315 and 4.685 L/100 km. PTS: 1 REF: 404-406 BLM: Higher Order - Understand

TOP: 1–3

38. Refer to Motorcycle Fuel Consumption Narrative. The dealer claims that the average fuel consumption of a Harley Davidson 1992 XLT is 4.2 L/100 km. At a 95% level of confidence, can this claim be rejected? Justify your answer. ANS: Since 4.2 is outside the limits of the 95% confidence interval, one can reject the dealer’s claim. PTS: 1 REF: 404-406 BLM: Higher Order - Evaluate

TOP: 1–3

Coffee Vending Machines Narrative An automatic coffee vending machine dispenses a different amount of coffee in millilitres (mL) for each cup. Assume the following nine measurements were taken from a population with a normal distribution: 185, 170, 196, 176, 173, 187, 193, 170 and 173 mL. 39. Refer to Coffee Vending Machines Narrative. Construct an 80% confidence interval for the mean amount of coffee that is dispensed for all cups of coffee from this machine. ANS: The sample mean and standard deviation are confidence interval is = 180.3 4.692, or 175.61 <

= 180.3, and s = 10.075 mL. The 80%

180.3 (1.397) (10.075)/ < 184.99 mL.

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

40. Refer to Coffee Vending Machines Narrative. Interpret the interval in the previous question. ANS: One can estimate with 80% confidence that the mean amount of coffee that is dispensed for all cups of coffee from this machine is roughly between 175.61 and 184.99 mL.

PTS: 1 REF: 404-406 BLM: Higher Order - Understand

TOP: 1–3

Test Scores Narrative The test scores on a 100-point test were recorded for 20 students: 73, 95, 93, 83, 77, 75, 83, 84, 78, 59, 86, 91, 69, 64, 74, 79, 70, 67, 77, and 86. 41. Refer to Test Scores Narrative. Can you reasonably assume that these test scores have been selected from a normal population? Use a stem and leaf plot to justify your answer. ANS: The stem and leaf plot is shown below. Notice the mounded shape of the data, which justifies the normality assumption. Character Stem-and-Leaf Display Stem-and-Leaf of scores N = 20 Leaf Unit = 1.0 1 5 9 2 6 4 4 6 79 7 7 034 (5) 7 57789 8 8 334 7 8 66 3 9 13 1 9 5 PTS: 1 REF: 23-25 BLM: Higher Order - Evaluate

TOP: 1–3

42. Refer to Test Scores Narrative Calculate the mean and standard deviation of the scores. ANS:

= 78.15, and PTS: 1 REF: 57 | 65-66 BLM: Higher Order - Apply

= 93.2921, hence s = 9.6588. TOP: 1–3

43. Refer to Test Scores Narrative If these students can be considered a random sample from the population of all students, find a 95% confidence interval for the average test score in the population. ANS:

= 78.15 (2.093) 9.6588/ = 78.15 4.5204 or 73.6296 < < 82.6704. One can estimate with 95% confidence that the average test score in the population is roughly between 73.6 and 82.7. Intervals constructed using this procedure will enclose 95% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

TOP: 1–3

Interest Rates Narrative The following 10 observations are interest rates on unpaid balances on credit cards for a department store: 10.4, 10.1, 9.5, 10.5, 10.6, 9.3, 9.9, 10.7, 9.5, and 10.0. 44. Refer to Interest Rates Narrative. Find the mean and standard deviation of these data. ANS:

= 10.05, and

= 0.249444, hence s = 0.4994.

PTS: 1 REF: 57 | 65-66 BLM: Higher Order - Apply

TOP: 1–3

45. Refer to Interest Rates Narrative. Calculate the test statistic , specify the rejection region and then test the hypothesis

. Use

= 0.01.

ANS: The test statistic is rejection region with

= (10.05 – 10.5)/(0.4994/ ) = –2.8495. The = 0.01 and 9 degrees of freedom is located in the lower tail of the t

distribution and is found as t < – falls in the rejection region,

= –2.821. Since the observed value of the test statistic

is rejected and we conclude that

PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

is less than 10.5. TOP: 1–3

46. Refer to Interest Rates Narrative. Find a 99% confidence interval for the population mean , and explain how to use it for testing

vs.

using

ANS: = 10.05

(3.250) 0.4994/

= 10.05

0.5133 or 9.5367 <

< 10.5633.

One can estimate with 99% confidence that the population mean is roughly between 9.54 and 10.56. Intervals constructed using this procedure will enclose 99% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses. Notice that the 99% confidence interval for does indeed include the value = 10.5. However, the one-tailed test of hypothesis allows us to reject the hypothesis that = 7.5 with = 0.01. These seemingly contradictory results are explained by the fact that the test of hypothesis is one-tailed, while the confidence interval is two-sided. If the test had been two-tailed, or if you had used a one-sided confidence bound, you would have had identical conclusions. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

47. Here are the red blood cell counts (in cells per microlitre) of a healthy person measured on each of 15 days: 5.6, 5.4, 5.2, 5.4, 5.7, 5.5, 5.6, 5.4, 5.3, 5.5, 5.5, 5.1, 5.6, 5.4, and 5.4. Find a 95% confidence interval estimate of the true mean red blood cell count for this person during the period of testing. ANS:

= 5.44, and

= 0.02543, hence s = 0.15946.

= 5.44 (2.145) 0.15946/ = 5.44 0.0883 or 5.3517 < < 5.5283. One can estimate with 95% confidence that the true mean red blood cell count is roughly between 5.35 and 5.53. Intervals constructed using this procedure will enclose the true mean 95% of the time in repeated sampling. Hence, we are fairly certain that this particular interval encloses the true mean. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

Disinfectant Experiments Narrative An experiment to determine the efficacy of using 95% ethanol or 20% bleach as a disinfectant in removing bacterial and fungal contamination when culturing plant tissues was repeated 15 times for each disinfectant. The plant tissue being cultured was sweet potato: Five cuttings per plant were placed on a petri dish for each disinfectant and stored at 25°C for four weeks. The observation reported was the number of uncontaminated eggplant cuttings after the four-week storage. Disinfectant Sample Size Sample Mean Sample Variance

95% Ethanol 15 3.85 2.75

20% Bleach 15 4.92 0.18

48. Refer to Disinfectant Experiments Narrative. Is it reasonable to assume that the underlying variances are equal? Justify your conclusion. ANS: Check the ratio of the two variances using the rule of thumb given in the text. = 2.75/0.18 = 15.2778, which is greater than three. Therefore, it is not reasonable to assume that the two population variances are equal. PTS: 1 REF: 418 BLM: Higher Order - Evaluate

TOP: 1–3

49. Refer to Disinfectant Experiments Narrative. Using the information from the previous question, are you willing to conclude that there is a significant difference in the mean numbers of uncontaminated eggplants for the two disinfectants tested? ANS: You should use the unpooled t test with Satterthwaite’s approximation to the degrees of freedom for testing

vs.

. The test statistic is

with degrees of freedom approximately equal to 15.844 Take the integer part of this result. Use df = 15. With df = 15, the p-value for this two-tailed test is bounded between 0.02 and 0.05 so that can be rejected at the 5% level of significance. There is evidence of a difference in the mean number of uncontaminated eggplants for the two disinfectants. PTS: 1 REF: 414-415 | 418-419 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

50. The following data were drawn from a normal population: 15, 4, 24, 8, 16, 13, 9, 15, 7, and 22. Estimate the population mean with 90% confidence. ANS: )=

. Thus, LCL = 9.553 and UCL =17.047.

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

51. A random sample of seven observations was drawn from a normal population. The following summations were computed: hypothesis vs.

and

at the 1% significance level.

. Test the

ANS: Rejection region: Test statistic: t = 3.403 Conclusion: Reject

. We can infer that the population mean is larger than 8.

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

Grocery Receipts Narrative A simple random sample of 100 grocery receipts was drawn from a normal population. The mean and standard deviation of the sample were $120 and $25, respectively. 52. Refer to Grocery Receipts Narrative. Test the hypothesis the 10% significance level.

vs.

ANS: Rejection region: Test statistic: t = –2.0 Conclusion: Reject

. We can infer that the population mean is not equal to 125.

PTS: 1 REF: 404-410 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

53. Refer to Grocery Receipts Narrative. Estimate the population mean with 90% confidence. ANS: ) = 120 0.664. Thus, LCL = 119.336 and UCL = 120.664. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Apply

TOP: 1–3

54. Refer to Grocery Receipts Narrative. Explain how to use the confidence interval to test the hypotheses at . ANS: Since the hypothesized value at

= 125 does not lie in the 90% confidence interval, we reject

PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Understand Hourly Wages Narrative

TOP: 1–3

A random sample of 15 hourly wages for waitresses (including tips) was drawn from a normal population. The sample mean and sample standard deviation were computed as = $14.9 and s = $6.75. 55. Refer to Hourly Wages Narrative. Can we infer at the 5% significance level that the population mean is greater than 12? Justify your conclusion. ANS: , Rejection region: t > Test statistic: t = 1.664

= 1.761

Conclusion: Don’t reject . No we can’t infer at the 5% significance level that the population mean is greater than 12. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Evaluate

TOP: 1–3

56. Refer to Hourly Wages Narrative. Can we infer at the 5% significance level that the population mean is greater than 12, assuming that you know the population standard deviation is equal to 6.75? Give reasons for your answer. ANS: Rejection region: z > Test statistic: z = 1.664 Conclusion: Reject

= 1.645 . Yes, we can infer that the population mean is greater than 12

because the test statistic falls within the rejection region for PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Evaluate

. TOP: 1–3

57. A psychologist is trying to determine how many hours the average person sleeps each night. He takes a random sample of 25 individuals and asks each person how many hours he or she slept the previous night. The sum of the observations and the sum of the squared observations are 192.5 and mean number of hours of sleep.

1531.7. Estimate with 99% confidence the

ANS: = 7.74 0.666. Thus, LCL = 7.074 and UCL = 8.406 We estimate that the mean number of hours of sleep lies between 7.074 hours and 8.406 hours. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

58. During a water shortage, a water company randomly sampled residential water meters in order to monitor daily water consumption. On a particular day, a sample of 25 meters showed a sample mean of 750 litres and a sample standard deviation of 150 litres. Provide a 90% confidence interval estimate of the mean water consumption for the population. ANS: = 750 51.33. Thus, LCL = 698.67 and UCL = 801.33 L. We estimate that the mean water consumption for the population lies between 698.67 litres and 801.33 litres. PTS: 1 REF: 404-406 | 722 BLM: Higher Order - Analyze

TOP: 1–3

59. Assume that the population distributions of ages (in years) of students at two different universities in Ontario are normal with equal variances. Two random samples, drawn independently from the populations, showed the following statistics:

= 10,

= 25,

= 4; = 9, = 24, and = 9. Construct and interpret a 99% confidence interval for the true difference in average ages of students at each university. ANS: The pooled estimate of the common variance is = (36 + 72)/17 = 6.3529 Then, the 99% confidence interval is

= (25 – 24)

2.898(1.1581) = 1

3.356, or

–2.356 < < 4.356. Since this interval does contain 0, the sample evidence supports that the two universities’ students do not, on average, have significantly different ages. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

Laptop Battery Charge Time Narrative A computer laboratory manager was in charge of purchasing new battery packs for her lab of laptop computers. She narrowed her choices to two models that were available for her machines. Since the models cost about the same, she was interested in determining whether there was a difference in the average time the battery packs would function before needing to be recharged. She took two independent random samples and computed the following summary information:

Sample Size Sample Mean

Battery Pack Model 1 9 5 hours

Battery Pack Model 2 9 5.5 hours

Standard Deviation

1.5 hours

1.3 hours

60. Refer to Laptop Battery Charge Time Narrative. Perform the appropriate test of hypotheses to determine whether there is a significant difference in average functioning time before recharging between the two models of battery packs. Test using = 0.05. ANS: Let Model 1 be population 1 and Model 2 be population 2, and let be the true average functioning time before a battery needs to be recharged for model i, i = 1, 2. The hypotheses to be tested are

The pooled estimate of the common variance is

= (18 + 13.52)/16

= (5 – 5.5)/0.6616 = –0.7557.

= 1.97. Hence, the test statistic is

Reject if | t | > = 2.12. Since | t | < 2.12, do not reject . Thus one cannot conclude that there is a significant difference between the two models of battery packs in the average functioning time before recharging. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

61. Refer to Laptop Battery Charge Time Narrative. Is it reasonable to assume equality of variances in this problem? Justify your answer. ANS: Yes, since

= 1.3314  3.

PTS: 1 REF: 418 BLM: Higher Order - Evaluate

TOP: 7

62. Refer to Laptop Battery Charge Time Narrative. Use the two population variances are equal.

= 0.05 to test the hypothesis that

ANS: The hypotheses to be tested are

vs.

The observed value of the

test statistic is F =

1.3314. The reject region is F >

Since F = 1.3314, we fail to reject equal.

, and we conclude that the population variances are

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Evaluate

TOP: 7

=4.43.

63. Set up the rejection regions for the following testing conditions. Assume the assumptions of normality and equal variances are satisfied. a.

= 10,

= 12, and

= 4,

= 8, and

= 15,

= 15, and

= 0.05. = 0.01. = 0.05.

ANS: a. | t | > 2.086 b. t > 2.764 c. t < –1.701 PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Analyze

TOP: 4

Average Telephone Time on Hold Narrative A customer service representative was interested in comparing the average time (in minutes) customers are placed on hold when calling Gaz Metropolitain and Hydro-Quebec, both in Quebec. The representative obtained two independent random samples and calculated the following summary information:

Sample Size Sample Mean Sample Standard deviation

Gaz Metropolitain 9 3.2 min 0.5 min

Hydro-Quebec 12 2.8 min 0.7 min

Assume the distributions of time a customer is on hold are approximately normal. 64. Refer to Average Telephone Time on Hold Narrative. State the appropriate null and alternative hypotheses to test whether there is a significant difference between the two companies in average time a customer is on hold. ANS: Let Gaz Metropolitain be population 1, Hydro-Quebec be population 2, and be the true average time a customer is on hold for company i, i = 1, 2. The hypotheses to be tested are . PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

65. Refer to Average Telephone Time on Hold Narrative. Calculate the value of the test statistic. ANS:

The pooled estimate of the common variance is

= (5.39 + 2.0)/19

= (2.8 – 3.2)/0.275 = –1.455.

= 0.3889. The test statistic is PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

66. Refer to Average Telephone Time on Hold Narrative. Set up the appropriate rejection region for the hypotheses above, assuming = 0.10. ANS: Reject

if | t | >

= 1.729.

PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Analyze

TOP: 4

67. Refer to Average Telephone Time on Hold Narrative. What is the appropriate conclusion? Justify your answer. ANS: Since | t | < 1.729, do not reject . Thus one cannot conclude that there is a significant difference between the two companies in mean time a customer is on hold. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

68. Refer to Average Telephone Time on Hold Narrative. Is it reasonable to assume equality of variances in this problem? Justify your answer. ANS: Yes, since

= 1.96 < 3.

PTS: 1 REF: 418 BLM: Higher Order - Evaluate

TOP: 7

69. Refer to Average Telephone Time on Hold Narrative. Use that the two population variances are equal.

= 0.10 to test the hypotheses

ANS: The hypotheses to be tested are

vs.

test statistic is F =

The rejection region is F >

Since F = 1.96, we fail to reject equal.

The observed value of the

, and we conclude that the population variances are

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Evaluate

TOP: 7

70. Assume that the population distributions of times (in minutes) for two different skiers to race the same course are normal with equal variances. Two random samples, drawn independently from the populations, showed the following statistics:

= 4,

= 7.52,

= 0.25; = 5, = 8.37, and = 0.09. Construct and interpret a 95% confidence interval for the true difference in average time of skiers to race the same course. ANS: The pooled estimate of the common variance is

= (0.75+ 0.36)/7

= 0.15857. Then, the 95% confidence interval is

= (7.52 –

8.37) 2.365 (0.2671) = –0.85 0.632, or –1.482 < < –0.218. Since this interval does not contain 0, the sample evidence supports that the two skiers, on average, have different race times. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

71. A child psychologist was interested in the difference in age (in years) between a boy and girl when they first learn to ride a two-wheeled bicycle. The psychologist developed a 99% confidence interval for the difference in average ages to be (–0.58, 0.73). What conclusion, if any, can be drawn from this interval? Justify your answer. ANS: Since 0 is included in the interval, one cannot conclude there is a difference in age, on average, at which boys and girls first learn to ride a two-wheeled bicycle. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

72. In an effort to raise ratings, a television network president decides to reduce the number of commercials. A random sample of eight one-hour programs was monitored from each of two major networks. The data below reflect the time in minutes of commercials for each of the 16 shows. Suppose

denotes the mean commercial time for network 1 and

denotes the mean commercial time for network 2. Estimate – using a 95% confidence interval. Assume both population distributions are normal and have equal variances. Network 1 Network 2

15.2 17.1

14.3 18.3

16.7 14.7

17.8 13.9

13.9 14.6

15.0 15.4

16.3 16.0

14.7 16.3

ANS: Summary statistics: Network 1:

= 8,

= 15.4875,

= 1.764

Network 2:

= 8,

= 15.7875,

= 2.0927

The pooled estimate of the common variance is

= (12.348 +

14.6489)/14 = 1.9284. Then, the 95% confidence interval is

(15.4875 – 15.7875) 2.145 (0.6943) = –0.3 1.489, or –1.789 < < 1.189. Since this interval contains 0, we are unable to say the two networks have significantly different commercial times. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

73. The mean playing times (in hours) for five different co-ed volleyball games for two different teams are listed below. Is there sufficient evidence to conclude the mean playing time for the two teams differ? Justify your answer. Assume the population distributions are normal and

. Use a 0.05 significance level. Team A 2.8 3.1 2.7 1.9 6.7

Team B 1.9 3.8 3.6 2.5 3.1

ANS: Summary statistics: Team A:

= 5,

= 3.44,

= 3.518

Team B:

= 5,

= 2.98,

= 0.617

The hypotheses to be tested are

. The pooled estimate

of the common variance is

= (14.072 + 2.468)/8 = 2.0675.

Hence, the test statistic is

= (3.44 – 2.98)/0.9094 = 0.5058.

Since | t | < = 2.306, do not reject . One cannot conclude that there is sufficient evidence to claim the two teams have different playing times. PTS:

REF: 414-417 | 722

TOP: 4

BLM: Higher Order - Evaluate 74. Assume that the population distributions of times (in hours) of two different surgeries are normal with equal variances. Two random samples, drawn independently from the populations, showed the following statistics. = 10,

= 2.5,

= 0.04

= 11,

= 2.6,

= 0.09

Construct and interpret a 90% confidence interval for the true difference in mean amount of time of the two surgeries. ANS: The pooled estimate of the common variance is 0.90)/19 = 0.0663. Then, the 90% confidence interval is

= (2.5 – 2.6)

= (0.36 +

1.729 (0.1125) = –0.10

0.1945, or –0.2945

< < 0.0945. Since this interval contains 0, the sample evidence supports that the two surgeries, on average, do not take significantly different amounts of time. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

75. Assume that the population distributions of life expectancy (in years) of men and women are normal with equal variances. Two random samples, drawn independently from the populations, showed the following statistics. Men: Women:

= 10, = 10,

= 76, = 83,

=1 =4

Construct and interpret a 99% confidence interval for the true difference in average life expectancy of men and women. ANS: The pooled estimate of the common variance is

2.5. Then, the 99% confidence interval is

= (9 + 36)/18 =

= (76 – 83)

2.878

(0.7071) = –7 2.035, or –9.035 < < –4.965. Since this interval does not contain 0, the sample evidence supports that men and women have, on average, different life expectancies.

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

Studying Time Narrative A faculty advisor was interested in determining whether there is a difference between male and female students in the amount of time (in hours) spent studying on weeknights (Monday through Thursday). The advisor selected a random sample of 12 female students and a second random, but independent, sample of 10 male students and asked each student to indicate the average amount of time spent studying on a weeknight. The following summary statistics are obtained.

Sample Size Sample Mean Sample Standard Deviation

Female 12 3.267 0.749

Male 10 3.390 0.837

76. Refer to Studying Time Narrative. State the null and alternative hypotheses for the advisor. ANS: Let males be population 1, females be population 2, and be the true average amount of time spent studying on weeknights for gender i, i = 1 (Male), and 2(Female). The null and alternative hypotheses to be tested are as follow: PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

77. Refer to Studying Time Narrative. Perform the appropriate test of hypothesis to determine whether there is a significant difference between male and female students in average time spent studying on weeknights. Use = 0.05. ANS: The pooled estimate of the common variance is

6.305)/20 = 0.6238. The test statistic is

= (6.171 +

= (3.267 – 3.39)/0.3382

= –0.3637. Reject if | t | > = 2.086. Since | t | < 2.086, do not reject . One cannot conclude that there is a significant difference between male and female students in average time spent studying on weeknights. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

78. Refer to Studying Time Narrative. Approximate the p-value for the test in the above question. ANS: p-value = P(| t | > 0.3637 ) = 2P( t > 0.3637) = 2(0.10) = 0.20. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

79. Refer to Studying Time Narrative. Using the p-value approach and  = 0.10, what conclusion can be drawn about the difference between male and female students in average time spent studying on weeknights? ANS: Since the p-value  0.10, the results are not significant; that is, one cannot conclude there is a significant difference between male and female students in average time spent studying on weeknights. PTS: 1 REF: 414-415 BLM: Higher Order - Evaluate

TOP: 4

80. Refer to Studying Time Narrative. Develop a 95% confidence interval for the average amount of time spent studying on weeknights by females. ANS: = 3.267

2.201 (0.749)/

= 3.267

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Apply

0.476 = (2.791, 3.743) TOP: 4

81. Refer to Studying Time Narrative. The advisor assumed equal variances in the analysis. Is this a reasonable assumption? Justify your answer. ANS: Yes, since

= 1.2488  3.

PTS: 1 REF: 418 BLM: Higher Order - Evaluate 82. Refer to Studying Time Narrative. Use population variances are equal. ANS:

TOP: 7

= 0.05 to test the hypothesis that the two

The hypotheses to be tested are

vs.

test statistic is F =

The observed value of the The rejection region is F >

3.59. Since F = 1.2488, we fail to reject are equal.

and we conclude that the population variances

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Evaluate

TOP: 7

Snowmobile Speeds Narrative A customer was interested in comparing the top speed (in kilometres per hour) of two models of snowmobiles. The customer selected two independent random samples of the snowmobiles and calculated the following summary information:

Sample Size Sample Mean Sample Standard Deviation

Model A 8 90 3

Model B 9 84 5

Assume the distribution of top speeds is approximately normal. 83. Refer to Snowmobile Speeds Narrative. State the appropriate null and alternative hypotheses to test whether there is a significant difference between the two models of snowmobiles in average top speed. ANS: Let Model A be population 1, Model B be population 2, and be the true average top speed of snowmobile model i, i = 1, 2. The hypotheses to be tested are

PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

84. Refer to Snowmobile Speeds Narrative. Calculate the value of the test statistic. ANS: The pooled estimate of the common variance is

= (90 – 84)/2.0347 = 2.949.

= 17.5333. The test statistic is PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

= (63 + 200)/15

TOP: 4

85. Refer to Snowmobile Speeds Narrative. Set up the appropriate rejection region for the hypotheses above assuming = 0.05. ANS: Reject

if | t | >

= 2.131.

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

86. Refer to Snowmobile Speeds Narrative. What is the appropriate conclusion? Be sure to justify your answer. ANS: Since 2.949  2.131, reject and conclude that there is a significant difference between the two models of snowmobiles in average top speed. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

87. Refer to Snowmobile Speeds Narrative. Is it reasonable to assume equality of variances in this problem? Justify your answer. ANS: Yes, since

= 2.78 < 3.

PTS: 1 REF: 418 BLM: Higher Order - Evaluate

TOP: 7

88. Refer to Snowmobile Speeds Narrative. Use population variances are equal.

= 0.05 to test the hypothesis that the two

ANS: The hypotheses to be tested are test statistic is F = Since F = 2.778, we fail to reject equal.

vs.

The observed value of the

The rejection region is F >

= 4.90.

and we conclude that the population variances are

PTS: 1 REF: 441-444 | 725-732 BLM: Higher Order - Evaluate

TOP: 7

Quiz Scores Narrative Two independent random samples of sizes = 4 and = 5 are selected from two normal populations. The data below represent the scores in a 20-point quiz.

Population 1 Population 2

14 16

5 9

10 9

7 11

89. Refer to Quiz Scores Narrative. Calculate

, the pooled estimator of

ANS: The pooled estimator of

PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

= 12.457.

TOP: 4

90. Refer to Quiz Scores Narrative. Find a 90% confidence interval for ( between the two population means.

), the difference

ANS: A 90% confidence interval for

is given as or

. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

91. Refer to Quiz Scores Narrative. Test State and justify your conclusions.

for

= 0.05.

ANS: The hypothesis to be tested is the assumption that

vs.

. The test statistic, under

, is calculated using the pooled value of

t-statistic shown below:

. The rejection region

is one-tailed, based on rejection region is the rejection region,

in the

degrees of freedom. With

, the

Since the observed value, t = –0.6758 does not fall in is not rejected. We do not have sufficient evidence to indicate that

PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

Employees Test Scores Narrative A random sample of 35 employees who completed two years of college were asked to take a basic mathematics test. The mean and standard deviation of their scores were 75.1 and 12.8, respectively. In a random sample of 50 employees who completed only high school, the mean and standard deviation of the test scores were 72.1 and 14.6, respectively. 92. Refer to Employees Test Scores Narrative. Can we infer at the 10% significance level that a difference exists between the two groups? ANS: , Rejection region: | t | > 1.664 Test statistic: t = 0.98 Conclusion: Don’t reject the null hypothesis. No, we can’t infer at the 10% significance level that a difference exists between the two groups. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

93. Refer to Employees Test Scores Narrative. Estimate with 90% confidence the difference in mean scores between the two groups of employees. ANS: 3.0 5.094. Thus, LCL = –2.094 and UCL = 8.094. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

94. Refer to Employees Test Scores Narrative. Explain how to use the interval estimate to test the hypotheses. ANS: Since the hypothesized value 0 is included in the 90% confidence interval, we fail to reject the null hypothesis at 0.10. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Understand

TOP: 4

Preservatives Narrative A food processing plant wants to compare two preservatives for their effects on retarding spoilage. Suppose 16 cuts of fresh meat are treated with preservative A and 16 are treated with preservative B. The number of hours until spoilage begins is recorded for each of the 32 cuts of meat. The results are summarized in the table below.

Sample Mean

Preservative A 108.7 hours

Preservative B 98.7 hours

Sample Standard Deviation

10.5 hours

13.6 hours

95. Refer to Preservatives Narrative. State the null and alternative hypotheses to determine if the average number of hours until spoilage begins differs for the preservatives A and B. ANS: vs. PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

96. Refer to Preservatives Narrative. Assuming that the population variances are equal, which test would likely be most appropriate to employ to test the equality of the population means? ANS: We would use the equal variances t test. PTS:

REF: 414-415

TOP: 4

BLM: Remember

97. Refer to Preservatives Narrative. Calculate the pooled variance and the value of the test statistic. ANS: = 147.605, t = 2.33 PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Analyze 98. Refer to Preservatives Narrative. Determine the rejection region at proper conclusion.

TOP: 4

and write the

ANS: or . Since t = 2.33 > 2.042, we reject the null hypothesis at , and we conclude that the average number of hours until spoilage begins differs for preservatives A and B. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

Coffee Breaks Narrative Do government employees take longer coffee breaks than private sector workers? That is a question that interested a management consultant. To examine the issue, he took a random sample of ten government employees and another random sample of ten private sector workers and measured the amount of time (in minutes) they spent in coffee breaks during the day. The results are listed below.

Government Employees 23 18 34 31 28 33 25 27 32 21

Private-Sector Workers 25 19 18 22 28 25 21 21 20 16

99. Refer to Coffee Breaks Narrative. Do these data provide sufficient evidence at the 5% significance level to support the consultant’s question Justify your conclusion. ANS: , Rejection region: t > 1.734 Test statistic: t = 2.766 Conclusion: Reject the null hypothesis. Yes, these data provide sufficient evidence at the 5% significance level to support the consultant’s question PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

100. Refer to Coffee Breaks Narrative. Estimate with 95% confidence the difference between the two groups in coffee break mean time. ANS: 5.7 4.309. Thus, LCL = 1.371 and UCL = 10.029. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

101. Refer to Coffee Breaks Narrative. Explain what the interval estimate tells you. ANS: We estimate that government employees, on average, take between 1.371 and 10.029 minutes longer for coffee breaks than private-sector workers do. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Understand Children Narrative

TOP: 4

Two random samples of 3-year-old children, each of size 25, are taken from independent populations. The populations are distributed with equal variances. The first sample has a mean of 35.5 and a standard deviation of 3.0 while the second sample has a mean 33.0 and standard deviation of 4.0. A test for the difference between the two population means is conducted on this data. 102. Refer to Children Narrative. What is the pooled variance? ANS: 12.5 PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

103. Refer to Children Narrative. Compute the t statistic of this test. ANS: 2.50 PTS: 1 REF: 414-415 | 722 BLM: Higher Order

TOP: 4

104. Refer to Children Narrative. How many degrees of freedom does this test have? ANS: 48 PTS: 1 REF: 414-415 BLM: Higher Order - Analyze

TOP: 4

105. Refer to Children Narrative. Between which two values does the p-value for a one-tailed test whose computed statistic is 2.50 (in the hypothesized direction) lie? ANS: 0.005 and 0.01 PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Analyze

TOP: 4

106. Refer to Children Narrative. If we were interested in testing against the one-tailed alternative that at the rejected or not be rejected?

level of significance, should the null hypothesis be

ANS: The null hypothesis should be rejected. PTS: 1 REF: 414-417 | 722 BLM: Higher Order - Evaluate

TOP: 4

107. Given a random variable that has a t distribution with the specified degrees of freedom, what percentage of the time will its value fall into the indicated region? a. 15 degrees of freedom, between –2.131 and 2.131 b. 19 degrees of freedom, between –2.539 and 2.539 c. 23 degrees of freedom, between –1.319 and 1.319 d. 10 degrees of freedom, between –3.169 and 3.169 ANS: a. 95% b. 98% c. 80% d. 99% PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

TOP: 5

108. What is the appropriate t critical value for each of the following confidence levels and sample sizes when testing the two-sided alternative hypothesis? a. 80% confidence, n = 17 b. 90% confidence, n = 7 c. 99% confidence, n = 4 d. 95% confidence, n = 14 ANS: a. 1.337 b. 1.943 c. 5.841 d. 2.16 PTS: 1 REF: 401-403 | 722 BLM: Higher Order - Apply

TOP: 5

109. Let  denote the true average number of minutes of a television commercial. Suppose the hypotheses are tested. Assuming the commercial time is normally distributed, give the appropriate rejection region for each of the following sample sizes and significance levels. a. n = 6, = 0.01 b. n = 12, = 0.05 c. n = 20, = 0.05 d. n = 23, = 0.1 ANS: a. | t | > 4.032 b. | t | > 2.201 c. | t | > 2.093 d. | t | > 1.717 PTS:

REF: 404 | 722

TOP: 5

BLM: Higher Order - Analyze Price Differences Narrative A consumer was interested in determining whether there is a significant difference in the price charged for tools by two hardware stores. The consumer selected five tools and recorded the price for each tool in each store. The following data were recorded: Tool Store 1

1 $32.00

2 $3.95

3 $1.50

4 $2.95

5 $4.00

$30.00

$2.95

$1.50

$2.45

$5.00

110. Refer to Price Differences Narrative. Are the samples independent? Justify your answer. ANS: No, since the pair of measurements for each tool are related. PTS: 1 REF: 425-427 BLM: Higher Order - Analyze

TOP: 5

111. Refer to Price Differences Narrative. Perform the appropriate test of hypothesis to determine whether there is a significant difference, on average, in the price of tools between the two stores. Use = 0.05. ANS: Let Store 1 be population 1, Store 2 be population 2, and be the population mean of the differences (mean of Store 1 price – mean of Store 2 price). The hypotheses to be tested are Differences: d = (Store 1 price – Store 2 price) are given: d = 2, 1, 0, 0.5, –1. The sample mean and sample standard deviation of the differences, d, are calculated as

follows:

= 2.5/5 = 0.5, and

= 1.118. Hence, the

test statistic is: = 0.5/0.5 = 1.0. Reject if | t | > = 2.776. Since t < 2.776, do not reject. One cannot conclude there is a significant difference in the average price of tools between the two stores. PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

112. In a study to compare average snowfall in two different Canadian cities in December, measurements were taken in each of the cities for ten randomly selected years. Snowfalls, in centimetres, for the two cities are listed below. Assume the two population distributions are normal. Use the data to determine if there is a significant difference in average snowfall in the two cities. Use a significance level of  = 0.05. Year 1942 1948 1954 1959 1967 1970 1975 1981 1983 1990

City A 45 0 4 21 9 1 30 17 53 8

City B 40 10 2 20 7 9 33 17 50 10

ANS: Let City A be population 1, City B be population 2, and

be the true average difference

in snowfall in the two cities. The hypotheses to be tested are . Differences: d = (City A snowfall – City B snowfall) are d = 5, –10, 2, 1, 2, –8, –3, 0, 3, –2. The sample mean and sample standard deviation of the differences, d, are calculated as

follows:

= -10/10 = –1.0, and

Hence, the test statistic is 2.262. Since | t | < 2.262, do not reject significant difference in the snowfalls.

= –1.0/1.5275 = –0.655. Reject

= 4.830. if | t | >

. The sample data support that there is no

PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

Contact Lenses Narrative Two different brands of contact lenses are to be compared for length, in hours, of comfortable wear. The lenses are available in any prescription. 113. Refer to Contact Lenses Narrative. How might an experiment resulting in paired data be carried out? ANS: Each brand of lens is tested on each of n people, each person wearing each type of lens for a predetermined length of time.

PTS: 1 REF: 425-427 BLM: Higher Order - Analyze

TOP: 5

114. Refer to Contact Lenses Narrative. How might an independent samples experiment be carried out? ANS: Each brand of lens is tested on n randomly selected people, each person wearing only one brand of lens. PTS: 1 REF: 413-414 BLM: Higher Order - Analyze

TOP: 5

115. A scientist is testing two different types of Secchi discs. This is an instrument used for determining water clarity. The scientist takes a depth reading (in metres below the surface) with each disc at eight different locations on a lake. The results of the eight different locations for each Secchi disc are listed below. Assume the two population distributions are normal. Determine if there is a significant difference in average depth reading for the two discs. Use a significance level of = 0.01. Location 1 2 3 4 5 6 7 8

Secchi A 25 10 7 12 19 4 30 8

Secchi B 24 8 7 10 18 4 25 9

ANS: Let Secchi A be population 1, Secchi B be population 2, and be the true average difference in depth reading for the two discs. The hypotheses to be tested are . Differences: d = (Secchi A reading – Secchi B reading) are given for the eight locations: d = 1, 2, 0, 2, 1, 0, 5, –1. The sample mean and sample standard deviation of the differences, d, are calculated as

follows:

= 1.832. Hence,

Reject if | t | > = 3.499. Since | t | < 3.499, do not reject supports that the two discs do not have different measurements.

. The sample evidence

the test statistic is

= 10/8 = 1.25, and = 1.25/0.6477 = 1.9299.

PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

116. A researcher believes she has designed a keyboard that is more efficient to use than a standard keyboard. In order to help decide if this is the case, typing speeds were taken for eight different people on each keyboard, new and standard original. The lengths of time, in minutes, for each of the people to type a preselected manuscript are listed below. Assume the two population distributions are normal. Use the data to determine if the original keyboard yields slower times. Use a significance level of  = 0.05. Person 1 2 3 4 5 6 7 8

Original 15 9 17 10 9 4 30 29

New 12 8 15 8 5 4 25 21

ANS: Let Original keyboard be population 1, New keyboard be population 2, and be the true average difference in typing speed for the two keyboards. The hypotheses to be tested are Differences: d = (Speed of original keyboard – Speed of new keyboard) are d =3, 1, 2, 2, 4, 0, 5, and 8. The sample mean and sample standard deviation of the differences, d, are calculated as

follows: the test statistic is t > 1.895, reject

= 25/8 = 3.125.

= 3.125/0.8952 = 3.491. Reject

= 2.532. Hence, if t >

= 1.895. Since

, and conclude that the original keyboard yields slower times.

PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

Bottling Productivity Narrative Five soft drink bottling companies have agreed to implement a time management program in hopes of increasing productivity (measured in cases of soft drinks bottled per hour). The number of cases of soft drinks bottled per hour before and after the implementation of the program are listed below: Company 1 2 3 4 5 Before 500 475 525 490 530 After 510 480 525 495 533

117. Refer to Bottling Productivity Narrative. State the appropriate null and alternative hypotheses to test whether the time management has been effective in increasing productivity. ANS: Let Before be population 1, After be population 2, and be the true average difference in the number of cases of bottles produced per hour before and after the management program. Then, the hypotheses to be tested are

vs.

PTS: 1 REF: 426 BLM: Higher Order - Analyze

TOP: 5

118. Refer to Bottling Productivity Narrative. Calculate the value of the test statistic. ANS: Company Before After d=B–A

1 500 510 –10

2 475 480 –5

3 525 525 0

4 490 495 –5

5 530 533 –3

The sample mean and sample standard deviation of the differences, d, are calculated as follows:

= –23/5 = –4.6, and

= 3.6469. Hence, the test

= –4.6/1.631 = –2.82.

statistic is

PTS: 1 REF: 426-428 BLM: Higher Order - Analyze

TOP: 5

119. Refer to Bottling Productivity Narrative. Set up the appropriate rejection region for the hypotheses above, assuming  = 0.05. ANS: Reject

if t < –

= –2.132.

PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Analyze

TOP: 5

120. Refer to Bottling Productivity Narrative. What is the appropriate conclusion? Justify your answer. ANS:

Since t  –2.132, reject and conclude that the time management program has been effective in increasing production. PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

121. Refer to Bottling Productivity Narrative. Find the approximate p-value. ANS: 0.01 < p-value < 0.025. PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Analyze

TOP: 5

Running Shoes Narrative A new runner has decided to purchase a new pair of running shoes. He has narrowed his choices to two brands, each of which would be appropriate for his use. His concern is whether there is a significant difference in the average wear between the two brands of shoes. He enlists a random sample of six veteran runners to test the shoes. Each runner wore each brand of shoe until it wore out. The following data were recorded, representing the number of weeks each runner used each pair of shoes: Runner Shoe Brand Brand A Brand B

1 10 9

2 6 7

3 14 12

4 12 14

5 10 11

6 13 11

122. Refer to Running Shoes Narrative. The new runner used a paired-difference t test for the analysis. Is this method appropriate? Justify your answer. ANS: Yes, since the samples are related. The pair of measurements for each jogger are definitely related. PTS: 1 REF: 425-427 BLM: Higher Order - Evaluate

TOP: 5

123. Refer to Running Shoes Narrative. Perform the appropriate test of hypothesis to determine whether there is a significant difference in the average wear between the two brands of shoes. Use the 5% level of significance. ANS:

Let Brand A be population 1, Brand B be population 2, and be the population difference of the average wear between the two brands. The hypotheses to be tested are Differences: d = (Brand A average wear – Brand B average wear) are d = 1, –1, 2, –2, –1, and 2. The sample mean and sample standard deviation of the differences, d, are calculated as follows:

= 1/6 = 0.1667, and

= 1.7224. The test statistic is

= 0.5/0.5 =

0.237. Reject if | t | > = 2.571. Since t < 2.571, do not reject. One cannot conclude that there is a significant difference in average wear length between the two brands of shoes. PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Evaluate

TOP: 5

124. Refer to Running Shoes Narrative. Find a 95% confidence interval for the difference in average wear length between the two brands of shoes. Based on this interval, can one conclude there is a significant difference in average wear length between the two brands of shoes? Justify your answer. ANS: The 95% confidence interval is = 0.1667 2.571 (1.7224)/ = 0.1667 1.808, or = (–1.6413, 1.9747). Since 0 is within the limits of the interval, one cannot conclude there is a significant difference in average wear length between the two brands of shoes. PTS: 1 REF: 426-428 | 722 BLM: Higher Order - Analyze

TOP: 5

Chapter 11A—The Analysis of Variance MULTIPLE CHOICE 1. Which of the following values may be analyzed using one-way ANOVA? a. the difference between more than two population means b. the difference between two population variances c. the difference between two sample means d. the difference between two sample population proportions ANS: A TOP: 1–5

PTS: 1 BLM: Remember

REF: 468 | 473-474

2. Which of the following is equal to the test statistic of the completely randomized ANOVA design? a. Sum of squares for treatments/Sum of squares for error b. Sum of squares for error/Sum of squares for treatments c. Mean square for treatments/Mean square for error d. Mean square for error/Mean square for treatments ANS: C BLM: Remember

PTS:

REF: 473-474

TOP: 1–5

3. In one-way ANOVA, suppose that there are three treatments with

and . What is the rejection region for this test at the 5% level of significance? a. F > 3.24 b. F > 3.63 c. F > 3.81 d. F > 4.08 ANS: C TOP: 1–5

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Apply

4. In a one-way ANOVA test, the test statistic is F = 4.25. The rejection region is F > 3.06 for the 5% level of significance, F > 3.8 for the 2.5% level, and F > 4.89 for the 1% level. For this test, which of the following is a valid statement about the approximate p-value? a. It is greater than 0.05. b. It is between 0.025 and 0.05. c. It is between 0.01 and 0.025. d. It is approximately 0.05. ANS: C TOP: 1–5

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

5. Which of the following is NOT a required condition for one-way ANOVA? a. The populations must be normally distributed. b. The sample sizes must be equal. c. The population variances must be equal. d. The samples must be selected randomly and independently from their respective

populations. ANS: B BLM: Remember

PTS:

REF: 470 | 474

TOP: 1–5

6. Which of the following is the distribution of the test statistic for analysis of variance? a. the normal distribution b. the Student’s t distribution c. the F distribution d. the binomial distribution ANS: C BLM: Remember

PTS:

REF: 468

TOP: 1–5

7. In the completely randomized design of ANOVA where there are k treatments and n observations, what are the degrees of freedom for the F statistic? a. n and k b. k and n c. n – k and k – 1 d. k – 1 and n – k ANS: D BLM: Remember

PTS:

REF: 474

TOP: 1–5

8. How many degrees of freedom are there for a denominator in a one-way ANOVA test that includes three population means with ten observations sampled from each population? a. 30 b. 27 c. 13 d. 12 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 474

TOP: 1–5

9. In a completely randomized design for ANOVA, the number of degrees of freedom for the numerator and denominator for test statistic are 3 and 16, respectively. Which of the following is equal to the total number of observations? a. 48 b. 32 c. 20 d. 19 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 474

TOP: 1–5

10. What is the term for an experimental plan that creates one treatment group for each treatment, and then randomly assigns each experimental unit to one of these groups? a. a completely randomized design b. a randomized block design c. a matched-pairs design d. either a or b

ANS: A BLM: Remember

PTS:

REF: 469

TOP: 1–5

11. Which of the following are underlying assumptions in order for an ANOVA completely randomized design to be used? a. The ANOVA test assumes that the sampled populations are F-distributed. b. The ANOVA test assumes that the sampled populations have a common variances .

c. The ANOVA test assumes that the samples are randomly selected from their respective populations but they need not be independent. d. both a and b. ANS: B BLM: Remember

PTS:

REF: 468

TOP: 1–5

12. Which of the following correctly describes the treatment sum of squares in one-way ANOVA? a. It equals the sum of the squared deviations between each treatment sample mean and the grand mean, multiplied by the number of observations made for each treatment. b. It equals the sum of the square deviations between each block sample mean and the grand mean, multiplied by the number of observations made for each block, multiplied by the number of observations per cell. c. It equals the number of rows (or columns) multiplied by the sum of the squared deviations of the column-block sample means from the grand mean. d. It equals the number of columns (or rows) multiplied by the sum of the squared deviations of the row-block sample means from the grand mean. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 470

TOP: 1–5

13. Which of the following correctly describes the F statistic for the completely randomized design? a. It equals the ratio of mean squares for treatments (MST) to mean squares for error (MSE). b. It equals the ratio of sum of squares for treatments (SST) to sum squares for error (SSE). c. It equals the ratio of sum of squares for treatments (SST) to total sum of squares (Total SS). d. It equals the ratio of sum of squares for error (SSE) to total sum of squares (Total SS). ANS: A BLM: Remember

PTS:

REF: 474

TOP: 1–5

14. In one-way ANOVA, what is the sum of the squared deviations of each individual sample observation (regardless of the sample to which it belongs) from the mean of all observations? a. the F statistic b. the total sum of squares

c. the grand mean d. the unexplained variance ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 471

TOP: 1–5

15. Which of the following is a characteristic of all members of the F distribution family? a. positively skewed between values of and b. positively skewed between values of 0 and c. negatively skewed between values of and 0 d. both b and c, which is why the normal distribution is a special case of the F-distribution family ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 474

16. In a one-way ANOVA, if the null hypothesis is alternative hypothesis be? a. at least one of the means is different from the others. b. at least two of the means are different from the others. c. at least three of the means are different from the others. d.

TOP: 1–5

then what would the

all populations means differ.

ANS: A BLM: Remember

PTS:

REF: 474

TOP: 1–5

17. In a one-way ANOVA, if and a sample of size 25 is selected at random from each of the 5 populations, then which of the following is the correct critical value at a. z = 1.645 b. t = 2.1318 c. F = 2.45 d. = 9.488 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 474

TOP: 1–5

18. In one-way ANOVA, which of the following statistics is used to measure the amount of total variation that is unexplained? a. the sum of squares for treatments b. the sum of squares for error c. the total sum of squares d. the degrees of freedom ANS: B BLM: Remember

PTS:

REF: 471

TOP: 1–5

19. Which of the following is used as the test statistic of the single-factor ANOVA? a. Sum of squares for treatments/Sum of squares for error

b. Sum of squares for error/Sum of squares for treatments c. Mean square for treatments/Mean square for error d. Mean square for error/Mean square for treatments ANS: C BLM: Remember

PTS:

REF: 474

TOP: 1–5

20. In one-way ANOVA, suppose that there are four treatments with and a. b. c. d.

. What is the rejection region for this test at the 5% level of significance? F> F> F> F>

ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 473-474

TOP: 1–5

21. In an ANOVA test, the test statistic is F = 6.75. The rejection region is F > 3.97 for the 5% level of significance, F > 5.29 for the 2.5% level, and F > 7.46 for the 1% level. For this test, what is the p-value? a. greater than 0.05 b. approximately 0.05 c. between 0.025 and 0.05 d. between 0.01 and 0.025 ANS: D TOP: 1–5

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

22. The analysis of variance is a procedure that allows statisticians to compare two or more of which of the following population parameters? a. means b. proportions c. variances d. standard deviations ANS: A BLM: Remember

PTS:

Source of Variation Treatments Error Total

REF: 468

SS 4 30 34

df 2 12 14

TOP: 1–5

MS 2.0 2.5

23. Refer to ANOVA table one. In this case, how many treatments are there? a. 13 b. 12 c. 5

F 0.80

d. 3 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 471

TOP: 1–5

24. Refer to ANOVA table one. A one-way ANOVA test is applied to three independent samples having means 10, 13, and 18, respectively. If each observation in the third sample were increased by 30, how would this affect the value of the F-statistics? a. They would increase. b. They would decrease. c. They would remain unchanged. d. They would increase by exactly 30. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 473-474

TOP: 1–5

25. Refer to ANOVA table one. In a completely randomized design for ANOVA, the numerator and denominator degrees of freedom for the test statistic are 4 and 25, respectively. What must the total number of observations equal? a. 24 b. 25 c. 29 d. 30 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 471 | 474

TOP: 1–5

26. How many degrees of freedom are there for the denominator in a one-way ANOVA test involving 4 population means with 15 observations sampled from each population? a. 60 b. 56 c. 45 d. 19 ANS: C PTS: 1 BLM: Higher Order - Apply Source of Variation Treatments Error Total

REF: 474

SS 75 60 135

df * * 19

TOP: 1–5

MS 25 3.75

F 6.67

27. Refer to ANOVA table two. In the table above, what are the respective numerator and denominator degrees of freedom (identified by asterisks)? a. 3 and 16 b. 4 and 15 c. 15 and 4 d. 16 and 3 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 471

TOP: 1–5

28. For which of the following departures from the conditions required for a completely randomized design is the procedure no longer considered to be robust? a. The populations are not normally distributed. b. The population variances are not equal. c. The samples are not independent. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 468 | 470

TOP: 1–5

29. A one-way ANOVA test is performed on three independent samples with

and . What is the critical value obtained from the F-table for this test at the 2.5% level of significance? a. 3.55 b. 4.56 c. 29.45 d. 39.45 ANS: B TOP: 1–5

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Apply

30. A professor of statistics in Simon Fraser University wants to determine whether the average starting salaries among graduates of the nine universities in British Columbia are equal. A sample of 25 recent graduates from each university was randomly taken. The appropriate critical value for the ANOVA test is obtained from the F-distribution. What are the respective numbers of degrees of freedom for this distribution? a. 8 and 216 b. 9 and 25 c. 25 and 15 d. 360 and 14 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 474

TOP: 1–5

31. Which of the following correctly describes Tukey’s method of paired comparisons? a. It is a statistical technique designed to test whether the means of more than two quantitative populations are equal. b. It is a method employed as a follow-up to ANOVA that seeks out honestly significant differences between paired sample means. c. It is a method to determine whether different statistical populations have equal variances. d. It is a method to measure a statistical test’s sensitivity to any breach of ANOVA basic assumptions. ANS: B BLM: Remember

PTS:

REF: 482-483

TOP: 6

32. Which of the following is NOT a property of Tukey’s Multiple Comparison Method? a. It is based on the studentized range statistic q to obtain the critical value needed to construct individual confidence intervals.

b. It requires that all sample sizes are equal, or at least similar. c. It can be used instead of the analysis of variance. d. It puts no restriction on the sample means. ANS: D BLM: Remember

PTS:

REF: 482-483

TOP: 6

33. Why would you use the Tukey multiple comparison? a. to test for normality b. to test for homogeneity of variance c. to test independence of errors d. to test for differences in pairwise means ANS: D BLM: Remember

PTS:

REF: 482-483

TOP: 6

34. The equation: “Total SS = SST + SSB + SSE” applies to which ANOVA model? a. one-way ANOVA b. a two-factor factorial design c. completely randomized design d. randomized block design ANS: D BLM: Remember

PTS:

REF: 487

TOP: 7–8

35. In the randomized block design for ANOVA where k is the number of treatments and b is the number of blocks, the degrees of freedom for error are given by which of the following expressions? a. bk – 1 b. kb + 1 c. (b – 1)(k – 1) d. k + b – 1 ANS: C BLM: Remember

PTS:

REF: 487-488

TOP: 7–8

36. What is an analysis of variance that controls extraneous factors by using the randomized block design? a. one-way ANOVA b. two-way ANOVA c. three-way ANOVA d. four-way ANOVA ANS: B BLM: Remember

PTS:

REF: 485-486

TOP: 7–8

37. In the randomized block design for ANOVA, which of the following statements is NOT a property of blocks variation? a. It equals the sum of the squared deviations between each block sample mean and the grand mean. b. It measures the variation among block sample means, which is attributable not to

chance but to inherent differences among blocks of experimental units. c. In two-way ANOVA, it equals the blocks mean square. ANS: A BLM: Remember

PTS:

REF: 486-488

TOP: 7–8

38. In a randomized block design of ANOVA, how many factors are there to be analyzed? a. one factor b. two factors c. three factors d. four or more factors ANS: B BLM: Remember

PTS:

REF: 486

TOP: 7–8

39. In a randomized block design of ANOVA, which of the following correctly describes the number of degrees of freedom associated with the sum of squares for treatments? a. one less than the total number of observations in all samples b. one less than the number of blocks c. one less than the number of populations involved ANS: C BLM: Remember

PTS:

REF: 486

TOP: 7–8

40. The F-test of the randomized block design of the analysis of variance requires that the random variable of interest must be normally distributed and the population variances must be equal. When the random variable is NOT normally distributed, which of the following tests may we use? a. one-way ANOVA b. two-way ANOVA c. chi-square test d. Friedman test ANS: D BLM: Remember

PTS:

REF: 509 | 686

TOP: 7–8

41. In the randomized block design for ANOVA, where k is the number of treatments and b is the number of blocks, how many degrees of freedom for error are there? a. k – 1 b. b – 1 c. (k – 1)(b – 1) d. kb – 1 ANS: C BLM: Remember

PTS:

REF: 487-488

TOP: 7–8

42. Three tennis players, one a beginner, one intermediate, and one advanced, have been randomly selected from the membership of a racquet facility club in a large city. Using the same tennis ball, each player hits ten serves, one with each of three racquet models, with the three racquet models selected randomly. The speed of each serve is measured with a machine and the result recorded. Among the ANOVA models listed below, which is the most likely model to fit this situation? a. the one-way ANOVA b. Tukey’s method c. the randomized block design d. the matched-pairs model ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 486

TOP: 7–8

43. What is the primary interest when designing a randomized block experiment? a. to reduce the variation among blocks b. to increase the between-treatments variation to more easily detect differences among the treatment means c. to reduce the within-treatments variation to more easily detect differences among the treatment means d. to increase the total sum of squares ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 486

TOP: 7–8

44. The randomized block design with exactly two treatments is equivalent to which of the following two-tailed tests? a. independent samples z-test b. independent samples equal-variances t-test c. independent samples unequal-variances t-test d. matched pairs t-test ANS: D BLM: Remember

PTS:

REF: 486

TOP: 7–8

45. In the randomized block design ANOVA, which of the expressions below is equal to the sum of squares for error? a. Total SS – SST b. Total SS – SSB c. Total SS – SST – SSB d. Total SS – SS(A) – SS(B) – SS(AB) ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 487-488

TOP: 7–8

46. A randomized block design with 4 treatments and 5 blocks produced the following sum of squares values: Total SS = 1951, SST = 349, SSE = 188. What must the value of SSB be? a. 537 b. 1414 c. 1602 d. 1763

ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 487-488

TOP: 7–8

47. The equation “Total SS = SSA + SSB + SS(AB) + SSE” applies to which ANOVA model? a. one-way ANOVA b. a two-factor factorial design c. completely randomized design d. randomized block design ANS: B BLM: Remember

PTS:

REF: 498-499

TOP: 9–10

48. In the factorial experiment, where a is the number of levels for factor A, b is the number of levels for factor B, and r is the number of replications of each of the ab factor combinations, the degrees of freedom for interaction is given by which of the listed expressions below? a. (a – 1)(b – 1) b. (a – 1)(r – 1) c. (b – 1)(r – 1) d. ab(r – 1) ANS: A BLM: Remember

PTS:

REF: 500

TOP: 9–10

49. Under which of the following conditions is a complete 3  2 factorial experiment said to be balanced? a. Factor A has three levels. b. Factor B has two levels. c. The number of replicates is the same for each treatment. d. The number of observations for each combination of factor A and factor B levels equal at least five. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 498

TOP: 9–10

50. In the two-way ANOVA, where a is the number of factor A levels, b is the number of factor B levels, and r is the number of replicates, which of the following expressions gives the formula for the number of degrees of freedom for error? a. (a – 1)(b – 1) b. abr – 1 c. (a – 1)(r – 1) d. ab(r – 1) ANS: D BLM: Remember

PTS:

REF: 500

TOP: 9–10

51. When the effect of a level for one factor depends on which level of another factor is present, what is the most appropriate ANOVA design to use in this situation? a. one-way ANOVA b. two-way ANOVA

c. randomized block design d. matched pairs design ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 496

TOP: 9–10

52. In which of the following models can interaction in an experimental design be tested? a. a completely randomized model b. a randomized block model c. a two-factor model d. all ANOVA models ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 498-499

TOP: 9–10

53. In a two-way ANOVA, there are 4 levels for factor A, 5 levels for factor B, and 3 observations for each combination of factor A and factor B levels. What is the number of treatments in this experiment? a. 60 b. 25 c. 20 d. 16 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 498

TOP: 9–10

54. In a two-way ANOVA, where a is the number of factor A levels and b is the number of factor B levels, which of the following expressions gives the number of the degrees of freedom for the “error term”? a. (a – 1)(b – 1) b. n – ab c. (a – 1) + (b – 1) d. abn + 1 ANS: B BLM: Remember

PTS:

REF: 499-500

TOP: 9–10

TRUE/FALSE 1. Analysis of variance (ANOVA) is a procedure for comparing more than two population means. ANS: T BLM: Remember

PTS:

REF: 468

TOP: 1–5

2. One of the assumptions underlying analysis of variance for a completely randomized design is that the observations within each population are normally distributed with unequal variance. ANS: F

PTS:

REF: 469

TOP: 1–5

BLM: Remember 3. Given the significance level 0.05, the F-value for the degrees of freedom 8 is 4.82. ANS: F TOP: 1–5

= 5 and

PTS: 1 REF: 474 | 725-732 BLM: Higher Order - Apply

4. In analysis of variances, the sum of squares for treatments (SST) is 0 when all the sample means are equal. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 470

TOP: 1–5

5. In analysis of variances, the sum of squares for error (SSE) is 0 when all the sample variances are equal. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 471

TOP: 1–5

6. The analysis of variance (ANOVA) technique analyzes the variance of the data to determine whether differences exist between the population variances. ANS: F BLM: Remember

PTS:

REF: 468

TOP: 1–5

7. The equation: Total SS = SST + SSB + SSE, applies to the completely randomized design (one-way ANOVA model). ANS: F BLM: Remember

PTS:

REF: 471

TOP: 1–5

8. Two samples of ten persons, one from the male workers and the second from the female workers of a large company, have been taken. The data involved the wage rate of each worker. To test whether there is any difference in the average wage rate between male and female workers, the most likely ANOVA design to fit this test situation is the randomized block design. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 469

TOP: 1–5

9. In ANOVA, a factor is an independent variable whose values are controlled and varied by the experimenter, while a level is the intensity setting of a factor. ANS: T BLM: Remember

PTS:

REF: 467

TOP: 1–5

10. Randomization is a procedure that eliminates the effects of extraneous factors during an experiment, even prior to the administration of treatments, by creating blocks of experimental units so that all units within any one block are as alike as possible with respect to these factors. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 469-470

TOP: 1–5

11. The completely randomized design is an experimental plan that creates one treatment group for each treatment, and then randomly assigns each experimental unit to one of these groups. ANS: T BLM: Remember

PTS:

REF: 469-470

TOP: 1–5

12. The analysis of variance, ANOVA, is best described as a statistical technique designed to test whether the means of more than two quantitative populations are equal. ANS: T BLM: Remember

PTS:

REF: 468

TOP: 1–5

13. One-factor or one-way ANOVA is one of several versions of the analysis of variance, which controls extraneous factors by using the randomized group design. ANS: T BLM: Remember

PTS:

REF: 469-470

TOP: 1–5

14. Two-factor or two-way ANOVA is one of several versions of the analysis of variance, which controls extraneous factors by using the randomized block design. ANS: T BLM: Remember

PTS:

REF: 485-486

TOP: 1–5

15. In the analysis of variance, explained variation equals the mean square for treatments treatments means square in one-way ANOVA. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 470-471

TOP: 1–5

16. The F statistic is the arithmetic mean of two or more means. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 473-474

TOP: 1–5

17. In the analysis of variance, the ANOVA table is a summary table that shows, for each source of variation, the sum of squares, the degrees of freedom, and the ratio of the sum of squares to the associated degrees of freedom (called the mean square), and also shows the F statistic. ANS: T

PTS:

REF: 471

TOP: 1–5

BLM: Remember 18. In the analysis of variance, the sum of all samples’ sums of squared deviations of individual observations from their sample mean equals the error mean square. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 470-471

TOP: 1–5

19. In one-way ANOVA, there is a single factor of interest but there may be multiple levels of the factor. ANS: T BLM: Remember

PTS:

REF: 469

TOP: 1–5

20. The term “one-way ANOVA” refers to the fact that in conducting the test, there is only one way to partition the total sum of squares. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 469-471

TOP: 1–5

21. The one-way ANOVA assumes that the population means are equal. ANS: F BLM: Remember

PTS:

REF: 468

TOP: 1–5

22. In a one-way ANOVA, the total variation in the data across the various factor levels can be partitioned into two parts: the between-samples variation (called the sum of squares for treatments) and the within-samples variation (called the sum of squares for error). ANS: T BLM: Remember

PTS:

REF: 470-471

TOP: 1–5

23. In a one-way ANOVA, the null hypothesis is written in terms of the population variances. ANS: F BLM: Remember

PTS:

REF: 474

TOP: 1–5

24. The one-way ANOVA requires that the sample sizes are all equal. ANS: F BLM: Remember

PTS:

REF: 474

TOP: 1–5

25. In a one-way ANOVA, total SS is equal to 12,345, and SST is equal to 10,685, then SSE = 1660. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 470-471

TOP: 1–5

26. In a one-way ANOVA, if the sum of squares for error (SSE) is large relative to the sum of squares for treatments (SST), this is an indication that the population means are likely to be different. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 470-471

TOP: 1–5

27. In a one-way ANOVA, if the null hypothesis that all population means are equal is rejected, then we can conclude that the alternative hypothesis is true and that all population means differ. ANS: F BLM: Remember

PTS:

REF: 474

TOP: 1–5

28. The sum of squares for treatments, SST, achieves its smallest value (0) when all the sample means are equal. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 470-471

TOP: 1–5

29. The F test of the analysis of variance requires that the populations be normally distributed with equal variances. ANS: T BLM: Remember

PTS:

REF: 474

TOP: 1–5

30. In one-way ANOVA, the test statistic is defined as the ratio of the mean square for error (MSE) and the mean square for treatments (MST); namely, F = MSE/MST. ANS: F BLM: Remember

PTS:

REF: 474

TOP: 1–5

31. The calculated value of F in a one-way ANOVA is 7.88. The numerator and denominator degrees of freedom are 3 and 9, respectively. The most accurate statement to be made about the p-value is that p-value < 0.01. ANS: T TOP: 1–5

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Apply

32. The numerator or MST degrees of freedom is 3 and the denominator or MSE degrees of freedom is 18. The total number of observations in the completely randomized design must equal 20. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 470-471

TOP: 1–5

33. In any ANOVA test, whenever the computed F statistic falls short of a chosen critical value of F, the null hypothesis of equal population means should be rejected.

ANS: F TOP: 1–5

PTS: 1 REF: 474 | 490 | 500 BLM: Higher Order - Understand

34. In any ANOVA test, whenever the computed F statistic falls short of a chosen critical value of F, the alternative hypothesis of unequal population means should be accepted. ANS: F TOP: 1–5

PTS: 1 REF: 474 | 490 | 500 BLM: Higher Order - Understand

35. In any ANOVA test, whenever the computed F statistic exceeds a chosen critical value of F, the null hypothesis of equal population means should be rejected. ANS: T TOP: 1–5

PTS: 1 REF: 474 | 490 | 500 BLM: Higher Order - Understand

36. In a one-way ANOVA, if the null hypothesis is rejected, it may still be possible that two or more of the population means are equal. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 474

TOP: 1–5

37. In a one-way ANOVA, the degrees of freedom associated with the sum of squares for treatments is equal to one less than the number of populations. ANS: T BLM: Remember

PTS:

REF: 474

TOP: 1–5

38. In a one-way ANOVA, the mean squares for treatments (MST) will be larger than the mean squares for error (MSE) if the null hypothesis is rejected. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 474

TOP: 1–5

39. In a one-way ANOVA, the sum of squares for treatments (SST) measures the amount of variation among the samples. ANS: T BLM: Remember

PTS:

REF: 470-471

TOP: 1–5

40. In a one-way ANOVA, the sum of squares for error (SSE) measures the amount of variation among the samples. ANS: F BLM: Remember

PTS:

REF: 470-471

TOP: 1–5

41. Multiple comparison methods are used in one-way ANOVA if the null hypothesis that no difference between the treatment means is rejected. ANS: T

PTS:

REF: 482

TOP: 6

BLM: Higher Order - Understand 42. Tukey’s multiple comparison method determines a critical number, , such that if any pair of sample means has a difference greater than , we conclude that the pair’s two corresponding population means are different. ANS: T BLM: Remember

PTS:

REF: 482

TOP: 6

43. Tukey’s multiple comparison method determines a critical number , such that if any pair of sample means has a difference smaller than this critical number, we conclude that the pair’s two corresponding population means are different. ANS: F BLM: Remember

PTS:

REF: 482

TOP: 6

44. The studentized range is the difference between the smallest and the largest in a set of k samples means. It is used for determining whether there is a difference in a pair of population means. ANS: T BLM: Remember

PTS:

REF: 482

TOP: 6

45. Tukey’s method for making paired comparisons is based on the usual ANOVA assumptions. ANS: T BLM: Remember

PTS:

REF: 482

TOP: 6

46. Tukey’s method for paired comparisons makes the probability of declaring that a difference exists between at least one pair in a set of k treatments, when no difference exists, equal to 1 – . ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 482

TOP: 6

47. Tukey’s method for paired comparisons assumes that the sample means are equal and independent of each other. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 482

TOP: 6

48. In a randomized block design of ANOVA, the sum of squares for blocks measures the variation among the block means. ANS: T BLM: Remember

PTS:

REF: 487

TOP: 7–8

49. In a randomized block design of ANOVA, the sum of squares for error measures the variation of the differences among the treatment observations within blocks.

ANS: T BLM: Remember

PTS:

REF: 487

TOP: 7–8

50. A randomized block design should NOT be used when treatments and blocks both correspond to experimental factors of interest to the researcher. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 485-486

TOP: 7–8

51. Three tennis players, one a beginner, one experienced, and one professional, have been randomly selected from the membership of a large city tennis club. Using the same ball, each person hits four serves with each of five racquet models, with the five racquet models selected randomly. Each serve is clocked with a radar gun and the result recorded. Among ANOVA models, this setup is most like the randomized block design. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 485-486

TOP: 7–8

52. In a two-way ANOVA, there are 4 levels for factor A, 3 levels for factor B, and 3 observations within each of the 12 factor combinations. The number of treatments in this experiment will be 36. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 498

TOP: 7–8

53. The randomized block design is a two-way classification design. ANS: T BLM: Remember

PTS:

REF: 485-486

TOP: 7–8

54. A randomized block experiment having five treatments and four blocks produced the following values: Total SS = total sum of squares = 1500, SST = sum of squares for treatments = 275, SSE = sum of squares for error = 153. The value of MSB = mean square for blocks must be 268. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 487-488

TOP: 7–8

55. In a two-way ANOVA, there are 4 levels for factor A, 3 levels for factor B, and 2 observations within each of the 12 factor combinations. The number of treatments in this experiment will be 12. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 498

TOP: 7–8

56. Blocking is a procedure that lets extraneous factors operate during an experiment but assures—by virtue of the random selection of experimental units and their subsequent random assignment to experimental and control groups—that each treatment has an equal chance to be enhanced or handicapped by these factors. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 485-486

TOP: 7–8

57. In employing the randomized block design, the primary interest lies in reducing sum of squares for blocks (SSB). ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 485-486

TOP: 7–8

58. The completely randomized design is an experimental plan that divides all available experimental units into blocks of fairly homogeneous units—each block containing as many units as there are treatments or some multiple of that number—and then randomly matches each treatment with one or more units within each block. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 485 | 469

TOP: 7–8

59. In the analysis of variance, the blocks mean square is equal to the sum of the squared deviations between each block sample mean and the grand mean, multiplied by the number of observations made for each block, multiplied by the number of observations per cell (which may well equal 1). ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 487-488

TOP: 7–8

60. Two samples of ten each from the male and female workers of a large company have been taken. The data involved the wage rate of each worker. To test whether there is any difference in the average wage rate between male and female workers, a pooled-variances t test will be considered. Another test option to consider is ANOVA. The most likely ANOVA to fit this test situation is the randomized block design. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 485-486

TOP: 7–8

61. When the problem objective is to compare more than two populations, the experimental design that is the counterpart of the matched pairs experiment is called the randomized block design. ANS: T BLM: Remember

PTS:

REF: 486

TOP: 7–8

62. A randomized block design ANOVA has two treatments. The test to be performed in this procedure is equivalent to dependent samples t test.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 486

TOP: 7–8

63. A randomized block design ANOVA has five treatments and four blocks. The computed test statistic (value of F) is 6.25. With a 0.05 significance level, the conclusion will be to accept the null hypothesis. ANS: F TOP: 7–8

PTS: 1 REF: 489-490 | 725-732 BLM: Higher Order - Analyze

64. The randomized block design is also called the two-way analysis of variance. ANS: T BLM: Remember

PTS:

REF: 485-486

TOP: 7–8

65. The F test of the randomized block design of the analysis of variance has the same requirements as the independent samples design; that is, the random variable must be normally distributed and the population variances must be equal. ANS: T BLM: Remember

PTS:

REF: 490

TOP: 7–8

66. The purpose of designing a randomized block experiment is to reduce the between-treatments variation (SST) to more easily detect differences between the treatment means. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 486

TOP: 7–8

67. If we first arrange test units into similar groups before assigning treatments to them, the test design we should use is the randomized block design. ANS: T BLM: Remember

PTS:

REF: 485-486

TOP: 7–8

68. The randomized block design with two treatments is equivalent to a nondirectional dependent samples z-test. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 486

TOP: 7–8

69. A randomized block design with four treatments and five blocks produced the following sum of squares values: SS(Total) = 2000, SST = 400, SSE = 200. The value of MSB must be 350. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 487-488

TOP: 7–8

70. In ANOVA for an a b factorial experiment, it is necessary to have at least two measurements at each level of each factor in order to analyze any interaction between the factors. ANS: T BLM: Remember

PTS:

REF: 498

TOP: 9–10

71. The number of cells in a two-factor ANOVA is equal to a + b – 1; where a is the number of levels of factor A and b is the number of levels of factor B. ANS: F BLM: Remember

PTS:

REF: 498

TOP: 9–10

72. In order to conduct a two-factor ANOVA with replications, the number of replications, r, must be the same in each cell. ANS: T BLM: Remember

PTS:

REF: 498

TOP: 9–10

73. In a two-way factor ANOVA, the smallest number of replications required in any cell is two, but all cells must have the same number of replications. ANS: T BLM: Remember

PTS:

REF: 498

TOP: 9–10

74. In a two-way factor ANOVA, the total sum of squares can be partitioned into four parts: the variation due to factor A, the variation due to factor B, the error variation, and the variation due to blocking. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 498-500

TOP: 9–10

75. In a two-way ANOVA, the variances of the populations are assumed to be equal unless the error variation is 0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 489 | 508

TOP: 9–10

76. In a two-way factor ANOVA, if factors A and B do not interact, then neither A nor B can be considered statistically significant. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 502

TOP: 9–10

77. In a two-way factor ANOVA with replications, the null hypothesis for testing whether interaction exists is that no interaction exists, while the alternative hypothesis is that interaction does exist. ANS: T

PTS:

REF: 500

TOP: 9–10

BLM: Remember 78. In a two-way factor ANOVA with replications in which all hypotheses are to be tested at the 0.05 significance level, if the p-value for interaction is 0.0257, then we should conclude that no interaction exists between the levels of the two factors. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 500

TOP: 9–10

79. In a two-way factor ANOVA with replications, the reason for separating out the sum of squares due to interaction between factors A and B is to increase the chance of detecting significant differences across levels of factor A and factor B. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 498-499

TOP: 9–10

80. In a two-way ANOVA, it is easier to interpret main effects when the interaction component is not significant. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 502

TOP: 9–10

81. A study will be undertaken to examine the effect of two kinds of background music and of two assembly methods on the output of workers at a fitness shoe factory. Two workers will be randomly assigned to each of four groups, for a total of eight in the study. Each worker will be given a headphone set so that the music type can be controlled. The number of shoes completed by each worker will be recorded. The ANOVA model most likely to fit this situation is the two-way analysis of variance. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 496-498

TOP: 9–10

82. In a two-factor ANOVA, the sum of squares due to both factors, the interaction sum of squares, and the error sum of squares, must add up to the total sum of squares. ANS: T BLM: Remember

PTS:

REF: 498-499

TOP: 9–10

Chapter 11B—The Analysis of Variance PROBLEM Manufacturing Plant Machines A mechanical engineer at a manufacturing plant keeps a close watch on the performance and condition of the machines. The following data are the weight losses (in milligrams) of certain machine parts due to friction when used with three different lubricants.

Total ( ) Mean ( ) Standard Deviation ( ) = 2283.0, and

Lubricant A 12 9 8 11 10 8 7 6

Lubricant B 10 8 9 13 8 7 5 11

Lubricant C 12 8 14 11 12 13 7 8

8.875

10.625

2.031

2.475

2.615

= 227

1. Refer to Manufacturing Plant Machines graphics. State the null and alternative hypotheses to test whether there is a significant difference in mean weight losses among the three lubricants. ANS: vs.

: At least one of the population means is different from the

others. PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

2. Refer to Manufacturing Plant Machines graphics. How many degrees of freedom are associated with the F test statistic? ANS: The test statistic F has an F distribution with degrees of freedom given by and

= n – k = 24 – 3 = 21.

= k – 1 = 2,

PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

3. Refer to Manufacturing Plant Machines graphics. Calculate the value of the test statistic F. ANS: The correction for the mean is CM =

/24 = 2,147.042. Then, Total SS

= = 2283 – 2147.042 = 135.958. SST = = 2163.375 – 2147.042 = 16.333. MST = SST/(k – 1) = 16.333/2 = 8.1665. SSE = Total SS – SST = 119.625. MSE = SSE/(n – k) = 119.625/21 = 5.6964. The value of the test statistic is F = MST/MSE = 8.1665/5.6964 = 1.4336. PTS: 1 REF: 473-474 BLM: Higher Order - Apply

TOP: 1–5

4. Refer to Manufacturing Plant Machines graphics. Set up the ANOVA Table. ANS: Source of Variation Treatments Error Total

df 2 21 23

SS 16.333 119.625 135.958

PTS: 1 REF: 471 BLM: Higher Order - Apply

MS 8.1665 5.6964

F 1.4336

TOP: 1–5

5. Refer to Manufacturing Plant Machines graphics. Set up the rejection region for

= 0.01.

ANS: With 5.78.

= 2,

= 21, and

= 0.01, the rejection region is Reject

PTS: 1 REF: 473-474 | 725-732 BLM: Higher Order - Apply

if F >

TOP: 1–5

6. Refer to Manufacturing Plant Machines graphics. What is the appropriate conclusion for the test? ANS: Since F  5.78, do not reject ; therefore one cannot conclude that there is a significant difference in average weight loss among the three lubricants. PTS: 1 REF: 473-474 BLM: Higher Order - Evaluate

TOP: 1–5

Laundry Detergent Preference Narrative A consumer would like to know if there is a difference in the performance of three different laundry detergents. Eighteen pieces of white cloth were soiled with grape juice then washed in one of the three detergents A, B, or C. The resulting whiteness readings are listed below, where the larger number indicates whiter fabric. Detergent A 77 71 81 78 76 80

Detergent B 73 68 66 74 70 60

Detergent C 84 78 82 81 78 80

7. Refer to Laundry Detergent Preference Narrative. State the null and alternative hypotheses to test the hypothesis of equality of the mean whiteness readings for the three detergents. ANS: vs.

At least one of the population means is different from the

others. PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

8. Refer to Laundry Detergent Preference Narrative. How many degrees of freedom are associated with the F test statistic? ANS: The test statistic F has an F distribution with degrees of freedom given by and

= k – 1 = 2,

= n – k = 18 – 3 = 15.

PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

9. Refer to Laundry Detergent Preference Narrative. Calculate the value of the test statistic F. ANS: = 102,985, for the mean is CM =

= 1,357, =

= 463,

= 411, and

= 483. The correction

/18 = 102,302.722. Then, Total SS =

= 102,985 – 102,302.722 = 682.278. SST = = 102,763.1667 – 102,302.722 = 460.445. MST = SST/(k – 1) = 460.445/2 = 230.223. SSE = Total SS – SST = 682.278 – 460.445 = 221.833. MSE = SSE/(n – k) = 221.833/15 = 14.789. The value of the test statistic is F = MST/MSE = 230.223/14.789 = 15.567.

PTS: 1 REF: 473-474 BLM: Higher Order - Apply

TOP: 1–5

10. Refer to Laundry Detergent Preference Narrative. Set up the ANOVA Table. ANS: Source of Variation Treatments Error Total

df 2 15 17

SS 460.445 221.833 682.278

PTS: 1 REF: 471 BLM: Higher Order - Apply

MS 230.223 14.789

F 15.567

TOP: 1–5

11. Refer to Laundry Detergent Preference Narrative. Set up the rejection region for

= 0.05.

ANS: With

= 2,

= 3.68.

= 15, and

= 0.05, the rejection region is reject

PTS: 1 REF: 473-474 | 725-732 BLM: Higher Order - Analyze

TOP: 1–5

12. Refer to Laundry Detergent Preference Narrative. What is the appropriate conclusion for the test? ANS: Since F > 3.68, we reject . The sample evidence supports that the true mean whiteness readings are not the same for the three detergents. PTS: 1 REF: 473-474 BLM: Higher Order - Evaluate

TOP: 1–5

13. Refer to Laundry Detergent Preference Narrative. Develop a 95% confidence interval for

ANS: = 463/6 = 77.167. The 95% confidence interval for = 77.167

2.131 (

) = 77.167

is 3.346, or between 73.821

and 80.513. PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

TOP: 1–5

14. Refer to Laundry Detergent Preference Narrative. Develop and interpret a 95% confidence interval for

ANS: = 411/6 = 68.5, and

= 483/6 = 80.5. The 95% confidence interval

for is = (68.5 – 80.5) 2.131 (2.220) = –2 4.731 or between –16.731 and –7.269. Since this interval does not contain 0, detergents B and C are significantly different from each other. PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

TOP: 1–5

Breaking Strength of Thread Narrative A textile company is interested in knowing if there is a difference in the breaking strength of four different kinds of thread.

Total ( ) Means ( Standard

)

Thread 1 17.8 15.2 16.5 16.3 18.5

Thread 2 21.2 18.7 18.6 20.6 17.9

Thread 3 16.4 18.0 16.7 19.5 16.8

Thread 4 19.2 18.0 17.9 18.1 20.3

84.3

97.0

87.4

93.5

16.86

19.40

17.48

18.70

1.301

1.420

1.283

1.037

Deviation ( ) = 6605.02, and

= 362.2

15. Refer to Breaking Strength of Thread Narrative. State the null and alternative hypotheses to test whether there is a significant difference in mean breaking strength of the four kinds of thread. ANS: vs.

At least one of the population means is different from the

others. PTS:

REF: 473-474

TOP: 1–5

BLM: Higher Order - Analyze 16. Refer to Breaking Strength of Thread Narrative. Calculate the test statistic. ANS: The correction for the mean is CM =

/20 = 6559.442. Then, Total SS

= = 6605.02 – 6559.442 = 45.578. SST = = 6579.3 – 6559.442 = 19.858. MST = SST/(k – 1) = 9.858/3 = 6.619. SSE = Total SS – SST = 45.578 – 19.858 = 25.72. MSE = SSE/(n – k) = 25.72/16 = 1.6075. The value of the test statistic is F = MST/MSE = 6.619/1.6075 = 4.118. PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

17. Refer to Breaking Strength of Thread Narrative. Set up the ANOVA Table ANS: Source of Variation Treatments Error Total

df 3 16 19

SS 19.858 25.72 45.578

PTS: 1 REF: 471 BLM: Higher Order - Apply

MS 6.619 1.6075

F 4.118

TOP: 1–5

18. Refer to Breaking Strength of Thread Narrative. Set up the rejection region for

= 0.05.

ANS: With

= 3,

= 16, and

= 0.10, the rejection region is reject

PTS: 1 REF: 473-474 | 725-732 BLM: Higher Order - Analyze

if F >

= 2.46.

TOP: 1–5

19. Refer to Breaking Strength of Thread Narrative. What conclusion can be drawn? Give reasons for your answer. ANS: Since F  2.46, reject and conclude that at least one of the mean breaking strengths of the four kinds of threads is significantly different from the others. PTS: 1 REF: 473-475 BLM: Higher Order - Evaluate

TOP: 1–5

20. Refer to Breaking Strength of Thread Narrative. Find and interpret the approximate p-value.

ANS: p-value = P(F  4.118). For degrees of freedom 3 and 16, we have P(F  4.03) = 0.025 and P(F  5.29) = 0.010. Thus 0.010  p-value  0.025. This would indicate the results are significant (i.e., at least one of the mean breaking strengths is significantly different) at the 0.05 level of significance. PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

TOP: 1–5

21. Refer to Breaking Strength of Thread Narrative. Find and interpret a 90% confidence interval for

ANS: = 84.3/5 = 16.86, and

= 87.4/5 = 17.48. The 90% confidence interval

for is = (16.86 – 17.48) 1.746 (0.802) = -0.62 1.40 or between -2.02 and 0.78. Since 0 is within the limits of the confidence interval, one cannot conclude that there is a significant difference in average breaking strength between thread kind 1 and thread kind 3. PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

TOP: 1–5

22. Based on public demand, price for seeds, and average yield, a farmer must choose which variety of wheat to grow. In the first step toward making a decision, the farmer planted eight test plots each with three varieties of wheat. The recorded yields (in kilograms per plot) were used in an analysis of variance. Use the output below from the analysis of variance to test the null hypothesis of no difference among the mean yields of the three varieties of wheat. Use  = 0.005 to draw conclusions. Source Model Error Total

DF 2 21 23

SS 134.3333 49.0000 183.3333

MS 67.1667 2.3333

F 28.79

ANS: The hypotheses to be tested are

vs.

: At least one of the population

means is different from the others. The test statistic is F = 28.79. With

= 2,

= 21,

and = 0.005, the rejection region is reject if F > = 6.89. Since F > 6.89, reject the null hypothesis. The sample in this problem suggests that the mean yields for the three varieties are not the same. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

23. Ann Day wishes to determine if the mean price of grocery items is the same for five supermarkets in her city. The same seven food (and brand) items were priced at the five stores. Complete the partial ANOVA table below for testing the null hypothesis of no difference in the true mean price for the five stores in a completely randomized design. Can you reject the null hypothesis? Use = 0.05. Source Store Error Total

DF * * *

SS * 249.1301 252.0784

MS * 8.3043

F *

SS 2.9783 249.1301 252.0784

MS 0.7371 8.3043

F 0.0888

ANS: Source Store Error Total

DF 4 30 34

The hypotheses to be tested are

vs.

At least one of the

population means is different from the others. The test statistic is F = 0.0888. With

= 4,

= 30, and = 0.05, the rejection region is reject if F > = 2.69. Since F < 2.69, do not reject the null hypothesis. One cannot conclude that there is a significant difference in the true mean price for the five stores. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

24. Suppose a reading comprehension test is given to random samples of three Grade 7 students from each of five different schools. The samples are chosen so that each school provides a student categorized as low IQ, average IQ, and high IQ. Perform an analysis using the data below, treating the IQs as blocks. Is there sufficient evidence to reject the null hypothesis of no difference between the true mean test scores of the five schools? Draw conclusions using = 0.05.

IQ Low Average High

A 45 67 95

B 65 90 75

School C 85 90 95

D 30 70 99

E 47 50 52

ANS: Using statistical software, the following randomized block design ANOVA table is obtained. Source of Variation df SS MS F P-value F crit School 4 2615.33 653.833 2.76482 0.10309 3.83785 IQ 2 2144.13 1072.07 4.53337 0.04828 4.45897

Error Total

8 14

1891.87 6651.33

236.483

The hypotheses to be tested are

vs.

At least one of the

population means is different from the others. The test statistic is F = 2.765. With

= 4,

= 8 and = 0.05, the rejection region is reject if F > = 3.84. Since F < 3.84, do not reject the null hypothesis. One cannot conclude that there is a significant difference between the true mean test scores of the five schools PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

25. A travel agency primarily reserves flights with four major airlines. The agency would like to know if the true mean price for the four airlines is the same. Below are the prices for flights leaving from the same city travelling to five different destinations.

Destination V W X Y Z

A 350 450 300 800 650

B 300 400 325 750 650

C 325 495 325 775 625

Airline D 375 450 375 825 675

Test the null hypothesis of no difference in the true mean price for the four airlines in a completely randomized design. Use  = 0.05. ANS: Using statistical software, the following randomized block design ANOVA table is obtained Source of Variation Model Error Total

df 3 16 19

SS MS 7,610 2,536.667 655,620 40,976.25 663,230

F P-value F crit 0.06191 0.97915 3.2389

The hypotheses to be tested are

vs.

At least one of the

population means is different from the others. The test statistic is F = 0.0619. With = 16 and = 0.05, the rejection region is reject if F > = 3.24. Since F < 3.24, do not reject the null hypothesis. One cannot conclude that there is a significant difference between the true mean prices for the four airlines. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

= 3,

26. Due to his high blood pressure, Sam watches the sodium content of the foods that he eats. Five samples for each of four brands of canned turkey (97% fat free) were tested for sodium content, measured in milligrams of sodium per 60 gram serving. Brand 1 250 251 260 255 245

Brand 2 175 185 175 180 165

Brand 3 175 180 180 170 190

Brand 4 200 210 210 195 205

The following summary table and ANOVA were generated by statistical software as shown below: Summary Table Groups Count Brand 1 5 Brand 2 5 Brand 3 5 Brand 4 5

Sum 1261 880 895 1020

Average Variance 252.2 31.7 176 55 179 55 204 42.5

ANOVA Table Source of Variation Brand Error

df 3 16

SS MS F P-value F crit 18,632.4 6,210.8 134.871 0.000 3.23887 736.8 46.05

Total

19,369.2

Use the p-value approach to test whether there is a significant difference in mean amount of sodium in the four brands. Let = 0.05. ANS: The hypotheses to be tested are vs. At least one of the population means is different from the others. Since the p-value = 0.000  0.05, the results are significant, i.e. at least one of the brands of canned turkey has a significantly different mean sodium content from the others. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

Lifetime of Brake Shoes Narrative An automobile parts store was interested in comparing the mean life length of three brands of automobile brake shoes. The following data represent the life length, measured in 1000s of kilometres, of random samples of six sets of brake shoes of each brand: Brakes 1 43

Brakes 2 51

Brakes 3 34

44 57 41 47 54

65 62 67 58 57

45 39 37 48 38

27. Refer to Lifetime of Brake Shoes Narrative. State the null and alternative hypotheses to test whether there is a significant difference in mean life length among the three brands of brake shoes. Let = 0.05. ANS: vs.

At least one of the population means is different from the others.

PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

28. Refer to Lifetime of Brake Shoes Narrative. Use statistical software to produce the ANOVA Table. ANS: ANOVA Table Source of Variation Between Groups Within Groups

df 2 15

SS 1203.444 518.1667

Total

1721.611

PTS: 1 REF: 471 BLM: Higher Order - Apply

MS F P-value F crit 601.7222 17.41878 0.00012 3.68232 34.54444

TOP: 1–5

29. Refer to Lifetime of Brake Shoes Narrative. Perform the test using the p-value approach. ANS: Since the p-value = 0.00012  0.05, the results are significant, i.e. at least one of the brake shoes has a significantly different mean life length. PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

30. The following statistics were calculated based on samples drawn from three normal populations: Treatment Statistic 1 10 n 95 s 10

2 10 86 12

3 10 92 15

Set up the ANOVA table and test at the 5% level of significance to determine whether differences exist among the population means. ANS: Source of Variation Treatments Error Total

SS 420 4221 4641

df 2 27 29

MS 210 156.333

F 1.343

F critical 3.35

vs. At least two means differ. Conclusion: Don’t reject the null hypothesis. No differences exist among the population means. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Analyze

TOP: 1–5

31. Fill in the blanks (identified by asterisks) in the following partial ANOVA table: Source of Variation Treatments Error Total

SS * 625 1600

df * * 25

MS 195 *

F *

ANS: Source of Variation Treatments Error Total

SS 975 625 1600

PTS: 1 REF: 471 BLM: Higher Order - Analyze

df 5 20 24

MS 195 31.25

F 6.24

TOP: 1–5

Laptop Battery Charge Time Narrative A computer laboratory manager must choose between three brands of battery packs for use in a laboratory of laptop computers. A major concern is the time, in hours, the battery packs will function before needing to be recharged. The manager obtains a random sample of eight observations for each of the three brands and records the following information: Battery Pack 1 6.9 7.2 7.8 6.3 6.2

Battery Pack 2 6.4 6.9 6.1 7.2 7.4

Battery Pack 3 6.9 6.3 7.8 7.2 7.8

6.5 7.0 7.4 55.3

6.1 5.9 6.3 52.3

7.3 7.2 7.5 58.0

6.9125

6.5375

7.2500

Standard Deviation ( ) 0.5566

0.5579

0.4928

Total ( ) Mean ( )

= 1150.72, and

= 165.6

32. Refer to Laptop Battery Charge Time Narrative. State the null and alternative hypothesis to test whether there is a significant difference in mean functioning time before needing recharging among the three brands of battery packs. ANS: vs.

At least one of the population means is different from the others.

PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

33. Refer to Laptop Battery Charge Time Narrative. Calculate the test statistic. ANS: The correction for the mean is CM =

/24 = 1142.64. Then, Total SS

= = 1150.72 – 1142.64 = 8.08. SST = = 1144.6725 – 1142.64 = 2.0325. MST = SST/(k – 1) = 2.0325/2 = 1.01625. SSE = Total SS – SST = 8.08 – 2.0325 = 6.0475. MSE = SSE/(n – k) = 6.0475/21 = 0.2880. The value of the test statistic is F = MST/MSE = 1.01625/0.288 = 3.529. PTS: 1 REF: 471 | 473-474 BLM: Higher Order - Analyze

TOP: 1–5

34. Refer to Laptop Battery Charge Time Narrative. Set up the rejection region for

= 0.01.

ANS: With = 3.47.

= 2,

= 21, and

= 0.05, the rejection region is reject

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

if F >

TOP: 1–5

35. Refer to Laptop Battery Charge Time Narrative. What conclusion can be drawn? Provide a reason for your answer. ANS:

Since F > 3.47, reject , and conclude that there is a significant difference in mean functioning time before needing recharging among the three brands of battery packs. PTS: 1 REF: 473-475 BLM: Higher Order - Evaluate

TOP: 1–5

36. Refer to Laptop Battery Charge Time Narrative. Find and interpret the approximate p-value. ANS: p-value = P(F  3.529). For degrees of freedom 2 and 21, P(F  3.47) = 0.05 and P(F  4.42) = 0.025. Thus 0.025  p-value  0.05. This would indicate the results are significant (i.e., at least one of the mean functioning times until recharging is needed is significantly different) at the 0.05 level of significance. PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

TOP: 1–5

37. Refer to Laptop Battery Charge Time Narrative. Develop and interpret a 90% confidence interval for

ANS: = 52.3/8 = 6.5375, and

= 58/8 = 7.25. The 90% confidence interval

for is = (6.5375 – 7.25) 1.721 (0.2683) = –0.7125 0.4617 or between –1.1742 and –0.2508. Since 0 is not within the limits of the confidence interval, one can conclude that is a significant difference in mean functioning time until needing to be recharged between battery pack 2 and battery pack 3. PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

TOP: 1–5

38. Refer to Laptop Battery Charge Time Narrative. Set up the ANOVA Table. ANS: Source of Variation Treatments Error Total

df 2 21 23

PTS: 1 REF: 471 BLM: Higher Order - Apply Healthy Dog Food Narrative

SS 2.0325 6.0475 8.08 TOP: 1–5

MS 1.01625 0.2880

F 3.529

A premium dog food manufacturer claims that, since its food is so highly nutritious, you won’t need to feed your dog as much as with other dog foods. The premium dog food was compared to three other brands with respect to how much a dog needed to eat, in kilograms per week, to maintain its current weight and for the dog to remain healthy. A random sample of 24 similar dogs was selected and divided into 4 groups (i.e., 6 dogs were randomly assigned to each dog food brand). The following partial analysis table was computed: Analysis of Variance for Dog Food Brands Source df SS MS F * 1403.4 * * Brand * * * Error Total

1821.6

39. Refer to Healthy Dog Food Narrative. Complete the ANOVA table. ANS: Source Brand Error

df 3 20

SS 1403.4 418.2

Total

1821.6

PTS: 1 REF: 471 BLM: Higher Order - Analyze

MS 467.8 20.91

F 22.372

TOP: 1–5

40. Refer to Healthy Dog Food Narrative. State the null and alternative hypotheses to test whether there is a significant difference in the average amount of dog food needed to maintain a dog’s current weight and for the dog to remain healthy. ANS: vs.

at least one of the population means is different from the

others. PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

TOP: 1–5

41. Refer to Healthy Dog Food Narrative. What is the value of the test statistic? ANS: F = 22.372 PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Apply

TOP: 1–5

42. Refer to Healthy Dog Food Narrative. Set up the rejection region using

= 0.05.

ANS: With

= 3,

= 20 and

= 0.05, the rejection region is reject

PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Analyze

if F >

= 3.10.

TOP: 1–5

43. Refer to Healthy Dog Food Narrative. What conclusion can be drawn? Explain. ANS: Since F  3.10 reject and conclude that there is a significant difference in the average amount of dog food needed to maintain a dog’s current weight and for the dog to remain healthy. PTS: 1 REF: 473-475 BLM: Higher Order - Evaluate

TOP: 1–5

Absenteeism Narrative The data shown below are collected using randomized design. The data values represent the number of days absent from work for three independent samples of workers. Sample 1 4 3 5 4 3

Sample 2 5 4 6 3 6

Sample 3 3 1 3 2

44. Refer to Absenteeism Narrative. Calculate CM and Total SS. ANS: = 220,

= 52,

= 193.1429. Total SS = PTS: 1 REF: 471 BLM: Higher Order - Analyze

= 19,

= 24,

= 9, and n = 14. Then, CM =

CM = 220 – 193.1429 = 26.8571. TOP: 1–5

45. Refer to Absenteeism Narrative. Calculate SST and MST. ANS: SST= 7.25355.

CM = 207.65 – 193.1429 = 14.5071, and

PTS: 1 REF: 471 BLM: Higher Order - Analyze

TOP: 1–5

= 14.5071/2 =

46. Refer to Absenteeism Narrative. Calculate SSE and MSE. ANS: SSE = Total SS – SST = 26.8571 – 14.5071 = 12.35, and MSE = SSE/(n – k) = 12.35/11 = 1.1227. PTS: 1 REF: 471 BLM: Higher Order - Analyze

TOP: 1–5

47. Refer to Absenteeism Narrative. Construct an ANOVA table for the data. ANS: Source of Variation Treatments Error Total

2 11 13

14.5071 12.35 26.8571

7.25355 1.1227

6.4608

PTS: 1 REF: 471 BLM: Higher Order - Apply

TOP: 1–5

48. Refer to Absenteeism Narrative. State the null and alternative hypotheses for an analysis of variance F test. ANS: vs.

at least one pair of means are different.

PTS: 1 REF: 473-474 BLM: Higher Order - Understand

TOP: 1–5

49. Refer to Absenteeism Narrative. Use the p-value approach to determine whether there is a difference in the three population means. ANS: The rejection region for the test statistic F = 6.46 is based on an F distribution with 2 and 11 degrees of freedom. Since P(F > 5.25) = 0.025 and P(F > 7.21) = 0.01, and since the observed value F = 6.46 falls between 5.25 and 7.21, then 0.01 < p-value < 0.025 and rejected at the 5% level of significance. There is a difference among the means. PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

50. Refer to Absenteeism Narrative. Do the data provide sufficient evidence to indicate a difference between and ? Justify your conclusion. Test using the two-sample independent t test with = 0.05.

ANS: The hypothesis to be tested is

. The test statistic is

. Notice that the best estimator of is = MSE, which is used in the calculation. The rejection region with = 0.05 and 11 degrees of freedom is | t | > = 2.201 and null hypothesis is rejected. We conclude that there is a difference between the means. PTS: 1 REF: 414-415 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

51. Refer to Absenteeism Narrative. Find a 90% confidence interval for

ANS: The 90% confidence interval for

or PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

TOP: 1–5

52. Refer to Absenteeism Narrative. Find a 90% confidence interval for the difference ( ANS:

The 90% confidence interval for

is =

PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze Water Samples Narrative

= or

. TOP: 1–5

Water samples were taken at four different locations in a river to determine whether the quantity of dissolved oxygen, a measure of water pollution, varied from one location to another. Locations 1 and 2 were selected above an industrial plant, one near the shore and the other in midstream; location 3 was adjacent to the industrial water discharge for the plant; and location 4 was slightly downriver in midstream. Five water specimens were randomly selected at each location, but one specimen, corresponding to location 4, was lost in the laboratory. The data and an ANOVA computer printout are provided here (the greater the pollution, the lower the dissolved oxygen readings). Location 1 2 3 4

Mean 5.9 6.3 4.8 6.0

Dissolved 6.1 6.6 4.3 6.2

Oxygen 6.3 6.4 5.0 6.1

Summary Table Groups Location 1 Location 2 Location 3 Location 4

Count 5 5 5 4

Sum 31.9 33.7 25.4 25.3

Average Variance 6.38 0.022 6.74 0.013 5.08 0.097 6.325 0.0292

ANOVA Table Source of Variation Location Error

df 3 15

SS 7.8361 0.6155

MS 2.6120 0.0410

Total

8.4516

Content 6.1 6.4 4.7 5.8

6.0 6.5 5.1

F P-value F crit 63.6562 0 3.2874

53. Refer to Water Samples Narrative. Do the data provide sufficient evidence to indicate a difference in the mean dissolved oxygen contents for the four locations? Justify your conclusion. ANS: The design is completely randomized with four treatments. The analysis of variance table is given on the computer printout above, and the test statistic is F = MST/MSE = 63.6562 with p-value = 0.000. is rejected and the results are declared highly significant. There is a significant difference in the mean dissolved oxygen content for the four locations. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

54. Refer to Water Samples Narrative. Compare the mean dissolved oxygen content in midstream above the plant with the mean content adjacent to the plant (location 2 versus location 3). Use a 95% confidence interval. ANS:

The 95% confidence interval for

is =

= , or 1.387 <

PTS: 1 REF: 475-476 | 722 BLM: Higher Order - Analyze

<1.933. TOP: 1–5

55. Physicians depend on laboratory test results when managing medical problems such as diabetes or epilepsy. In a uniformity test for glucose tolerance, three different laboratories each sent = 5 identical blood samples from a person who had drunk 50 milligrams (mg) of glucose dissolved in water. The laboratory results (in mg/dL) are listed here: Lab 1 121.3 111.9 110.1 105.4 101.6

Lab 2 99.5 113.2 108.9 109.1 100.4

Lab 3 104.2 109.7 102.3 111.2 106.6

Do the data indicate a difference in the average readings for the three laboratories? Give reasons for your answer. ANS: The design is completely randomized with three treatments and five replications per treatment. The computer printout below shows the summary table and analysis of variance for this experiment Summary Table Groups Lab 1 Lab 2 Lab 3

Count 5 5 5

Sum 550.3 531.2 534

Average Variance 110.06 55.753 106.24 36.158 106.8 13.705

ANOVA Table Source of Variation Lab Error

df 2 12

SS MS F 42.556 21.278 0.6044 422.464 35.2053

Total

465.02

P-value F crit 0.5622 3.8853

The analysis of variance F test statistic for

is F = 0.6044 with p-value =

0.5622. The results are not significant and is not rejected. There is insufficient evidence to indicate a difference in the treatment means. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

56. In a completely randomized design, 7 experimental units were assigned to the first treatment, 13 units to the second treatment, and 10 units to the third treatment. A partial ANOVA table for this experiment is shown below: Source of Variation Treatments Error Total

SS * * *

df * * *

MS * 4

F 1.50

a. Fill in the blanks (identified by asterisks) in the above ANOVA Table. b. Test at the 5% significance level to determine if differences exist among the three treatment means. ANS: a. Source of Variation Treatments Error Total b.

SS 12 108 130

df 2 27 29

MS 6 4

F 1.50

vs.

At least two means differ.

Rejection region: F > Test statistics: F = 1.50 Conclusion: Don’t reject the null hypothesis. No differences exist among the three treatment means. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

Insurance Company Narrative An insurance company is considering opening a new branch in Lethbridge. The company will choose the final location from two locations within the city. One of the factors in the decision is the annual family income (in thousands of dollars) of five families randomly sampled from a radius of five miles from the potential locations. Area 1 73 48

Area 2 74 50

46 53 51

81 49 61

57. Refer to Insurance Company Narrative. Perform an equal-variances t test at the 5% significance level to determine whether the population means differ. ANS: vs. Rejection region: | t | > = 2.306 Test statistics: t = –1.098 Conclusion: Don’t reject the null hypothesis. No, the population means don’t differ. PTS: 1 REF: 481-482 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

58. Refer to Insurance Company Narrative. Perform an F-test for one-way ANOVA at the 5% level of significance to determine whether the population means differ. ANS: Source of Variation Treatments Error Total

SS 193.6 1284.8 1478.4

df 1 8 9

MS 193.6 160.6

F 1.206

Rejection region: F > 5.32 Test statistic: F = 1.206 Conclusion: Don’t reject the null hypothesis. No, the population means don’t differ. PTS: 1 REF: 471 | 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

59. Refer to Insurance Company Narrative. What is the mathematical relationship between the observed t and observed F-test statistics from the previous two questions? Does the same relation hold true for the corresponding critical values? Why or why not? ANS: Yes, since PTS: 1 REF: 481-482 | 473-474 BLM: Higher Order - Analyze

TOP: 1–5

60. Refer to Insurance Company Narrative. If we want to determine whether the population mean income for area 2 is higher than that for area 1, can we still use both the t and F tests applied in the previous questions? Explain. ANS: No. We must use the equal variances t-test of . We cannot use the analysis of variance F-test since this technique allows us to test only for a difference. PTS: 1 REF: 481-482 | 473-474 BLM: Higher Order - Analyze

TOP: 1–5

Wrapping Colour Narrative A firm’s product can be wrapped in any of three colours: red, white, and black. The manager wants to test whether mean monthly sales (in $1000s) are the same, regardless of the colour. Viewing the past five months as a random sample, the manager collected the data shown below. Sales History (thousands of dollars) Wrapping Observation Red White Black March 20 20 27 April 21 22 18 May 22 26 22 June 23 31 23 July 24 18 27 61. Refer to Wrapping Colour Narrative. State the appropriate null and alternative hypotheses. ANS: The product’s mean monthly sales are the same regardless of which of the three colours is used for wrapping, i.e., differs from the others. PTS: 1 REF: 473-474 BLM: Higher Order - Analyze

At least one of the population means

TOP: 1–5

62. Refer to Wrapping Colour Narrative. Create the appropriate ANOVA table. ANS: ANOVA Table

PTS: 1 REF: 471 BLM: Higher Order - Apply

TOP: 1–5

63. Refer to Wrapping Colour Narrative. Perform the test at the 1% level of significance. ANS: The computed test statistic F = 0.225 <

. In addition, p-value = 0.802 >

0.01, therefore cannot be rejected. We conclude that the mean monthly sales are the same regardless of which of the three colours is used for wrapping. PTS: 1 REF: 473-475 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

64. Due to his high blood pressure, Sam watches the sodium content of the foods that he eats. Five samples for each of four brands of canned turkey (97% fat free) were tested for sodium content, measured in milligrams of sodium per 60 gram serving. Brand 1 250 251 260 255 245

Brand 2 175 185 175 180 165

Brand 3 175 180 180 170 190

Brand 4 200 210 210 195 205

The following summary table and ANOVA were generated by statistical software as shown below: Summary Table Groups Count Brand 1 5 Brand 2 5 Brand 3 5 Brand 4 5

Sum 1,261 880 895 1,020

Average Variance 252.2 31.7 176 55 179 55 204 42.5

ANOVA Table Source of Variation Brand Error

df 3 16

SS MS F P-value F crit 18,632.4 6,210.8 134.871 0.000 3.23887 736.8 46.05

Total

19,369.2

Use the following output generated by statistical software to determine which of the means are different.

Tukey’s pairwise comparisons Family error rate = 0.0500 Individual error rate = 0.0113 Critical value = 4.05 Intervals for (column level mean) – (row level mean)

2 3 4

1 63.91 88.49 60.91 85.49 35.91 60.49

–15.29 9.29 –40.29 –15.71

–37.29 –12.71

ANS: Recall that if 0 is within the limits of the given interval for the two brands, one cannot conclude a significant difference in the means for the two brands. If the given interval is entirely positive or negative, one can conclude that there is a significant difference in the means for the two brands. Thus, one can conclude: Mean of Brand 1 is significantly different from the mean of Brand 2, Brand 3, and Brand 4. Mean of Brand 2 is significantly different from the mean of Brand 4, but not from the mean of Brand 3. Mean of Brand 3 is significantly different from the mean of Brand 4. PTS: 1 REF: 482-484 BLM: Higher Order - Evaluate

TOP: 6

65. An automobile parts store was interested in comparing the mean life length of three brands of automobile brake shoes. The following data represent the life length, measured in thousands of kilometres, of random samples of six sets of brake shoes of each brand: Brakes 1 43 44 57 41 47 54

Brakes 2 51 65 62 67 58 57

Brakes 3 34 45 39 37 48 38

Use the following output from statistical software to determine which of the means differ at the 0.05 level of significance. Tukey’s pairwise comparisons Family error rate = 0.0500 Individual error rate = 0.0203 Critical value = 3.67 Intervals for (column level mean) – (row level mean)

2 3

1 2 –21.139 –3.527 –1.306 11.027 16.306 28.639

ANS: Since the interval for comparing means 1 and 2 is entirely negative, one can conclude there is a significant difference in the mean life length between brake shoes of brands 1 and 2. Since the interval for comparing means 2 and 3 is entirely positive, one can conclude there is a significant difference in the mean life length between brake shoes of brands 2 and 3. Since the interval for comparing means 1 and 3 includes the value 0, one cannot conclude there is a significant difference in the mean life length between brake shoes of brands 1 and 3. PTS: 1 REF: 482-484 BLM: Higher Order - Evaluate

TOP: 6

66. A computer laboratory manager must choose between three brands of battery packs for use in a laboratory of laptop computers. A major concern is the time, in hours, the battery packs will function before needing to be recharged. The manager obtains a random sample of eight observations for each of the three brands and records the following information:

Total ( ) Mean ( ) Standard Deviation ( )

Battery Pack 1 6.9 7.2 7.8 6.3 6.2 6.5 7.0 7.4 55.3

Battery Pack 2 6.4 6.9 6.1 7.2 7.4 6.1 5.9 6.3 52.3

Battery Pack 3 6.9 6.3 7.8 7.2 7.8 7.3 7.2 7.5 58.0

6.9125

6.5375

7.2500

0.5566

0.5579

0.4928

= 1150.72, and = 165.6 Use Tukey’s method of comparison to determine which of the three battery pack means differ from the others. Let = 0.05. ANS:

=3.56, so 3.56

Tukey’s yardstick for comparing treatment means is

= 3.56

) = 0.6755. Compare means:

7.2500 – 6.5375 = 0.7125 > **, = 7.2500 – 6.9125 = 0.3375, and = 6.9125 – 6.5375 = 0.3750. Therefore, one can conclude that Battery Pack means 2 and 3 are significantly different. Other pairs of means are not significantly different. PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Evaluate

TOP: 6

Treatment Observation Narrative An independent random sampling design was used to compare the means of six treatments based on samples of four observations per treatment. The pooled estimator of and the sample means follow: and

= 102.1,

= 98.9,

= 112.8,

= 93.4,

is 9.42, = 104.7,

= 114.3.

67. Refer to Treatment Observation Narrative. Give the value of pairwise comparisons of the treatment means for = 0.05.

that you would use to make

ANS:

PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Analyze

TOP: 6

68. Physicians depend on laboratory test results when managing medical problems such as diabetes or epilepsy. In a uniformity test for glucose tolerance, three different laboratories each sent = 5 identical blood samples from a person who had drunk 50 milligrams (mg) of glucose dissolved in water. The laboratory results (in mg/dL) are listed here: Lab 1 121.3 111.9 110.1 105.4 101.6

Lab 2 99.5 113.2 108.9 109.1 100.4

Lab 3 104.2 109.7 102.3 111.2 106.6

Use Tukey’s method for paired comparisons to rank the three treatment means. Use 0.05. ANS:

Since the treatment means are not significantly different, there is no need to use Tukey’s test to search for the pairwise differences. Notice that all three intervals generated by statistical software contain 0, indicating that the pairs cannot be judged different. PTS: 1 REF: 482-484 BLM: Higher Order - Evaluate

TOP: 6

69. The data that follow are observations collected from an experiment that compared four treatments, A, B, C, and D, within each of three blocks, using a randomized block design. Treatment Block 1 2 3

A 7 5 13

B 11 10 16

C 9 6 15

D 10 8 15

Rank the four treatment means using Tukey’s method of paired comparisons with 0.01.

ANS:

With k = 4, df = 6, = 3, are shown below. 7.33 9.00 10.00 11.33

. The ranked means

A line under two or more means indicates a difference less than differences between that group of means.

and hence no

PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Evaluate

TOP: 6

70. A study was conducted to compare fuel consumption of medium-size trucks for three brands of gasoline, A, B, and C. Four trucks of the same make and model were used in the experiment, and each gasoline brand was tested in each truck. Using each brand in the same truck has the effect of eliminating (blocking out) truck-to-truck variability. The data (litres per 100 km) are as follows:

Gasoline Brand A B C

Automobile 3 4

16.9 18.4 17.3

18.2 19.3 18.7

18.5 19.1 18.0

17.3 18.9 19.0

Use an appropriate method to identify the pairwise differences, if any, in the average fuel consumptions for the three brands of gasoline. ANS: To determine where the treatment differences lie, use Tukey’s test with

. The ranked means are shown below: 17.725

18.25 18.925

Only gasoline brands A and B are significantly different from each other. PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Evaluate

TOP: 6

71. In order to examine the differences in ages of teachers among five school districts, an educational statistician took random samples of six teachers’ ages in each district. The data are listed below. Ages of Teachers among Five School Districts 1 41 53 28 45 40 59

2 39 48 41 51 49 50

3 36 28 29 33 27 26

4 45 37 46 48 51 49

5 53 55 49 56 48 61

Use Tukey’s multiple comparison method to determine which means differ. ANS: 10.717 District

District

2 3 4 5 3 4 5 4

| 2.0 14.5 1.667 9.333 16.50 0.333 7.333 16.167

Significant? No Yes No No Yes No No Yes

5 5

23.833 7.677

Yes No

It is clear that the mean for District 3 is significantly different from the mean for each of the other four districts. PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Evaluate

TOP: 6

Supermarket Prices Narrative A building contractor employs three construction engineers, A, B, and C to estimate and bid on jobs. To determine whether one tends to be a more conservative (or liberal) estimator than the others, the contractor selects four projected construction jobs and has each estimator independently estimate the cost (in dollars per square foot) of each job. The data are shown in the table: Construction Job Estimator 1 2 3 4 A 35.10 34.50 29.25 31.60 B 37.45 34.60 33.10 34.40 C 36.30 35.10 32.45 32.90 72. Refer to Supermarket Prices Narrative. If any differences in treatment means exist, use an appropriate method to identify specifically where the differences lie. Has blocking been effective in this experiment? Explain. ANS: The treatment means can be further compared using Tukey’s test with

. The ranked means are shown below. 32.6125

34.1875

34.8875

A line under two or more means indicates a difference less than and hence no differences between that group of means. Estimators A and B show a significant difference in average costs. PTS: 1 REF: 482-484 | 739-742 BLM: Higher Order - Evaluate

TOP: 6

73. Refer to Supermarket Prices Narrative. Analyze the experiment using the appropriate methods. Identify the blocks and treatments, and investigate any possible differences in treatment means. ANS:

A randomized block design has been used with “estimators” as treatments and “construction job” as the block factor. A summary table and the analysis of variance table are shown in the computer printout below. Summary Table Estimator 1 Estimator 2 Estimator 3

Count 4 4 4

Sum 133.85 142.95 140.15

Average 33.4625 35.7375 35.0375

Variance 7.3606 3.3606 3.3240

Job 1 Job 2 Job 3 Job 4

3 3 3 3

111.4 106.75 97.35 101.45

37.1333 35.5833 32.45 33.8167

1.3808 0.1033 4.2475 1.9633

ANOVA Table Source of Variation df Estimators 2 Jobs 3 Error 6

SS 10.8617 37.6073 4.5283

Total

52.99729

MS F 5.4308 7.1958 12.5358 16.6098 0.7547

P-value 0.0255 0.0026

F crit 5.1432 4.7571

Both treatments and blocks are significant. PTS: 1 REF: 486-488 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

74. Refer to Supermarket Prices Narrative. Complete the ANOVA table below where the grocery items were treated as blocks. Can you reject the null hypothesis of no difference between the true mean price for the five stores? Use  = 0.05. Source Store Item Error Total

df * * * *

SS 2.9483 247.3979 * 252.0784

MS * * *

F * *

ANS: Source Store Item Error Total

df 4 6 24 34

SS 2.9483 247.3979 1.7322 252.0784

MS 0.7371 41.233 0.0722

F 10.209 571.09

The hypotheses to be tested are

vs.

at least one of the

population means is different from the others. The test statistic is F = 10.209. With

= 4,

= 24, and = 0.05, the rejection region is reject if F > = 2.78. Since F > 2.78, reject the null hypothesis. One can conclude that there is a significant difference in the true mean price for the five stores. PTS: 1 REF: 486-488 | 489-492 |725-732 BLM: Higher Order - Evaluate

TOP: 7–8

75. Refer to Supermarket Prices Narrative. Why is it important to treat the food items as blocks? (Hint: The observed value of the test statistic in a completely randomized design is F = 0.089, and

= 2.69).

ANS: In this particular problem, the stores’ prices were not considered significantly different when the items were not blocked, since F = 0.089 < 2.69. However, when the items were blocked, the stores were considered to have different prices. It is important to treat the food items as blocks to eliminate as much of the variability as possible. PTS: 1 REF: 486 BLM: Higher Order - Analyze

TOP: 7–8

Travel Agency Narrative A travel agency primarily reserves flights with four major airlines. The agency would like to know if the true mean price for the four airlines is the same. Below are the prices for flights leaving from the same city and travelling to five different destinations. Airline Destination V W X Y Z

A 350 450 300 800 650

B 300 400 325 750 650

C 325 495 325 775 625

D 375 450 375 825 675

76. Refer to Travel Agency Narrative. Perform an analysis of variance treating the destinations as blocks. Use  = 0.05. ANS: Source of Variation Airline Destination Error Total

SS 7,610 648,242.5 7,377.5 663,230

df 3 4 12 19

MS 2,536.667 162,060.6 614.7917

F 4.12606 263.603

P-value 0.03168 0.000

F crit 3.4903 3.25916

The null hypothesis can be rejected because the p-value of 0.03168 is less than 0.05. However, the null hypothesis is barely rejected. Had we been testing at the 0.01 significance level, the null hypothesis would not have been rejected. PTS: 1 REF: 486-488 | 489-492 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

77. Refer to Travel Agency Narrative. Why is it necessary to treat the destinations as blocks? Justify your answer. ANS: Because there is variability in the prices of flights to different destinations. Our goal was to see if the airlines had different prices, so it was important to block this variable. PTS: 1 REF: 486 BLM: Higher Order - Analyze

TOP: 7–8

78. Refer to Travel Agency Narrative. Develop and interpret a 90% confidence interval for

ANS: = 2550/5 = 510, and

for

= 2545/5 = 509. The 95% confidence interval

= (510 – 509)

1.782

=1 27.945, or between –26.945 and 28.945. Since this interval contains 0, the mean price for airlines A and C are not significantly different. PTS: 1 REF: 491-492 | 722 BLM: Higher Order - Analyze

TOP: 7–8

79. Refer to Travel Agency Narrative. Develop and interpret a 90% confidence interval for

ANS:

interval for

= 2,425/5 = 485, and

= 2,700/5 = 540. The 95% confidence

= (485 – 540)

1.782

= –55 27.945, or between –82.945 and –27.055. Since this interval does not contain 0, the mean price for airlines B and D are significantly different. PTS: 1 REF: 491-492 | 722 BLM: Higher Order - Analyze

TOP: 7–8

Tool Prices Comparison Narrative A consumer was interested in determining whether there is a significant difference in the price charged for tools by three hardware stores. The consumer selected five tools and recorded the price for each tool in each store. The following data was recorded: Tools Store

Store 1

$32.00

$3.95

$1.50

$2.95

$4.00

Store 2

$30.00

$2.95

$1.50

$2.45

$5.00

Store 3

$29.50

$3.50

$1.75

$2.45

$4.50

80. Refer to Tool Prices Comparison Narrative. What is the appropriate experimental design? ANS: Randomized block design with the tools as blocks and the stores as treatments. PTS: 1 REF: 485-487 BLM: Higher Order - Analyze

TOP: 7–8

81. Refer to Tool Prices Comparison Narrative. Use statistical software to develop the ANOVA table. ANS: Two-way analysis of variance table Source of Variation Tools Store Error

df 4 2 8

Total

14 1828.058

SS 1823.348 0.9053 3.8047

MS F P-value 455.8371 958.4799 0 0.4527 0.9518 0.4258 0.4756

PTS: 1 REF: 487-488 | 725-732 BLM: Higher Order - Apply

F crit 3.8379 4.4590

TOP: 7–8

82. Refer to Tool Prices Comparison Narrative. Use the p-value approach to determine whether there is a significant difference in the price of tools. What does this tell you about the appropriateness of the experimental design? Explain your reasoning. ANS: Since the p-value associated with the tools is 0.000  0.05, the results are significant, i.e., there is a significant difference in the average price of the tools. This indicates that blocking is necessary in this problem.

PTS: 1 REF: 489-492 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

83. Refer to Tool Prices Comparison Narrative. Use the p-value approach to determine whether there is a significant difference in the prices charged by the three stores at the 0.05 level of significance. ANS: Since the p-value associated with the stores is 0.4258  0.05, the results are not significant, i.e., there is not a significant difference in the average price of the tools among the stores. PTS: 1 REF: 489-492 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

Running Shoes Narrative An avid runner was interested in whether there is a significant difference in the average wear (measured in weeks of use) among three brands of running shoes. To answer the question, the runner randomly selected six runners and assigned them to wear each of the three brands of running shoes until the shoes wore out. Each of the runners wore the brands of shoes in a random order. After the data had been recorded, the following output was generated using statistical software: Two-way Analysis of Variance Source Brands Runners Error

df 2 5 10

SS 17.44 86.44 14.56

Total

118.44

MS 8.72 17.29 1.46

F 5.99 11.88

84. Refer to Running Shoes Narrative. What are the blocks? ANS: The runners are the blocks. PTS: 1 REF: 485-488 BLM: Higher Order - Analyze

TOP: 7–8

85. Refer to Running Shoes Narrative. What are the treatments? ANS: The running shoes are the treatments. PTS: 1 REF: 485-488 BLM: Higher Order - Analyze

TOP: 7–8

P 0.019 0.001

86. Refer to Running Shoes Narrative. Is blocking necessary in this problem? Justify your answer. Let = 0.05. ANS: Yes. The p-value for the runners is 0.001  0.05, indicating significant results; i.e., there is a significant difference in the means of the runners, indicating blocking is necessary. PTS: 1 REF: 489-492 | 725-732 BLM: Higher Order - Analyze

TOP: 7–8

87. Refer to Running Shoes Narrative. Use the p-value approach to determine whether there is a significant difference in the average wear between the three brands of running shoes. Let = 0.05. ANS: The p-value for the brands of running shoes is 0.019  0.05, indicating significant results; i.e., at least one of the brands of running shoes has a significantly different average wear than the others. PTS: 1 REF: 489-492 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

Incentive Pay Narrative A company conducted an experiment to determine the effect of two types of incentive pay plans on worker productivity for workers of two shifts. The company used an equal number of production workers from each of the two shifts and one-half of these workers were assigned to each plan. Then five workers from each pay plan–shift combination were selected and their productivity (in number of items produced) recorded for a one-week period. The following output was generated using statistical software: Two-way analysis of variance Source Plan Shift Interaction Error Total

df 1 1 1 16 19

SS 26.5 12,852.5 2,531.2 1,098.4 16,508.6

MS 26.5 12,852.5 2,531.2 68.7

F 0.39 187.22 36.87

P 0.544 0.000 0.000

88. Refer to Incentive Pay Narrative. Is there significant interaction present in this problem? Justify your answer. (Let = 0.05.) ANS: Yes, since the p-value for interaction is 0.000  0.05, significant interaction is present.

PTS: 1 REF: 498-500 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

89. Refer to Incentive Pay Narrative. Based on your answer to the previous question, is testing for the main effects, plan and shift, appropriate? Give reasons for your answer. ANS: Since significant interaction is present, the difference in plan means and shift means can be examined by looking at comparisons of the factor-level combinations (as opposed to comparing the factor means individually). PTS: 1 REF: 498-500 | 502 BLM: Higher Order - Evaluate

TOP: 7–8

90. The data that follow are observations collected from an experiment that compared four treatments, A, B, C, and D, within each of three blocks, using a randomized block design. Treatment Block 1 2 3

A 7 5 13

B 11 10 16

C 9 6 15

D 10 8 15

a. Do the data present sufficient evidence to indicate differences among the treatment means? Test using = 0.05. b. Do the data present sufficient evidence to indicate differences among the block means? Test using = 0.05. c. Find a 95% confidence interval for the difference in means for treatments A and B. d. Does it appear that the use of a randomized block design for this experiment was justified? Explain. ANS: a. This is a randomized block design with four treatments and three blocks. The computer printout below shows the summary table and analysis of variance for this experiment.

Summary Block 1 Block 2 Block 3

Count 4 4 4

Sum 37 29 59

Average 9.25 7.25 14.75

Variance 2.9167 4.9167 1.5833

Treatment A Treatment B Treatment C Treatment D

3 3 3 3

25 37 30 33

8.3333 12.3333 10 11

17.3333 10.3333 21 13

ANOVA Table

Source of Variation Treatments Blocks Error

SS 25.5833 120.6667 2.6667

df 3 2 6

MS F P-value F crit 8.5278 19.1875 0.0018 4.7571 60.3333 135.75 0 5.1432 0.4444

Total

148.9167 11

To test the difference among treatment means, the test statistic is F = MST/MSE = 19.1875. And the rejection region with = 0.05 and 3 and 6 degrees of freedom is F > 4.76. Also notice that the p-value 0.0018 < = 0.05. There is a significant difference among the treatment means. b. To test the difference among block means, the test statistic is F = MSB/MSE = 135.75. And the rejection region with = 0.05 and 2 and 6 df is F > 5.14. Also notice that the p-value = 0.00 < . There is a significant difference among the block means. = (8.833 – 12.333)

c. The 95% confidence interval is

= or . d. Since there is a significant difference among the block means, blocking has been effective. The variation due to block differences can be isolated using the randomized block design. PTS: 1 REF: 489-492 | 722 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

Randomized Block Design Narrative The partially completed ANOVA table for a randomized block design is presented below: Source Treatments Blocks Error Total

df 4 * 24 34

SS 15.5 21.0 * 45.5

MS * * *

F * *

91. Refer to Randomized Block Design Narrative. How many blocks are involved in the design? ANS: The degrees of freedom for blocks is b – 1 = 34 – 4 – 24 = 6. Hence, there are b = 7 blocks. PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

92. Refer to Randomized Block Design Narrative. How many observations are in each treatment total?

ANS: There are always b observations in a treatment total. Here, b = 7. PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

93. Refer to Randomized Block Design Narrative. How many observations are in each block total? ANS: There are always k observations in a block total. Here, k – 1 = 4, hence k = 5. PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

94. Refer to Randomized Block Design Narrative. Fill in the blanks in the ANOVA table. ANS: Source Treatments Blocks Error Total

df 4 6 24 34

SS 15.5 21.0 9.0 45.5

PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

MS 3.875 3.5 0.375

F 10.333 9.333

TOP: 7–8

95. Refer to Randomized Block Design Narrative. Do the data present sufficient evidence to indicate differences among the treatment means? Provide a justification for your answer. (Test using = 0.05.) ANS: To test the difference among treatment means, the test statistic is F = MST/MSE = 10.333 and the rejection region with = 0.05 and 4 and 24 df is F > 2.78. There is a significant difference among the treatment means. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

96. Refer to Randomized Block Design Narrative. Do the data present sufficient evidence to indicate differences among the block means? Why or why not? (Test using = 0.05.) ANS: To test the difference among block means, the test statistics is F = MSB/MSE = 9.333 and the rejection region with = 0.05 and 6 and 24 df is F > 2.51. There is a significant difference among the block means. PTS:

REF: 489-491 | 725-732

TOP: 7–8

BLM: Higher Order - Evaluate Fuel Consumption Study Narrative A study was conducted to compare fuel consumption of medium-size trucks for three brands of gasoline, A, B, and C. Four trucks of the same make and model were used in the experiment, and each gasoline brand was tested in each truck. Using each brand in the same truck has the effect of eliminating (blocking out) truck-to-truck variability. The data (litres per 100 km) are as follow: Automobile 1 2 3 4 Gasoline Brand A 16.9 18.2 18.5 17.3 B 18.4 19.3 19.1 18.9 C 17.3 18.7 18.0 19.0 97. Refer to Fuel Consumption Study Narrative. Use statistical software to generate a summary table and the ANOVA table. ANS: Summary Table Count Gas Brand A 4 Gas Brand B 4 Gas Brand C 4

Sum 70.9 75.7 73

Average 17.725 18.925 18.25

Auto 1 Auto 2 Auto 3 Auto 4

52.6 56.2 55.6 55.2

17.53333 0.603333 18.73333 0.303333 18.53333 0.303333 18.4 0.91

3 3 3 3

ANOVA Table Source of df Variation Treatments 2 Blocks 3 Error 6 Total

Variance 0.5625 0.149167 0.576667

P-value

F crit

2.895 2.52 1.345

1.4475 0.84 0.2242

6.4572 3.7472

0.0319 0.0792

5.1432 4.7571

6.76

PTS: 1 REF: 486-488 BLM: Higher Order - Apply

TOP: 7–8

98. Refer to Fuel Consumption Study Narrative. Do the data provide sufficient evidence to indicate a difference in mean fuel consumption for the three brands of gasoline? Explain. ANS:

To test the null hypothesis that there is no difference in mean fuel consumption for the three gasoline brands, the test statistic is F = MST/MSE = 6.4572. Since p-value = 0.0319 < 0.05, the null hypothesis is rejected at the 5% level of significance. There is a significant difference among the treatment means. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

99. Refer to Fuel Consumption Study Narrative. Is there evidence of a difference in mean fuel consumption for the four automobiles? Justify your conclusion. ANS: To test the null hypothesis that there is no difference in mean fuel consumption for the four automobiles, the test statistic is F = MSB/MSE = 3.7472. Since p-value = 0.0792, the null hypothesis is not rejected at the 5% level of significance. There is no evidence of a significant difference among the automobiles. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

100. Refer to Fuel Consumption Study Narrative. Suppose that prior to looking at the data, you had decided to compare the mean fuel consumption for gasoline brands A and B. Find a 90% confidence interval for this difference. ANS: = (17.725 – 18.925)

The 90% confidence interval is =

PTS: 1 REF: 491-492 | 722 BLM: Higher Order - Analyze

TOP: 7–8

Salary of Business Graduates Narrative An economist wants to test whether the mean starting salary (in $1000s) of business graduates is the same for three different majors and three types of class standing (top third, middle third, lowest third). No interaction is assumed. Appropriate sampling reveals the data shown below. Starting Salaries (thousands of dollars per year) Major Class Rank Marketing Finance Accounting Top Third 52 75 63 Middle Third 42 59 53 Lowest Third 32 29 43

101. Refer to Salary of Business Graduates Narrative. Create the appropriate ANOVA table. ANS: ANOVA Table

PTS: 1 REF: 486-488 BLM: Higher Order - Apply

TOP: 7–8

102. Refer to Salary of Business Graduates Narrative. Do the data present sufficient evidence to indicate differences among the treatment means? Justify your response. (Test using α = 0.05.) ANS: At least one of the treatment means differs from the others. The computed test statistic F = 2.225 < = 6.94. In addition, p-value = 0.224 > = 0.05, therefore be rejected. We conclude that the starting salary is the same for the three majors. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

cannot

TOP: 7–8

103. Refer to Salary of Business Graduates Narrative. Do the data present sufficient evidence to indicate differences among the block means? Why or why not? (Test using α = 0.05.) ANS: At least one of the block means differs from the others. The computed test statistic F = 10.065 > = 6.94. In addition, p-value = 0.027 < = 0.05, therefore rejected. We conclude that the starting salary differs among class ranks. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

should be

TOP: 7–8

104. Provide an example for a randomized block design with three treatments (k = 3) and four blocks (b = 4), in which SSB = 0 and SST and SSE are not equal to 0. ANS: Treatment Block 1 2

1 8 6

2 10 12

3 6 6

3 4

10 4

9 11

5 9

PTS: 1 REF: 485-486 | 488 BLM: Higher Order - Create

TOP: 7–8

Automobile Repair Cost Narrative Automobile insurance appraisers examine cars that have been involved in accidental collisions and estimate the cost of repairs. An insurance executive claims that there are significant differences in the estimates from different appraisers. To support his claim, he takes a random sample of six cars that have recently been damaged in accidents. Three appraisers then estimate the repair costs of all six cars. The data are shown below. Estimated Repair Cost Appraiser 1 Appraiser 2 650 600 930 910 440 450 750 710 1190 1050 1560 1270

Car 1 2 3 4 5 6

Appraiser 3 750 1010 500 810 1250 1450

105. Refer to Automobile Repair Cost Narrative. Set up the ANOVA Table. Use α = 0.05 to determine the critical values. ANS: Source of Variation Treatments Blocks Error Total

P-value

F critical

52,877.78 1,844,311.11 35,455.56 1,932,644.44

2 5 10 17

26,438.889 368,862.222 3,545.556

7.457 104.035

0.01042 0.00003

4.103 3.326

PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

106. Refer to Automobile Repair Cost Narrative. Can we infer at the 5% significance level that the executive’s claim is true? Provide a justification for your conclusion. ANS: vs. At least two means differ. Conclusion: Reject the null hypothesis. Yes, the insurance executive’s claim is true. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate Food Irradiation Narrative

TOP: 7–8

In recent years the irradiation of food to reduce bacteria and preserve the food longer has become more common. A company that performs this service has developed four different methods of irradiating food. To determine which is best, it conducts an experiment where different foods are irradiated and the bacteria count is measured. As part of the experiment, the following foods are irradiated: beef, chicken, turkey, eggs, and milk. The results are shown below. Bacteria Count Method 1 Method 2 47 53 53 61 68 85 25 24 44 48

Food Beef Chicken Turkey Eggs Milk

Method 3 36 48 55 20 38

Method 4 68 75 45 27 46

107. Refer to Food Irradiation Narrative. Set up the ANOVA Table. Use the critical values. ANS: Source of Variation Treatments Blocks Error Total

= 0.01 to determine

P-value

F critical

650.2 3838.7 1279.3 5768.2

3 4 12 19

216.733 959.675 106.608

2.033 9.002

0.1630 0.0013

5.953 5.412

PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

108. Refer to Food Irradiation Narrative. Can the company infer at the 1% significance level that differences in the bacteria count exist among the four irradiation methods? ANS: vs. At least two means differ. Conclusion: Don’t reject the null hypothesis. No differences in the bacteria count exist among the four irradiation methods. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate Baseball Controversy Narrative

TOP: 7–8

In recent years a controversy has arisen in major league baseball. Some players have been accused of “doctoring” their bats to increase the distance the ball travels. However, a physics professor claims that the effect of doctoring is negligible. A major league manager decides to test the professor’s claim. He doctors two bats by inserting cork into one and rubber into another. He then tells five players on his team to hit a ball with an undoctored bat and with the doctored bats. The distances are measured and listed below.

Player 1 2 3 4 5

Distance Ball Travels (in feet) Undoctored Bat Bat with Cork 275 265 315 335 425 435 380 375 450 460

Bat with Rubber 280 320 440 370 450

109. Refer to Baseball Controversy Narrative. Set up the ANOVA Table. Use determine the critical values.

= 0.05 to

ANS: Source of Variation Treatments Blocks Error Total

P-value

F critical

63.333 67,466.667 503.333 68,033.333

2 4 8 12

31.667 16,866.667 62.917

0.5033 268.078

0.622 0.0

4.459 3.838

PTS: 1 REF: 486-488 BLM: Higher Order - Analyze

TOP: 7–8

110. Refer to Baseball Controversy Narrative. Do these data provide sufficient evidence with the 5% level of significance to refute the professor’s claim? Justify your conclusion. ANS: vs. at least two means differ. Conclusion: Don’t reject the null hypothesis. There isn’t sufficient evidence to refute the professor’s claim. PTS: 1 REF: 489-491 | 725-732 BLM: Higher Order - Evaluate

TOP: 7–8

3  3 Factorial Experiment Narrative The table below gives data for a 3  3 factorial experiment, with two replications per treatment:

Levels of Factor B

Levels of Factor A 1 2

1 2 3

6, 8 9, 8 15, 12

10, 8 13, 14 9, 10

5, 7 8, 11 13, 16

111. Refer to 3  3 Factorial Experiment Narrative. Perform an analysis of variance for the data, and present the results in an analysis of variance table. ANS: A 3  3 factorial design has been used. The analysis of variance table is shown in the computer printout below. ANOVA Table Source of Variation SS A 3.1111 B 81.4444 AB 62.2222 Error 21.0000

df 2 2 4 9

Total

167.7778

PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

MS 1.5556 40.7222 15.5556 2.3333

F 0.6667 17.4524 6.6667

P-value 0.5370 0.0008 0.0089

F crit 4.2565 4.2565 3.6331

TOP: 9–10

112. Refer to 3  3 Factorial Experiment Narrative. What do we mean when we say that factors A and B interact? Do the data provide sufficient evidence to indicate interaction between factors A and B? Justify your answer. (Test at = 0.05.) ANS: The test statistic for interaction is F = MS(AB)/MSE = 6.67 and rejection region is F > 3.63. There is evidence of a significant interaction. That is, the effect of factor A depends upon the level of factor B at which A is measured. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

113. Refer to 3  3 Factorial Experiment Narrative. Find the approximate p-value for the test in the previous question. Use statistical software or Excel and report the exact p-value. ANS: Since the test statistic F = 6.67 lies between and the approximate p-value is given by 0.005 < p-value < 0.01. The computer printout in part a shows a p-value = 0.0089. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Apply

TOP: 9–10

114. Refer to 3  3 Factorial Experiment Narrative. Do the data provide sufficient evidence to indicate that factors A and B affect the response variable? Explain why or why not. (Test at = 0.05.)

ANS: The null and alternative hypotheses for testing the main effect of factor A are

: There are

no differences among the factor A means vs. : At least two of the factor A means differ. The test statistic for testing factor A is F = 0.6667 and the rejection region is F > 4.26. Hence, is not rejected, and that factor A is not significant. That is, there is no evidence to indicate that factor A affects the response variable. The null and alternative hypotheses for testing the main effect of factor B are

: There are

no difference among the factor B means vs. : At least two of the factor B means differ. The test statistic for testing factor B is F = 17.4524 and the rejection region is F > 4.26. Hence, is rejected, and that factor A is significant. That is, there is evidence to indicate that factor B affects the response variable. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

Product Markup Narrative A chain of jewellery stores conducted an experiment to investigate the relationship between price and location and the demand for its diamonds. Six small-town stores were selected for the study, as well as six stores located in large suburban malls. Two stores in each of these locations were assigned to each of three item percentage markups. The percentage gain (or loss) in sales for each store was recorded at the end of one month. The data are shown in the accompanying table.

Location Small towns Suburban malls

1 12 6 16 20

Markup 2 –1 9 10 5

3 –8 –22 –2 5

115. Refer to Product Markup Narrative. Do the data provide sufficient evidence to indicate an interaction between markup and location? Justify your conclusion. (Test using = 0.05.) ANS: The ANOVA table for this 2  3 factorial experiment was generated by computer and shown in the printout below. ANOVA TABLE Source of Variation Markup Location Interaction Error

SS 835.1667 280.3333 85.16667 211

df 2 1 2 6

Total

1411.667

MS 417.5833 280.3333 42.5833 35.1667

F P-value F crit 11.8744 0.0082 5.14325 7.9716 0.0302 5.9874 1.2109 0.3616 5.1432

The null and alternative hypotheses are : Markup and location do not interact vs. : Markup and location interact. From the printout, F = 1.2109 with p-value = 0.3616. Hence, at the = 0.05 level, interaction.

is not rejected. There is insufficient evidence to indicate

PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

116. Refer to Product Markup Narrative. What are the practical implications of your test in the previous question? ANS: Since no interaction is found, the effects of markup and location can be tested individually. Both main effects of the two factors are significant. PTS: 1 REF: 500-502 BLM: Higher Order - Analyze

TOP: 9–10

117. Refer to Product Markup Narrative. Find a 95% confidence interval for the difference in mean change in sales for stores in small towns versus those in suburban malls if the stores are using price markup 3. ANS: The 95% confidence interval is = =

PTS: 1 REF: 491-492 | 722 BLM: Higher Order - Analyze

TOP: 9–10

Headache Treatments Narrative The following data were generated from a 2  2 factorial experiment with three replicates, where factor A levels represent two different injection procedures of an anesthetic to the occipital nerve (located in the back of the neck), and factor B levels represent two different drugs that physicians recommend to increase the effectiveness of the injections. Three headache patients were randomly selected for each combination of injection and drug. Factor B Factor A 1 1 7 10 8 2 10

2 13 11 12 16

11 6

15 11

118. Refer to Headache Treatments Narrative. Test at the 5% significance level to determine if differences exist among the four treatment means. ANS: Source of Variation Treatments Error Total

SS 63.000 34.667 97.667

df 3 8 11

MS 21.00 4.333

F 4.846

P-value 0.0330

F critical 4.066

vs. at least two means differ. Conclusion: Reject the null hypothesis. Yes, differences exist among the four treatment means. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

119. Refer to Headache Treatments Narrative. Test at the 5% significance level to determine if factors A and B interact. ANS: Source of Variation Factor A Factor B Interaction Error Total

SS 5.333 56.333 1.333 34.667 97.667

df 1 1 1 8 11

MS 5.333 56.333 1.333 4.333

F 1.231 13.00 0.308

P-value 0.2995 0.0069 0.5943

F critical 5.318 5.318 5.318

Factors A and B do not interact vs. Factors A and B do interact. Conclusion: Don’t reject the null hypothesis. There is no interaction between injections and drugs. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

120. Refer to Headache Treatments Narrative. Test at the 5% significance level to determine if differences exist among the levels of factor A. ANS: No difference among the means of the two levels of factor A. At least two means differ. Conclusion: Don’t reject the null hypothesis. No, differences do not exist among the levels of factor A (injections). PTS:

REF: 498-502 | 725-732

TOP: 9–10

BLM: Higher Order - Evaluate 121. Refer to Headache Treatments Narrative. Test at the 5% significance level to determine if differences exist among the levels of factor B. ANS: No difference among the means of the two levels of factor B. At least two means differ. Conclusion: Reject the null hypothesis. Yes, differences exist among at least two of the levels of factor B (drugs). PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

Keyboard and Word Processing Narrative The data shown below were taken from a 2  3 factorial experiment to examine the effects of factor A (keyboard configuration, 3 levels) and factor B (word processing package, 2 levels). Each cell consists of four replicates, representing the number of minutes each of four secretaries randomly assigned to that cell required to type a standard document.

Factor A 1

1 26 19 20 21 30 24 25 29 26 22 27 17

Factor B 2 24 21 20 23 33 27 31 29 31 23 24 26

122. Refer to Keyboard and Word Processing Narrative. Is there sufficient evidence at the 5% significance level to infer that factors A and B interact? Explain why or why not. ANS: Source of Variation Factor A Factor B Interaction Error Total

P-value

F critical

184.333 28.167 8.333 185.0 405.833

2 1 2 18 23

92.167 28.167 4.167 10.278

8.968 2.741 0.405

0.0020 0.1152 0.6726

3.555 4.414 3.555

Factors A and B do not interact vs. Factors A and B do interact Conclusion: Don’t reject the null hypothesis. No interaction between factors A and B. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

123. Refer to Keyboard and Word Processing Narrative. Test at the 5% significance level to determine if differences exist among the levels of factor A. ANS: No difference among the means of the 3 levels of factor A. At least two means differ. Conclusion: Reject the null hypothesis. Yes, differences exist for at least two levels of factor A (keyboard configuration). PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

124. Refer to Keyboard and Word Processing Narrative. Test at the 5% significance level to determine if differences exist among the levels of factor B. ANS: No difference among the means of the b levels of factor B. At least two means differ. Conclusion: Don’t reject the null hypothesis. No differences exist among the levels of factor B (word processing). PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

UBC Building Signs Narrative A researcher at University of British Columbia (UBC) wanted to determine whether different building signs (building maps versus wall signage) affect the total amount of time visitors require to reach their destination and whether that time depends on whether the starting location is inside or outside the building. Three subjects were assigned to each of the combinations of signs and starting locations, and travel time in seconds from beginning to destination was recorded. A partial computer output of the appropriate analysis is given below: ANOVA Table Source of Variation Signs (Factor A) Starting Location (Factor B) Interaction

SS 14,008.33 12,288 48

MS 14,008.33

F 2.784

Error Total

35,305.33 61,649.67

4,413.167 11

125. Refer to UBC Building Signs Narrative. Find the degrees of freedom for the different building signs. ANS: 1 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

126. Refer to UBC Building Signs Narrative. What are the degrees of freedom for the different starting location? ANS: 1 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

127. Refer to UBC Building Signs Narrative. What are the degrees of freedom for the interaction between the levels of signs and starting location? ANS: 1 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

128. Refer to UBC Building Signs Narrative. What are the error degrees of freedom? ANS: 8 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

129. Refer to UBC Building Signs Narrative. What is the mean squares value for the starting location? ANS: 12,288 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

130. Refer to UBC Building Signs Narrative. Find the F test statistic for testing the main effect of types of signs. ANS: 3.174 PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Analyze

TOP: 9–10

131. Refer to UBC Building Signs Narrative. What is the F test statistic for testing the interaction effect between the types of signs and the starting location? ANS: 0.0109 PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Analyze

TOP: 9–10

132. Refer to UBC Building Signs Narrative. In order to determine the critical value of the F ratio against which to test for differences between the levels of factor A, which numerator and which denominator, respectively, for df should we use? ANS: 1, 8 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

133. Refer to UBC Building Signs Narrative. In order to determine the critical value of the F ratio against which to test for differences between the levels of factor B, which numerator and which denominator, respectively, for df should we use? ANS: 1, 8 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze

TOP: 9–10

134. Refer to UBC Building Signs Narrative. In order to determine the critical value of the F ratio against which to test for interaction between levels of factor A and levels of factor B, which numerator of df and which denominator of F should we use? ANS: 1, 8 PTS: 1 REF: 498-500 BLM: Higher Order - Analyze Statistical Software Narrative

TOP: 9–10

A professor of statistics is trying to determine which of three statistical software programs is best for his students. He believes that the time (in hours) it takes a student to master particular software may be influenced by gender. A 3  2 factorial experiment with three replicates was designed, as shown below:

Software 1

Gender Male Female 29 26 24 32 20 30 32 23 26 31 21 25 18 27 20 22 25 30

135. Refer to Statistical Software Narrative. Is there sufficient evidence at the 10% significance level to infer that the time it takes a student to master software and the gender of the student interact? Justify your conclusion. ANS: Source of Variation

P-value

F critical

Software Gender Interaction Error Total

34.778 53.389 26.778 213.333 328.278

2 1 2 12 17

17.389 53.389 13.389 17.778

0.978 3.003 0.753

0.4041 0.1087 0.4919

2.807 3.177 2.807

Software type and gender do not interact. Software type and gender do interact. Conclusion: Don’t reject the null hypothesis. There is insufficient evidence, at the 10% significance level, to deduce that there is an interaction between the time it takes a student to master software and the gender of the student. Therefore, at the 10% significance level, we may infer that software type and gender do not interact. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

136. Refer to Statistical Software Narrative. Test at the 10% significance level to determine if differences exist among the types of software. ANS: No difference among the means of the types of software. At least two means differ.

Conclusion: Don’t reject the null hypothesis. At the 10% significance level, we may infer that there is no essential difference among the means of the types of software. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

137. Refer to Statistical Software Narrative. Test at the 10% significance level to determine if differences exist among male and female students. ANS: No difference among the means of the male and female students. At least two means differ. Conclusion: Don’t reject the null hypothesis. There is no difference among the means of the male and female students. PTS: 1 REF: 498-502 | 725-732 BLM: Higher Order - Evaluate

TOP: 9–10

Chapter 12A—Linear Regression and Correlation MULTIPLE CHOICE 1. In a regression setting, which of the following is NOT an assumption about the random error, ? a. The errors are linearly correlated. b. The errors are normally distributed with a mean of 0 and a common variance . c. The errors are independent in a probabilistic sense. ANS: A BLM: Remember

PTS:

REF: 530

TOP: 1–5

2. In a simple linear regression problem, if the coefficient of determination is 0.96, what does this imply? a. It means that 96% of the y values are positive. b. It means that 90% of the total variation in y can be explained by the regression line. c. It means that 96% of the x values are equal. d. It means that 90% of the total variation in x can be explained by regression line. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

3. If the sum of squares for error is equal to 0, then what must the coefficient of determination ( ) equal? a. 1.5 b. 1.0 c. 0.0 d. –1.0 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

4. A regression analysis between sales (y in $1000) and advertising (x in $100) resulted in the following least-squares line: = 82 + 7x. Given this information, if advertising costs were $900, what could we reasonably expect the amount of sales (in dollars) to be? a. $6,382 b. $82,063 c. $88,300 d. $145,000 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 531 | 533

TOP: 1–5

5. If all of the values of an independent variable x are equal, then regressing a dependent variable y on this independent variable x will result in which of the following coefficients of determination ( )?

a. –1.3 b. –1.0 c. 0.0 d. 1.0 ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

6. For the values of the coefficient of determination listed below, which one yields the greatest value of sum of squares for regression given that the total sum of squares is 200? a. –0.90 b. 0.00 c. 0.90 d. 0.98 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

7. A regression analysis between weight (y in kg) and height (x in cm) resulted in the following least-squares line: = –20 + 0.5x. Taking this into consideration, if the height is increased by 1 cm, what does this imply about the change in the weight, on average? a. The weight will increase by 1/2 kg. b. The weight will increase by 6 kg. c. The weight will decrease by 1/2 kg. d. The weight will decrease by 20 kg. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 531 | 533

TOP: 1–5

8. Which of the following correctly describes an estimated regression line? a. It is a line calculated from census data by the method of least squares. b. It might be represented by the equation . c. The values of a and b found in the equation of the estimated regression line represent the line’s slope and intercept, respectively. ANS: B BLM: Remember

PTS:

REF: 531-533

TOP: 1–5

9. Which of the following correctly describes a true regression line? a. It is represented by the equation E(y) = . b. It is a line calculated from sample data by the method of least squares. c. The values of and found in the equation of the true regression line represent the line’s slope and intercept, respectively. d. The response y is related to the independent variable x. ANS: A BLM: Remember

PTS:

REF: 531

TOP: 1–5

10. Which of the following is a measure of how well an estimated regression line fits the sample data on which it is based, denoted by (and equal to the proportion of the total variation in the values of the dependent variable, y, that can be explained by the association of y with x as measured by the estimated regression line)? a. the sample coefficient of variation b. the sample coefficient of correlation c. the sample coefficient of determination d. the sample coefficient of non-determination ANS: C BLM: Remember

PTS:

REF: 543

TOP: 1–5

11. Which of the following is NOT an assumption for the simple linear regression model? a. The distribution of the error terms will be skewed to left or right, depending on the values of the dependent variable. b. The error terms have equal variances for all values of the independent variable. c. The error terms are independent of each other. d. The mean of the dependent variable for all levels of the independent variable can be connected by a straight line. ANS: A BLM: Remember

PTS:

REF: 530

TOP: 1–5

12. In a simple linear regression analysis, if SSE = 27, Total SS = 63, then what is the approximate percentage of the variation in the dependent variable y that is explained by the independent variable x? a. 27% b. 43% c. 57% d. 63% ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

13. In a simple linear regression analysis, which of the following best describes the standard error of the slope? a. It is a measure of the amount of change in the dependent variable y for a one-unit change in the independent variable x. b. It is a measure of the variation in the regression slope from sample to sample. c. It is equal to the square root of the standard error of the estimate. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 540

TOP: 1–5

14. If an estimated regression line has a y-intercept of 10 and a slope of 4, then when x = 2 what is the actual value of y? a. 18 b. 15 c. 14 d. y is unknown

ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 531

TOP: 1–5

15. Given the least-squares regression line = 5 –2x, what may be said about the relationship between the two variables? a. The relationship between x and y is positive. b. The relationship between x and y is negative. c. As x increases, so does y. d. As x decreases, so does y. ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 531 | 562

TOP: 1–5

16. A regression analysis between sales (in $1000) and advertising (in $100) resulted in the following least-squares line: = 75 +6x. From this information, if advertising is $800, then what is the predicted amount of sales (in dollars)? a. $4,875 b. $12,300 c. $123,000 d. $487,500 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 533

TOP: 1–5

17. A regression analysis between sales (in $1000) and advertising (in $) resulted in the following least-squares line: = 80,000 + 5x. What does this imply? a. An increase of $1 in advertising is expected, on average, to result in an increase of $5 in sales. b. An increase of $5 in advertising is expected, on average, to result in an increase of $5000 in sales. c. An increase of $1 in advertising is expected, on average, to result in an increase of $5000 in sales. d. An increase of $1 in advertising is expected, on average, to result in an increase of $80,005 in sales. ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 533

TOP: 1–5

18. In the simple linear regression model, what does the slope represent? a. the value of y when x = 0 b. the average change in y per unit change in x c. the value of x when y = 0 d. the average change in x per unit change in y ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 531

TOP: 1–5

19. In regression analysis, what do the residuals represent? a. the difference between the actual y values and their predicted values b. the difference between the actual x values and their predicted values

c. the square root of the slope of the regression line d. the change in y per unit change in x ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 531

TOP: 1–5

20. In a simple linear regression problem, the following statistics are calculated from a sample of ten observations: = 2250, = 10, = 50, = 75. Which of the following values equals the least-squares estimates of the slope and y-intercept, respectively? a. 1.5 and 0.5 b. 1.5 and 2.5 c. 2.5 and –5.0 d. 2.5 and 1.5 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 532-533

TOP: 1–5

21. If a simple linear regression model has no y-intercept, then which of the following may be deduced about the values of the variables? a. All values of x are 0. b. All values of y are 0. c. When y = 0, x = 0. d. When x = 0, y = 0. ANS: D PTS: 1 BLM: Higher Order - Understand 22. In the least-squares regression line predicted value of y? a. 1.0 when x = 1.0 b. 1.0 when x = –1.0 c. 2.0 when x = 1.0 d. 2.0 when x = –1.0 ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 531

TOP: 1–5

= 3 – 2x, which of the following is the correct

REF: 533

TOP: 1–5

23. What does the least-squares method for determining the best fit minimize? a. total variation in the dependent variable b. sum of squares for error c. sum of squares for regression d. total variation in the independent variable ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 531

TOP: 1–5

24. In a simple linear regression problem, the following sums of squares are produced: , , and variation in y may be explained by the variation in x?

. What percentage of the

a. 25% b. 33% c. 50% d. 75% ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

25. In simple linear regression, most often we perform a two-tailed test of the population slope to determine whether there is sufficient evidence to infer that a linear relationship exists. How should we state the null hypothesis? a. b. c. d. ANS: A BLM: Remember

PTS:

REF: 539-540

TOP: 1–5

26. Testing whether the slope of the population regression line could be 0 is equivalent to testing which of the following? a. whether the sample coefficient of correlation could be 0 b. whether the standard error of estimate could be 0 c. whether the population coefficient of correlation could be 0 d. whether the sum of squares for error could be 0 ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 540 | 543

TOP: 1–5

27. What is the symbol for the population coefficient of correlation? a. r b. c. r2 d. ANS: B BLM: Remember

PTS:

REF: 561

TOP: 1–5

28. Given that the sum of squares for error is 60 and the sum of squares for regression is 140, then what is the value of the coefficient of determination? a. 0.300 b. 0.429 c. 0.700 d. 0.837 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

29. A regression line using 25 observations produced SSR = 118.68 and SSE = 56.32. What was the standard error of estimate? a. 2.2716 b. 2.1788 c. 1.5648 d. 1.5009 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 534-535

TOP: 1–5

30. Given the least-squares regression line = –2.48 + 1.63x, and a coefficient of determination of 0.81, what is the coefficient of correlation? a. –0.85 b. 0.85 c. –0.90 d. 0.90 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

31. If the coefficient of determination is 0.975, then what may be said about the slope of the regression line? a. It must be positive. b. It must be negative. c. It could be either positive or negative. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 543 | 562

TOP: 1–5

32. Which value of the coefficient of correlation r indicates a stronger correlation than 0.65? a. 0.60 b. 0.55 c. –0.45 d. –0.75 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

33. In regression analysis, if the coefficient of determination is 1.0, which of the following statements can be deduced from this information? a. The sum of squares for error must be 1.0. b. The sum of squares for regression must be 1.0. c. The sum of squares for error must be 0.0. d. The sum of squares for regression must be 0.0. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

34. Which of the following is measured by the coefficient of determination a. the amount of variation in y that is explained by variation in x b. the amount of variation in x that is explained by variation in y

c. the amount of variation in y that is not explained by variation in x d. the amount of variation in x that is not explained by variation in y ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

35. If the coefficient of correlation is 0.90, what is the percentage of the variation in the dependent variable y that is explained by the variation in the independent variable x? a. 90% b. 81% c. 0.90% d. 0.81% ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 543

TOP: 1–5

36. In simple linear regression, the plot of residuals versus fitted values for which of the following a. normality b. a constant variance independent of x c. independence ANS: B BLM: Remember

PTS:

REF: 549-550

can be used to check

TOP: 6–8

37. In a simple linear regression problem including n = 10 observations, which of the following critical values would be appropriate for a 95% confidence interval estimation for the average value of y? a. 1.860 b. 2.228 c. 2.262 d. 2.306 ANS: D TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

38. In a regression problem the following pairs of (x, y) are given: (4, 1), (4, –1), (4, 0), (4, –2) and (4, 2). Which of the following statements may be deduced from the given information? a. The correlation coefficient is –1. b. The correlation coefficient is 0. c. The correlation coefficient is 1. d. The coefficient of determination is between –2 and 2. ANS: B TOP: 6–8

PTS: 1 REF: 560-562 | 543 BLM: Higher Order - Analyze

39. Which of these coefficients of correlation (r) indicates a strong negative linear relationship between the two variables of interest? a. –1.3 b. –0.9 c. 0.8

d. 0.9 ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

40. Which of these coefficients of correlation (r) indicates a strong positive linear relationship between the two variables of interest? a. –1.3 b. –0.9 c. 0.8 d. 0.9 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

41. Given the least-squares regression line = –4.63 + 1.38x, and a coefficient of determination of 0.9025, what is the correlation coefficient? a. –0.95 b. –0.81 c. +0.95 d. +1.38 ANS: C TOP: 6–8

PTS: 1 REF: 543 | 560-562 BLM: Higher Order - Apply

42. Which of the following is an indication of no linear relationship between two variables x and y? a. a coefficient of correlation of 1 b. a coefficient of correlation of 0 c. a coefficient of determination of –1 d. a coefficient of determination of 1 ANS: B TOP: 6–8

PTS: 1 REF: 543 | 560-562 BLM: Higher Order - Understand

43. In publishing the results of some research work, the following values of the correlation coefficient were listed. Which one is incorrect? a. 0.00 b. 0.05 c. 0.95 d. 1.05 ANS: D BLM: Remember

PTS:

REF: 561

TOP: 6–8

44. A sample of 25 observations is selected, and the sample correlation coefficient between the variables x and y is r = 0.525. What is the test statistic value for testing a. about 3.81 b. about 3.65

vs.

c. about 3.08 d. about 2.96 ANS: D TOP: 6–8

PTS: 1 REF: 562-563 | 722 BLM: Higher Order - Analyze

45. Which of the following is the appropriate null hypothesis to test whether a population correlation is 0? a. b. c. d. ANS: B BLM: Remember

PTS:

REF: 562

TOP: 6–8

46. In order to predict with 90% confidence the expected value of y for a given value of x in a simple linear regression problem, a random sample of ten observations is taken. Which of the following t-table values would be used? a. 2.306 b. 2.228 c. 1.860 d. 1.812 ANS: C TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

47. In order to predict with 99% confidence the expected value of y for a given value of x in a simple linear regression problem, a random sample of ten observations is taken. Which of the following t-table values would be used? a. 1.860 b. 2.306 c. 2.896 d. 3.355 ANS: D TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

48. In order to predict with 80% confidence the expected value of y for a given value of x in a simple linear regression problem, a random sample of 15 observations is taken. Which of the following t-table values would be used? a. 1.350 b. 1.771 c. 2.160 d. 2.650 ANS: A TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

49. In order to predict with 98% confidence the expected value of y for a given value of x in a simple linear regression problem, a random sample of 15 observations is taken. Which of the following t-able values would be used? a. 1.350 b. 1.771 c. 2.160 d. 2.650 ANS: D TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

50. Which of the following statements is NOT a property of the residuals in simple linear regression model? a. They sum to 0. b. They have a mean of 0.. c They have a median of 0 c. They have a standard deviation of 1. ANS: C BLM: Remember

PTS:

REF: 548-549

TOP: 6–8

51. A study of 20 students showed that the correlation between the time spent writing a test and the number of hours studied the night before the test was 0.35. Using a level of significance equal to 0.05, what does this imply? a. The sample correlation coefficient could be 0 since the test statistic does not fall into the rejection region. b. The null hypothesis that the population mean is equal to 0 should not be rejected, and we should conclude that the true correlation coefficient is 0. c. There is not enough statistical evidence to conclude that the true correlation coefficient is different from 0. d. The null hypothesis that the population variance is equal to 0 should be rejected, and we should conclude that the true correlation coefficient is 0. ANS: C TOP: 6–8

PTS: 1 REF: 562-563 | 722 BLM: Higher Order - Evaluate

TRUE/FALSE 1. In a simple linear regression problem, if the sum of squares for regression is 90, then the correlation coefficient is 0.9. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

2. In a simple linear regression problem, if the sum of squares for regression is 90, then the total sum of squares is at least 90. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 534-535

TOP: 1–5

3. In a simple linear regression problem, if the sum of squares for regression is 90, then the sum of squares for error is at most 90. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 534-535

TOP: 1–5

4. In a simple linear regression model, if the regression slope coefficient is negative, then the standard error of the estimate will be positive. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 540

TOP: 1–5

5. In a simple linear regression model, the regression slope coefficient will have the same sign as the correlation coefficient. ANS: T BLM: Remember

PTS:

REF: 561-562

TOP: 1–5

6. In a simple linear regression setting, the deterministic model equation determines an exact value of the dependent variable y when the value of the independent variable x is given, since all points must lie exactly on the line. ANS: T BLM: Remember

PTS:

REF: 534 529

TOP: 1–5

7. The sum of squares for regression can never be larger than the sum of squares for error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 534-535

TOP: 1–5

8. The value of the sum of squares for regression can never be larger than 100. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 534-535

TOP: 1–5

9. In a simple linear regression setting, the probabilistic model equation allows for some deviation of the points about the regression line, making it a more practical model. ANS: T BLM: Remember

PTS:

REF: 530

TOP: 1–5

10. The vertical spread of the data points about the regression line is measured by the y-intercept. ANS: F PTS: 1 BLM: Higher Order - Understand 11.

REF: 534

TOP: 1–5

The method of least-squares requires that the sum of the squared deviations between actual y values in the scatter diagram and y values predicted by the regression line be minimized.

ANS: T BLM: Remember

PTS:

REF: 531

TOP: 1–5

12. A regression analysis between sales (in $1000) and advertising (in $) resulted in the following least-squares line: = 60 + 5x. This implies that an increase of $1 in advertising is expected to result in an increase of $65 in sales. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 532-533

TOP: 1–5

13. A regression analysis between weight ( , in kg) and height ( , in cm) resulted in the following least-squares line: = –5 + 0.4 . This implies that if the height is increased by 1 cm, the weight is expected to increase by an average of 0.4 kg. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 532-533

14. In simple linear regression, if the estimated values

TOP: 1–5

and the corresponding actual values

are equal, then the standard error of estimate, SE( ), must equal –1.0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 535 |554

TOP: 1–5

15. The value of the sum of squares for error can never be larger than the total sum of squares. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 534

TOP: 1–5

16. If a least-squares regression line has a y-intercept of 6.84 and a slope of 2.16, then when x = 1 the actual value of y must be 9. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 534

TOP: 1–5

17. Given that the sum of squares for error is 52 and the sum of squares for regression is 148, then the coefficient of determination is 0.74. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

18. If the coefficient of determination is 0.982, then the slope of the regression line must be positive. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 543 | 561

TOP: 1–5

19. A regression analysis between sales (in $1000) and advertising (in $100) resulted in the following least-squares line: = 77 + 8x. This implies that if advertising is $600, then the predicted amount of sales is $125,000. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 531-533

TOP: 1–5

20. The residuals are observations of the error variable . Consequently, the minimized sum of squared deviations is called the sum of squares for error. ANS: T BLM: Remember

PTS:

REF: 531

TOP: 1–5

21. One way to measure the strength of the relationship between the response variable y and the predictor variable x is to calculate the coefficient of determination, that is, the proportion of the total variation in y that is explained by the linear regression of y on x. ANS: T BLM: Remember

PTS:

REF: 543

TOP: 1–5

22. Regression analysis is a statistical method that seeks to establish an equation that allows the unknown value of one variable to be estimated from the known value of one or more other variables. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 528-529

TOP: 1–5

23. In regression analysis, the independent variable is a variable whose value is known and is being used to explain or predict the value of another variable. ANS: T BLM: Remember

PTS:

REF: 528-529

TOP: 1–5

24. In regression analysis, the dependent variable is a variable whose value is unknown and is being explained or predicted with the help of another variable. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 528-529

TOP: 1–5

25. The values of a and b found in the equation of the estimated regression line represent the line’s y-intercept and slope, respectively, and are called estimated regression coefficients. ANS: T BLM: Remember

PTS:

REF: 531

TOP: 1–5

26. The values of α and found in the equation of the true regression line E(y) = represent the line’s y-intercept and slope, respectively, and are called true regression coefficients. ANS: T BLM: Remember

PTS:

REF: 529

TOP: 1–5

27. A measure of how well an estimated regression line fits the sample data on which it is based (denoted by and equal to the proportion of the total variation in the values of the dependent variable, y, that can be explained by the association of y with x as measured by the estimated regression line) is called the sample coefficient of correlation. ANS: F BLM: Remember

PTS:

REF: 543

TOP: 1–5

28. If two variables are related in a negative linear manner, the scatterplot will show points on the x,y-space that are generally moving from the upper left to the lower right. ANS: T BLM: Remember

PTS:

REF: 562

TOP: 1–5

29. An automobile company in Ontario is interested in the relationship between the gender of its employees and employee productivity. A good starting point in this analysis would be to compute the coefficient of determination and the correlation coefficient. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 543

TOP: 1–5

30. In developing a simple linear regression model, only one independent variable is used to explain the variation in a single dependent variable. ANS: T BLM: Remember

PTS:

REF: 528-529

TOP: 1–5

31. In a simple linear regression model, the slope coefficient represents the average change in the independent variable x for a one-unit change in the dependent variable y. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 529-530

TOP: 1–5

32. If a set of data contains no values of the independent variable x that are equal to 0, then the y-intercept regression coefficient has no particular meaning. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 529

TOP: 1–5

33. The sign of the correlation coefficient in a simple linear regression model will always be the same as the sign of the y-intercept coefficient .

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 561

TOP: 1–5

34. The standard error of the estimate in a simple linear regression model measures the variation in the slope coefficient from sample to sample. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 540-541

TOP: 1–5

35. In a simple linear regression model, if the independent and dependent variables are negatively linearly related, then the standard error of the estimate will also be negative. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 562 | 543

TOP: 1–5

36. In a simple linear regression model, if the regression model is statistically significant, then the regression slope coefficient is significantly greater than 0. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 540-541

TOP: 1–5

37. If a simple linear regression model is developed based on a sample where the independent and dependent variables are known to be positively related, then the sign of the slope regression coefficient will be positive also. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 561-562

TOP: 1–5

38. If a simple linear regression model is developed based on a sample where the independent and dependent variables are known to be negatively related, then the sum of squares for error will be negative also. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 562 | 543

TOP: 1–5

39. If the coefficient of determination value for a simple linear regression model is 0.90, then the correlation coefficient between the two variables will be 0.81. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

40. The value of the sum of squares for regression can never be smaller than 0.0. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 534-535

TOP: 1–5

41. The value of the sum of squares for regression can never be smaller than 1.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 534-535

TOP: 1–5

42. If all the values of an independent variable x are equal, then regressing a dependent variable y on x will result in a coefficient of determination of 0. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

43. In a simple linear regression model, testing whether the slope of the population regression line could be 0 is the same as testing whether or not the population coefficient of correlation equals 0. ANS: T BLM: Remember

PTS:

REF: 540 | 562

TOP: 1–5

44. When the actual values y of a dependent variable and the corresponding predicted values are the same, the standard error of estimate ANS: T PTS: 1 BLM: Higher Order - Understand

will be 0.0.

REF: 531

45. If there is no linear relationship between two variables determination must be 1.0. ANS: F PTS: 1 BLM: Higher Order - Understand

TOP: 1–5

and

REF: 543

, the coefficient of

TOP: 1–5

46. The value of the sum of squares for regression can never be larger than the value of sum of squares for error. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 534-535

TOP: 1–5

47. When the actual values y of a dependent variable and the corresponding predicted values are the same, the standard error of estimate ANS: F PTS: 1 BLM: Higher Order - Understand

will be –1.0. REF: 531

TOP: 1–5

48. In a simple linear regression problem, the least-squares line is = –3.75 + 1.25 , and the coefficient of determination is 0.81. The coefficient of correlation must be –0.90. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 543

TOP: 1–5

49. In a simple linear regression analysis, if the t test statistic for testing the significance of the regression model is 3.4, then the F test statistic from the ANOVA table for regression will be 11.56. ANS: T TOP: 1–5

PTS: 1 REF: 539-542 | 722 BLM: Higher Order - Apply

50. In a simple linear regression model, if you found that the true regression coefficient is significantly greater than 0, then you may also conclude that the two variables are positively correlated. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 1–5

51. In a simple linear regression problem, if the coefficient of determination is 0.95, this means that 95% of the variation in the independent variable x can be explained by regression line. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 1–5

52. If the true correlation between two variables is 0, then there is no linear relationship between the two variables. ANS: T BLM: Remember

PTS:

REF: 560-562

TOP: 6–8

53. When all the points in a scatter diagram lie precisely on the estimated regression line, the sample coefficient of correlation will equal 0. ANS: F BLM: Remember

PTS:

REF: 560-562

TOP: 6–8

54. When all the points in a scatter diagram lie precisely on the estimated regression line, the sample coefficient of correlation will show the variables to be perfectly correlated. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

55. The normal probability plot is a graph that plots the residuals against the expected value of that residual if it had come from a normal distribution. When the residuals are normally distributed or approximately so, the plot should appear as a straight line, sloping upward at a angle. ANS: T BLM: Remember

PTS:

REF: 549-550

TOP: 6–8

56. In simple linear regression, one can use the plot of residuals versus the fitted values of y to check for a constant variance as well as to make sure that the linear model is in fact adequate.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 549-550

TOP: 6–8

57. In regression analysis, a careful study of the differences between observed and estimated y values, given x (in order to decide whether crucial assumptions are fulfilled that allow valid inferences about the true regression line to be made from an estimated regression line) is called residual analysis. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 549-550

TOP: 6–8

58. In regression analysis, a graph of each residual against the corresponding fitted value is called a scatter diagram. ANS: F BLM: Remember

PTS:

REF: 549-550

TOP: 6–8

59. In developing a 90% confidence interval for the average value of y from a simple linear regression problem involving 12 observations, the appropriate t-table value would be 1.796. ANS: F TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

60. If all the points in a scatterplot lie on the least-squares regression line, then the correlation coefficient must be 1.0. ANS: F BLM: Remember

PTS:

REF: 560-562

TOP: 6–8

61. In a simple linear regression problem, the least-squares line is = 2.73 – 1.02 , and the coefficient of determination is 0.7744. The correlation coefficient must be –0.88. ANS: T TOP: 6–8

PTS: 1 REF: 543 | 561-562 BLM: Higher Order - Apply

62. Simple regression analysis is a statistical technique that establishes an index which provides, in a single number, a measure of the strength of association between two variables. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 6–8

63. If the correlation coefficient between two variables is very close to 0, this means that there is no relationship between the two variables. ANS: F BLM: Remember

PTS:

REF: 560-562

TOP: 6–8

64. A perfect correlation between two variables will always produce a correlation coefficient of +1.0. ANS: F BLM: Remember

PTS:

REF: 560-562

TOP: 6–8

65. In simple linear regression analysis, if the correlation coefficient between the independent variable x and the dependent variable y is –0.85, this means that the scatterplot generated by the same data values would show points that would fall on a straight line with slope equal to –0.85. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

66. If the correlation coefficient for two variables is found to be 0.094, then the scatterplot will show the data upward sloping from lower left to upper right. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

67. If there is a negative correlation between the independent variable x and the dependent variable y, then to test this, the appropriate null and alternative hypotheses would be vs. ANS: F TOP: 6–8

PTS: 1 BLM: Remember

REF: 562-563 | 722

68. In simple linear regression analysis, if the independent variable x and the dependent variable y are highly correlated, this means not only that they are linearly related, but also that a change in x will cause a change in y. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 560-562

TOP: 6–8

69. The regression model = 36.5 + 20.1x has been computed based on a sample of 50 observations. One observation in the sample was (x, y) = (14, 350.9). Given this, the residual value for this observation is 33. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 531

TOP: 6–8

70. In a simple linear regression analysis, it was stated that the correlation between starting salary and years of experience is 0.80. This indicates that 80% of the variation in starting salary is explained by years of experience. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 543

TOP: 6–8

71.

When regression analysis is used for prediction, the confidence interval for the average y given x will be wider than the prediction interval for a particular value of y given x. ANS: F BLM: Remember

PTS:

REF: 554

TOP: 6–8

72. A large coefficient of determination value will result in a small standard error of the estimate for the regression model, thus providing prediction intervals that are narrow. ANS: F TOP: 6–8

PTS: 1 REF: 543 | 554-555 BLM: Higher Order - Understand

73. The prediction interval developed from a simple linear regression model will be very narrow when the value of x used to predict y is equal to the mean value . ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 554-555

TOP: 6–8

74. In developing a 95% confidence interval for the expected value of y from a simple linear regression problem involving a sample of size ten, the appropriate critical value would be 1.86. ANS: F TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

75. In developing an 80% prediction interval for the particular value of y from a simple linear regression problem involving a sample of size 12, the appropriate t-table value would be 1.372. ANS: T TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

76. In developing a 90% prediction interval for the particular value of y from a simple linear regression problem involving a sample of size 14, the appropriate t-table value would be 2.179. ANS: F TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

77. In order to predict with 95% confidence a particular value of for a given value of simple linear regression problem, a random sample of 20 observations is taken. The appropriate t-table value that would be used is 2.101. ANS: T TOP: 6–8

in a

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

78. The confidence interval estimate of the expected value of y will be narrower than the prediction interval for the same given value of x and confidence level. This is because there is less error in estimating a mean value as opposed to predicting an individual value.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 554

TOP: 6–8

79. The confidence interval estimate of the expected value of y will be wider than the prediction interval for the same given value of x and confidence level. This is because there is more error in estimating a mean value as opposed to predicting an individual value. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 554

TOP: 6–8

80. In developing a 90% confidence interval for the expected value of y from a simple linear regression problem involving a sample of size 15, the appropriate t-table value would be 1.761. ANS: F TOP: 6–8

PTS: 1 REF: 553-555 | 722 BLM: Higher Order - Analyze

Chapter 12B—Linear Regression and Correlation PROBLEM Blacktop Let x be the area (in square metres) to be covered with blacktop, and let y be the time (in minutes) it takes a construction crew to completely cover the area. The simple linear regression model relates x and y where the least-squares estimates of the regression parameters are b = 0.207 and a = 81.6. 1. Refer to Blacktop statement. What is the least-squares best-fitting regression line? ANS: = 81.6 + 0.207x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

2. Refer to Blacktop statement. What is the estimated amount of time it takes to apply 2400 square metres of blacktop? ANS: = 81.6 + 0.207 (2400) = 578.4 minutes PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

3. Refer to Blacktop statement. What is the average change in time per one square metre increase in area? ANS: The average change in time per one square metre increase in area is b = +0.207. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

Lumber Weight Narrative Let x be the weight in tonnes (1 tonne = 1000 kg) of a load of lumber and y be the time (in hours) it takes to load it onto a truck. A simple linear regression model relates x and y where the least-squares estimates of the regression parameters are b = 6.5 and a = 3.3. 4. Refer to Lumber Weight Narrative. What is the least-squares best-fitting regression line? ANS: = 3.3 + 6.5x

PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

5. Refer to Lumber Weight Narrative. What is the estimated time it takes to load 9 tonnes of lumber? ANS: = 3.3 + 6.5(9) = 61.8 hours PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

6. Refer to Lumber Weight Narrative. What is the average change in time per one tonne increase in weight? ANS: The average change in time per one tonne increase in weight is b = +6.5 hours. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

7. Refer to Lumber Weight Narrative. Interpret the y-intercept of the regression line. ANS: The estimated time it takes to load 0 tonnes of lumber on a truck is 3.3 hours. Of course this does not have any practical sense. PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

Delivery Time Narrative Let x be the number of pieces of furniture in a delivery truck and y be the time (in hours) it takes the delivery person to deliver all the pieces of furniture. A simple linear regression analysis related x and y where the least-squares estimates of the regression parameters are a = 1.85 and b = 0.55. 8. Refer to Delivery Time Narrative. What is the least-squares best-fitting regression line? ANS: = 1.85 + 0.55x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

9. Refer to Delivery Time Narrative. Identify and interpret the slope of the equation.

ANS: The slope is b = 0.55. This means the average increase in time per one unit increase of furniture is 0.55 hours. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

10. Refer to Delivery Time Narrative. Identify and interpret the y-intercept. Does this make sense? ANS: The y-intercept is a = 1.85. This means that the time it takes to deliver 0 pieces of furniture is 1.85. This does not make sense. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

11. Refer to Delivery Time Narrative. Use the least-squares regression line to estimate the time it takes to deliver ten pieces of furniture. (You may assume that ten is in the range of the data.) ANS: = 1.85 + 0.55 (10) = 7.35 hours. PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

Salary and Years Narrative A company manager is interested in the relationship between x = number of years that an employee has been with the company and y = the employee’s annual salary (in thousands of dollars). The following statistical software output is from a regression analysis for predicting y from x for n = 15 data points. Predictor Coef Constant 16.8221 x 0.64983

St. Dev 0.3887 0.02617

t-ratio 43.28 24.83

p 0.000 0.000

s = 0.8081 R-sq = 97.9% R-sq(adj) = 97.8% 12. Refer to Salary and Years Narrative. What is the least-squares regression equation? ANS: = 16.8221 + 0.64983x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

13. Refer to Salary and Years Narrative. What are the least-squares estimates of the slope and the y-intercept? ANS: Slope = b = 0.64983, and y-intercept = a = 16.8221. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

14. Refer to Salary and Years Narrative. Interpret the estimated slope and y-intercept for this problem. ANS: Slope: For each additional year an employee is with this company, his or her salary increases, on average, by $650. y-intercept: An employee just starting a job with this company, i.e., 0 years, has a starting salary of $16,820. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

15. Refer to Salary and Years Narrative. What are the values of error?

and the sum of squares for

ANS: = 0.653. Recall that = MSE, and MSE = SSE/(n – 2). Hence, SSE = (n – 2) MSE, or SSE = (15 – 2)(0.653) = 8.489. PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

TOP: 1–5

16. Refer to Salary and Years Narrative. Find and interpret the coefficient of determination. ANS: = 97.9%. This means that 97.9% of the total variation in y (the employee’s annual salary) is accounted for by regression on x (number of years that an employee has been with the company). PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

17. Refer to Salary and Years Narrative. Does a linear relationship exist between x and y? Test using = 0.05. ANS: Since p-value = 0.0 < , reject the null hypothesis and conclude that a linear relationship does exist between x and y.

PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

18. Refer to Salary and Years Narrative. Find and interpret the correlation coefficient. ANS: Since b > 0, then the correlation coefficient there is a strong positive linear relationship between x and y. PTS: 1 REF: 560-562 BLM: Higher Order - Analyze

= 0.989. This means that

TOP: 6–8

Advertising and Money Spent Narrative A marketing analyst is studying the relationship between x = money spent on television advertising and y = increase in sales. One study reported the following data (in dollars) for a particular company. x y

380 550

645 780

360 530

900 540 1,200 620

670 800

820 910

1,050 760 1,400 830

800 905

19. Refer to Advertising and Money Spent Narrative. Develop a scatterplot and determine whether a linear relationship appears to provide a good fit to this data set. ANS:

A linear relationship appears to provide a good fit to this data set. PTS: 1 REF: 562 BLM: Higher Order - Analyze

TOP: 1–5

20. Refer to Advertising and Money Spent Narrative. Use a statistical software package of your choice and report the regression analysis results. ANS: Summary Output Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations

0.9522 0.9067 0.8951 89.3345 10

ANOVA df 1 8 9

Regression Error Total

Intercept x

Coefficients 27.5431 1.1913

SS MS F 620,817.3133 620,817.3133 77.7903 63,845.1867 7,980.6483 684,662.5 Standard Error 97.7069 0.1351

PTS: 1 REF: 534-536 BLM: Higher Order - Apply

t Stat 0.2819 8.8199

p-value 0.0000

p-value 0.7852 0.0000

TOP: 1–5

21. Refer to Advertising and Money Spent Narrative. What is the least-squares regression line? ANS: = 27.5431 + 1.1913x PTS: 1 REF: 531-533 | 536 BLM: Higher Order - Analyze

TOP: 1–5

22. Refer to Advertising and Money Spent Narrative. State and interpret the slope. ANS: Slope = b =1.19. This means that for every additional $1 spent on advertising, sales increase, on average, by about $1.19. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

23. Refer to Advertising and Money Spent Narrative. Calculate

and SSE for these data.

ANS: = MSE = 7,980.6483. Since MSE = SSE/(n – 2), then SSE = (10 – 2) (7,980.6483) = 63,845.1864. PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

TOP: 1–5

24. Refer to Advertising and Money Spent Narrative. Does a linear relationship exist between x and y? Test using = 0.05. ANS: Since p-value = 0.0 < , reject the null hypothesis and conclude that a linear relationship does exist between x and y. Notice that p-value is 0 for both the F test and t test. PTS: 1 REF: 540-543 | 722 | 725-732 BLM: Higher Order - Evaluate

TOP: 1–5

Wind Velocity and Windmills Narrative A scientist is studying the relationship between wind velocity (x in km/h) and DC output of a windmill (y). The following MINITAB output is from a regression analysis for predicting y from x. Predictor Constant X

Coef –0.1346 0.28996

Stdev 0.1803 0.03050

t-ratio –0.75 9.51

p 0.470 0.000

s = 0.2435 R-sq = 88.3% R-sq(adj) = 87.3% Analysis of Variance Source DF Regression 1 Error 12 Total 13

SS 5.3606 0.7116 6.0721

MS 5.3606 0.0593

F 90.40

p 0.000

25. Refer to Wind Velocity and Windmills Narrative. What is the least-squares regression line? ANS: = –0.1346 + 0.28996x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

26. Refer to Wind Velocity and Windmills Narrative. Predict the DC output for a wind velocity of 22 km/h. ANS:

= –0.1346 + 0.28996 (22) = 6.2445 PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

27. Refer to Wind Velocity and Windmills Narrative. . What is the value of the error sum of squares? ANS: SSE = 0.7116 PTS: 1 REF: 534-536 BLM: Higher Order - Analyze

TOP: 1–5

28. Refer to Wind Velocity and Windmills Narrative. One of the assumptions about the random error in the regression model is that the values of have a common variance equal to . What is the best estimator of ? ANS: = 0.2435 PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

TOP: 1–5

29. Refer to Wind Velocity and Windmills Narrative. . Identify and interpret the coefficient of determination. ANS: = 88.3%. This means that 88.3% of the total variation in y (DC output of a windmill) is accounted for by regression on x (wind velocity). PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

30. Refer to Wind Velocity and Windmills Narrative. Does a linear relationship exist between x and y? Test using = 0.05 ANS: Since p-value = 0.0 < , reject the null hypothesis and conclude that a linear relationship does exist between x and y. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate Correlation between Shoreline Erosion and Rainfall

TOP: 1–5

A scientist is studying the relationship between x = centimetres of annual rainfall and y = centimetres of shoreline erosion. One study reported the following data. Use the following statistical software output to answer the questions below. x y

30 0.3

25 0.2

Predictor Constant x

Coef –1.7359 0.073099

90 5.0

60 3.0

Stdev 0.1882 0.002867

50 2.0

35 0.5

t-ratio –9.22 25.50

75 4.0

110 6.0

45 1.5

80 4.0

p 0.000 0.000

s = 0.2416 R-sq = 98.8% R-sq(adj) = 98.6% Analysis of Variance Source DF Regression 1 Error 8 Total 9

SS 37.938 0.467 38.405

MS 37.938 0.058

F 650.14

p 0.000

31. Refer to Correlation between Shoreline Erosion and Rainfall. What is the equation of the estimated regression line? ANS: = –1.7359 + 0.073099x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

32. Refer to Correlation between Shoreline Erosion and Rainfall. Interpret the estimated slope of the regression line in the previous question. ANS: For each additional centimetre of annual rainfall, the shoreline erodes 0.0731 centimetres, on average. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

33. Refer to Correlation between Shoreline Erosion and Rainfall. Is the simple linear regression model useful for predicting erosion from a given amount of rainfall? Test the following hypotheses. ANS: Since p-value = 0.0 < , reject the null hypothesis and conclude that a linear relationship does exist between x and y. That is, the simple linear regression model is useful for predicting erosion from a given amount of rainfall. PTS:

REF: 540-542 | 722

TOP: 1–5

BLM: Higher Order - Evaluate 34. Refer to Correlation Between Shoreline Erosion and Rainfall. Identify and interpret the coefficient of determination. ANS: = 98.8%. This means that 98.8% of the total variation in y (centimetres of shoreline erosion) is accounted for by regression on x (centimetres of annual rainfall). PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

Sales and Experience Narrative The general manager of a chain of furniture stores believes that experience is the most important factor in determining the level of success of a salesperson. To examine this belief, she records last month’s sales (in $1000s) and the years of experience of ten randomly selected salespeople. These data are listed below. Salesperson 1 2 3 4 5 6 7 8 9 10

Years of Experience 0 2 10 3 8 5 12 7 20 15

Sales 7 9 20 15 18 14 20 17 30 25

35. Refer to Sales and Experience Narrative. Determine the standard error of estimate and describe what this statistic tells you about the regression line. ANS: 1.5724; the model’s fit is good. PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

TOP: 1–5

36. Refer to Sales and Experience Narrative. Determine the coefficient of determination and discuss what its value tells you about the two variables. ANS: 0.9536, which means that 95.36% of the variation in sales is explained by the variation in years of experience of the salesperson.

PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

37. Refer to Sales and Experience Narrative. Calculate the Pearson correlation coefficient. What sign does it have? Why? ANS: 0.9765. It has a positive sign since the slope of the regression line ( = 1.0817) is positive. PTS: 1 REF: 560-561 BLM: Higher Order - Analyze

TOP: 1–5

38. Refer to Sales and Experience Narrative. Conduct a test of the population slope to determine at the 5% significance level whether a linear relationship exists between years of experience and sales. ANS: vs. Rejection region: | t | > 2.306 Test statistic: t = 12.8258 Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of experience and sales. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

39. Refer to Sales and Experience Narrative. Predict with 95% confidence the monthly sales of a salesperson with ten years of experience. ANS: 19.447

3.819. Thus LCL = 15.628 (in $1,000s), and UCL = 23.266 (in $1,000s)

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

TOP: 6–8

40. Refer to Sales and Experience Narrative. Estimate with 95% confidence the average monthly sales of all salespersons with ten years of experience. ANS: 19.447

1.199. Thus LCL = 18.248 (in $1000s), and UCL = 20.646 (in $1000s)

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

TOP: 6–8

41. Refer to Sales and Experience Narrative. Which interval in the previous two questions is narrower: the confidence interval estimate of the expected value of y or the prediction interval for the same given value of x (ten years) and same confidence level? Why?

ANS: The confidence interval estimate of the expected value of y is narrower than the prediction interval for the same given value of x (ten years) and some confidence level. This is because there is less error in estimating a mean value as opposed to predicting an individual value. PTS: 1 REF: 554 BLM: Higher Order - Understand

TOP: 6–8

42. Refer to Sales and Experience Narrative. Draw a scatter diagram of the data to determine whether a linear model appears to be appropriate. ANS: It appears that a linear model is appropriate.

PTS: 1 REF: 562 BLM: Higher Order - Evaluate

TOP: 1–5

43. Refer to Sales and Experience Narrative. Determine the least-squares regression line. ANS: 8.63 + 1.0817x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

44. Refer to Sales and Experience Narrative. Interpret the value of the slope of the regression line. ANS: For each additional year of experience, monthly sales of a salesperson increase by an average of $1,081.70. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

45. Refer to Sales and Experience Narrative. Estimate the monthly sales for a salesperson with 16 years of experience. ANS: When x =16,

= 25.94

PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

Income and Education Narrative A professor of economics wants to study the relationship between income (y in $1,000s) and education (x in years). A random sample eight individuals is taken and the results are shown below. Education Income

16 58

11 40

15 55

8 35

12 43

10 41

13 52

14 49

46. Refer to Income and Education Narrative. Determine the standard error of estimate and describe what this statistic tells you about the regression line. ANS: 2.436; the model’s fit to these data is good. PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

TOP: 1–5

47. Refer to Income and Education Narrative. Determine the coefficient of determination and discuss what its value tells you about the two variables. ANS: 0.9223, which means that 92.03% of the variation in income is explained by the variation in years of education. PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

48. Refer to Income and Education Narrative. Calculate the Pearson correlation coefficient. What sign does it have? Why? ANS: 0.9604. It has a positive sign since the slope of the regression line ( positive. PTS: 1 REF: 560-561 BLM: Higher Order - Analyze

TOP: 1–5

= 2.9098) is

49. Refer to Income and Education Narrative. Conduct a test of the population slope to determine at the 5% significance level whether a linear relationship exists between years of education and income. ANS: , Rejection region: | t | > 2.447 Test statistic: t = 8.439 Conclusion: Reject the null hypothesis. Yes, a linear relationship exists between years of education and income. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

50. Refer to Income and Education Narrative. Predict with 95% confidence the income of an individual with ten years of education. ANS: 39.715

2.710. Thus, LCL = 37.005 (in $1,000s), and UCL = 42.425 (in $1,000s)

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

TOP: 6–8

51. Refer to Income and Education Narrative. Estimate with 95% confidence the average income of all individuals with ten years of education. ANS: 39.715

1.188. Thus, LCL = 38.527 (in $1,000s), and UCL = 40.903 (in $1,000s)

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

TOP: 6–8

52. Refer to Income and Education Narrative. Which interval in the previous two questions is narrower: the confidence interval estimate of the expected value of y or the prediction interval for the same given value of x (ten years) and same confidence level? Why? ANS: The confidence interval estimate of the expected value of y is narrower than the prediction interval for the same given value of x (ten years) and some confidence level. This is because there is less error in estimating a mean value as opposed to predicting an individual value. PTS: 1 REF: 554 BLM: Higher Order - Understand

TOP: 6–8

53. Refer to Income and Education Narrative. Draw a scatter diagram of the data to determine whether a linear model appears to be appropriate. ANS:

It appears that a linear model is appropriate.

PTS: 1 REF: 562 BLM: Higher Order - Evaluate

TOP: 1–5

54. Refer to Income and Education Narrative. Determine the least-squares regression line. ANS: 10.6165 + 2.9098x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

55. Refer to Income and Education Narrative. Interpret the value of the slope of the regression line. ANS: For each additional year of education, the income increases by an average of $2,909.80. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

56. Refer to Income and Education Narrative. Estimate the income of an individual with 15 years of education. ANS: When x = 15,

= 54.264 (in $1,000s) or $54,264.0

PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

Amount of Trees and Beavers A scientist is studying the relationship between x = density (in number per square metre) of aspen trees around a pond and y = beaver abundance. The following statistical software output is from a regression analysis for predicting y from x.

Predictor Constant x

Coef 2.5595 0.52381

Stdev 0.3442 0.03733

t-ratio 7.44 14.03

p 0.000 0.000

s = 0.4839 R-sq = 97.0% R-sq(adj) = 96.5% Analysis of Variance Source DF Regression 1 Error 6 Total 7

SS 46.095 1.405 47.500

MS 46.095 0.234

F 196.88

p 0.000

57. Refer to Amount of Trees and Beavers. What is the least-squares regression equation? ANS: = 2.5595 + 0.52381x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

58. Refer to Amount of Trees and Beavers. What are the estimated slope and estimated intercept? ANS: Slope = b =0.52381, and y-intercept = a = 2.5595 PTS: 1 REF: 531-533 | 536 BLM: Higher Order - Understand

TOP: 1–5

59. Refer to Amount of Trees and Beavers. Interpret the estimated slope and intercept for this problem. ANS: Slope: For each additional square metre of aspen trees around a pond, the beaver abundance will increase, on average, by 0.524. Intercept: A pond with no aspen trees has a beaver abundance of 2.56. This may or may not be a valid interpretation if x = 0 is not in the sample range of the predictor variable. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

60. Refer to Amount of Trees and Beavers. What is the value of the error sum of squares? ANS: SSE =1.405 PTS: 1 REF: 534-536 BLM: Higher Order - Analyze

TOP: 1–5

61. Refer to Amount of Trees and Beavers. What is the value of the coefficient of determination? ANS: Coefficient of determination =

= 0.97

PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

62. Refer to Amount of Trees and Beavers. What is the percentage of variation in beaver abundance accounted for by regression on the density of aspen trees? NAR: Amount of Trees and Beavers ANS: This is simply the coefficient of determination; PTS: 1 REF: 543 BLM: Higher Order - Understand

= 97.0%.

TOP: 1–5

Young Aspen Trees and Growth Narrative Let x be the number of leaves on a young aspen tree and let y be the growth of the tree (in mm). The data are as follows. x y

30 3

42 5

50 7

65 8

70 9

35 4

44 6

60 8

40 5

46 6

55 7

63. Refer to Young Aspen Trees and Growth Narrative. Develop a scatterplot for this data. ANS:

PTS: 1 REF: 534 BLM: Higher Order - Apply

TOP: 1–5

64. Refer to Young Aspen Trees and Growth Narrative. What does the scatterplot developed in the previous question indicate about the relationship between the two variables? ANS: It seems that the simple linear regression model is appropriate for describing the relationship between x and y. PTS: 1 REF: 531-533 | 562 BLM: Higher Order - Understand

TOP: 1–5

65. Refer to Young Aspen Trees and Growth Narrative. Use a statistical software package of your choice and report the regression analysis results. ANS: ANOVA Output Source Regression Residual Total

Intercept x

df 1 9 10

Coefficients –0.8007 0.1430

SS 32.23381 1.4026 33.63636

MS F p-value 32.23381 206.8406 0 0.155839

Standard Error t Stat 0.4999 –1.6017 0.0099 14.3820

PTS: 1 REF: 534-536 BLM: Higher Order - Apply

P-value 0.14369 0.00000

TOP: 1–5

66. Refer to Young Aspen Trees and Growth Narrative. What is the least-squares regression line? ANS: = –0.8007 + 0.143x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

67. Refer to Young Aspen Trees and Growth Narrative. What is the average change in growth with the increase of one leaf? ANS: This is simply the estimated slope, b = + 0.143 mm. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

TOP: 1–5

68. Refer to Young Aspen Trees and Growth Narrative. Find and interpret the coefficient of determination. ANS: = 0.9583. This means that about 95.8% of the total variation in the y values (growth of the tree in mm) can be explained by regression on x (number of leaves). PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

69. Refer to Young Aspen Trees and Growth Narrative. Use the t-test to test the hypotheses

ANS: The test statistic is t = 14.382 with p-value = 0. Reject significant linear relationship between x and y.

and conclude that there is a

PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

70. Refer to Young Aspen Trees and Growth Narrative. Use the F-test to test the hypotheses at

= 0.05.

ANS: The test statistics is F = 206.84 with p-value = 0. Again, Reject is a significant linear relationship between x and y. PTS: 1 REF: 542-543 | 722 BLM: Higher Order - Evaluate

and conclude that there

TOP: 1–5

71. Refer to Young Aspen Trees and Growth Narrative. Find and interpret the correlation coefficient. ANS: Since b = 0.143 > 0, the correlation coefficient = 0.9789. This indicates a strong positive linear relationship between the two variables. PTS: 1 REF: 560-561 BLM: Higher Order - Analyze

TOP: 6–8

Age of Forest and Diameter of Trees A scientist is studying the relationship between age of a forest, x, in years and the average diameter of the trees, y, in cm. One study reported the following data. x

120

150

100

175

72. Refer to Age of Forest and Diameter of Trees. Construct a scatterplot for this data, including the least-squares regression line. ANS:

PTS: 1 REF: 562 BLM: Higher Order - Apply

TOP: 1–5

73. Refer to Age of Forest and Diameter of Trees. Use a statistical software package of your choice and report the regression analysis results. ANS: Summary Output Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA Output Source df Regression 1 Residual 7 Total 8

Intercept x

0.9986 0.9972 0.9968 1.0963 9

SS 2,976.4761 8.4128 2,984.8889

MS F p-value 2,976.4761 2,476.6321 0.0000 1.2018

Standard Coefficients Error 12.0560 0.6385 0.3280 0.0066

PTS: 1 REF: 534-536 BLM: Higher Order - Apply

t Stat 18.8826 49.7658 TOP: 1–5

P-value 0.0000 0.0000

74. Refer to Age of Forest and Diameter of Trees. What is the equation of the least-squares regression line? ANS: = 12.056 + 0.328x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

75. Refer to Age of Forest and Diameter of Trees. Is the simple linear regression model useful for predicting the diameter of the trees from a given age of the forest? Use the t-test at = 0.05. ANS: This is the equivalent of testing the hypotheses vs. The value of the test statistic is t = 49.7658, and p-value = 0. So, we reject the null hypothesis, and conclude that the simple linear regression model is useful for predicting the diameter of the trees from a given age of the forest. PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

76. Refer to Age of Forest and Diameter of Trees. Predict the tree diameter of an 83-year-old forest. ANS: = 12.056 + 0.328 (83) = 39.28 cm. PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

77. Refer to Age of Forest and Diameter of Trees. Estimate interval.

using a 90% confidence

ANS: = 0.328

= 0.328 1.895 0.0125 = (0.3155, 0.3405)

PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Analyze

TOP: 1–5

Vending Machines Narrative Let x be the number of vending machines and let y be the time (in hours) it takes to stock them. The data are as follows. x 4 8 10 13 16 11 5 9 18 14 6 12 20

y 1

2.5

3.5

1.3

2.5

1.4

10.5

78. Refer to Vending Machines Narrative. Construct a scatterplot for this data including the least-squares regression line. ANS:

PTS: 1 REF: 531-534 BLM: Higher Order - Apply

TOP: 1–5

79. Refer to Vending Machines Narrative. Use a software package of your choice and report the regression analysis results. ANS: Summary Output Regression Statistics Multiple R R Square Adjusted R Square Standard Error Observations ANOVA Table Source df Regression 1 Residual 10 Total 11 Coefficients Intercept –3.0306 x 0.6624

0.9822 0.9646 0.9611 0.6156 12

SS MS 103.4190 103.4190 3.7902 0.3790 107.2092 Standard Error t Stat 0.5067 –5.9806 0.0401 16.5185

PTS: 1 REF: 535-536 BLM: Higher Order - Apply

TOP: 1–5

F p-value 272.8622 0.0000

p-value 0.00014 0.00000

80. Refer to Vending Machines Narrative. What is the equation of the estimated regression line? ANS:

PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

81. Refer to Vending Machines Narrative. What is the predicted time required to stock seven vending machines? ANS: –2.5408 + 0.6283(7) = 1.8573 hours PTS: 1 REF: 533 BLM: Higher Order - Apply

TOP: 1–5

82. Refer to Vending Machines Narrative. What percentage of the total variation in y can be explained by the simple linear regression model? ANS: This is simply the coefficient of determination 0.957, so about 95.7% PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

83. Refer to Vending Machines Narrative. Estimate

using a 95% confidence interval.

ANS: = 0.6283

= 0.6283 2.201 0.0793 = (0.549, 0.7076)

PTS: 1 REF: 541-542 | 722 BLM: Higher Order - Analyze

TOP: 1–5

Weight and Height Narrative Evidence supports using a simple linear regression model to estimate a person’s weight based on a person’s height. Let x be a person’s height (measured in cm) and y be the person’s weight (measured in kg). A random sample of 11 people was selected and the following data recorded: x 152 185 163 170 155 165 160 173 157 178 188 y 47.7 68.2 54.5 59.1 48.6 55.9 53.6 61.4 50.0 65.9 77.3 The following output was generated using statistical software:

Regression Analysis The regression equation is y = –148 + 4.18x Predictor Constant x 0.7474

Coef –67.04 0.0463

StDev 7.787 16.144

T –8.609 0.000

P 0.000

S = 1.7698; R-Sq = 96.7%; R-Sq(adj) = 96.3% Analysis of Variance Table Source Regression Residual Error Total

DF 1 9 28. 10

Unusual Observations Obs x y 2 185 68.2 11 188 77.3

SS 816.39 19 3.13 844.58

MS 816.39

F 260.63

p 0.000

Fit 71.22 73.46

StDev 2.22 2.45

Fit Residual –3.02 3.84

St Resid –2.18R 2.66R

denotes an observation with a large standardized residual. 84. Refer to Weight and Height Narrative. Based on the scatterplot above, does a simple linear regression model seem appropriate? Justify your answer. ANS: Yes. The data appear to be reasonably linear so a simple linear regression model seems appropriate. PTS: 1 REF: 562 BLM: Higher Order - Evaluate

TOP: 1–5

85. Refer to Weight and Height Narrative. Use the printout to find the least-squares prediction line. ANS: = –67.04 + 0.7474x PTS: 1 REF: 531-533 | 536 BLM: Higher Order - Analyze

TOP: 1–5

Ice Cream Sales Narrative The manager of an ice cream store is interested in examining the relationship between sales of ice cream (in litres per day) and maximum temperature of the day. The vendor records the following data for a random sample of five days in the summer, where y is number of litres of ice cream sold per day and x is maximum temperature, in degrees Celsius, recorded for the day: x y

29 19

32 28

35 38

31 23

27 16

The following summary information was computed:

86. Refer to Ice Cream Sales Narrative. Construct a scatterplot for the data. Do the data appear to be reasonably linear? ANS: Yes, the data appear to be reasonably linear. PTS: 1 REF: 562 BLM: Higher Order - Evaluate

TOP: 1–5

87. Refer to Ice Cream Sales Narrative. Find the least-squares prediction line. ANS: = 4780 – 4743.2 = 36.8;

= 3922 –

3819.2 = 102.8 = 102.8/36.8 = 2.7935, and

= 24.8 – 86.039 = –61.239

Then, the least-squares prediction line is:

= –61.239 + 2.7935x.

PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

88. Refer to Ice Cream Sales Narrative. Find the estimated sales of ice cream for a maximum daily temperature of 34°C. ANS: = –61.239 + 2.7935(34)

33.74 litres.

PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

89. Refer to Ice Cream Sales Narrative. Would you use the least-squares prediction equation line to find the estimated sales of ice cream for a maximum daily temperature of 6°C? Why or why not? ANS: No, since 6°C is outside the range of the x data values. PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

90. Refer to Ice Cream Sales Narrative. Find and interpret the coefficient of determination. ANS: The coefficient of determination is = 0.9611. This means that 96.11% of the total variation in y (ice cream sales) can be explained by regression on x (maximum temperature). One can also say that there is a 96.11% reduction in the total variation by using the regression line to predict the response variable y instead of using . PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

91. Refer to Ice Cream Sales Narrative. Test significance.

at the 0.05 level of

ANS: SSE = = 298.8 – 287.1696 = 11.6304, MSE = SSE/(n – 2) = 11.6304/3 = 3.8768. Then, the value of the test statistic is

= 8.607.

With three degrees of freedom, reject if | t | > = 3.182. Since t  3.184, reject and conclude that there is a significant linear relationship between maximum daily temperature and daily sales of ice cream. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

92. Refer to Ice Cream Sales Narrative. Find and interpret the correlation between maximum daily temperature and daily sales of ice cream.

ANS: = 3374 – 3075.2 = 298.8. Then, the correlation coefficient is = 102.8/104.861 = 0.9803. There is a strong positive linear relationship between daily sales of ice cream and maximum daily temperature. PTS: 1 REF: 560-562 BLM: Higher Order - Analyze

TOP: 6–8

SAT Scores and GPA Narrative A university admissions committee was interested in examining the relationship between a student’s score on the Scholastic Aptitude Test, x, and the student’s grade point average, y, at the end of the student’s first year of university. The committee selected a random sample of 25 students and recorded the SAT score and GPA at the end of the first year of university for each student. Use the following output that was generated using statistical software to answer the questions below: Regression Analysis The regression equation is GPA = –1.09 + 0.00349 SAT Predictor Coef Constant –1.0851 SAT 0.0034868

StDev 0.2593 0.0002171

T –4.19 16.06

P 0.000 0.000

S = 0.1463 R-Sq = 91.8% R-Sq(adj) = 91.5% Analysis of Variance Source DF SS MS Regression 1 5.5189 5.5189 23 0.4921 0.0214 Residual Error 24 6.0111 Total Correlations (Pearson) Correlation of SAT and GPA = 0.958

F 257.93

P 0.000

93. Refer to SAT Scores and GPA Narrative. Use the information above to find the least-squares prediction line. ANS: = –1.0851 + 0.0035x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

TOP: 1–5

94. Refer to SAT Scores and GPA Narrative. Find the estimated GPA at the end of the freshman year for a student who scored 1175 on the SAT exam.

ANS: = –1.0851 + 0.0035(1175) = 3.0274 PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

95. Refer to SAT Scores and GPA Narrative. Determine the coefficient of determination, and interpret its value. ANS: The coefficient of determination is = 91.8%. This means that 91.8% of the total variation in GPA can be explained by regression on SAT score. One can also say that there is a 91.8% reduction in the total variation in y (GPA) by using the regression line to predict the response variable y instead of using . PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

96. Refer to SAT Scores and GPA Narrative. Use the p-value approach to test the usefulness of the linear regression model at the 0.05 level of significance. ANS: Since p-value = 0.000  0.05 (specified level of significance), reject and conclude that there is a significant linear relationship between a student’s SAT score and the student’s GPA at the end of the first year of university. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

97. Refer to SAT Scores and GPA Narrative. Determine the correlation between a student’s SAT score and GPA at the end of the freshman year. Interpret the value. ANS: Since b = 0.0035 > 0, then the correlation coefficient r is given by = +0.958. There is a strong positive linear relationship between a student’s SAT score and GPA at the end of the freshman year. PTS: 1 REF: 560-562 BLM: Higher Order - Analyze

TOP: 6–8

Extra Help Sessions Narrative A study was conducted to determine the effect of extra help sessions attended on students’ ability to avoid mistakes on a 20-question test. The data shown below represent the number of extra help sessions attended (x) and the average number of mistakes (y) recorded.

x y

1 6.1

2 5.1

3 5.0

4 4.2

5 3.7

6 3.2

98. Refer to Extra Help Sessions Narrative. Use the regression formulas to find the least-squares line for the data. ANS: . Then ,

, and the least-squares line is PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

and .

TOP: 1–5

99. Refer to Extra Help Sessions Narrative. Plot the six points and graph the line. Does the line appear to provide a good fit to the data points? ANS: The line appears to provide a good fit to the data points.

PTS: 1 REF: 562 BLM: Higher Order - Apply 100.

TOP: 1–5

Refer to Extra Help Sessions Narrative. Use the least-squares line to predict the value of y when x = 3.5. ANS: When x = 3.5, the value for y can be predicted using the least-squares line as . PTS: 1 REF: 531-533 BLM: Higher Order - Apply

TOP: 1–5

101.

Refer to Extra Help Sessions Narrative. Use statistical software to construct the ANOVA table for the linear regression. ANS: The completed ANOVA table is shown below. ANOVA Table Source df Regression 1 Residual 4 Total 5

SS 5.43214 0.14286 5.575

MS 5.43214 0.03571

PTS: 1 REF: 534-536 BLM: Higher Order - Apply 102.

F 152.118

p-value 0.000248

TOP: 1–5

Refer to Extra Help Sessions Narrative. Do the data provide sufficient evidence to indicate that y and x are linearly related at the 1% level of significance? ANS: The hypotheses to be tested are Since p-value = 0.000248 < , reject the null hypothesis and conclude that the relationship between x and y is significant at = 0.01; that is, y and x are linearly related. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

103.

TOP: 1–5

Refer to Extra Help Sessions Narrative. Calculate the coefficient of determination, What information does this value give about the usefulness of the linear model?

ANS: The coefficient of determination, , is The coefficient of determination measures the proportion of the total variation in y that is accounted for using the independent variable x. That is, the total variation in y is reduced by 97.44% by using the regression equation PTS: 1 REF: 543 BLM: Higher Order - Analyze Sleep Deprivation Narrative

rather than TOP: 1–5

to predict the response y.

A study was conducted to determine the effects of sleep deprivation on people’s ability to solve problems. The amount of sleep deprivation varied with 8, 12, 16, 20, and 24 hours without sleep. A total of ten subjects participated in the study, two at each sleep deprivation level. After his or her specified sleep deprivation period, each subject was administered a set of simple addition problems, and the number of errors was recorded. These results were obtained: Number of Errors, y Number of Hours without Sleep, x 104.

9, 7 9

7, 11 13

9, 15 17

15, 13 21

17, 13 25

Refer to Sleep Deprivation Narrative. How many pairs of observations are in the experiment? What are the total number of degrees of freedom? ANS: There are n = 2(5) = 10 pairs of observations in the experiment, so that the total number of degrees of freedom are n – 1 = 9. PTS: 1 REF: 534-535 BLM: Higher Order - Analyze

105.

TOP: 1–5

Refer to Sleep Deprivation Narrative. What is the least-squares prediction equation? ANS: Source Intercept x

Coefficients Standard Error t Stat 3.525 2.2452 1.5701 0.475 0.1253 3.7905

p-value 0.155044 0.005308

From the computer printout above, the least-squares prediction line is . PTS: 1 REF: 531-533 | 535 BLM: Higher Order - Analyze 106.

TOP: 1–5

Refer to Sleep Deprivation Narrative. Use a statistical software to construct the ANOVA table for the linear regression. ANS: ANOVA Table Source df Regression 1 Residual 8 Total 9

SS 72.2 40.2 112.4

MS 72.2 5.025

PTS: 1 REF: 534-536 BLM: Higher Order - Apply

F 14.36816

TOP: 1–5

p-value 0.005308

107.

Refer to Sleep Deprivation Narrative. Do the data present sufficient evidence at the 1% level of significance to indicate that the number of errors is linearly related to the number of hours without sleep? Identify the two test statistics in the printout that can be used to answer this question. ANS: The test of vs. is performed using one of two test statistics: t = 3.7905 or F = 14.36814 with p-value = 0.005308. Since the p-value is smaller than = 0.01, is rejected, and the results are declared highly significant. There is evidence to indicate that x and y are linearly related. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

108.

TOP: 1–5

Refer to Sleep Deprivation Narrative. Use the prediction equation to predict the number of errors for a person who has not slept for ten hours. ANS: When x = 10, PTS: 1 REF: 531-533 BLM: Higher Order - Apply

109.

TOP: 1–5

Refer to Sleep Deprivation Narrative. Would you expect the relationship between y and x to be linear if x varied over a wider range (say, x = 4 to x = 48)? ANS: If a person is deprived of sleep for as much as 48 hours, their number of errors will probably become extremely high. The relationship will not remain linear, but will become curvilinear. PTS: 1 REF: 562 BLM: Higher Order - Evaluate

110.

TOP: 1–5

Refer to Sleep Deprivation Narrative. How do you describe the strength of the relationship between y and x? ANS: The coefficient of determination, , can be calculated as = SSR/Total SS = 72.2/112.4 = 0.642. That is, 64.2% of the total variation in the experiment can be explained by the independent variable x. The total variation in y is reduced by 64.2% by using rather than to predict the response y. This is a relatively strong relationship. PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

111.

Refer to Sleep Deprivation Narrative. What is the best estimate of the common population variance

ANS: The population variance

is estimated using

PTS: 1 REF: 540 BLM: Higher Order - Analyze 112.

= MSE = 5.025.

TOP: 1–5

Refer to Sleep Deprivation Narrative. Find a 95% confidence interval for the slope of the line. ANS: The standard error of x is 0.1253. Hence, the 95% confidence interval for the slope . This implies that equivalently 0.186 <

, or

< 0.764.

PTS: 1 REF: 541-542 | 722 BLM: Higher Order - Analyze

TOP: 1–5

Antibiotic Potency Narrative An experiment was conducted to observe the effect of an increase in temperature on the potency of an antibiotic. Three 25 gram portions of the antibiotic were stored for equal lengths of time at each of these temperatures: C, C, C, and C. The potency readings observed at each temperature of the experimental period are listed here: Potency Readings, y Temperature, x 113.

41, 45, 31 0

34, 28, 35 11

21, 29, 25 22

16, 21, 23 33

Refer to Antibiotic Potency Narrative. Use the computing formulas to find the least-squares line appropriate for these data. ANS: ,

, and

The least-squares regression line is PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

. Then

. TOP: 1–5

114.

Refer to Antibiotic Potency Narrative. Plot the points and graph the line as a check on your calculations for the previous question. ANS: See the scatterplot.

PTS: 1 REF: 534 BLM: Higher Order - Apply 115.

TOP: 1–5

Refer to Antibiotic Potency Narrative. Use an appropriate statistical software program to construct the ANOVA table for linear regression. ANS: ANOVA Table Regression Residual Total

df 1 10 11

SS 620.8167 194.1 814.9167

PTS: 1 REF: 534-536 BLM: Higher Order - Apply 116.

MS F p-value 620.816731.984 0.000211 19.41

TOP: 1–5

Refer to Antibiotic Potency Narrative. Do the data provide sufficient evidence to indicate that potency of an antibiotic is linearly related to the increase in temperature? Test at the 1% level of significance. ANS: The hypotheses to be tested are

Since the p-value = 0.000211

is smaller than 0.01, is rejected, and the results are declared highly significant. There is evidence to indicate that potency of an antibiotic is linearly related to the increase in temperature. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate

TOP: 1–5

117.

Refer to Antibiotic Potency Narrative. Estimate the change in potency for a one-unit change in temperature. Use a 95% confidence interval. ANS: The 95% confidence interval for

is or

PTS: 1 REF: 541-542 | 722 BLM: Higher Order - Analyze 118.

. TOP: 1–5

Refer to Antibiotic Potency Narrative. Estimate the mean potency corresponding to a temperature of

C. Use a 95% confidence interval.

ANS: When x = 10, the estimate of mean potency E(y) is

the 95% confidence interval is

, or 29.68 < E(y) < 36.09.

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 119.

= 38.885 – 0.5848(10) = 32.885 and

TOP: 6–8

Refer to Antibiotic Potency Narrative. Suppose that a batch of the antibiotic was stored at C for the same length of time as the experimental period. Predict the potency of the batch at the end of the storage period. Use a 95% prediction interval. ANS: The predictor for y when x = 10 is

= 32.885 and the 95% prediction interval is

= PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze Income and Height Narrative

, or TOP: 6–8

Do tall men earn more than short ones? An economist collected the data shown below for 25 men, where the annual income (y) in thousands of dollars and the height of the income earner (x) in cm.

120.

x y

163 11

160 14

157 16

175 41

183 42

170 47

x y

180 78

188 83

185 84

180 86

185 103

188 111

173 51 191 112

183 54 183 113

175 61 191 121

178 63 196 127

183 65

178 71

188 130

178 78

185 132

Refer to Income and Height Narrative. Calculate the preliminary sums of squares and cross-products, ANS:

PTS: 1 REF: 531-533 BLM: Higher Order - Apply 121.

TOP: 1–5

Refer to Income and Height Narrative. Determine the least-squares estimates of the regression parameters, and the least-squares regression line. ANS: , and

PTS: 1 REF: 531-533 BLM: Higher Order - Apply 122.

TOP: 1–5

Refer to Income and Height Narrative. Construct the scatterplot and plot the fitted line on the scatterplot. Does the assumption of a linear relationship appear to be reasonable? ANS:

It seems that a linear relationship between annual income and height of earner is reasonable. PTS: 1 REF: 562 BLM: Higher Order - Evaluate 123.

TOP: 1–5

Refer to Income and Height Narrative. Predict the annual income for a 6-foot-tall man. ANS:

PTS: 1 REF: 531-533 BLM: Higher Order - Apply 124.

TOP: 1–5

Refer to Income and Height Narrative. Construct the ANOVA table for the linear regression. ANS: ANOVA Table

PTS: 1 REF: 534-536 BLM: Higher Order - Apply 125.

TOP: 1–5

Refer to Income and Height Narrative. Do the data present sufficient evidence to indicate that annual income and height of income earner are linearly related? Use the F test at the 5% level of significance. ANS:

vs. . The observed value of the test statistic F = 68.245, and the critical value of F with df = 1 and 23 for numerator and denominator respectively, is 4.28; therefore, we reject and conclude that the data present sufficient evidence to indicate that annual income and height of income earner are linearly related. PTS: 1 REF: 542-544 | 725-732 BLM: Higher Order - Evaluate 126.

TOP: 1–5

Refer to Income and Height Narrative. Do the data present sufficient evidence to indicate that annual income and height of income earner are linearly related? Use the t test at the 5% level of significance. ANS: vs.

. The observed value of the test statistic is calculated as

. At and df = n – 2 = 23, we reject when t > 2.069 or t < –2.069. Since t = 8.261, we reject and conclude that the data present sufficient evidence to indicate that annual income and height of income earner are linearly related. PTS: 1 REF: 540-542 | 722 BLM: Higher Order - Evaluate 127.

TOP: 1–5

Refer to Income and Height Narrative. Compare the observed value of the F statistic with that of the t statistic. What is the relationship between the two values? ANS:

PTS: 1 REF: 541-544 | 722 | 725-732 BLM: Higher Order - Analyze 128.

TOP: 1–5

Refer to Income and Height Narrative. Compare the two-tailed critical value for the t test with the critical value for the F statistic. What is the relationship between the two values? ANS: fact that the square of a t statistic with df = numerator df = 1 and denominator df = .

. This is no accident and results from the has the same distribution as an F statistic with

PTS: 1 REF: 541-544 | 722 | 725-732 BLM: Higher Order - Analyze 129.

TOP: 1–5

Refer to Income and Height Narrative. Calculate the coefficient of determination What information does this value give about the usefulness of the linear regression model?

ANS: This can be interpreted as 74.79% reduction in the total variation of annual income (y) is obtained by using the regression line = –514.01 + 3.2794x, instead of ignoring the height of income earner (x) and using the sample mean to predict the response variable y. The regression model seems to be working very well. PTS: 1 REF: 543 BLM: Higher Order - Analyze

TOP: 1–5

Sunshine and Skin Cancer Narrative A medical statistician wanted to examine the relationship between the amount of sunshine (x) in hours, and incidence of skin cancer (y). As an experiment, he found the number of skin cancers detected per 100,000 of population and the average daily sunshine in eight counties around the country. These data are shown below: Average Daily Sunshine Skin Cancer per 100,000 130.

5 7

7 11

6 9

7 12

8 15

6 10

4 7

3 5

Refer to Sunshine and Skin Cancer Narrative. Determine the least-squares regression line. ANS: –1.115 + 1.846x PTS: 1 REF: 531-533 BLM: Higher Order - Analyze

131.

TOP: 1–5

Refer to Sunshine and Skin Cancer Narrative. Draw a scatter diagram of the data and plot the least-squares regression line on it. ANS: See Average Daily Sunshine Line Fit Plot

PTS: 1 REF: 534 BLM: Higher Order - Apply 132.

TOP: 1–5

Refer to Sunshine and Skin Cancer Narrative. Estimate the number of skin cancer per 100,000 of population for six hours of sunshine. ANS: When x = 6,

= 9.961

PTS: 1 REF: 531-533 BLM: Higher Order - Apply 133.

TOP: 1–5

Refer to Sunshine and Skin Cancer Narrative. What does the value of the slope of the regression line tell you? ANS: If the amount of sunshine x increases by one hour, the amount of skin cancer y increases by an average of 1.846 per 100,000 of population. PTS: 1 REF: 531-533 BLM: Higher Order - Understand

134.

TOP: 1–5

Refer to Sunshine and Skin Cancer Narrative. Calculate the residual corresponding to the pair (x, y) = (8, 15). ANS: e=y–

= 15 – 13.653 = 1.347

PTS: 1 REF: 531 BLM: Higher Order - Apply

TOP: 1–5

Willie Nelson Concert Narrative At a recent Willie Nelson concert, a survey was conducted that asked a random sample of 20 people their age and how many concerts they have attended since the first of the year. The following data were collected: Age Number of Concerts

62 6

57 5

40 4

Age Number of Concerts

44 3

48 2

55 4

49 3 60 5

67 5 59 4

54 5

43 2

65 6

54 3

41 1

63 5

69 4

40 2

38 1

52 3

An Excel output follows:

135.

Refer to Willie Nelson Concert Narrative. Use the regression determine the predicted values of y.

ANS: The predicted values are 4.781 4.153 2.016 3.147 5.410 3.776 2.393 5.158 3.776 3.901 4.530 4.404 4.907 5.661 2.016 1.765 3.524 PTS: 1 REF: 533 BLM: Higher Order - Apply 136.

2.142 2.519 3.022

TOP: 6–8

Refer to Willie Nelson Concert Narrative. Use the predicted values and the actual values of y to calculate the residuals. ANS:

The residuals are 1.219 0.847 1.984 –0.147 –0.410 1.224 –0.393 0.842 –0.776 –1.142 0.481 –1.022 0.099 0.470 –0.404 0.093 –1.661 –0.016 –0.765 –0.524 PTS: 1 REF: 531 BLM: Higher Order - Apply 137.

TOP: 6–8

Refer to Willie Nelson Concert Narrative. Plot the residuals in against the predicted values . ANS: See Residuals versus Predicted chart.

PTS: 1 REF: 548-550 BLM: Higher Order - Apply 138.

TOP: 6–8

Refer to Willie Nelson Concert Narrative. Does it appear that random variables is a problem? Explain. ANS: There is no evidence to indicate that the variance of the error variable is not constant; therefore, random variables is not a problem. PTS: 1 REF: 548-550 BLM: Higher Order - Evaluate

139.

TOP: 6–8

Refer to Willie Nelson Concert Narrative. Draw a histogram of the residuals. ANS: See Residuals versus Predicted chart.

PTS: 1 REF: 30 | 548-550 TOP: 6–8 BLM: Higher Order - Apply 140.

Refer to Willie Nelson Concert Narrative. Does it appear that the errors are normally distributed? Explain. ANS: The histogram is positively skewed. The errors may not be normally distributed. PTS: 1 REF: 548-550 BLM: Higher Order - Analyze

TOP: 6–8

Oil Quality and Price Narrative Quality of oil is measured in API gravity degrees; the higher the degrees API, the higher the quality. The table shown below was produced by an expert in the field who believes that there is a relationship between quality and price per barrel. Oil Degrees API 27.0 28.5 30.8 31.3 31.9 34.5 34.0 34.7 37.0 41.0 41.0 38.8 39.3

Price per Barrel (in $) 12.02 12.04 12.32 12.27 12.49 12.70 12.80 13.00 13.00 13.17 13.19 13.22 13.27

A partial MINITAB output follows:

Descriptive Statistics Variable N Degrees 13 34 Price 13 Covariances Degrees Price Degrees 21.281667 Price 2.026750

Mean .60 12.730

StDev 4.613 0.457

SE Mean 1.280 0.127

T 32.91 11.59

P 0.000 0.000

0.208833

Regression Analysis Predictor Coef Constant 9.4349 Degrees 0.095235

StDev 0.2867 0.008220

S = 0.1314 R-Sq = 92.46% R-Sq(adj) = 91.7%

Analysis of Variance Source DF Regression 1 Residual 11 Error Total 141.

SS 2.3162 0.1898

MS 2.3162 0.0173

2.5060

Refer to Oil Quality and Price Narrative. Use the equation determine the predicted values of y.

F 134.24

P 0.000

ANS: The predicted values are 12.006, 12.149, 12.368, 12.416, 12.473, 12.721, 12.673, 12.740, 12.959, 13.340, 13.340, 13.130, and 13.178. PTS: 1 REF: 533 BLM: Higher Order - Apply 142.

TOP: 6–8

Refer to Oil Quality and Price Narrative. Use the predicted values and the actual values of y to calculate the residuals. ANS: The residuals are 0.014, –0.109, –0.048, –0.146, 0.017, –0.021, 0.127, 0.260, 0.041, –0.170, –0.150, 0.090, and 0.092. PTS: 1 REF: 531 BLM: Higher Order - Apply

143.

TOP: 6–8

Refer to Oil Quality and Price Narrative. Plot the residuals against the predicted values.

ANS:

PTS: 1 REF: 548-550 BLM: Higher Order - Apply 144.

TOP: 6–8

Refer to Oil Quality and Price Narrative. Does it appear that random variables is a problem? Explain.

ANS: There is no evidence to indicate that variance of the error variable is not constant; therefore random variables is not a problem. PTS: 1 REF: 548-550 BLM: Higher Order - Evaluate 145.

TOP: 6–8

Refer to Oil Quality and Price Narrative. Draw a histogram of the residuals. ANS:

PTS: 1 REF: 30 | 531 | 548-550 BLM: Higher Order - Apply 146.

TOP: 6–8

Refer to Oil Quality and Price Narrative. Does it appear that the errors are normally distributed? Explain. ANS: The histogram is fairly symmetric; therefore, we may conclude that the errors are normally distributed. PTS: 1 REF: 548-550 BLM: Higher Order - Analyze

TOP: 6–8

Circumference and Age Narrative Evidence supports using a simple linear regression model to estimate the circumference of a pine tree based on its age. Let x be the age of the pine tree (measured in years) and y be the circumference (measured in cm). A random sample of 11 mature pine trees was selected and the following data recorded: x

105

150

120

130

107

123

118

135

110

145

170

The following output was generated using statistical software:

Regression Analysis The regression equation is y = –148 + 4.18x Predictor Constant – x

Coef 147.64 4.1774

StDev 17.78 0.2684

T –8.30 15.56

P 0.000 0.000

S = 4.025 R-Sq = 96.4% R-Sq(adj) = 96.0% Analysis of Variance Table Source Regression Residual Error Total

DF 1 9 10

Unusual Observations Obs x 2 73.0 11 74.0

SS 3924.9 145.8 4070.7

y 150.00 170.00

Fit 157.32 161.49

MS 3924.9 16.2

F 242.23

p 0.000

StDev 2.22 2.45

Fit Residual –7.32 8.51

St Resid –2.18R 2.66R

R denotes an observation with a large standardized residual 147.

Refer to Circumference and Age Narrative. Consider the following residual plot of the residuals versus the fitted values. What conclusion can be drawn from the plot?

ANS: The residual plot shows a definite cyclical pattern, indicating that the errors are not independent. PTS: 1 REF: 548-550 BLM: Higher Order - Evaluate 148.

TOP: 6–8

Refer to Circumference and Age Narrative. Consider the following normal probability plot of the residuals. What conclusion can be drawn from the plot? Justify your answer.

ANS: The normal probability plot indicates a departure from normality of the errors since the plot is not even close to a

line.

PTS: 1 REF: 548-550 BLM: Higher Order - Evaluate 149.

TOP: 6–8

Refer to Circumference and Age Narrative. Based on the plots in the previous two questions, should you use the model in the computer printout to predict circumference? Justify your answer. ANS: No, since the basic assumptions of independent and normally distributed errors have been violated. PTS: 1 REF: 548-550 BLM: Higher Order - Evaluate

150.

TOP: 6–8

Six points have these coordinates: x y

1 6.1

2 5.1

3 5.0

4 4.2

5 3.7

6 3.2

The normal probability plot and the residuals versus fitted values plots generated by statistical software are shown below. Does it appear that any regression assumptions have been violated? Explain. ANS: Although there is one data point in each graph that appears somewhat unusual, there is no reason to doubt the validity of the regression assumptions. PTS: 1 REF: 548-550 BLM: Higher Order - Analyze

TOP: 6–8

Income and Attractiveness Narrative In order to determine whether good looks translate into heftier paycheques, an economist collected the data shown below on annual income of doctors (y) in thousands of dollars and attractiveness (x) as recorded on a scale from 1 to 5, based on a panel’s rating of head-and-shoulder photographs.

151.

x y

1 50

2 65

1 75

x y

4 385

4 410

4 415

1 20 4 425

1 205

2 230

3 510

3 250

5 550

2 265 3 555

1 300 5 560

3 310 5 600

3 320

4 350

3 385

5 630

5 645

5 655

Refer to Income and Attractiveness Narrative. Construct a scatterplot and comment on the relationship between income of doctors and attractiveness. ANS:

The scatterplot reveals that a direct relationship exists between annual income and attractiveness of doctors. This suggests that further analysis may be worthwhile. PTS: 1 REF: 562 BLM: Higher Order - Analyze 152.

TOP: 6–8

Refer to Income and Attractiveness Narrative. The normal probability plot is shown below. Does it appear that the normality regression assumption has been violated? Explain.

ANS: The graph does not suggest a violation of the normality assumption. When this assumption is met and y values at any given x are normally distributed, the residuals are normally distributed as well. In that case, a normal probability plot approximates a straight line.

PTS: 1 REF: 548-550 BLM: Higher Order - Analyze 153.

TOP: 6–8

Refer to Income and Attractiveness Narrative. The plot of residuals versus the fitted values is shown below. Does it appear that the constant variance regression assumption has been violated? Explain.

ANS: The graph does not suggest a violation of the constant variance assumption. When this assumption is met, and all conditional standard deviation, a plot of each residual value against the associated fitted value should indicate a similar variation of residuals throughout the range of these values. A marked pattern of ever-widening or ever-narrowing scatter, not shown here, would point to a violation of the constant variance assumption. PTS: 1 REF: 548-550 BLM: Higher Order - Analyze

TOP: 6–8

Forest Age and Tree Diameter Narrative A scientist is studying the relationship between the age of a forest, x, in years and the average diameter of the trees, y, in cm. One study reported the following data. x 15 30 25 120 60 40 150 100 175 y 17 21 20 51 32 25 62 47 68 154.

Refer to Forest Age and Tree Diameter Narrative. Develop a 95% confidence interval for the average value of y when x = 83. ANS:

= (n – 1)

= 27672.224,

= 79.444. The 95% confidence interval for the average

value of y when x = 83 is

= 39.28

0.866 = (38.414, 40.146).

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 155.

TOP: 6–8

Refer to Forest Age and Tree Diameter. Develop a 95% prediction interval of y when x = 83. ANS:

= = (36.547, 42.013)

= 39.28

2.733

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 156.

TOP: 6–8

Let x be the number of vending machines and let y be the time (in hours) it takes to stock them. The data are as follows. x y

4 1

8 2

10 2.5

13 6

16 8

11 3.5

5 1.3

9 2.5

18 9

14 6

6 1.4

12 5

20 10.5

Develop a 95% confidence interval for the average value of y when x = 7. ANS: = (n – 1) = 292.308, = 11.2308, MSE = 0.379. The 95% confidence interval for the average value of y when x = 7 is

= 1.8573

= 0.5037 = (1.3536, 2.3559)

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze Soft Drink Sales Narrative

TOP: 6–8

A soft drink vendor, set up near a beach for the summer, was interested in examining the relationship between sales of soft drinks (in litres per day) and maximum temperature of the day. The vendor records the following data for a random sample of eight days in the summer, where y = number of litres of soft drinks sold per day and x = maximum temperature, in degrees Celsius, recorded for the day: x y

32 28

35 32

38 38

37 35

31 25

36 35

39 39

31 25

The following summary information was computed: In addition, the following partial output was generated using statistical software: The regression equation is sales = –28.460 + 1.7372 temp, MSE = 0.496767 157.

Refer to Soft Drink Sales Narrative. Find a 95% confidence interval for the mean value of y (sales) when the maximum temperature is

ANS: n = 8,

= 279/8 = 34.875, df = n – 2 = 6,

= 2.447, MSE = 0.49676,

= 9801 – 9730.125 = 70.875. When x =

= –28.460 + 1.7372(34) = 30.6049. The 95% confidence interval for the average value of y when x =

30.6049

= 0.6356 = (29.969, 31.242).

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 158.

C is

TOP: 6–8

Refer to Soft Drink Sales Narrative. Find a 95% confidence interval for some value of y (sales) to be observed in the future when the maximum temperature is

ANS: The 95% prediction interval of y when x =

= = (28.767, 32.443). PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

C is

= 30.6049 1.8381

TOP: 6–8

Microwave Sales Narrative A microwave oven manufacturer has collected the data shown below on number of units sold (y) in the thousands of dollars and the number of ads (x) placed during the month.

159.

x y

88 6

10 3

26 7

26 6

40 17

190 18

185 8

130 8

116 12

185 16

x y

148 6

67 6

90 15

145 9

177 17

60 8

71 10

101 11

190 15

170 15

Refer to Microwave Sales Narrative. Calculate the preliminary sums of squares and cross products, ANS:

PTS: 1 REF: 531-533 BLM: Higher Order - Apply 160.

TOP: 6–8

Refer to Microwave Sales Narrative. Calculate the quantities SSE and MSE. ANS:

MSE = SSE/(n – 2) = 283.5035/18 = 15.753 PTS: 1 REF: 534-536 BLM: Higher Order - Apply 161.

TOP: 6–8

Refer to Microwave Sales Narrative. Determine the least-squares estimates of the regression parameters, and the least-squares regression line. ANS:

PTS:

REF: 531-533

TOP: 6–8

BLM: Higher Order - Apply 162.

Refer to Microwave Sales Narrative. Compute a point estimate of number of units sold if there are 140 ads. ANS: units sold. PTS: 1 REF: 531-533 BLM: Higher Order - Apply

163.

TOP: 6–8

Refer to Microwave Sales Narrative. Compute the standard error of the point estimate of number of units sold if there are 140 ads. ANS:

PTS: 1 REF: 554 BLM: Higher Order - Apply 164.

TOP: 6–8

Refer to Microwave Sales Narrative. Compute a 95% confidence interval for the average number of units sold in all months with 140 ads. ANS:

or from 9.781 to 13.935. PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 165.

TOP: 6–8

Refer to Microwave Sales Narrative. Compute a 95% prediction interval for sales during the next month that happens to be associated with 140 ads. ANS:

, or from 3.265 to 20.451. PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze Special Programs Narrative

TOP: 6–8

A social skills training program was implemented with seven students with disabilities in a study to determine whether the program caused improvement in pre/post measures and behaviour ratings. For one such test, the pre- and posttest scores for the seven students are given in the table. Subject Andy Bianca Carl Daniel Earl Faith Gina 166.

Pretest 105 93 116 109 94 95 93

Posttest 117 93 125 103 108 98 103

Refer to Special Programs Narrative. What type of correlation, if any, do you expect to see between the pre- and posttest scores? Plot the data. Does the correlation appear to be positive or negative? ANS: When the pretest score x is high, the posttest score y should also be high. There should be a positive correlation, as shown on the scatterplot below.

PTS: 1 REF: 562 BLM: Higher Order - Analyze 167.

TOP: 6–8

Refer to Special Programs Narrative. Calculate the correlation coefficient r. Is there a significant positive correlation between x and y? Explain. ANS:

. Then

The hypotheses to be tested are

, and the test statistic is

. The rejection region for = 0.05 is There is sufficient evidence to indicate positive correlation.

= 2.015 and

PTS: 1 REF: 543 | 562-563 | 722 BLM: Higher Order - Analyze

is rejected.

TOP: 6–8

TV Game Show Revenues Narrative An ardent fan of television game shows has observed that, in general, the more educated the contestant, the less money he or she wins. To test her belief, she gathers data about the last eight winners of her favourite game show. She records their winnings in dollars and the number of years of education. The results are as follows. Contestant 1 2 3 4 5 6 7 8 168.

Years of Education

Winnings

11 15 12 16 11 16 13 14

750 400 600 350 800 300 650 400

Refer to TV Game Show Revenues Narrative. Predict with 95% confidence the winnings of a contestant who has 15 years of education. ANS: 397.500

159.213. Thus, LCL = $238.287, and UCL = $556.713

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 169.

TOP: 6–8

Refer to TV Game Show Revenues Narrative. Predict with 95% confidence the winnings of a contestant who has ten years of education. ANS: 397.500 PTS:

179.971. Thus, LCL = $217.529, and UCL = $577.471 REF: 553-557 | 722

TOP: 6–8

BLM: Higher Order - Analyze 170.

Refer to TV Game Show Revenues Narrative. Estimate with 95% confidence the average winnings of all contestants who have 15 years of education. ANS: 397.500

64.998. Thus, LCL = $332.502, and UCL = $462.498

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze 171.

TOP: 6–8

Refer to TV Game Show Revenues Narrative. Estimate with 95% confidence the average winnings of all contestants who have ten years of education. ANS: 397.500

106.141. Thus, LCL = $291.359, and UCL = $503.641

PTS: 1 REF: 553-557 | 722 BLM: Higher Order - Analyze

TOP: 6–8

Chapter 13—Multiple Regression Analysis MULTIPLE CHOICE 1. The adjusted multiple coefficient of determination is adjusted for which of the following quantities? a. the number of regression parameters including b. the number of dependent variables and the sample size c. the number of predictor variables and the sample size d. the correlation coefficient ANS: C BLM: Remember

PTS:

REF: 584-585

TOP: 1–4

2. Suppose a regression analysis based on the model observations produced SSE = 3.55,

= 13.131, and

with 15 = 125.1. In this case, what is

the proportion of the total variability in y that is accounted for by a. 0.0312 b. 0.2704 c. 0.7296 d. 0.9688 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 582-584

and

TOP: 1–4

3. In a multiple regression analysis involving 24 data points, the mean squares for error, MSE, is 2, and the sum of squares for error, SSE, is 36. Under these circumstances, what must the number of the predictor variables be? a. 6 b. 5 c. 4 d. 3 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 582-583

TOP: 1–4

4. The test statistic F found in the ANOVA table for testing the usefulness of the regression model is given by which of the following expressions? a. MSR/MSE b. MSE/MSR c. Total SS/SSE d. SSR/SSE ANS: A BLM: Remember

PTS:

REF: 583-584

TOP: 1–4

5. In order to test the validity of a multiple regression model involving 4 predictor variables and 25 observations, what are the numerator and denominator degrees of freedom ( a. b. c. d.

, respectively) for the critical value of F? 4 and 20 4 and 25 20 and 4 25 and 4

ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 582-583

TOP: 1–4

6. In multiple regression analysis, what does the ratio MSR/MSE equal? a. the F-test statistic for testing the usefulness of the regression model b. the coefficient of determination c. the adjusted coefficient of determination (adj) d. the t-test statistic for testing each partial regression coefficient ANS: A BLM: Remember

PTS:

REF: 582-584

TOP: 1–4

7. In a multiple regression analysis involving 4 predictor variables and 25 observations, the total sum of squares is 800, and the error sum of squares is 200. What, then, is the value of the F-test statistic for testing the usefulness of this model? a. 200 b. 50 c. 32 d. 15 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 582-584

TOP: 1–4

8. For a multiple regression model the following statistics are given: Total SS = 400, SSR = 350, k = 4, and n = 20. Given this information, what is the coefficient of determination adjusted for degrees of freedom? a. 87.5% b. 84.17% c. 15.83% d. 12.5% ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

9. A multiple regression model involves 3 predictor variables and a sample of 20 data points. If we want to test the usefulness of the model at the 1% significance level, what is the critical value of the rejection region? a. 5.29 b. 5.93 c. 6.30 d. 7.09

ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 584

10. A multiple regression model has the form

TOP: 1–4

. As

increases by 1

unit, with and held constant, what is expected to happen to y on average? a. It will increase by 1 unit. b. It will decrease by 3 units. c. It will decrease by 4 units. d. It will increase by 10 units. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 580

TOP: 1–4

11. A multiple regression model involves 5 predictor variables and 30 observations. If we want to test the hypotheses the critical value of the rejection region be? a. 1.699 b. 1.711 c. 2.045 d. 2.064 ANS: D PTS: 1 BLM: Higher Order - Analyze

at the 5% significance level, what will

REF: 584-585

TOP: 1–4

12. To test the validity of a multiple regression model involving three predictor variables, which of the following is the best formulation of the null hypothesis to be tested? a. b. c. d. ANS: C BLM: Remember

PTS:

REF: 584

TOP: 1–4

13. For a multiple regression model, the following statistics are given: Total SS = 500, SSE = 60, and n = 20. In this case, what is the coefficient of determination, expressed as a percentage? a. 93.81% b. 88% c. 77.44% d. 12% ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

14. In a regression model involving 40 observations, the following estimated regression model was obtained: . For this model, the following statistics are given: SSR = 501 and SSE = 99. What is the value of MSR?

a. 12.525 b. 15 c. 33 d. 167 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 582-584

TOP: 1–4

15. Which of the following is the correct interpretation of the multiple-regression equation ? a. This equation gives us the estimated value of the dependent variable for any specified pair of values of the independent variables. b. The estimated regression coefficient represents the slope of the line. c. The estimated regression coefficient equals the change in estimated Y that is associated with a unit change in and . ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 579-581

TOP: 1–4

16. A three-variable multiple regression establishes an estimated multiple regression equation. Which of the following is a property of that equation? a. All the estimates derived from it fall on a surface that is called the regression plane. b. All the estimates derived from it fall on a surface that is called the regression plane, which is positioned among the sample points in such a way as to minimize the sum of the square horizontal deviations between these sample points and their associated estimates. c. All the estimates derived from it fall on a surface that is called the regression plane, which is positioned among the sample points in such a way as to minimize the sum of the squared horizontal deviations between these sample points and their associated estimates, all of which lie on this plane. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 580

TOP: 1–4

17. A three-variable multiple regression plane is positioned so as to minimize the sum of squared errors. This sum is given by which of the following expressions? a. b. c. ANS: B BLM: Remember

PTS:

REF: 581-582

TOP: 1–4

18. Which of the following correctly describes a p-value? a. It gives the partial change in Y for a unit change in an independent variable, while holding other independent variables constant. b. It equals the probability of being true, given the claim The true

regression coefficient equals 0. c. d.

It is an index of the degree of linear association among more than two variables, equal to the square root of the sample coefficient of multiple determination. It equals the ratio of an estimated partial-regression coefficient to its standard error.

ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

19. Which of the following correctly describes a multiple-regression ANOVA test? a. It is a test of the significance of individual coefficients, rather than of the overall significance of the regression model. b. It amounts to testing the hypothesis that none of the true regression coefficients is 0 and that, therefore, all the independent predictors help explain the variation of the dependent variable. c. It is a test of the overall significance of a regression, rather than a test of the significance of individual coefficients. d. Both a and b. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 582-583

TOP: 1–4

20. Which of the following correctly describes the coefficient of multiple determination? a. It is denoted by and is interpreted as the proportion of the total variation that is explained by the regression of y on the predictor variables. b. It is computed as the ratio of total variation to explained variation. c. In the case of two independent variables, it describes how well the regression hyperplane fits the data. ANS: A BLM: Remember

PTS:

REF: 584

TOP: 1–4

21. Given a multiple regression with a regression sum of squares of 850 and a total sum of square of 1000, what is the coefficient of multiple determination? a. 0.150 b. 0.387 c. 0.850 d. 0.925 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

22. Given MSR = 345 and MSE = 431.25, what is the value of the F statistic? a. 1.25 b. 0.994 c. 0.894 d. 0.800 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

23. In testing the validity of a multiple regression model, a large value of the F-test statistic is indicative of which of the following situations? a. Most of the variation in the independent variables is explained by the variation in y. b. Most of the variation in y is explained by the regression equation. c. Most of the variation in y is unexplained by the regression equation. d. The model provides a poor fit. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 583-584

TOP: 1–4

24. For which of the following quantities is the adjusted multiple coefficient of determination adjusted?? a. the number of regression parameters including the y-intercept b. the number of dependent variables and the sample size c. the number of independent variables and the sample size d. the coefficient of correlation and the significance level ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 585

TOP: 1–4

25. In a multiple regression model, what is assumed to be true of the standard deviation of the error variable ? a. It is assumed to be constant for all values of the independent variables. b. It is assumed to be constant for all values of the dependent variable. c. It is assumed to be equal to 1.0. d. It is assumed to be less than 1.0. ANS: A BLM: Remember

PTS:

REF: 579

TOP: 1–4

26. In multiple regression analysis, which of the following is equal to the ratio MSR/MSE? a. the t-test statistic for testing each individual regression coefficient b. the F-test statistic for testing the validity of the regression equation c. the multiple coefficient of determination d. the adjusted multiple coefficient of determination ANS: B BLM: Remember

PTS:

REF: 584

TOP: 1–4

27. In order to test the validity of a multiple regression model involving 5 independent variables and 30 observations, what are the respective numerator and denominator degrees of freedom for the critical value of F? a. 5 and 24 b. 5 and 30 c. 6 and 25

6 and 29

ANS: A PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

28. In multiple regression models, which of the following may be assumed in with respect to the values of the error variable ? a. They are autocorrelated. b. They are dependent on each other. c. They are independent of each other. d. They are always positive. ANS: C BLM: Remember

PTS:

REF: 579

TOP: 1–4

29. A multiple regression model involves five independent variables and a sample of ten data points. If we want to test the validity of the model at the 5% significance level, what is the critical value? a. 9.36 b. 6.26 c. 4.24 d. 3.33 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 583-584

TOP: 1–4

30. To test the validity of a multiple regression model, which of the following tests do we use for the null hypothesis that the regression coefficients are all 0? a. t test b. z test c. F test d. R test ANS: C BLM: Remember

PTS:

REF: 584

TOP: 1–4

31. To test the validity of a multiple regression model involving two independent variables, which of the following is the most appropriate null hypothesis? a. b. c. d. ANS: B BLM: Remember

PTS:

32. A multiple regression model has the form

REF: 583-584

TOP: 1–4

. Which of the following is

the best interpretation of the coefficient ? a. It is the change in y per unit change in . b. It is the change in y per unit change in , holding constant. c. It is the change in y per unit change in , when and values are correlated. d. It is the change in the average value of y per unit change in , holding

constant. ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

33. A multiple regression analysis involving 3 independent variables and 25 data points results in a value of 0.769 for the unadjusted multiple coefficient of determination. What is the adjusted multiple coefficient of determination? a. 0.385 b. 0.591 c. 0.736 d. 0.877 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 585

TOP: 1–4

34. What is the range in values of the coefficient of multiple determination? a. from 1.0 to b. from 0.0 to 1.0 c. from 1.0 to k, where k is the number of independent variables in the model d. from 1.0 to n, where n is the number of observations in the dependent variable ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

35. In testing the validity of a multiple regression model involving 10 independent variables and 100 observations, what will the respective values of the numerator and denominator degrees of freedom for the critical value of F be? a. 9 and 10 b. 9 and 90 c. 10 and 89 d. 10 and 100 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 584

TOP: 1–4

36. In multiple regression analysis involving 10 independent variables and 100 observations, how many degrees of freedom will the critical value of t for testing individual coefficients in the model have? a. 100 degrees of freedom b. 89 degrees of freedom c. 10 degrees of freedom d. 9 degrees of freedom ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 585

TOP: 1–4

37. A multiple regression equation includes five independent variables, and the coefficient of determination is 0.81. What is the percentage of the variation in y that is explained by the regression equation?

a. 90%

b. 86% c. 81% d. about 16% ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 584

TOP: 1–4

38. In a simple linear regression problem, the following pairs of ( ) are given: (6.75, 7.42), (8.96, 8.06), (10.30, 11.65), and (13.24, 12.15). Which of these values is equal to the sum of squares for error? a. –0.0300 b. 4.2695 c. 39.2500 d. 39.2800 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 582-583

TOP: 1–4

39. Which of the following best describes the model ? a. It is a third-order polynomial model. b. It is a second-order polynomial model. c. It is a first-order polynomial model since there is only one independent variable x. d. It is a quadratic model since the term will often be dropped out of the model based on sample information. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 587

TOP: 1–4

40. In a multiple regression model, if the residuals have a constant variance, which of the following should be evident? a. The residual mean should be close to 0, and the residual standard deviation should be close to 1.0. b. The plot of residuals against each independent variable should show that the spread in the residuals is about the same at all levels of the independent variables. c. The residuals should have a variance equal to 0 for all levels of the independent variable. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 605-607

TOP: 1–4

41. Which of the following methods is used to help assess whether the regression model meets the assumption of having normally distributed residuals? a. Develop a normal probability plot of the residuals. b. Develop a histogram of the residuals. c. both (a) and (b) d. neither (a) nor (b) ANS: C BLM: Remember

PTS:

REF: 586

TOP: 1–4

42. In which of the following situations is stepwise regression especially useful? a. when there are many independent variables b. when there are few independent variables c. when there are many dependent variables d. when there are few dependent variables ANS: A BLM: Remember

PTS:

REF: 607

TOP: 5–8

43. Stepwise regression is an iterative procedure that does which of the following? a. It adds one independent variable at a time. b. It deletes one independent variable at a time. c. It deletes one dependent variable at a time. d. It adds and deletes one independent variable at a time ANS: D BLM: Remember

PTS:

REF: 607

TOP: 5–8

44. In stepwise regression procedure, what occurs if two independent variables are highly correlated? a. Both variables will enter the equation. b. Only one variable will enter the equation. c. Neither variable will enter the equation. d. A third variable equal to their sum will replace them in the equation. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 607-609

TOP: 5–8

45. In multiple regression analysis, which procedure permits variables to enter and leave the model at different stages of its development? a. forward selection b. residual analysis c. backward elimination d. stepwise regression ANS: D BLM: Remember

PTS:

REF: 607

TOP: 5–8

46. How many dummy variables will you need to include if you wish to develop a regression model in which the high school class standing is a qualitative variable with four possible levels of response? a. 8 b. 5 c. 4 d. 3

ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 594-595

TOP: 5–8

47. Assume you are considering including two additional qualitative variables into a regression model. The first variable has four categories, and the second variable has four categories as well. Given this information, how many indicator variables will be incorporated into the model? a. 8 b. 7 c. 6 d. 5 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 594-595

TOP: 5–8

48. Which, if any, of the following is an advantage of using stepwise regression compared to just entering all the independent variables at one time? a. There are no advantages of using stepwise regression over entering all the variables at one time. b. Stepwise regression allows us to observe the effects of multicollinearity earlier than when all variables are entered at one time. c. Stepwise regression will generally produce a model with a larger value for the coefficient of determination. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 607-609

TOP: 5–8

49. In regression analysis, to what does the term “multicollinearity” refer? a. the response variables being highly correlated with one another b. the predictor variables being highly correlated with one another c. the response variable and the predictor variables being highly correlated with one another d. the response variables being highly correlated over time ANS: B BLM: Remember

PTS:

REF: 608

TOP: 9–10

50. When the independent variables are correlated with one another in a multiple regression analysis, what is this condition called? a. heteroscedasticity b. homoscedasticity c. multicollinearity d. causality ANS: C BLM: Remember

PTS:

REF: 608

TOP: 9–10

51. Typical symptoms of the presence of multicollinearity include which of the following? a. The estimated regression coefficients vary substantially from sample to sample; this fact raises their standard errors; hence, the

is unlikely to be greater

than 2, or statistically significant. b.

The estimated regression coefficients change greatly in value as independent variables are dropped from or added to the regression equation.

c. d.

The signs of the estimated regression coefficients are nonsensical; they are negative when common sense suggests positive signs and vice versa. All of the above and more.

ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 608

TOP: 9–10

52. Under which of the following conditions does the problem of multicollinearity arise? a. when the dependent variables are highly correlated with one another b. when the independent variables are highly correlated with one another c. when the independent variables are highly correlated with the dependent variable ANS: B BLM: Remember

PTS:

REF: 608

TOP: 9–10

53. What may be deduced if multicollinearity exists among the independent variables included in a multiple regression model? a. The regression coefficients will be difficult to interpret. b. The standard errors of the regression coefficients for the correlated independent variables will increase c. The multiple coefficient of determination will assume a value close to 0. d. Both a and b. ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 608

TOP: 9–10

54. In multiple regression analysis, which of the following is a clue that multicollinearity is present? a. The value of is large, indicating a good fit, but the individual t tests are nonsignificant. b. The signs of the regression coefficients are contrary to what we would intuitively expect the contributions of those variables to be. c. A matrix of correlations, generated by computer, shows which predictor variables are highly correlated with each other and with the response variable y. d. All of (a), (b), and (c). ANS: D BLM: Remember

PTS:

REF: 608

TOP: 9–10

TRUE/FALSE 1. In a multiple linear regression model, intercept, while ANS: T BLM: Remember

is the

are the partial slopes, or partial regression coefficients. PTS:

REF: 579-580

TOP: 1–4

2. In a multiple linear regression model,

, the coefficient

measures the change in the dependent variable y for a unit change in independent variables are held constant. ANS: T BLM: Remember

PTS:

REF: 579-580

TOP: 1–4

3. In a multiple linear regression model, are independent predictor variables that are measured without error. ANS: T BLM: Remember

PTS:

REF: 579-580

when all other

TOP: 1–4

4. In a regression setting, if you add a predictor variable to an existing model, the value of will either remain the same or increase. However, in selecting a model, you should consider the value of the adjusted as well as . ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 585

TOP: 1–4

5. In a regression setting, you should always select the model with the largest ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

6. In a regression setting, you should select a model where all the regression coefficients have p-values greater than 0.05. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

7. Multiple correlation analysis measures the overall strength of association among more than two variables. ANS: T BLM: Remember

PTS:

REF: 579

8. An adjusted coefficient of multiple determination, denoted by degrees of freedom. ANS: T BLM: Remember

PTS:

REF: 585

TOP: 1–4

(adj), is adjusted for the

TOP: 1–4

9. An estimated partial-regression coefficient gives the partial change in Y for a unit change in that independent variable, while holding other independent variables constant. ANS: T

PTS:

REF: 580

TOP: 1–4

BLM: Remember 10. The coefficient of multiple determination takes on values between 0 and 1, inclusive. ANS: T BLM: Remember

PTS:

REF: 584

TOP: 1–4

11. A coefficient of multiple correlation, denoted by R, equals the proportion of the total variation in the values of the dependent variable, Y, that is explained by the estimated multiple regression of Y on and possibly additional independent variables ( and so on). ANS: F BLM: Remember

PTS:

REF: 584

TOP: 1–4

12. Multicollinearity exists in virtually all multiple regression models. ANS: T BLM: Remember

PTS:

REF: 609

TOP: 1–4

13. Multicollinearity is also called collinearity and intercorrelation. ANS: T BLM: Remember

PTS:

REF: 608

TOP: 1–4

14. Multicollinearity is a condition that exists when the independent variables are highly correlated with the dependent variable. ANS: F BLM: Remember

PTS:

REF: 608

TOP: 1–4

15. Multicollinearity does not affect the F-test of the analysis of variance. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 608

TOP: 1–4

16. In a multiple regression analysis involving six independent variables, the sum of squares are calculated as: Total SS = 900, SSR = 600, and SSE = 300. In this case, the value of the F-test statistic is 150. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

17. In a multiple regression model, the coefficient of determination will be equal to the square of the largest correlation value between the dependent variable and the independent variables. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

18. In a multiple regression model, the mean of the residuals is equal to the variance of all combinations of levels of the independent variables. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 583

TOP: 1–4

19. In a multiple regression model, adding more independent variables that have a low correlation with the dependent variable will decrease the value of the coefficient of multiple determination. ANS: F BLM: Remember

PTS:

REF: 585

20. The value will tend to be smaller than the adjusted independent variables are included in the model. ANS: F PTS: 1 BLM: Higher Order - Understand

TOP: 1–4

value when insignificant

REF: 585

TOP: 1–4

21. The y-intercept will usually be negative in a multiple regression model when the regression slope coefficients are positive. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 581

TOP: 1–4

22. If the confidence interval estimate for the regression slope coefficient, based on the sample information, crosses over 0, then the true population regression slope coefficient could be 0. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 586-587

TOP: 1–4

23. The coefficient of determination, R2, represents the proportion of the total variability in y that can be explained by the regression of y on x. When transformed to a percentage, it represents the percentage reduction in the sum of the squares of the error that can be accomplished by using the model to predict the dependent variable as opposed to just using the sample mean of the dependent variable. ANS: T TOP: 1–4

PTS: 1 REF: 584 | 586-587 BLM: Higher Order - Understand

24. Assume that a company is tracking its advertising expenditures as they relate to television ( ) and radio advertising ( ). The owner of the company believes that it would improve the regression model to add a third variable that represents the sum of the advertising on radio and television (

). This assessment is generally correct.

ANS: F TOP: 1–4

PTS: 1 REF: 579 | 584-585 BLM: Higher Order - Understand

25. Having a large number of predictors in a regression model guarantees that the model fit is good. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

26. The more predictors that are added to a regression model, the larger the coefficient of determination R2 value will be. ANS: T BLM: Remember

PTS:

27.

REF: 584

TOP: 1–4

is an example of a multiple regression model. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 579

TOP: 1–4

28. In multiple regression, the prediction equation is the line that minimizes SSE, the sum of squares of the deviations of the observed values y from the predicted values . ANS: T BLM: Remember 29. Let

PTS:

REF: 581

be the least squares estimate of the population coefficient

TOP: 1–4

. If the regression

assumptions hold true, the test statistic given by ] has an F distribution with and degrees of freedom, where n is the number of observations and is the number of predictor variables. ANS: F BLM: Remember

PTS:

REF: 584-585

TOP: 1–4

30. Suppose that one equation has three explanatory variables and an F-ratio of 52. Another equation has five explanatory variables and an F-ratio of 40. The first equation will always be considered a better model. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

31. In order to test the usefulness of a multiple regression model involving 5 predictor variables and 25 observations, the numerator and denominator degrees of freedom for the critical value of F are 4 and 24, respectively. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 582-584

TOP: 1–4

32. A multiple regression equation includes five predictor variables, and the coefficient of multiple determination is 0.7921. The percentage of the variation in that is explained by the regression equation is 89%. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

33. A multiple regression model has the form interpreted as the change in

. The coefficient

per unit change in

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 579-581

TOP: 1–4

34. In testing the significance of a multiple regression model in which there are three independent variables, the null hypothesis is ANS: F BLM: Remember

PTS:

. REF: 582-584

TOP: 1–4

35. Some statistical packages print a second statistic, called the adjusted coefficient of determination, which has been adjusted for the degrees of freedom to take into account the sample size and the number of predictor variables. ANS: T BLM: Remember

PTS:

36. In reference to the equation, in

per unit change in

REF: 584-585

the

0.63

, regardless of the value of

ANS: F PTS: 1 BLM: Higher Order - Understand

value

REF: 580-581

TOP: 1–4

the

average

change

TOP: 1–4

37. A multiple regression model involves 40 observations and 4 independent variables and produces SST = 2000 and SSR = 1608. The value of MSE is 11.2. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 583

TOP: 1–4

, the value –0.75 is the intercept.

38. In reference to the equation ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

39. In multiple regression, the descriptor “multiple” refers to more than one dependent variable. ANS: F

PTS:

REF: 579

TOP: 1–4

BLM: Remember 40. For each x term in the multiple regression equation, the corresponding partial regression coefficient. ANS: T BLM: Remember

PTS:

REF: 580

is referred to as a

TOP: 1–4

41. In a multiple regression problem, the regression equation is estimated value for

when

and

ANS: T PTS: 1 BLM: Higher Order - Apply

. The

is 48. REF: 580-581

TOP: 1–4

, the value –0.80 is the

42. In reference to the equation ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 580-581

-intercept.

TOP: 1–4

43. In a multiple regression problem involving 24 observations and 3 independent variables, the estimated regression equation is . For this model, SST = 800 and SSE = 245. The value of the F statistic for testing the significance of the model is 15.102. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 582-583

TOP: 1–4

44. A multiple regression equation includes five independent variables, and the coefficient of determination is 0.81. The percentage of the variation in that is explained by the regression equation is 90%. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

45. Multiple regression analysis is a type of regression analysis in which several independent variables are used to estimate the value of an unknown dependent variable; hence, each of these predictor variables explains part of the total variation of the dependent variable. ANS: T BLM: Remember

PTS:

REF: 579

TOP: 1–4

46. An estimated partial-regression coefficient is the coefficient of a dependent variable in an estimated multiple-regression equation. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

47. An estimated partial-regression coefficient gives the partial change in y for a unit change in that independent variable, while holding other independent variables constant. ANS: T BLM: Remember

PTS:

REF: 580-581

TOP: 1–4

48. If we want to relate a random variable y to two independent variables and , a regression hyperplane is the three-dimensional equivalent of a regression line that minimizes the sum of the squared vertical deviations between the sample points suspended in y vs. vs. space and their associated multiple regression estimates, all of which lie on this hyperplane. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 579-580

TOP: 1–4

49. In regression analysis, a p-value provides the probability (judged by the t-value associated with an estimated regression coefficient) of being true, given the claim The true regression coefficient equals 0. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

50. A coefficient of multiple correlation is a measure of how well an estimated regression plane (or hyperplane) fits the sample data on which it is based. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

51. Multiple linear regression is an extension of simple linear regression to allow for more than one dependent variable. ANS: F BLM: Remember

PTS:

REF: 579

TOP: 1–4

52. A coefficient of multiple correlation is denoted by and equals the proportion of the total variation in the values of the dependent variable, y, that is explained by the estimated multiple regression of y on , , and possibly additional independent variables ( and so on). ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

53. A multiple regression analysis includes 25 data points and 4 independent variables, and produces SST = 400 and SSR = 300. Then, the multiple standard error of estimate is 5. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 582-583

TOP: 1–4

54. Multicollinearity is present if the dependent variable is linearly related to one of the explanatory variables. ANS: F BLM: Remember

PTS:

REF: 584

TOP: 1–4

55. In a multiple regression analysis involving 50 observations and 5 independent variables, SST = 475 and SSE = 71.25. Then, the multiple coefficient of determination is 0.85. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 582-584

56. A multiple regression model has the form unit, holding

TOP: 1–4

. As

increases by one

constant, the value of y will increase by 9 units.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

57. In reference to the multiple regression model increase by five units, holding average by 50 units. ANS: T PTS: 1 BLM: Higher Order - Understand

and

, if

constant, then the value of

REF: 580-581

were to

would decrease on

TOP: 1–4

58. In multiple regression, a large value of the test statistic F indicates that most of the variation in y is unexplained by the regression equation and that the model is useless. A small value of F indicates that most of the variation in y is explained by the regression equation and that the model is useful. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

59. When an additional explanatory variable is introduced into a multiple regression model, the coefficient of multiple determination adjusted for degrees of freedom can never decrease. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 585

TOP: 1–4

60. In multiple regression analysis, when the response surface (the graphical depiction of the regression equation) hits every single point, the sum of squares for error SSE = 0, the standard error of estimate

= 0, and the coefficient of determination

ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 579-580

= 1.

TOP: 1–4

61. In multiple regression analysis, the coefficient of determination is sometimes called multiple . ANS: T BLM: Remember

PTS:

REF: 584

TOP: 1–4

62. The coefficient of multiple determination is calculated by dividing the regression sum of squares by the total sum of squares (SSR/SST) and subtracting that value from 1. ANS: F BLM: Remember

PTS:

REF: 584

TOP: 1–4

63. In a multiple regression model involving 5 independent variables, if the sum of the squared residuals is 847 and the data set contains 40 points, then the value of the standard error of the estimate is 24.911. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 582-583

TOP: 1–4

64. The adjusted coefficient of determination is adjusted for the number of independent variables in the model by using sum of squares rather mean squares. ANS: F BLM: Remember

PTS:

REF: 584-585

TOP: 1–4

65. The adjusted value of is used mainly to compare two or more regression models that have the same number of independent predictors to determine which one fits the data better. ANS: F BLM: Remember

PTS:

REF: 584-585

TOP: 1–4

66. In a multiple regression model, the coefficient of determination (sometimes called multiple ) can be computed by simply squaring the largest correlation coefficient between the dependent variable and any independent variable. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 584-585

TOP: 1–4

67. The multiple coefficient of determination measures the proportion or percentage of variation in the dependent variable that is explained by the independent variables included in the model. ANS: T BLM: Remember

PTS:

REF: 585

TOP: 1–4

68. In a multiple regression model, the partial regression slope coefficients measure the average change in the dependent variable for a one-unit change in the dependent variable of interest, with all other independent variables held constant.

ANS: T BLM: Remember

PTS:

REF: 579-580

TOP: 1–4

69. Consider a multiple regression model with three independent variables. If the y-intercept is negative, then at least two of the partial regression slope coefficients will also be negative. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

70. In a multiple regression model, if there are ten independent variables included in the model, then the sample size should be at least ten. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 579-580

TOP: 1–4

71. In a multiple regression model, it is assumed that the residuals are normally distributed. ANS: T BLM: Remember

PTS:

REF: 579

TOP: 1–4

72. In a multiple regression model, the regression coefficients are calculated such that the quantity

is minimized.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 580-581

TOP: 1–4

73. A multiple regression model forms a plane through multidimensional space. ANS: T BLM: Remember

PTS:

REF: 579-580

TOP: 1–4

74. In a multiple regression model where four independent variables are included in the model, the percentage of explained variation in the dependent variable will be equal to the square root of the sum of the largest correlations between the dependent variable and the four independent variables. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 582-583

TOP: 1–4

75. The larger the value of the coefficient of multiple determination the larger the value of the F-test statistic that is used for testing the usefulness of the multiple regression model. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 584

TOP: 1–4

76. If the value for a multiple regression model with two independent variables is 0.81, then the correlation between the two independent variables will be 0.90.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 584

TOP: 1–4

77. If you wish to test the usefulness of a multiple regression model with three independent variables, the appropriate null and alternative hypotheses are

ANS: F BLM: Remember

PTS:

78. A regression model of the form second-order polynomial model. ANS: T BLM: Remember

PTS:

REF: 582-584

TOP: 1–4

is called a quadratic model or a

REF: 587

79. A regression model of the form polynomial model. ANS: T PTS: 1 BLM: Higher Order - Understand

vs.

TOP: 1–4

is called a third-order

REF: 587

TOP: 1–4

80. To check out whether the regressions assumption involving normality of the error terms (residuals) is valid, it is appropriate to construct a normal probability plot. If this plot forms a straight line from the lower-left-hand corner to the upper-right-hand corner, the error terms may be assumed to be normally distributed. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 586

TOP: 1–4

81. It is appropriate to compute a correlation coefficient for the relationship between a dummy variable and a dependent variable. ANS: T BLM: Remember

PTS:

REF: 595

TOP: 5–8

82. If a qualitative variable has m categories, you should use m – 1 dummy variables to incorporate the qualitative variable into a regression model. ANS: T BLM: Remember

PTS:

REF: 595

TOP: 5–8

83. Including a dummy variable into a regression model will simplify the regression results and help people to interpret the meaning of the regression parameters. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 595

TOP: 5–8

84. Qualitative predictor variables are entered into a regression model through dummy variables. ANS: T BLM: Remember

PTS:

REF: 595

TOP: 5–8

85. Quantitative predictor variables are entered into a regression model through indicator variables. ANS: T BLM: Remember

PTS:

REF: 595

TOP: 5–8

86. The t distribution with df = n – 2 is used for testing a specific set of regression coefficients, e.g., . ANS: F BLM: Remember

PTS:

REF: 584-585

TOP: 5–8

87. Plots of the residuals against or against the individual independent variables often indicate departures from the assumptions required for an analysis of variance, and they also may suggest changes in the underlying model. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 586

TOP: 5–8

88. Stepwise regression analysis is a procedure that is implemented by computer and is available in most statistical packages. It is used mainly to determine which of a large number of independent variables should be included in the model. ANS: T BLM: Remember

PTS:

REF: 607

TOP: 5–8

89. A dummy or indicator variable is a dependent variable whose values are either 0.0 or 1.0. ANS: F BLM: Remember

PTS:

REF: 595

TOP: 5–8

90. In order to incorporate qualitative variables into a regression model, one or more dummy variables are needed. ANS: T BLM: Remember

PTS:

REF: 595

TOP: 5–8

91. In order to incorporate the marital status variable into a multiple regression model, there are four possible categories for this variable: married, single, divorced, or widowed. Based on this information, four indictor variables will need to be created (one for each category) and incorporated into the regression model.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 594-596

TOP: 5–8

92. Three qualitative variables need to be incorporated into a regression model. The first variable has five possible categories, the second one has three possible categories, and the third one has two possible categories. Based on this information, ten dummy variables need to be included in the regression model. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 594-599

TOP: 5–8

93. In a multiple regression analysis, the regression equation

is obtained.

The variable is quantitative variable, and the variable is a dummy variable with values 0 and 1. Given this information, we can interpret the slope coefficient (–3) on variable as follows: Holding constant, if the value of average value of y will decrease by 3 units. ANS: T PTS: 1 BLM: Higher Order - Understand

is changed from 0 to 1, the

REF: 595

TOP: 5–8

94. If you wish to develop a multiple regression model that includes a qualitative variable, for example, education status, in which the following categories exist: no degree, high school diploma, college degree, bachelor degree, and graduate degree, you need to code the categories as 1, 2, 3, 4, and 5. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 595

TOP: 5–8

95. Stepwise regression is a statistical technique that is always implemented when developing a regression model to fit a nonlinear relationship between the dependent and potential independent variables. ANS: F BLM: Remember

PTS:

REF: 607

TOP: 5–8

96. In constructing a multiple regression model with two independent variables

and

was known that the correlation between and y is 0.75, and the correlation between and y is 0.55. Based on this information, the regression model containing both independent variables will explain 65% of the variation in the dependent variable y. ANS: F TOP: 5–8

PTS: 1 REF: 582-584 | 608 BLM: Higher Order - Analyze

97. When a stepwise procedure is used, a variable selected at an earlier step can be removed from the model if, in the presence of other variables, it no longer contributes significantly to explaining the variation in the dependent variable y.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 607

TOP: 5–8

98. If a stepwise regression procedure is used to enter, one at a time, three variables into a regression model, the resulting regression equation may differ from the regression equation that occurs when all three variables are entered at one step. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 607

TOP: 5–8

99. Stepwise regression analysis is most useful when it is anticipated that there are curvilinear relationships between the dependent variable and the potential independent variables. ANS: F PTS: 1 BLM: Higher Order - Understand 100.

PTS:

TOP: 9–10

REF: 607-608

TOP: 9–10

REF: 608

TOP: 9–10

One of the consequences of multicollinearity in multiple regression is inflated standard errors in some or all of the estimated slope coefficients. ANS: T BLM: Remember

105.

REF: 608

Multicollinearity is present when there is a high degree of correlation between the dependent variable and all the independent variables included in the model. ANS: F PTS: 1 BLM: Higher Order - Understand

104.

TOP: 5–8

If a multiple regression model includes ten or more predictor variables, it is almost certain that changes in the predictor variables cause changes in the response variable y. ANS: F PTS: 1 BLM: Higher Order - Understand

103.

REF: 607

Multicollinearity is a situation in which two or more of the independent variables are highly correlated with each other. ANS: T BLM: Remember

102.

TOP: 5–8

The stepwise regression analysis is best used as a preliminary tool for identifying which of a large number of variables should be considered in the model. ANS: T BLM: Remember

101.

REF: 607

PTS:

REF: 608

TOP: 9–10

Multicollinearity will result in excessively low standard errors of the parameter estimates reported in the regression output.

ANS: F BLM: Remember 106.

REF: 608

TOP: 9–10

When two or more of the predictor variables are highly correlated with one another, adding or deleting a predictor variable may cause significant changes in the values of the other regression coefficients. ANS: T BLM: Remember

107.

PTS:

REF: 608

TOP: 9–10

When multicollinearity is present, the estimated regression coefficients will have large standard error, causing imprecision in confidence and prediction intervals. ANS: T BLM: Remember

PTS:

REF: 608

TOP: 9–10

PROBLEM 1. The first-order model

attempts to explain average air temperatures

in degrees Celsius for a particular day as a function of distance from the coast ( , in km) and altitude (

, in hundreds of metres). Interpret the parameters

ANS: 0 = 20 is the average temperature at the coast (

= 0) at sea level (

= 0).

= 0.25 is the change in y for a one-unit change in when is held constant. It tells us that at a fixed altitude, the temperature increases on average by 0.25 degrees Celsius as distance from the coast increases by one kilometre. = –0.40 is the change in y for a one-unit change in when is held constant. It tells us that at a given distance from the coast, temperature decreases on average by 0.40 degrees Celsius as altitude increases by 100 metres above sea level. PTS: 1 REF: 579-581 BLM: Higher Order - Analyze

TOP: 1–4

Fuel Consumption and Horsepower An automobile manufacturer would like to know the fuel consumption (y, in litres per 100 km) of a car based on four predictor variables:

= horsepower,

= torque,

displacement (litres), and = weight (kg). Suppose the following equation does indeed describe the true relationship:

2. Refer to Fuel Consumption and Horsepower. What is the gas mileage for a car with horsepower 160, torque 250, displacement 1.9 L (1900 cm3), and weight 2000 kg? ANS:

y = 8.7 L/100 km PTS: 1 REF: 579-581 BLM: Higher Order - Apply

TOP: 1–4

3. Refer to Fuel Consumption and Horsepower. What is the gas mileage for a car with 210 horsepower, 330 torque, 7 L of displacement, and weight 2600 kg? ANS: y = 11.16 L/100 km PTS: 1 REF: 579-581 BLM: Higher Order - Apply

TOP: 1–4

4. Refer to Fuel Consumption and Horsepower. How would you interpret the values of

ANS: = –0.066 is the change in y for a one-unit change in

when all other independent

variables ( , , ) are held constant. This means that when the horsepower increases by 1, the fuel consumption will decrease on average by 0.066 L/100 km if all other independent variables are held constant.

= 0.008 is the change in y for a one-unit change in

when

all other independent variables ( , , ) are held constant. This means that when the torque increases by 1, the fuel consumption will increase on average by 0.008 L/100 km if all other independent variables are held constant.

= 3.567 is the change in y for a

one-unit change in when all other independent variables ( , , ) are held constant. This means that when the displacement increases by 1 L (1000 cm3), the fuel consumption will increase on average by 3.567 L/100 km if all other independent variables are held constant.

= 0.005 is the change in y for a one-unit change in

when all other

independent variables ( , , ) are held constant. This means that when the weight increases by 1 kg, the fuel consumption will increase on average by 0.005 L/100 km if all other independent variables are held constant. PTS: 1 REF: 579-581 BLM: Higher Order - Analyze

TOP: 1–4

Electric Usage Narrative The power company claims the amount of electricity used by a house (y) depends on square metres of heated space, = mean outside temperature, and sunlight per day. Partial statistical software output is given below. Regression Analysis The regression equation is = 357 + 0.808

– 16.6

+ 40

= mean hours of

Predictor Constant

Coef 357.0000 0.8082 –16.6100 39.7000

S = 267.7

StDev 2,235.0000 0.1378 28.2300 232.8000

R-sq = 82.0%

Analysis of Variance Source DF SS Regression 3 3,271,175 Error 10 716,459 Total 13 3,987,635

t-ratio 0.16 5.87 –0.59 0.17

p-value 0.876 0.000 0.569 0.868

R-sq(adj) = 76.6%

MS 1,090,392 71,646

F 15.22

p 0.000

5. Refer to Electric Usage Narrative. Does the regression model appear to be useful? Justify your answer. (Use = 0.05.) ANS: The hypotheses to be tested are =0 vs 0. The test statistics is F = 15.22 with p-value = 0.0 < conclude that the regression model is useful. PTS: 1 REF: 582-584 BLM: Higher Order - Evaluate

At least one of is not = 0.05. Therefore, one can

TOP: 1–4

6. Refer to Electric Usage Narrative. Carry out three separate tests with a significance level of 0.05 to decide if

, and

are significant.

ANS: The three individual t tests are designed to test , for each of the three partial regression coefficients given, that the other predictor variables are already in the model. By examining the p-values in the last column of the first table (0.0, 0.569, and 0.868) you can see that only variable and

is significant in predicting y, while

are not significant.

PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

7. Refer to Electric Usage Narrative. Construct a 99% confidence interval for

ANS: With 10 degrees of freedom, the 99% confidence interval for 3.169 (0.1378) = 0.8082 0.4367, or 0.3715 <

< 1.2449.

= 0.8082

PTS: 1 REF: 584-587 BLM: Higher Order - Analyze

TOP: 1–4

8. Refer to Electric Usage Narrative. Obtain a point prediction of the electricity use for a home that has 300 m2 of space, an outside temperature of 3°C, and 10.2 hours of sunlight. ANS: The prediction equation is and

= 10.2, then

. When

= 300,

= 3,

= 954.57.

PTS: 1 REF: 579-582 | 586-587 BLM: Higher Order - Analyze

TOP: 1–4

Personal Spending and Personal Income Is personal spending linearly related to orders for durable goods and personal income? A recent study reported the amounts of personal spending (in trillions of dollars), amount spent on durable goods (in billions of dollars), and personal income (in trillions of dollars). A statistical package was used to fit a linear regression model, producing the output below. Source Regression Residual

Sum of Squares 0.0436 0.0019

df 2 9

Mean Square 0.0218 0.0002

Variable Constant Goods Income

Coefficient –0.6023 –0.0013 0.9580

s.e. of Coeff 0.3826 0.0016 0.1023

t-ratio –1.57 –0.80 9.36

F-ratio 105

R2 = 95.9% R2(adj) = 95.0%, s = 0.0144 with 12 – 3 = 9 df. 9. Refer to Personal Spending and Personal Income. Write the model that was fit. Include the estimates of the parameters. ANS: , where

is the amount spent on durable goods, and

is personal income. PTS: 1 REF: 579-582 BLM: Higher Order - Analyze

TOP: 1–4

10. Refer to Personal Spending and Personal Income. Predict the level of personal spending when amount spent on durable goods is 130 and personal income is at 5.10. ANS:

When

= 130 and

= 5.10, then

= –0.6023 – 0.0013(130) + 0.958(5.10) = 4.1145.

PTS: 1 REF: 579-582 BLM: Higher Order - Apply

TOP: 1–4

11. Refer to Personal Spending and Personal Income. Use the computer output shown above to calculate 95% confidence intervals for the intercept and the partial regression coefficients. ANS: With 9 degrees of freedom, the 95% confidence interval for and 2:

: –0.6023

2.262(0.3826) = –0.6023

0.8654 = (–1.4677, 0.2631)

: –0.0013

2.262(0.0016) = –0.0013

0.0036 = (–0.0049, 0.0023)

: 0.9580

2.262(0.1023) = 0.9580

0.2314 = (0.7266, 1.1894).

PTS: 1 REF: 584-587 BLM: Higher Order - Analyze

, i = 0, 1,

TOP: 1–4

12. Refer to Personal Spending and Personal Income. Based on your confidence intervals in the previous question, does the amount spent on durable goods have any predictive power beyond that provided by the other independent variables for determining personal spending? Explain. ANS: No; the confidence interval (–0.0049, 0.0023) for PTS: 1 REF: 584-587 BLM: Higher Order - Evaluate

contains 0.

TOP: 1–4

13. Refer to Personal Spending and Personal Income. Based on your confidence intervals in above, does personal income have any predictive power beyond that provided by the other independent variables for determining personal spending? Explain. ANS: Yes; the confidence interval (0.7266, 1.1894) for PTS: 1 REF: 584-587 BLM: Higher Order - Evaluate

does not include 0.

TOP: 1–4

14. Refer to Personal Spending and Personal Income. Use the computer output shown above to test the hypotheses conclusion? ANS:

vs.

at the 5% significance level. What is your

With 9 degrees of freedom and = 0.05, we reject if | t | > 2.262. Since | t | = 0.8 < 2.262, there is insufficient evidence at the 95% level to reject the null hypothesis. We conclude that spending on durable goods has no predictive power over and above personal income to predict personal spending. PTS: 1 REF: 584-587 BLM: Higher Order - Evaluate

TOP: 1–4

15. Refer to Personal Spending and Personal Income. Use the computer output shown above to test the hypotheses conclusion?

at the 5% significance level. What is your

ANS: With 9 degrees of freedom and = 0.05, we reject if | t | > 2.262. Since t = 9.36 > 2.262, we reject the null hypothesis at the 95% level and conclude that personal income does have predictive power beyond that provided by the other independent variable. PTS: 1 REF: 584-587 BLM: Higher Order - Evaluate

TOP: 1–4

16. A medical study investigated the link between obesity and television viewing habits in children. One part of the study involved characterizing the difference in viewing habits between boys and girls. A regression model was used to compare the number of hours of television watched per week by boys with the number watched by girls. Use the computer printout below to determine whether there is a significant difference between these two groups. The variable named “Gender” in the printout is equal to one if the subject is female, and 0 if the subject is male. State the null and alternative hypotheses of interest. State your conclusion based on a 0.05 significance level. The regression equation is Hours = 21.5 Predictor Constant Gender s = 7.89 Analysis of Variance Source Regression Error Total

Coef 21.50 –0.20 R-sq = 87.4%

DF 1 30 31

0.201 Gender StDev.Coef 5.42 0.09

SS 12,954.6 1,867.6 14,822.2

t-ratio 3.97 –2.16

MS 12,954.6 62.3

ANS: The hypotheses to be tested are

. With df = 30 and

= 0.05,

we reject if | t | > 2.042. Since | t | = 2.16 > 2.042, is barely rejected, and we conclude that there is a significant difference in TV viewing habits between boys and girls.

PTS: 1 REF: 594-599 BLM: Higher Order - Evaluate

TOP: 1–4

Magazine Sales Narrative A publisher is studying the effectiveness of advertising to sell a woman’s magazine. She wishes to investigate the relationship between the number of magazines sold (10,000s), the “reach” (proportion of the population who see at least one advertisement for the magazine), and the average income of the target market ($1000s). The publisher suspects people at certain income levels might be more susceptible to this advertising campaign than others. Preliminary studies show there is no evidence of a quadratic relationship between sales and either of the other two variables. Use the output below to answer the questions. The regression equation is Sales = 3.1 + 10341 Reach + 0.871 Income + 3.256 Reach*Income Predictor Constant Reach Income Reach*Income

Coef 3.10 10,341.00 0.87 3.26

S = 8.43

R-sq = 82.4%

Analysis of Variance SOURCE SS Model 3654.2 Error 780.5 Total 4434.7

DF 3 11 14

StDev.Coef 0.69 1,456.50 0.31 1.71

t-ratio 4.50 7.10 2.81 1.90

MS 1218.1 71.0

17. Refer to Magazine Sales Narrative. Write the model used for the regression. ANS: , where PTS: 1 REF: 580-582 BLM: Higher Order - Analyze

is the reach, and

is the average income.

TOP: 1–4

18. Refer to Magazine Sales Narrative. Is there an interaction effect? Test at the 5% significance level. ANS:

The hypotheses to be tested are

. With df = 11 and

= 0.05, we

reject if | t | > 2.201. Since t = 1.90 < 2.201, there is not enough evidence to reject . One may conclude that there is no interaction between the “reach” (proportion of the population who see at least one advertisement for the magazine) and the average income of the target market. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

19. Refer to Magazine Sales Narrative. Develop 95% confidence intervals for the intercept and the partial regression coefficients. ANS: With 11 degrees of freedom, the 95% confidence interval for 2, and 3. : 3.10 : 10,341

2.201(0.69) = 3.10

, i = 0, 1,

1.5187 = (1.5813, 4.6187)

2.201(1,456.5) = 10,341 3,205.76 = (7,135.24, 13,546.76)

: 0.87

2.201(0.31) = 0.87

0.6823 = (0.1877, 1.5523)

: 3.26

2.201(1.71) = 3.26

3.7637 = (–0.5037, 7.0237).

PTS: 1 REF: 584-587 BLM: Higher Order - Analyze

TOP: 1–4

20. Suppose you have a choice of three multiple linear regression models. Listed below are the independent variables (predictors) and the values of and adjusted for each model. Predictors in the Model 0.949 0.962 0.969

adjusted 0.932 0.958 0.947

Which model would you choose as the most appropriate to use? Justify your answer. ANS: Since all three models have a reasonable value of

, the second model that includes

would be most appropriate because it has the highest adjusted PTS: 1 REF: 584-585 | 594-599 BLM: Higher Order - Evaluate

TOP: 1–4

21. Consider the following partial output generated using a statistical software: The regression equation is

Predictor Constant

Coef StDev 228.86 73.95 12.496 4.039 –39.15 25.76 2.686 9.670 –33.90 13.69 S = 15.51 R-Sq = 95.2%

T P 3.09 0.036 3.09 0.036 –1.52 0.203 0.28 0.795 –2.48 0.068 R-Sq(adj) = 90.3%

a. Which, if any, of the terms should be removed from the model first? Justify your answer. (Use a significance level of 0.05.) b. Which of the predictors makes the most significant contribution in predicting the dependent variable y? Justify your answer. c. Find and interpret the coefficient of determination. ANS: a.

since it has the largest p-value (0.795 

Remove the term involving

= 0.05).

The most significant predictor variable is since it has the smallest p-value (0.036 < = 0.05). c. = 0.952. This means that 95.2% of the total variability in the dependent variable y can be explained by the current model. Thus, the model is doing a good job. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

Chemical Comparisons Narrative A chemist was interested in examining the effects of three chemicals on a chemical process yield. Let , i = 1, 2, 3, represent the effects of the three chemicals, respectively, and y be the process yield. The following output was generated using statistical software. Regression Analysis Predictor Coef Constant 225.19 12.725 –39.45 –33.48 S = 14.01

StDev 65.70 3.570 23.24 12.29

R-Sq = 95.1%

Analysis of Variance Source DF Regression 3 Residual Error 5

T 3.43 3.56 –1.70 –2.73

P 0.019 0.016 0.150 0.042

R-Sq(adj) = 92.1

SS 18,951.1 981.1

MS 6,317.0 196.2

F 32.19

Total

19,932.2

22. Refer to Chemical Comparisons Narrative. Find the least squares regression equation. ANS:

PTS: 1 REF: 579-582 BLM: Higher Order - Analyze

TOP: 1–4

23. Refer to Chemical Comparisons Narrative. Test the usefulness of the model at the 0.05 level of significance. ANS: The null and alternative hypotheses of interest are

vs.

. The value of test statistic is F = 32.19. The critical value of F with = 0.05, = 3, and = 5 is 5.41. Reject if F > 5.41. Since F > 5.41, reject at = 0.05 and conclude that at least one of is not 0; i.e., at least one of the predictor variables (one of the three chemicals) is contributing significant information for the prediction of the chemical process yield. PTS: 1 REF: 582-584 BLM: Higher Order - Evaluate

TOP: 1–4

24. Refer to Chemical Comparisons Narrative. Which variable, if any, should be removed from the model if a 0.05 level of significance is desired? ANS: The variable should be used, since it is not significant at the 0.05 level (the p-value associated with this variable is 0.15  = 0.05). PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

25. The sequential sums of squares represent the conditional contribution of each of the predictor variables given the variables already in the model. Use the following partial output generated using MINITAB to determine which predictor variable accounts for the largest proportion of the total variation explained by the regression model. What is the proportion accounted by the selected variable? Source

DF 1 1 1 1

Seq SS 16,560.1 933.5 0.0 1,476.1

ANS: accounts for 16,560.1/18,969.7 = 0.8730, or 87.3% of the total variation explained by the regression model. PTS: 1 REF: 582-583 BLM: Higher Order - Analyze

TOP: 1–4

26. Suppose that you fit the model to 15 data points and found F equal to 52.36. a. Do the data provide sufficient evidence to indicate that the model contributes information for the prediction of y? Test using a 5% level of significance. b.

Use the value of F to calculate

Interpret its value.

ANS: a.

The hypotheses to be tested are

vs.

from 0 and the test statistic is

: at least one

differs

, which has an F distribution with The rejection region for α = 0.05, which,

and

if found in the upper tail of the F distribution, is and is rejected. There is evidence that the model contributes information for the prediction of y. b.

Use the fact that

and solve for

where n = 15 and k

= 3; you find that = 0.9346. The value means that the total sum of squares of deviations of the y-values about their mean has been reduced by 93.46% by using the linear model to predict y. PTS: 1 REF: 584 BLM: Higher Order - Evaluate

TOP: 1–4

27. The computer output for a multiple regression analysis including 15 observations and three predictor variables, ,

and ,

, provides the following information: ,

, SE

, and

SE a. Which, if any, of the independent variables the prediction of y? b. Give the least-squares prediction equation. c.

and

What is the practical interpretation of the parameter

ANS:

contribute information for

The hypotheses of interest are

statistic is choose

vs.

. For

, the test

For i = 1, 2, and 3, t = 2.978, 4.074, and 2.30, respectively. If you the rejection region, with

degrees of freedom, is

and all three hypotheses are rejected. Each of the three independent variables contributes to the prediction of y in the presence of the other two variables. b. From the information given in the exercise, the prediction equation is . c. represents the change in y for a one-unit change in constant.

when

PTS: 1 REF: 579-581 | 584-585 BLM: Higher Order - Evaluate

and

are held

TOP: 1–4

College Textbook Sales Narrative A publisher of college textbooks conducted a study to relate profit per text y to cost of sales x over a six-year period when its sales force (and sales costs) were growing rapidly. These inflation-adjusted data (in thousands of dollars) were collected: Profit per text, y Sales cost per text, x

17.2 5.7

23.1 6.3

25.6 6.8

29.5 7.5

32.2 8.1

36.5 9.3

Expecting profit per book to rise and then plateau, the publisher fitted the model to the data. 28. Refer to College Textbook Sales Narrative. Plot the data points. Does it look as though the quadratic model is necessary? ANS: The plotted points shown in the scatterplot indicate slight curvature to the points. PTS: 1 REF: 587-588 BLM: Higher Order - Apply

TOP: 1–4

29. Refer to College Textbook Sales Narrative. Use statistical software to perform the multiple regression analysis for the model ANS: Regression Analysis The regression equation is y = –55.3 + 17.5x – 0.820x-square Predictor Coef StDev Constant –55.33 10.11 x 17.482 2.744

T –5.47 6.37

P 0.012 0.008

x-square

–0.8198

S =0.5944

–4.49

0.1824 R-Sq = 99.6%

0.021 R-Sq(adj) = 99.3%

Analysis of Variance Source Regression Residual Error Total Source DF x 1 x-square 1

DF SS 2 234.96 3 1.06 5 236.02 Seq SS 227.82 7.140

PTS: 1 REF: 589-590 BLM: Higher Order - Apply

MS 117.48 0.35

F 332.53

p 0.000

TOP: 1–4

30. Refer to College Textbook Sales Narrative. Find s on the printout. Confirm that . ANS: Refer to the statistical software printout given above. The value of s is s = 0.5944 and the value of SSE is found in the column labelled “SS” and the row labelled “Residual Error” to be SSE = 1.06. Then

= 0.5944.

PTS: 1 REF: 589-590 BLM: Higher Order - Analyze

TOP: 1–4

31. Refer to College Textbook Sales Narrative. Do the data provide sufficient evidence to indicate that the model contributes information for the prediction of y? What is the p-value for this test, and what does it mean? ANS: The hypotheses to be tested are The test statistic is

vs. at least one differs from 0. ,which has an F distribution with

and The p-value given in the printout is p = 0.000 and is rejected. There is evidence that the model contributes information for the prediction of y. PTS: 1 REF: 589-590 BLM: Higher Order - Evaluate

TOP: 1–4

32. Refer to College Textbook Sales Narrative. How much of the regression sum of squares is accounted for by the quadratic term? By the linear term? ANS:

The sequential sum of squares in the printout gives the needed information. The quadratic term accounts for 0.0304 or 3.04% of the regression variation. The rest of the variation explained by the regression is due to the linear term: 0.9696 or 96.96% of the regression variation. PTS: 1 REF: 589-590 BLM: Higher Order - Analyze

TOP: 1–4

33. Refer to College Textbook Sales Narrative. What sign would you expect the actual value of to have? Find the value of expectation? Justify your answer.

in the printout. Does this value confirm your

ANS: The publisher expects y to increase as x increases to a point, and then to decrease as x increases further. Hence, the parabola should be cupped downward. You can discover by inspecting several parabolas that those that are cupped downward have negative coefficients for

Hence, we would expect

to be negative. From the printout,

PTS: 1 REF: 588-590 BLM: Higher Order - Analyze

TOP: 1–4

34. Refer to College Textbook Sales Narrative. Do the data indicate a significant curvature in the relationship between y and x? Test at the 5% level of significance. ANS: The hypotheses of interest are

vs.

. The test statistic is t = –0.49

with p-value = 0.021. Hence,

can be rejected for any

. Hence, for

is rejected. We conclude that there is curvature in the relationship. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

35. Refer to College Textbook Sales Narrative. Use the values of SSR and Total SS in the printout to calculate

Compare this value with the value given in the printout.

ANS: From the printout, SSR = 234.96 and Total SS =

Then

, which agrees with the printout. PTS: 1 REF: 584 BLM: Higher Order - Apply

TOP: 1–4

36. Refer to College Textbook Sales Narrative. Calculate (adj). When would it be appropriate to use this value rather than to assess the fit of the model? ANS: The value of (adj) can be used to compare two or more regression models using different numbers of independent predictor variables. PTS: 1 REF: 584-585 BLM: Higher Order - Apply

TOP: 1–4

37. Refer to College Textbook Sales Narrative. The value (adj) was 95.7% when a simple linear model was fit to the data. Does the linear or the quadratic model fit better? ANS: Since the value of (adj) = 99.3% is just slightly larger than the value of for the linear model, the quadratic model fits just slightly better. PTS: 1 REF: 584-585 | 589 BLM: Higher Order - Analyze

TOP: 1–4

(adj) = 95.7%

38. Refer to College Textbook Sales Narrative. What conclusions can you draw from the accompanying residual plots?

ANS: There are no obvious violations in the assumptions based on the patterns shown in the diagnostic plots.

PTS: 1 REF: 590 BLM: Higher Order - Analyze

TOP: 5–8

Air Pollution Monitors Narrative An experiment was designed to compare several different types of air pollution monitors. Each monitor was set up and then exposed to different concentrations of ozone, ranging between 15 and 230 parts per million (ppm), for periods of 8–72 hours. Filters on the monitor were then analyzed, and the response of the monitor was measured. The results for one type of monitor showed a linear pattern. The results for another type of monitor are listed in the table. Ozone (ppm/hr), x Relative Fluorescence Density, y

0.07 9

0.13 19

0.19 28

0.32 34

0.58 43

0.66 48

0.69 53

1.30 62

39. Refer to Air Pollution Monitors Narrative. Plot the data. What model would you expect to provide the best fit to the data? Write the equation of that model. ANS:

The pattern of the points suggests a quadratic model given by the formula: . PTS: 1 REF: 587-588 BLM: Higher Order - Analyze

TOP: 1–4

40. Refer to Air Pollution Monitors Narrative. Use statistical software to fit the model from the previous question. ANS: The statistical software printout fitting the quadratic model to the data is shown below. Regression Analysis The regression equation is y = 7.76 + 86.8x - 34.9 x-square Predictor Coef StDev T

Constant x x-square S = 3.742

7.764 3.110 2.50 0.055 86.82 11.79 7.36 0.001 -34.917 8.559 -4.08 0.010 R-Sq = 96.9% R-Sq(adj) = 95.6%

Analysis of Variance Source DF SS MS Regression 2 2,166.0 1,083.0 Residual Error 5 70.0 14.0 Total 7 2,236.0 Source x x-square

DF 1

F 77.34

P 0.000

Seq SS 1 1932.9 233.1

PTS: 1 REF: 589-590 BLM: Higher Order - Apply

TOP: 1–4

41. Refer to Air Pollution Monitors Narrative. Find the least-squares regression equation relating the monitor’s response to the ozone concentration. ANS: The least squares equation is PTS: 1 REF: 590 BLM: Higher Order - Understand

. TOP: 1–4

42. Refer to Air Pollution Monitors Narrative. Does the model contribute significant information for the prediction of the monitor’s response based on ozone exposure? Use the appropriate p-value to make your decision. ANS: The hypotheses of interest are . The F test for the overall utility of the model is F = 77.34 with p-value = 0.000. The results are highly significant; the model contributes significant information for the prediction of y. PTS: 1 REF: 589-590 BLM: Higher Order - Evaluate

TOP: 1–4

43. Refer to Air Pollution Monitors Narrative. Find on the printout. What does this value tell you about the effectiveness of the multiple regression analysis? ANS: From the statistical software printout, = 96.9%, which means that 96.9% of the total variation in relative fluorescence density can be explained by the quadratic model. The model is very effective. PTS:

REF: 589-590

TOP: 1–4

BLM: Higher Order - Understand Life Expectancy Narrative An actuary wanted to develop a model to predict how long individuals will live. After consulting a number of physicians, she collected the age at death (y), the average number of hours of exercise per week (

), the cholesterol level (

), and the number of points that the

individual’s blood pressure exceeded the recommended value ( ). A random sample of 40 individuals was selected. The computer output of the multiple regression model is shown below. The regression equation is

Predictor Constant

Coef 55.8 1.79 –0.021 –0.016

StDev 11.8 0.44 0.011 0.014

S = 9.47

R-Sq = 22.5%

Analysis of Variance Source of Variation df Regression 3 Error 36 Total 39

SS 936 3230 4166

T 4.729 4.068 –1.909 –1.143

MS 312 89.722

F 3.477

44. Refer to Life Expectancy Narrative. Is there enough evidence at the 10% significance level to infer that the model is useful in predicting length of life? ANS: At least one

is not equal to 0.

Rejection region: F > = 2.84, Test statistic: F = 3.477 Conclusion: Reject the null hypothesis. Yes, there enough evidence at the 10% significance level to infer that the model is useful in predicting length of life. PTS: 1 REF: 582-584 BLM: Higher Order - Evaluate

TOP: 1–4

45. Refer to Life Expectancy Narrative. Is there enough evidence at the 1% significance level to infer that the average number of hours of exercise per week and the age at death are linearly related? Justify your conclusion.

ANS: vs.

Rejection region: | t | > 2.724, Test statistic: t = 4.068 Conclusion: Reject the null hypothesis. Yes, there enough evidence at the 1% significance level to infer that the average number of hours of exercise per week and the age at death are linearly related. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

46. Refer to Life Expectancy Narrative. Is there enough evidence at the 5% significance level to infer that the cholesterol level and the age at death are negatively linearly related? Justify your conclusion. ANS: vs.

Rejection region: t < –1.69, Test statistic: t = –1.909 Conclusion: Reject the null hypothesis. Yes, there is enough evidence at the 5% significance level to infer that the cholesterol level and the age at death are negatively linearly related. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

47. Refer to Life Expectancy Narrative. Is there sufficient evidence at the 5% significance level to infer that the number of points that the individual’s blood pressure exceeded the recommended value and the age at death are negatively linearly related? Justify your conclusion. ANS: vs.

Rejection region: t < –1.69, Test statistic: t = –1.143 Conclusion: Don’t reject the null hypothesis. No, sufficient evidence at the 5% significance level to infer that the number of points that the individual’s blood pressure exceeded the recommended value and the age at death are negatively linearly related. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

48. Refer to Life Expectancy Narrative. What is the coefficient of determination? What does this statistic tell you? ANS:

0.225. This means that 22.5% of the variation in the age at death is explained by the three variables: the average number of hours of exercise per week, the cholesterol level, and the number of points that the individual’s blood pressure exceeded the recommended value, while 77.5% of the variation remains unexplained. PTS: 1 REF: 584 BLM: Higher Order - Understand

TOP: 1–4

49. Refer to Life Expectancy Narrative. Interpret the coefficient

ANS: = 1.79. This tells us for each additional hour increase of exercise per week, the age at death on average is extended by 1.79 years (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

50. Refer to Life Expectancy Narrative. Interpret the coefficient

ANS: = –0.021. This tells us that, for each additional unit increase in the cholesterol level, the age at death on average is shortened by 0.021 years or equivalently about a week (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

51. Refer to Life Expectancy Narrative. Interpret the coefficient

ANS: = 0.016. This tells us that, for each additional point increase of the individual’s blood pressure that exceeded the recommended value, the age at death on average is shortened by 0.016 years or equivalent, about six days (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

Demographic Variables and TV Narrative

A statistician wanted to determine if the demographic variables of age, education, and income influence the number of hours of television watched per week. A random sample of 25 adults was selected to estimate the multiple regression model: , where y is the number of hours of television watched last week, is the age (in years), is the number of years of education, and $1000s). The computer output is shown below.

is income (in

The regression equation is

Predictor Constant

Coef 22.3 0.41 –0.29 –0.12

StDev 10.7 0.19 0.13 0.03

S = 4.51

R-Sq = 34.8%

T 2.084 2.158 –2.231 –4.00

Analysis of Variance Source of Variation Regression Error Total

df 3 21 24

SS 227 426 653

MS 75.667 20.286

F 3.730

52. Refer to Demographic Variables and TV Narrative. Test the overall validity of the model at the 5% significance level. ANS: At least one

is not equal to 0.

Rejection region: F > = 3.07 Test statistic: F = 3.73 Conclusion: Reject the null hypothesis. The model is valid at PTS: 1 REF: 582-583 BLM: Higher Order - Evaluate

= 0.05.

TOP: 1–4

53. Refer to Demographic Variables and TV Narrative. Is there sufficient evidence at the 1% significance level to indicate that hours of television watched and age are linearly related? Justify your conclusion. ANS: vs.

Rejection region: | t | > 2.831 Test statistic: t = 2.158 Conclusion: Don’t reject the null hypothesis. There is insufficient evidence at the 1% significance level to indicate that hours of television watched per week and age are linearly related. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

54. Refer to Demographic Variables and TV Narrative. Is there sufficient evidence at the 1% significance level to indicate that hours of television watched and education are negatively linearly related? Justify your conclusion. ANS: vs.

Rejection region: t < -2.518 Test statistic: t = –2.231 Conclusion: Don’t reject the null hypothesis. There is insufficient evidence at the 1% significance level to indicate that hours of television watched per week and the number of years of education are negatively linearly related. PTS: 1 REF: 584-585 BLM: Higher Order - Evaluate

TOP: 1–4

55. Refer to Demographic Variables and TV Narrative. What is the coefficient of determination? What does this statistic tell you? ANS: 0.348. This means that 34.8% of the variation in the number of hours of television watched per week is explained by the three variables: age, number of years of education, and income, while 65.2% remains unexplained. PTS: 1 REF: 584 BLM: Higher Order - Understand

TOP: 1–4

56. Refer to Demographic Variables and TV Narrative. Interpret the coefficient

ANS: = 0.41. This tells us that, for each additional year of age, the number of hours of television watched per week on average increases by 0.41 (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

57. Refer to Demographic Variables and TV Narrative. Interpret the coefficient

ANS: = –0.29. This tells us that, for each additional year of education, the number of hours of television watched per week on average decreases by 0.29 (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

58. Refer to Demographic Variables and TV Narrative. Interpret the coefficient

ANS: = –0.12. This tells us that, for each additional year of $1000 in income, the number of hours of television watched per week on average decreases by 0.12 (assuming that the other independent variables in the model are held constant). PTS: 1 REF: 579-581 BLM: Higher Order - Understand

TOP: 1–4

59. Refer to Eating Habits of Canadians. How well does the model fit? Use any relevant statistics and diagnostic tools from the printout to answer this question. ANS: From the statistical software printout, use F = 69.83 with p-value = 0.000 to test the overall utility of the model. The model contributes significant information for the prediction of y. There are no obvious violations of the regression assumptions, since the patterns in the diagnostic plots are as expected when the regression assumptions are satisfied. PTS: 1 REF: 594-599 BLM: Higher Order - Evaluate

TOP: 5–8

60. Refer to Eating Habits of Canadians. Write the equations of the two straight lines that describe the trend in consumption over the period of 30 years for beef and for chicken. ANS: The LS regression equation is

. The equation for

chicken uses

and the prediction equation is

, while the equation

for beef uses

and the prediction equation is

PTS: 1 REF: 594-599 BLM: Higher Order - Apply

TOP: 5–8

61. Refer to Eating Habits of Canadians. Use the prediction equation to find a point estimate of the average beef consumption per family of three in 2005. Compare this value with the value labelled “Fit” in the printout.

ANS: Using the second prediction equation from part (b), the point estimate is . To within rounding error, this is the value marked “Fit” in the computer printout. PTS: 1 REF: 594-599 BLM: Higher Order - Apply

TOP: 5–8

62. Refer to Eating Habits of Canadians. Use the printout to find a 95% confidence interval for the average beef consumption per family of three in 2005. What is the 95% prediction interval for the beef consumption per family of three in 2005? Is there any problem with the validity of the 95% confidence level for these intervals? ANS: From the printout, the two intervals are and you are predicting outside of the experimental unit, there is a danger of inaccurate predictions! PTS: 1 REF: 594-599 BLM: Higher Order - Evaluate

. Since

TOP: 5–8

Rocket Experiments Narrative An engineer was investigating the relationship between the thrust of an experimental rocket (y), the percent composition of a secret chemical in the fuel (x1), and the internal temperature of a chamber of the rocket (x2). The engineer starts by fitting a quadratic model, but he believes that the full quadratic model is too complex and can be reduced by including only the linear terms and the interaction term. 63. Refer to Rocket Experiments Narrative. Write the two models the engineer is considering. ANS: Complete model: Reduced model: PTS: 1 REF: 587-590 BLM: Higher Order - Analyze

TOP: 5–8

64. Refer to Rocket Experiments Narrative. The engineer obtained a random sample of 66 measurements and computed the SSE for both the complete model and the reduced model. The values were 1477.8 and 1678.8, respectively. Perform the appropriate test of hypothesis to determine whether the reduced model is adequate for the engineer’s use. Use = 0.05. ANS: The hypotheses of interest are The test statistic is

vs.

At least one of

is not 0.

= = 0.05,

= k – r = 2, and

= 4.08. The critical value of F with = n – (k+1) = 60 is 3.15. Reject

if F > 3.15. Since F 

3.15, is rejected. There is evidence to indicate that at least one of the two quadratic variables is contributing significant information for predicting y. Hence, the complete model should be used. PTS: 1 REF: 587-590 BLM: Higher Order - Evaluate

TOP: 5–8

65. Refer to Chemical Analysis Narrative. Write the two models the chemist considered. ANS: Complete model: Reduced model: PTS: 1 REF: 594-598 BLM: Higher Order - Analyze

TOP: 5–8

66. Refer to Chemical Analysis Narrative. Use the statistical software output below to test whether the reduced model is adequate at the 0.05 level of significance. Complete Model Regression Analysis The regression equation is Analysis of Variance Source Regression Residual Error Total

DF 4 4 8

SS 18,969.7 962.5 19,932.2

MS 4,742.4 240.6

F P 19.71 0.007

MS 9,192.9 257.8

F 35.67

Reduced Model Regression Analysis The regression equation is Analysis of Variance Source Regression Residual Error Total

DF 2 6 8

SS 18,385.7 1,546.5 19,932.2

P 0.000

ANS: The hypotheses of interest are The test statistic is

vs.

At least one of

is not 0.

= = k – r = 2, and

= 0.05,

< 6.94, we fail to reject or

= 1.2136. The critical value of F with

= n – (k + 1) = 4 is 6.94. Reject

if F > 6.94. Since F

= 0.05. There is no evidence to indicate that at least one of

is not 0. Hence, the reduced model is adequate.

PTS: 1 REF: 594-599 BLM: Higher Order - Evaluate

TOP: 5–8

67. Use the following partial output and residual plot generated using statistical software to determine whether there are any potential outliers in this data. Obs 1 2 3 4 5 6 7 8 9

x1 15.0 38.0 23.0 16.0 16.0 13.0 20.0 34.0 30.0

y 145.00 228.00 150.00 130.00 160.00 114.00 142.00 265.00 200.00

Fit 143.89 223.04 161.20 138.11 134.12 118.27 149.54 265.20 200.62

St Resid 0.39 0.66 –1.02 –0.64 1.96 –0.35 –0.59 –0.07 –0.06

ANS: Based on the output and the plot, it would appear that observation #5 is close to being a suspect outlier since it has a standardized residual close to 2. PTS:

REF: 586

TOP: 5–8

BLM: Higher Order - Analyze 68. What is stepwise regression, and when is it desirable to make use of this multiple regression technique? ANS: Stepwise regression is a multiple regression estimation technique whereby independent variables are added to the regression equation one at a time. The first x variable to enter the regression is the one that explains the greatest amount of variation in y. The second variable to enter is the one that explains the greatest amount of the remaining variation. The use of stepwise regression can reduce the possibility for multicollinearity since it is unlikely that two highly correlated x variables will be included in a multiple regression that is estimated using the stepwise technique. This technique is useful when there are a great many independent variables. PTS: 1 REF: 607-609 BLM: Higher Order - Understand

TOP: 5–8

69. The two largest values in a correlation matrix are the 0.89 correlation between y and the 0.83 correlation between y and

. During a stepwise regression analysis,

independent variable brought into the equation. Will not?

and

is the first

necessarily be next? If not, why

ANS: Predictor variable

will not necessarily be the next variable brought into the equation. We

do not know about the correlation between and , so we cannot determine whether will explain the greatest amount of the remaining variation in y. PTS: 1 REF: 607-609 BLM: Higher Order - Analyze

TOP: 5–8

70. In general, on what basis are independent variables selected for entry into the equation during stepwise regression? ANS: Independent variables are selected for entry into the equation during stepwise regression based upon the amount of the remaining variation in y (the variation that has not already been explained by included variables) that a candidate variable can explain. PTS: 1 REF: 607-609 BLM: Higher Order - Understand

TOP: 5–8

71. Discuss some of the signals for the presence of multicollinearity. ANS: There are several clues to the presence of multicollinearity:

a. b. c.

An independent variable known to be an important predictor ends up having a partial regression coefficient that is not significant. A partial regression coefficient exhibits the wrong sign. When an independent variable is added or deleted, the partial regression coefficients for the other variables change dramatically. A more practical way to identify multicollinearity is through the examination of a correlation matrix, which is a matrix that shows the correlation of each variable with each of the other variables. A high correlation between two independent variables is an indication of multicollinearity.

PTS: 1 REF: 608-609 BLM: Higher Order - Understand

TOP: 9–10

72. Discuss briefly what is meant by multicollinearity. ANS: Multicollinearity (also called collinearity and intercorrelation) is a condition that exists when two or more of the independent variables are highly correlated with each other. PTS:

REF: 608

TOP: 9–10

BLM: Remember

Chapter 14—Analysis of Categorical Data MULTIPLE CHOICE 1. The area to the right of a chi-square value is 0.05. Given this information, for 9 degrees of freedom, what would be the table value? a. 3.32511 b. 4.16816 c. 16.9190 d. 19.0228 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

2. How is a chi-square goodness-of-fit test conducted? a. as a lower-tailed test b. as an upper-tailed test c. as a two-tailed test d. either as a lower-tailed or as an upper-tailed test ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

3. A left-tailed area in the chi-square distribution equals 0.90. Taking this into consideration, for 7 degrees of freedom, what does the table value equal? a. 1.68987 b. 2.83311 c. 12.0170 d. 14.0671 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

4. In a chi-square goodness-of-fit test with 5 degrees of freedom and a significance level of 0.05, the chi-square value from the table is 11.0705. Which of the following computed values of the chi-square test statistic will lead to rejection of the null hypothesis? a. 7.814 b. 8.952 c. 10.78 d. 17.61 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 626-628

TOP: 1–3

5. In chi-square tests, there is a rule known as Rule of Five. What does this rule require? a. It requires that the observed frequency for each cell be five or more. b. It requires that the degrees of freedom for the test be at least five. c. It requires that the expected frequency for each cell be five or more. d. It requires that the difference between the observed and expected frequency for each cell be at least five.

ANS: C BLM: Remember

PTS:

REF: 644

TOP: 1–3

6. Consider a cell in a contingency table. Given the cell’s row total of 80, the cell’s column total of 60, and a sample size of 250, what is the cell’s expected frequency? a. 1.786 b. 3.125 c. 19.2 d. 20.0 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 1–3

7. In a goodness-of-fit test, suppose that the value of the test statistic is 11.89 and the degrees of freedom are 5. At the 5% significance level, what may one conclude about the null hypothesis? a. It is rejected, and the p-value for the test is smaller than 0.05. b. It is not rejected, and the p-value for the test is greater than 0.05. c. It is rejected, and the p-value for the test is greater than 0.05. d. It is not rejected, and the p-value for the test is smaller than 0.05. ANS: A PTS: 1 BLM: Higher Order - Evaluate

REF: 626-628

TOP: 1–3

8. To determine whether a single coin is fair, the coin was tossed 250 times, and heads was observed 140 times. What is the value of the test statistic? a. 13.6 b. 30 c. 40 d. 110 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 627-628

TOP: 1–3

9. In a goodness-of-fit test, suppose that a sample showed that the observed frequency

and

the expected frequency were equal for each cell i. Using this information, what may one conclude about the null hypothesis? a. It is rejected at 0.05 but is not rejected at 0.025. b. It is not rejected at 0.05 but is rejected at 0.025. c. It is rejected at any level . d. It is not rejected at any level. ANS: D PTS: 1 BLM: Higher Order - Evaluate

REF: 626-628

TOP: 1–3

10. In a goodness-of-fit test, suppose that the value of the test statistic is 13.08 and the number of degrees of freedom is 6. At the 5% significance level, what may one conclude about the null hypothesis? a. It is rejected, and the p-value for the test is smaller than 0.05.

b. It is not rejected, and the p-value for the test is greater than 0.05. c. It is rejected, and the p-value for the test is greater than 0.05. d. It is not rejected, and the p-value for the test is smaller than 0.05. ANS: A PTS: 1 BLM: Higher Order - Evaluate

REF: 626-628

TOP: 1–3

11. If each element in a population is classified into one and only one of several categories, which of the following best describes this kind of population? a. It is a normal population. b. It is a multinomial population. c. It is a chi-square population. d. It is a binomial population. ANS: B BLM: Remember

PTS:

REF: 625

TOP: 1–3

12. Which of the following must be known in order to determine the critical values in the chi-square distribution table? a. the degrees of freedom b. the probability of Type I error c. the probability of Type II error d. both the degrees of freedom and the probability of Type I error ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

13. Of the values for a chi-square test statistic listed below, which one is likely to lead to rejecting the null hypothesis in a goodness-of-fit test? a. 0 b. 1.3 c. 1.9 d. 40 ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

14. Which of the following is our best option if the expected frequency for any cell i is less than five and we want to use Pearson’s chi-square statistic in our experiment? a. We must choose another sample of five or more observations. b. We should use the normal distribution instead of the chi-square distribution. c. We should combine the cells such that each observed frequency is five or more. d. We increase the number of degrees of freedom for the test by five. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 644

TOP: 1–3

15. Which statistical technique is appropriate when we describe a single population of qualitative data with two or more categories?

a. b. c. d.

the z test of the difference between two proportions the chi-square goodness-of-fit test the chi-square test of a contingency table either (a) and (b)

ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 625-627

TOP: 1–3

16. Which of the following best describes the sampling distribution of the test statistic for a goodness-of-fit test with k categories? a. It is a Student t distribution with k – 1 degrees of freedom. b. It is a normal distribution. c. It is a chi-square distribution with k – 1 degrees of freedom. d. It is an approximately chi-square distribution with k – 1 degrees of freedom. ANS: D BLM: Remember

PTS:

REF: 626-627

TOP: 1–3

17. Consider a multinomial experiment with 200 trials, and the outcome of each trial can be classified into one of five categories. In this case, how many degrees of freedom would be associated with the chi-square goodness-of-fit test? a. 195 b. 40 c. 5 d. 4 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

18. Which of the following is NOT a characteristic of a multinomial experiment? a. The experiment consists of a fixed number, n, of trials. b. The outcome of each trial can be classified into one of two categories called successes and failures. c. The probability that the outcome will fall into cell i remain constant for each trial. d. Each trial of the experiment is independent of the other trials. ANS: B BLM: Remember

PTS:

REF: 625

TOP: 1–3

19. In the chi-square goodness-of-fit test, if the expected frequencies, , and the observed frequencies, , were very different, what would we conclude? a. The null hypothesis is false, and we would reject it. b. The null hypothesis is true, and we would not reject it. c. The alternative hypothesis is false, and we would reject it. d. The chi-square distribution is invalid, and we would use the t-distribution instead. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

20. Which of the following is the appropriate test to use if we wish to determine whether there is evidence that the proportion of successes is higher in group 1 than in group 2? a. z test b. test c. t test with 2 degrees of freedom d. F test with 2 degrees of freedom ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 629 | 643

TOP: 1–3

21. What is the appropriate test to use if we wish to determine whether there is evidence that the proportion of successes is the same in group 1 as in group 2? a. the z test b. the test c. the z test and the test d. t test with 2 degrees of freedom ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 643

TOP: 1–3

22. A chi-square test of independence with 10 degrees of freedom results in a test statistic of 19.25. Using the chi-square table, which of the following is the most accurate statement that can be made about the p-value for this test? a. p-value < 0.025 b. 0.025 < p-value < 0.05 c. 0.05 < p-value < 0.10 d. 0.10 < p-value < 0.20 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

23. Upon which of the following kinds of variables is the chi-square test of independence based? a. two qualitative variables b. two quantitative variables c. three or more qualitative variables d. three or more quantitative variables ANS: A BLM: Remember

PTS:

REF: 631-632

TOP: 4

24. A chi-square test of independence is applied to a contingency table with four rows and five columns for two qualitative variables. What are the degrees of freedom for this test? a. 20 b. 16 c. 15 d. 12 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 632-633

TOP: 4

25. In a chi-square test of independence, the value of the test statistic was , and the critical value at was 11.1433. Which of the following conclusions may we draw from this information? a. We fail to reject the null hypothesis at . b. We reject the null hypothesis at . c. We don’t have enough evidence to accept or reject the null hypothesis at . d. We should decrease the level of significance in order to reject the null hypothesis. ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

26. How many columns would the table associated with a contingency table test with 4 rows and 15 degrees of freedom have? a. 5 columns b. 6 columns c. 9 columns d. 11 columns ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 632-633

TOP: 4

27. What summary data does a contingency table contain? a. It classifies data with respect to two qualitative variables. b. It divides each variable into two or more categories. c. It contains numbers that show the frequency of occurrence of all possible combinations of categories. d. All of (a), (b), and (c). ANS: D BLM: Remember

PTS:

REF: 631-632

TOP: 4

28. Consider a cell in a contingency table. Given the cell’s row total of 200, the cell’s column total of 75, and a sample size of 1000, what is the cell’s expected frequency? a. 10 b. 15 c. 20 d. 44 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 4

29. Which of the following statements is NOT a property of a contingency table test? a. The nature of the sampling distribution of chi-square depends on the number of degrees of freedom associated with the problem under investigation. b. The degrees of freedom are found as (r – 2)(c – 2), where r represents number of rows and c represents number of columns. c. The contingency table must have a minimum of two rows and two columns.

ANS: B BLM: Remember

PTS:

REF: 633

TOP: 4

30. Which of the following statements is NOT a characteristic of a goodness-of-fit test? a. It determines the likelihood that sample data have been generated from a population that conforms to a specified type of probability distribution. b. It compares the entire shapes of two (discrete or continuous) probability distributions: one describing known population data and the other one describing hypothetical sample data. c. The aim of the test might be limited to identifying only the family to which the underlying distribution belongs. d. The aim of the test might be limited to identifying only the family to which the underlying distribution belongs or it might go further, seeking even to identify a particular member of that family. ANS: B TOP: 4

PTS: 1 REF: 625-627 | 644 BLM: Higher Order - Understand

31. How many columns does the table associated with a contingency table test with 4 rows and 30 degrees of freedom have? a. 7 columns b. 9 columns c. 11 columns d. 13 columns ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 4

32. The president of a large university collected data from students concerning building a new library, and classified the responses into different categories (strongly agree, agree, undecided, disagree, strongly disagree) and according to whether the student was male or female. To determine whether the data provide sufficient evidence to indicate that the responses depend upon gender, which of the following would be the most appropriate test? a. a chi-square goodness-of-fit test b. a chi-square test of a contingency table (test of independence) c. a chi-square test of normality d. a chi-square test of abnormality ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 633

TOP: 4

33. Which of the following is the number of degrees of freedom for a contingency table with six rows and six columns? a. 36 b. 25 c. 12 d. 6 ANS: B PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 4

34. A chi-square test of a contingency table with 4 rows and 5 columns shows that the value of the test statistic is 22.18. Which of the following is the most accurate statement that can be made about the p-value for this test? a. p-value is greater than 0.05 b. p-value is smaller than 0.025 c. p-value is greater than 0.025 but smaller than 0.05 d. p-value is greater than 0.10 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

35. Which statistical technique is appropriate when we wish to analyze the relationship between two qualitative variables with two or more categories? a. the chi-square test of a multinomial experiment b. the chi-square test of a contingency table c. the t test of the difference between two means d. the z test of the difference between two proportions ANS: B BLM: Remember

PTS:

REF: 631-632

TOP: 4

36. In which of the following situations are contingency tables used? a. to test independence of two samples b. to test dependence in matched pairs c. to test independence of two qualitative variables in a population d. to describe a single population ANS: C BLM: Remember

PTS:

REF: 631-632

TOP: 4

37. Upon which of the following types of variables is the chi-square test of a contingency table based? a. two qualitative variables b. two quantitative variables c. three or more qualitative variables d. three or more quantitative variables ANS: A BLM: Remember

PTS:

REF: 631-632

TOP: 4

38. If we wanted to conduct a one-tailed test of a population proportion, which of the following tests could we employ? a. z test of a population proportion b. the chi-square goodness-of-fit test since c. the chi-square test of independence ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 380 | 643

TOP: 5–7

39. If we wanted to conduct a two-tailed test of a population proportion, which of these tests could we employ?

a. z test of a population proportion b. the chi-square goodness-of-fit test since c. the chi-square test of independence d. both a and b ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 380 | 643

TOP: 5–7

40. Which of the following procedures would be considered a suitable application for the chi-square technique? a. testing the alleged independence of two qualitative variables b. making inferences about the relative sizes of more than two population proportions c. conducting a goodness-of-fit-test to determine whether data are consistent with data drawn from a particular probability distribution ANS: A TOP: 5–7

PTS: 1 REF: 631-632 | 643-644 BLM: Higher Order - Understand

TRUE/FALSE 1. The rejection region of the chi-square goodness-of-fit test has k – 1 degrees of freedom, where k is the number of categories (called cells). ANS: T BLM: Remember

PTS:

REF: 626-628

TOP: 1–3

2. The rejection region of the chi-square goodness-of-fit test is number of categories and

, where k is the

is the value of the test statistics.

ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

3. The chi-square goodness-of-fit test involves two categorical variables. ANS: F BLM: Remember

PTS:

REF: 626-628

TOP: 1–3

4. A chi-square goodness-of-fit test with 3 degrees of freedom results in a test statistic of 6.789. Using the chi-square table, the most accurate statement that can be made about the p-value for this test is that 0.05 < p-value < 0.10. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 626-629

TOP: 1–3

5. A chi-square goodness-of-fit test is always conducted as a two-tailed test. ANS: F BLM: Remember

PTS:

REF: 626-628

TOP: 1–3

6. A goodness-of-fit test determines the likelihood that sample data have been generated from a population that conforms to a specified type of probability distribution. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 625-627

TOP: 1–3

7. The degrees of freedom associated with a goodness-of-fit test equal the number of rows times the number of columns in the table. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

8. A right-tailed area in the chi-square distribution equals 0.05. For 6 degrees of freedom the table value equals 12.5916. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 626-629

TOP: 1–3

9. Whenever the expected frequency of a cell is less than five, one remedy for this condition is to increase the significance level. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 644

TOP: 1–3

10. Whenever the expected frequency of a cell is less than five, one remedy for this condition is to increase the size of the sample. ANS: T BLM: Remember

PTS:

REF: 644

TOP: 1–3

11. For a chi-square distributed random variable with 10 degrees of freedom and a level of significance of 0.025, the chi-square table value is 20.4831. The computed value of the test statistics is 16.857. This will lead us to reject the null hypothesis. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 626-629

TOP: 1–3

12. Whenever the expected frequency of a cell is less than five, one remedy for this condition is to decrease the size of the sample. ANS: F BLM: Remember

PTS:

REF: 644

TOP: 1–3

13. The middle 0.95 portion of the chi-square distribution with 9 degrees of freedom has table values of 3.32511 and 16.9190, respectively. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

14. A left-tailed area in the chi-square distribution equals 0.10. For 5 degrees of freedom the table value equals 9.23635. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

15. In applying the chi-square goodness-of-fit test, the rule of thumb for all expected frequencies is that each expected frequency is five or more. ANS: T BLM: Remember

PTS:

REF: 644

TOP: 1–3

16. The area to the right of a chi-square value is 0.01. For 8 degrees of freedom, the table value is 1.64648. ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 626-628

TOP: 1–3

17. A multinomial experiment, where the outcome of each trial can be classified into one of two categories, is identical to a binomial experiment. ANS: T BLM: Remember

PTS:

REF: 625

TOP: 1–3

18. A chi-square goodness-of-fit test is always conducted as a two-tailed test. ANS: F BLM: Remember

PTS:

REF: 626

TOP: 1–3

19. If the test statistic for a chi-square goodness-of-fit test is larger than the critical value, the null hypothesis should be rejected. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 1–3

20. When the expected cell frequencies are smaller than five, the cells should be combined in a meaningful way such that the expected cell frequencies exceed five. ANS: T BLM: Remember

PTS:

REF: 644

TOP: 1–3

21. By combining cells that have expected frequencies smaller than five, we guard against having an inflated test statistic that could have led us to incorrectly accept the null hypothesis. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 644

TOP: 1–3

22. A contingency table classifies data with respect to two qualitative variables that are each divided into two or more categories. ANS: T BLM: Remember

PTS:

REF: 631-632

TOP: 4

23. Numbers in a contingency table show the frequency of occurrence of all possible combinations of categories. ANS: T BLM: Remember

PTS:

REF: 631-632

TOP: 4

24. In a typical chi-square test of independence, we calculate each expected cell frequency ( as the product of row total ( ) times column total ( ANS: F BLM: Remember

PTS:

)

) times sample size (n).

REF: 631-633

TOP: 4

25. Chi-square tests of independence are always lower-tailed because a perfect fit between and

makes the test statistic

ANS: F PTS: 1 BLM: Higher Order - Understand

equal to 0. REF: 631-633

TOP: 4

26. The degrees of freedom associated with a chi-square test of independence where data are summarized in a contingency table with r rows and c columns equal the number of rows times the number of columns in the table; that is, rc. ANS: F BLM: Remember

PTS:

REF: 633

TOP: 4

27. To be valid, a chi-square test of independence requires that the expected frequency for each cell in the contingency table equals ten or more. ANS: F BLM: Remember

PTS:

REF: 644

TOP: 4

28. The chi-square test statistic for a contingency table with r rows and c columns can be negative if r is much smaller than c. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 633-635

TOP: 4

29. A chi-square test for independence is applied to a contingency table with three rows and five columns for two qualitative variables. The degrees of freedom for this test is 8. ANS: T

PTS:

REF: 633

TOP: 4

BLM: Higher Order - Apply 30. In a chi-square test of independence with 6 degrees of freedom and a level of significance of 0.05, the critical value from the chi-square table is 12.5916. The computed value of the test statistics is 11.264. This will lead us to reject the null hypothesis. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

31. Numbers in a contingency table show the frequency of occurrence of all possible combinations of categories. ANS: T BLM: Remember

PTS:

REF: 634-635

TOP: 4

32. In a typical contingency table test, we calculate each expected cell frequency as the product of row total times column total times sample size. ANS: F BLM: Remember

PTS:

REF: 634-635

TOP: 4

33. Chi-square tests of independence are always lower-tailed because a perfect fit between and

makes

equal to 0.

ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 631-633

TOP: 4

34. A contingency table classifies data according to two or more categories associated with each of two qualitative variables that are statistically independent of one another. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 631-632

TOP: 4

35. The degrees of freedom associated with a contingency table test of independence equal the number of rows times the number of columns in the table. ANS: F BLM: Remember

PTS:

REF: 633

TOP: 4

36. To be valid, a chi-square test of independence requires that each expected frequency equal 30 or more. ANS: F BLM: Remember

PTS:

REF: 644

TOP: 4

37. A chi-square test for independence is applied to a contingency table with three rows and four columns for two qualitative variables. The degrees of freedom for this test must be 12.

ANS: F PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 4

38. A chi-square test for independence is applied to a contingency table with four rows and four columns for two qualitative variables. The degrees of freedom for this test must be 9. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 633

TOP: 4

39. A chi-square test for independence with 10 degrees of freedom results in a test statistic of 17.894. Using the chi-square table, the most accurate statement that can be made about the p-value for this test is that 0.05 < p-value < 0.10. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

40. In a chi-square test of independence, the value of the test statistic was = 15.652, and the critical value at was 11.1433. Thus, we must reject the null hypothesis at . ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

41. The chi-square test of independence is based upon three or more quantitative variables. ANS: F BLM: Remember

PTS:

REF: 631-632

TOP: 4

42. A chi-square test for independence with 6 degrees of freedom results in a test statistic of 13.25. Using the chi-square table, the most accurate statement that can be made about the p-value for this test is that p-value is greater than 0.025 but smaller than 0.05. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 632-634

TOP: 4

43. The chi-square test of a contingency table is used to determine if there is enough evidence to infer that two nominal variables are related, and to infer that differences exist among two or more populations of nominal variables. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 631-632

TOP: 4

44. If we want to perform a two-tailed test for differences between two populations of nominal data with exactly two categories, we can employ either the z test of , or the chi-square test of homogeneity, since squaring the value of the z statistic yields the value of statistics.

ANS: T BLM: Remember

PTS:

REF: 643

TOP: 5–7

45. The chi-square technique is a statistical technique for testing the alleged independence of two qualitative variables, making inferences about the relative sizes of more than two population proportions, and conducting goodness-of-fit tests to assess the plausibility that sample data come from a population that conforms to a specified probability distribution. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 631-632

TOP: 5–7

46. When there are only two categories in a multinomial experiment, the experiment reduces to a binomial experiment. ANS: T BLM: Remember

PTS:

REF: 625

TOP: 5–7

47. The data that result from two binomial experiments can be displayed as a two-way classification with two rows and two columns, so that the chi-square test of homogeneity can be used to compare the two binomial proportions ANS: T BLM: Remember

PTS:

and

REF: 639

. TOP: 5–7

48. The goodness-of-fit test is a two-way classification with cell probabilities specified in ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 626-628

TOP: 5–7

49. When either the row or the column totals in a contingency table are fixed, the test of independence of classifications becomes a test of the homogeneity of cell probabilities for several multinomial experiments. ANS: T BLM: Remember

PTS:

REF: 638-639

TOP: 5–7

50. The large-sample z tests for one and two binomial proportions are special cases of the chi-square statistic. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 643

TOP: 5–7

51. Tests of homogeneity are used to compare several binomial populations. ANS: T BLM: Remember

PTS:

REF: 639

TOP: 5–7

52. If there are more than two row categories in a contingency table with fixed c column totals, then the test of independence is equivalent to a test of the equality of c sets of multinomial proportions. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 643

TOP: 5–7

53. The total degrees of freedom for an r  c (fixed-column) contingency table are rc + 1. ANS: F BLM: Remember

PTS:

REF: 639

TOP: 5–7

PROBLEM 1. Historically, a bank has employed 50% permanent full-time workers, 25% permanent part-time workers, 15% temporary full-time workers, and 10% temporary part-time workers. In a recent sample of 400 workers, 190 were permanent full-time, 125 were permanent part-time, 55 were temporary full-time, and 30 were temporary part-time. Use these data and  = 0.05 to assess whether there has been a change from the historical percentages. Category Perm full Perm part Temp full Temp part

Percent 50 25 15 10

Observed 190 125 55 30

Expected 200 100 60 40

ANS: The hypotheses to be tested are

vs. .

The chi-square test statistic can now be calculated as

= = 9.67. With df = k –

1 = 3, and reject under

= 0.05, we reject

when

= 7.81473. Since

> 7.81473, we

and conclude that at least one of the cell probabilities differs from that specified .

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

2. A librarian knows from old records that the percentages of checked-out materials that are fiction, nonfiction, children’s books, books on tape, and magazines are 35%, 25%, 20%, 10%, and 10%, respectively. The material checked out in each of the categories in a recent survey of 600 randomly chosen library items is listed below. Use these data to assess whether there has been a change in the distribution of checked-out library materials. State the hypotheses to be tested and use = 0.01 to interpret your results.

Category Fiction Nonfiction Children Tapes Magazines

Percent 35 25 20 10 10

Observed 200 150 110 100 40

ANS: The hypotheses to be tested are vs. . The chi-square test statistic can now be calculated as = 34.643. With df = k – 1 = 4, and 13.2767. Since

> 13.2767, we reject

= 0.01, we reject

when

and conclude that at least one of the cell

probabilities differs from that specified under PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

Ages of National College Students Narrative A national survey states that 67% of college students are under the age of 25, 21% are between the ages of 25 and 30, 8% are between 30 and 40, and 4% are over 40. A random sample of 250 students at Vanier College in Montreal yielded the following data: Age Under 25 25 but under 30 30 but under 40 Over 40

Frequency 138 62 32 18

3. Refer to Ages of National College Students Narrative. State the null and alternative hypotheses to test whether the distribution of students’ ages at Vanier College agrees with the national survey. ANS: Let be the proportion of students in age category i, i = 1, 2, 3, and 4 for the four age groups as they appear on the frequency table above. The hypotheses to be tested are vs. .

PTS: 1 REF: 626-628 BLM: Higher Order - Analyze

TOP: 1–3

4. Refer to Ages of National College Students Narrative. Compute the value of the test statistic. ANS: The expected cell counts for each of the four age categories, computed using the formula , are 167.5, 52.5, 20, and 10, respectively. The chi-square test statistic can now be calculated as = = 20.515. PTS: 1 REF: 626-628 BLM: Higher Order - Apply

TOP: 1–3

5. Refer to Ages of National College Students Narrative. Set up the appropriate rejection region for = 0.05. ANS: With df = k – 1 = 3, and

= 0.05, we reject

PTS: 1 REF: 626-629 BLM: Higher Order - Analyze

when

= 7.81473.

TOP: 1–3

6. Refer to Ages of National College Students Narrative. What is the appropriate conclusion? ANS: Since  7.81473, reject and conclude that the distribution of students’ ages at Vanier College does not agree with the national survey. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

7. A sample survey of 850 people in Calgary was conducted in 2001 to see how people get to work. The results are shown in the frequency table below. A study of the same nature was conducted in 1990 and revealed 12% walk, 67% drive an automobile, 18% take a bus, and 3% ride a bicycle. Do the current data support that the proportions are the same as in 1990? Use  = 0.025. Method Walk Automobile Bus Bicycle

Frequency 100 550 175 25

ANS: Let be the proportion of people in each category i, i = 1, 2, 3, and 4 for the four transportation methods as they appear in the frequency table above. The hypotheses to be tested are vs. . The expected cell counts for each of the four age categories, computed by using the formula , are 102, 569.5, 153, and 25.5, respectively. The chi-square test statistic can now be calculated as = = 3.88. With df = k – 1 = 3 and 9.3484, do not reject as in 1990.

= 0.025, we reject

when

= 9.3484. Since

. The current data support that the proportions in 2001 are the same

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

Average Casino Dice Rolls Narrative A casino customer is concerned about whether the dice the casino is using are “fair” (i.e., each face is equally likely to appear). The customer obtains a die that is being used by the casino and rolls it 240 times. The customer records the following data: Face Frequency

1 46

2 38

3 44

4 45

5 38

6 29

8. Refer to Average Casino Dice Rolls Narrative. State the null and alternative hypotheses. ANS: Let

be the probability of face i of the die, i = 1, 2, 3, 4, 5, 6. The hypotheses to be tested

are

(die is fair) vs. (die is not fair).

PTS: 1 REF: 626-628 BLM: Higher Order - Analyze

TOP: 1–3

9. Refer to Average Casino Dice Rolls Narrative. Compute the value of the test statistic. ANS: The expected cell counts for each of the six categories, computed using the formula are all equal to 40 (since as

is the same). The chi-square test statistic can now be calculated

= = 5.15. PTS: 1 REF: 626-628 BLM: Higher Order - Apply

TOP: 1–3

10. Refer to Average Casino Dice Rolls Narrative. Set up the appropriate rejection region for = 0.10. ANS: With df = k – 1 = 5 and

= 0.10, we reject

PTS: 1 REF: 626-629 BLM: Higher Order - Analyze

when

= 9.23635.

TOP: 1–3

11. Refer to Average Casino Dice Rolls Narrative. What is the appropriate conclusion? ANS: Since

< 9.23635, do not reject

. We can conclude that the die is fair.

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

12. A total of 200 people in each of three grocery stores were asked if they favoured, opposed, or were indifferent to the sale of lottery tickets in grocery stores. The results of the survey are summarized below. Use the statistical software computer output shown here to conduct a chi-square test of independence at = 0.05. Opinion Favour Oppose Indifferent

Store A 21 101 78

Store B 78 63 59

Store C 94 27 79

Chi-Square Test Expected counts are printed below observed counts Store A 21 64.33

Store B 78 64.33

Store C 94 64.33

Total 193

101 63.67

63 63.67

27 63.67

191

216

Total

72.00

200

600

Chi-Sq = 29.188 + 2.903 + 13.680 + 21.892 + 0.007 + 21.117 + 2.347 + 0.681 = 92.315 DF = 4

0.500 +

ANS: People’s opinion concerning the sale of lottery tickets in grocery stores and store type are independent. People’s opinion concerning the sale of lottery tickets in grocery stores and store type are dependent. With df = (r – 1)(c – 1) = 4, and

= 0.05, we reject

when

= 9.48733. Since

= 92.315 > 9.48733, we can reject and conclude that the two variables are dependent; that is, the opinion of people is not the same at the three stores. PTS: 1 REF: 631-636 BLM: Higher Order - Evaluate

TOP: 1–3

Ink Pen Colours Narrative A national survey stated that 30% of the population prefer to use a pen with black ink, 30% prefer blue ink, 25% prefer red ink, and 15% prefer some other colour. A statistics professor took a random sample of 80 students and asked them to state their ink colour preference. The following data was recorded: Colour Frequency

Black 28

Blue 26

Red 18

Other 8

13. Refer to Ink Pen Colours Narrative. State the null and alternative hypotheses to test whether the data agree with the percentages stated in the national survey. ANS: Let be the population proportion who prefer to use ink pen with colour i, i = 1, 2, 3 and 4 for black, blue, red, and other, respectively. The null and alternative hypotheses to be tested are vs. . PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

14. Refer to Ink Pen Colours Narrative. Compute the value of the test statistic.

ANS: The chi-square test statistic can be calculated as

= = 2.367.

PTS: 1 REF: 626-628 BLM: Higher Order - Apply

TOP: 1–3

15. Refer to Ink Pen Colours Narrative. Set up the appropriate rejection region for  = 0.05. ANS: With df = k – 1 = 3, and

= 0.05, we reject

PTS: 1 REF: 626-629 BLM: Higher Order - Analyze

when

= 7.81473.

TOP: 1–3

16. Refer to Ink Pen Colours Narrative. What is the appropriate conclusion? ANS: Since < 7.81473, do not reject . Therefore, we conclude that the data agree with the percentages stated in the national survey. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

17. A take-out pizza palace offers four different pizzas. The manager believes that 45% of the single pizza customers prefer pepperoni pizza, 35% prefer the combination pizza, 15% prefer the taco pizza, and 5% prefer the vegetarian pizza. To check his belief, the manager takes a random sample of 100 single pizza orders and records the following information: Pizza Preference Frequency

Pepperoni 43

Combination 39

Taco 12

Veggie 6

Do the sample data present sufficient evidence to support the manager’s belief? Test the appropriate hypotheses at the 10% significance level. Justify your conclusion. ANS: Let be the proportion of customers who prefer pizza variety i, i = 1, 2, 3, and 4, for pepperoni, combination, taco, and veggie, respectively. The null and alternative hypotheses to be tested are vs. .

The expected cell counts for each of the four pizza types, computed using the formula , are 45, 35, 15, and 5, respectively. The chi-square test statistic can now be calculated as

= = 1.346.

With df = k – 1 = 3 and 6.25139, do not reject

= 0.10, we reject

when

= 6.25139. Since

. The current data support the manager’s belief.

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

Ice Cream Flavours Narrative The four most popular flavours of a particular brand of ice cream are French vanilla, natural vanilla, chocolate chip, and caramel almond swirl. The question arose as to whether each of these flavours is equally preferred. To answer that question, a random sample of 160 sales was selected, yielding the following data: Ice Cream Flavour Frequency

French Vanilla 45

Natural Vanilla 40

Chocolate Chip 50

Caramel Almond Swirl 20

18. Refer to Ice Cream Flavours Narrative. State the null and alternative hypotheses. ANS: Let be the proportion of people who prefer ice cream flavour i, i = 1, 2, 3, and 4, for French vanilla, natural vanilla, chocolate chip, and caramel almond swirl, respectively. The hypotheses to be tested are . PTS: 1 REF: 626-628 BLM: Higher Order - Analyze

TOP: 1–3

19. Refer to Ice Cream Flavours Narrative. Compute the value of the test statistic. ANS: The expected cell counts for each of the four categories, computed using the formula , are all equal to 40 (since calculated as

is the same). The chi-square test statistic can now be = = 13.125.

PTS:

REF: 626-628

TOP: 1–3

BLM: Higher Order - Apply 20. Refer to Ice Cream Flavours Narrative. What is the appropriate conclusion? ANS: Since > 6.25139, we reject preferred.

, and conclude that the ice cream flavours are not equally

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

21. Refer to Ice Cream Flavours Narrative. Set up the appropriate rejection region for  = 0.10. ANS: With df = k – 1 = 3 and

= 0.10, we reject

PTS: 1 REF: 626-629 BLM: Higher Order - Analyze

when

= 6.25139.

TOP: 1–3

22. A survey in Quebec claims that, among consumers, the grocery store chains of Provigo, IGA, and Metro are equally preferable. To check this claim, a random sample of 150 consumers was selected and the following data recorded: Store Chain Frequency

Provigo 58

IGA 50

Metro 42

Test the appropriate hypotheses at the 5% significance level. ANS: Let be the proportion of people who prefer grocery store chain i, i = 1, 2, and 3 for Provigo, IGA, and Metro, respectively. The hypotheses to be tested are vs.

The expected cell counts for each of the three stores, computed using the formula are all 50 (since

is the same). The chi-square test statistic can now be calculated as =

= 2.56.

With df = k – 1 = 2 and

= 0.05, we reject

5.99147, do not reject equally preferred.

, and therefore conclude that the three grocery store chains are

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

when

TOP: 1–3

= 5.99147. Since

23. A sports enthusiast believes that among sports television viewers 60% prefer to watch hockey, 25% prefer basketball, and 15% prefer football. A random sample of 120 viewers was selected and the following data recorded: Sport Hockey Basketball Football Frequency 80 30 10 Perform the appropriate test of hypothesis to determine whether the percentages are correct. Use  = 0.05. ANS: Let be the proportion of people who prefer sport i, i = 1, 2, and 3 for hockey, basketball, and football, respectively. The hypotheses to be tested are vs. The expected cell counts for each of the three sports, computed using the formula are 72, 30, and 18. The chi-square test statistic can now be calculated as

With df = k – 1 = 2 and

= 4.444. = 0.05, we reject

when

= 5.99147. Since

5.99147, do not reject , and therefore conclude that the data support the stated percentages; that is, the sports enthusiast’s belief is correct. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

Suppose that a response can fall into one of k = 5 categories with probabilities , and that n = 250 responses produced these category counts: Category Observed Count

1 39

2 53

3 61

4 43

5 54

The categories represent five car salespersons, and the observed frequencies represent the number of cars sold in a six-month period. 24. Refer to Car Sales Narrative. Are the five categories equally likely to occur? Specify the hypotheses of interest. ANS: To see if the five salespersons are equally likely to sell cars, the hypotheses of interest are vs. . PTS: 1 REF: 626-628 BLM: Higher Order - Analyze

TOP: 1–3

25. Refer to Car Sales Narrative. Calculate the observed value of the test statistic. ANS: The expected cell counts for each of the five categories, computed using the formula , are all equal to 50 (since calculated as

is the same). The chi-square test statistic can now be = = 6.32.

PTS: 1 REF: 626-628 BLM: Higher Order - Apply

TOP: 1–3

26. Refer to Car Sales Narrative. If you were to test this hypothesis using the chi-square statistic, how many degrees of freedom would the test have? What is the rejection region? ANS: has k – 1 = 4 degrees of freedom. The rejection region for this test is located in the upper tail of the chi-square distribution with df = 4. The appropriate upper-tailed rejection region is PTS: 1 REF: 626-629 BLM: Higher Order - Analyze

TOP: 1–3

27. Refer to Car Sales Narrative. Test the hypotheses at

= 0.05, and write your conclusion.

ANS: Since the observed value of the test statistic does not fall in the rejection region, we cannot reject the null hypothesis. We conclude that the five salespersons are equally like to sell cars. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

Peony Plants Narrative A peony plant with red petals was crossed with another plant having streaky petals. A geneticist states that 80% of the offspring from this cross will have red flowers. To test this claim, 100 seeds from this cross were collected and germinated, and 65 plants had red petals. 28. Refer to Peony Plants Narrative. Use the chi-square goodness-of-fit test to determine whether the sample data confirm the geneticist’s prediction. Find the approximate p-value and use it to make your decision. ANS: The hypotheses of interest are vs.

. With n = 100, the expected cell counts are calculated as

and

. The chi-square test statistic can now be calculated as =

= 14.0625, which has an

approximate chi-square distribution (if

is true) with df = k – 1 = 1. From the chi-square

table, with 1 degrees of freedom, the observed value = 14.0625 is greater than = 7.87944. Hence, p-value < 0.005 and the results are not highly significant. That is, we have sufficient evidence to conclude that the geneticist’s model is not correct. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

29. Refer to Peony Plants Narrative. Use the large-sample z test to test the hypothesis of the previous question p = 0.80. Verify that the squared value of the test statistic from the previous question. ANS: The hypotheses of interest are

vs.

with

0.65, and the test statistic is

65/100 =

which, when squared

yields the chi-square statistic:

PTS: 1 REF: 380 | 643 BLM: Higher Order - Evaluate

TOP: 1–3

30. Researchers from Germany claimed that the risk of a heart attack for a working person may be as much as 50% greater on Monday than on any other day. In an attempt to verify their claim, they surveyed 420 working people who had recently had heart attacks and recorded the day on which their heart attacks occurred: Sunday 50

Monday 76

Tuesday 56

Wednesday 55

Thursday 67

Friday 55

Saturday 61

Do the data present sufficient evidence to indicate that there is a difference in the incidence of heart attacks depending on the day of the week? Test using = 0.05. ANS: If the frequency of occurrence of a heart attack is the same for each day of the week, then when a heart attack occurs, the probability that it falls in one cell (day) is the same as for any other cell (day). Hence, the hypotheses of interest are vs.

. Since n = 400,

= 420(1/7) = 60 and the test statistic is

. The degrees of freedom for this test of specified cell probabilities is k – 1 = 7 – 1 = 6 and the upper-tailed rejection region is . Since < 12.59, is not rejected. There is insufficient evidence to indicate a difference in frequency of occurrence of a heart attack from day to day. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

31. Medical statistics show that deaths due to four major diseases—call them A, B, C, and D—account for 15%, 21%, 18%, and 14%, respectively, of all non-accidental deaths. A study of the causes of 616 non-accidental deaths at a hospital gives the following counts: Disease A B C D Other Deaths 86 152 170 42 166 Do these data provide sufficient evidence to indicate that the proportions of people dying of diseases A, B, C, and D at this hospital differ from the proportions accumulated for the population at large? Find the approximate p-value and use it to make your decision. ANS: The hypotheses to be tested are vs. . The expected cell counts for each of the four diseases and some other disease, computed by using the formula 92.4, 129.36, 110.88, 86.24, and 197.12. The chi-square test statistic can now be calculated as

, are =

+ = 63.535. The number of degrees of freedom is k – 1 = 4 and, since the observed value

is greater than

= 14.8602, the

p-value is less than 0.005 and the results are declared highly significant. We reject and conclude that the proportions of people dying of diseases A, B, C, and D at this hospital differ from the proportions for the larger population. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

32. A firm has been accused of engaging in prejudicial hiring practices. According to the most recent census, the percentages of whites, blacks, and Asians in a certain community are 72%, 10%, and 18%, respectively. A random sample of 200 employees of the firm revealed that 165 were white, 14 were black, and 21 were Asian. Do the data provide sufficient evidence to conclude at the 5% level of significance that the firm has been engaged in prejudicial hiring practices? Justify your conclusion. ANS: , At least one proportion differs from its specified value Rejection region:

5.991, Test statistic:

Conclusion: Reject

= 11.113

. Yes, the firm has been engaged in prejudicial hiring practices.

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

33. Five brands of orange juice are displayed side by side in several supermarkets in Vancouver. It was noted that in 1 day, 180 customers purchased orange juice. Of these, 30 picked Brand A, 40 picked Brand B, 25 picked Brand C, 35 picked Brand D, and 50 picked Brand E. Can you conclude at the 5% significance level that there is a preferred brand of orange juice in Vancouver? Justify your answer. ANS: = 0.20 At least one proportion differs from its specified value. Rejection region: Test statistic:

9.488 = 10.278

Conclusion: Reject

. Yes, there is a preferred brand of orange juice in Vancouver.

PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

34. In 2003, the student body of a large university in Ontario consisted of 30% first-year students, 25% second-year students, 27% third-year students, and 18% graduate students. A sample of 400 students taken from the 2004 student body showed that there were 138 first-year students, 88 second-year students, 94 third-year students, and 80 graduate students. Test with 5% significance level to determine whether the student body proportions changed. ANS: , At least one proportion differs from its specified value

Rejection region:

7.815

Test statistic: = 6.844 Conclusion: Don’t reject the null hypothesis. The student body proportions did not change from 2003 to 2004. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

35. In 2003, Brand A microwaves had 45% of the market, Brand B had 35%, and Brand C had 20%. This year the makers of Brand C launched a heavy advertising campaign. A random sample of appliance stores shows that, of 10,000 microwaves sold, 4,350 were Brand A, 3,450 were Brand B, and 2,200 were Brand C. Has the market changed? Test at 0.01. ANS: , At least one proportion differs from its specified value Rejection region:

9.210

Test statistic: = 25.714 Conclusion: Reject the null hypothesis. Yes, the market has changed since 2003. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

36. Consumer panel preferences for three proposed fast-food restaurants are as follows: Restaurant A 48

Restaurant B 62

Restaurant C 40

Use 0.05 level of significance and test to see if there is a preference among the three restaurants. ANS: = 1/3 At least one proportion differs from its specified value Rejection region:

5.991

Test statistic: = 4.96 Conclusion: Don’t reject the null hypothesis. There is no preference among the three restaurants. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

37. A cafeteria proposes to serve four main entrees. For planning purposes, the manager expects that the proportions of each that will be selected by his customers will be Selection Chicken Roast Beef Steak Fish

Proportion 0.50 0.20 0.10 0.20

Of the first 100 customers, 44 selected chicken, 24 selected roast beef, 13 selected steak, and 10 selected fish. Should the manager revise his estimates? Justify your conclusion. (Use = 0.01.) ANS: , At least one proportion differs from its specified value Rejection region:

11.345

Test statistic: = 7.264 Conclusion: Don’t reject the null hypothesis. The manager should not revise his estimates. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

38. A statistics professor posted the following grade distribution guidelines for his elementary statistics class: 8% A, 35% B, 40% C, 12% D, and 5% F. A sample of 100 elementary statistics grades at the end of last semester showed 12 As, 30 Bs, 35 Cs, 15 Ds, and 8 Fs. Test at the 5% significance level to determine whether the actual grades deviate significantly from the posted grade distribution guidelines. ANS: , At least one proportion differs from its specified value Rejection region:

9.488

Test statistic: = 5.889 Conclusion: Don’t reject the null hypothesis. The actual grades do not deviate significantly from the posted grade distribution guidelines. PTS: 1 REF: 626-629 BLM: Higher Order - Evaluate

TOP: 1–3

39. A market research study was conducted to compare three different brands of car oil. The results of the study are summarized below. Use the accompanying statistical software output and = 0.005 to determine whether the brand of oil is independent of opinion.

Opinion Excellent Satisfactory Unsatisfactory

Brand A 65 40 8

Brand B 60 73 18

Brand C 21 67 30

Chi-Square Test Expected counts are printed below observed counts Brand A 65 43.19

Brand B 60 57.71

Brand C 21 45.10

Total 146

40 53.25

73 71.15

67 55.60

180

8 16.57

18 22.14

30 17.30

113

151

118

382

Total

Chi-Sq = 11.015 + 0.091 + 12.878 + 3.295 + 0.048 + 2.336 + 4.429 + 0.773 + 9.326 = 44.192 DF = 4, P-Value = 0.000 ANS: The hypotheses to be tested are The brand of oil and opinion are independent vs. The brand of oil and opinion are dependent. With df = (r – 1)(c – 1) = 4 and

= 0.005, we reject

Since = 40.192 > 14.8602, we can reject dependent. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

when

= 14.8602.

and conclude that the two variables are

TOP: 4

Business Decision Narrative The owner of Fit Forever Health Club is considering adding an indoor swimming pool to his facility. The manager decided to take a survey to determine whether member opinion about the addition of a pool was independent of the age of the member. Two hundred members were selected at random and asked to state their opinion. The following data were recorded: Opinion Concerning Pool Age Favour Under 30 40

Undecided 20

Oppose 18

30 but under 50 Over 50

30 10

25 16

20 21

Use the following output generated by statistical software to answer the questions below. Chi-Square Test Expected counts are printed below observed counts 1

Favour 40

Undecided 20 31.20

Oppose 18 23.79

Total 78 23.01

30 30.00

25 22.88

20 22.12

10 18.80

16 14.34

21 13.87

200

Total

Chi-Sq = 2.482 + 0.604 + 1.091 + 0.000 + 0.197 + 0.204 + 4.119 + 0.193 +3.672 = 12.562 DF = 4, P-Value = 0.014 40. Refer to Business Decision Narrative. State the null and alternative hypotheses. ANS: Member opinion concerning the addition of an indoor pool is independent of the member’s age. Member opinion concerning the addition of an indoor pool is dependent on the member’s age. PTS: 1 REF: 631-632 BLM: Higher Order - Analyze

TOP: 4

41. Refer to Business Decision Narrative. What is the value of the test statistic? ANS: = 12.562 PTS: 1 REF: 631-633 BLM: Higher Order - Apply

TOP: 4

42. Refer to Business Decision Narrative. What is the p-value associated with the test? ANS: p-value = 0.014

PTS: 1 REF: 632-635 BLM: Higher Order - Apply

TOP: 4

43. Refer to Business Decision Narrative. What is the appropriate conclusion at the 0.05 level of significance? Justify your answer. ANS: Since p-value = 0.014  = 0.05, reject and conclude that member’s opinion concerning the addition of an indoor pool is dependent on the member’s age. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

Preferred Television Drama Narrative A pollster was interested in determining whether three television dramas are equally preferred by men and women. The following data were recorded: Preference Gender Male Female

“Criminal Minds” 40 30

“Law & Order” 35 45

“Grey’s Anatomy” 10 10

44. Refer to Preferred Television Drama Narrative. State the null and alternative hypotheses. ANS: Preference of television drama is independent of the viewer’s gender Preference of television drama and the viewer’s gender are dependent PTS: 1 REF: 631-632 BLM: Higher Order - Analyze

TOP: 4

45. Refer to Preferred Television Drama Narrative. Compute the value of the test statistic. ANS: The estimated expected cell counts ( together with the observed cell counts ( Gender Male

“Criminal Minds” 40 (35)

) are shown in parentheses in the table below, ) “Law & Order” 35 (40)

“Grey’s Anatomy” 10 (10)

Total 85

Female TOTAL

30 (35) 70

45 (40) 80

10 (10) 20

85 170

Then the test statistic can be calculated as = 0.7143 + 0.625 + 0.000 + 0.7143 + 0.625 + 0.000 = 2.6786. PTS: 1 REF: 631-634 BLM: Higher Order - Apply

TOP: 4

46. Refer to Preferred Television Drama Narrative. Set up the appropriate rejection region for = 0.05. ANS: With df = (r – 1)(c – 1) = 2 and

= 0.05, we reject

PTS: 1 REF: 632-635 BLM: Higher Order - Analyze

when

= 5.99147.

TOP: 4

47. Refer to Preferred Television Drama Narrative. What is the appropriate conclusion? Explain. ANS: Since < 5.99147, do not reject , and therefore conclude that the preference of television drama is independent of the viewer’s gender. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

48. A marketing research professor at Simon Fraser University conducted a survey to determine whether mode of transportation to the university and the person’s position at the university were independent. The following data were recorded: Mode of Transportation Position Faculty Staff Students

Walk 19 14 27

Bike 28 20 49

Perform the appropriate test of hypothesis using

Automobile 75 63 88

Other 45 70 67

= 0.10.

ANS: Mode of transportation to the university and the person’s position at the university are independent.

Mode of transportation to the university and the person’s position at the university are dependent. The estimated expected cell counts (

) are shown in parentheses in the table below,

together with the observed cell counts (

) Mode of Transportation Automobile Other 75 45 (66.80) (53.79)

Position Faculty

Walk 19 (17.74)

Bike 28 (28.67)

Staff

14 (17.74)

20 (28.67)

63 (66.80)

70 (53.79)

167

Students

27 (24.53) 60

49 (39.66) 97

88 92.40 226

67 74.41 182

231

TOTAL

Total 167

565

Then the test statistic can be calculated as = 0.090 + 0.016 + 1.007 + 1.438 + 0.786 + 2.622 + 0.216 + 4.882 + 0.249 + 2.200 + 0.210 + 0.738 = 14.454. With df = (r – 1)(c – 1) = 6 and

= 0.10, we reject

when

= 10.6446. Since

> 10.6446, reject , and therefore conclude that the mode of transportation to the university and person’s position at the university are dependent. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

Seat Belts On School Buses Narrative A study was conducted to determine whether opinion concerning the addition of seat belts in school buses is independent of the population density in which a person resides. The following data were recorded:

Opinion Favour Oppose Undecided

Rural 75 40 13

Local Suburban 62 34 18

Urban 87 39 12

Use the following output generated using statistical software to answer the questions below. Chi-Square Test

Expected counts are printed below observed counts Rural 75 75.45

Suburban 62 67.20

Urban 87 81.35

Total 224

40 38.06

34 33.90

39 41.04

113

13 14.48

18 12.90

12 15.62

Total

128

114

138

380

Chi-Sq = 0.003 + 0.402 + 0.393 + 0.099 + 0.000 + 0.101 + 0.152 + 2.016 + 0.837 = 4.003 DF = 4, P-Value = 0.406 49. Refer to Seat Belts On School Buses Narrative. State the null and alternative hypotheses. ANS: Opinion concerning the addition of seat belts to school buses is independent of population density where the person lives. Opinion concerning the addition of seat belts to school buses and the population density where the person lives are dependent. PTS: 1 REF: 631-632 BLM: Higher Order - Analyze

TOP: 4

50. Refer to Seat Belts On School Buses Narrative. What is the value of the test statistic? ANS: = 4.003 PTS: 1 REF: 631-633 BLM: Higher Order - Apply

TOP: 4

51. Refer to Seat Belts On School Buses Narrative. What is the p-value associated with the test? ANS: p-value = 0.406 PTS: 1 REF: 632-635 BLM: Higher Order - Analyze

TOP: 4

52. Refer to Seat Belts On School Buses Narrative. What is the appropriate conclusion at the 0.05 level of significance? Explain.

ANS: Since the p-value = 0.406  = 0.05, do not reject , and therefore conclude that opinion concerning the addition of seat belts to school buses is independent of the population density where the person lives. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

53. A survey of 500 respondents produced these cell counts in a 2 3 contingency table: Columns Rows 1 1 46 2 82 Total 128

2 42 72 114

3 117 141 258

Total 205 295 500

a. If you wish to test the null hypothesis of “independence”—that the probability that a response falls in any one row is independent of the column it falls into—and you plan to use a chi-square test, how many degrees of freedom will be associated with the chi-square statistic? b. Find the value of the test statistic. c. Find the rejection region for = 0.01. d. Conduct the test and state your conclusions. e. Find the approximate p-value for the test. ANS: a. r = 2 and c = 3; the total degrees of freedom are (r –1)(c – 1) = (1)(2) = 2. b. The experiment is analyzed as a 2  3 contingency table. The contingency table, including the observed cell counts ( ) and the estimated expected cell counts parentheses), follows: Column Rows 1 2 3 Total 1 46 42 117 205 (52.48) (46.74) (105.78) 2 Total

82 (75.52) 128

72 (67.26) 114

141 (152.22) 258

Then the test statistic can be calculated as 1.1901 + 0.5560 + 0.3340 + 0.8270 = 4.1879. c.

Reject

(in

295 500 = 0.8001 + 0.4807 +

d. The observed value does not fall in the rejection region. Hence, rejected. There is no reason to expect a dependence between rows and columns. e. p-value > 0.10.

is not

PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

54. Is there a difference in the spending patterns of high-school seniors depending on their gender? A study to investigate this question focused on 196 employed high-school seniors. Students were asked to classify the amount of their earnings that they spent on their car during a given month: Gender

Male Female

None or Only a Little 73 57

Some

About Half

Most

All or Almost All

12 15

6 11

4 9

3 6

A portion of the computer printout is given here. Use the printout to analyze the relationship between spending patterns and gender. Write a short paragraph explaining your statistical conclusions and their practical implications. Chi-Sq = 0.985 + 0.167 + 0.735 + 0.962 + 0.500 + 0.985 + 0.167 + 0.735 + 0.962 + 0.500 = 6.696 DF = 4, P-Value = 0.153 2 cells with expected counts less than 5.0 ANS: There is no difference in the spending patterns of high-school seniors depending on their gender. There is a difference in the spending patterns of high-school seniors depending on their gender. The printout shows the observed value of the test statistic, its degrees of freedom, and its p-value. The test statistic is given as , with p-value = 0.153. is not rejected, and the results are declared non-significant. There is insufficient evidence to indicate a difference in spending patterns between males and females. PTS: 1 REF: 632-636 | 644 BLM: Higher Order - Evaluate

TOP:

55. The number of Canadians who visit fast-food restaurants regularly has grown steadily over the past decade. For this reason, marketing experts are interested in the demographics of fast-food customers. Is a customer’s preference for a fast-food chain affected by the age of the customer? If so, advertising might need to target a particular age group. Suppose a random sample of 400 fast-food customers aged 16 and older was selected, and their favourite fast-food restaurants along with their age groups were recorded, as shown in the table: Age Group 16–21

McDonald’s 60

Burger King 27

Wendy’s

Other

21–30 30–49 49+

71 43 17

34 42 20

15 22 6

8 14 8

Use an appropriate method to determine whether or not a customer’s fast-food preference is dependent on age. Write a short paragraph presenting your statistical conclusions and their practical implications for marketing experts. ANS: The 4 4 contingency table is analyzed using statistical software to test fast-food choice is independent of age group. fast-food choice is dependent on age group. Chi-Square Test Expected counts are printed below McDonald’s 60 47.75

Burger King 27 30.75

Wendy’s 8 12.75

Other 5 8.75

Total 100

71 61.12

34 39.36

15 16.32

8 11.20

128

43 57.78

42 37.21

22 15.43

14 10.59

121

17 24.35

20 15.68

6 6.50

8 4.46

Total

191

123

400

Chi-Sq = 3.143 + 0.457 + 1.770 + 1.607 + 1.597 + 0.730 + 0.107 + 0.914 + 3.780 + 0.617 + 2.800 + 1.100 + 2.220 + 1.189 + 0.039 + 2.804 = 24.873 DF = 9, P-Value = 0.003 1 cells with expected counts less than 5.0 The statistical software printout shows = 24.873 with p-value = 0.003, so that is rejected. There is a dependence between age group and favourite fast-food restaurant. The practical implications of this conclusion can be explored by looking at the conditional distribution of fast-food choices for each of the four age groups. Each student will present slightly different conclusions. PTS: 1 REF: 632-636 | 644 BLM: Higher Order - Evaluate

TOP: 4

56. An experiment was conducted to investigate the effect of general hospital experience on the attitudes of physicians toward the lower-income group. A random sample of 50 physicians who had just completed 4 weeks of service in a general hospital were categorized according to their concern for lower-income group before and after their general hospital experience. The data are shown in the table. Do the data provide sufficient evidence to indicate a change in “concern” after the general hospital experience? If so, describe the nature of the change.

Concern Before Low High

Concern After High Low 26 4 10 10

ANS: There is no change in the attitudes of physicians toward the lower-income group after the general hospital experience. There is a change in the attitudes of physicians toward lower-income group after the general hospital experience. The data are analyzed as a contingency table. The contingency table, including column and row totals and the estimated expected cell counts (in parentheses), follows. Concern Before

Concern After Low 4 (8.4)

Total 30

Low

High 26 (21.6)

10 (14.4) 36

10 (5.6) 14

High Total

Then the test statistic can be calculated as 1.3444 + 3.4571 = 8.0026.

50 = 0.8963 + 2.3048 +

The p-value with 1 df is between 0.005 and 0.01 and is rejected. There is evidence to indicate a change in concern due to the general hospital experience. PTS: 1 REF: 632-636 | 644 BLM: Higher Order - Evaluate

TOP: 4

57. The personnel manager of a consumer products company asked a random sample of employees how they felt about the work they were doing. The following table gives a breakdown of their responses by age. Is there sufficient evidence to conclude that the level of job satisfaction is related to age? Justify your answer. (Use 0.10.)

Age

Very Interesting

Response Fairly Interesting

Not Interesting

Under 30 Between 30 and 50 Over 50

31 42

24 30

13 4

ANS: Job satisfaction and age are independent. Job satisfaction and age are dependent. Rejection region:

7.779

Test statistic: = 9.692 Conclusion: Reject the null hypothesis. Yes, job satisfaction is related to age. PTS: 1 REF: 632-636 | 644 BLM: Higher Order - Evaluate

TOP: 4

58. A sport preference poll showed the following data for men and women:

Gender Male Female

Baseball 24 21

Favourite Sport Basketball Football Golf 17 30 18 20 22 12

Tennis 22 28

Use the 5% level of significance and test to determine whether sport preferences depend on gender. ANS: Gender and sport preferences are independent. Gender and sport preferences are dependent. Rejection region:

9.488

Test statistic: = 3.30 Conclusion: Don’t reject the null hypothesis. Sport preferences don’t depend on gender. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

59. A study of education levels of 500 voters and their political party affiliations in a western province showed the following results:

Education Level Didn’t Complete High School High School Diploma Has College Degree

Liberal 40

Party Affiliation Conservative Independent 20 80

70 90

30 50

60 60

Use the 1% level of significance and test to see if party affiliation is independent of the educational level of the voters. ANS: Political party affiliation and education level of voters are independent. Political party affiliation and education level of voters are dependent. Rejection region:

13.277

Test statistic: = 26.830 Conclusion: Reject the null hypothesis. Political party affiliation depends on the education level of the voters. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

60. A major insurance firm interviewed a random sample of 1200 university students to find out the type of life insurance preferred, if any. The results follow:

Gender Female Male

Term 100 160

Insurance Preference Whole Life No Insurance 80 325 60 475

Is there evidence that life insurance preference of male students is different than that of female students? Explain. Test using the 5% level of significance. ANS: Gender and life insurance preference are independent. Gender and life insurance preference are dependent. Rejection region:

5.991

Test statistic: = 15.124 Conclusion: Reject the null hypothesis. Yes, life insurance preference depends on gender; that is, life insurance preference of male students is different than that of female students. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

61. The personnel manager of a consumer product company asked a random sample of employees how they felt about the work they were doing. The following table gives a breakdown of their responses by gender. Do the data provide sufficient evidence to conclude that the level of job satisfaction is related to gender? Explain. Conduct the test at the 0.10 level of significance.

Gender Male

Very Interesting 70

Response Fairly Interesting 41

Not Interesting 9

Female

ANS: Gender and level of job satisfaction are independent. Gender and level of job satisfaction are dependent. Rejection region: Reject

if calculated

4.605

Since calculated = 4.708 > 4.605, we reject the null hypothesis. The data provide sufficient evidence to conclude that the level of job satisfaction is related to gender. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

62. A large carpet store wishes to determine if the brand of carpet purchased is related to the purchaser’s family income. As a sampling frame, the store mailed a survey to people who have a store credit card. Five hundred customers returned the survey and the results follow:

Family Income High Income Middle Income Low Income

Brand of Carpet Brand A Brand B Brand C 65 32 32 80 68 104 25 35 59

At the 5% level of significance, can you conclude that the brand of carpet purchased is related to the purchaser’s family income? Justify your answer. ANS: Family income and brand of carpet are independent. Family income and brand of carpet are dependent. Rejection region: Reject

if calculated

9.488

Since calculated 27.372 > 9.488, we reject the null hypothesis. We can conclude that the brand of carpet purchased is related to the purchaser’s family income. PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 4

63. The following table shows the results of a study in which random samples of 200 members of each of five large unions were asked whether they are for, undecided, or against a certain piece of legislation. Use the 0.025 level of significance to test whether the unions differ with respect to their views. Union’s Opinion For Undecided

Union 1

Union 2

Union 3

Union 4

Union 5

100 62

75 75

49 87

95 28

80 50

Against

ANS: Unions’ opinion concerning a certain piece of legislation and union type are independent (that is, unions do not have different views). Unions’ opinion concerning a certain piece of legislation and union type are dependent (that is, unions have different views). The estimated expected cell counts (

) are shown in parentheses in the table below,

together with the observed cell counts (

)

Union’s Opinion For

Union 1 100 (79.8)

Union 2 75 (79.8)

Union 3 49 (79.8)

Union 4 95 (79.8)

Union 5 80 (79.8)

Total 399

Undecided

62 (60.4)

75 (60.4)

87 (60.4)

28 (60.4)

50 (60.4)

302

Against

38 (59.8)

50 (59.8)

64 (59.8)

77 (59.8)

70 (59.8)

299

Total

200

1000

Then the test statistic can be calculated as = 0.289 + 11.888 + 2.895 + 0.001 + 5.113 + 0.042 + 3.529 +11.715 + 17.38 + 1.791 + 7.947 + 1.606 + 0.295 + 4.947 + 1.740 = 71.178. With df = (r – 1)(c – 1) = 8 and > 17.5346, we can reject

= 0.025, we reject

when

= 17.5346 Since

and conclude that the unions have different views.

PTS: 1 REF: 632-636 BLM: Higher Order - Evaluate

TOP: 5–7

64. Suppose you wish to test the null hypothesis that three binomial parameters , , and are equal versus the alternative hypothesis that at least two of the parameters differ. Independent random samples of 100 observations were selected from each of the populations. The data are shown in the table:

Successes Failures Total

A 26 74 100

Population B 21 79 100

C 35 65 100

Total 82 218 300

a. Write the null and alternative hypotheses for testing the equality of the three binomial proportions. b. Calculate the test statistic and find the approximate p-value for the test in the previous question. c. Use the approximate p-value to determine the statistical significance of your results. If the results are statistically significant, explore the nature of the differences in the three binomial proportions. ANS: a. If we define , and as the probability of success for each of the three binomial populations, then the null hypothesis of independence of rows and columns is the same as a test of the equality of the three binomial proportions: . b. The test procedure is identical to that used for an r  c contingency table. The contingency table, including column and row totals and the estimated expected cell counts, follows. Population Number of Successes

1 26 (27.333)

2 21 (27.333)

3 35 (27.333)

Total 82

Number of Failures Total

74 (72.667) 100

79 (72.667) 100

65 (72.667) 100

218

Then the test statistic can be calculated as 1.4673 + 2.1506 + 0.0245+ 0.5519 + 0.8089 = 5.0682.

300

= 0.0650 +

Since the observed value = 5.0682, falls between = 5.99147 and = 4.60517, then 0.05 < p-value < 0.10. c. Since the p-value is greater than 0.05, namely, 0.0793, the null hypothesis is not rejected. There is insufficient evidence to indicate that the proportions depend upon the population from which they were drawn. PTS: 1 REF: 638-641 BLM: Higher Order - Evaluate

TOP: 5–7

65. A study of the purchase decisions of three stock portfolio managers, A, B, C, was conducted to compare the numbers of stock purchases that resulted in profits over a time period less than or equal to one year. One hundred randomly selected purchases were examined for each of the managers. Do the data provide evidence of differences among the rates of successful purchases for the three managers?

Portfolio Profit

A 65

Manager B 73

C 57

No Profit

ANS: It is necessary to test a hypothesis of equivalence of the rates of successful purchases for three different managers, which is equivalent to a test of the equivalence of three binomial populations. The contingency table, including column and row totals and the estimated expected cell counts (in parentheses), follows.

Number of Successes

Number of Failures Total

Manager A 65 (65)

B 73 (65)

C 57 (65)

Total 195

35 (35) 100

27 (35) 100

43 (35) 100

105

Then the test statistic can be calculated as 0.9846 + 0.000 + 1.8286 + 1.8286 = 5.6264.

300

= 0.000 + 0.9846 +

With (r – 1)(c – 1) = 2 df, the p-value is bounded between 0.05 and 0.10. is not rejected and the results are declared not significant. There is not enough information to conclude that the proportion of successful purchases will differ among the managers. PTS: 1 REF: 638-641 BLM: Higher Order - Evaluate

TOP: 5–7

Chapter 15A—Nonparametric Statistics MULTIPLE CHOICE 1. Which of the following correctly states the difference between parametric and nonparametric statistical methods? a. Nonparametric tests do not make any distributional assumptions throughout the entire process. b. Nonparametric tests do not have the hypotheses based on population parameters or assumptions based on the distributions of populations from which the samples are drawn. c. Nonparametric tests involve only interval data, whereas parametric tests involve only ordinal data. d. Nonparametric tests are involved with populations that have no parameters. ANS: B BLM: Remember

PTS:

REF: 660

TOP: 1–2

2. A two-sample t test with independent samples corresponds to which of the following tests? a. a sign test b. a Wilcoxon rank sum test c. a Wilcoxon signed-rank test e d. a Friedman test for randomized block design ANS: B BLM: Remember

PTS:

REF: 661-663

TOP: 1–2

3. Under which of the following conditions are nonparametric tests appropriate for quantitative data? a. One or more of the assumptions underlying a particular parametric statistical test has been violated. b. The sample size is very large. c. The underlying population can be assumed to be normally distributed. d. All assumptions for a particular parametric statistical test have been met. ANS: A BLM: Remember

PTS:

REF: 660

TOP: 1–2

4. You are performing the Wilcoxon rank sum test. The 10th through 12th values in an ordered array of pooled sample data all equal $100, while the 9th value is less than $100 and the 13th value is more than $100. What are the appropriate ranks for the three $100 values? a. 9, 9, 9 b. 10, 10, 10 c. 11, 11, 11 d. 12, 12, 12 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 661

TOP: 1–2

5. Consider the following data set: 14, 14, 15, 16, 18, 19, 19, 20, 21, 22, 23, 25, 25, 25, 25, and 28. What is the rank assigned to the four observations of value 25? a. 12 b. 12.5 c. 13 d. 13.5 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 661

TOP: 1–2

6. Which of the following correctly describes the Wilcoxon rank sum test? a. It is a nonparametric test based on two independent simple random samples. b. It is designed to determine whether the relative frequency distributions of two statistical populations of continuous values are identical to or different from one another. c. It is equivalent to a parametric t test of the difference between two independent means. d. all of the above ANS: D TOP: 1–2

PTS: 1 BLM: Remember

REF: 660-661 | 696

7. Suppose we want to compare the output of strawberries grown on plots using fertilizer A with that grown on otherwise identical plots using fertilizer B in order to make a general assessment of relative fertilizer effectiveness. Which of the following tests might we use? a. a Friedman–Smirnov test b. a Kruskal–Wallis test c. a Wilcoxon rank sum test d. a Spearman rank correlation test ANS: C TOP: 1–2

PTS: 1 REF: 660-661 | 696 BLM: Higher Order - Analyze

8. When assigning ranks during a Wilcoxon rank sum test, tied values are each given the mean of the next ranks to be assigned. Which of the following statements is NOT correct regarding this averaging procedure? a. The procedure is crucial when the tied values belong to the same sample. b. Although still employed, the procedure is not crucial (and an arbitrary assignment of ranks to tied values would be acceptable) when the tied values belong to the same sample. c. The procedure is designed to avoid arbitrariness. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 660-661

TOP: 1–2

9. You are performing the Wilcoxon rank sum test. The 14th through 16th values in an ordered array of pooled sample data all equal $160 (while the 13th value is less and the 17th value is more). What are the appropriate ranks for the three $160 values? a. 13.5, 14, 15.5 b. 14, 15, 16 c. 14.5, 15, 16.5

d. 15, 15, 15 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 660-661

TOP: 1–2

10. To apply the Wilcoxon rank sum test to determine whether the location of population 1 is different from the location of population 2, what kind of samples must be used? a. They must be drawn from normal populations. b. They must be drawn from matched pairs experiment. c. They must be independent. d. They must be larger than 30. ANS: C BLM: Remember

PTS:

REF: 660-661

TOP: 1–2

11. A Wilcoxon rank sum test for comparing two populations involves two independent samples of sizes 5 and 7. The alternative hypothesis is stated as: The location of population 1 is different from the location of population 2. In this case, what is the appropriate critical value at the 5% significance level? a. 20 b. 29 c. 33 d. 35 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 662-664

TOP: 1–2

12. The Wilcoxon rank sum test statistic T is approximately normally distributed whenever the sample sizes are larger than or equal to which of these values? a. 10 b. 15 c. 20 d. 25 ANS: A BLM: Remember

PTS:

REF: 665

TOP: 1–2

13. Consider the following two independent samples: Sample A: 16 17 19 22 47 Sample B: 27 31 34 37 40 Given this information, what is the value of the test statistic for a left-tailed Wilcoxon rank sum test? a. 6 b. 20 c. 35 d. 55 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 664-665

TOP: 1–2

14. What is the appropriate nonparametric method to compare two populations when the samples are independent and where the normality requirement necessary to perform the parametric test is NOT satisfied? a. Wilcoxon rank sum test b. sign test c. Wilcoxon signed-rank sum test d. equal-variance t test of ANS: A BLM: Remember

PTS:

REF: 660-661

TOP: 1–2

15. Which of the techniques listed below are statistical methods that require few assumptions, if any, about the population distribution? a. parametric techniques b. nonparametric techniques c. free agent techniques d. general techniques ANS: B BLM: Remember

PTS:

REF: 660

TOP: 1–2

16. The nonparametric tests discussed in your text (Wilcoxon rank sum test, sign test, Wilcoxon signed-rank test, Kruskal–Wallis test, and Friedman test) all require that the probability distributions have which of the following properties? a. They must be identical except with respect to location. b. They must be identical except with respect to spread (variance). c. They must be identical except with respect to shape (distribution). d. They must be different with respect to location, spread, and shape. ANS: A TOP: 1–2

PTS: 1 REF: 660 | 669 | 674 | 680 | 686 | 696-697 BLM: Higher Order - Understand

17. Consider the following two independent samples: Sample A: Sample B:

15 14

17 16

18 19

In this case, what is the value of the test statistic for a right-tailed Wilcoxon rank sum test? a. 3 b. 7 c. 11 d. 22 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 662-663

TOP: 1–2

18. Which of the following techniques are statistical methods that require, among other assumptions, that the populations be normally distributed? a. distribution-free techniques b. nonparametric techniques

c. parametric techniques d. both distribution-free and nonparametric techniques ANS: C BLM: Remember

PTS:

REF: 660

TOP: 1–2

19. In a Wilcoxon rank sum test, the two sample sizes are 4 and 6, and the value of the Wilcoxon test statistic is T = 20. The test is two-tailed and the level of significance is . What may be deduced from the given information? a. The null hypothesis will be rejected. b. The null hypothesis will not be rejected. c. The alternative hypothesis will not be rejected. d. The alternative hypothesis will be rejected. ANS: B PTS: 1 BLM: Higher Order - Evaluate

REF: 662-664

TOP: 1–2

20. A Wilcoxon rank sum test for comparing two populations involves two independent samples of sizes 15 and 20. The nonstandardized test statistic (that is, the rank sum) is T = 210. Under these circumstances, what is the value of the standardized test statistic z? a. 14.0 b. 10.5 c. 6.0 d. –2.0 ANS: D PTS: 1 BLM: Higher Order - Apply

REF: 664-667

TOP: 1–2

21. You are performing the Wilcoxon rank sum test. The 13th through 15th values in an ordered array of pooled sample data all equal $180 (while the 12th value is less and the 16th value is more). What are the appropriate ranks for the three $180 values? a. 12.5, 13, 14.5 b. 12.5, 13, 15 c. 13, 14, 15 d. 14, 14, 14 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 660-661

TOP: 1–2

22. In a normal approximation to the Wilcoxon rank sum test, the standardized test statistic is calculated as z = 1.80. In this case, what would the p-value be for a two-tailed test? a. 0.0359 b. 0.0718 c. 0.2321 d. 0.4641 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 664-667

TOP: 1–2

23. Which of the following will NEVER be a required condition of a nonparametric test? a. The samples are drawn from normally distributed populations.

b. The populations being compared are identical in spread and shape. c. The samples are not drawn from normally distributed populations. d. The populations being compared are not identical in spread and shape. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 660

TOP: 1–2

24. Which of the following is a nonparametric method that is equivalent to the Wilcoxon rank sum test? a. the Wilcoxon signed-rank sum test b. the Friedman test c. the Kruskal–Wallis test d. the Mann–Whitney test ANS: D BLM: Remember

PTS:

REF: 660

TOP: 1–2

25. Consider the following data set: 2.2, 2.3, 2.3, 2.5, 2.6, 2.7, 2.8, 2.8, 2.8, 2.9, 3.1, 3.2, and 3.5. What is the rank assigned to the three observations of value 2.8? a. 6 b. 7 c. 8 d. 9 ANS: C PTS: 1 BLM: Higher Order - Apply

REF: 660-661

TOP: 1–2

26. The first step in a Wilcoxon rank sum test is to combine the data values in the two samples and assign a rank of 1 to which of the following observations? a. the smallest observation b. the first observation c. the largest observation d. the observation that occurs most frequently ANS: A BLM: Remember

PTS:

REF: 661

TOP: 1–2

27. The Wilcoxon rank sum test (like most of the nonparametric tests presented in your text) is used to determine whether the population distributions have certain identical characteristics. Which of these properties would be included in that list of characteristics? a. locations b. spreads (variances) c. shapes d. locations, spreads, and shapes ANS: D PTS: 1 BLM: Higher Order - Understand

REF: 662-663

TOP: 1–2

28. What are inferential procedures that are free from restrictive assumptions about the sampled populations? a. distribution-free tests (because no assumption about the nature of the population

distribution is being made) b. distribution-free tests (because no assumption about the nature of the sampling distributions of test statistics is being made) c. parametric tests d. general tests ANS: A BLM: Remember

PTS:

REF: 660 | 696

TOP: 1–2

29. To use the Wilcoxon rank sum test as a test for location, which of the following conditions must we assume? a. that the obtained data are either ordinal or interval where the normality requirement necessary to perform the equal-variances t test of is unsatisfied b. that both samples are randomly and independently drawn from their respective populations c. that both underlying populations from which the samples were drawn are equivalent in shape and dispersion d. all of (a), (b), and (c) ANS: D BLM: Remember

PTS:

REF: 660-661

TOP: 1–2

30. The sign test is a nonparametric procedure for testing which of the following situations? a. whether two populations have identical means b. whether two populations have identical medians c. whether two populations have identical probability distributions d. the amount of skewness in a single population ANS: C BLM: Remember

PTS:

REF: 669

TOP: 3

31. Which one of the following is NOT a reason for using a sign test to make a comparison between two populations? a. Some studies yield responses that are difficult to quantify. b. It is easy to use the sign-testing procedure. c. The data in question consist of count data. d. No assumptions need to be made about the form of the population probability distributions. ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 669

TOP: 3

32. In the case of the sign test, which of the following is the best formulation of the null hypothesis? a. There is a difference in the probability distribution for the two populations. b. There is no difference in the probability distribution for the two populations. c. The two populations in questions are both normally distributed. d. p  0.5. ANS: B

PTS:

REF: 669

TOP: 3

BLM: Remember 33. Which one of the following is a disadvantage of the sign test? a. Tied pairs are not considered in the analysis. b. Only the signs of the differences and not the actual values are used in the analysis. c. The sign test cannot cope with small samples. d. Results from such tests have little practical use. ANS: B PTS: 1 BLM: Higher Order - Understand

REF: 669-670

TOP: 3

34. Which of the following is a nonparametric method to compare two populations when the samples are matched pairs and the data are ordinal? a. the Wilcoxon signed-rank sum test b. the sign test c. the Wilcoxon rank sum test d. the matched pairs t test ANS: B BLM: Remember

PTS:

REF: 669

TOP: 3

35. In a normal approximation to the sign test, the standardized test statistic is calculated as z = –1.58. To test the alternative hypothesis that the location of population 1 is to left of the location of population 2, what would be the p-value of the test? a. 0.0571 b. 0.1142 c. 0.2215 d. 0.2284 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 670-671

TOP: 3

36. In the sign test applications, the normal approximation to the binomial distribution works very well even when the number of non-zero differences is as small as which of the following values? a. 5 b. 10 c. 15 d. 25 ANS: B BLM: Remember

PTS:

REF: 670

TOP: 3

37. Which of the following statements correctly describes the sign test? a. It often uses the directions of differences observed in matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another. b. It is often used to determine whether a sample comes from a population with a specified median. c. Both (a) and (b). d. Neither (a) nor (b).

ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 670-671

TOP: 3

38. Which of these tests employs matched pairs sampling? a. the Wilcoxon signed-rank test b. the Wilcoxon rank sum test c. the Mann–Whitney test d. the Friedman test ANS: A BLM: Remember

PTS:

REF: 674

TOP: 4–5

39. Which of the following statements correctly describes the Wilcoxon signed-rank test? a. It makes use of the sign and the magnitude of the rank of the differences between pairs of measurements. b. It is concerned with the analysis of a single population. c. It cannot cope with ordinal data. d. It makes use of the sign but does not consider the magnitude of the differences between pairs of measurements. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 674-675

TOP: 4–5

40. In the Wilcoxon signed-rank test, if two or more measurements have the same non-zero difference, signs ignored, then what would be the logical next step? a. The scores are eliminated from the analysis. b. The scores are assigned the lowest rank of the tied values. c. The scores are assigned the highest rank of the tied values. d. The average of the ranks of the tied values is assigned. ANS: D BLM: Remember

PTS:

REF: 674

TOP: 4–5

41. In which of the following situations might the Wilcoxon signed-rank test be more appropriate than the paired-difference t test? a. when the population of differences is not normal b. when the population of differences is normal c. when the sample size is small d. when the data values are quantitative ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 674 | 676

TOP: 4–5

42. A paired sample, two-sample t test corresponds to which of the following tests? a. a Wilcoxon signed-rank test b. a Wilcoxon rank sum test c. a Kruskal-Wallis H-test d. a Friedman test for randomized block design ANS: A BLM: Remember

PTS:

REF: 674 | 676

TOP: 4–5

43. The Wilcoxon signed-rank test statistic is approximately normally distributed whenever the sample sizes are larger than or equal to which of the following values? a. 10 b. 20 c. 25 d. 35 ANS: C BLM: Remember

PTS:

REF: 677-678

TOP: 4–5

44. The significance level for a Wilcoxon signed-rank sum test is 0.05. The alternative hypothesis is stated as: The location of population 1 is different from the location of population 2. Given this information, what is the appropriate critical value for a sample of size 20 (i.e., the number of non-zero differences is 20)? a. 37 b. 43 c. 52 d. 60 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 674-677

TOP: 4–5

45. The significance level for a Wilcoxon signed-rank test is 0.05. The alternative hypothesis is stated as: The location of population 1 is to the left of the location of population 2. Under these circumstances, what is the appropriate critical value for a sample of size 20 (i.e., the number of non-zero differences)? a. 37 b. 43 c. 52 d. 60 ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 674-677

TOP: 4–5

46. In a normal approximation to the Wilcoxon signed-rank test, the test statistic is calculated as z = 1.36. For a two-tailed test, what is the p-value? a. 0.0869 b. 0.1738 c. 0.2066 d. 0.4131 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 677-678

TOP: 4–5

47. In a Wilcoxon signed-rank test for matched pairs with n = 35, the rank sums of the positive and negative differences are 380 and 225, respectively. What, then, is the value of the standardized test statistic z? a. 1.065 b. 1.206 c. 1.400

d. 1.689 ANS: A PTS: 1 BLM: Higher Order - Analyze

REF: 678

TOP: 4–5

48. Which of these tests is the nonparametric counterpart of the parametric t test of matched pairs? a. the Friedman test b. the Kruskal–Wallis test c. the Wilcoxon signed-rank test d. the Wilcoxon rank sum test ANS: C TOP: 4–5

PTS: 1 BLM: Remember

for

REF: 674 | 676 | 697

49. Which of these tests is a nonparametric method to compare two populations when the samples are matched pairs and the data are interval, and where the normality requirement necessary to perform the parametric test is unsatisfied? a. the Wilcoxon rank sum test b. the sign test c. the matched pairs t test d. the Wilcoxon signed-rank test ANS: D BLM: Remember

PTS:

REF: 674-675

TOP: 4–5

50. In a Wilcoxon signed-rank test for matched pairs with n = 32, the rank sums of the positive and negative differences are 367.5 and 160.5, respectively. Given this information, what is the value of the standardized test statistic z? a. 3.764 b. 1.935 c. 1.882 d. 1.391 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 678

TOP: 4–5

51. In a Wilcoxon signed-rank test, the test statistic is calculated as T = 75. If there are n = 15 observations for which the difference is not 0, and a two-tailed test is performed at the 5% significance level, what is an appropriate conclusion? a. Reject the null hypothesis. b. Don’t reject the null hypothesis. c. The test results are inconclusive. d. Perform a parametric test. ANS: B PTS: 1 BLM: Higher Order - Evaluate

REF: 674-677

TOP: 4–5

52. How is the alternative hypothesis to be tested always stated in all applications of the Kruskal–Wallis test? a. The locations of all k populations are the same.

b. The locations of all k populations differ. c. At least two population locations are the same. d. At least two population locations differ. ANS: D BLM: Remember

PTS:

REF: 683-684

TOP: 6

53. Which of the following is a nonparametric alternative to the Kruskal–Wallis test for differences in more than two medians? a. the ANOVA F test for completely randomized experiments b. the Student’s t test for related samples c. the Student’s t test for independent samples d. the Wilcoxon rank sum test for differences in two medians ANS: A TOP: 6

PTS: 1 BLM: Remember

REF: 684 | 696-697

54. In a Kruskal–Wallis test for comparing three populations, the test statistic is calculated as H = 2.80. If the test is conducted at the 5% significance level, then what may one conclude? a. The null hypothesis will be rejected. b. The null hypothesis will not be rejected. c. The test results are inconclusive. d. The t test for matched pairs must be used. ANS: B PTS: 1 BLM: Higher Order - Evaluate

REF: 680-684

TOP: 6

55. Which of the following is the nonparametric counterpart of the parametric one-way analysis of variance F-test? a. the Kruskal–Wallis test b. the Friedman test c. the Wilcoxon rank sum test d. the Wilcoxon signed-rank sum test ANS: A BLM: Remember

PTS:

REF: 680 | 697

TOP: 6

56. Which of the following is a characteristic of the Kruskal–Wallis test? a. It is always one-tailed. b. It is always two-tailed. c. It is always used with one sample. d. It is always used when the populations are normally distributed. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 680-681

TOP: 6

57. In a Kruskal–Wallis test, there are four samples and the value of the test statistic is calculated as H = 8.79. Under these conditions, which of the following is the most accurate statement that can be made about the p-value? a. It is greater than 0.10. b. It is greater than 0.05 but smaller than 0.10.

c. It is greater than 0.05. d. It is greater than 0.025 but smaller than 0.05. ANS: D PTS: 1 BLM: Higher Order - Analyze

REF: 680-684

TOP: 6

58. Which of these tests is a nonparametric method to compare two or more populations when the samples are independent and the data are either ordinal or interval but not normal? a. the Kruskal–Wallis test b. the Friedman test c. the Wilcoxon rank sum test d. the Wilcoxon signed-rank sum test ANS: A BLM: Remember

PTS:

REF: 683-684

TOP: 6

59. Suppose there is interest in comparing the median response time for three independent groups learning a new specific task. In this case, what is the appropriate nonparametric procedure to use? a. the Wilcoxon rank sum test b. the Wilcoxon signed-rank test c. the Kruskal–Wallis test for differences in medians d. either a or b ANS: C PTS: 1 BLM: Higher Order - Understand

REF: 683-684

TOP: 6

60. Which of the following distributions approximates the Kruskal–Wallis test statistic H when the problem objective is to compare k distributions and the sample sizes are greater than or equal to 5? a. the normal distribution b. the chi-square distribution with k – 1 degrees of freedom c. the Student t distribution with k – 2 degrees of freedom d. either the chi-square distribution with k – 5 degrees of freedom or the Student’s t distribution with k + 5 degrees of freedom ANS: B BLM: Remember

PTS:

REF: 680-681

TOP: 6

61. The Kruskal–Wallis test statistic can be approximated by a chi-square distribution with k – 1 degrees of freedom (where k is the number of populations) whenever the sample sizes are all greater than or equal to which of the following values? a. 5 b. 15 c. 25 d. 30 ANS: A BLM: Remember

PTS:

REF: 680-681

TOP: 6

62. To which of the following tests does a randomized block design analysis of variance test correspond? a. a Wilcoxon signed-rank test for paired samples b. a Wilcoxon rank sum test c. a Wilcoxon signed-rank test for one sample d. a Friedman test for randomized block design ANS: D BLM: Remember

PTS:

REF: 686

TOP: 7

63. Which test is the nonparametric counterpart of the randomized block model of the analysis of variance? a. the Kruskal–Wallis test b. the Friedman test c. the Wilcoxon rank sum test d. the Wilcoxon signed-rank sum test ANS: B BLM: Remember

PTS:

REF: 686 | 697

TOP: 7

64. In a Friedman test for comparing three populations, provided that there are five blocks, the test statistic is calculated as = 6.594. In this case, what is the most accurate statement that can be made about the p-value? a. the p-value < 0.025 b. 0.025 < p-value < 0.05 c. 0.05 < p-value < 0.10 d. the p-value > 0.10 ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 686-689

TOP: 7

65. Which of the following is characteristic of the Friedman test? a. It is always one-tailed. b. It is always two-tailed. c. It is always used with one sample. d. It is always used when the populations are normally distributed. ANS: A PTS: 1 BLM: Higher Order - Understand

REF: 688-689

TOP: 7

66. In a Friedman test for comparing four populations, provided that there are eight blocks, the test statistic is calculated as = 10.98. If the test is conducted at the 5% significance level, what may we conclude about the null hypothesis and the p-value? a. Reject the null hypothesis, and 0.01 < p-value < 0.025. b. Reject the null hypothesis, and p-value > 0.025. c. Do not reject the null hypothesis, and 0.025 < p-value < 0.05. d. Do not reject the null hypothesis, and p-value > 0.05. ANS: A PTS: 1 BLM: Higher Order - Evaluate

REF: 686-689

TOP: 7

67. To apply the Friedman test to determine whether the locations of two or more populations are the same, what is a necessary property of the samples drawn? a. They must be from a matched pairs experiment. b. They must be from normal populations. c. They must be independent. d. They must be larger than 20. ANS: A BLM: Remember

PTS:

REF: 686

TOP: 7

68. A Friedman test is applied to a data set that is generated from a randomized block experiment with four treatments and eith blocks. What is the rejection region at the 5% significance level? a. 1.6450 b. > 7.8147 c.

9.4877

> 11.0705

ANS: B PTS: 1 BLM: Higher Order - Analyze

REF: 686-689

TOP: 7

69. Under which of the following conditions is the rank correlation coefficient used? a. when one is correlating quantitative data b. when one is correlating rankings of individual values for two variables c. when one is analyzing data that are assumed to be linearly related d. when one is not interested in drawing inferences from the study ANS: B BLM: Remember

PTS:

REF: 690

TOP: 8

70. Which of the following statements is NOT a property of the Spearman’s rank correlation coefficient? a. It is the test statistic used in Spearmen’s rank correlation test. b. It can take on values only between 0 and +1. c. Positive values near +1 point to a monotonically increasing relationship between the two variables. d. It specifies hypotheses in terms of population distributions rather than parameters. ANS: B TOP: 8

PTS: 1 REF: 690-691 | 697 BLM: Higher Order - Understand

71. The Spearman’s rank correlation coefficient must lie between which of the following values? a. 0 and b. and 0 c. –1 and +1 d. and +1 ANS: C

PTS:

REF: 690-691 | 697

TOP: 8

BLM: Higher Order - Understand

72. When the relationship between two variables is monotonically decreasing, what might the size of Spearman’s rank correlation coefficient be? a. –2 b. –1 c. 0 d. +1 ANS: B TOP: 8

PTS: 1 REF: 690-691 | 697 BLM: Higher Order - Understand

73. In testing at the 5% significance level, a sample of size 20 is used. In this case, what is the rejection region? a. –0.450 0.450 b. –0.377 0.377 c. 0.450 or –0.450 d. 0.777 or –0.377 ANS: C PTS: 1 BLM: Higher Order - Analyze

REF: 691-694

TOP: 8

TRUE/FALSE 1. We can safely employ nonparametric tests even when we know nothing at all about the populations from which sample data are being drawn. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

2. We can conduct nonparametric tests even with nominal and ordinal data. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

3. We can conduct nonparametric tests only with data of a higher order than nominal or ordinal. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 660

TOP: 1–2

4. Nonparametric methods can be applied to a wider variety of problems because they have less rigid requirements than parametric methods. ANS: T TOP: 1–2

PTS: 1 BLM: Remember

REF: 660 | 696-697

5. Given sample size and specified significance level is larger for a nonparametric test. ANS: F PTS: 1 BLM: Higher Order - Understand

, the probability of a Type II error

REF: 673-674

TOP: 1–2

6. Using a nonparametric test when we could employ a parametric test can be less efficient because a nonparametric test often tends to ignore available sample information; for example, by focusing only on the directions rather than the sizes of observed differences. ANS: T TOP: 1–2

PTS: 1 REF: 660 | 673-674 BLM: Higher Order - Understand

7. The Wilcoxon rank sum test is a nonparametric test that measures the degree of association between two variables for which only rank-order data are available. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 660-661

TOP: 1–2

8. The Wilcoxon rank sum test is a nonparametric test that can be used to compare two independent samples when the assumptions for a t test are invalid. ANS: T TOP: 1–2

PTS: 1 BLM: Remember

REF: 660-661 | 696

9. Nonparametric statistical methods generally specify hypotheses in terms of population distributions rather than parameters such as means and standard deviations. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

10. A parametric test is a hypothesis test that depends on certain specific assumptions about the probability distribution of population values or the sizes of population parameters. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

11. Nonparametric tests are methods of inference that make no assumptions whatsoever about the nature of underlying population distributions or their parameters. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 660

TOP: 1–2

12. The Wilcoxon rank sum test is a nonparametric test that uses two independent simple random samples to determine whether the relative frequency distributions of two statistical populations of continuous values are identical to or different from one another. ANS: T BLM: Remember

PTS:

REF: 662-663

TOP: 1–2

13. The Mann–Whitney U test is a nonparametric test that measures the degree of association between two variables for which only rank-order data are available. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 660

TOP: 1–2

14. Statistical tests that are not very sensitive to errors in assumptions are called parametric tests. ANS: F TOP: 1–2

PTS: 1 BLM: Remember

REF: 660 | 696-697

15. Nonparametric tests are often more efficient than parametric tests. ANS: F TOP: 1–2

PTS: 1 BLM: Remember

REF: 660 | 696-697

16. A parametric test is a hypothesis test that depends on certain specific assumptions about the probability distribution of population values or the sizes of population parameters. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

17. A nonparametric test is one that makes no assumptions about the specific shape of the population from which a sample is drawn. ANS: T BLM: Remember

PTS:

REF: 660

TOP: 1–2

18. Nonparametric procedures are often, and perhaps more accurately, called free-agent statistics. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 660

TOP: 1–2

19. A Wilcoxon rank sum test for comparing two independent samples involves two samples of sizes 6 and 9. The alternative hypothesis is that the location of population 1 is to the left of the location of population 2. Using a 0.05 significance level, the appropriate critical values are 31 and 65. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 662-664

TOP: 1–2

20. In a Wilcoxon rank sum test for independent samples, the two sample sizes are 4 and 6, and the value of the Wilcoxon test statistic is T = 25. If the test is a two-tail and the level of significance is 0.05, then the null hypothesis will be rejected. ANS: F

PTS:

REF: 662-664

TOP: 1–2

BLM: Higher Order - Evaluate 21. A Wilcoxon rank sum test for comparing two populations involves two independent samples of sizes 15 and 20. If the value of non-standardized test statistic is T = 225, then the value of the standardized test statistic is z = –1.50. ANS: T PTS: 1 BLM: Higher Order - Apply

REF: 664-667

TOP: 1–2

22. The Wilcoxon rank sum test is used to compare two populations when the samples are independent but not normally distributed. ANS: T BLM: Remember

PTS:

REF: 662-663

TOP: 1–2

23. In a normal approximation to the Wilcoxon rank sum test, the standardized test statistic is calculated as z = 1.96. For a two-tailed test, the p-value of the test is 0.025. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 664-667

TOP: 1–2

24. The z-test approximation to the Wilcoxon rank sum test for two independent samples requires that both sample sizes are smaller than 10. ANS: F BLM: Remember

PTS:

REF: 665

TOP: 1–2

25. A Wilcoxon rank sum test for comparing two independent samples involves two samples of sizes 6 and 10. The alternative hypothesis is that the location of population 1 is different from the location of population 2. Using a 0.05 significance level, the appropriate critical value is 32. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 662-664

TOP: 1–2

26. The procedure for the Wilcoxon rank sum test requires that we rank each group separately rather than together. ANS: F BLM: Remember

PTS:

REF: 663

TOP: 1–2

27. When the direction (and not the magnitude) of the difference within each matched paired in a paired experiment is known, the sign test can be used while the Wilcoxon signed-rank test cannot be used. ANS: T BLM: Remember

PTS:

REF: 671

TOP: 3

28. The sign test, or Wilcoxon signed-rank test, is a nonparametric test that can be used to compare two dependent samples when the assumptions for a t test are invalid. ANS: T BLM: Remember

PTS:

REF: 671

TOP: 3

29. In a normal approximation to the sign test, the standardized test statistic is calculated as z = 2.17. If the alternative hypothesis states that the location of population 1 is to the right of the location of population 2, then the p-value of the test is 0.015. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 670-671

TOP: 3

30. The sign test is employed to compare two populations when the experimental design is matched pairs, and the data are ordinal but not normally distributed. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 669-670

TOP: 3

31. One of the required conditions of the sign test is that the number of non-zero differences n must be greater than or equal to 30. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 669-670

TOP: 3

32. The sign test is a nonparametric test that (1) uses the directions of differences observed in a matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another, and (2) determines whether a sample comes from a population with a specified median. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 669-670

TOP: 3

33. The critical value is taken from the F distribution whenever the Wilcoxon signed-rank test is employed. ANS: F BLM: Remember

PTS:

REF: 676-677

TOP: 4–5

34. The Wilcoxon signed-rank test is a nonparametric test that uses the directions of differences observed in a matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 674-676

TOP: 4–5

35. A two-sample t test with independent samples corresponds to a Wilcoxon signed-rank test for paired samples.

ANS: F TOP: 4–5

PTS: 1 BLM: Remember

REF: 674 | 696-697

36. The Wilcoxon signed-rank test is a nonparametric test that (1) uses the directions of differences observed in a matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another, and (2) determines whether a sample comes from a population with a specified median. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 674-676

TOP: 4–5

37. A one-sample t test is the parametric counterpart of the Wilcoxon signed-rank test for matched pairs. ANS: F TOP: 4–5

PTS: 1 REF: 674 | 696-697 BLM: Higher Order - Understand

38. The Wilcoxon signed-rank test for matched pairs is the nonparametric counterpart of the paired two-sample t test of ANS: T TOP: 4–5

PTS: 1 BLM: Remember

REF: 674 | 696-697

39. The Wilcoxon signed-rank test is applied to compare two populations when the samples are matched pairs and the data are interval but not normally distributed. ANS: T BLM: Remember

PTS:

REF: 674 | 660

TOP: 4–5

40. The z-test approximation to the Wilcoxon signed-rank sum test is used whenever the number of non-zero differences is at least 50. ANS: F BLM: Remember

PTS:

REF: 677-678

TOP: 4–5

41. Kruskal–Wallis test is a nonparametric test that can be used to compare more than two independent samples when the assumptions for an analysis of variance are invalid. ANS: T BLM: Remember

PTS:

REF: 683-684

TOP: 6

42. A one-sample t test corresponds to a Kruskal–Wallis test. ANS: F TOP: 6

PTS: 1 BLM: Remember

REF: 680 | 683-684

43. The Kruskal–Wallis H test is an extension of the Wilcoxon rank sum test from two to more than two statistical populations.

ANS: T BLM: Remember

PTS:

REF: 680

TOP: 6

44. The Kruskal–Wallis test can be conducted as one- or two-tailed tests. ANS: F TOP: 6

PTS: 1 BLM: Remember

REF: 681 | 683-684

45. A one-sample t test is the parametric counterpart of the Kruskal–Wallis test. ANS: F TOP: 6

PTS: 1 BLM: Remember

REF: 680 | 696-697

46. The Kruskal–Wallis test is applied to compare two or more populations when the samples are independent and the data are either ordinal or interval but not normal. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 680

TOP: 6

47. The critical value is taken from the F distribution whenever the test is a Kruskal–Wallis test. ANS: F BLM: Remember

PTS:

REF: 683-684

TOP: 6

48. In a Kruskal–Wallis test, there are five samples and the value of the test statistic is calculated as H = 12.32. Then the most accurate statement that can be made about the p-value of the test is that it is greater than 0.025 but smaller than 0.05. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 680-684

TOP: 6

49. In a Kruskal–Wallis test, there are three samples and the value of the test statistic is calculated as H = 7.378. Then the p-value of the test is 0.025. ANS: T PTS: 1 BLM: Higher Order - Analyze

REF: 680-684

TOP: 6

50. The Kruskal–Wallis test can be used to test for a difference between two populations. It will produce the same outcome as the two-tailed Wilcoxon rank sum test. ANS: T BLM: Remember

PTS:

REF: 680

TOP: 6

51. In a Kruskal–Wallis test, there are four samples and the value of the test statistic is calculated as H = 13.21. Then the most accurate statement that can be made about the p-value of the test is that it is greater than 0.005 but smaller than 0.01. ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 680-684

TOP: 6

52. The Kruskal–Wallis test can be used to determine whether a difference exists between two populations. However, to determine whether one population location is larger than another, we must apply the Wilcoxon rank sum test. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 680

TOP: 6

53. The Friedman test is a nonparametric test that can be used to compare more than two dependent samples when the assumptions for an analysis of variance are invalid. ANS: T TOP: 7

PTS: 1 REF: 686 | 696-697 BLM: Higher Order - Understand

54. The Friedman test is the nonparametric counterpart to the randomized block design of the analysis of variance. ANS: T TOP: 7

PTS: 1 BLM: Remember

REF: 686 | 696-697

55. We can use the Friedman test to determine whether a difference exists between two populations. However, if we want to determine whether one population location is larger than another, we must use the sign test. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 669

TOP: 7

56. If the Friedman test is applied to a data set that are generated from a randomized block experiment with five treatments and seven blocks, then the rejection region at the 5% significance level is ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 686-689

TOP: 7

57. The Friedman test is employed to compare two or more populations when the data are generated from a matched pairs experiment and are either ordinal or interval, but not normally distributed. ANS: T TOP: 7

PTS: 1 REF: 686 | 696-697 BLM: Higher Order - Understand

58. A one-sample t test is the parametric counterpart of the Friedman test for randomized block experimental design. ANS: F TOP: 7

PTS: 1 BLM: Remember

REF: 686 | 696-697

59. If the Friedman test is applied to a data set that are generated from a randomized block experiment with three treatments and five blocks, then the rejection region at the 2.5% significance level is

> 12.8325.

ANS: F PTS: 1 BLM: Higher Order - Analyze

REF: 686-689

TOP: 7

60. The Friedman test statistic is approximately chi-square distributed with (k – 1) degrees of freedom, provided that either the number of blocks, b, or the number of treatments, k, is greater than or equal to 5. ANS: T BLM: Remember

PTS:

REF: 687-689

TOP: 7

61. Given that n = 37, and the value of sample Spearman rank correlation coefficient then the value of the test statistic for testing ANS: T PTS: 1 BLM: Higher Order - Analyze

= 0.35,

is z = 2.10. REF: 691-694

TOP: 8

62. The Spearman rank correlation test is a nonparametric test that uses the directions of differences observed in a matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 693-694

TOP: 8

63. The Spearman rank correlation test is a nonparametric test that (1) uses the directions of differences observed in a matched pairs sample to determine whether the relative frequency distributions of two statistical populations are identical to or different from one another, and (2) determines whether a sample comes from a population with a specified median. ANS: F PTS: 1 BLM: Higher Order - Understand

REF: 693-694

TOP: 8

64. The Spearman rank correlation coefficient is calculated by first ranking the data values, and then calculating the Pearson correlation coefficient of the ranks. ANS: T PTS: 1 BLM: Higher Order - Understand

REF: 690-692

65. The population Spearman correlation coefficient is labelled to estimate its value is labelled ANS: T BLM: Remember

PTS:

TOP: 8

, and the sample statistic used

. REF: 690-691

TOP: 8

66. To determine if a relationship exists between two variables, the hypotheses to be tested are , where ANS: F PTS: 1 BLM: Higher Order - Understand

is the Spearman rank correlation coefficient. REF: 693-694

TOP: 8

Chapter 15B—Nonparametric Statistics PROBLEM 1. To investigate the effect of sleep on basal metabolism, seven students who averaged seven or more hours of sleep a night (Group A), and five students who averaged less than seven hours of sleep a night (Group B), were examined and their basal metabolism recorded as shown below. Group A 36 Group B 30

38 34

36 31

29 35

32 33

Since it was not clear whether the assumptions for a t test were valid, the researcher decided to employ nonparametric methods. Use the Wilcoxon rank sum test to determine whether the metabolism measurements for Group A are significantly higher than those of Group B. Use = 0.05. ANS: : The distributions of metabolism measurements for the two groups are identical. : The distribution of metabolism measurements for Group B is shifted to the left of Group A. (i.e., the metabolism measurements are lower for Group B). Let Group B be population 1 since it has the smaller sample size. We first need to rank the observations from small to large as shown below. Group A Rank Group B Rank

36 9.5 30 2

The test statistic is 5, and

38 12 34 6.5

36 9.5 31 3

29 1 35 8

32 4 33 5

34 6.5

37 11

= 2 + 6.5 + 3 + 8 + 5 = 24.5. For left-tailed test with

= 7, the critical value for the Wilcoxon rank sum test is 21. Reject

= 0.05,

21. Since > 21, do not reject , and therefore cannot conclude the distribution of metabolism measurements for Group B is lower than those for Group A. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

2. A vendor was interested in determining whether two soft drink machines dispense the same amount of liquid. A sample of size seven was selected from each machine and the amount of liquid dispensed (in mL) was recorded as follows: Machine A 287 312 278

Machine B 284 256 241

290 293 281 295

315 298 273 275

Use the Wilcoxon rank sum test to determine whether the distributions for the amount of liquid dispensed are the same for both machines. Use = 0.05. ANS: The null and alternative hypotheses are The distributions for the amount of liquid dispensed are identical for the two machines. The distributions for the amount of liquid dispensed differ for the two machines. We first need to rank the observations from small to large as shown below. The ranks are in parentheses: Machine A 287 (8) 312 (13) 278 (5) 290 (9) 293 (10) 281 (6) 295 (11) Then,

Machine B 284 (7) 256 (2) 241 (1) 315 (14) 298 (12) 273 (3) 275 (4)

= 8 + 13 + 5 + 9 + 10 + 6 + 11 = 62, and = 7 (15) – 62 = 43

The test statistic for the two-tailed Wilcoxon rank sum test is T = Min ( , 43) = 43 For two-tailed test with

= 0.05,

= 7, and

) = min (62,

= 7, the critical value for the Wilcoxon

rank sum test is 36. Reject if T 36. Since T > 36, do not reject . We conclude that the distributions of the amount of liquid dispensed by the two machines are identical. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

3. Suppose you want to use the Wilcoxon rank sum test to detect a shift in distribution 1 to the right of distribution 2, based on samples of sizes a. b. c.

Should you use or as the test statistic? What is the rejection region for the test if What is the rejection region for the test if

ANS:

= 7 and

= 9.

If distribution 1 is shifted to the right of distribution 2, the rank sum for sample 1

will tend to be large. The test statistic will be

the rank sum for sample 1 if the

observations had been ranked from large to small. The null hypothesis will be rejected if is unusually small. b.

From the table of critical values for the Wilcoxon rank sum test, with

and = 0.05, c.

will be rejected if

= 0.01,

will be rejected if

PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

= 9,

= 7,

= 9,

43.

From the table of critical values for the Wilcoxon rank sum test, with

and

= 7,

37. TOP: 1–2

4. In an investigation of the visual scanning behaviour of hearing-impaired children, measurements of eye movement rates were taken on nine hearing-impaired children and nine hearing children as shown in the table below. Does it appear that the distributions of eye-movement rates for hearing-impaired children and hearing children differ? Test at = 0.05 using the Wilcoxon rank sum test.

Hearing-Impaired Children

Hearing Children

2.81 2.20 3.29 2.13 2.55 2.24 3.22 2.99 2.26

0.95 1.49 1.12 1.07 1.00 1.85 1.18 2.07 1.18

ANS: The null and alternative hypotheses to be tested are

: The distributions of eye-movement

rates for hearing-impaired children and hearing children are identical, and : The distributions of eye-movement rates for hearing-impaired children and hearing children are different. The data, with corresponding ranks, are shown in the following table. Hearing Impaired (1) 2.81 (15) 2.20 (11) 3.29 (18) 2.13 (10) 2.55 (14)

Hearing (2) 0.95 (1) 1.49 (7) 1.12 (4) 1.07 (3) 1.00 (2)

2.24 (12) 3.22 (17) 2.99 (16) 2.26 (13)

1.85 (8) 1.18 (5.5) 2.07 (9) 1.18 (5.5)

= 126 Calculate min

= 9(19) – 126 = 45. The test statistic is T =

= 126, and = 45. With

to be

= 9, the two-tailed rejection region with

= 0.05 is found

62 in the table of critical values of T for the Wilcoxon rank sum test. The

observed value, T = 45, falls in the rejection region and is rejected. We conclude that the hearing-impaired children do differ from the hearing children in eye-movement rate. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

5. Suppose you want to use the Wilcoxon rank sum test to detect a shift in distribution 1 either to the left or to the right of distribution 2, based on samples of sizes a. b. c.

= 7 and

= 9.

Should you use or as the test statistic? What is the rejection region for the test if What is the rejection region for the test if

ANS: a.

If the test is two-tailed, you should use the smaller of

That is, the test

statistic is T = min ( ). b. Since the test is two-tailed with = 0.05, half of this probability should be placed in each of the two tails. From the table of critical values for the Wilcoxon rank sum test (2.5% points), c.

will be rejected if T = min

40.

From the table of critical values for the Wilcoxon rank sum test (0.5% points),

be rejected if T = min

35.

PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

6. Use the Wilcoxon rank sum test on the data below to determine at the 10% significance level whether the two population locations differ. Sample 1: Sample 2:

32 29

22 20

19 18

29 27

20 19

ANS: The two population locations are the same. The two population locations are different.

34 23

25 19

9 12

28 22

17 10

will

Rejection region: Test statistic: = 90.5 Conclusion: Don’t reject the null hypothesis. The two population locations are the same. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate 7. In testing the hypotheses,

TOP: 1–2

The two population locations are the same, and

two population locations are different, the statistics , are calculated with data drawn from two independent samples. a. Which test is used for testing the hypotheses above? b. What is the value of the test statistic? c. What is the rejection region for this test at = 0.05? d. What is your conclusion at = 0.05?

The

, and

ANS: a. The Wilcoxon rank sum test b.

T = min(

) = min(22, 53) = 22

c. d.

Reject if T 22 Since T = 22, we reject the null hypothesis and conclude that the two population locations are different.

PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

8. The following statistics are drawn from two independent samples:

;

, . Test at the 5% significance level to determine whether the two population locations differ. ANS: The two population locations are the same. The two population locations differ. Rejection region: Test statistic: z = (550 – 675)/56.1249 = 2.227 p-value = 0.0258 Conclusion: Reject the null hypothesis. Yes, the two population locations differ. PTS: 1 REF: 664-667 BLM: Higher Order - Evaluate

TOP: 1–2

9. Given the statistics , , , , use the Wilcoxon rank sum test to determine at the 5% significance whether the location of population 1 is to the right of the location of population 2.

ANS: The two population locations are the same. The location of population A is to the right of the location of population B. Rejection region: Test statistic: = 54 Conclusion: Don’t reject the null hypothesis. The two population locations are the same. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

10. Because of the rising costs of industrial accidents, many chemical, mining, and manufacturing firms have instituted safety courses. Employees are encouraged to take these courses designed to heighten safety awareness. A company is trying to decide which one of two courses to institute. To help make a decision, eight employees take course 1 and another eight take course 2. Each employee writes a test, which is marked out of a possible 25. The results are shown below. Do these data provide sufficient evidence at the 5% level of significance to conclude that the marks from course 2 are higher than those of course 1? Assume that the scores are not normally distributed. Safety Test Scores Course 1 Course 2 14 20 21 18 17 22 14 15 17 23 19 21 20 19 16 15 ANS: The two population locations are the same. The location of population 1 (course 1) is to the left of the location of population 2 (course 2). Rejection region: Test statistic: Conclusion: Don’t reject the null hypothesis. The two population locations are the same. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

11. In testing the hypotheses,

The two population locations are the same, and

The

location of population 1 is to the left of the location of population 2, the statistics a. b. c. d.

, , and are calculated with data drawn from two independent samples. Which test is used for testing the hypotheses above? What is the value of the test statistic? What is the rejection region for this test at = 0.05? What is your conclusion at = 0.05?

ANS: a. The Wilcoxon rank sum test b.

c. d.

Reject if T 31 Don’t reject the null hypothesis. The two population locations are the same.

PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

12. Use the Wilcoxon rank sum test on the data below to determine at the 10% significance level whether the two population locations differ. Sample 1: Sample 2:

17 17

20 25

18 33

25 38

16 15

22 26

ANS: The two population locations are the same. The two population locations are different. Rejection region: Test statistic: = min(34, 50) = 34 Conclusion: Don’t reject the null hypothesis. The two population locations are the same. PTS: 1 REF: 662-664 BLM: Higher Order - Evaluate

TOP: 1–2

Attitude Test Narrative Twenty students are given an attitude test before and after viewing a motion picture designed to change their attitudes favourably toward a new curriculum. A high score indicates a favourable attitude and a low score indicates an unfavourable attitude, with the scores ranging from 1 to 30. This problem will use the sign test on the data given below to see if we can conclude the motion picture was successful in improving attitudes. Student 1 Before 15 After 20

2 20 21

3 17 16

4 6 5

5 12 19

6 14 17

7 20 23

8 15 14

9 19 17

10 26 25

Student 11 Before 19 After 22

12 14 13

13 8 13

14 11 15

15 24 25

16 19 20

17 16 22

18 22 20

19 15 18

20 21 28

13. Refer to Attitude Test Narrative. State the null and alternative hypotheses. ANS: The before and after populations are identically distributed, and P(After – Before) = p = 0.5 for each pair. The population of “After” measurements is shifted to the right of the population of “Before” measurements, and p > 0.5. PTS: 1 REF: 669 BLM: Higher Order - Analyze

TOP: 3

14. Refer to Attitude Test Narrative. Describe what the test statistic is for the sign test. What is the value of the test statistic in this problem? ANS: Let x be the number of times the “After” measurement was larger than the “Before” measurement; that is, number of times where (After – Before) is positive. The observed value of the test statistic is x = 13. PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

TOP: 3

15. Refer to Attitude Test Narrative. Is this a one-tailed test or a two-tailed test? Find the rejection region for = 0.10. ANS: This is a one-tailed test because we are trying to improve the attitudes. Reject PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

if x > 12.

TOP: 3

16. Refer to Attitude Test Narrative. Using  = 0.10, can we conclude the motion picture was successful in changing attitudes? ANS: Since x = 13 > 12, we reject changing attitudes.

. Yes, we can conclude the motion picture was successful in

PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

17. Refer to Attitude Test Narrative. At what level of significance could we reject

ANS: The observed significance level (p-value) is calculated by using the cumulative binomial probabilities table with n = 20 and p = 0.50 as follows: p-value = P(x > 13) = 1 – P(x 13) = 1 – 0.942 = 0.058. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

18. Refer to Attitude Test Narrative. State the null and alternative hypotheses if the Wilcoxon signed-rank test is used on the data to see if we can conclude that the motion picture was successful in changing attitudes. ANS: The two population relative frequency distributions are identical. The relative frequency distribution for “After” population is shifted to the right of the relative frequency distribution for “Before” population. PTS: 1 REF: 676-677 BLM: Higher Order - Analyze

TOP: 4–5

19. Refer to Attitude Test Narrative. Describe what the test statistic is for the Wilcoxon signed-rank test. What is the value of the test statistic in this problem? ANS: The test statistic T is the smaller of and , the sum of the ranks of the positively and negatively signed differences, respectively. Here, T = 41.5. PTS: 1 REF: 674-677 BLM: Higher Order - Analyze

TOP: 4–5

20. Refer to Attitude Test Narrative. Determine the rejection region for the Wilcoxon signed-rank test at the 5% significance level. ANS: Reject

if T < 60.

PTS: 1 REF: 676-677 BLM: Higher Order - Analyze

TOP: 4–5

21. Refer to Attitude Test Narrative. Using the answers to the previous two questions, can we conclude that the motion picture was successful in changing attitudes? ANS: Since T = 41.5 < 60, we reject and conclude that the relative frequency distribution for “After” population is shifted to the right of the relative frequency distribution for “Before” population.

PTS: 1 REF: 676-677 BLM: Higher Order - Evaluate

TOP: 4–5

22. Refer to Attitude Test Narrative. Find the p-value for the Wilcoxon signed-rank test. ANS: p-value = 0.01 PTS: 1 REF: 676-677 BLM: Higher Order - Analyze

TOP: 4–5

23. A car dealer was interested in comparing two brands of tires to see if they yielded different wear length (in thousands of km). The dealer selected eight cars at random and used each of the brands of tires on each car. The wear length was recorded as follows: Car 1 2 3 4 5 6 7 8

Brand A 45 43 54 63 39 47 56 50

Brand B 47 40 57 61 43 45 58 55

Use the sign test to see if the distribution of wear length is the same for both brands of tires. Use = 0.05. ANS: The distributions of wear length are identical for the two brands of tires, and p = 0.50. : The distributions of wear length are not identical for the two brands of tires, and p 0.50. Car 1 2 3 4 5 6 7 8

Brand A 45 43 54 63 39 47 56 50

Brand B 47 40 57 61 43 45 58 55

Sign of Difference – + – + – + – –

The observed value of the test statistic (which is the number of “plus” signs in the table) is x = 3.

Using the cumulative binomial probability table with n = 8 and p = 0.50, the observed significance level (p-value) is 2P(x  3) = 2(0.363) = 0.726. Since p-value  0.05, do not reject , and therefore conclude the distributions of wear length for the two brands of tires are identical. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

24. A dog kennel manager was interested in determining whether there is a difference in the time it takes a dog to complete an obstacle course for two different courses. A random sample of 36 dogs was selected and the time it took each dog to complete each course was recorded. In 12 cases it took the dog longer to complete course 1. Use the normal approximation to the sign test to determine if there is a significant difference in the time it takes to complete the two obstacle courses. Use = 0.05. ANS: The distributions of completion time for an obstacle course are identical for the two different courses, and p = 0.50. : The distributions of completion time for an obstacle course are not identical for the two different courses, and p 0.50. Since n = 36 > 25, we can use the normal approximation to the sign test. The z-test statistic is

= (12 – 18)/3 = –2.0. The rejection region at

= 0.05 is | z | > 1.96. Since |

z | = 2 > 1.96, reject and conclude that the distributions of completion time for an obstacle course are not identical for the two different courses. PTS: 1 REF: 670-671 BLM: Higher Order - Evaluate

TOP: 3

25. A paired-difference experiment was conducted to compare two populations. The data are shown in the table. Use a sign test to determine whether the population distributions are different. Pairs Population 1 2 3 4 5 6 7 1 9.6 8.8 10.0 8.4 11.1 9.0 8.1 2 9.5 8.1 9.7 8.5 10.6 8.8 7.6 a. State the null and alternative hypotheses for the test. b. Determine an appropriate rejection region with 0.01. c. Calculate the observed value of the test statistic. d. Do the data present sufficient evidence to indicate that populations 1 and 2 are different? ANS:

Define x = number of positive differences,

= P(positive difference). The

hypotheses of interest are p = 0.5 vs. 0.5. b. With n = 7, the rejection region must be calculated using the binomial formula or the binomial tables with n = 7 and p = 0.5. If we choose to use {x = 0, x = 7. as the rejection region, then

= 0.008 + 0.008 = 0.016. If we choose to use

region, then

= 0.062 + 0.062 = 0.124. Clearly

as the rejection

= 0.124 is too large. The rejection

region will be x = 0 or x = 7 with = = 0.016. c. There are x = 6 positive differences. The observed value of the test statistic is x = 6. d. Since the observed value of the test statistic x = 6 does not fall in the rejection region, is not rejected. We cannot detect a difference between the populations. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

Gourmet Meals Narrative Two gourmets, A and B, rated 22 meals on a scale of 1 to 10. The data are shown in the table. Do the data provide sufficient evidence to indicate that one of the gourmets tends to give higher ratings than the other? Meal 1 2 3 4 5 6 7 8 9 10 11

A 5 3 6 7 1 6 8 6 1 3 5

B 7 4 3 6 2 3 8 7 4 2 8

Meal 12 13 14 15 16 17 18 19 20 21 22

A 7 3 2 5 8 8 3 3 4 2 4

B 4 1 2 7 9 7 5 2 3 1 2

26. Refer to Gourmet Meals Narrative. Test by using the sign test with a value of Use the binomial tables to find the exact rejection region for the test.

near 0.05.

ANS: Define p = P(gourmet A’s rating exceeds gourmet B’s rating for a given meal) and x = number of meals for which gourmet A exceeds B. The hypotheses to be tested are

0.50 vs. 0.50 using the sign test with x as the test statistic. Notice that n = 20, since a tie rating was given to meals 7 and 14. There are x = 11 positive differences, hence the observed value of the test statistic is x = 11.

Critical value approach: Various two-tailed rejection regions are tried in order to find a region with 0.05. These are shown in the following table: Rejection Region 0.002 0.012 0.042 0.116 We choose to reject = 0.042. Since x = 11, is not rejected. There is insufficient evidence to indicate a difference between the two gourmets. p-value approach: For the observed value x = 11, calculate the two-tailed p-value: p-value = 2P = 2(1 – 0588) = 0.824. Since the p-value is greater than 0.05, is not rejected. There is insufficient evidence to indicate a difference between the two gourmets. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

27. Refer to Gourmet Meals Narrative. Use the large-sample z statistic for testing. (Note: Although the large-sample approximation is suggested for 25, it works fairly well for values of n as small as 15). ANS: The large sample z statistic, with n = 20 and p = 0.5 is , and the two-tailed rejection region with = 0.05 is | z | >1.96. The null hypothesis is not rejected. PTS: 1 REF: 670-671 BLM: Higher Order - Evaluate

TOP: 3

28. Refer to Gourmet Meals Narrative. . Compare the results of the previous two questions. ANS: The results are the same. PTS: 1 REF: 669-671 BLM: Higher Order - Understand TV Commercials Narrative

TOP: 3

It is important to sponsors of television shows that viewers remember as much as possible about the commercials. The advertising executive of a large company is trying to decide which of two commercials to use on a weekly half-hour comedy. To help make a decision, she decides to have 12 individuals watch both commercials. After each viewing, each respondent is given a quiz consisting of 10 questions. The number of correct responses was recorded and is listed below. Assume that the quiz results are not normally distributed. Quiz Scores Respondent 1 2 3 4 5 6 7 8 9 10 11 12

Commercial 1 7 8 6 10 5 7 5 4 6 7 5 8

Commercial 2 9 9 6 10 4 9 7 5 8 9 6 10

29. Refer to TV Commercials Narrative. Which test is appropriate for this situation? ANS: the sign test PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

TOP: 3

30. Refer to TV Commercials Narrative. Do these data provide enough evidence at the 5% significance level to conclude that the two commercials differ? Justify your conclusion. ANS: The two population locations are equal. The two population locations are not equal. Rejection region: Test statistic: z = –2.53 Conclusion: Reject the null hypothesis. Yes, these data provide enough evidence at the 5% significance level to conclude that the two commercials differ. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate Typing Speed Narrative

TOP: 3

Ten secretaries were selected at random from among the secretaries of a large university. The typing speed (number of words per minute) was recorded for each secretary on two different brands of computer keyboards. Assume that the typing speeds are not normally distributed. The following results were obtained.

Secretary Li Shantal Carol Donna Ellen Faith Mina Heather Ingrid Jody

Computer Keyboard Brand A Brand B 72 74 80 86 68 72 74 70 86 85 75 73 78 72 69 65 76 79 65 64

31. Refer to Typing Speed Narrative. Which test is appropriate for this situation? ANS: The sign test PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

TOP: 3

32. Refer to Typing Speed Narrative. Perform the test you suggested in the question above to determine if these data provide enough evidence at the 5% significance level to infer that the brands differ with respect to typing speed. ANS: The two population locations are the same. The two population locations differ. Rejection region: Test statistic: z = 0.63 Conclusion: Don’t reject the null hypothesis. No, these data don’t provide enough evidence at the 5% significance level to infer that the brands differ with respect to typing speed. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate Books Manuscripts Narrative

TOP: 3

In general, before an academic publisher agrees to publish a book, each manuscript is thoroughly reviewed by university professors. Suppose that the Duxbury Publishing Company has recently received two manuscripts for statistics books. To help them decide which one to publish, both are sent to 30 professors of statistics who rate the manuscripts to judge which one is better. Suppose that 10 professors rate manuscript 1 better and 20 rate manuscript 2 better. 33. Refer to Books Manuscripts Narrative. Which test is appropriate for this situation? ANS: the sign test PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

TOP: 3

34. Refer to Books Manuscripts Narrative. Can Duxbury conclude at the 5% significance level that manuscript 2 is more highly rated than manuscript 1? ANS: The two population locations are the same. The location of population 1 (manuscript 1) is to the left of the location of population 2. Rejection region: Test statistic: z = –1.83 Conclusion: Reject the null hypothesis. Yes, Duxbury can conclude at the 5% significance level that manuscript 2 is more highly rated than manuscript 1. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

35. Refer to Books Manuscripts Narrative. What is the p-value of the test you conducted in the previous question? ANS: p-value = 0.0336 PTS: 1 REF: 669-670 BLM: Higher Order - Apply

TOP: 3

36. Refer to Books Manuscripts Narrative. Explain how to use the p-value for testing the hypotheses at the 5% significance level. ANS: Since p-value <

, we reject the null hypothesis.

PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

Ice Cream Narrative A supermarket chain has its own house brand of ice cream. The general manager claims that her ice cream is better than the ice cream sold by a well-known ice cream parlour chain. To test the claim, 40 individuals are randomly selected to participate in the following experiment. Each respondent is given the two brands of ice cream to taste (without any identification) and asked to judge which one is better. Suppose that 25 people judge the ice cream parlour brand as better, 4 say that the brands taste the same, and the rest claim that the supermarket brand is better. 37. Refer to Ice Cream Narrative. Which test is appropriate for this situation? ANS: the sign test PTS: 1 REF: 669-670 BLM: Higher Order - Analyze

TOP: 3

38. Refer to Ice Cream Narrative. Can we conclude at the 1% significance level that the general managers’ claim is false? Justify your answer. ANS: The two population locations are the same. The location of population 1 (own house brand of ice cream) is to the left of the location of population 2 (ice cream parlour brand). Rejection region: Test statistic: z = –2.85 Conclusion: Reject the null hypothesis. Yes, we can conclude at the 1% significance level that the general managers’ claim is false. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

39. Refer to Ice Cream Narrative. What is the p-value of the test in the previous question? ANS: p-value = 0.0022 PTS: 1 REF: 669-670 BLM: Higher Order - Apply

TOP: 3

40. A matched pairs experiment yielded the following results: Number of positive differences = 20 Number of negative differences = 8 Number of 0 differences = 2 Can we infer at the 5% significance level that the location of population 1 is to the right of the location of population 2? Justify your response.

ANS: The two population locations are the same. The location of population 1 is to the right of the location of population 2. We apply the normal approximation of the binomial distribution and use the sign test with x = 20 and n = 28. Rejection region: Test statistic: z = 2.27 Conclusion: Reject the null hypothesis. Yes, we can infer at the 5% significance level that the location of population 1 is to the right of the location of population 2. PTS: 1 REF: 669-670 BLM: Higher Order - Evaluate

TOP: 3

41. Two aptitude tests are currently being used to screen applicants for a certain position within a company. The question arose as to whether the two tests are comparable, i.e., whether they yield the same results. Six applicants were selected at random to take both tests (in a random order). The following scores were recorded: Applicant 1 2 3 4 5 6

Test A 85 93 98 68 76 83

Test B 87 94 92 73 73 85

Use the Wilcoxon signed-rank test to determine whether there is a difference in scores between the two tests. Use = 0.10. ANS: : The population frequency distributions of scores are identical for the two aptitude tests. : The population frequency distributions of scores are not identical for the two aptitude tests. Applicant 1 2 3 4 5 6

Test A 85 93 98 68 76 83

Test B 87 94 92 73 73 85

Difference –2 –1 +6 –5 +3 –2

Rank 2.5 1 6 5 4 2.5

The rank sum for positive and the rank sum for negative differences are = 6 + 4 = 10, and = 2.5 + 1 + 5 + 2.5 = 11 The value of the test statistics is T = min( , ) = min(10, 11) = 10.

For two-tailed test with

= 0.10 and n = 6, the critical value of T for the Wilcoxon

signed-rank test is 2. Reject if T 2. Since T  2, do not reject , and therefore cannot conclude there is a difference in the frequency distributions of scores for the two aptitude tests. PTS: 1 REF: 674-677 BLM: Higher Order - Evaluate

TOP: 4–5

42. Suppose you wish to detect a difference in the locations of two population distributions based on a paired-difference experiment consisting of n = 35 pairs. a. Give the null and alternative hypotheses for the Wilcoxon signed-rank test. b. Give the test statistic. c. Give the rejection region for the test for = 0.05. d. If = 339, what are your conclusions? [Note: = n(n + 1)/2]. e. Conduct the test using the large-sample z test. Compare your results with the nonparametric test results in the previous question. ANS: a. b. c.

Population distributions 1 and 2 are identical. Population distributions 1 and 2 differ in location. For a two-tailed test, the test statistic is T, the smaller of the rank sum for positive differences ( ) and the rank sum for negative differences ( ). From the table of critical values of T for the Wilcoxon signed-rank test, with n = 35, = 0.05 and a two-tailed test, the rejection region is 195.

Since

= 339, we can calculate

339 = 291. The test

statistic T is the smaller of or T = 291 and is not rejected. There is no evidence of a difference between the two distributions. Since n > 25, the large sample approximation to the signed-rank test can be used to test the hypotheses given in question 51. Calculate

. The test statistic is . The two-tailed rejection region with z | > 1.96 and

is not rejected. The results agree with part d.

PTS: 1 REF: 674-678 BLM: Higher Order - Evaluate

TOP: 4–5

= 0.05 is |

43. Eight people were asked to perform a simple puzzle-assembly task under normal conditions and under stressful conditions. During the stressful time, a mild shock was delivered to subjects 3 minutes after the start of the experiment and every 30 seconds thereafter until the task was completed. Blood pressure readings were taken under both conditions. The data in the table are the highest readings during the experiment. Do the data present sufficient evidence to indicate higher blood pressure readings under stressful conditions? Analyze the data using the Wilcoxon signed-rank test for a paired experiment.

Subject

Normal

Stressful

128 119 117 120 120 130 127 122

132 120 127 122 123 127 132 122

1 2 3 4 5 6 7 8 ANS:

The differences, , along with their ranks (according to absolute magnitude), are shown in the following table.

Rank |

–4

–1

–10

–2

–3

–5

3.5

—

Then = 3.5 and = 24.5 with n = 7 (one tie). Indexing n = 7 and = 0.05 in the table of critical values of T for the Wilcoxon signed-rank test, the lower portion of the two-tailed rejection region is is rejected. We conclude that higher blood pressure readings occur during conditions of stress. PTS: 1 REF: 674-677 BLM: Higher Order - Evaluate

TOP: 4–5

44. Given the statistics , , and n = 50 from a matched pairs experiment, perform the Wilcoxon signed-rank test to determine whether we can infer at the 5% significance level that the two population locations differ. ANS: The two population locations are the same. The two population locations differ. Rejection region: Test statistic: z = –1.13 Conclusion: Don’t reject the null hypothesis. We can’t infer at the 5% significance level that the two population locations differ.

PTS: 1 REF: 677-678 BLM: Higher Order - Evaluate 45. In testing the hypotheses,

TOP: 4–5

The two population locations are the same vs.

The

two population locations are different, the statistics , , and calculated with data drawn from a matched pairs experiment. a. Which test is used for testing the hypotheses above? b. What is the value of the test statistic? c. What is the p-value of this test? d. Can we infer at the 5% significance level that the population locations differ?

are

ANS: a. The Wilcoxon signed-rank test b. z = –2.31 c. p-value = 0.0208 d. Since p-value < , we reject the null hypothesis. Yes, we can infer at the 5% significance level that the population locations differ. PTS: 1 REF: 677-678 BLM: Higher Order - Evaluate

TOP: 4–5

46. In testing the hypotheses The two population locations are the same vs The location of population A is to the right of the location of population B, the statistics

, and are calculated with data drawn from a matched pairs experiment. a. Which test is used in testing the hypotheses above? b. What is the value of the test statistic? c. What is the p-value of this test? d. Can we infer at the 1% significance level that the location of population A is to the right of the location of population B? Explain. ANS: a. The Wilcoxon signed-rank sum test b. z = 1.97 c. p-value = 0.0244 d. Since p-value > , don’t reject the null hypothesis. No, we can’t infer at the 1% significance level that the location of population A is to the right of the location of population B. PTS: 1 REF: 677-678 BLM: Higher Order - Evaluate

TOP: 4–5

47. A computer laboratory manager was interested in whether there was a difference in functioning time before needing to be recharged for three battery packs for laptop computers. The manager took a random sample of six battery packs of each brand and tested them. The results, in hours of functioning before needing to be recharged, were recorded as follows:

Brand 1 6.75 7.30 7.60 7.50 6.90 7.25

Brand 2 7.80 7.65 7.72 7.85 7.45 7.00

Brand 3 6.25 6.54 6.20 6.35 6.39 6.95

The manager, unsure that the assumptions for the usual parametric analysis of variance were valid, decided to employ nonparametric methods. Use the appropriate nonparametric procedure to determine whether the distribution of functioning time before needing to be recharged is the same for the three brands of battery packs. Use = 0.05. ANS: The Kruskal–Wallis test will be used to test: : The distributions of functioning times before battery needing to be recharged are identical for the three brands of battery packs. : At least two of the distributions of functioning times before battery needing to be recharged differ in location. To find the value of H, we first rank the n = 18 observations from the smallest (rank 1) to the largest (rank 18). These ranks are shown in parentheses in the table below. Brand 1 6.75 7.30 7.60 7.50 6.90 7.25 = 61

(6) (11) (14) (13) (7) (10)

Brand 2 7.80 7.65 7.72 7.85 7.45 7.00

(17) (15) (16) (18) (12) (9)

Brand 3 6.25 6.54 6.20 6.35 6.39 6.95

=87

(2) (5) (1) (3) (4) (8)

= 23

= 12.117 With

= 0.05 and df = k – 1 = 2, we reject

= 5.99147. Since H > 5.99147,

reject and conclude that there is sufficient evidence at  = 0.05 to say that at least two of the distributions of functioning times before battery needing to be recharged differ in location. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

48. Three treatments were compared using a completely randomized design. The data are shown in the table.

1 29 32 26 27 31 29

Treatment 2 30 34 33 31 32 35 33 36

3 28 27 30 25 27 23 24

Do the data provide sufficient evidence to indicate a difference in location for at least two of the population distributions? Test using the Kruskal–Wallis H statistic with = 0.05. ANS: The Kruskal–Wallis H test provides a nonparametric analogue to the analysis of variance F test for a completely randomized design. The data are jointly ranked from smallest to largest. The data with corresponding ranks in parentheses are shown below. Treatment 1

29 (9.5) 32 (15.5) 26 (4) 27 (6) 31 (13.5) 29 (9.5)

30 (11.5) 34 (19)

28 (8) 27 (6)

33 (17.5) 31 (13.5) 32 (15.5)

30 (11.5) 25 (3) 27 (6)

35 (20) 33 (17.5) 36 (21)

23 (1) 24 (2)

= 58

= 135.5

= 37.5

The test statistic, based on the rank sums, is

– 3(22) = 13.59.

The hypotheses of interest are : The three population distributions are identical vs. : At least two of the three population distributions differ in location. The rejection region with = 0.05 and df = k – 1 = 2 is based on the chi-square distribution, or = 5.99147. The null hypothesis is rejected and we conclude that there is a difference in location among the three treatments. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

Heart Rates and Exercise Narrative An experiment was conducted to examine the effect of age on heart rate when a person is subjected to a specific amount of exercise. Ten men were randomly selected from each of four age groups: 10–19, 20–39, 40–59, and 60–69. Each man walked a treadmill at a fixed grade for a period of 12 minutes, and the increase in heart rate (the difference before and after exercise) was recorded (in beats per minute). The data are shown in the table. Age Group 10–19 32 36 29 30 42 38 36 32 39 25

20–39 27 30 36 34 24 31 27 37 24 35

40–59 40 28 25 36 31 29 33 37 30 36

60–69 31 32 37 39 24 23 28 27 36 35

339

305

325

312

49. Refer to Heart Rates and Exercise Narrative. Do the data present sufficient evidence to indicate differences in location for at least two of the four age groups? Test using the Kruskal–Wallis H test with = 0.01. ANS: The data with corresponding ranks in parentheses are shown below.

10–19 32 (21) 36 (29.5) 29 (12.5) 30 (15) 42 (40)

Age 20–39 40–59 27 (8) 40 (39) 30 (15) 28 (10.5) 36 (29.5) 25 (5.5) 34 (24) 36 (29.5) 24 (3) 31 (18)

60–69 31 (18) 32 (21) 37 (340 39 (37.5) 24 (3)

38 (36) 36 (29.5) 32 (21) 39 (37.5) 25 (5.5)

31 (18) 27 (8) 37 (34) 24 (3) 35 (25.5)

29 (12.5) 33 (23) 37 (34) 30 (15) 36 (29.5)

23 (1) 28 (10.5) 27 (8) 36 (29.5) 35 (25.5)

= 247.5

= 168

= 216.5

= 188

= 10

The test statistic, based on the rank sums, is

– 3(41) = 2.63. The hypotheses of interest are : The four population distributions are identical vs. : At least two of the four population distributions differ in location. The rejection region with = 0. 01 and df = k – 1 = 3 is based on the chi-square distribution, or = 11.3449. The null hypothesis is not rejected. There is no evidence of a difference in location. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

50. Refer to Heart Rates and Exercise Narrative. Find the approximate p-value for the test in the previous question. ANS: Since the observed value H = 2.63 is less than 0.10. PTS: 1 REF: 680-684 BLM: Higher Order - Analyze

= 6.25139, the p-value is greater than

TOP: 6

Movie Ratings Narrative A movie critic wanted to determine whether or not moviegoers of different age groups evaluated a movie differently. With this objective, he commissioned a survey that asked people their ratings of the most recently watched movies. The rating categories were 1 = terrible, 2 = fair, 3 = good, and 4 = excellent. Each respondent was also asked to categorize his or her age as either 1 = teenager, 2 = young adult (20–34), 3 = middle age (35–50), and 4 = senior (over 50). The results are shown below.

Teenager 3 4

Movie Ratings Young Adult Middle Age 2 3 3 2

Senior 3 4

3 3 3 4 2 4

3 2 2 1 3 2

1 2 2 3 1 4

4 3 3 4 4 3

51. Refer to Movie Ratings Narrative. Which test can the movie critic use in this situation? ANS: the Kruskal–Wallis test PTS: 1 REF: 680 | 683-684 BLM: Higher Order - Analyze

TOP: 6

52. Refer to Movie Ratings Narrative. Do these data provide sufficient evidence to infer at the 5% significance level that there were differences in ratings among the different age categories? Justify your response. ANS: The locations of all three populations are the same. At least two population locations differ. Rejection region: Test statistic: H = 11.0824 Conclusion: Reject the null hypothesis. Yes, these data provide sufficient evidence to infer at the 5% significance level that there were differences in ratings among the different age categories. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

53. Refer to Movie Ratings Narrative. What statement can be made about the p-value for this test? ANS: 0.01 < p-value < 0.025 PTS: 1 REF: 680-684 BLM: Higher Order - Analyze

TOP: 6

Advertisement Narrative In a Kruskal–Wallis test to determine whether differences exist among three different advertisements, the following statistics were obtained: ,

and

54. Refer to Advertisement Narrative. Conduct the test at the 5% significance level. ANS: The locations of all three populations are the same. At least two population locations differ. Rejection region: Test statistic: H = 10.167 Conclusion: Reject the null hypothesis. There is enough evidence to conclude that differences exist among three different advertisements. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

55. Refer to Advertisement Narrative. What is the most accurate statement that can be made about the p-value of this test? ANS: 0.005 < p-value < 0.01 PTS: 1 REF: 680-684 BLM: Higher Order - Analyze

TOP: 6

56. Refer to Advertisement Narrative. Explain how to use the p-value for testing the hypotheses. ANS: Since p-value <

, reject the null hypothesis.

PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

57. Apply the Kruskal–Wallis test to determine if there is enough evidence at the 5% significance level to infer that at least two populations differ.

1 23 22 25 20 18

Sample 2 25 27 17 19 20

3 25 22 19 21 26

ANS: The locations of all three populations are the same. At least two population locations differ. Rejection region:

Test statistic: H = 0.38 Conclusion: Don’t reject the null hypothesis. There is not enough evidence at the 5% significance level to infer that at least two populations differ. PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

Customers’ Ages Narrative The marketing manager of a pizza chain is in the process of examining some of the demographic characteristics of her customers. In particular, she would like to investigate the belief that the ages of the customers of pizza parlours, hamburger emporiums, and fast-food chicken restaurants are different. As an experiment, the ages of eight customers of each of the restaurants are recorded and listed below. From previous analysis we know that the ages are not normally distributed. Customers’ Ages Pizza Hamburger 23 26 19 20 25 18 17 35 36 33 25 25 28 19 31 17

Chicken 25 28 36 23 39 27 38 31

58. Refer to Customers’ Ages Narrative. Do these data provide enough evidence at the 10% significance level to infer that there are differences in ages among the customers of the three restaurants? Justify your response. ANS: The locations of all three populations are the same. At least two population locations differ. Rejection region: Test statistic: H = 4.3738 Conclusion: Don’t reject the null hypothesis. These data do not provide enough evidence at the 10% significance level to infer that there are differences in ages among the customers of the three restaurants PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

59. Refer to Customers’ Ages Narrative. Using the appropriate statistical table, what statement can be made about the p-value for this test? ANS:

p-value > 0.10 PTS: 1 REF: 680-684 BLM: Higher Order - Analyze

TOP: 6

60. Refer to Customers’ Ages Narrative. Explain how to use the p-value for testing the hypotheses. ANS: Since p-value >

, don’t reject the null hypothesis.

PTS: 1 REF: 680-684 BLM: Higher Order - Evaluate

TOP: 6

Reaction Times Narrative The reaction times to three stimuli were recorded for each of eight subjects. The data, recorded in seconds, are shown below. This problem uses Friedman’s Fr test to determine if there is a difference among the population distributions of reaction times. Subject 1 2 3 4 5 6 7 8

Stimulus I 3.5 4.7 6.1 2.8 4.3 2.6 5.0 3.2

Stimulus II 4.8 6.2 5.2 4.5 4.2 4.3 4.4 3.9

Stimulus III 5.3 5.9 5.8 5.7 5.8 3.7 5.1 4.6

61. Refer to Reaction Times Narrative. What experimental design is being used in this problem? ANS: randomized block design; subjects are the blocks PTS: 1 REF: 686 BLM: Higher Order - Analyze

TOP: 7

62. Refer to Reaction Times Narrative. State the null and alternative hypotheses. ANS: : The three population distributions are identical. : At least two of the three population distributions differ in location. PTS: 1 REF: 688-689 BLM: Higher Order - Understand

TOP: 7

63. Refer to Reaction Times Narrative. Describe what the test statistic of

is. What is the value

in this problem?

ANS: is a function of the samples’ rank sums . To find the value of we rank the three treatment observations within each block (see below; ranks are in parentheses). Subject Stimulus I Stimulus II Stimulus III 1 3.5 (1) 4.8 (2) 5.3 (3) 2 4.7 (1) 6.2 (3) 5.9 (2) 3 6.1 (3) 5.2 (1) 5.8 (2) 4 2.8 (1) 4.5 (2) 5.7 (3) 5 4.3 (2) 4.2 (1) 5.8 (3) 6 2.6 (1) 4.3 (3) 3.7 (2) 7 5.0 (2) 4.4 (1) 5.1 (3) 8 3.2 (1) 3.9 (2) 4.6 (3) Rank sum

= 12

=15

= 20 = 5.25.

PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

64. Refer to Reaction Times Narrative. Find the rejection region for

= 0.05.

ANS: With

= 0.05 and df = k – 1 = 2, we reject

PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

= 5.99147.

TOP: 7

65. Refer to Reaction Times Narrative. Is there a difference among the population distributions of reaction times? Justify your answer. ANS: No. Since < 5.99147, do not reject and therefore conclude that the three population distributions are identical; that is, there is insufficient evidence to conclude that the reaction times for the three stimuli are different. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

66. Refer to Reaction Times Narrative. What is the p-value for this problem? ANS:

0.05 < p-value < 0.10 PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

Teaching Methods Narrative Two different workbooks and two distinct teaching machines were to be evaluated on their effectiveness in teaching the concept of multiplication. A Grade 4 class of 24 subjects was randomly assigned to 4 groups, and each group in turn was randomly assigned to a teaching method. A test was given and the number of errors was recorded. This problem uses the Kruskal–Wallis H test to see if the number of errors differs from one teaching method to another. Workbook 1 Workbook 2 2 5 1 5 1 4 3 6 2 8 0 9

Teaching Machine 1 1 1 2 0 3 4

Teaching Machine 2 5 6 7 4 5 9

67. Refer to Teaching Methods Narrative. What experimental design did the statistician use? ANS: randomized block design; ability levels are the blocks PTS: 1 REF: 686 BLM: Higher Order - Analyze

TOP: 7

68. Refer to Teaching Methods Narrative. State the null and alternative hypotheses. ANS: : The three population distributions are identical. : At least two of the three population distributions differ in location. PTS: 1 REF: 688-689 BLM: Higher Order - Understand

TOP: 7

69. Refer to Teaching Methods Narrative. Describe what the test statistic value of

is. What is the

in this problem?

ANS: is a function of the samples’ rank sums . To find the value of , we first rank the four treatment observations within each block, as shown below (ranks are in parentheses)

Ability Level A

1 3 (2)

Level B

1 (1)

4 (3.5) 3 (4)

Level C Level D Level E

4 (2) 6 (2) 1 (2)

8 (3) 8 (4) 3 (4)

Rank sums

Teaching Method 4 4 (3.5)

3 2 (1) 2 (2.5) 3 (1) 4 (1) 0 (1)

= 18.5

2 (2.5) 10 (4) 7 (3) 2 (3)

= 6.5

= 16

= 11.58. PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

70. Refer to Teaching Methods Narrative. Find the rejection region for  = 0.05. ANS: With

= 0.05 and df = k – 1 = 3, we reject

PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

= 7.81473.

TOP: 7

71. Refer to Teaching Methods Narrative. Can we conclude the teaching methods are equally effective? Explain. ANS: No; since > 7.81473, we reject equally effective. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

and conclude that the teaching methods are not

TOP: 7

72. Refer to Teaching Methods Narrative. What is the observed significance level of this test? ANS: 0.005 < p-value < 0.010 PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

73. A toy store manager was interested in determining whether the assembly time is the same for three models of baby strollers. The manager selected five employees at random and asked each of them to assemble each of the strollers. The assembly time, in minutes, was recorded as follows: Employee 1 2 3 4 5

Model A 32 30 34 29 35

Model B 26 28 35 27 36

Model C 39 42 45 37 34

The manager wasn’t sure whether the assumptions for the usual analysis of variance were valid, so she decided to use a nonparametric procedure. Use the appropriate method to determine whether the assembly time is the same for the three models of baby strollers. Use = 0.05. ANS: The Friedman test for randomized block design (employees are the blocks) will be used to test

The distributions of assembly time are identical for the three models of baby

strollers vs.

: At least two of the distributions of assembly time for the three models of

baby strollers differ in location. To find the value of Friedman test statistic , we first rank the three treatment observations within each block, as shown below (ranks are in parentheses): Employee 1 2 3 4 5

Model A 32 (2) 30 (2) 34 (1) 29 (2) 35 (2)

Model B 26 (1) 28 (1) 35 (2) 27 (1) 36 (3)

Model C 39 (3) 42 (3) 45 (3) 37 (3) 34 (1)

Rank Sum

= 13

Then, With

= 2.80. = 0.05 and df = k – 1 = 2, we reject

= 5.99147. Since

5.99147, we do not reject , and therefore conclude that the distributions of assembly time for the three models of baby strollers are identical. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

74. A randomized block design is used to compare three treatments in six blocks.

Block 1 2 3 4 5 6

Treatment 2 3.7 3.6 5.6 3.3 4.7 3.0

1 3.8 3.4 5.1 3.1 4.3 3.0

3 3.0 2.3 4.5 3.2 4.1 2.6

a. Use the Friedman test to detect differences in location among the three treatment distributions. Test using = 0.05. b. Find the approximate p-value for the test in (a). c. Perform an analysis of variance and give the ANOVA table for the analysis, and using statistical software to approximate the p-value for the F-statistic in testing the equality of the three treatment means. Test using = 0.05. d. Compare the p-values for the tests in (b) and (c). ANS: a. In using the Friedman test, data are ranked within a block from 1 to k. The treatment rank sums are then calculated as usual. The data and the corresponding ranks (in parentheses) are shown below. Treatment 1 3.8 (3) 3.4 (2) 5.1 (2) 3.1 (1) 4.3 (2) 3.0 (2.5)

Block 1 2 3 4 5 6 Rank sum

= 12.5

2 3.7 3.6 5.6 3.3 4.7

(2) (3) (3) (3) (3) 3.0 (2.5) = 16.5

3 3.0 2.3 4.5 3.2 4.1 2.6

(1) (1) (1) (2) (1) (1) =7

–

The test statistic is 3(6)(4) = 7.58 and the rejection region is = 5.99147. Hence, we conclude that there is a difference among the three treatments. b. The observed value = 7.58 falls between 0.01 < p-value < 0.025. c. The ANOVA table is shown below. ANOVA Source of Variation Treatments Blocks Error

df 2 5 10

SS 1.56 10.965 0.72

MS 0.78 2.193 0.072

= 7.37776 and

F 10.8333 30.4583

is rejected and = 9.21034. Hence,

Total

13.245

The hypotheses to be tested are vs. : At least two of the ’s are different. The approximate p-value using Excel is p-value = 0.00314. Since p-value <

0.05, is rejected, and therefore we can conclude that at least two of the three treatment means are different. d. The p-value for the parametric test is p-value = 0.00314, while the p-value for the nonparametric test (Friedman test) is 0.01 < p-value < 0.025. Hence, if all of the parametric assumptions are met, the parametric test will be more powerful than its nonparametric analogue. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

Children and Antibiotics Narrative In a study of the palatability of antibiotics in children, a medical team used a voluntary sample of healthy children to assess their reactions to the taste of four antibiotics. The children’s response was measured on a 10 centimetre (cm) visual analogue scale incorporating the use of faces, from sad (low score) to happy (high score). The minimum score was 0 and the maximum was 10. For the accompanying data, each of five children was asked to taste each of four antibiotics and rate them using the visual (faces) analogue scale from 0 to 10 cm. Antibiotic Child 1 2 3 4 5

1 5.0 8.3 5.2 8.1 4.1

2 2.4 9.4 2.8 9.6 7.6

3 7.0 6.8 3.8 5.5 2.3

4 6.4 9.8 6.7 8.7 2.2

75. Refer to Children and Antibiotics Narrative. What design is used in collecting these data? ANS: The experiment was designed as a randomized block design, with children as blocks and the four antibiotics as treatments. PTS: 1 REF: 686 BLM: Higher Order - Analyze

TOP: 7

76. Refer to Children and Antibiotics Narrative. Using an appropriate statistical package for a two-way classification, produce a normal probability plot of the residuals as well as a plot of residuals vs. antibiotics. Do the usual analysis of variance assumptions appear to be satisfied? Report the ANOVA results. ANS:

The residual plots are shown below. Although the normality assumption does not look unreasonable (even though the scores are measured on an analogue scale), the equality of variance assumption is questionable.

Two-way Analysis of Variance Analysis of Variance Table Source Child Antibiotic Error

DF 4 3 12

SS 67.31 7.72 47.29

MS 16.83 2.57 3.94

F 4.27 0.65

P 0.022 0.596

Total

122.33

Since the p-value is 0.596, you cannot reject There is sufficient evidence to indicate differences in the responses to the four antibiotics. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

77. Refer to Children and Antibiotics Narrative. Use the appropriate nonparametric test to test for differences in the distributions of responses to the tastes of the four antibiotics. ANS: The Friedman’s

test is run using MINITAB, and the printout is shown below.

Friedman Test Friedman test for Score by Antibiotic blocked by Child S = 1.56 DF = 3 P = 0.668 Antibiotic 1 2 3 4

N 5 5 5 5

Median 5.238 6.313 3.738 6.663

Ranks 12.0 13.0 10.0 15.0

Grand median = 5.488 Since the p-value is 0.668, you cannot reject There is sufficient evidence to indicate differences in the responses to the four antibiotics. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

78. Refer to Children and Antibiotics Narrative. Comment on the results of the analysis of variance compared with the nonparametric test. ANS: The results are the same for both tests. PTS: 1 REF: 686-689 BLM: Higher Order - Understand Frozen TV Dinner Narrative

TOP: 7

The general manager of a frozen TV dinner maker must decide which one of four new dinners to introduce to the market. He decides to perform an experiment to help make a decision. Each dinner is sampled by ten people who then rate the product on a 7-point scale, where 1 = poor, and 7 = excellent. The results are shown below.

Respondent 1 2 3 4 5 6 7 8 9 10

Taste Ratings Dinner 1 Dinner 2 6 6 5 5 7 7 6 6 7 6 7 5 6 4 5 6 4 4 7 5

Dinner 3 4 2 3 5 4 3 3 4 3 6

Dinner 4 5 4 4 4 3 5 4 6 5 4

79. Refer to Frozen TV Dinner Narrative. Which statistical technique can the general manager use to help him make a decision? ANS: The Friedman test PTS: 1 REF: 686 BLM: Higher Order - Analyze

TOP: 7

80. Refer to Frozen TV Dinner Narrative. Can the general manager infer at the 5% significance level that there are differences in the taste ratings of the four dinners? Justify your decision. ANS: The locations of all four populations are the same. At least two population locations differ. Rejection region: Test statistic: Conclusion: Reject the null hypothesis. Yes, the general manager can infer at the 5% significance level that there are differences in the taste ratings of the four dinners. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

81. Refer to Frozen TV Dinner Narrative. Using the appropriate statistical table, what statement can be made about the p-value for this test? ANS: p-value < 0.005

PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

Hamburger Ratings Narrative The restaurant critic at a newspaper claims that the hamburgers that one gets at the hamburger chain restaurants are all equally bad, and that people who claim to like one hamburger over others are victims of advertising. In fact, he claims that if there were no differences in appearance, then all hamburgers would be rated equally. To test the critic’s assertion, ten teenagers are asked to taste hamburgers from three different fast-food chains. Each hamburger is dressed in the same way (mustard, relish, tomato, and pickle) and with the same type of bun. The teenagers taste each hamburger and rate it on a 9-point scale with 1 = bad and 9 = excellent. The data are listed below.

Teenager 1 2 3 4 5 6 7 8 9 10

Chain 1 7 5 6 9 4 4 6 5 8 9

Chain 2 5 3 4 8 3 5 5 4 7 8

Hamburger Ratings Chain 3 6 4 5 8 2 4 5 5 9 7

82. Refer to Hamburger Ratings Narrative. Which statistical technique is appropriate if you want to compare the quality of hamburger of the three chain restaurants? ANS: the Friedman test PTS: 1 REF: 686 BLM: Higher Order - Analyze

TOP: 7

83. Refer to Hamburger Ratings Narrative. Can we infer at the 1% significance level that the critic is wrong? Justify your response. ANS: The locations of all three populations are the same. At least two population locations differ. Rejection region: Test statistic: Conclusion: Don’t reject the null hypothesis. No, we can’t infer at the 1% significance level that the critic is wrong.

PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

84. Refer to Hamburger Ratings Narrative. Using the appropriate statistical table, what statement can be made about the p-value for this test? ANS: 0.01 < p-value < 0.025 PTS: 1 REF: 686-689 BLM: Higher Order - Analyze

TOP: 7

85. The following data were generated from a blocked experiment. Conduct a Friedman test at the 5% significance level to determine if at least two population locations differ. Treatment Block 1 2 3 4 5

1 89 77 85 65 58

2 84 67 77 72 47

3 78 52 75 62 52

4 76 81 69 73 62

ANS: The locations of all four populations are the same. At least two population locations differ. Rejection region: Test statistic: Conclusion: Don’t reject the null hypothesis. The locations of all four populations are the same. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

86. Apply the Friedman test to the accompany table of ordinal data to determine whether we can infer at the 10% significance level that at least two population locations differ. Treatment Block 1 2 3 4 5 ANS:

1 2 1 3 2 1

2 5 4 4 5 5

3 3 5 2 4 3

4 1 4 2 1 5

The locations of all four populations are the same. At least two population locations differ. Rejection region: Test statistic: Conclusion: Reject the null hypothesis. Yes, we can infer at the 10% significance level that at least two population locations differ. PTS: 1 REF: 686-689 BLM: Higher Order - Evaluate

TOP: 7

Standardized College Entrance Exam Narrative Two psychometricians (educators who are experts in the field of psychological test design) were asked to rank six designs for a new standardized college entrance exam. Design Educator 1 2 3 4 5 6 1 6 5 1 3 2 4 2 5 6 2 1 4 3 This problem uses Spearman’s rank correlation coefficient to see if there is a positive relationship between the educators’ rankings. 87. Refer to Standardized College Entrance Exam Narrative. State the null and alternative hypotheses. ANS: (There is no association between the rank pairs.) (The correlation between the rank pairs is positive.) PTS: 1 REF: 693-694 BLM: Higher Order - Analyze

TOP: 8

88. Refer to Standardized College Entrance Exam Narrative. Describe why the test statistic is called the rank correlation coefficient. What is the value of

in this problem?

ANS: The test statistic

is called the rank correlation coefficient simply because it is the usual

correlation coefficient applied to ranks. The value of PTS: 1 REF: 690-694 BLM: Higher Order - Apply

TOP: 8

in this problem is 0.671.

89. Refer to Standardized College Entrance Exam Narrative. Do the data present sufficient evidence to indicate a positive correlation in the rankings of the two educators? What does this result mean in the context of the problem? ANS: No; since < 0.829, we fail to reject . There is insufficient evidence to indicate there is any significant positive correlation between the educators’ rankings. The two educators do not agree on the design rankings. One would hope for a positive correlation indicating the designs that one educator thought were better, the other educator also thought were better. PTS: 1 REF: 690-694 BLM: Higher Order - Evaluate

TOP: 8

90. Refer to Standardized College Entrance Exam Narrative. What is the observed significance level for this test? ANS: p-value > 0.05 PTS: 1 REF: 690-694 BLM: Higher Order - Analyze

TOP: 8

91. Refer to Standardized College Entrance Exam Narrative. Find the rejection region for  = 0.05. NAR: Standardized College Entrance Exam Narrative ANS: Reject

> 0.829.

PTS: 1 REF: 690-694 BLM: Higher Order - Analyze

TOP: 8

92. A professor was interested in the relationship between a student’s rank on an oral exam and the student’s rank on a written exam. The professor selected eight students at random and ranked their scores for both the oral exam and the written exam. The following data were recorded: Student 1 2 3 4 5 6 7 8

Oral Exam 1 3 2 4 8 7 6 5

Written Exam 2 3 1 5 7 8 4 6

Find and interpret the rank correlation between a student’s rank on the oral exam and the student’s rank on the written exam. ANS: Let be the ranks on the oral exam and be the ranks on the written exam for i = 1,2,…,8. Since

, then = 199 – (36)(36)/8 = 37 = 204 – (

/8) = 42

= 204 – (

/8) = 42

Hence, the Spearman’s rank correlation coefficient is = 37/42 = 0.881. There is a reasonably strong relationship between the student’s rank on the oral exam and the student’s rank on the written exam. PTS: 1 REF: 690-694 BLM: Higher Order - Evaluate

TOP: 8

93. The following paired observations were obtained on two variables x and y: x y a. b.

1.3 1.1

0.9 1.4

2.2 0.2

3.6 –0.7

2.8 –0.1

1.6 0.7

Calculate Spearman’s rank correlation coefficient Do the data present sufficient evidence to indicate a correlation between x and y? Justify your response. Test using = 0.05.

ANS: a. To calculate the Spearman’s rank correlation coefficient, the data are ranked separately according to the variables x and y. Rank x Rank y

2 5

1 6

4 3

6 1

5 2

3 4

Since there were no tied observations, the simpler formula for

is used, and

= –1.

The hypotheses of interest are To test for correlation with = 0.05, index 0.025 in the table of critical values for Spearman’s rank correlation coefficient. The rejection region is | we reject

0.886. Since |

| = 1 > 0.886,

and conclude that there is a correlation between x and y.

PTS: 1 REF: 690-694 BLM: Higher Order - Evaluate

TOP: 8

Teachers and Students’ IQ Narrative. A school principal suspected that a teacher’s attitude toward a Grade 1 student depended on his original judgment of the child’s ability. The principal also suspected that much of that judgment was based on the Grade 1 student’s IQ score, which was usually known to the teacher. After three weeks of teaching, a teacher was asked to rank the nine children in his class from 1 (highest) to 9 (lowest) as to his opinion of their ability. Teacher Rank IQ Rank

94. Refer to Teachers and Students’ IQ Narrative. Calculate

for these teacher–IQ ranks.

ANS: The calculation of involves the ranks of the two variables being compared. The data given in the table above have already been ranked and may be substituted into the simpler formula since no ties exist in either the x or y rankings. = 0.8333. PTS: 1 REF: 690-694 BLM: Higher Order - Analyze

TOP: 8

95. Refer to Teachers and Students’ IQ Narrative. Do the data provide sufficient evidence to indicate a positive correlation between the teacher’s ranks and the ranks of the IQs? Justify your answer. Use = 0.05. ANS: The hypotheses of interest are To test for positive correlation with = 0.05, index 0.05 in the table of critical values for Spearman’s rank correlation coefficient. The rejection region is 0.60. Since > 0.60, we reject the null hypothesis of no association and conclude that a positive correlation exists between the teacher’s ranks of the IQs. PTS:

REF: 690-694

TOP: 8

BLM: Higher Order - Evaluate