Introductory Statistics Exploring the World Through Data 1st Edition by Robert N. Gould TEST BANK by ACADEMIAMILL

Chapter 1 Test A—Multiple Choice Section 1.1 (Defining and characterizing data) 1.

[Objective: Understanding data] Which of the following is not an example of data? a. A list of receipt totals from one day at a national department store. b. A list of the number of cars that are stopped at stop sign at a busy intersection. The number of cars waiting is recorded every 15 minutes for 8 hours. c. A chart showing the number of goals scored per game for a NHL hockey team during one whole season. d. All of the above are examples of data.

[Objective: Understanding variation] Which of the following measurements is likely to have the least variation? a. The individual weights in ounces of oranges in a randomly selected five-pound bag of oranges at the market. b. The individual mass measured in grams of quarters in a randomly selected ten dollar roll of quarters. c. The individual heights of children, measured in inches, in a randomly selected class of sixth grade students .

[Objective: Understanding the language of statistics] Choose the best answer to complete the statement: “In a statistical context, the term variable is used…” a. because it is too difficult to get certain information from each member of a population.” b. because there is variability in the information gathered from the members of a sample or population.” c. because, like algebra, a statistical variable stands in for some unknown numerical value.” d. because it describes a characteristic of the population which is never known.”

Section 1.2 (Classifying and storing data) 4.

[Objective: Distinguish between numerical and categorical variables] The average gas mileage of the top selling mini-vans for each U. S. car manufacturer is an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

[Objective: Distinguish between numerical and categorical variables] A state senator’s comments about the dangers of global warming are an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

1-2

Chapter 1 Test A 6.

[Objective: Distinguish between numerical and categorical variables] Marital status of each member of a randomly selected group of adults is an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

[Objective: Display meaning of coded categorical data] In a survey, married couples were asked, “Do you have children?” The response was electronically recorded as a “1” for yes and a “0” for no. This is an example of ____________. a. Coded categorical data b. Unstacked numerical data c. Random sample d. None of the above

[Objective: Identify a population from a sample] Researches want to find out which U. S. movie has the most positive audience reaction for the current week. As they exited a randomly selected movie theater, movie-goers were asked to give the movie they had just viewed a letter grade of A, B, C, D, or F. In this scenario, the movie-goers are an example of a _____. a. Sample b. Population c. Variable

Section 1.3 (Organizing Categorical Data) 9.

[Objective: Choose the appropriate tool to organize categorical data] A two-way table is useful for summarizing and comparing what? a. A numerical variable and categorical variable that may be related. b. Two numerical variables that may be related. c. Two categorical variables that may be related.

Consider the following for questions (10) and (11): In a study of 900 adults, 45 out of the 325 men in the study said that they preferred to rent a movie on DVD rather than going out to a movie theater. 10. [Objective: Understand calculations involving percentages or rates] What is the approximate percentage of men in this study who prefer to rent a movie on DVD? a. 13.8% b. 36% c. 5% 11. [Objective: Understand calculations involving percentages or rates] What is the approximate percentage of women who participated in this study? a. 41% b. 63.9% c. 7.8% d. Not enough information available

Chapter 1 Test A

1-3

Consider the following table for questions (12) and (13): The two-way table below shows teenage driver gender and whether or not the respondent had texted at least once while driving during the last thirty days.

Texted at least once while driving during past 30 days. Had not texted at least once while driving during the past 30 days.

Teenage driverMale

Teenage driverFemale

12. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] What percentage of the sample had texted at least once while driving in the past thirty days? a. 62.5% b. 37.5% c. 50% d. 43.75% 13. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] What percentage of the sample were female drivers? a. 62.5% b. 50% c. 78% d. 28.3% 14. [Objective: Show understanding of percentages] In a sample of 800 first-year college students, 72% said that they check their Facebook page at least three times a day. How many students is this? a. 72 b. 576 c. 224 d. Not enough information available.

Section 1.4 (Collecting Data to Understand Causality) For (15)-(17) Indicate whether the study described is an observational study or a controlled experiment. 15. [Objective: Distinguish between an observational study and a controlled experiment] The obesity rates of elementary age children living in urban areas are compared to those living in rural areas to see whether children in urban settings have higher obesity rates. a. Observational study b. Controlled experiment 16. [Objective: Distinguish between an observational study and a controlled experiment] “People with diabetes are at higher risk for certain cancers than those without the blood sugar disease, suggests a new study based on a telephone survey of nearly 400,000 adults.” a. Observational study b. Controlled experiment

1-4

Chapter 1 Test A 17. [Objective: Distinguish between an observational study and a controlled experiment] A group of students is divided into two groups. One group is a given a new chewable vitamin and the other group is given a placebo. After six months they are asked to fill out a questionnaire and given a health exam to see whether the new vitamin has health benefits that are better than a placebo. a. Observational study b. Controlled experiment 18. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] Consider the following statement, “Babies who breastfeed are less likely to grow into children with behavioral problems by the time they reach age 5 than those who receive formula milk.” Which of the following is a plausible confounding variable in this study? a. The quality of the formula milk b. Mothers social-economic status c. The age at which breastfeeding ends d. All of the above e. None of the above 19. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] Consider the following statement: “Researched conducted a large observational study and determined that children who participated in school music programs scored higher on math exams in later grades than those who did not.” Suppose that upon hearing this a politician states that all children should participate in school music programs. What is wrong with the politician’s statement? a. There was a placebo effect. b. This study exhibits bias. c. The controlled experiment was not double-blinded. d. The politician confused correlation with causation. 20. [Objective: Understand difference between treatment and outcome variables] A group of adults was given a new high protein breakfast bar then asked to record their level of alertness just before lunchtime. In this example, _____________ is the treatment variable and _____________ is the outcome variable. a. Alertness level; the breakfast bar b. The breakfast bar; alertness level c. The group of adults; alertness level d. The breakfast bar; lunchtime

Chapter 1 Test A

Chapter 1 Test A—Answer Key 1. D 2. B 3. B 4. A 5. C 6. B 7. A 8. A 9. C 10. A 11. B 12. B 13. B 14. B 15. A 16. A 17. B 18. D 19. D 20. B

1-5

Chapter 1 Test B—Multiple Choice Section 1.1 (Defining and characterizing data) 1.

[Objective: Understanding variation] Which of the following measurements is likely to have the most variation? a. The individual weights in ounces of tennis balls in a randomly selected can of tennis balls. b. The volume of individual pop cans measured in fluid ounces from a randomly selected twentyfour pack. c. The individual weights in ounces of potatoes in a randomly selected crate of potatoes.

[Objective: Understanding data] Which of the following is not an example of data? a. A list of the length of the 25 most popular pop songs of the year. b. A chart showing the number of mailboxes per city block for a map of 10 city blocks. c. A list showing the amount of money collected at an annual charity raffle for the past 10 years. d. All of the above are examples of data

[Objective: Understanding the importance of context] Suppose you are presented with a dataset containing the ages of 50 randomly selected adults and the amount of money they spent at a certain grocery store on a certain day. Which of following questions do not need to be asked about the dataset? a. Who collected the data? b. How was the data collected? c. Why was the data collected? d. Were the names of respondents recorded so that they could be contacted later if necessary?

Section 1.2 (Classifying and storing data) 4.

[Objective: Distinguish between numerical and categorical variables] The ethnicity of the individual respondents in a political poll of a randomly selected group of adults is an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

[Objective: Distinguish between numerical and categorical variables] The average number of hours spent completing statistics homework for randomly selected group of statistics students is an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

[Objective: Distinguish between numerical and categorical variables] The number of parents who attended parent teacher conferences at a local elementary school is an example of what type of variable? a. Numerical variable b. Categorical variable c. Neither

1-2

Chapter 1 Test B

[Objective: Display meaning of coded categorical data] In a survey, high school graduates were asked “Did you play sports in high school?” The response was electronically recorded as a “1” for yes and a “0” for no. This is an example of ____________. a. Random sample b. Unstacked numerical data c. Coded categorical data d. None of the above

[Objective: Identify a population from a sample] The deacons at a local church surveyed the congregation to find out if members would be willing to fund a new construction project. In this example, what is the population of interest? a. The deacons b. The congregation c. The survey respondents d. None of the above

Section 1.3 (Organizing Categorical Data) 9.

[Objective: Choose the appropriate tool to organize categorical data] The gender of a sample of adults was recorded, and then they were asked whether they had used a postage stamp in the last thirty days. A good way to organize this categorical data is in a a. Unstacked data table. b. Two-way table. c. Scatterplot. d. None of the above

Consider the following for questions (10) and (11): In a study of 1050 adults, 175 out of the 650 women in the study said that they preferred to drive a SUV to driving a compact car. 10. [Objective: Understand calculations involving percentages or rates] What is the approximate percentage of study participants who are women in this study who said that they prefer to drive a SUV to driving a compact car? a. 61.9% b. 16.7% c. 26.9% 11. [Objective: Understand calculations involving percentages or rates] What is the approximate percentage of study participants who are women? a. 61.9% b. 16.7% c. 26.9% d. Not enough information available

Chapter 1 Test B

1-3

Consider the following table for questions (12) and (13): The two-way table below shows the survey results when sixty adults were asked whether they had made a clothing purchase in the last thirty days.

Purchased clothing in the last thirty days. Had not purchased clothing in the last thirty days.

Male

Female

12. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] What percentage of the sample had not made a clothing purchase in the past thirty days? a. 35% b. 50% c. 33% d. 65% 13. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] Of the adult males surveyed, what percentage had made a clothing purchase in the last thirty days? a. 35% b. 50% c. 33% d. 65% 14. [Objective: Show understanding of percentages] In a sample of 775 senior citizens, approximately 67% said that they had seen a television commercial for life insurance. About how many senior citizens is this? a. 256 b. 67 c. 519 d. Not enough information available.

Section 1.4 (Collecting Data to Understand Causality) For (15)-(17) Indicate whether the study described is an observational study or a controlled experiment. 15. [Objective: Distinguish between an observational study and a controlled experiment] The smoking rates of teens in urban areas are compared to those living in rural areas to see whether teens living in rural settings have higher rates of smoking. a. Observational study b. Controlled experiment 16. [Objective: Distinguish between an observational study and a controlled experiment] A group of cancer patients is divided into two groups. One group is given a new drug to fight the side effects of chemotherapy and the other group is given a placebo. After three months they are asked to respond to a questionnaire about the frequency and severity of their side effects to see whether the new drug improved the overall negative side effects of chemotherapy. a. Observational study b. Controlled experiment

1-4

Chapter 1 Test B 17. [Objective: Distinguish between an observational study and a controlled experiment] A group of students is divided into two groups. One group listens to classical music while taking a math test and the other group takes the test in silence. The average test scores of the two groups are compared to see whether listening to music during a math test has an effect on scores. a. Observational study b. Controlled experiment 18. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] Consider the following statement, “In a nationwide study, children on an all-organic diet are more alert in school than those not on an all-organic diet.” Which of the following is a plausible confounding variable in this study? a. The quality of the non-organic diet b. Parents’ social-economic status c. School start times d. All of the above e. None of the above 19. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] Consider the following statement “My child was bullied on the school bus and so was my neighbors child, so obviously, bullying is a big problem on school buses and something needs to be done about it!” What is wrong with this statement? a. The statement exhibits bias. b. The statement is anecdotal. c. The person making the statement confused correlation with causation. d. None of the above—the statement is valid. 20. [Objective: Understand difference between treatment and outcome variables] A group of college students was given a new energy drink then asked to record their level of alertness at midday. In this example, _____________ is the treatment variable and _____________ is the outcome variable. a. The energy drink; midday b. The energy drink; alertness level c. The group of college students; alertness level d. Alertness level; the energy drink

Chapter 1 Test B Chapter 1 Test B—Answer Key 1. C 2. D 3. D 4. B 5. A 6. A 7. C 8. B 9. B 10. C 11. A 12. A 13. B 14. C 15. A 16. B 17. B 18. D 19. B 20. B

1-5

Chapter 1 Test C—Short Answer Provide an appropriate response. Section 1.1 (Defining and characterizing data) 1.

[Objective: Understanding variation] A sticker on a new car advertises that it gets 34 miles per gallon, but cautions that results may vary. Explain what variation means in this context.

[Objective: Understanding data] Consider the statement “Data are numbers in context.” Consider a randomly selected group of newborn babies born in a large city. Describe one possible type of numerical data that could be reported about this group that might be of interest to obstetricians.

Section 1.2 (Classifying and storing data)

[Objective: Identify a population from a sample] Identify the sample and the population it is most likely intended to represent: The manager at a bicycle store asks 30 customers if they would be interested in participating in a weekly group ride.

For questions (4) and (5): The following is reported in a local newspaper: “The majority of Michigan voters want to keep the individual pricing law. A poll of 600 voters showed 51% wanting to keep the law and 39% supporting its repeal.” 4.

[Objective: Identify a population from a sample] Identify the sample and the population it is most likely intended to represent.

[Objective: Understanding the importance of context] If possible, answer the following questions about the context of this study: a. b. c. d. e.

What are the objects of interest? What variables are being measured? How were the variables measured? Who collected the data? How did they collect the data?

[Objective: Distinguish between numerical and categorical variables] Explain what a numerical variable is and give an example of a numerical variable.

[Objective: Distinguish between numerical and categorical variables] Explain what a categorical variable is and give an example of a categorical variable.

1-2

Chapter 1 Test C

1.3 (Organizing Categorical Data) Use the two-way table below to answer questions (8)-(11). A sample of 984 voters was asked in a survey, “Do you support your state’s helmet law for motorcycles?” The results are summarized in a two-way table:

Support the helmet law Oppose the helmet law

Male 214 185

Female 380 205

[Objective: Perform percentage or rate calculations given data that is organized in a two-way table] How many men are in this sample? What percent of the sample are men? Round to the nearest whole percent.

[Objective: Perform percentage or rate calculations given data that is organized in a two-way table] Among the women surveyed, what percent support the helmet law? Round to the nearest whole percent.

10. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] Based on this sample, are women more or less likely to oppose the helmet law than men?

11. [Objective: Perform percentage or rate calculations given data that is organized in a two-way table] Overall, what percentage of the sample supports the helmet law? Round to the nearest whole percent.

Chapter 1 Test C

1- 3

Answer questions (12)-(14) using the following table showing fatal injury counts by four major categories from 2004-2009 for Michigan Residents. The table shows total counts for all ages and both sexes.

Fatal injuries

2004

2005

2006

2007

2008

2009

Unintentional Injuries

3,244

3,353

3,496

3,655

3,624

3,589

Self-inflicted/Suicide

1,096

1,103

1,132

1,123

1,173

1,164

Assault/Homicide

672

673

720

702

642

658

All other fatal injuries

466

404

414

446

506

All fatal Injuries

5,478

5,533

5,762

5,926

5,885

5,917

12. [Objective: Distinguish between an observational study and a controlled experiment] Was this data collected using an observational study or a controlled experiment?

13. [Objective: Understand calculations involving percentages or rates] What is the rate of fatal injuries caused by Assault/Homicide in 2008?

14. [Objective: Understand calculations involving percentages or rates] Write a statement comparing the rate of fatal injuries that were self-inflicted or suicide from 2005 to 2006. Why is it important to compare rates and not total numbers?

1.4 (Collecting Data to Understand Causality)

15. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] Explain the difference between an observational study and an anecdote. Be sure to provide an example of each.

16. [Objective: Show ability to judge reliability and validity of conclusions of a statistical study] Name two of the four key features of a well-designed controlled experiment. When is it important to have a welldesigned controlled experiment?

1-4

Chapter 1 Test C For questions (17) and (18), use the following description of a controlled experiment: Two similar kindergarten classrooms, each containing 20-25 students, have agreed to participate in a study to see if incorporating math manipulatives into the teacher’s lesson improves understanding of new concepts in geometry, like recognizing shapes and similarities in shapes. The teacher in one classroom gives a standard lesson without the use of manipulatives then gives a three-question quiz about the new geometry concepts that were taught. The teacher in the second classroom gives a geometry lesson that incorporates the use of math manipulatives. The same three-question quiz is given and the scores of the two classrooms are compared to see whether the use of math manipulatives improved quiz scores.

17. [Objective: Understand difference between treatment and outcome variables] In this study, what is the treatment variable and the response variable?

18. [Objective: Show ability to evaluate the characteristics of a controlled experiment] Which features of a well-designed control experiment does this study have? Which features are missing?

19. [Objective: Understand the correlation vs. causation error and other common statistical argument errors] In a study of National Hockey League statistics, the data of Stanley cups wins per team and the number of fights on the ice during Stanley cup play-off games was compared. It is found that there is a positive correlation between the number of fights on the ice of a NHL team and the number of Stanley cup wins for that team. That is, NHL teams with more fights on the ice tend to have more Stanley cups. A sportscaster makes the following statement: “If a team wants to win a Stanley cup the players should start as many fights on the ice as possible.” What is wrong with the sportscaster’s statement?

20. [Objective: Show ability to evaluate statistical studies in the news] Time Magazine online posted an article that claims that a long work commute (45 minutes or more) is harmful to overall well-being and can contribute to obesity, stress, and loneliness (www.healthland.time.com/2011). Is this more likely to be an observational study or a controlled experiment? Why? Can the reader conclude that taking a job with a long commute will result in poorer health?

Chapter 1 Test C

1- 5

Chapter 1 Test C—Answer Key 1.

Not every owner of this type of car will get exactly 34 miles per gallon of gas. Different drivers in different driving conditions will get different fuel economy. 2. Answers may vary, but likely answers would be weight or length. 3. The sample is the 30 customers questioned and the population is all bicycle shop customers. 4. The sample is the 600 voters who participated in the poll and the population is all Michigan voters. 5. a. Michigan voters b. Opposition to the individual pricing law c. Participants were asked a question and the response was recorded d. Unknown from information given e. By poll, but it is unknown how the poll was conducted. 6. Numerical variables describe quantities of the objects of interest and will be numbers. Examples will vary. 7. Categorical variables describe qualities of the objects of interest and will be categories. Examples will vary. 8. 399 men, 41% 9. 65% (380/585) 10. Women are less likely to oppose the helmet law then men. 46% of the men in the study opposed the helmet law compared to only 35% of the women. 11. 60% (594/984) 12. Observational Study 13. About 11 out of 100 (642/5885) 14. From 2005 to 2006 the rate decreased from about 199 out of 1000 to 196 out of 1000. The rate shows a small decrease even though numbers increased from 2005 to 2006. The rate is more meaningful because it shows the amount relative to the total. 15. In an observational study, subjects are placed into control or treatment groups by their own actions, decisions, or the decisions of someone else. The conclusions of an observational study are based on the unbiased observations made from a large randomly selected group. An anecdote is a story or example from one person’s experience used to make a general conclusion. Examples will vary. 16. A well-designed controlled experiment is important when researchers want to answer questions about causality. The four key features of a well-designed experiment are (1) large sample size; (2) subjects assigned to treatment and control groups at random; (3) the study is double-blind; and (4) the study should use a placebo. 17. The treatment variable is whether manipulatives were used and the response variable is the quiz scores. 18. Various, but students should address the four attributes of a well-designed controlled study: sample size, randomization, double-blind format, and placebo effect. This study is missing randomization and double blinds. 19. The sportscaster made the faulty assumption that correlation implies causation 20. This is most-likely an observational study because participants were not randomly assigned to a long work commute. Health data was probably gathered from people who chose to have long work commutes. The conclusion that long work commutes cause poorer health cannot be made since this is an observational study.

Chapter 2 Test A—Multiple Choice Section 2.1 (Visualizing Variation in Numerical Data) 1.

[Objective: Interpret visual displays of numerical data] Each day for twenty days a record store owner counts the number of customers who purchase an album by a certain artist. The data and a dotplot of the data are shown below: Data set:

1, 3, 4, 4, 5, 6, 7, 2, 3, 4, 4, 5, 6, 8, 2, 3, 4, 5, 6, 7, 9

Which of the following statements can be made using the given information? a. b. c. d.

On the first day of collecting data the record store owner had one person purchase an album by the artist. The dotplot shows that this data has a roughly bell-shaped distribution. During the twenty days when the record store owner collected data, there were some days when no one purchased an album by the artist. None of the above

A fitness instructor measured the heart rates of the participants in a yoga class at the conclusion of the class. The data is summarized in the histogram below. There were fifteen people who participated in the class between the ages of 25 and 45. Use the histogram to answer questions (2) and (3). 6

Frequency

5 4 3 2 1 90 100 110 120 130 140 150 160 Beats per minute (bpm)

[Objective: Interpret visual displays of numerical data] How many participants had a heart rate between 120 and 130 bpm? a. 2 b. 4 c. 3 d. 5 [Objective: Interpret visual displays of numerical data] What percentage of the participants had a heart rate greater than 130 bpm? a. 13% b. 27% c. 33% d. 53%

2-2

Chapter 2 Test A 4.

[Objective: Interpret visual displays of numerical data] Find the original data set from the stemplot given below. 45 46 47 a. b. c. d.

0015 223 08999

450, 450, 451, 455, 462, 462, 463, 478, 479, 479 450, 450, 451, 455, 462, 462, 463 45, 41, 45, 46, 42, 42, 43, 47, 40, 48, 49, 49, 49 450, 451, 455, 462, 463, 470, 478, 479

[Objective: Interpret visual displays of numerical data] A collection of twenty college students was asked how much cash they currently had in their possession. The data is summarized in the stemplot below. Typically, how much money does a student have in his or her possession? 0 1 2 3 4 5 6

a. b. c. d.

5 022 2 00339 00023566 1 7

$60-$70 $10-$30 $30-$50 Not enough information available

Chapter 2 Test A

2-3

Section 2.2 (Summarizing Important Features of a Numerical Distribution) Match one of the following histograms with one of the descriptions in questions (6) – (8): c. Frequency

Frequency

[Objective: Recognize the shape of a distribution] The distribution of heights of adult males tends to be symmetrical which is displayed in histogram ________.

[Objective: Recognize the shape of a distribution] The distribution of the numbers of times individuals in the 18-24 age group log onto a social networking website during the course of a day tends to be rightskewed which is displayed in histogram ________.

[Objective: Recognize the shape of a distribution] The distribution of test scores for a group of adults on a written driving exam following a refresher course tends to be left-skewed which is displayed in histogram ________.

[Objective: Recognize the center of a distribution] The histogram below shows the distribution of pass rates on a swimming test of all children who completed a four week summer swim course at the local YMCA. What is the typical pass rate for the swim test? 25

Frequency

20 15 10 5

Percentage that passed the swim test

a. b. c. d.

About 75% About 55% About 95% Not enough information available

Chapter 2 Test A 10. [Objective: Recognize the important features of a numerical distribution] The histogram below displays the distribution of the length of time on hold, for a collection of customers, calling a repair call center. Use the histogram to select the true statement.

Frequency

20 15 10 5 1 2 3 4 5 6 7 8 Length of time on hold in minutes

a. b. c. d.

The distribution is symmetrical. The number of callers who waited on hold for less than three minutes was the same as the number of callers who waited on hold for more than three minutes. The distribution is left-skewed and most callers waited on hold at least three minutes. The distribution shows that the data was highly variable with some callers waiting on hold as many as 20 minutes. The distribution is right-skewed and most callers waited on hold less than three minutes.

11. [Objective: Recognize the important features of a numerical distribution] Based on the histogram in question (10), would it be unusual to be on hold for 5 minutes or more at this call center? a. Yes, it would be unusual. b. No, it would not be unusual. c. Not enough information given. 12. [Objective: Recognize the important features of a numerical distribution] The histogram shows the distribution of pitch speeds for a sample of 75 pitches for a college pitcher during one season. Which of the following statements best describes the distribution of the histogram below? 30 25

Frequency

2-4

20 15 10 5 80

100 105

Pitch Speed (mph)

a. b. c. d.

The distribution has a large amount of variation which can be seen by comparing the heights of the bars in the histogram. The distribution is right-skewed and shows that most of the pitches were more than 90 mph. The distribution is left-skewed and shows that most of the pitches were less than 95 mph. The distribution is symmetric around a pitch speed of about 93 mph. Copyright © 2013 Pearson Education, Inc.

Chapter 2 Test A

2-5

13. [Objective: Recognize the important features of a numerical distribution] The histogram below is the distribution of heights for a randomly selected Boy Scout troupe. Choose the statement that is true based on information from the histogram 6

Frequency

5 4 3 2 1 2.0 2.5

3.0 3.5 4.0 4.5 5.0 5.5 Height (in feet)

a. b. c. d.

The gap between the two smallest values indicates an outlier may be present. The smallest value is so extreme that it is possible that a mistake was made in recording the data. Although the smallest value does not fit the pattern, it should not be altogether disregarded. It is possible that the Boy Scout is 2.4 feet tall. All of the above are true statements

Section 2.3 (Visualizing Variation in Categorical Variables) 14. [Objective: Interpret visual displays of categorical data] A group of junior high athletes was asked what team sport was their favorite. The data are summarized in the table below. On the pie chart, which area would correspond to the category “Soccer”? Team Sport Soccer Volleyball Basketball Football

Frequency 12 28 20 20

Pie Chart: Favorite Team Sport A D

15. [Objective: Recognize the important features of a categorical bar graph] Which of the following statements about bar graphs is true? a. It sometimes doesn’t matter in which order you place the bars representing different categories. b. It is appropriate to have gaps between the bars on the graph. c. On a bar graph, the width of the bars has no meaning. d. All of the above are true for bar graphs.

2-6

Chapter 2 Test A

Section 2.4 (Summarizing Categorical Distributions)

Percentage

The following side-by-side bar graph shows the level of post-secondary education achieved ten years after high school for graduates from the years 1999 and 2001. Use the bar graph to answer questions (16) and (17).

16. [Objective: Recognize the important features of a categorical bar graph] What was the most common response for 1999? a. No College b. Some College c. Graduated College, Associate’s Degree d. Graduated College, Bachelor’s Degree 17. [Objective: Summarize categorical distributions] All of the juniors and seniors at a college are asked their major. Which of the following graph types would be appropriate for displaying the variability in majors for this data set? a. Bar graph b. Histogram c. Stemplot d. None of the above

Chapter 2 Test A

2-7

18. [Objective: Recognize the important features of a numerical distribution] Data was collected on hand grip strength of adults. The histogram below summarizes the data. Which statement is true about the distribution of the data shown in the graph? 30

Frequency

25 20 15 10 5 40

100

110

120

130

Grip Strength (pounds)

a. b. c. d.

The graph is useless because it is bimodal. The best estimate of typical grip strength is 80-90 pounds because it is in the center of the distribution. There must have been a mistake made in data collection because the distribution should be bellshaped. The graph shows evidence that two different groups may have been combined into one collection.

Section 2.5 (Interpreting Graphs) 19. [Objective: Analyze statistical graphs] The graph below displays the number of applications for a concealed weapons permit in Montcalm County, Michigan, for each of three years. A reported interprets this graph to mean that applications in 2010 are more than twice the level in 2008. Is the reported making a correct interpretation?

Permits requested

55 50

30 2008

2009

2010

Year

b. c.

No. Although the 2010 bar is more than twice the height of the 2008, the bars do not begin at 0 applications, so the graph does not correctly represent the data. Fifty-five is not equal to two times the number of applications made in 2008. No. The width of the bars is identical, indicating that the number of applications in 2010 is no different from 2008. Yes. The bar for 2010 is twice the height of the bar for 2008 and the number of applications indicated above the bars shows that applications in 2010 are more than twice the level in 2008.

2-8

Chapter 2 Test A 20. [Objective: Analyze statistical graphs] The following graphic was used to visually summarize the following statement made by Supertuf Bicycle Tire Company in a recent magazine advertisement: “Our patented Supertuf bicyle tire design lasts twice as long as the leading competitor’s tire design.” Does the graphic correctly represent the statement made in the advertisement?

Supertuf Bicycle Tires Last Twice as Long!

Supertuf Tires

a. b. c.

Leading Competitor

Yes, the area of the first tire is twice the area of the second tire. No, although the dimensions have doubled, the area of the first tire is more than twice the area of the second tire so the graphic incorrectly represents what is stated in the advertisement. Not enough information available to make a judgment. More information is need about how long Supertuf tires last and how long the leading competitor’s tires last.

Chapter 2 Test A Chapter 2 Test A—Answer Key 1. B 2. C 3. D 4. A 5. C 6. B 7. C 8. A 9. A 10. D 11. A 12. D 13. D 14. A 15. D 16. B 17. A 18. D 19. A 20. B

2-9

Chapter 2 Test B—Multiple Choice Section 2.1 (Visualizing Variation in Numerical Data) 1.

[Objective: Interpret visual displays of numerical data] For twenty days a record store owner counts the number of customers who purchase an album by a certain artist. The data and a dotplot of the data are shown below: Data set:

1, 3, 4, 4, 5, 6, 7, 2, 3, 4, 4, 5, 6, 8, 2, 3, 4, 5, 6, 7, 9

Which of the following statements can be made using the given information? a. b. c. d.

On five of the twenty days observed by the record store owner, there were four albums by the artist purchased. During the twenty days when the record store owner collected data, at least one album by the artist was purchased each day. The dotplot shows that this data has a roughly bell-shaped distribution. All of the above

Frequency

5 4 3 2 1 90 100 110 120 130 140 150 160 Beats per minute (bpm)

[Objective: Interpret visual displays of numerical data] How many participants had a heart rate between 140 and 150 bpm? a. 2 b. 4 c. 3 d. 5

[Objective: Interpret visual displays of numerical data] What is the approximate percentage of participants had a heart rate less than 130 bpm? a. 13% b. 47% c. 33% d. 53% Copyright © 2013 Pearson Education, Inc.

2-2

Chapter 2 Test B 4.

[Objective: Interpret visual displays of numerical data] Find the original data set from the stemplot given below. 45 46 47 a. b. c. d.

0015 223 08999

450, 450, 451, 455, 462, 462, 463 450, 450, 451, 455, 462, 462, 463, 470, 478, 479, 479, 479 45, 41, 45, 46, 42, 42, 43, 47, 40, 48, 49, 49, 49 450, 451, 455, 462, 463, 470, 478, 479

[Objective: Interpret visual displays of numerical data] A collection of twenty college students was asked how much cash they currently had in their possession. How many students had between forty and fifty dollars in their possession? 0 1 2 3 4 5 6

a. b. c. d.

5 022 2 00339 00023566 1 7

8 9 10 Not enough information available

Section 2.2 (Summarizing Important Features of a Numerical Distribution) 6.

[Objective: Recognize the important features of a numerical distribution] Data was collected on the heights of a group of five boy scouts, between the ages of seven and eleven. The following heights were recorded in inches: 29, 60, 52, 57, 52. Choose the true statement: a. The gap between the two smallest values indicates an outlier may be present. b. The smallest value is so extreme that it is possible that a mistake was made in recording the data. c. Although the smallest value does not fit the pattern, it should not be altogether disregarded. d. All of the above are true statements

Chapter 2 Test B

2- 3

c. Frequency

Frequency

Match one of the following histograms with the descriptions in questions (7) – (9):

[Objective: Recognize the shape of a distribution] The distribution of heights of adult females tends to be symmetrical which is displayed in histogram ________.

10. [Objective: Recognize the center of a distribution] The histogram below shows the distribution of pass rates on a swimming test of all children who completed a four week summer swim course at the local YMCA. How many of the courses had a pass rate less than 40 percent? 25

Frequency

20 15 10 5

Percentage that passed the swim test

a. b. c. d.

About 8 About 5 About 3 Not enough information available

Chapter 2 Test B 11. [Objective: Recognize the important features of a numerical distribution] Which of the following statements best describes the distribution and variability of the histogram below? The data in the histogram summarizes length of time on hold for a collection of customers calling a repair call center.

Frequency

20 15 10 5 1 2 3 4 5 6 7 8 Length of time on hold in minutes

a. b. c. d.

The distribution shows that the data was highly variable with some callers waiting on hold as many as 20 minutes. The distribution is right-skewed and most callers waited on hold less than three minutes. The distribution is symmetrical. The number of callers who waited on hold for less than three minutes was the same as the number of callers who waited on hold for more than three minutes. The distribution is left-skewed and most callers waited on hold at least three minutes.

12. [Objective: Recognize the important features of a numerical distribution] Based on the histogram in question (10), would it be unusual to be on hold for at least 6 minutes at this call center? a. Yes, it would be unusual. b. No, it would not be unusual. c. Not enough information given. 13. [Objective: Recognize the important features of a numerical distribution] The data in the histogram summarizes the pitch speed of a sample of 75 pitches for a college pitcher during one season. Which of the following statements best describes the distribution of the histogram below? 30 25

Frequency

2-4

20 15 10 5 80

100 105

Pitch Speed (mph)

a. b. c. d.

The distribution is symmetric around a pitch speed of about 93 mph. The distribution has a large amount of variation which can be seen by comparing the heights of the bars in the histogram. The distribution is left-skewed and shows that most of the pitches were less than 95 mph. The distribution is right-skewed and shows that most of the pitches were more than 90 mph. Copyright © 2013 Pearson Education, Inc.

Chapter 2 Test B

2- 5

Section 2.3 (Visualizing Variation in Categorical Variables) 14. [Objective: Recognize the important features of a categorical bar graph] Which of the following statements about bar graphs is true? a. It sometimes doesn’t matter in which order you place the bars representing different categories. b. It is appropriate to have gaps between the bars on the graph. c. On a bar graph, the width of the bars has no meaning. d. All of the above are true for bar graphs. 15. [Objective: Interpret visual displays of categorical data] A group of junior high athletes were asked what team sport was their favorite. The data are summarized in the table below. On the pie chart, which area would correspond to the category “Volleyball”? Pie Chart: Favorite Team Sport

Team Sport Soccer Volleyball Basketball Football

Frequency 12 28 20 20

Section 2.4 (Summarizing Categorical Distributions)

Percentage

16. [Objective: Recognize the important features of a categorical bar graph] What is the mode response for 2001? a. Graduated College, Bachelor’s Degree b. Graduated College, Associate’s Degree c. Some College d. No College 17. [Objective: Recognize the important features of a categorical bar graph] Which category shows the least amount of variation between years? a. No college b. Some college c. Graduated college, Associate’s degree d. Graduated college, Bachelor’s degree Copyright © 2013 Pearson Education, Inc.

2-6

Chapter 2 Test B 18. [Objective: Recognize the important features of a numerical distribution] Data were collected on hand grip strength of adults. The histogram below summarizes the data. Which statement is true about the distribution of the data shown in the graph? 30

Frequency

25 20 15 10 5 40

100

110

120

130

Grip Strength (pounds)

a. b. c. d.

The graph shows evidence that two different groups may have been combined into one collection. The best estimate of typical grip strength is 80-90 pounds because it is in the center of the distribution. There must have been a mistake made in data collection because the distribution should be bellshaped. The graph is useless because it is bimodal.

Section 2.5 (Interpreting Graphs) 19. [Objective: Analyze statistical graphs] The graph below displays the number of homicides in the city of Flint, Michigan for each of the last three years. A reported interprets this graph to mean that the number of murders in 2010 was more than twice the number of murders in 2008. Is the reporter making a correct interpretation? 80 64

Homicides

36 32

0 2008

2009

2010

Year

a. b. c.

No. The width of the bars is identical, indicating that the number of murders in 2010 is no different from 2008. Yes. The bar for 2010 is twice the height of the bar for 2008 and the number of murders indicated above the bars confirms that murders in 2010 were more than twice the level in 2008. There is not enough information given in the graph to determine whether the reporter’s interpretation is correct or not.

Chapter 2 Test B

2- 7

20. [Objective: Analyze statistical graphs] The following graphic was used to visually summarize the following statement made by Incredi-gro Fertilizer Company in a recent newspaper advertisement: “Our new Incredi-gro Fertilizer for flowers will grow your flowers two times faster than water alone.”

INCREDI-GRO will grow your flowers two times faster!

Using Water alone

a. b. c.

After using Incredi-gro!

Yes, the area of the first flower is half the area of the second flower. No, although the dimensions have doubled, the area of the first flower is one quarter of the area of the second flower so the graphic incorrectly represents what is stated in the advertisement. Not enough information available to make a judgment. More information is need about how fast flowers treated with incredi-gro fertilizer will grow and how fast flowers treated with water alone will grow.

2-8

Chapter 2 Test B

Chapter 2 Test B—Answer Key 1. D 2. A 3. B 4. B 5. A 6. D 7. A 8. C 9. B 10. C 11. B 12. A 13. A 14. D 15. B 16. C 17. A 18. A 19. B 20. B

Chapter 2 Test C

2- 1

Chapter 2 Test C—Short Answer Provide an appropriate response. Section 2.1 (Visualizing Variation in Numerical Data) The dotplot shows how many times a computer was used daily at a public library during a 30-day period. Use information from the dotplot to answer questions (1) and (2).

20 25

Number of daily logins

[Objective: Interpret visual displays of numerical data] Describe the shape of the distribution in context?

[Objective: Interpret visual displays of numerical data] The library does not want patrons to wait in line. Usually, a line develops when a computer is used more than 40 times in a day. What percent of days did this occur? Round to the nearest whole percent.

Use the histogram below to answer questions (3), (4) and (5). The histogram shows the distribution of the number of cell phones owned by 210 households of five or more people with a minimum age of nine. 60

Frequency

50 40 30 20 10 0

Number of Cell Phones

[Objective: Reading visual displays of numerical data] According to the histogram, about how many households do not own any cell phones?

[Objective: Reading visual displays of numerical data] According to the histogram, about how many households own four or more cell phones?

2-2

Chapter 2 Test C 5.

[Objective: Reading visual displays of numerical data] About what percentage of households own no more than three cell phones?

[Objective: Recognize the important features of a numerical distribution] Define the important features to look for when presented with a numerical distribution.

[Objective: Predict the shape of a distribution] A used car salesman decides to track the number of cars he sold each week for the past twelve months. Typically, the salesman sells 13 cars per week. During an usually slow week he sold only 2 cars, but during his best week he sold 29 cars. Predict the shape of the histogram of the number of cars sold each week.

The histogram below shows the distribution of pass rates on a swimming test of all children who completed a four week summer swim course at the local YMCA. Use the histogram to answer questions (8). 8.

[Objective: Interpret visual displays of numerical data] About how many of the swim courses had a pass rate of 70% or better? The YMCA managers have set a goal for the summer that at least half the swim courses have a pass rate of at least 70%. Do you think that this YMCA is running a successful summer swim program? Why?

Frequency

20 15 10 5

Percentage that passed the swim test

Chapter 2 Test C

2- 3

Section 2.2 (Summarizing important Features of a Numerical Distribution)

10 Frequency

Frequency

The histograms below show the distribution of the amount of time per week spent outside of school on extracurricular athletic activities for high school boys and high school girls. Use the histograms to answer questions (9), (10) and (11).

20 15

8 6

2 4

Hours per week for boys

Hours per week for girls

[Objective: Recognize the important features of a numerical distribution] Compare and describe the shape of the distributions.

10. [Objective: Recognize the important features of a numerical distribution] Which group is more likely to spend 8 hours or more on extracurricular athletic activities each week? Which group is more likely to spend 6 hours or less on extracurricular athletic activities each week?

11. [Objective: Recognize the important features of a numerical distribution] About what percent of girls spent 5 hours or less on extracurricular athletic activities each week? Round to the nearest whole percent.

12. [Objective: Recognize the important features of a numerical distribution] You have created a histogram showing the distribution of the amount of money spent weekly on video game purchases of 120 males ages 19-25 over the last 6 months. The histogram shows that the typical amount spent by males in this age category is $35. The histogram also shows that one male reported that he spent $250 dollars on video game purchases during one of the weeks. Explain how you would use the histogram to determine whether $250 is an outlier.

2-4

Chapter 2 Test C 13. [Objective: Recognize the important features of a numerical distribution] The histograms below show the distribution of quiz scores on a ten point math quiz with and without a fifteen minute review before the quiz. Describe the different shapes of the distributions. Does it appear that the fifteen minute review resulted in improved quiz scores? Explain the evidence that supports your conclusion. Without 15 minute review before quiz

With 15 minute review before quiz

10 Frequency

Frequency

10 8 6

8 6

2 4

Quiz Scores (out of 10 points)

Section 2.3 (Visualizing Variation in Categorical Data) & Section 2.4 (Summarizing Categorical Distributions) 14. [Objective: Recognize the important features of a categorical visual display] Compare and contrast the important similarities and differences between bar charts and histograms. In what context should a bar chart be used? Explain why there are gaps between the bars on a bar chart. In what context should a histogram be used? Explain why there are no gaps between the bars on a histogram.

Chapter 2 Test C

2- 5

The bar chart below shows car color for college students at a local community college. Use the bar chart to answer questions (15), (16) and (17).

Female College Students

Male College Students

Frequency

30 25 20 15 10

Car Color

15. [Objective: Recognize the important features of a categorical bar graph] Report the mode for males and females.

16. [Objective: Recognize the important features of a categorical bar graph] Did one group (males or females) show more variability in car color?

17. [Objective: Recognize the important features of a categorical bar graph] Write a sentence to compare color preferences or males and females.

2-6

Chapter 2 Test C

A survey was conducted which asked college students, “What is your favorite type of take-out food?” The results are shown in the graphs below. Use the graphs to answer questions (18) and (19).

Frequency

30 Pizza: 26.3%

25 20

Submarine Sandwich: 20%

Greek: 14.4% Mexican: 18.3%

Chinese: 21%

Type of Take-out Food

18. [Objective: Interpret visual displays of categorical data] Which type of take-out food is least popular? Is this easier to determine with the bar chart or with the pie chart? Why?

19. [Objective: Interpret visual displays of categorical data] What percentage of college students preferred either Chinese or pizza? Is this easier to determine with the bar chart or with the pie chart? Why?

20. [Objective: Interpret visual displays of categorical data] Explain how to assess variability when presented with categorical data.

Chapter 2 Test C

2- 7

Chapter 2 Test C—Answer Key 1.

The distribution is roughly bell-shaped and centered around 30 daily logins. There was at least 10 daily logins, but never more than 50 daily logins during the 30-day period. 2. 23% 3. About 9 households did not own any cell phones. 4. About 90 households owned four or more cell phones. 5. About 57% households owned no more than three cell phones. 6. The important features to look for are the shape of the distribution, the typical value (or center) of the distribution, and the variability (or horizontal spread) of the distribution. 7. Answers will vary but should be bounded by a minimum value of 0 cars and a maximum value of 29 cars. 8. About 50 of the courses had a pass rate of 70% or better. About 56% (approximately 50 out of 89) of the swim courses had a pass rate of 70% or better so the summer swim program has met the goal set by management. 9. The histograms for both high school boys and girls are roughly right-skewed. 10. Boys; both groups are equally likely to spend 6 hours or less on extracurricular activities each week. 11. About 75% 12. Answers will vary, but should address some or all of the following: Does the histogram show small or large variation? Was it skewed to the right? Does this piece of data fit the general pattern or is it far removed from the other data points? This observation warrants more investigation. It could be a recording mistake, but it could also be an accurate (although unusually high) response. 13. Without the 15 minute review the distribution of quiz scores is roughly symmetrical with a typical score of 6 out of 10. With the 15 minute review the distribution of quiz scores is left-skewed with a typical score of 8 out of 10. The 15 minute review worked. Without the review roughly 23% scored higher than 8 out of 10, but after the review roughly 70% scored higher than 8 out of 10. 14. Bar charts are used to display the distribution of categorical data. Each bar shows the frequency of a category. Since the horizontal axis contains the categories, the bar chart will have gaps between the bars. A histogram is used to display the distribution of numerical data. Each bar shows the frequency of a “bin” or interval of numerical values, so bars must be ordered without gaps. 15. The mode for females is black, and the mode for males is red. 16. Although the distributions for males and females were different, there does not appear to be a significant difference in variability. 17. Answers will vary. One possible response: For both groups, the least popular car color was white. 18. Greek take-out was the least popular category. It is easiest to see this on the bar chart. 19. The percentage that prefer Chinese or pizza is 47.3%. It is easiest to calculate this using information from the pie chart. 20. Answers will vary, but should include some explanation about how variability on bar charts can be determined by assessing diversity (are there many observations in many categories).

Chapter 3 Test A—Multiple Choice Section 3.1 (Summaries for Symmetric Distributions) 1.

[Objective: Calculate the mean of a small data set] The following nine values represent race finish times in hours for a randomly selected group of participants in an extreme 10k race (a 10k race with obstacles). Which of the following is closest to the mean of the following data set? 1.0, 1.1, 1.2, 1.2, 1.3, 1.4, 1.4, 1.4, 1.5 a. b. c. d.

x is about 1.1 hours x is about 1.3 hours x is about 1.5 hours x is about 1.6 hours

[Objective: Understand the standard deviation in context] Below is the standard deviation for extreme 10k finish times for a randomly selected group of women and men. Chose the statement that best summarizes the meaning of the standard deviation. Women: s = 0.17 a. b. c. d.

Men: s = 0.21

On average, men’s finish times will be 0.21 hours faster than the overall average finish time. On average, women’s finish times will be 0.17 hours less than men’s finish times. The distribution of men’s finish times is less varied then the distribution of women’s finish times. The distribution of women’s finish times is less varied then the distribution of men’s finish times.

[Objective: Understand the mean in context] A city planner says, “The typical commute to work for someone living in the city limits is less than the commute to work for someone living in the suburbs.” What does this statement mean? a. If you live in the city limits you will have a longer commute time. b. The center of the distribution of commute times for a city-dweller is less than the center of the distribution for those living in the suburbs. c. All city dwellers spend less time commuting to work than those living in the suburbs. d. There is less variation in the commute time of those living in the suburbs.

3-2

Chapter 3 Test A

The following list shows the age at appointment of U. S. Supreme Court Chief Justices appointed since 1900. Use the data to answer questions (4), (5), and (6). Last Name White Taft Hughes Stone Vinson

Age 65 63 67 68 56

Last Name Warren Burger Rehnquist Roberts

Age 62 61 61 50

[Objective: Calculate and interpret the mean of a small data set] Find the mean, rounding to the nearest tenth of a year, and interpret the mean in this context. a. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 63.0. b. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 64.1. c. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 61.4. d. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 61.0.

[Objective: Interpret variability of a small data set] The U. S. Supreme Court Chief Justice data was used to create the following output in an Excel spreadsheet. Choose the statement that best summarizes the variability of the dataset.

a. b. c. d.

The age of most of the U. S. Supreme Court Chief Justice’s since 1900 are within 5.6 years of the mean age. The age of most of the U. S. Supreme Court Chief Justice’s since 1900 are within 31.3 years of the mean age. The ages of most of the U. S. Supreme Court Chief Justices are between 50 and 68 years. None of the above.

Section 3.2 (What’s Unusual? The Empirical Rule and z-Scores) Use the following information to answer questions (6)-(9). The mean age of lead actresses from the top ten grossing movies of 2010 was 29.6 years with a standard deviation of 6.35 years. Assume the distribution of the actresses’ ages is approximately unimodal and symmetric. 6.

[Objective: Use the Empirical Rule to determine ranges of usual values] Between what two values would you expect to find about 95% of the lead actresses ages? a. 23.25 and 35.95 years b. 10.55 and 48.65 years c. 16.9 and 42.3 years d. None of the above Copyright © 2013 Pearson Education, Inc.

Chapter 3 Test A

3-3

[Objective: Use the Empirical Rule to determine ranges of usual values] Between what two values would you expect to find about 68% of the lead actresses ages? a. 23.25 and 35.95 years b. 10.55 and 48.65 years c. 16.9 and 42.3 years d. None of the above

[Objective: Calculate and interpret z-scores] In 2010, actress Helena Bonham Carter was in the movie “The King’s Speech.” She was 44. Was it unusual for a 44-year old actress to be in a top grossing film of 2010? Assume the Empirical Rule holds and find the z-score and whether it was unusual. a. z = 2 . 27 Most people would say it is unusual. b. z = −2.27 Most people would say it is not unusual. c. z = 0 . 27 Most people would say it is unusual. d. z = −0.27 Most people would say it is not unusual.

[Objective: Use the Empirical Rule to determine ranges of usual values] In 1993, actress Anna Paquin won an academy award in for the movie “The Piano”. She was 11-years-old. Finish the statement: “According to the Empirical Rule, the ages of nearly all lead actresses will be between ______ and _____ years. Anna Paquin was ___________ this range when she won the academy award.” a. 16.9; 42.3; not within b. 10.6; 48.7; within c. 10.6; 48.7; not within d. 23.3; 36.0; within

10. [Objective: Calculate and interpret z-scores] The mean price of a pound of ground beef in 75 cities in the Midwest is $2.11 and the standard deviation is $0.56. A histogram of the data shows that the distribution is symmetrical. A local Midwest grocer is selling a pound of ground beef for $3.25. What is this price in standard units? Assuming the Empirical Rule applies, would this price be unusual or not? Round to the nearest hundredth. a. z = 2.04 ; This is unusually expensive ground beef. b. z = 2.04 ; This price would not be unusual. c. z = −2.04 ; This price would not be unusual. d. z = −2.04 ; This is unusually inexpensive ground beef. Use the following information to answer questions (12)-(14). The economic impact of an industry, such as sport fishing, can be measured by the retail sales it generates. In 2006, the economic impact of great lakes fishing in states bordering the great lakes had a mean of $318 and a standard deviation of $83.5. Note that all dollar amounts are in millions of dollars. Assume the distribution of retail sales is unimodal and symmetric. (Source: National Oceanic and Atmospheric Administration). 11. [Objective: Use the Empirical Rule to determine ranges of usual values] For what percentage of great lakes states would you expect the economic impact from fishing to be between $234.5 and $401.5 (in millions of dollars)? a. 95% b. 68% c. Nearly all d. None of the above

3-4

Chapter 3 Test A 12. [Objective: Use the Empirical Rule to determine ranges of usual values] The economic impact of fishing for nearly all great lakes states should fall within what range (in millions of dollars)? a. $151 to $485 b. $67.5 to $568.5 c. $234.5 to $401.5 d. $83.5 to $318 13. [Objective: Use the Empirical Rule to determine ranges of usual values] If a new report came out saying that the economic impact of great lakes sport fishing on the economy of Illinois was $93,588,546, would you say this was unusual? Note that this dollar amount must be converted before calculating a standard score. a. No, it is in the range of typical values. b. Yes, it is unusually high. c. Yes, it is unusually low. d. Not enough information available

Section 3.3 (Summaries for Skewed Distributions) & Section 3.4 (Comparing Measures of Center) Use the following information to answer questions (14)-(16). Here is a table recording the number of deaths for the top thirteen worst U. S. tornados since 1925. A histogram showing the distribution is also included. 10

268 150 181 114 115 110

Frequency

689 454 102 208 142 271 315

Number of deaths

14. [Objective: Determine most appropriate typical value for skewed data] Choose the most appropriate measure of center then calculate the typical value rounded to the nearest tenth. a. Median; 181.0 b. Median; 239.9 c. Mean; 239.9 d. Mean; 181.0 15. [Objective: Determine most appropriate measure of variability for skewed data] Estimate the most appropriate measure of variability. a. Standard Deviation; 169.4 b. IQR; 574 c. Standard Deviation; 178.5 d. IQR; 156

Chapter 3 Test A

3-5

16. [Objective: Judging the effects of outliers] The worst tornado on record since 1925 is a tornado that went through Missouri, Illinois, and Indiana on March 18, 1925. It killed 689 people. Suppose that when this value was entered into a calculator or other software a mistake was made and it was entered as 1,689. Choose the statement that describes what affect his mistake will have on the mean and median. a. Both the median and the mean will be higher than they should be. b. The median and the mean will not be affected by the error. Both measures of center are resistant to extreme values. c. The median will not be affected by the error, but the mean will higher than it should be. d. The median will be higher than it should be, but the mean will not be affected by the error.

Section 3.5 (Using Boxplots for Displaying Summaries) 17. [Objective: Calculate the five-number summary from a small data set] Calculate the five-number summary for the following dataset. 51 53 62 34 36 39 43 63 73 79 a. 32, 39, 52, 63, 79 b. 34, 37.5, 51, 62.5, 73 c. 34, 37.5, 53, 68, 79 d. 34, 39, 52, 63, 79 Use the side-by-side boxplots below to answer questions (19) and (20). The boxplots summarize the number of sentenced prisoners by state in the Midwest and West. 1,435

4,322

2,114

3887.5

9,891

29,928

50,648

Midwest

West 6,887

15,706

22,662

34,581

18. [Objective: Use boxplots to make comparisons] Pick the statement that best describes the shape of the distribution for the states in the West. a. The data appears to be roughly symmetrical with a possible outlier. b. The data appears to be right-skewed with a possible outlier. c. The data appears to be left-skewed with large variability.

19. [Objective: Use boxplots to make comparisons] Based on the boxplot for the Midwest, which of the following is true? a. 25% of the states sentenced less than 1,435 prisoners. b. 25% of the states sentenced more than 29,928 prisoners. c. 50% of the states sentenced less than 4,322 prisoners. d. 50% of the states sentenced more than 29,928 prisoners.

3-6

Chapter 3 Test A 20. [Objective: Interpret a boxplot] Using the boxplot for the Midwest, determine which of the following statements about the distribution cannot be justified. a. The range in the Midwest is 49,213. b. About 75% of the Midwestern states had 4,322 or more prisoners. c. The distribution is skewed to the right d. There are fewer states with 3887.5 to 6887 prisoners than states with 6887 to 15706 prisoners.

Chapter 3 Test A Chapter 3 Test A—Answer Key 1. B 2. D 3. B 4. C 5. A 6. C 7. A 8. A 9. B 10. A 11. B 12. A 13. C 14. A 15. D 16. C 17. D 18. B 19. B 20. D

3-7

Chapter 3 Test B—Multiple Choice Section 3.1 (Summaries for Symmetric Distributions) 1.

x is about 1.1 hours x is about 1.3 hours x is about 1.4 hours x is about 1.6 hours

Men: s = 0.25

On average, men’s finish times will be 0.25 hours faster than the overall average finish time. On average, women’s finish times will be 0.16 hours less than men’s finish times. The distribution of women’s finish times is less varied then the distribution of men’s finish times. The distribution of men’s finish times is less varied then the distribution of women’s finish times.

[Objective: Understand the mean in context] A school board member says, “The typical bus ride to school for a student living in the city limits is more than the bus ride to school for a student living in the suburbs.” What does this statement mean? a. There is less variation in the bus ride times of those living in the suburbs. b. If you are a student living in the city limits you will have a shorter commute time. c. All students living in the city spend less time riding the bus to school than those living in the suburbs. d. The center of the distribution of bus ride times for a city-dweller is more than the center of the distribution for those living in the suburbs.

3-2

Chapter 3 Test B

Age 65 63 67 68 56

Last Name Warren Burger Rehnquist Roberts

Age 62 61 61 50

[Objective: Calculate and interpret the mean of a small data set] Find the mean, rounding to the nearest tenth of a year, and interpret the mean in this context. a. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 61.4. b. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 61.0. c. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 63.0. d. The typical age of a U. S. Supreme Court Chief Justice appointed since 1900 is 64.1.

a. b. c. d.

The ages of most of the U. S. Supreme Court Chief Justices are between 50 and 68 years. The age of most of the U. S. Supreme Court Chief Justice’s since 1900 are within 31.3 years of the mean age. The age of most of the U. S. Supreme Court Chief Justice’s since 1900 are within 5.6 years of the mean age. None of the above.

Section 3.2 (What’s Unusual? The Empirical Rule and z-Scores) Use the following information to answer questions (6)-(9). The mean age of lead actors from the top ten grossing movies of 2007 was 36.4 years with a standard deviation of 9.87 years. Assume the distribution of actors ages is approximately unimodal and symmetric. 6.

[Objective: Use the Empirical Rule to determine ranges of usual values] Between what two values would you expect to find about 68% of the lead actors ages? a. 6.87 and 66.01 years b. 26.53 and 46.27 years c. 16.66 and 56.14 years d. None of the above

Chapter 3 Test B

3-3

[Objective: Use the Empirical Rule to determine ranges of usual values] Between what two values would you expect to find about 95% of the lead actors ages? a. 6.87 and 66.01 years b. 26.53 and 46.27 years c. 16.66 and 56.14 years d. None of the above

[Objective: Calculate and interpret z-scores] In 2007, popular actor and singer Justin Timberlake was 26years-old. What is Justin Timberlake’s age in 2007 if it is standardized? Would it be unusual for a 26year-old actor to be in a top-grossing film of 2007? Assume the Empirical Rule applies and round to the nearest hundredth. a. z = 1.05 ; It would not be unusual. b. z = 1.05 ; It would be unusual. c. z = −1.05 ; It would be unusual. d. z = −1.05 ; It would not be unusual.

[Objective: Use the Empirical Rule to determine ranges of usual values] In 2002, actor Adrian Brody won an academy award in for the movie “The Pianist”. He was 29-years-old. Finish the statement: “According to the Empirical Rule, the ages of nearly all lead actors will be between ______ and _____ years. Adrien Brody was ___________ this range when she won the academy award.” a. 6.8; 66.0; within b. 6.8; 66.0; not within c. 16.7; 56.1; within d. 26.5; 46.3; not within

10. [Objective: Calculate and interpret z-scores] In 2007, the mean price per pound of lobster in New England was $11.48 and the standard deviation was $2.12. A histogram of the data shows that the distribution is symmetrical. A local New England grocer is selling lobster for $8.99 per pound. What is this price in standard units? Assuming the Empirical Rule applies, would this price be considered unusual or not? Round to the nearest hundredth. a. z = 1.17 ; This is unusually expensive lobster. b. z = 1.17 ; This price would not be unusual. c. z = −1.17 ; This price would not be unusual. d. z = −1.17 ; This is unusually inexpensive lobster. Use the following information to answer questions (12)-(14). The economic impact of an industry, such as sport fishing, can be measured by the retail sales it generates. In 2006, the economic impact of great lakes fishing in states bordering the great lakes had a mean of $318 and a standard deviation of $83.5. Note that all dollar amounts are in millions of dollars. Assume the distribution of retail sales is unimodal and symmetric. (Source: National Oceanic and Atmospheric Administration). 11. [Objective: Use the Empirical Rule to determine ranges of usual values] For what percentage of great lakes states would you expect the economic impact from fishing to be between $151.00 and $485.00 (in millions of dollars)? a. 95% b. 68% c. Nearly all d. None of the above

3-4

Chapter 3 Test B 12. [Objective: Use the Empirical Rule to determine ranges of usual values] The economic impact of fishing for nearly all great lakes states should fall within what range (in millions of dollars)? a. $234.5 to $401.5 b. $151 to $485 c. $67.5 to $568.5 d. $83.5 to $318 13. [Objective: Use the Empirical Rule to determine ranges of usual values] If a new report came out saying that the economic impact of great lakes sport fishing on the economy of Illinois was $93,588,546, would you say this was unusual? Note that this dollar amount must be converted before calculating a standard score. a. Yes, it is unusually high. b. Yes, it is unusually low. c. No, it is in the range of typical values. d. Not enough information available

Section 3.3 (Summaries for Skewed Distributions) & Section 3.4 (Comparing Measures of Center) Use the following information to answer questions (15)-(17). Here is a table recording the number of deaths for the top thirteen worst U. S. tornados since 1925. A histogram showing the distribution is also included. 10

268 150 181 114 115 110

8 Frequency

689 454 102 208 142 271 315

Number of deaths

14. [Objective: Determine most appropriate typical value for skewed data] Choose the most appropriate measure of center then calculate the typical value rounded to the nearest tenth. a. Mean; 239.9 b. Mean; 181.0 c. Median; 239.9 d. Median; 181.0 15. [Objective: Determine most appropriate measure of variability for skewed data] Estimate the most appropriate measure of variability. a. IQR; 574 b. Standard Deviation; 169.4 c. IQR; 156 d. Standard Deviation; 178.5

Chapter 3 Test B

3-5

16. [Objective: Judging the effects of outliers]: In a data set containing the number of casualties (deaths) for all tornados since 1925, the tornado with the most deaths went through Missouri, Illinois and Indiana on March 18, 1925. It killed 689 people. Suppose that when this value was entered into a calculator or other software a mistake was made and it was entered as 1,689. Choose the statement that describes what affect his mistake will have on the median and the mean. a. Both the median and the mean will be higher than they should be. b. The median and the mean will not be affected by the error. Both measures of center are resistant to extreme values. c. The median will be higher than it should be, but the mean will not be affected by the error. d. The median will not be affected by the error, but the mean will higher than it should be.

19.98, 41.19, 73.295, 83.88, 114.6 19.98, 41.19, 75, 115, 83.88 41.19, 73.295, 83.88, 114.6 19, 41, 63, 84, 115

Use the side-by-side boxplots below to answer questions (19) and (20). The boxplots summarizes the number of sentenced prisoners by state in the Midwest and West. 1,435

4,322

2,114

3887.5

9,891

29,928

50,648

Midwest

West 6,887

15,706

22,662

34,581

18. [Objective: Use boxplots to make comparisons] Pick the statement that best describes the shape of the distribution for the states in the Midwest. a. The data appears to be roughly symmetrical with a possible outlier. b. The data appears to be right-skewed with a possible outlier. c. The data appears to be left-skewed with large variability. d. The data appears to be right-skewed with large variability. 19. [Objective: Use boxplots to make comparisons] Based on the boxplot for the West, which of the following is true? a. 25% of the states sentenced more than 15,706 prisoners. b. 25% of the states sentenced less than 6,887 prisoners. c. 50% of the states sentenced less than 15,706 prisoners. d. 50% of the states sentenced less than 22,662 prisoners.

3-6

Chapter 3 Test B

20. [Objective: Interpret a boxplot] Using the boxplot for the West, determine which of the following statements about the distribution cannot be justified. a. The range is 32,467. b. About 75% of the West states had 3,887 or more prisoners. c. The distribution is skewed to the right d. There are fewer states with 3887.5 to 6887 prisoners than states with 6887 to 15706 prisoners. e. The interquartile range is about 11,819.

Chapter 3 Test B Chapter 3 Test B—Answer Key 1. C 2. C 3. D 4. A 5. C 6. B 7. C 8. D 9. A 10. C 11. A 12. B 13. B 14. D 15. C 16. D 17. A 18. D 19. A 20. D

3-7

Chapter 3 Test C—Short Answer Provide an appropriate response. Section 3.1 (Summaries for Symmetric Distributions) Use the following information to answer questions (1) and (2). A junior high gym teacher recorded the time, in minutes, that it took two of her classes to run one mile. Here are the summary statistics for each class: Class A: x = 9.6, s = 1.1

Class B: x = 9.9; s = 1.5

[Objective: Interpret the mean and standard deviation] Write a sentence comparing the Class A and Class B. Does one class run faster than the other? Explain.

[Objective: Calculate the variance] Calculate the sample variance for each class. Round to the nearest hundredth and be sure to use the correct symbols and units.

Explain in your own words what the sample standard deviation is and why it is an important summary statistic.

Use the following information to answer questions (4)-(6). Data and summary statistics about college professor’s salaries was gathered from nine institutions and is presented below. This salary information is from professors teaching at the Master’s level. Men 81907 66290 55632 95724 70034 57179 73648 59052 49751

Women 77451 64251 54018 91360 68970 56092 69690 57278 48793

Men: x = 67691; s = 14506 Women: x = 65323; s = 13274

[Objective: Interpret the mean and standard deviation] According to a national study, the average salary for a professor teaching at the Master’s level is approximately $70,000. How does the data from these nine institutions compare to this? Does it appear that salaries for either group at these nine institutions agrees or disagrees with the study?

3-2

Chapter 3 Test C 5.

[Objective: Interpret variability of a data set] Which group has more variability? How would this affect the histogram for the group, compared to the other group? Explain.

[Objective: Use the Empirical rule to determine unusual values] Suppose a female professor is offered a position with an annual salary of $100,000. Compared to the women from the nine institutions in the study, would this be an unusually good salary? Explain.

Section 3.2 (What’s Unusual? The Empirical Rule and z-Scores) 7.

[Objective: Understand and apply the Empirical Rule] In your own words, explain what the Empirical Rule says and what conditions a distribution must meet in order to apply the rule.

Use the following information to answer questions (8)-(10). The average snowfall for cities in Michigan is 71.6 inches with a standard deviation of 9.7 inches. Assume the distribution for annual snowfall is approximately unimodal and symmetrical. 8.

[Objective: Use the Empirical Rule to determine ranges of usual values] What is the range of values for annual snowfall that would contain roughly 68% of the cities in Michigan? Round calculations to the nearest tenth.

[Objective: Use the Empirical Rule to determine ranges of usual values] What is the range of values for annual snowfall that would contain roughly 95% of the cities in Michigan? Round calculations to the nearest tenth.

10. [Objective: Calculate and interpret z-Scores] In 2007, the annual snowfall in Grand Rapids, Michigan was 97.2 inches. What is the standard score for the 2007 snowfall? Was this an unusual amount of snowfall for a city in Michigan? Show all work and round any calculations to the nearest tenth.

Chapter 3 Test C

3- 3

11. [Objective: Calculate z-scores and use to compare relative standing] The average grade on an algebra exam was 76% with a standard deviation of 6 percentage points. The average grade on a chemistry exam was 81% with a standard deviation of 2 percentage points. Julie got a grade of 83% on both exams. Which exam did she do relatively better on? Show all work and round any calculations to the nearest hundredth.

Use the following information to answer questions (12)-(14). In 2007, the average number of hours spent online at home for U. S. adults with internet access was 8.9 hours with a standard deviation of 0.4 hours. The U. S average was determined by collecting cluster data from thirty randomly selected states. Assume the distribution of time spent online at home is approximately unimodal and symmetric. 12. [Objective: Use the Empirical Rule to determine ranges of usual values] What is the range of time spent online at home for 95% of adults with internet access? Round to the nearest tenth.

13. [Objective: Use the Empirical Rule to determine ranges of usual values] If a new report came out saying that on average, Floridian adults spent 8.0 hours online while at home, would you say this was an unusual value? If it was unusual, explain how you reached your conclusion and whether it was unusually high or low.

14. [Objective: Calculate and interpret z-scores] The standard score for hours spent online at home for Minnesotans was 3.00. Approximately how many hours do Minnesotan adults spend online at home? Show your work and round all calculations to the nearest tenth.

Section 3.3 (Summaries for Skewed Data) & Section 3.4 (Comparing Measures of Center) 15. [Objective: Determine the most appropriate typical value for skewed data] Eric is contemplating whether to accept a job offer in an unfamiliar city. The move would mean buying a new home for his family of five people. He is curious about typical home prices in the new city. Which information would more useful to him, average house prices or median house prices? Explain.

3-4

Chapter 3 Test C

Use the following data to answer questions (16)-(18). Here are the scores on a recent statistics midterm exam (Scores have been listed from lowest to highest). A histogram showing the distribution is also included. 10

11 15 19 21 23

12 18 20 22 24

13 18 20 22 24

Frequency

3 14 19 20 22

Exam Scores

16. [Objective: Determine the most appropriate typical value for skewed data] Which measure of center, mean or median, would be most appropriate, and why? Using the data and the histogram, find the approximate value of the appropriate measure of center, and describe how you found it.

17. [Objective: Determine the most appropriate measure of variability for skewed data] Choose the most appropriate measure of variability for the data and calculate it. Explain why you chose the measure of variability that you did. Round all calculations to the nearest tenth if necessary.

18. [Objective: Judging the effects of outliers] The worst midterm grade was received by a student who was absent the week prior to the exam due to illness. Should this grade be considered an outlier? Explain and support your reasoning. Be sure to state what you would do with this data value.

Chapter 3 Test C

3- 5

Section 3.5 (Using Boxplots for Displaying Data) Use the following information for questions (19) and (20). The boxplots below are from a Graphing Calulator. The boxplots summarize the cost in dollars of a typical evening out for two people (dinner and a movie) in two different cities in west Michigan, Muskegon (top boxplot) and Grand Rapids (bottom boxplot). The five-number summaries are also given for each city. Muskegon: 59, 62, 80.5, 85, 94 Grand Rapids: 50, 75, 93.5, 105, 125

19. [Objective: Use boxplots to make comparisons] Explain which city you think is more economical for an evening out and why. Be sure to comment on differences in typical value and variation for each city.

20. Brian and his significant other plan to visit west Michigan and eat out every night at a different restaurant. He wants to be careful about his budget. Based on the information provided, which city would be best for them to visit and why?

3-6

Chapter 3 Test C

Chapter 3 Test C—Answer Key 1.

Class B has a mean time that is slightly higher than Class A, but they also had a larger standard deviation so the two class may not be that different. We cannot say that one class is faster than the other, they are very similar. 2. Class A: s 2 = 1.21 ; Class B: s 2 = 2.25 3. Various. Standard deviation is a measurement of variability which helps us to understand whether most values are close to or far from the typical value. 4. Although both groups from this sample have means that are below $70,000, they both have standard deviations that are relatively large so the sample agrees with study. 5. Men’s salaries have slightly more variation than women’s salaries which means that men’s salaries will vary on either side of the mean a little more than women’s salaries. The histogram for men’s salaries will be more spread out horizontally than the histogram for women’s salaries. 6. Yes, this would be an unusually good salary for a female professor because it is more than two standard deviations above the typical value for women in this study. 7. If a distribution is unimodal and approximately symmetrical then the Empirical Rule says that approximately 68% of the observations will be within one standard deviation of the mean and 95% of the observations will be within two standard deviations of the mean. Nearly all the observations will be within three standard deviations of the mean. 8. 61.9 inches to 81.3 inches of snow. 9. 52.2 inches to 91 inches of snow. 10. z ≈ 2.64 ; yes, this is an unusually large amount of snow. 11. zalg ≈ 1.17, zchem ≈ 1.00 ; Julie did relatively better on the algebra exam. 12. 8.1 hours to 9.7 hours 13. Yes, this value is unusually low since it lies outside of the range that contains 95% of the observations. 14. x = 10.1 hours

15. Median house values would be more valuable since this is likely to be right-skewed data, which would pull the average up and would therefore not yield the true typical value. 16. The median would be the best measure of center, since the distribution is skewed left. The median has roughly half the observations below it, and from the histogram we see that this would be the 13th or 14th observation. This tells us that an exam score of about 19 or 20 (or 19.5) would be the median. 17. The inter-quartile range (IQR) is the most appropriate measure of variability since this is skewed data. The IQR is 7.5. 18. The histogram shows that the low score of 3 is separated from the rest of the distribution. This data point should be noted as an outlier and two analyses should be done: one including the outlier and one not including the outlier and the results should be compared. 19. The typical cost of an evening out in Muskegon is $80.50 compared to an evening out in Grand Rapids at $93.50. The Muskegon data is left-skewed and has smaller variability then Grand Rapids. It appears that Muskegon is the more economical city for an evening out. 20. Answers will vary, but should be supported with information from the boxplots. Although, in general, it is more economical to spend an evening out in Muskegon, there is more variation in cost in Grand Rapids meaning that there are more options for spending less (or more) on the evening out.

Chapter 4 Test A—Multiple Choice Section 4.1 (Visualizing Variability with a Scatterplot) 1.

[Objective: Analyze a scatterplot and recognize trends] The scatterplot below shows the hat size and IQ of some adults. Is the trend positive, negative, or near zero? a. Positive

b. Negative c. Near Zero

Hat Size

[Objective: Analyze a scatter plot and recognize trends] The scatterplot below shows the number of tackles received and the number of concussions received for a team of football players for the most recent season. Choose the statement that best describes the trend. a. Teams that receive a greater number of tackles tend to have a higher number of concussions.

Number of concussions

b. Teams that receive a greater number of tackles tend to receive a lower number of concussions. c. There is no association between the number of tackles a team receives and the number of concussions.

Number of Tackles

[Objective: Analyze a scatter plot and recognize trends] Doctors hypothesize that smoking cigarettes lowers lung capacity. To test this they measured the lung capacity (in liters) and the number of cigarettes smoked in a typical day for a sample of adults. Is the scatterplot below consistent with the researcher’s hypothesis?

Lung Capacity

a. Yes, it is consistent. b. No, it contradicts this hypothesis. c. There is no evidence in support or contradiction of the hypothesis. Number of cigarettes smoked per day

[Objective: Recognize the key components of a scatterplot] What key things should you look for when examining the potential linear association between two variables? a. A noticeable positive or negative trend on the scatterplot. b. The vertical spread of the points on the scatterplot which indicate the strength of an association. c. A noticeable overall linear shape on the scatterplot. d. All of the above.

4-2

Chapter 4 Test A

Section 4.2 (Measuring Strength of Association with Correlation) 5.

[Objective: Interpret the correlation coefficient] Which of the following statements regarding the correlation coefficient is not true? a. The correlation coefficient has values that range from -1.0 to 1.0 inclusive. b. The correlation coefficient measures the strength of the linear relationship between two numerical variables. c. A value of 0.00 indicates that two variables are perfectly linearly correlated. d. All of the above are true statements.

[Objective: Interpret the correlation coefficient] The following calculator screenshots show the scatterplot and the correlation coefficient between the number of days absent and the final grade for a sample of college students in a general education statistics course at a large community college.

The relationship between “days absent” and “final grade” can be described as a. A strong positive linear relationship b. A moderate positive linear relationship c. A strong negative linear relationship d. A weak negative relationship [Objective: Visualize the correlation coefficient] The table shows the number of minutes ridden on a stationary bike and the approximate number of calories burned. Plot the points on the grid provided then choose the most likely correlation coefficient from the answer choices below.

Minutes 30 75 65 40 55

Calories 250 500 425 300 375

600 500 400

Calories

a. b. c. d.

-0.99 0.99 -0.20 0.20

300 200 100

40 Minutes

Chapter 4 Test A

4- 3

For questions (8)-(10), match each scatterplot to one of the correlation coefficients.

[Objective: visualizing the correlation coefficient] r = 0.8787

[Objective: visualizing the correlation coefficient] r = −0.6542

10. [Objective: visualizing the correlation coefficient] r = −0.3120

Section 4.3 (Modeling Linear Trends) 11. [Objective: Understand the components of a linear regression model] Suppose it has been established that “annual income” and “Years of college” are linearly related, and that the relationship can be modeled using the following equation: Annual Income = $23,400+$7200(Years of College). In this model, “Annual Income” is the __?__ variable, and “Years of College” is the __?__ variable. The two variables have a __?__ linear relationship. a. Dependent; Independent; Positive b. Dependent; Independent; Negative c. Independent; Dependent; Positive d. Independent; Dependent; Negative 12. [Objective: Understand the components of a linear regression model] A veterinarian is going to investigate whether homes with more pets tend to have more fleas. In this scenario, the explanatory variable is _____________ and the response variable is ______________. a. Number of fleas spotted in a period of time; number of pets b. Number of pets; number of fleas spotted in a period of time c. Slope; intercept d. None of the above

Use the following information to answer questions (13)-(15) The following linear regression model can be used to predict ticket sales at a popular water park.

Ticket sales per hour = −631.25 + 11.25 ( current temperature in D F ) 13. [Objective: Interpret a linear model] What is the predicted number of tickets sold per hour if the temperature is 86D F ? Round to the nearest whole ticket. a. About 252 tickets b. About 276 tickets c. About 301 tickets d. About 336 tickets Copyright © 2013 Pearson Education, Inc.

4-4

Chapter 4 Test A 14. [Objective: Interpret a linear model] Choose the statement that best states the meaning of the slope in this context. a. The slope tells us that if ticket sales are decreasing there must have been a drop in temperature. b. The slope tells us that a one degree increase in temperature is associated with an average increase in ticket sales of 11.25 tickets. c. The slope tells us that high temperatures are causing more people to buy tickets to the water park. d. None of the above 15. [Objective: Interpret a linear model] In this context, does the intercept have a reasonable interpretation? a.

Yes, it is reasonable for people to go to a water park when it is 0D F , so park managers might want to know how many tickets they would sell on average on a 0D F day.

b. c.

No, at a temperature of 0D F , ticket sales would be −631.25 and it is not reasonable (or possible) to have negative ticket sales. Not enough information available

16. [Objective: Use technology to find a regression equation] The data in the table represent the amount of raw material (in tons) put into an injection molding machine each day (x), and the amount of scrap plastic (in tons) that is collected from the machine every 4 weeks (y). Also shown below are the outputs from two different statistical technologies (TI-83/84 Calculator and Excel). A scatterplot of the data confirms that there is a linear association. Report the equation for predicting scrap from raw material using words such as scrap, not x and y. State the slope and intercept of the prediction equation. Round all calculations to the nearest hundredth. x

2.71 2.33 2.33 2.21 2.11 2.08 1.98 1.95 1.84 1.84

3.61 2.80 2.77 2.34 2.15 2.06 2.02 1.95 1.73 1.68

scrap = 2.19 − 2.38 ( raw material ) ;slope = 2.19and the intercept is -2.38.

scrap = 2.19 − 2.38 ( raw material ) ;slope = −2.38and the intercept is 2.19.

scrap = −2.38 + 2.19 ( raw material ) ;slope = −2.38 and the intercept is 2.19.

scrap = −2.38 + 2.19 ( raw material ) ;slope = 2.19 and the intercept is -2.38.

Chapter 4 Test A

4- 5

17. [Objective: Understand the components of a linear regression model] A horticulturist conducted an experiment on 110 thirty-six inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of tomatoes harvested from the plants. The average amount of plant food given was 27.8 milliliters with a standard deviation of 2.1 milliliters. The average number of tomatoes harvested was 7.5 with a standard deviation of 1.5. The correlation coefficient was 0.7691. Use the information to calculate the slope of the linear model that predicts the number of tomatoes harvested from the amount of plant food given. Show your work and round to the nearest hundredth. a. -7.50 b. 1.08 c. 0.55 d. The slope cannot be determined without the actual data.

Section 4.4 (Evaluating the Linear Model)

18. [Objective: Critically evaluate a regression model] The following model was created to show the association between the number of massages received per month and self-predicted stress level: Stress level = 10 − 0.02 ( number of massages per month ) The coefficient of determination for the model is 0.066 or 6.6%. Choose the true statement regarding this model. a. b. c.

The model shows that getting even one massage a month will decrease your stress level. The model shows that on average a person getting no massages will always have a stress level of 10. The model shows that there is a negative association between stress level and number of monthly massages, but the relatively small coefficient of determination suggests that very little of the variation is explained by the model, so the model is probably not very good at predicting stress level. None of the above.

19. [Objective: Calculate and interpret the coefficient of determination] In the NBA, the correlation between “steals per game” and “blocked shots per game” is found to be 0.8045. Choose the statement that is true about the coefficient of determination. a. The coefficient of determination , r 2 , is equal to approximately 0.6472 . b. The coefficient of determination states that about 64.72% of the variation in blocked shots per game is explained by steals per game. c. When given as a percent, the coefficient of determination is always between 0 and 100%. d. All of the above are true statements.

4-6

Chapter 4 Test A 20. [Objective: Critically evaluate a regression model] It is determined that a positive linear association

exists between age (for children between the ages of 3 and 9 years) and attention span (measured in minutes). The scatterplot below shows the association. The prediction equation is also given. A college instructor uses the model to predict the attention span of the students in her class who have an average age of 29. Choose the best statement to summarize why this is not an appropriate use for the model. attention span = 4.68 + 3.40 ( age )

a. b. c.

This is an inappropriate use of the model because the model was used to make predictions beyond the scope of the data. The college instructor is extrapolating. This is an inappropriate use of the model because a 29-year-old person would be an outlier in this context. This is an inappropriate use of the model because age does not cause attention span to increase. Correlation does not mean causation

Chapter 4 Test A Chapter 4 Test A—Answer Key

1. C 2. A 3. A 4. D 5. C 6. C 7. B 8. B 9. C 10. A 11. A 12. B 13. D 14. B 15. B 16. D 17. C 18. C 19. D 20. A

4- 7

Chapter 4 Test B—Multiple Choice Section 4.1 (Visualizing Variability with a Scatterplot) 1.

[Objective: Analyze a scatter plot and recognize trends] Doctors believe that smoking cigarettes lowers lung capacity. To test this they measured the lung capacity (in liters) and the number of cigarettes smoked in a typical day for a sample of adults. Is the scatterplot below consistent with the researcher’s hypothesis? a. Yes, it is consistent. Lung Capacity

b. No, it contradicts this hypothesis. c. There is no evidence in support or contradiction of the hypothesis.

Number of cigarettes smoked per day

[Objective: Analyze a scatterplot and recognize trends] The scatterplot below shows the hat size and IQ of some adults. Is the trend positive, negative, or near zero? a. Positive

b. Negative c. Near Zero

Hat Size

[Objective: Analyze a scatter plot and recognize trends] The scatterplot below shows the number of tackles received and the number of concussions received for a team of football players for the most recent season. Choose the statement that best describes the trend? a. Teams that receive a greater number of tackles tend to have a higher number of concussions.

Number of concussions

b. Teams that receive a greater number of tackles tend to receive a lower number of concussions.

Number of Tackles

c. There is no association between the number of tackles a team receives and the number of concussions.

[Objective: Describe an association between two variables] Choose the best statement to summarize the association shown between hat size and IQ in the scatterplot from question (2). a. As hat size increases, IQ scores tend to increase. b. As hat size increases, IQ scores tend to decrease. c. The scatterplot does not show a trend that would indicate an association between hat size and IQ scores. d. Hat size causes IQ to increase.

4-2

Chapter 4 Test B

Section 4.2 (Measuring Strength of Association with Correlation) 5.

[Objective: Interpret the correlation coefficient] Which of the following statements regarding the correlation coefficient is not true? a. A correlation coefficient value of 0.00 indicates that two variables have no linear correlation at all. b. The correlation coefficient measures the strength of the linear relationship between two numerical variables. c. The correlation coefficient has values that range from -1.0 to 1.0 inclusive. d. All of the above are true statements.

[Objective: Interpret the correlation coefficient] The following calculator screenshots show the scatterplot and the correlation coefficient between car weight and car length for a sample of 2009 model year cars.

The relationship between “car length” and “car weight” can be described as a. A strong positive linear relationship b. A moderate positive linear relationship c. A strong negative linear relationship d. A weak negative relationship [Objective: Calculate the correlation coefficient] The table shows the number of minutes ridden on a stationary bike and the approximate number of calories burned. Plot the points on the grid provided then choose the most likely correlation coefficient from the answer choices below. Minutes 30 80 60 40 55

a. b. c. d.

0.99 -0.99 -0.30 0.30

Calories 200 500 425 250 450

600 500 400 Calories

300 200 100

40 Minutes

Chapter 4 Test B

4- 3

For questions (8)-(10), match each scatterplot to one of the correlation coefficients.

[Objective: visualizing the correlation coefficient] r = −0.6542

[Objective: visualizing the correlation coefficient] r = −0.3526

10. [Objective: visualizing the correlation coefficient] r = 0.8670

Section 4.3 (Modeling Linear Trends) 11. [Objective: Understand the components of a linear regression model] Suppose it has been established that “home value” and “Years of college” are linearly related, and that the relationship can be modeled using the following equation: Home value = $75,000+$12,500(Years of College). In this model, “years of college” is the __?__ variable, and “home value” is the __?__ variable. The two variables have a __?__ linear relationship. a. Dependent; Independent; Positive b. Dependent; Independent; Negative c. Independent; Dependent; Positive d. Independent; Dependent; Negative 12. [Objective: Understand the components of a linear regression model] A concert ticket agent is going to investigate whether an increase in money spent on radio advertisements for a particular venue tends to lead to more concert ticket sales. In this scenario, the response variable is _____________ and the explanatory variable is ______________. a. Amount of radio advertisement money spent; concert ticket sales b. Concert ticket sales; amount of radio advertisement money spent c. Slope; intercept d. None of the above Use the following information to answer questions (13)-(15). The following linear regression model can be used to predict ticket sales at a popular water park. Ticket sales per hour = −631.25 + 11.25 ( current temperature D F )

13. [Objective: Interpret a linear model] What is the predicted number of tickets sold per hour if the temperature is 79 D F ? Round to the nearest whole ticket. a. About 258 tickets b. About 257 tickets c. About 250 tickets d. About 310 tickets

4-4

Chapter 4 Test B 14. [Objective: Interpret a linear model] In this context, does the intercept have a reasonable interpretation? a. Yes, it is reasonable for people to go to a water park when it is 0D F , so park managers might want to know how many tickets they would sell on average on a 0D F day. b. c.

No, at a temperature of 0D F , ticket sales would be −631.25 and it is not reasonable (or possible) to have negative ticket sales. Not enough information available

15. [Objective: Interpret a linear model] Choose the statement that best states the meaning of the slope in this context. a. The slope tells us that high temperatures are causing more people to buy tickets to the water park. b. The slope tells us that if ticket sales are decreasing there must have been a drop in temperature. c. The slope tells us that a one degree increase in temperature is associated with an average increase in ticket sales of 11.25 tickets. d. None of the above

16. [Objective: Use technology to find a regression equation] The data in the table represent the amount of pressure (psi) exerted by a stamping machine (x), and the amount of scrap brass shavings (in pounds) that are collected from the machine each hour (y). Also shown below are the outputs from two different statistical technologies (TI-83/84 Calculator and Excel). A scatterplot of the data confirms that there is a linear association. Report the equation for predicting scrap brass shavings using words such as scrap, not x and y. State the slope and intercept of the prediction equation. Round all calculations to the nearest thousandth. x 2.00 7.80 14.51 2.80 4.01 6.21 11.84 5.11 11.67 8.70

y 2.30 15.14 28.65 4.15 6.35 10.52 24.05 8.75 22.22 17.02

scrap = 2.134 − 2.019 ( pressure ) ;slope = −2.019 and the intercept is 2.134.

scrap = −2.019 + 2.134 ( pressure ) ;slope = −2.019 and the intercept is 2.134.

scrap = 2.134 − 2.019 ( pressure ) ;slope = 2.134 and the intercept is -2.019.

scrap = −2.019 + 2.134 ( pressure ) ;slope = 2.134 and the intercept is -2.019.

Chapter 4 Test B

4- 5

17. [Objective: Understand the components of a linear regression model] A horticulturist conducted an experiment on 140 thirty-six inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of habanera peppers harvested from the plants. The average amount of plant food given was 17.8 milliliters with a standard deviation of 0.7 milliliters. The average number of habanera peppers harvested was 6.5 with a standard deviation of 1.5. The correlation coefficient was 0.8123. Use the information to calculate the slope of the linear model that predicts the number of habanera peppers harvested from the amount of plant food given. Show your work and round to the nearest hundredth. a. -1.50 b. 1.74 c. 0.38 d. The slope cannot be determined without the actual data.

Section 4.4 (Evaluating the Linear Model) 18. [Objective: Critically evaluate a regression model] The following model was created to show the association between the number of massages received per month and self-predicted stress level: Stress level = 10 − 0.02 ( number of massages per month ) The coefficient of determination for the model is 0.066 or 6.6%. Choose the true statement regarding this model. a.

b. c. d.

The model shows that there is a negative association between stress level and number of monthly massages, but the relatively small coefficient of determination suggests that very little of the variation is explained by the model, so the model is probably not very good at predicting stress level. The model shows that getting even one massage a month will decrease your stress level. The model shows that on average a person getting no massages will always have a stress level of 10. All of the above are true statements.

19. [Objective: Calculate and interpret the coefficient of determination] In the NHL, the correlation between “Goals scored per game” and “minutes on the ice” for a team of players is found to be 0.8178. Choose the statement that is true about the coefficient of determination. a. When given as a percent, the coefficient of determination is always between 0 and 100%. b. c. d.

The coefficient of determination , r 2 , is equal to approximately 0.6688 . The coefficient of determination states that about 66.88% of the variation in goals scored per game is explained by minutes on the ice. All of the above are true statements.

4-6

Chapter 4 Test B 20. [Objective: Critically evaluate a regression model] In certain urban regions, it can be shown that there is a strong correlation between the number of bars in the region and the crime rate in the region. Choose the best answer. a. This shows that bars cause crime. b. This shows that crimes cause bars. c. Causation from this data is not appropriate because the study is observational.

Chapter 4 Test B Chapter 4 Test B—Answer Key 1. A 2. C 3. A 4. C 5. D 6. B 7. A 8. C 9. A 10. B 11. C 12. B 13. A 14. B 15. C 16. D 17. B 18. A 19. D 20. C

4- 7

Chapter 4 Test C—Short Answer Provide an appropriate response. Section 4.1 (Visualizing Variability with a Scatterplot) [Objective: Analyze a scatterplot and recognize trends] The scatterplot below shows the ice cream sales and daily high temperatures for a three week period of time during the summer. Does there appear to be an association between these two variables? If so, describe the pattern. Be sure to comment on trend, shape, and the strength of the association.

[Objective: Analyze a scatterplot and recognize trends] The scatterplot below shows the number of alcoholic drinks consumed and memory test results for some college students. Is there an association? If so, describe the pattern. Be sure to comment on trend, shape, and the strength of the association.

Memory Test Results

Number of Alcoholic Drinks Consumes

Head Circumference

[Objective: Analyze a scatterplot and recognize trends] Based on the scatterplots below, what is the better predictor for head circumference—height or shoulder width? Explain how you made your decision. Head Circumference

Height

Shoulder Width

4-2

Chapter 4 Test C

Section 4.2 (Measuring Strength of Association with Correlation) Use the data provided in the table below to answer questions (4)-(6). The table shows city size and annual grocery expenditures for eight families. City size is in thousands and expenditures is in hundreds of dollars. City Size (in thousands) 30 50 75 100 150 200 175 120

Grocery Expenditures (in hundreds of dollars) 65 77 79 80 82 90 84 81

[Objective: Verify conditions for a linear model] Using the data from the table, sketch a scatterplot (by hand or with the aid of technology) of the data. Describe any association that you see. Would it be appropriate to fit a linear model to this data?

[Objective: Estimate the correlation coefficient] Based on the scatterplot, estimate the correlation coefficient between city size and expenditures for these eight families. Explain your reasons for choosing your estimate.

[Objective: Estimate the correlation coefficient] Suppose each of these families is given a grocery credit of $100, therefore reducing expenditures in the table by one unit (since this variable was recorded in hundreds of dollars). Estimate the new correlation with city size. What happens to the correlation when a constant is added (in this case −100 dollars is added to each number)? Explain your reasoning.

Chapter 4 Test C

4- 3

Section 4.3 (Modeling Linear Trends)

[Objective: Understand the components of a linear regression model] A horticulturist conducted an experiment on 120 thirty-six inch flower boxes to see if the amount of plant food given to the flower boxes was associated with the number of blooms on the plants. The average amount of plant food given was 31.6 milliliters with a standard deviation of 2.2 milliliters. The average number of blooms was 10.5 with a standard deviation of 1.5. The correlation coefficient was 0.8891. Use the information to calculate the slope of the linear model that predicts the number of blooms from the amount of plant food given. Show your work and round to the nearest hundredth. Write a sentence explaining the meaning of the slope in this context.

[Objective: Understand the components of a linear regression model] The following regression equation was found to model commute distance (in miles) and number of minor accidents per year for a group of adults. Identify the independent variable and the dependent variable.

Minor accidents per year = 1.204 + .024 ( commute distance )

In the previous model (question 8), what is the intercept? What is the interpretation of the intercept in this context? Does it make sense to interpret the intercept in this context?

4-4

Chapter 4 Test C 10. [Objective: Understand the function of the regression line] Suppose that runner height (in inches) and finish time in a 5k (in seconds) have a linear association as evidenced by a scatterplot showing a roughly linear pattern, explain the purpose of finding the equation for the regression line that will relate runner height and finish time. What are the limitations of the regression line? Explain the meaning of the slope and intercept of such a model?

Use the following information to answer questions (11)-(13) A scatterplot shows that city size of residence and number of children born to married couples has a decreasing linear trend, that is, on average, as the city size where a married couple lives increases the number of children that a married couple has decreases. The correlation is 0.767 and the regression equation is Predicted number of children = 5.2 − .08 ( city size ) The variable “city size” was recorded in thousands of people. It should be noted that this model is only useful for relatively small cities with populations less than 65,000 people. 11. [Objective: Apply a linear model] Use the regression equation to predict how many children a married couple will have if they live in city with a population of 40,000 people.

12. [Objective: Understand the components of a linear regression model] State the slope and intercept of the regression line and explain each in context. Be sure to explain whether the intercept has a practical meaning in this context.

13. [Objective: Understand the components of a linear regression model] State the explanatory variable and the response variable.

Chapter 4 Test C

4- 5

14. [Objective: Apply a linear model] Suppose that environmentalists monitor algae levels in a river and determine that there is a linear association between local rainfall and algae levels. The determine that the best fitting linear model to predict algae cell counts per milliliter from rainfall in inches is: Algae cell count = 229.32 + 79.81( Rainfall in inches ) with r 2 = 0.604 How will a rainfall that is 3 inches above average affect the algae cell count?

Use the following information to answer questions (15) and (16). A scatterplot of data from a large sample of adult women shows that height in inches and weight in pounds have a linear association. Shown below are the outputs from two different statistical technologies (TI-83/84 Calculator and Excel).

15. [Objective: Finding regression equations] Report the equation for predicting weight in pounds from height in inches using words such as weight, not x and y.

16. [Objective: Apply a linear model] Height and weight charts for women show that a woman who is 71 inches tall has a target weight between 135 and 176 pounds. Would the regression model you found for the large sample of (in question 15) place a woman who was 71 inches tall within this range?

4-6

Chapter 4 Test C

Section 4.4 (Evaluating the Linear Model)

17. [Objective: Evaluate a linear model] Suppose that in the Midwest, it is shown that there is trend between home insurance claims in dollars and the number of cattle owned by the home owner. The trend shows that higher claims were paid out to homeowners that owned more cattle. Does this trend prove that owning more cattle causes higher insurance claims? Be sure to explain your reasoning, don’t just answer yes or no. What is a potential hidden variable in this context?

18. [Objective: Evaluate a linear model] Explain in your own words what extrapolation is and give an example. Why should extrapolation be avoided when doing regression analysis?

19. [Objective: Calculate and interpret the coefficient of determination] If the correlation between whole milk content per serving and calories per serving for several brands of ice cream is 0.71, report the coefficient of determination (rounded to the nearest tenth of a percent) and explain what it means using a complete sentence. Assume that whole milk content is the predictor and calories per serving is the response, and assume that the association between whole milk content and calories is linear.

Chapter 4 Test C

4- 7

20. [Objective: Evaluate the effect of influential points] The figures below show the relationship between salary and personal lunch expenses on week days for a group of business men. Comment on the difference in graphs and in the coefficient of determination between the graph that includes a data point of someone who reported earnings of $21,000 per year and weekly personal lunch expenses of $100 per week (second graph) and the graph that did not include this data point (first graph).

r 2 = 8.2%

Weekly lunch expenses

100

r 2 = 78.4%

60 50 40 30 20

80 70 60 50 40

10k

20k

30k

40k

50k

60k

70k

Annual Salary (in thousands of dollars)

10k

20k

30k

40k

50k

60k

70k

Annual Salary (in thousands of dollars)

4-8

Chapter 4 Test C

Chapter 4 Test C—Answer Key

1. 2. 3. 4. 5. 6. 7.

Days with higher temperatures are positively associated with greater number of ice cream cones sold. The shape of the trend is roughly linear and the trend is fairly strong. Consuming more alcoholic drinks is negatively associated with memory test results. The shape of the trend is roughly linear and the trend is fairly strong. Shoulder width; shoulder width has a stronger relationship with head circumference as shown by the scatterplot where points are less scattered in the vertical direction. The scatterplot shows a positive trend that appears to be linear. It would be appropriate to fit a linear model to this data. The calculated correlation coefficient is 0.890, so student’s estimates should be within a reasonable range of 0.890. The estimate for r should not change—it should be the same as the estimate in questions (5). Multiplying one of the numerical variables by a constant does not affect the correlation coefficient. slope = r ( s y sx ) = 0.61 . On average, for every one milliliter increase in plant food there are 0.61

additional blooms. 8. The independent variable is commute miles, the dependent variable is minor accidents per year. 9. The intercept is 1.204. This suggests that people who drive but do not commute tend to have about 1.2 minor accidents per year. 10. The purpose of the regression line is to make predictions about finish times from runner height. The regression line should not be used to make predictions beyond the range of the data. This means the values put into the model for runner height must be reasonable. The intercept would not make sense in this model since runner height cannot be zero. The regression line will provide an average rate of change in finish time given a one inch change in runner height. 11. 2 children 12. The slope is −0.08 ; intercept is 5.2 which means that couples living in a city of population zero are predicted to have 5.2 children. This does not have a practical meaning in this context. 13. The explanatory variable is city size; response variable is number of children. 14. The algae cell count will be about 240 cells higher than the average algae cell count. 15. Weight = −23.87 + 2.58 ( height ) 16. Yes (159.31 pounds) 17. No, correlation does not imply causation; various (possible hidden variable could be the higher occurrence of tornados in the midwest) 18. Extrapolation is making predictions based on values of the independent variable that are far beyond he range of reasonable values in the data. Examples will vary. 19. 50.4%; the coefficient of determination of 50.4% means that 50.4% of the variation in calories per serving was explained by whole milk content. 20. The coefficient of determination is 8.2% when the data point is included and 78.4% when it is not included. This indicates that this data point is influential.

Chapter 5 Test A—Multiple Choice Section 5.1 (What is Randomness?) 1.

[Objective: Understand the meaning of probability] Which of the following statements is not true about probability? a. A probability of zero means that an event will not happen, a probability of one means that an event is certain to happen. b. Probability is used to measure how often random events occur. c. Probabilities are always numbers between 0 and 1 inclusive. d. All of the above are true statements

[Objective: Understand simulations of random events] If 20 babies are born, how often are there 8 or less male babies? Assume that the gender of a baby is a random event. Which of the following experiments would not simulate this situation? a. Flip a coin twenty times. Designate a head to mean “female” and a tail to mean “male”. b. Roll a die twenty times. Designate a 1, 2, or 3 to mean “female” and a 4, 5, or 6 to mean “male”. c. Choose the first twenty digits from a row in the random number table. Designate odd numbers to mean “female” and even numbers to mean “male”. d. All of the above will simulate the gender of twenty babies.

[Objective: Understand the difference between empirical and theoretical probabilities] Is the following an example of theoretical probability or empirical probability? A card player declares that there is a one in thirteen chance that the next card pulled from a well-shuffled, full deck will be a queen. a. Theoretical b. Empirical

[Objective: Understand the difference between empirical and theoretical probabilities] Is the following an example of theoretical probability or empirical probability? A homeowner notes that five out of seven days the newspaper arrives before 5 pm. He concludes that the probability that the newspaper will arrive before 5 pm tomorrow is about 71%. a. Theoretical b. Empirical

Section 5.2 (Finding Theoretical Probabilities) 5.

[Objective: Calculate the probability of the complement of an event] The National Center for Health Statistics has found that there is a 0.41% chance that an American citizen will die from falling. What is the probability that you will not die from a fall? Round to the nearest hundredth of a percent) a. 99.59% b. 93.31% c. 59.00% d. Can’t be determined with the given information.

Chapter 5 Test A 6.

[Objective: Using a Venn Diagram to visualize events] The Venn diagram below depicts gender and occupation of a sample of adults. Which region on the Venn diagram represents the event “The individual is a male nurse”?

Female Region 1

Region 2

5-2

Nurse Region 3

Region 4

a. b. c. d.

Region 1 Region 2 Region 3 Region 4

Use the following table to answer questions (7) -(13). A random sample of college students was asked to respond to a survey about how they spend their free time on weekends. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Saturday morning. The activity choices were homework, housework, outside employment, recreation, or other.

Male Female Total

Homework 29 18 47

Housework 15 17 32

Outside Employment 20 26 46

Recreation 23 39 62

Other 9 4 13

Total 96 104 200

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student is female? a. 0.50 b. 0.48 c. 0.52 d. None of the above

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student chose “outside employment” as their most likely activity on a Saturday morning? a. 0.13 b. 0.23 c. 0.43 d. None of the above

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student is female and chose “outside employment” as their most likely activity on a Saturday morning? a. 0.13 b. 0.23 c. 0.43 d. None of the above

Chapter 5 Test A

5-3

10. [Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student chose “homework” or “housework” as their most likely activity on a Saturday morning? a. 0.395 b. 0.145 c. 0.075 d. None of the above 11. [Objective: Apply the probability rules] What is the probability that a randomly chosen survey respondent is male or chose “recreation” as their most likely activity on Saturday mornings? a. 0.790 b. 0.480 c. 0.675 d. None of the above 12. [Objective: Apply the probability rules] Find the probability that a female college student from the group chose “housework” as their most likely activity on Saturday mornings? (round to the nearest thousandth) a. 0.531 b. 0.520 c. 0.163 d. None of the above 13. [Objective: Understand mutually exclusive events] Which of the following are mutually exclusive events? a. Student is male and student chose “housework” as their most likely activity on Saturday mornings. b. Student is female and student chose “housework” as their most likely activity on Saturday mornings. c. Student is male and student chose “outside employment” as their most likely activity on Saturday mornings. d. Student chose “recreation” and student chose “other” as their most likely activity on Saturday mornings.

Section 5.3 (Associations in Categorical Variables) 14. [Objective: Differentiate between independent and associated events] Use your intuition to decide whether the following events are likely to be independent or associated. Event A: A randomly selected person is married with no children. Event B: A randomly selected person opposes a tax credit for children. a. Associated b. Independent 15. [Objective: Differentiate between independent and associated events] Use your intuition to decide whether the following events are likely to be independent or associated. Event A: The randomly selected carton of milk you purchased from the store is sour. Event B: Your car won’t start on a randomly selected morning. a. Associated b. Independent

5-4

Chapter 5 Test A

Use the following information to answer questions (16)-(18). Suppose that a recent poll of American households about pet ownership found that for households with one pet, 39% owned a dog, 33% owned a cat, and 7% owned a bird. Suppose that three households are selected randomly and with replacement. 16. [Objective: Apply the probability rules] What is the probability that all three randomly selected households own a dog? (Round to the nearest hundredth) a. 0.39 b. 0.06 c. 0.23 d. None of the above 17. [Objective: Apply the probability rules] What is the probability that none of the three randomly selected households own a cat? (Round to the nearest hundredth) a. 0.70 b. 0.46 c. 0.30 d. None of the above 18. [Objective: Apply the probability rules] What is the probability that at least one of the three randomly selected households own a bird? (Round to the nearest hundredth) a. 0.20 b. 0.40 c. 0.60 d. 0.80 19. [Objective: Apply the probability rules] A true/false pop quiz contains five questions. What is the probability that when guessing, a student will get at least one question correct? (Round to the nearest hundredth) a. 0.50 b. 0.97 c. 0.76 d. 1.00

Section 5.4 (Finding Empirical Probabilities with Simulations) 20. [Objective: Understand the Law of Large Numbers] Which of the following statements is true about the Law of Large Numbers (LLN)? a. The LLN is almost always true, but there are special occasions, even when outcomes are random, when the LLN can be broken. b. The LLN states that the empirical probability that is observed will simulate the theoretical probability that is expected for any finite number of trials, so a simulation or experiment need not have an excessive number of trials. c. The LLN states that if you simulate or conduct an experiment or simulation enough times the empirical probability observed will always match the theoretical probability that is expected. d. The LLN states that if an experiment with a random outcome is repeated a large number of times, the empirical probability that is observed is likely to be close to the theoretical probability.

Chapter 5 Test A Chapter 5 Test A—Answer Key 1. D 2. D 3. A 4. B 5. A 6. C 7. C 8. B 9. A 10. A 11. C 12. C 13. D 14. A 15. B 16. B 17. C 18. A 19. B 20. D

5-5

Chapter 5 Test B—Multiple Choice Section 5.1 (What is Randomness?) 1.

[Objective: Understand the meaning of probability] Which of the following statements is not true about probability in general? a. Probability is used to measure how often random events occur. b. Probabilities are always numbers between 0 and 1 inclusive. c. A probability of 0 means that an event will not occur and a probability of 1 means that an event is certain to occur. d. All of the above are true statements

[Objective: Understand simulations of random events] If 20 babies are born, how often are there 12 or more female babies? Assume that the gender of a baby is a random event. Which of the following experiments would not simulate this situation? a. Choose the first twenty digits from a row in the random number table. Designate odd numbers to mean “female” and even numbers to mean “male”. b. Flip a coin twenty times. Designate a head to mean “female” and a tail to mean “male”. c. Roll a die twenty times. Designate a 1, 2, or 3 to mean “female” and a 4, 5, or 6 to mean “male”. d. All of the above will simulate the gender of twenty babies.

[Objective: Understand the difference between empirical and theoretical probabilities] Is the following an example of theoretical probability or empirical probability? A fisherman notes that eight out of ten times that he uses a certain lure he catches a fish within an hour. He concludes that the probability that the lure will catch a fish on his fishing next trip is about 80% a. Theoretical b. Empirical

[Objective: Understand the difference between empirical and theoretical probabilities] Is the following an example of theoretical probability or empirical probability? At a carnival shell game the player can pay three dollars and choose the shell that he or she believes is hiding the prize. There are four shells that are thoroughly mixed up after each guess. The player concludes that there is a one in four chance of randomly picking the winning shell. a. Theoretical b. Empirical

Section 5.2 (Finding Theoretical Probabilities) 5.

[Objective: Calculate the probability of the complement of an event] The National Center for Health Statistics has found that there is a 5.01% chance that an American citizen will die from an accident (unintentional injury). What is the probability that you will not die from an accident? Round to the nearest hundredth of a percent) a. 95.00% b. 99.50% c. 94.99% d. Can’t be determined with the given information.

Chapter 5 Test B 6.

Female Region 1

Region 2

5-2

Nurse Region 3

Region 4

a. b. c. d.

Region 1 Region 2 Region 3 Region 4

Use the following table to answer questions (7)-(13). A random sample of college students was asked to respond to a survey about how they spend their free time on weekends. One question, summarized in the table below, asked each respondent to choose the one activity that they are most to participate in on a Saturday morning. The activity choices were homework, housework, outside employment, recreation, or other.

Male Female Total

Homework 29 18 47

Housework 15 17 32

Outside Employment 20 26 46

Recreation 23 39 62

Other 9 4 13

Total 96 104 200

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student is male? a. 0.50 b. 0.48 c. 0.52 d. None of the above

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student chose “recreation” as their most likely activity on a Saturday morning? a. 0.310 b. 0.195 c. 0.115 d. None of the above

[Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student is male and chose “outside employment” as their most likely activity on a Saturday morning? a. 0.13 b. 0.23 c. 0.10 d. None of the above

Chapter 5 Test B

5-3

10. [Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student chose “recreation” or “other” as their most likely activity on a Saturday morning? a. 0.395 b. 0.375 c. 0.275 d. None of the above 11. [Objective: Apply the probability rules] If one student is randomly chosen from the group, what is the probability that the student is female or chose “homework” as their most likely activity on a Saturday morning? a. 0.665 b. 0.900 c. 0.755 d. None of the above 12. [Objective: Apply the probability rules] Find the probability that a male college student from the group chose “homework” as their most likely activity on Saturday mornings? (round to the nearest thousandth) a. 0.302 b. 0.156 c. 0.145 d. None of the above 13. [Objective: Understand mutually exclusive events] Which of the following are mutually exclusive events? a. Student is female and student chose “housework” as their most likely activity on Saturday mornings. b. Student is male and student chose “housework” as their most likely activity on Saturday mornings. c. Student is female and student chose “outside employment” as their most likely activity on Saturday mornings. d. Student chose “outside employment” and student chose “other” as their most likely activity on Saturday mornings.

Section 5.3 (Associations in Categorical Variables) 14. [Objective: Differentiate between independent and associated events] Use your intuition to decide whether the following two events are likely to be independent or associated. Event A: You reach into your dark closet, without looking, and pull out a black shirt. Event B: You reach into your sock drawer, without looking, and pull out black socks. a. b.

Associated Independent

5-4

Chapter 5 Test B 15. [Objective: Differentiate between independent and associated events] Use your intuition to decide whether the following two events are likely to be independent or associated. Even A: A randomly selected registered voter’s political party affiliation is Republican Event B: A randomly selected registered voter opposes a new tax on fuel. a. b.

Associated Independent

Use the following information to answer questions (16)-(18). Suppose that a recent poll of American households about pet ownership found that for households with a pet, 39% owned exactly a dog, 33% owned a cat, and 7% owned a bird. Suppose that three households are selected randomly and with replacement. 16. [Objective: Apply the probability rules] What is the probability that all three randomly selected households own a cat? (Round to the nearest hundredth) a. 0.06 b. 0.03 c. 0.04 d. None of the above 17. [Objective: Apply the probability rules] What is the probability that none of the three randomly selected households own a bird? (Round to the nearest hundredth) a. 0.80 b. 0.93 c. 0.86 d. None of the above 18. [Objective: Apply the probability rules] What is the probability that at least one of the three randomly selected households own a dog? (Round to the nearest hundredth) a. 0.27 b. 0.77 c. 0.61 d. 0.70 19. [Objective: Apply the probability rules] A true/false pop quiz contains seven questions. What is the probability that when guessing, a student will get at least one question correct? (Round to the nearest hundredth) a. 0.50 b. 0.97 c. 0.99 d. 1.00

Chapter 5 Test B

5-5

Section 5.4 (Finding Empirical Probabilities with Simulations) 20. [Objective: Understand the Law of Large Numbers] Which of the following statements is true about the “law of large numbers” (LLN)? a. The LLN is almost always true, but there are special occasions, even when outcomes are random, when the LLN can be broken. b. The LLN states that the empirical probability that is observed will simulate the theoretical probability that is expected for any finite number of trials, so a simulation or experiment need not have an excessive number of trials. c. The LLN states that if you simulate or conduct an experiment or simulation enough times the empirical probability observed will always match the theoretical probability that is expected. d. The LLN states that if an experiment with a random outcome is repeated a large number of times, the empirical probability that is observed is likely to be close to the theoretical probability.

5-6

Chapter 5 Test B

Chapter 5 Test B—Answer Key 1. D 2. D 3. B 4. A 5. C 6. C 7. B 8. A 9. C 10. B 11. A 12. A 13. D 14. B 15. A 16. C 17. A 18. B 19. C 20. D

Chapter 5 Test C—Short Answer Section 5.1 (What is Randomness?) 1.

[Objective: Understand the meaning of randomness and probability] Consider one toss of a fair six-sided die. State the sample space of possible outcomes. State one possible random event then state the theoretical probability of that event. Explain how you know that this is the probability of the random event.

[Objective: Understand the meaning of randomness and probability] Describe the difference between a theoretical probability and an empirical probability. Give at least one example of each type of probability.

[Objective: Understand the difference between empirical and theoretical probabilities] A card player claims that the probability of choosing a red jack from a well-shuffled deck of cars is 1/26 because choosing any card is equally likely and there are two red jacks in the deck of fifty-two cards. Is this an example of a theoretical probability or an empirical probability? Explain.

[Objective: Understand the difference between empirical and theoretical probabilities] A football game official tosses a coin at the beginning of each game to determine who will have possession of the ball first. In the previous ten games the toss has come up tails four times. The official says the probability that the coin will come up tails again is 40%. Is the official referring to a theoretical probability or an empirical probability? Explain.

5-2

Chapter 5 Test C

Section 5.2 (Finding Theoretical Probabilities) & Section 5.3 (Associations in Categorical Variables) Use the following information to answer questions (5)-(7). Suppose that the typical work schedule for the wait staff at Sam’s BBQ Shack, which is open seven days a week, is five days on with two days off each week. A week begins on Monday and ends on Sunday. Assume that any day of the week is equally likely to be a day off. 5.

[Objective: Apply the rules of probability] What is the probability that Isaac, a waiter at Sam’s BBQ Shack, will have Saturday or Sunday off? Show your work and round to the nearest tenth of a percent.

[Objective: Calculate the probability of the complement of an event] State the complement of the event given in question (5) and calculate the probability of the complement. Show your work and round to the nearest tenth of a percent.

[Objective: Apply the rules of probability] What is the probability that Issac will have Saturday and Sunday off? Show your work and round to the nearest tenth of a percent.

Use the following table to answer questions (8)-(13). A random sample of 200 new vehicle buyers were asked to respond to a survey about what kind of vehicle they purchased. One question, summarized in the table below, asked each respondent to choose the vehicle that best described the type of vehicle that they purchased. The vehicle choices were car, pick-up truck, sport utility vehicle, van, or other.

Male Female Total

Car 34 37 71

Truck 14 3 17

SUV 30 23 53

Van 21 17 38

Other 13 8 21

Total 112 88 200

[Objective: Apply the rules of probability] If one person is chosen randomly from the group, what is the probability that the person is female?

[Objective: Apply the rules of probability] If one person is chosen randomly from the group, what is the probability that the person purchased a van?

10. [Objective: Apply the rules of probability] If one person is chosen randomly from the group, what is the probability that the person was male and bought a car?

Chapter 5 Test C

5-3

11. [Objective: Apply the rules of probability] If one person is chosen randomly from the group, what is the probability that the person purchased a sport utility vehicle or a pick-up truck?

12. [Objective: Apply the rules of probability] Find the probability that a randomly chosen female buyer bought a van. (Round to the nearest hundredth)

13. [Objective: Understand mutually exclusive events] Using this example, state two events that are mutually exclusive.

14. [Objective: Differentiate between independent and associated events] Suppose you would like a mug of hot chocolate with cinnamon. You reach into the kitchen cupboard containing twenty mixed up mismatched mugs without looking and pull out a pink coffee cup. You also reach into a kitchen drawer containing 30 different mixed up spice jars without looking and pull out the cinnamon. Use your intuition and state whether these two events are associated or independent. Explain.

15. [Objective: Differentiate between independent and associated events] Use your intuition and state whether these two events are likely to be associated or independent. Explain. Event A: A randomly selected adult is a pet owner. Event B: A randomly selected adult responds favorably to the survey question “Should a portion of the beach be set aside as an (unleashed) dog beach?”.

5-4

Chapter 5 Test C

Use the following information to answer questions (16)-(18). Suppose that a recent poll of single people over the age of thirty-five were asked about their living arrangements. The poll found that 34% rented a house or apartment, 21% owned a house, and 17% owned a condominium. Suppose that four single people are selected randomly and with replacement. 16. [Objective: Apply the probability rules] What is the probability that all four people rent a house or apartment? Show your work and round to the nearest thousandth.

17. [Objective: Apply the probability rules] What is the probability that none of the four randomly selected people rent a house or apartment? Show your work and round to the nearest thousandth.

18. [Objective: Apply the probability rules] What is the probability that at least one of the four randomly selected people rents a house or apartment? Show your work and round to the nearest thousandth.

19. [Objective: Apply the probability rules] A multiple choice quiz contains five questions. Each question has four answer choices. Michael is not prepared for the quiz and decides to guess for each question. What is the probability that Michael will get at least one question correct? What is the probability that Michael will get all five questions correct? Show your work and round to the nearest thousandth.

Section 5.4 (Finding Empirical Probabilities with Simulations) 20. [Objective: Understand the Law of Large Numbers] Jody flips a coin ten times and observes the outcome of heads three times. Yvonne flips a coin one hundred times and observes the outcome of heads forty-eight times. Jody states that his coin must not be fair because so few heads were observed. Pretend you are Yvonne and explain to Jody why his results does not indicate that he has an unfair coin by explaining to him what the Law of Large Numbers is, and how it justifies the results that were observed in both experiments.

Chapter 5 Test C

5-5

Chapter 5 Test C—Answer Key 1.

Sample space S = {1, 2,3, 4,5,6} ; any of the individual components of the sample space are possible random

events. The probability of any single event is 1/6 because the outcomes are equally likely. An empirical probability is short-run relative frequency based on an experiment. A theoretical probability is a long-run relative frequency of an event after infinitely many trials. 3. This is a theoretical probability because it is not based on a short-run experiment. 4. This is an empirical probability because it is based on a short-run experiment. 5. 28.6% 6. Complement: Isaac will not have Saturday or Sunday off; 71.4% 7. 2.4% 8. 0.44 9. 0.19 10. 0.17 11. 0.35 12. 0.19 13. Various. Any combination of the events “buyer purchase vehicle type x” and “buyer purchases vehicle type y”. 14. These two events are most likely independent. 15. These two events are most likely associated. 16. 0.013 17. 0.190 18. 0.810 19. 0.763, 0.001 20. The LLN states that if an experiment with a random outcome is repeated a large number of times, the empirical probability is likely to be close to the true (theoretical) probability. A pattern of heads in the short run (like ten trials) can be highly variable, but the more trials that are conducted the closer the empirical probability will tend to approach the theoretical probability. Yvonne did five times as many trials as Jody and observed an empirical probability that is much closer to the theoretical probability of one half. 2.

Chapter 6 Test A—Multiple Choice Note: Some questions require the use of either a standard normal probability table or technology that can calculate normal probabilities. Section 6.1 (Probability Distributions are Models of Random Experiments) 1.

[Objective: Distinguish between discrete and continuous-valued variables] Determine whether the variable would best be modeled as continuous or discrete: The temperature of a greenhouse at a certain time of the day. a. Continuous b. Discrete

[Objective: Distinguish between discrete and continuous-valued variables] Determine whether the variable would best be modeled as continuous or discrete: The number of tomatoes harvested each week from a greenhouse tomato plant. a. Continuous b. Discrete

[Objective: Understand the uniform probability distribution] At a course in public speaking, the instructor always gives an opening speech that lasts between fifteen and eighteen minutes. The length of the speech can be modeled by a uniform distribution, that is, the speech is just as likely to last fifteen minutes as it is to last eighteen minutes. The probability density curve is shown below. What is the probability that the speech will last at least seventeen minutes? What is the probability that the speech will last between fifteen and sixteen minutes?

a. b. c. d.

Can’t be determined with the given information 0.50; 0.75 0.25; 0.50 0.50; 0.25

6-2

Chapter 6 Test A 4.

[Objective: Understand the properties of a probability distribution] An MP3 playlist, containing several songs from five genres, is set to shuffle. The following table shows the genre and the associated probability for the first song played. Does the table represent a probability distribution? Genre Rock Pop Country Jazz Classical

a. b. c.

Probability 0.302 0.290 0.203 0.123 0.090

Yes No Can’t be determined with the given information

Section 6.2 (The Normal Model) Use the following information for questions (5)-(7). Male players at the high school, college and professional ranks use a regulation basketball that weighs 22.0 ounces with a standard deviation of 1.0 ounce. Assume that the weights of basketballs are approximately normally distributed. 5.

[Objective: Apply the normal model to find probabilities] Roughly what percentage of regulation basketballs weigh less than 20.7 ounces? Round to the nearest tenth of a percent. a. b. c. d.

40.3% of the basketballs will weigh less than 20.7 ounces. 22.3% of the basketballs will weigh less than 20.7 ounces. 9.7% of the basketballs will weigh less than 20.7 ounces. 5.7% of the basketballs will weigh less than 20.7 ounces.

[Objective: Apply the normal model to find probabilities] If a regulation basketball is randomly selected, what is the probability that it will weigh between 20.5 and 23.5 ounces? Round to the nearest thousandth. a. 0.866 b. 0.134 c. 0.267 d. 0.704

[Objective: Apply the normal model to find probabilities] Some statisticians use a guideline that says that events that happen 5% of the time or less often should be considered “unusual.” By this standard, is it unusual to find a basketball that weighs 23.75 ounces or more? a. Yes, this would be unusual. b. No, this would not be unusual.

Chapter 6 Test A 8.

6-3

[Objective: Apply the normal model to find probabilities] Suppose that weights of cans of AJ’s brand whipped cream have a population mean of 7.5 ounces and a population standard deviation of 0.27 ounces and are approximately normally distributed. Which of the following statements are correct? Choose the best statement.

a. b. c. d.

Approximately 95% of all cans of AJ’s whipped cream will weigh between 6.96 ounces and 8.04 ounces. The probability that a randomly selected can of AJ’s whipped cream will weigh between 7.8 ounces and 8.3 ounces is approximately 0.131. Less than 1% of all cans of AJ’s whipped cream will weigh more than 8.3 ounces All of the above statements are true.

Use the following information for questions (9) - (11). The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 9.

[Objective: Apply the normal model to find probabilities] Approximately what percentage of people living and working in Kokomo have a travel time to work that is at least 20 minutes? Round to the nearest whole percent. a. 75% b. 25% c. 15% d. None of the above.

10. [Objective: Distinguish between a percentile and a measurement] Which of these statements is asking for a measurement (i. e. is an inverse normal question)? a. What percentage of people living and working in Kokomo have a travel time to work that is between thirteen and fifteen minutes? b. If 15% of people living and working in Kokomo have travel time to work that is below a certain number of minutes, how many minutes would that be?

6-4

Chapter 6 Test A 11. [Objective: Calculate a data value from a percentile or z-score] Suppose that it is reported in the news that 12 % of the people living and working in Kokomo feel that their commute is too long. What is the travel time to work that separates the top 12% of people with the longest travel times and the lower 88%? Round to the nearest tenth of a minute. a. 26.0 minutes b. 18.1 minutes c. 22.3 minutes d. None of the above 12. [Objective: Calculate a data value from a percentile or z-score] The normal model N ( 58, 21) describes the

distribution of weights of chicken eggs in grams. Suppose that the weight of a randomly selected chicken egg has a z-score of 1.78. What is the weight of this egg in grams? Round to the nearest hundredth of a gram. a. 95.38 grams b. 89.50 grams c. 65.25 grams d. 79.50 grams

Section 6.3 (The Binomial Model)

13. [Objective: Understand the binomial model] Which of the following characteristics are not required for the binomial model? a. There are a fixed number of trials. b. The trials must be independent. c. Only two outcomes are possible at each trial. d. The probability of success must be the same as the probability of failure. 14. [Objective: Understand the binomial model] Determine which of the given procedures describe a binomial distribution. a. Record the number of ear piercings in a group of 30 randomly selected college students. b. Observing that five out of the next ten customers at a hotdog stand order hot peppers given that the probability of ordering hot peppers is 0.18. c. Surveying customers leaving a hardware store until a customer responds that he or she spent more than fifty dollars.

Use the following information to answer questions (15)-(18). Suppose that the probability that a person books a hotel using an online travel website is 0.68. For the questions that follow, consider a sample of fifteen randomly selected people who recently booked a hotel. 15. [Objective: Calculate probabilities using the binomial model] What is the probability that exactly ten people out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. a. 0.048 b. 0.552 c. 0.287 d. 0.213

Chapter 6 Test A

6-5

16. [Objective: Calculate probabilities using the binomial model] What is the probability that at least fourteen out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. a. 0.978 b. 0.323 c. 0.022 d. 0.028 17. [Objective: Calculate probabilities using the binomial model] What is the probability that no more than four out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. a. 0.111 b. 0.001 c. 0.321 d. None of the above 18. [Objective: Calculate the mean and standard deviation using the binomial model] Out of fifteen randomly selected people, how many would you expect to use an online travel website to book their hotel, give or take how many? Round to the nearest whole person. a. 10 people, give or take 2 people b. 5 people, give or take 2 people c. 10 people, give or take 3 people d. 9 people, give or take 3 people 19. [Objective: Calculate probabilities using the binomial model] Five identical poker chips are tossed in a hat and mixed up. Two of the chips have been marked with an X to indicate that if drawn a valuable prize will be awarded. If you and two of your friends each draws a chip (with replacement), what is the probability that at least one of your group of three will win the valuable prize? Round to the nearest thousandth. a. 0.216 b. 0.784 c. 0.978 d. None of the above 20. [Objective: Understand expected value and the binomial model] Suppose that the probability that a person between the ages of 19 and 24 checks their daily horoscope is 0.12. If 400 randomly selected people between the ages of 19 and 24 were asked “Do you check your daily horoscope?”, would you be surprised if 63 or more said yes to this question? Why? a. Yes, 63 would be an unusually small number of people given the known probability of 0.12. b. No, 63 is within the expected range of people. c. Yes, 63 would be an unusually large number of people given the known probability of 0.12. d. Cannot be determined with the given information.

6-6

Chapter 6 Test A

Chapter 6 Test A—Answer Key

1. A 2. B 3. D 4. B 5. C 6. A 7. A 8. D 9. B 10. B 11. C 12. A 13. D 14. B 15. D 16. C 17. B 18. A 19. B 20. C

Chapter 6 Test B—Multiple Choice Note: Some questions require the use of either a standard normal probability table or technology that can calculate normal probabilities. Section 6.1 (Probability Distributions are Models of Random Experiments) 1.

[Objective: Distinguish between discrete and continuous-valued variables] Determine whether the variable would best be modeled as continuous or discrete: The number of cups dispensed from a beverage vending machine during a 24-hour period. a. Continuous b. Discrete

[Objective: Distinguish between discrete and continuous-valued variables] Determine whether the variable would best be modeled as continuous or discrete: The temperature of a cup of coffee dispensed from a beverage vending machine, taken four times during a 24-hour period. a. Continuous b. Discrete

[Objective: Understand the uniform probability distribution] At a course in public speaking, the instructor always gives an opening speech that lasts between fifteen and eighteen minutes. The length of the speech can be modeled by a uniform distribution, that is, the speech is just as likely to last fifteen minutes as it is to last eighteen minutes. The probability density curve is shown below. What is the probability that the speech will last sixteen minutes or more? What is the probability that the speech will last between eighteen and nineteen minutes?

a. b. c. d.

Can’t be determined with the given information 0.50; 0.75 0.75; 0.25 0.50; 0.25

6-2

Chapter 6 Test B 4.

[Objective: Understand the properties of a probability distribution] A box containing recipes from five categories is dropped so that the recipe cards are thoroughly mixed up. The following table shows the possible categories and the associated probability for a recipe randomly chosen. Does the table represent a probability distribution? Category Main Dish Appetizer Desert Salad Vegetable

a. b. c.

Probability 0.421 0.210 0.103 0.173 0.093

Yes No Can’t be determined with the given information

[Objective: Apply the normal model to find probabilities] Roughly what percentage of regulation basketballs weigh more than 23.1 ounces? Round to the nearest tenth of a percent. a. b. c. d.

Roughly 15.1% of the basketballs will weigh more than 23.1 ounces. Roughly 42.3% of the basketballs will weigh more than 23.1 ounces. Roughly 36.4% of the basketballs will weigh more than 23.1 ounces. Roughly 13.6% of the basketballs will weigh more than 23.1 ounces.

[Objective: Apply the normal model to find probabilities] If a regulation basketball is randomly selected, what is the probability that it will weigh between 19.5 and 22.5 ounces? Round to the nearest thousandth. a. 0.547 b. 0.315 c. 0.685 d. 0.723

[Objective: Apply the normal model to find probabilities] Would it be unusual to randomly select a regulation basketball and find that it weighs 23.75 ounces? a. Yes, this would be unusual. b. No, this would not be unusual.

Chapter 6 Test B 8.

6-3

[Objective: Apply the normal model to find probabilities] Suppose that weights of cans of Benneke brand peaches have a population mean of 13.5 ounces and a population standard deviation of 0.33 ounces and are approximately normally distributed. Which of the following statements are correct? Choose the best statement.

a. b. c. d.

Approximately 95% of all Benneke brand canned peaches will weigh between 12.85 ounces and 14.15 ounces. The probability that a randomly selected can of Benneke peaches will weigh between 12.9 ounces and 13.6 ounces is approximately 0.585. About 4% of all cans of Benneke peaches will weigh less than 12.9 ounces All of the above statements are true.

[Objective: Apply the normal model to find probabilities] Approximately what percentage of people living and working in Kokomo have a travel time to work that is less than 15.5 minutes? Round to the nearest whole percent. a. 37% b. 63% c. 25% d. None of the above.

10. [Objective: Distinguish between a percentile and a measurement] Which of these statements is asking for a probability? a. What percentage of people living and working in Kokomo has a travel time to work that is between thirteen and fifteen minutes? b. If 15% of people living and working in Kokomo have travel time to work that is below a certain number of minutes, how many minutes would that be?

6-4

Chapter 6 Test B 11. [Objective: Calculate a data value from a percentile or z-score] Suppose that it is reported in the news that 8 % of the people living and working in Kokomo feel “very satisfied” with their commute time to work. What is the travel time to work that separates the bottom 8% of people with the shortest travel times and the upper 92%? Round to the nearest tenth of a minute. a. 17.2 minutes b. 13.8 minutes c. 10.7 minutes d. None of the above 12. [Objective: Calculate a data value from a percentile or z-score] The normal model N ( 58, 21) describes the distribution of weights of chicken eggs in grams. Suppose that the weight of a randomly selected chicken egg has a z-score of -2.01. What is the weight of this egg in grams? Round to the nearest hundredth of a gram. a. 31.20 grams b. 15.80 grams c. 28.50 grams d. 38.10 grams

Section 6.3 (The Binomial Model) 13. [Objective: Understand the binomial model] Which of the following characteristics are not required for the binomial model? a. The probability of success and of failure must be equal. b. There are a fixed number of trials. c. The trials must be independent. d. The probability of success is the same at each trial. 14. [Objective: Understand the binomial model] Determine which of the given procedures describe a binomial distribution. a. Record the number of songs downloaded in a month for a group of 30 randomly selected college students. b. Observing that ten out of the next twenty customers at a grocery store checkout use a credit card given that the probability of using a credit card is 0.58. c. Surveying customers entering a sporting equipment store until a customer responds that he or she was shopping for a bicycle.

Use the following information to answer questions (15)-(18). Suppose that the probability that a person books an airline ticket using an online travel website is 0.72. For the questions that follow, consider a sample of ten randomly selected people who recently booked an airline ticket. 15. [Objective: Calculate probabilities using the binomial model] What is the probability that exactly seven out of ten people used an online travel website when they booked their airline ticket? Round to the nearest thousandth. a. 0.035 b. 0.480 c. 0.998 d. 0.264

Chapter 6 Test B

6-5

16. [Objective: Calculate probabilities using the binomial model] What is the probability that at least nine out of ten people used an online travel website when they booked their airline ticket? Round to the nearest thousandth. a. 0.065 b. 0.183 c. 0.935 d. 0.857 17. [Objective: Calculate probabilities using the binomial model] What is the probability that no more than three out of ten people used an online travel website when they book their airline ticket? Round to the nearest thousandth. a. 0.733 b. 0.115 c. 0.007 d. None of the above 18. [Objective: Calculate the mean and standard deviation using the binomial model] Out of ten randomly selected people, how many would you expect to use an online travel website to book their hotel, give or take how many? Round to the nearest whole person. a. 7 people, give or take 1 person b. 8 people, give or take 2 people c. 7 people, give or take 2 people d. 3 people, give or take 4 people 19. [Objective: Calculate probabilities using the binomial model] Five identical poker chips are tossed in a hat and mixed up. Two of the chips have been marked with an X to indicate that if drawn a valuable prize will be awarded. If you and three of your friends each draws a chip (with replacement), what is the probability that at least one of your group of four will win the valuable prize? Round to the nearest thousandth. a. 0.870 b. 0.130 c. 0.758 d. None of the above 20. [Objective: Understand expected value and the binomial model] Suppose that the probability that a person between the ages of 19 and 24 buys at least one tabloid magazine per week is 0.115. If 500 randomly selected people between the ages of 19 and 24 were asked “Do you buy at least one tabloid magazine per week?”, would you be surprised if 45 or more said yes to this question? Why? a. Yes, 55 would be an unusually small number of people given the known probability of 0.115. b. No, 55 is within the expected range of people. c. Yes, 55 would be an unusually large number of people given the known probability of 0.12. d. Cannot be determined with the given information.

6-6

Chapter 6 Test B

Chapter 6 Test B—Answer Key 1. B 2. A 3. C 4. A 5. D 6. C 7. B 8. D 9. A 10. A 11. C 12. B 13. A 14. B 15. D 16. B 17. C 18. A 19. A 20. B

Chapter 6 Test C—Short Answer Section 6.1 (Probability Distributions are Models of Random Experiments) 1.

[Objective: Distinguish between discrete and continuous-valued variables] Explain the difference between a discrete random variable and a continuous random variable and give an example of each.

[Objective: Understand the uniform probability distribution] When exposed to heat, the reaction time of a certain chemical always occurs after thirteen minutes, but before 17 minutes. Reaction times for this chemical can be modeled by a uniform distribution, that is, the reaction time is just as likely to occur at thirteen minutes as it is to occur at seventeen minutes. Find the probability that the reaction will happen after fifteen minutes. Shade the appropriate area and calculate the numerical value of the probability.

[Objective: Understand the uniform probability distribution] Using the information from question (2), find the probability that the reaction time will occur after fourteen minutes, but before fifteen minutes. Shade the appropriate area and calculate the numerical value of the probability.

6-2

Chapter 6 Test C 4.

[Objective: Understand the properties of a probability distribution] A suit of standard playing cards has thirteen cards. Suppose the suit of hearts is thoroughly shuffled and that you have the opportunity to play the following game: You win $10 if you choose the ten or queen of hearts, you lose $5 if you choose the jack or king of hearts, you win $2 if you choose the five or seven of hearts, and you lose $10 if you choose the ace of hearts. With any other outcome you will win or lose nothing. Complete the table that shows the probability distribution. Would it be sensible to play this game? Winnings +$10 -$5 +$2 -$10 0

Probability

[Objective: Understand the properties of a probability distribution] List two requirements that must be satisfied for any probability distribution. Explain how the probability distribution in question (4) satisfies these requirements.

Section 6.2 (The Normal Model) Use the information that follows to answer questions (6)-(8). Circumferences of regulation soccer balls have a mean of 69 cm with as standard deviation of 1.50 cm. Assume that the circumferences of soccer balls are approximately normally distributed. 6.

[Objective: Apply the normal model to find probabilities] Roughly what percentage of regulation soccer balls has a circumference that is greater than 69.9 cm? Round to the nearest tenth of a percent.

[Objective: Apply the normal model to find probabilities] If a regulation soccer ball is randomly selected, what is the probability that it will have a circumference between 66.9 and 68.9 cm? Round to the nearest thousandth.

Chapter 6 Test C 8.

6-3

[Objective: Apply the normal model to find probabilities] Would it be unusual to randomly select a regulation soccer ball and find that it has a circumference that is greater than 72.3 cm?

Use the following information to answer questions (9) and (10). Suppose that weights of jars of Puff brand marshmallow cream has a population mean of 24.5 ounces and a population standard deviation of 0.19 ounces and are approximately normally distributed. Use the figure below to find determine the specified probabilities.

[Objective: Apply the normal model to find probabilities] What is the probability that a randomly selected jar of Puff marshmallow cream will be greater than 24.45 ounces? What is the probability that a randomly selected jar of Puff marshmallow cream will be less than 24.22 ounces? Round to the nearest thousandth.

10. [Objective: Apply the normal model to find probabilities] If a large random sample of Puff marshmallow cream jars were weighed, approximately what percentage of the jars would weigh between 24.22 and 24.45 ounces? Round to the nearest tenth of a percent.

6-4

Chapter 6 Test C

Use the following information for questions (11) and (12). On a busy day, the average roller coaster wait time at a large amusement park is 27 minutes. Suppose the standard deviation of wait time is 11.9 minutes and the distribution of wait times is approximately normally distributed. 11. [Objective: Calculate a data value from a percentile or z-score] If it is a busy day, approximately what percentage of people at the amusement park will have a wait time that is at least 30 minutes? Round to the nearest whole percent.

12. [Objective: Calculate a data value from a percentile or z-score] To improve customer satisfaction, the amusement park manager has decided to give food coupons to customers with long wait times. The manager decides to give the coupons to the top 10% of people waiting the longest. What is the minimum wait time for the top 10%? Round to the nearest tenth of a minute.

13. [Objective: Calculate a data value from a percentile or z-score] The normal model N ( 210, 45 ) describes the weights of baby elephants in pounds. Suppose that the weight of a randomly selected baby elephant has a z-score of -1.35. What is the weight of this baby elephant in pounds? Round to the nearest tenth of a pound.

Section 6.3 (The Binomial Model) 14. [Objective: Understand the binomial model] List the four characteristics of the binomial model.

Chapter 6 Test C

6-5

15. [Objective: Understand the binomial model] At a county animal shelter, the probability that a stray cat comes into the shelter with rabies is 0.145 and the probability that a stray dog comes into the shelter with rabies is 0.157. A volunteer at the shelter records whether the next thirty cats or dogs delivered to the shelter has rabies. Which of the condition or conditions for use of the binomial model is or are not met?

Use the following information for questions (16) – (19). Suppose that the probability that the train travelling from Holland, Michigan to Chicago, Illinois arrives on time is 0.80. For the questions that follow, consider a sample of twenty randomly selected train trips. 16. [Objective: Calculate the mean and standard deviation using the binomial model] Approximately how many trips would you expect to arrive on time out of twenty randomly selected trips, give or take how many trips? Round to the nearest whole trip.

17. [Objective: Calculate probabilities using the binomial model] What is the probability that exactly fifteen trips out of twenty will arrive in Chicago on time? Round to the nearest thousandth.

18. [Objective: Calculate probabilities using the binomial model] What is the probability that at least nineteen trips out of twenty arrive to Chicago on time? Round to the nearest thousandth.

19. [Objective: Calculate probabilities using the binomial model] What is the probability that thirteen or less trips out of twenty arrive on time to Chicago? Round to the nearest thousandth.

6-6

Chapter 6 Test C 20. [Objective: Understand expected value and the binomial model] Suppose the probability that a person does not recycle is 0.23. If 300 randomly selected people were asked “Do you recycle?” would you be surprised if 65 said no to this question? Why?

Chapter 6 Test C

6-7

Chapter 6 Test C—Answer Key 1.

A discrete random variables has a numerical outcome that can be listed or counted, A continuous random variable occurs over an infinite range of values and cannot be listed or counted. Examples will vary. 2. 0.50, the right half of the distribution should be shaded. 3. 0.25, the rectangle between 14 and 15 should be shaded. 4. The probability column should contain the following values: 2/13, 2/13, 2/13, 1/13, 6/13. Your chances of winning money are slightly better than losing. 5. A probability distribution must list all the possible outcomes and list all the associated probabilities. 6. 27.4% 7. 0.393 8. Yes, 72.3 cm would be unusual. It is more the two standard deviations from the mean. 9. 0.604; 0.074 10. 32.3% 11. 40% will have at least a 30 minute wait time. 12. 42.3 minutes 13. 149.3 lbs. 14. (1) A fixed number of trials, (2) Only two possible outcomes for each trial, (3) The probability of success is the same from trial to trial, and (4) The trials are independent. 15. There are a fixed number of trials, there are only two outcomes (rabies or no rabies), the trials are independent, but the probability of success is not the same from trial to trial since the trials include cats and dogs. 16. 16 trips give or take 2 trips 17. 0.175 18. 0.069 19. 0.087 20. No, 65 is within the expected range.

Chapter 7 Test A—Multiple Choice Section 7.1 (Learning About the World through Surveys) For questions (1)-(3), fill in the blank to complete the statement. 1.

[Objective: Understand survey terminology] The collection of the ages of all the U. S. first ladies when they married is a _______________. a. Population b. Sample c. Parameter d. Statistic

[Objective: Understand survey terminology] Suppose that the age of all the U. S. first ladies when they married was recorded. The mean age of U. S. first ladies when they married would be a __________________ . a. Population b. Sample c. Parameter d. Statistic

[Objective: Understand survey terminology] Researchers are interested in learning more about the age of women when they marry for the first time so they survey 500 married or divorced women and ask them how old they were when they first married. The collection of the ages of the 500 women when they first married is a_______________. a. Population b. Sample c. Parameter d. Statistic

[Objective: Identify statistical bias] Frances is interested in whether students at his college would like to see a portion of the campus preserved as green space. Using student numbers, he randomly contacts 300 students and receives a response from 75. Of those who responded, 64% favored the preservation of green space on campus. This scenario is describing what type of sampling bias? a. Measurement bias b. Survey bias c. Voluntary response bias d. Nonresponse bias

Section 7.2 (Measuring the Quality of a Survey) 5.

[Objective: Identify qualities of good estimation methods] If it is being used to make inferences about a population, a good statistic (or estimator) should a. Be derived from population data. b. Be accurate and precise. c. Show correlation. d. None of the above.

7-2

Chapter 7 Test A 6.

[Objective: Identify qualities of a sampling distribution] Which of the following statements is not true about a sampling distribution? a. It is the probability distribution of a statistic. b. It is used for making inferences about a population. c. It tells us how often we can expect to see particular values of our estimator d. All the above statements are true

[Objective: Interpreting expected value and standard error for a sample proportion] According to a snack cracker manufacturer, a batch of butter crackers has a defect rate of 8%. Suppose a quality inspector randomly inspects 500 crackers. Complete the following statement: The quality inspector should expect ________ defective crackers, give or take ______ crackers.” a. 60; 16 b. 40; 6 c. 40; 16 d. 60; 12

[Objective: Understand expected value for a sample proportion] There are four colors in a bag containing 500 plastic chips. It is known that 28% of the chips are green. On average, how many chips from a random sample of 50 (with replacement) would be expected to be green? a. 18 b. 28 c. 14 d. Not enough information to determine expected value.

Section 7.3 (The Central Limit Theorem for Sample Proportion) 9.

[Objective: Verify the conditions for the Central Limit Theorem] Suppose that New Mexico lawmakers survey 160 randomly selected registered voters to see if they favor stricter laws regarding motorcycle helmet use for riders over the age of 17. The lawmakers believe the population proportion in favor of changing the law is 6% (based on historical data and previous votes). Which of the following conditions for the Central Limit theorem are not met? a. The population proportion is too small and will not have enough expected successes. b. Relative to the population, the sample is not large enough. c. The population proportion is too small and will not have enough expected failures. d. None of the above, all the condition of the CLT are met.

Use the following information to answer questions (10)-(12). A pollotarian is a person who eats poultry but no red meat. A wedding planner does some research and finds that approximately 3.5% of the people in the area where a large wedding is to be held are pollotarian. Treat the 300 guests expected at the wedding as a simple random sample from the local population of about 200,000. 10. [Objective: Apply the Central Limit Theorem] On average, what proportion of the guests would be expected to be pollotarian, give or take how many? Round to the nearest whole person. a. There is not enough information given to calculate expected value. b. 20 people, give or take 5 people c. 15 people, give or take 4 people d. 11 people, give or take 3 people

Chapter 7 Test A

7- 3

11. [Objective: Apply the Central Limit Theorem] Suppose the wedding planner assumes that 5% of the guests will be pollotarian so she orders 15 pollotarian meals. What is the approximate probability that more than 5% of the guests are pollotarian and therefore she will not have enough pollotarian meals? Round to the nearest thousandth. a. 0.079 b. 0.421 c. 0.489 d. None of the Above 12. [Objective: Apply the Central Limit Theorem] Suppose the wedding planner assumes that only 3% of the guests will be pollotarian so she orders 9 pollotarian meals. What is the approximate probability that she will have too many pollotarian meals? Round to the nearest thousandth. a. 0.477 b. 0.681 c. 0.319 d. 0.251

Section 7.4 (Estimating the Population Proportion with Confidence Intervals) 13. [Objective: Understand the confidence interval for a proportion] Which of the following statements is true about the confidence interval for a population proportion? a. It is equal to the population proportion plus or minus a calculated amount called the standard error. b. It is equal to the sample proportion plus or minus a calculated amount called the margin of error. c. The confidence interval for a proportion will always contain the true population proportion. d. The confidence interval for a proportion does not need a specified confidence level. 14. [Objective: Understand the confidence interval for a proportion] Complete the statement by filling in the blank. When constructing a confidence interval, if the level of confidence increases the margin of error will _________________ and the confidence interval will be _________________. A larger sample size will improve the accuracy of the confidence interval, therefore margin of error will _________________ and the confidence interval will be _________________. a. Decrease, narrower. Increase, wider. b. Decrease, wider. Increase, narrower c. Increase; narrower. Decrease, wider. d. Increase; wider. Decrease; narrower. Use the following information to answer questions (15)-(17). In a recent poll of 1200 randomly selected adult office workers, 32% said they had worn a Halloween costume to the office at least once. 15. [Objective: Calculate and interpret confidence intervals for a proportion] What is the standard error for the estimate of the proportion of all American adult office workers that have worn a Halloween costume to the office? Round to the nearest ten-thousandth. a. 0.0002 b. 0.0135 c. 0.4672 d. 0.0143

7-4

Chapter 7 Test A 16. [Objective: Calculate and interpret confidence intervals for a proportion] What is the margin of error, using a 95% confidence level, for estimating the true population proportion of adult office workers who have worn a Halloween costume to the office at least once? (Round to the nearest thousandth) a. 0.158 b. 0.053 c. 0.013 d. 0.026 17. [Objective: Calculate and interpret confidence intervals for a proportion] Report the 95% confidence interval for the proportion of all adult office workers who have worn a Halloween costume to the office at least once. (Round final calculations to the nearest tenth of a percent) a. (28.0%, 36.1%) b. (30.7%, 33.4%) c. (29.4%, 34.6%) d. None of the above 18. [Objective: Calculate and interpret confidence intervals for a proportion] A random sample of 830 adult television viewers showed that 52% planned to watch sporting event X. The margin of error is 3 percentage points with a 95% confidence. Does the confidence interval support the claim that the majority of adult television viewers plan to watch sporting event X? Why? a. No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 49% and 55%. The true proportion could be less than 50%. b. No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 50.5% and 53.5%. The lower limit of the confidence interval is just too close to 50% to say for sure. c. Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 50.5% and 53.5%. This is strong evidence that the true proportion is greater than 50% d. Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 49% and 55%. Since the confidence interval is mostly above 50% it is likely that the true proportion is greater than 50%. 19. [Objective: Calculate and interpret confidence intervals for a proportion] Suppose that in a recent poll of 1200 adults between the ages of 35 and 45, 38% surveyed said they have thought about getting elective plastic surgery. Find the 95% confidence interval for the proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery then choose the correct interpretation. (Round to the nearest tenth of a percent) a. The population proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 35.2% and 40.7%, with a confidence level of 95%. b. There is a 95% chance that the population of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 35.2% and 40.7%. c. The population proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 28.5% and 47.5%, with a confidence level of 95%. d. There is a 95% chance that the population of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 28.5% and 47.5%.

Chapter 7 Test A

7- 5

Section 7.5 (Margin of Error and Sample Size for a Proportion—Optional) 20. [Objective: Determine the appropriate sample size] A polling agency wants to determine the size of a random sample needed to estimate the proportion of homeowners who keep a hand gun in their home for security. The estimate should have margin of error no more than 2.5 percentage points at a 95% level of confidence. Choose the most conservative answer that is closest to your calculated number, rounding to the nearest whole person. a. They should poll at least 1000 people b. They should poll at least 1600 people. c. They should poll at least 400 people. d. They should poll at least 1200 people.

7-6

Chapter 7 Test A

Chapter 7 Test A—Answer Key 1. A 2. C 3. B 4. D 5. B 6. D 7. B 8. C 9. A 10. D 11. A 12. C 13. B 14. D 15. B 16. D 17. C 18. A 19. A 20. B

Chapter 7 Test B—Multiple Choice Section 7.1 (Learning About the World through Surveys) For questions (1)-(3), fill in the blank to complete the statement. 1.

[Objective: Understand survey terminology] Suppose that the age of all the U. S. vice presidents when they took office was recorded. The collection of the ages of all the U. S. vice presidents when they took office is a_______________. a. Population b. Sample c. Parameter d. Statistic

[Objective: Understand survey terminology] The mean age of all the U. S. vice presidents when they took office would be a _________________. a. Population b. Sample c. Parameter d. Statistic

[Objective: Understand survey terminology] Researchers are interested in learning more about the age of men when they marry for the first time so they survey 500 married or divorced men and ask them how old they were when they first married. The mean of age of the 500 men when they married for the first time would be a _________________. a. Population b. Sample c. Parameter d. Statistic

[Objective: Identify statistical bias] Max organizes weekly concerts in the local park. He is interested in knowing what type of music people enjoy. Before one particular concert, he makes an announcement to the audience, and asks people to visit a web page and take a survey to vote on whether or not they liked the concert. 75 people take the survey, and 58% respond favorably. Max claims that 58% of all of those who were at the concert liked the music. What type of bias is in Max's method? a. Measurement bias b. Survey bias c. Voluntary response bias d. Nonresponse bias

Section 7.2 (Measuring the Quality of a Survey) 5.

[Objective: Identify qualities of a sampling distribution] Which of the following statements is not true about a sampling distribution? a. It gives probabilities for a statistic. b. It gives characteristics of the estimator, such as bias and precision. c. It is used for making inferences about a sample. d. It is the probability distribution of a statistic.

7-2

Chapter 7 Test B 6.

[Objective: Identify qualities of good estimation methods] If it is being used to make inferences about a population, a good statistic (or estimator) should a. Be accurate and precise. b. Show correlation. c. Always equal the population parameter. d. None of the above.

[Objective: Interpreting expected value and standard error for a sample proportion] According to a snack cracker manufacturer, a batch of butter crackers has a defect rate of 6%. Suppose a quality inspector randomly inspects 400 crackers. Complete the following statement: The quality inspector should expect ________ defective crackers, give or take ______ crackers. a. 45; 6 b. 24; 5 c. 25; 12 d. 48; 5

[Objective: Understand expected value for a sample proportion] There are four colors in bag containing 600 plastic chips. It is known that the 34% of the chips are yellow. On average, how many chips from a Random sample of 30 (with replacement) would be expected to be yellow? Round to the nearest whole chip. a. About 5 b. About 10 c. About 16 d. Not enough information to determine expected value.

Section 7.3 (The Central Limit Theorem for Sample Proportion) 9.

[Objective: Verify the conditions for the Central Limit Theorem] Suppose that Illinois lawmakers survey 130 randomly selected registered voters to see if they favor charging a deposit on aluminum cans to encourage recycling. The lawmakers believe the population proportion in favor of changing the law is 93% (based on historical data and previous votes). Which of the following conditions for the Central Limit theorem are not met? a. The population proportion is too small and will not have enough expected failures. b. Relative to the population, the sample is not large enough. c. The population proportion is too small and will not have enough expected successes. d. None of the above, all the condition of the CLT are met.

Use the following information to answer questions (10)-(12). A pescatarian is a person who eats fish and seafood but no other animal. An event planner does some research and finds that approximately 2.75% of the people in the area where a large event is to be held are pescatarian. Treat the 250 guests expected at the event as a simple random sample from the local population of about 150,000. 10. [Objective: Apply the Central Limit Theorem] On average, what proportion of the guests would be expected to be pescatarian, give or take how many? Round to the nearest whole person. a. There is not enough information given to calculate expected value. b. 6 people, give or take 5 people c. 8 people, give or take 4 people d. 7 people, give or take 3 people

Chapter 7 Test B 7- 3 11. [Objective: Apply the Central Limit Theorem] Suppose the event planner assumes that 4% of the guests will be pescatarian so he orders 10 pescatarian meals. What is the approximate probability that more than 4% of the guests are pescatarian and that he will not have enough pescatarian meals? Round to the nearest thousandth. a. 0.387 b. 0.113 c. 0.470 d. None of the Above 12. [Objective: Apply the Central Limit Theorem] Suppose the event planner assumes that only 1.6% of the guests will be pescatarian so he orders 4 pescatarian meals. What is the approximate probability that he will have too many pescatarian meals? Round to the nearest thousandth. a. 0.613 b. 0.387 c. 0.113 d. 0.245

Section 7.4 (Estimating the Population Proportion with Confidence Intervals) 13. [Objective: Understand the confidence interval for a proportion] Complete the statement by filling in the blank. When constructing a confidence interval, if the level of confidence increases the margin of error will _________________ and the confidence interval will be _________________. A larger sample size will improve the accuracy of the confidence interval, therefore margin of error will _________________ and the confidence interval will be _________________. a. Decrease, wider. Increase, narrower b. Decrease, narrower. Increase, wider. c. Increase; wider. Decrease; narrower. d. Increase; narrower. Decrease, wider. 14. [Objective: Understand the confidence interval for a proportion] Which of the following statements is true about the confidence interval for a population proportion? a. The confidence interval for a proportion will always contain the true population proportion. b. The confidence interval for a proportion does not need a specified confidence level. c. It is equal to the population proportion plus or minus a calculated amount called the standard error. d. It is equal to the sample proportion plus or minus a calculated amount called the margin of error.

Use the following information to answer questions (15)-(17). In a recent poll of 1100 randomly selected home delivery truck drivers, 26% said they had encountered an aggressive dog on the job at least once. 15. [Objective: Calculate and interpret confidence intervals for a proportion] What is the standard error for the estimate of the proportion of all home delivery truck drivers who have encountered an aggressive dog on the job at least once? Round to the nearest ten-thousandth. a. 0.1322 b. 0.0132 c. 0.0002 d. 0. 0141

7-4

Chapter 7 Test B 16. [Objective: Calculate and interpret confidence intervals for a proportion] What is the margin of error, using a 95% confidence level, for estimating the true population proportion of home delivery truck drivers who have encountered an aggressive dog on the job at least once? (Round to the nearest thousandth) a. 0.026 b. 0.013 c. 0.004 d. 0.053 17. [Objective: Calculate and interpret confidence intervals for a proportion] Report the 95% confidence interval for the proportion of all home delivery truck drivers who have encountered an aggressive dog on the job at least once. (Round final calculations to the nearest tenth of a percent) a. (24.7 %, 27.3%) b. (23.4%, 28.6%) c. (20.7%, 31.3%) d. None of the above 18. [Objective: Calculate and interpret confidence intervals for a proportion] A random sample of 950 adult television viewers showed that 48% planned to watch sporting event X. The margin of error is 4 percentage points with a 95% confidence. Does the confidence interval support the claim that the majority of adult television viewers plan to watch sporting event X? Why? a. No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 44% and 52%. b. No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 46% and 50%. c. Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 44% and 52%. d. Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 46% and 50%. 19. [Objective: Calculate and interpret confidence intervals for a proportion] Suppose that in a recent poll of 900 adults between the ages of 35 and 45, 22% surveyed said they have thought about participating in an extreme sport such as bungee jumping. Find the 95% confidence interval for the proportion of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping then choose the correct interpretation. (Round to the nearest tenth of a percent) a. The population proportion of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 13.9% and 30.1%, with a confidence level of 95%. b. There is a 95% chance that the population of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 13.9% and 30.1%. c. There is a 95% chance that the population of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 19.3% and 24.7%. d. The population proportion of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 19.3% and 24.7%, with a confidence level of 95%.

Chapter 7 Test B 7- 5 Section 7.5 (Margin of Error and Sample Size for a Proportion—Optional) 20. [Objective: Determine the appropriate sample size] A polling agency wants to determine the size of a random sample needed to estimate the proportion of homeowners who have an electronic home security system. The estimate should have margin of error no more than 4 percentage points at a 95% level of confidence. Choose the closest answer that is closest to your calculated number, rounding to the nearest whole person. a. They should poll at least 600 people b. They should poll at least 500 people. c. They should poll at least 1000 people. d. They should poll at least 1100 people.

7-6

Chapter 7 Test B

Chapter 7 Test B—Answer Key 1. A 2. C 3. D 4. C 5. C 6. A 7. B 8. B 9. A 10. D 11. B 12. C 13. C 14. D 15. B 16. A 17. B 18. C 19. D 20. A

Chapter 7 Test C—Short Answer Section 7.1 (Learning About the World through Surveys) 1.

[Objective: Understand survey terminology] Explain the difference between a population and a sample. Give an example of each.

[Objective: Understand survey terminology] Explain the difference between a statistic and a parameter. Give an example of each.

[Objective: Understand survey terminology] Can the way a survey question is asked affect the sample results? Explain why such a sample is or is not reflective of the population.

[Objective: Identify Statistical bias] Frederick is interested in whether residents of his community are opposed to the construction of a party store on the corner of a busy intersection. He randomly polls 150 residents in the community and receives responses from 55 residents. Of those who responded, 60% were opposed to the construction of the party store in the community so Frederick concludes that the majority of residents in his community oppose the construction of the party store. Explain what is wrong with this approach.

Section 7.2 (Measuring the Quality of a Survey) 5.

[Objective: Assess the quality of a sampling distribution] A sampling method should be as precise and accurate as possible. Explain what these two terms mean and how each is measured.

7-2

Chapter 7 Test C

Use the following information to answer questions (6)-(8). A marble manufacturer advertises that its bags of marbles will contain 25% “milky-white” marbles. Suppose that a bag containing 80 marbles is inspected. 6.

[Objective: Interpreting expected value and standard error for a sample proportion] What value should we expect for our sampling percentage of milky-white marbles? How many marbles would this be? Round to the nearest whole marble.

[Objective: Interpreting expected value and standard error for a sample proportion] What is the standard error? Round to the nearest tenth of a percent.

[Objective: Interpreting expected value and standard error for a sample proportion] Use your answers to fill in the blanks: We expect ______% milky-white marbles, give or take _____%

Section 7.3 (The Central Limit Theorem for Sample Proportion) 9.

[Objective: Verify the conditions for the Central Limit theorem] Suppose that Michigan lawmakers survey 500 randomly selected registered voters to see if they favor an extension of the fall duck hunting season. The lawmakers believe the population proportion in favor of extending the duck hunting season is 45% (based on historical data and previous votes). State the three conditions of the Central Limit Theorem and explain whether each condition is satisfied in this scenario.

U se the following information to answer questions (10)-(12). An event planner does some research and finds that in the area where a large children’s event is to be held, approximately 1.75% of the children are lactose intolerant. Treat the 250 children expected at the event as a simple random sample from the local population of about 100,000 children.

10. [Objective: Apply the Central Limit Theorem] On average, how many of the children attending the event would be expected to be lactose intolerant, give or take how many? Round to the nearest whole person.

11. [Objective: Apply the Central Limit Theorem] Suppose the event planner assumes that 2.8% of the children attending the event will be lactose intolerant so he orders 7 lactose-free meals. What is the approximate probability that more than 2.8% of the children attending the event are lactose intolerant and that he will not have enough lactose-free meals? Round to the nearest thousandth.

Chapter 7 Test C

7- 3

12. [Objective: Apply the Central Limit Theorem] Suppose the event planner assumes that only 0.8% of the children attending the event will be lactose intolerant so he orders 2 lactose-free meals. What is the approximate probability that he will have too many lactose-free meals? Round to the nearest thousandth.

Section 7.4 (Estimating the Population Proportion with Confidence Intervals) 13. [Objective: Understand the confidence interval for a proportion] Suppose a city manager conducts a poll and finds that a 95% confidence interval for the proportion of residents who support yard watering restrictions during extended periods of no rain is 43% to 51%. Explain what the “95%” means.

14. [Objective: Understand the confidence interval for a proportion] Explain the difference between the standard error of a sample proportion and the margin of error of a confidence interval for a population proportion.

Use the following information to answer questions (15)-(17). In a recent poll of 900 randomly selected adults, 37% reported that they could not swim 24 yards (the length of a typical gymnasium lap pool). 15. [Objective: Calculate and interpret confidence intervals for a proportion] What is the standard error for the estimate of the proportion of all adults that self-report that they cannot swim 24 yards? Round to the nearest ten-thousandth.

16. [Objective: Calculate and interpret confidence intervals for a proportion] What is the margin of error, using a 95% confidence level, for estimating the true proportion of adults who self-report that they cannot swim 24 yards? Round to the nearest thousandth.

7-4

Chapter 7 Test C 17. [Objective: Calculate and interpret confidence intervals for a proportion] Report the 95% confidence interval for the proportion of adults who self-report that they cannot swim 24 yards. Round final calculations to the nearest tenth of a percent.

18. [Objective: Calculate and interpret confidence intervals for a proportion] A survey of 800 randomly selected senior citizens showed that 55% said they planned to watch an upcoming political debate on television. The margin of error for the 95% confidence interval is 3.5 percentage points. Does the confidence interval support the claim that the majority of senior citizens plan to watch the upcoming political debate on television? Explain why or why not.

19. [Objective: Interpret confidence intervals for a proportion] Suppose that you and a friend read the following statement in a news report, “A recent poll found that 54% of voters, give or take 3%, plan to vote for candidate X in the next election (with a confidence level of 95%)”. Your friend then makes the statement, “Hey, look, there’s a 95% chance that somewhere between 51% and 57% of voters plan to vote for candidate X!” How would you explain to your friend why his statement is incorrect, be sure to provide your friend with the correct interpretation of the confidence interval.

Section 7.5 (Margin of Error and Sample Size for a Proportion—Optional) 20. [Objective: Determine the appropriate sample size] A polling agency wants to determine the size of the random sample needed to estimate the proportion of voters who favor proposal X. The estimate should have a margin of error no more than 4.5 percentage points at a 95% level of confidence. Determine the minimum size of the sample, rounding to the nearest whole person.

Chapter 7 Test C

7- 5

Chapter 7 Test C—Answer Key 1.

A population is a group of objects or people which are being studied. A population is a total collection. A sample is a collection of objects or people taken from the population of interest. Examples will vary. 2. A statistic is a summary of a sample of data, examples will vary. A parameter is a numerical value that characterizes some aspect of the population, examples will vary. 3. Yes, a persuasive survey question will result in biased results which may not accurately reflect the true sentiments of the population. 4. Frederick’s survey may have nonresponse bias. The residents who chose not to participate may have different views about the survey topic then those who did respond. 5. Precision means that sampling results are consistent when a sampling method is repeated. The precision of a sampling method is measured by the standard error. Accuracy means sampling results are centered around the population parameter. Accuracy is measured in terms of bias. 6. 25%; 20 milky-white marbles 7. 4.8% 8. 25%; 4.8% 9. Sample is random and independent—it is stated that this is a random sample and voters are independent. The sample is large—a sample of 500 is large enough since it will have at least 10 successes and failures ( 0.45 × 500 ≥ 10 and 0.55 × 500 ≥ 10 ). The population is big—A sample of 500 is large enough because the population is at least ten times larger. 10. 4 children, give or take 2 child. 11. 0.103 12. 0.126 13. The 95% indicates that if many polls were taken, 95% of them would result in confidence intervals that include the true population proportion of residents that support yard watering restrictions. 14. The margin of error is an amount that is added and subtracted from the estimate which provides the range of plausible values around the sample proportion base on a chosen level of confidence. The standard error of a sample proportion is the measure of variation for the data collected in a particular sample. 15. 0.0161 16. 0.032 17. (33.8%, 40.1%) 18. Yes, a confidence interval of 55% ± 3.5% would include plausible population parameter values that are greater than 50% so the claim would not be unreasonable. 19. Answers will vary, but should reference the fact that there is no chance that the population parameter will change, which is implied when one interprets a confidence level as a probability. The correct interpretation is that the proportion of voters who plan to vote for candidate X is between 51% and 57%, with a confidence level of 95% which means the process used to produce the interval will capture the true population proportion 95% of the time.

⎛ z* ⎞ ⎛ 1 ⎞ 20. If student uses the complete formula n = ⎜ ⎟⎜ ⎟ , then they should poll at least 474 people. If student ⎝ m ⎠⎝ 4 ⎠ 1 uses the simplified formula n = 2 , then they should poll at least 494 people. m

Chapter 8 Test A—Multiple Choice Section 8.1 (The Main Ingredients of Hypothesis Testing) 1.

[Objective: Understand the components of hypothesis testing] Complete the statement by filling in the blanks. The null hypothesis is ____________ to be ______ and is only rejected when the observed outcome is shown to be _____________________________. a. Proven; true; impossible b. Known; true; the population parameter c. Assumed; true; extremely unlikely d. Likely; false; extremely likely

[Objective: Understand the components of hypothesis testing] Read the following problem description then choose the correct null and alternative hypothesis. A new drug is being tested to see whether it can reduce the rate of asthma attacks in children ages 5 to 14 with asthma ages. The rate of asthma attacks in the population of concern is 0.0744. a. H 0 : p = 0.0744; H a : p > 0.0744

H 0 : p = 0.0744; H a : p < 0.0744

H 0 : p < 0.0744; H a : p < 0.0744

H 0 : p > 0.0744; H a : p ≠ 0.0744

Use the following information to answer questions (3) - (5). A janitor at a large office building believes that his supply of light bulbs has too many defective bulbs. The janitor’s null hypothesis is that the supply of light bulbs has a defect rate of p = 0.07 (the light bulb manufacturer’s stated defect rate). Suppose he does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H 0 : p = 0.07 and H a : p > 0.07 3.

[Objective: Understand the components of hypothesis testing] Choose the statement that best describes the significance level in the context of the hypothesis test. a. The significance level of 0.05 is the defect rate we believe is the true defect rate. b. The significance level of 0.05 is the test statistic that we will use to compare the observed outcome to the null hypothesis. c. The significance level of 0.05 is the probability of concluding that the defect rate is equal to 0.07 when in fact it is greater than 0.07. d. The significance level of 0.05 is the probability of concluding that the defect rate is higher than 0.07 when in fact the defect rate is equal to 0.07.

[Objective: Calculate the z test statistic] Suppose that the janitor tests 300 randomly selected light bulbs and finds that 27 bulbs are defective. What value of the test statistic should he report? Round to the nearest hundredth. a. z = −1.96 b. z = 1.36 c. z = −1.36 d. z = 1.96

8-2

Chapter 8 Test A

Section 8.2 (Characterizing P-Values)

[Objective: Understand the p-value] The janitor calculates a p-value for the hypothesis test of approximately 0.087. Choose the correct interpretation for the p-value. a. The p-value tells us that if the defect rate is 0.07, then the probability that the janitor will have 27 or more defective light bulbs out of 300 is approximately 0.087. At a significance level of 0.05, this would not be an unusual outcome. b. The p-value tells us that the probability of concluding that the defect rate is equal to 0.07, when in fact it is greater than 0.07, is approximately 0.087. c. The p-value tells us that the true population rate of defective light bulbs is approximately 0.087. d. None of the above

[Objective: Understand the p-value] From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area depicts a p-value for a two-tailed test. a.

[Objective: Understand the p-value] A research firm carried out a hypothesis test on a population proportion using a right-tailed alternative hypothesis. Which of the following z-scores would be associated with a p-value of 0.04? Round to the nearest hundredth. a. z = 2.50 b. z = −2.50 c. z = 1.75 d. z = −1.75

[Objective: Understand the p-value] From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area correctly depicts the following hypothesis test results: H 0 : p = 0.15, H a : p ≠ 0.15, α = .05, z = −1.82, p − value = 0.0688

Chapter 8 Test A

8- 3

Section 8.3 (Hypothesis Testing in Four Steps)

[Objective: Calculate the observed value of the z-statistic from sample data] Suppose that the following is to be tested: H 0 : p = 0.35 and H a : p > 0.35 . Calculate the observed z-statistic for the following sample data: Forty out of eighty test subjects have the characteristic of interest. Round to the nearest hundredth. a. z = −0.94 b. z = 2.81 c. z = 1.88 d. z = −1.87

10. [Objective: Understand the four steps of the hypothesis test] Which of the following is not one of the four steps of the hypothesis test? a. State the null and alternative hypothesis about the population parameter. b. Make a decision to reject or not reject the null hypothesis. c. State the level of significance, choose a test, and check the conditions for the test. d. All of the above are steps of the hypothesis test.

Use the following information to answer questions (11) and (12). A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The researcher’s null hypothesis for this test is H 0 : p = 0.26 and the alternative hypothesis is H a : p > 0.26 . The researcher collected data from a random sample of 75 adults in the region of interest.

11. [Objective: Verify and check the conditions for using the z-test statistic] Check that the conditions hold so that the sampling distribution of the z-statistic will approximately follow the standard normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. a. Yes, all the conditions are satisfied. b. No, the researcher did not collect a random sample. c. No, the researcher did not collect a large enough sample. d. No, the population of interest is not large enough to assume independence. 12. [Objective: Test a hypothesis for a population proportion] To continue the study into the drinking habits of adults, the researcher decides to collect data from adults working in “blue collar” jobs to see whether their drinking habits are in the same proportion as the general public. The null hypothesis for this test is H 0 : p = 0.26 and the alternative hypothesis is H a : p > 0.26 . The researcher collected data from a random sample of 90 adults with “blue collar” jobs of which 30 stated that they drank once a week or less in the last month. Assume that the conditions that must be met in order for us to use the N ( 0,1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, p̂ , the test statistic, and the p-value associated with the test statistic. Round all values to the nearest thousandth. a. pˆ = 0.333, z = 0.067, p-value = 0.946 b.

pˆ = 0.667, z = 8.795, p-value = 0.000+

pˆ = 0.333, z = 1.586, p-value = 0.056

pˆ = 0.289, z = −0.829, p-value = 0.407

8-4

Chapter 8 Test A 13. [Objective: Test a hypothesis for a population proportion] Suppose a city official conducts a hypothesis test to test the claim that the majority of voters oppose a proposed school tax. Assume that all the conditions for proceeding with a one-sample test on proportions have been met. The calculated test statistic is approximately 1.46 with an associated p-value of approximately 0.072. Choose the conclusion that provides the best interpretation for the p-value at a significance level of α = 0.05 . a. If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.46 is 0.072. This result is surprising and could not easily happen by chance. b. If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.46 is 0.072. This result is not surprising and could easily happen by chance. c. The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. d. None of the above.

Section 8.4 (Comparing Proportions from Two Populations)

14. [Objective: Understand the hypothesis test of proportions from two populations] A researcher believes that the reading habits of men and women are different. He takes a random sample from each population and records the response to the question, “Did you read at least one book last month?” The null hypothesis is H 0 : pwomen = pmen Choose the correct alternative hypothesis. a.

H a : pwomen < pmen

H a : pwomen > pmen

H a : pwomen ≠ pmen

Ha : p = 0

15. [Objective: Understand the hypothesis test of proportions from two populations] Which of the following is not a condition that must be checked before proceeding with a two-sample test? a. Both samples must be large enough so that the product of each sample size ( n1 and n2 ) and the pooled estimate, p̂ , is greater than or equal to 10. b. c. d.

Each sample must be a random sample. The samples must be independent of each other. All of the above are conditions that must be checked to proceed with a two-sample test.

Chapter 8 Test A

8- 5

16. [Objective: Understand the hypothesis test of proportions from two populations] A researcher believes that children who attend elementary school in a rural setting are more physically active then children who attend elementary school in an urban setting. The researcher collects a random sample from each population and records the proportion of children in each sample who reported participating in at least one hour of rigorous activity a day. The data is summarized in the table below. Assume the all conditions for proceeding with a two-sample test have been met. Rural n1 = 90

Urban n2 = 78

x1 = 74

x2 = 55

Find the z-statistic (rounded to the nearest hundredth) and p-value (rounded to the nearest thousandth) for this hypothesis test. Using a 5% significance level, state the correct conclusion regarding the null hypothesis, H 0 : prural = purban . a.

z = −1.79, p = 0.037 . There is insufficient evidence to reject the null hypothesis.

z = 1.79, p = 0.037 . There is sufficient evidence to reject the null hypothesis

z = 0.82, p = 0.073 . There is sufficient evidence to accept the null hypothesis.

z = 0.71, p = 0.073 . There is sufficient evidence to reject the null hypothesis.

Section 8.5 (Understanding Hypothesis Testing)

17. [Objective: Interpret the parts of the hypothesis test] A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H 0 : p = 0.4 and H a : p < 0.4 . The test statistic and p-value for the test are z = −3.01 and p − value = 0.0013 . For a significance level of α = 0.05 , choose the correct conclusion regarding the null hypothesis. a. There is not sufficient evidence to reject the null hypothesis that the population proportion is equal to 0.4. b. There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.4. c. There is sufficient evidence to conclude that the population proportion is significantly different from 0.4. d. There is not sufficient evidence to conclude that the population proportion is significantly different from 0.4. 18. [Objective: Interpret the parts of the hypothesis test] Which statement best describes the power of a hypothesis test? a. The probability of rejecting the null hypothesis when the null hypothesis is true. b. The probability of rejecting the null hypothesis when the null hypothesis is not true. c. The probability of failing to reject the null hypothesis when the null hypothesis is not true. d. None of the above

8-6

Chapter 8 Test A 19. [Objective: Interpret the parts of the hypothesis test] Read the following then choose the appropriate test and name the population(s). A researcher asks random samples of men and women whether they had purchased organically grown food in the last three months. He wants to determine whether the proportion of women who purchase organically grown food is greater than the proportion of men who purchase organically grown food. a. b. c. d.

One-proportion z-test; the population is all men. One-proportion z-test; the population is all women. Two-proportion z-test; one population is all men and the other population is all women. Two-proportion z-test; one population is all adults who buy organically grown food and the other population is all adults who do not buy organically grown food.

20. [Objective: Interpret the parts of the hypothesis test] A polling agency is interested in testing whether the proportion of women who support a female candidate for office is greater than the proportion of men. The null hypothesis is that there is no difference in the proportion of men and women who support the female candidate. The alternative hypothesis is that the proportion of women who support the female candidate is greater than the proportion of men. The test results in a p-value of 0.112. Which of the following is the best interpretation of the p-value? a. The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is greater than the proportion of men. b. The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. c. The p-value is the probability that men will support the female candidate. d. The p-value is the probability that women will support the female candidate.

Chapter 8 Test A Chapter 8 Test A—Answer Key

1. C 2. B 3. D 4. B 5. A 6. C 7. C 8. A 9. B 10. D 11. A 12. C 13. B 14. C 15. D 16. B 17. C 18. B 19. C 20. B

8- 7

Chapter 8 Test B—Multiple Choice Section 8.1 (The Main Ingredients of Hypothesis Testing) 1.

[Objective: Understand the components of hypothesis testing] Which of the following is not true about the alternative hypothesis? a. It is sometimes called the research hypothesis. b. It is assumed to be true based on the sample results. c. Like the null hypothesis, it is always a statement about a population parameter. d. It is usually a statement that the researcher hopes to demonstrate is true.

[Objective: Understand the components of hypothesis testing] Read the following problem description then choose the correct null and alternative hypothesis. A new drug is being tested to see whether it can reduce the rate of food-related allergic reactions in children ages 1 to 3 with food allergies. The rate of allergic reactions in the population of concern is 0.03. a. H 0 : p < 0.03; H a : p = 0.03

H 0 : p ≠ 0.03; H a : p = 0.03

H 0 : p = 0.03; H a : p > 0.03

H 0 : p = 0.03; H a : p < 0.03

Use the following information to answer questions (3) - (5). A janitor at a large office building believes that his supply of light bulbs has a defect rate that is different than the defect rate stated by the manufacturer. The janitor’s null hypothesis is that the supply of light bulbs has a defect rate of p = 0.09 (the light bulb manufacturer’s stated defect rate). Suppose we do a test with a significance level of 0.01. Symbolically, the null and alternative hypothesis are as follows: H 0 : p = 0.09 and H a : p > 0.09 3.

[Objective: Understand the components of hypothesis testing] Choose the statement that best describes the significance level in the context of the hypothesis test. a. The significance level of 0.01 is the probability of concluding the defect rate is more than 0.09 when it is equal to 0.09. b. The significance level of 0.01 is the defect rate we believe is the true defect rate. c. The significance level of 0.01 is the z-statistic that we will use to compare the observed outcome to the null hypothesis. d. The significance level of 0.01 is the probability of concluding that the defect rate is equal to 0.09 when in fact it is greater than 0.09.

[Objective: Calculate the z test statistic] Suppose the janitor tests 300 light bulbs and finds that 33 bulbs are defective. What value of the test statistic should he report? Round to the nearest hundredth. a. z = 1.21 b. z = −1.21 c. z = 2.17 d. z = −2.17

8-2

Chapter 8 Test B

Section 8.2 (Characterizing P-Values)

[Objective: Understand the p-value] The janitor calculates a p-value for the hypothesis test of approximately 0.113. Choose the correct interpretation for the p-value. a. The p-value tells us that the probability of concluding that the defect rate is equal to 0.09, when in fact it is greater than 0.09, is approximately 0.113. b. The p-value tells us that if the defect rate is 0.09, then the probability that the janitor will have 33 or more defective light bulbs out of 300 is approximately 0.113. At a significance level of 0.01, this would not be an unusual outcome. c. The p-value tells us that the true population rate of defective light bulbs is approximately 0.113. d. None of the above

[Objective: Understand the p-value] From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area depicts a p-value for a two-tailed test. a.

[Objective: Understand the p-value] A research firm carried out a hypothesis test on a population proportion using a left-tailed alternative hypothesis. Which of the following z-scores would be associated with a p-value of 0.04? Round to the nearest hundredth. a. z = 2.50 b. z = −2.50 c. z = 1.75 d. z = −1.75

[Objective: Understand the p-value] From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area correctly depicts the following hypothesis test results: H 0 : p = 0.25, H a : p > 0.25, α = .05, z = 2.01, p − value = 0.022

Chapter 8 Test B 8- 3 Section 8.3 (Hypothesis Testing in Four Steps)

[Objective: Calculate the observed value of the z-statistic from sample data] Suppose that the following is to be tested: H 0 : p = 0.72 and H a : p ≠ 0.72 . Calculate the observed z-statistic for the following sample data: Sixty-eight out of ninety test subjects have the characteristic of interest. Round to the nearest thousandth. a. z = 0.751 b. z = 0.453 c. z = 0.756 d. z = −0.751

10. [Objective: Understand the four steps of the hypothesis test] Which of the following is not one of the components of a hypothesis test? a. State the null and alternative hypothesis about the population parameter. b. Make a decision to either accept the null hypothesis or accept the alternative hypothesis. c. State the level of significance, choose a test, and check the conditions for the test. d. Calculate the test statistic and the p-value.

Use the following information to answer questions (11) - (13). A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H 0 : p = 0.26 and the alternative hypothesis is H a : p < 0.26 . The researcher collected data from 150 surveys he handed out at a busy park located in the region. 11. [Objective: Verify and check the conditions for using the z-test statistic] Check that the conditions hold so that the sampling distribution of the z-statistic will approximately follow the standard normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. a. No, the researcher did not collect a random sample. b. No, the researcher did not collect a large enough sample. c. No, the population of interest is not large enough to assume independence. d. Yes, all the conditions are satisfied. 12. [Objective: Test a hypothesis for a population proportion] To continue the study into the drinking habits of adults, the researcher decides to collect data from adults working in “white collar” jobs to see whether their drinking habits are in the same proportion as the general public. The null hypothesis for this test is H 0 : p = 0.26 and the alternative hypothesis is H a : p < 0.26 . The researcher collected data from a random sample of 120 adults with “white collar” jobs of which 25 stated that they drank once a week or less in the last month. Assume that the conditions that must be met in order for us to use the N ( 0,1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, p̂ , the test statistic, and the pvalue associated with the test statistic. Round all values to the nearest thousandth. a. pˆ = 0.208, z = −0.250, p-value = 0.401 b.

pˆ = 0.75, z = −1.32, p-value = 0.599

pˆ = 0.208, z = −1.290, p-value = 0.098

pˆ = 0.30, z = 0.803, p-value = 0.041.

8-4

Chapter 8 Test B 13. [Objective: Test a hypothesis for a population proportion] Suppose a city official conducts a hypothesis test to test the claim that the majority of voters support a proposed tax to build sidewalks. Assume that all the conditions for proceeding with a one-sample test on proportions have been met. The calculated test statistic is approximately 1.40 with an associated p-value of approximately 0.081. Choose the conclusion that provides the best interpretation for the p-value at a significance level of α = 0.05 . a. If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.40 is 0.081. This result is surprising and could not easily happen by chance. b. If the null hypothesis is true, then the probability of getting a test statistic as large or larger than 1.40 is 0.081. This result is not surprising and could easily happen by chance. c. The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. d. None of the above.

Section 8.4 (Comparing Proportions from Two Populations)

14. [Objective: Understand the hypothesis test of proportions from two populations] A researcher believes that the proportion of women that exercise with a friend is greater than the proportion men. He takes a random sample from each population and records the response to the question, “Have you exercised with a friend at least once in the last seven days?” The null hypothesis is H 0 : pwomen = pmen Choose the correct alternative hypothesis. a. H a : pwomen < pmen b.

H a : pwomen > pmen

H a : pwomen ≠ pmen

Ha : p = 0

15. [Objective: Understand the hypothesis test of proportions from two populations] Which of the following is not a condition that must be checked before proceeding with a two-sample test? a. The observations within each sample must be independent of one another. b. Each sample must be a random sample. c. The samples must be independent of each other. d. All of the above are conditions that must be checked to proceed with a two-sample test. 16. [Objective: Understand the hypothesis test of proportions from two populations] A researcher believes that children who attend elementary school in a rural setting have lower obesity rates then children who attend elementary school in an urban setting. The researcher collects a random sample from each population and records the proportion of children in each sample who were clinically obese. The data is summarized in the table below. Assume the all conditions for proceeding with a two-sample test have been met. Rural n1 = 95

Urban n2 = 100

x1 = 14

x2 = 26

Chapter 8 Test B 8- 5 a.

z = −1.95, p = 0.026 . There is sufficient evidence to reject the null hypothesis.

z = −1.95, p = 0.026 . There is not sufficient evidence to reject the null hypothesis.

z = 1.95, p = 0.026 . There is sufficient evidence to prove that the population proportions are the same.

z = −1.85, p = 0.032 . There is sufficient evidence to accept the null hypothesis.

Section 8.5 (Understanding Hypothesis Testing)

17. [Objective: Interpret the parts of the hypothesis test] A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H 0 : p = 0.6 and H a : p < 0.6 . The test statistic and p-value for the test are z = −1.51 and p − value = 0.0655 . For a significance level of α = 0.05 , choose the correct conclusion regarding the null hypothesis. a. There is insufficient evidence to reject the null hypothesis that the population proportion is equal to 0.6. b. There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.6. c. There is sufficient evidence to conclude that the population proportion is significantly different from 0.6. d. None of the above. 18. [Objective: Interpret the parts of the hypothesis test] Which statement best describes the significance level of a hypothesis test? a. The probability of rejecting the null hypothesis when the null hypothesis is true. b. The probability of rejecting the null hypothesis when the null hypothesis is not true. c. The probability of failing to reject the null hypothesis when the null hypothesis is not true. d. None of the above 19. [Objective: Interpret the parts of the hypothesis test] Read the following then choose the appropriate test and name the population(s). A researcher asks a random sample of 200 men whether they had made an online purchase in the last three months. He wants to determine whether the proportion of men who make online purchases is less than 0.18. a. b. c. d.

Two-proportion z-test; the population is the 200 men surveyed. One-proportion z-test; the population is all men. One-proportion z-test; the population is all adults who make online purchases. Two-proportion z-test; one population is all men who make online purchases and the other population is all men who do not make online purchases.

8-6

Chapter 8 Test B a.

b. c. d.

The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is less than the proportion of men. The p-value is the probability that the majority of women will support the female candidate. The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. The p-value is the probability that the majority of men will support the female candidate.

Chapter 8 Test B 8- 7 Chapter 8 Test B—Answer Key

1. B 2. D 3. A 4. A 5. B 6. C 7. D 8. B 9. A 10. B 11. A 12. C 13. B 14. B 15. D 16. A 17. A 18. A 19. B 20. C

Chapter 8 Test C—Short Answer Section 8.1 (The Main Ingredients of Hypothesis Testing) 1.

[Objective: Understand the components of hypothesis testing] The worker at a carnival game claims that he can communicate with a small magic rock and to prove it he tells you to hide it in your hand behind your back and he will identify the hand holding the rock. Being a wise student of statistics, you decide to stand back and observe the outcome of the next ten games before deciding whether to pay your three dollars to play the game. You have just conducted an informal hypothesis test. State the null and alternative hypothesis.

[Objective: Understand the components of hypothesis testing] A researcher wishes to test the claim that the proportion of children with blue eyes in his region is different than one in six, the national rate of blue eyes in children. State and explain the null and alternative hypothesis that should be used to test the claim.

[Objective: Understand the components of hypothesis testing] Use the following information to answer questions (3) - (5). A manager at a large office supply store believes that her supply of pencils has a defect rate that is higher than the defect rate stated by the manufacturer. The manager’s null hypothesis is that the supply of pencils has a defect rate of p = 0.025 (the pencil manufacturer’s stated defect rate). Suppose we do a test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H 0 : p = 0.025

H a : p > 0.025

[Objective: Understand the components of hypothesis testing] Write a statement describing the meaning of the level of significance in the context of the hypothesis test.

[Objective: Calculate the z test statistic] Suppose the store manager tests 150 pencils and finds that 9 are defective. Calculate the z test statistic. Round to the nearest hundredth.

8-2

Chapter 8 Test C

Section 8.2 (Characterizing P-Values) 5.

[Objective: Understand the p-value] The store manager calculates a p-value for the hypothesis test of 0.0030. Write a statement explaining what the p-value means and how it should be interpreted.

For questions (6) and (7), shade the approximate area that would represent the p-value for the alternative hypothesis and z-score, and then calculate the p-value. Round to the nearest thousandth. 6.

[Objective: Understand the p-value] The alternative hypothesis is a right-tailed with a z-score = 0.21

[Objective: Understand the p-value] The alternative hypothesis is a two-tailed with a z-score = −1.88

[Objective: Understand the p-value] Two different students conduct a coin flipping experiment with a lefttailed alternative. The obtain the following test statistics: Student #1: z = −2.05 Student #2: z = −1.28 Which of the test statistics has a smaller p-value and why?

Chapter 8 Test C

8- 3

Section 8.3 (Hypothesis Testing in Four Steps) 9.

[Objective: Understand the essential ingredients of the hypothesis test] List and briefly summarize the essential ingredients of the hypothesis test.

10. [Objective: Calculate the observed value of the z test statistic from sample data] Suppose the following is to be tested: H 0 : p = 0.4 and H a : p ≠ 0.4 . Calculate the observed z test statistic for the following sample data: n = 80 and 25 test subjects have the characteristic of interest. Round to the nearest thousandth.

Use the following information to answer questions (11) – (13). A health foods shop owner is wondering if his customer’s daily vitamin supplement habits are in the same proportion as the general population of adults. The shop owner heard in a news report that 60% of all adults reported that they took a daily vitamin. The shop owner believes that his customers have a greater proportion of adults who take a daily vitamin, so he decides to conduct a hypothesis test using the following null and alternative hypothesis: H 0 : p = 0.6 and H a : p > 0.6 . The shop owner collected data from 50 randomly selected customers. 11. [Objective: Verify and check the conditions for using the z test statistic] List and verify that the conditions hold so that the sampling distribution of the z test statistic will approximately follow the standard normal distribution.

12. [Objective: Test a hypothesis for a population proportion] To continue the study, the shop owner decides to collect data from 60 customers between the ages of 22 and 27 to see whether the proportion in this age group is different from the general population of adults. From this sample, 26 reported that they took a daily vitamin. The null hypothesis for this test is H 0 : p = 0.6 and the alternative hypothesis is H a : p ≠ 0.6 . Assume that the conditions that must be met in order for us to use the N ( 0,1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, p̂ , the observed test statistic, and the p-value associated with this observed value. Round all values to the nearest thousandth.

8-4

Chapter 8 Test C 13. [Objective: Interpreting p-value] Based on a 5% significance level, write a conclusion by interpreting the p-value. Be sure to clearly state the decision regarding the null hypothesis.

Section 8.4 (Comparing Proportions from Two Populations) 14. [Objective: Understand the hypothesis test of proportions from two populations] A sociologist believes that the proportion of single men who attend church on a regular basis is less than the proportion of single women. She takes a random sample from each population and records the proportion from each that reported that they attended church on a regular basis. The null hypothesis is H 0 : pmen = pwomen . State the correct alternative hypothesis with a sentence and symbolically.

15. [Objective: Understand the hypothesis test of proportions from two populations] A sociologist believes that families that eat at least one meal a day together (without the interference of any other media) will have better communication skills. The sociologist conducts a study to see if there is a difference in the proportion of meals that are eaten together as a family for families living in a rural setting compared to families living in an urban setting. She collects a random sample from each population and records the proportion of test subjects that reported that they had eaten at least 3 meals per week together as a family. The data is summarized in the table below. Assume the all conditions for proceeding with a two-sample test have been met. Rural n1 = 75

Urban n2 = 140

x1 = 12

x2 = 40

Find the z test statistic (rounded to the nearest hundredth) and p-value (rounded to the nearest thousandth) for testing the hypothesis that the population proportions are different. At the 5% significance level, state the correct conclusion regarding the null hypothesis H 0 : prural = purban . Round all calculations to the nearest hundredth.

Chapter 8 Test C

8- 5

16. [Objective: Understand the hypothesis test of proportions from two populations] When a two-sample test of proportions is conducted, there are two conditions of independence that must be checked. State the two conditions of independence. Be sure that your statement clearly states the difference between the two conditions.

Use the following information to answer questions (17) and (18). A child psychologist believes that controlled physical outbursts of anger (like punching a pillow) may improve the mood of young boys with emotional impairment. He believes that the proportion of boys that would benefit from this treatment is greater than the proportion of girls. A random sample from each population receives counseling in the treatment and is asked about their mood after an episode ( x is the number of test subjects that reported an improvement in mood). The results of the study are summarized in the table below. Boys n1 = 200

Girls n2 = 210

x1 = 35

x2 = 28

17. [Objective: Understand the hypothesis test of proportions from two populations] Find the percentage of children that reported an improved mood from each group. Compare the percentages. Do the initial (untested) findings show what the psychologist expected?

18. [Objective: Understand the hypothesis test of proportions from two populations] Assuming the conditions for a two proportion z-test hold, state the alternative hypothesis then find the observed test statistic and pvalue. State your decision regarding the null hypothesis, H 0 : p1 = p2 . How do the test results compare to the expectations of the psychologist? Round to the nearest hundredth.

8-6

Chapter 8 Test C

Section 8.5 (Understanding Hypothesis Testing) 19. [Objective: Interpret the parts of the hypothesis test] Explain why failing to reject the null hypothesis does not prove that the null hypothesis is true.

20. [Objective: Interpret the parts of the hypothesis test] For the following description, state whether a oneproportion z-test or a two-proportion z-test would be appropriate, and name the population. A researcher asks people who are 20-29 years old and senior citizens (people over 65) whether they support a new tax on income. He wants to determine whether the proportions that support the tax differ for these age groups.

Chapter 8 Test C

8- 7

Chapter 8 Test C—Answer Key 1.

Null hypothesis is that the carnival worker has no special powers, therefore H 0 : p = 0.5 . The alternative hypothesis is that he can communicate with a rock, therefore H a : p > 0.5 .

H 0 : p = 1 / 6 and H a : p ≠ 1 / 6 ; The null hypothesis states that the population parameter is no different than

what is expected and is assumed to be true. The alternative hypothesis states that the population parameter may be different and contains the claim that the researcher is trying to show. 3. The significance level of 0.05 is the probability of concluding that the defect rate is greater than 0.025, when in fact the defect rate is equal to 0.025. 4. z = 3.79 5. The p-value tells us that if the defect rate is 0.025, than the probability that the store manager will have 9 or more defective pencils out of 150 is 0.0030. At a significance level of 0.05, this would be an unusual outcome. 6. The right tail of the curve should be shaded and should approximately represent the p-value of 0.417. 7. Both tails of the curve should be shaded and should approximately represent the p-value of 0.060. 8. If the null hypothesis is correct, then the test statistic should be close to zero. Values farther from zero are more surprising and so have smaller p-values. Since -2.05 is farther from zero than is -1.28, the area under the normal curve in the left tail is smaller for -2.05, therefore -2.05 will have a smaller p-value. 9. (1) State the null and alternative hypothesis about the population parameter; (2) State the significance level, an appropriate test statistic, check the required conditions, and state any assumptions; (3) Compute the test statistic and associated p-value; (4) State you conclusion regarding the null hypothesis. Will you reject or fail to reject the null hypothesis? Explain in the context of the data. 10. z = −1.598 11. Random Sample-stated in problem description; large enough sample size-since there are at least 10 expected success and failures ( 50 * 0.6 ≥ 10 and 50 * 0.4 ≥ 10 ); Large enough population-yes, it is reasonable to assume that there are at least 500 customers in the population; Independence-yes, reasonable to assume that customer responses are independent; Null hypothesis-yes, it is reasonable to assume the null hypothesis is true. 12. pˆ = 0.433, z = −2.635, p = 0.008 13. If the null hypothesis is true, then the probability of getting a calculated z-score that is this far from zero is surprising and could not easily happen by chance. The null hypothesis should be rejected. There is sufficient evidence to support the claim that the proportion of customers who take a daily vitamin is different then 0.60. 14. The population proportion of single men who attend church regularly is less than the population proportion of single women who attend church regularly. H a : pmen < pwomen 15. z = −2.05; p = 0.04 . There is sufficient evidence to reject the null hypothesis that the population proportions are equal. 16. One condition is that the two sample themselves must be independent of each other. The second condition is that the observations within each sample must be independent of each other. 17. Boys: 17.5%, girls: 13.3%. 18. H a : p1 > p2 ; z = 1.17; p = 0.12 ; Do not reject the null hypothesis. Although the sample percentages showed that the percentage of boys with improved mood was greater than the percentage of girls, the conclusion of hypothesis test was that the results are not surprising and could easily happen by chance. 19. Failing to reject the null hypothesis does not prove that the null hypothesis is true, only that the sample evidence does not show surprising enough results to suggest that the assumption that the null hypothesis is true is incorrect. To say the null is proven true means that there is no doubt about it. This is not possible based on chance processes. Copyright © 2013 Pearson Education, Inc.

8-8

Chapter 8 Test C 20. Two-proportion z-test. One population is all people in the 20-29 age bracket, the other population is all senior citizens.

Chapter 9 Test A—Multiple Choice Section 9.1 (Sample Means of Random Samples) Use the following information to answer questions (1) – (4). A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants was taken and the mean finishing time was found to be 1.59 hours with a standard deviation of 0.30 hours. 1.

[Objective: Differentiate between a parameter and a statistic] 1.67 hours and 0.25 hours are _______________. a. estimates b. statistics c. parameters d. unbiased estimators

In this example, the numerical values of

[Objective: Differentiate between population distribution, the distribution of a sample, or the sampling distribution of means] Suppose we were to make a histogram of the finishing times of all participants in the duathlon. Would the histogram be a display of the population distribution, the distribution of a sample, or the sampling distribution of means? a. population distribution b. distribution of a sample c. sampling distribution of means

[Objective: Differentiate between population distribution, the distribution of a sample, or the sampling distribution of means] Suppose the process of taking random samples of size 30 is repeated 200 times and a histogram of the 200 sample means is created. Would the histogram be an approximate display of the population distribution, the distribution of a sample, or the sampling distribution of means? a. population distribution b. distribution of a sample c. sampling distribution of means

[Objective: Show understanding of mean and standard error of a sampling distribution of means] What is the standard error for the mean finish time of 30 randomly selected participants? Round to the nearest thousandth. a. 0.250 b. 0.300 c. 0.055 d. 0.046

[Objective: Show understanding of mean and standard error of a sampling distribution of means] Choose the statement that best describes what is meant when we say that the sample mean is unbiased when estimating the population mean. a. The sample mean will always equal the population mean. b. On average, the sample mean is the same as the population mean. c. The standard deviation of the sampling distribution (also called the standard error) and the population standard deviation are equal. d. None of the above.

9-2

Chapter 9 Test A

Section 9.2 (The Central Limit Theorem for Sample Means) 6.

[Objective: Understand the Central Limit Theorem for sample means] Which of the following is not a true statement about the Central Limit Theorem for sample means? a. All of the following statements are true about the Central Limit Theorem for sample means b. If the sample size is large, it doesn’t matter what the distribution of the population it was drawn from is, the normal distribution can still be used to perform statistical inference. c. If conditions are met, the mean of the sampling distribution is equal to the population mean. d. The Central Limit Theorem helps us find probabilities for sample means when those means are based on a random sample from a population.

[Objective: Apply the Central Limit Theorem] Suppose that the average pop song length in America is 4 minutes with a standard deviation of 1.25 minutes. It is known that song length is not normally distributed. Suppose a sample of 25 songs is taken from the population. What is the approximate probability that the average song length will be less than 3.5 minutes? Round to the nearest thousandth. a. 0.345 b. 0.023 c. 0.155 d. 0.477

[Objective: Apply the Central Limit Theorem] Suppose that the average song length in America is 4 minutes with a standard deviation of 1.25 minutes. It is known that song length is not normally distributed. Find the probability that a single randomly selected song from the population will be longer than 4.25 minutes. Round to the nearest thousandth. a. 0.579 b. 0.421 c. 0.079 d. This probability cannot be determined because we do not know the distribution of the population.

[Objective: Understand the characteristics of the t-distribution] Which of the following statements is not true about the t-distribution? a. All of the following statements about the t-distribution are true. b. For small sample sizes, the t-distribution has all the same properties of the normal curve. c. Like the normal distribution, the t-distribution is symmetric and unimodal. d. Since population standard deviation is usually unknown, the standard error uses the sample standard deviation to estimate population standard deviation. The formula is SEest = s

Chapter 9 Test A

9- 3

Section 9.3 (Answering Questions about the Mean of a Population) Use the following information to answer questions (10)-(12). Many couples believe that it is getting too expensive to host an “average” wedding in the United States. According to the website www.costofwedding.com, the average cost of a wedding in the U. S. in 2009 was $24,066. Recently, in a random sample of 40 weddings in the U. S. it was found that the average cost of a wedding was $23,224, with a standard deviation of $2,903. On the basis of this, a 95% confidence interval for the mean cost of weddings in the U. S. is $22,296 to $24,152. 10. [Objective: Verify the conditions for using a confidence interval] For this description, which of the following does not describe a condition for a valid confidence interval? a. The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. b. The sample observations are independent because knowledge about the cost of any one wedding tells us nothing about the cost of any other wedding in the sample. c. The sample distribution must be normally distributed in order to have a valid confidence interval. The problem does not describe the distribution of the sample, so this condition is not met. d. The sample size of 40 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. 11. [Objective: Interpret a confidence interval] Does the confidence interval provide evidence that the mean cost of a wedding has decreased? a. Yes b. No 12. [Objective: Interpret a confidence interval] Choose the statement that is the best interpretation of the confidence interval. a. In about 95% of all samples of 40 U. S. weddings, the resulting confidence interval will contain the mean cost of all weddings in the U. S. b. We are extremely confident that the mean cost of a U. S. wedding is between $22,296 and $24,152. c. That probability that a U. S. wedding will cost more than $24,152 is less than 3%. 13. [Objective: Find a confidence interval for sample mean] The weights at birth of five randomly chosen baby giraffes were 111, 115, 120, 103, and 106 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby giraffes. Use technology for your calculations. Give the confidence interval in the form “ estimate ± margin of error ”. Round to the nearest tenth of a pound. a. There is not enough information given to calculate the confidence interval. b. 110.0 ± 8.5 pounds c. 111.5 ± 9.0 pounds d. 111.0 ± 8.5 pounds

9-4

Chapter 9 Test A 14. [Objective: Understand the hypothesis test of the mean] Suppose a consumer product researcher wanted to find out whether a highlighter lasted longer than the manufacturer’s claim that their highlighters could write continuously for 14 hours. The researcher tested 40 highlighters and recorded the number of continuous hours each highlighter wrote before drying up. Test the hypothesis that the highlighters wrote for more than 14 continuous hours. Following are the summary statistics:

x = 14.5 hours, s = 1.2 hours Report the test statistic, p-value, your decision regarding the null hypothesis. At the 5% significance level, state your conclusion about the original claim. Round all values to the nearest thousandth. a.

z = 9.583; p = 0.000 + ; Reject the null hypothesis; there is strong evidence to suggest that the

highlighters last longer than 14 hours. z = 9.583; p = 0.000 + ; Fail to reject the null hypothesis; there is not strong evidence to suggest

that the highlighters last longer than 14 hours. t = 2.635; p = 0.006 ; Reject the null hypothesis; there is strong evidence to suggest that the

highlighters last longer than 14 hours. t = 2.635; p = 0.006 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that the highlighters last longer than 14 hours.

15. [Objective: Understand the hypothesis test of the mean] An economist conducted a hypothesis test to test the claim that the average cost of eating a meal at home increased from 2009 to 2010. The average cost of eating a meal at home in 2009 was $5.25 per person per meal. Assume that all conditions for testing have been met. He used technology to complete the hypothesis test. Following is his null and alternative hypothesis and the output from his graphing calculator. H 0 : μ = $5.25 H a : μ > $5.25

T-Test

μ > 5.25 t = 4.644687288 p = 3.4015934 E − 5 x = 6.31 Sx = 1.25 n = 30

At the 5% significance level, choose the statement that contains the correct conclusion regarding the hypothesis and the original claim. a. b. c. d.

Reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. Reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. Fail to reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. Fail to reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating at home has increased since 2009.

Chapter 9 Test A

9- 5

Section 9.4 (Comparing Two Population Means) 16. [Objective: Differentiate between dependent and independent samples] The reading level of a random sample of men and a random sample of women are measured. Researchers want know whether women typically read at a higher level than men. a. Dependent b. Independent 17. [Objective: Differentiate between dependent and independent samples] The college GPA’s of identical twins are compared to see whether the means are different. a. Dependent b. Independent 18. [Objective: Conduct a hypothesis test on two independent samples] A researcher wants to know whether athletic women are more flexible than non-athletic women. For this experiment, a woman who exercised vigorously at least four times per week was considered “athletic”. Flexibility is measured in inches on a sit & reach box. Test the researcher’s claim using the following summary statistics: Athletic women n = 50 x = 5.0 inches s = 1.4 inches

Non-athletic women n = 30 x = 4.6 inches s = 0.8 inches

Assume that all conditions for testing have been met. Report the test statistic and p-value. At the 1% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. a.

t = 1.623; p = 0.054 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that

athletic women are more flexible than non-athletic women. t = −1.623; p = 0.054 ; Reject the null hypothesis; there is strong evidence to suggest that athletic

women are more flexible than non-athletic women. t = −1.623; p = 0.108 ; Reject the null hypothesis; there is strong evidence to suggest that athletic

women are more flexible than non-athletic women. t = 1.623; p = 0.108 ; Reject the null hypothesis; there is not strong evidence to suggest that athletic women are more flexible than non-athletic women.

9-6

Chapter 9 Test A 19. [Objective: Conduct a hypothesis test on two dependent samples] A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classical music. Test the hypothesis that mood rating improved after being exposed to classical music. Following are the mood ratings for the four participants:

Participant #1 Participant #2 Participant #3 Participant #4

Before exposure 4.0 5.2 3.2 4.5

After exposure 5.0 6.0 4.0 6.2

Assume that all conditions for testing have been met. Report the null and alternative hypothesis and pvalue. At the 5% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. a.

H 0 : μ difference = 0, H a : μ difference > 0; p = 0.922 ; Fail to reject the null hypothesis; there is not strong

evidence to suggest that exposure to classical music improved mood rating. H 0 : μ1 = μ 2 , H a : μ1 ≠ μ ; p = 0.008 ; Fail to reject the null hypothesis; there is not strong

evidence to suggest that exposure to classical music improved mood rating. H 0 : μ difference = 0, H a : μ difference < 0; p = 0.008 ; Reject the null hypothesis; there is strong evidence

to suggest that exposure to classical music improved mood rating. H 0 : μ1 = μ 2 , H a : μ1 < μ 2 ; p = 0.077 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improved mood rating.

Section 9.5 (Overview of Analyzing Means) 20. [Objective: Compare confidence intervals and hypothesis tests] Choose the statement that describes a situation where a confidence interval and a hypothesis test will yield the same results. a. When the null hypothesis contains a population parameter that is equal to zero. b. When the alternative hypothesis is two-tailed. c. Both (a) and (b) d. Neither (a) nor (b). The confidence interval cannot yield results that are the same as the hypothesis test.

Chapter 9 Test A Chapter 9 Test A—Answer Key 1. C 2. A 3. C 4. D 5. B 6. A 7. B 8. D 9. B 10. C 11. B 12. A 13. D 14. C 15. A 16. B 17. A 18. A 19. C 20. B

9- 7

Chapter 9 Test B—Multiple Choice Section 9.1 (Sample Means of Random Samples) Use the following information to answer questions (1) – (4). A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants in the 40-44 age group was taken and the mean finishing time was found to be 1.62 hours with a standard deviation of 0.40 hours. 1.

[Objective: Differentiate between a parameter and a statistic] 1.62 hours and 0.40 hours are __________. a. estimates b. statistics c. parameters d. unbiased estimators

In this example, the numerical values of

[Objective: Differentiate between population distribution, the distribution of a sample, or the sampling distribution of means] Suppose we were to make a histogram of the finishing times of the 30 participants in the 40-44 age group. Would the histogram be a display of the population distribution, the distribution of a sample, or the sampling distribution of means? a. population distribution b. distribution of a sample c. sampling distribution of means

[Objective: Differentiate between population distribution, the distribution of a sample, or the sampling distribution of means] Suppose the process of taking random samples of size 30 from the 40-44 age group is repeated 200 times and a histogram of the 200 sample means is created. Which statement best describes the shape of the histogram? a. The histogram will be roughly symmetrical. b. The histogram will be unimodal. c. The histogram will be roughly bell-shaped d. All of the above statements are true.

[Objective: Show understanding of mean and standard error of a sampling distribution of means] What is the standard error for the mean finish time of 30 randomly selected participants in the 40-44 age group? Round to the nearest thousandth. a. 0.046 b. 0.300 c. 0.250 d. 0.055

[Objective: Show understanding of mean and standard error of a sampling distribution of means] Choose the statement that best describes what is meant when we say that the sample mean is unbiased when estimating the population mean. a. The sample mean will always equal the population mean. b. The standard deviation of the sampling distribution (also called the standard error) and the population standard deviation are equal. c. On average, the sample mean is the same as the population mean. d. None of the above.

9-2

Chapter 9 Test B

Section 9.2 (The Central Limit Theorem for Sample Means)

[Objective: Apply the Central Limit Theorem] Suppose that the average country song length in America is 4.75 minutes with a standard deviation of 1.10 minutes. It is known that song length is not normally distributed. Suppose a sample of 25 songs is taken from the population. What is the approximate probability that the average song length will last more than 5.25 minutes? Round to the nearest thousandth. a. 0.488 b. 0.012 c. 0.325 d. 0.175

[Objective: Apply the Central Limit Theorem] Suppose that the average country song length in America is 4.75 minutes with a standard deviation of 1.10 minutes. It is known that song length is not normally distributed. Find the probability that a single randomly selected song from the population will be less than 4.20 minutes. Round to the nearest thousandth. a. 0.006 b. 0.494 c. 0.068 d. This probability cannot be determined because we do not know the distribution of the population.

[Objective: Understand the characteristics of the t-distribution] Which of the following statements is not true about the t-distribution? a. All of the following statements about the t-distribution are true. b. The t-distribution is generally bell-shaped, as sample size increases the shape of the t-distribution gets closer to the shape of the normal distribution. c. Since population standard deviation is usually unknown, the standard error uses the sample standard deviation to estimate population standard deviation. The formula is SEest = s d.

Like the normal distribution, the t-distribution is symmetric and unimodal.

Chapter 9 Test B

9- 3

Section 9.3 (Answering Questions about the Mean of a Population) Use the following information to answer questions (10)-(12). According to the website www.costofwedding.com, the average cost of flowers for a wedding is $698. Recently, in a random sample of 40 weddings in the U. S. it was found that the average cost of the flowers was $734, with a standard deviation of $102. On the basis of this, a 95% confidence interval for the mean cost of flowers for a wedding is $701 to $767. 10. [Objective: Verify the conditions for using a confidence interval] For this description, which of the following does not describe a condition for a valid confidence interval? a. The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. b. The sample observations are independent because knowledge about the cost of flowers for any one wedding tells us nothing about the cost of flowers for any other wedding in the sample. c. The sample size of 40 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. d. All of the above describe conditions for a valid confidence interval. 11. [Objective: Interpret a confidence interval] Does the confidence interval provide evidence that the mean cost of flowers for a wedding has increased? a. Yes b. No 12. [Objective: Interpret a confidence interval] Choose the statement that is the best interpretation of the confidence interval. a. That probability that the flowers at a wedding will cost more than $698 is greater than 5%. b. In about 95% of all samples of size 40, the resulting confidence interval will contain the mean cost of flowers at weddings. c. We are extremely confident that the mean cost of flowers at a wedding is between $701 and $767. d. That probability that flowers at a wedding will cost less than $767 is nearly 100%. 13. [Objective: Find a confidence interval for sample mean] The weights at birth of five randomly chosen baby Orca whales were 425, 454, 380, 405, and 426 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby Orca whales. Use technology for your calculations. Give the confidence interval in the form “ estimate ± margin of error ”. Round to the nearest tenth of a pound. a. There is not enough information given to calculate the confidence interval. b. 384.0 ± 68.0 pounds c. 418.0 ± 34.0 pounds d. 418.0 ± 34.5 pounds

9-4

Chapter 9 Test B 14. [Objective: Understand the hypothesis test of the mean] Suppose a consumer product researcher wanted to find out whether a highlighter lasted less than the manufacturer’s claim that their highlighters could write continuously for 14 hours. The researcher tested 40 highlighters and recorded the number of continuous hours each highlighter wrote before drying up. Test the hypothesis that the highlighters wrote for less than 14 continuous hours. Following are the summary statistics:

x = 13.6 hours, s = 1.3 hours Report the test statistic, p-value, your decision regarding the null hypothesis, and your conclusion about the original claim. Round all values to the nearest thousandth. a.

z = 1.946; p = 0.029 ; Reject the null hypothesis; there is strong evidence to suggest that the

highlighters last less than 14 hours. z = 1.946; p = 0.974 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that

the highlighters last less than 14 hours. t = −1.946; p = 0.029 ; Fail to reject the null hypothesis; there is not strong evidence to suggest

that the highlighters last less than 14 hours. t = −1.946; p = 0.029 ; Reject the null hypothesis; there is strong evidence to suggest that the highlighters last less than 14 hours.

15. [Objective: Understand the hypothesis test of the mean] An economist conducted a hypothesis test to test the claim that the average cost of eating a meal away from home decreased from 2009 to 2010. The average cost of eating a meal away from home in 2009 was $7.15 per person per meal. Assume that all conditions for testing have been met. He used technology to complete the hypothesis test. Following is his null and alternative hypothesis and the output from his graphing calculator. H 0 : μ = $7.15 H a : μ < $7.15

T-Test

μ < 7.15

t = −1.043281062 p = .1527187422 x = 6.95 Sx = 1.05 n = 30 Choose the statement that contains the correct conclusion regarding the hypothesis and the original claim. a. b. c. d.

Reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. Reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. Fail to reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. Fail to reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009.

Chapter 9 Test B

9- 5

Section 9.4 (Comparing Two Population Means) 16. [Objective: Differentiate between dependent and independent samples] The productivity of manufacturing plant workers is compared before and after the installation of air conditioning. a. Dependent b. Independent 17. [Objective: Differentiate between dependent and independent samples] The weight of King Salmon from Lake Michigan and Lake Superior are measured. Researchers want to know whether Lake Michigan King Salmon weigh less than those from Lake Superior. a. Dependent b. Independent 18. [Objective: Conduct a hypothesis test on two independent samples] A researcher wants to know whether athletic men are more flexible than non-athletic men. For this experiment, a man who exercised vigorously at least four times per week was considered “athletic”. Flexibility is measured in inches on a sit & reach box. Test the researcher’s claim using the following summary statistics: Athletic men n = 50 x = 4.3 inches s = 2.1 inches

Non-athletic men n = 40 x = 3.2 s = 1.0 inches

Assume that all conditions for testing have been met. Report the test statistic and p-value. At the 5% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. a.

t = 3.270; p = 0.001 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that

athletic men are more flexible than non-athletic men. t = 3.270; p = 0.001 ; Reject the null hypothesis; there is strong evidence to suggest that athletic

men are more flexible than non-athletic men. t = −3.270; p = 0.002 ; Reject the null hypothesis; there is strong evidence to suggest that athletic

men are more flexible than non-athletic men. t = −3.270; p = 0.002 ; Reject the null hypothesis; there is not strong evidence to suggest that athletic men are more flexible than non-athletic men.

9-6

Chapter 9 Test B 19. [Objective: Conduct a hypothesis test on two dependent samples] A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classic rock music. Test the hypothesis that mood rating decreased after being exposed to classic rock music. Following are the mood ratings for the four participants:

Participant #1 Participant #2 Participant #3 Participant #4

Before exposure 5.1 4.3 2.5 4.9

After exposure 5.2 4.0 2.5 4.0

H 0 : μ difference = 0, H a : μ difference > 0; p = 0.154 ; Fail to reject the null hypothesis; there is not strong

evidence to suggest that exposure to classic rock music decreased mood rating. H 0 : μ difference = 0, H a : μ difference > 0; p = 0.154 ; Reject the null hypothesis; there is strong evidence

to suggest that exposure to classic rock music decreased mood rating. H 0 : μ difference = 0, H a : μ difference ≠ 0; p = 0.309 ; Reject the null hypothesis; there is strong evidence

to suggest that exposure to classic rock music decreased mood rating. H 0 : μ difference = 0, H a : μ difference > 0; p = 0.015 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classic rock music decreased mood rating.

Section 9.5 (Overview of Analyzing Means) 20. [Objective: Compare confidence intervals and hypothesis tests] Choose the statement that describes a situation where a confidence interval and a hypothesis test will yield the same results. a. Both (c) and (d) below. b. Neither (c) nor (d). The confidence interval cannot yield results that are the same as the hypothesis test. c. When the null hypothesis contains a population parameter that is equal to zero. d. When the alternative hypothesis is two-tailed.

Chapter 9 Test B 1. B 2. B 3. D 4. A 5. C 6. B 7. D 8. A 9. A 10. D 11. A 12. B 13. C 14. D 15. D 16. A 17. B 18. B 19. A 20. D

9- 7

Chapter 9 Test C—Short Answer Section 9.1 (Sample Means of Random Samples) Use the following information to answer questions (1) – (4). A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all male participants in a recent large duathlon was 1.54 hours with a standard deviation of 0.22 hours. The distribution of finish times for males is right-skewed. Suppose that a sample of 30 randomly selected male participants is selected. 1.

[Objective: Differentiate between a parameter and a statistic] Explain.

Is the number 1.54 a statistic or parameter?

[Objective: Understand the sampling distribution of a mean] What is the expected finish time for a male participant in the sample of 30? Will the expected mean finish time be the same for any sample of 30 males drawn from the population? Explain.

[Objective: Understand the sampling distribution of a mean] Calculate the standard error for the mean finish time of 30 randomly selected male participants. Show all your work and round to the nearest thousandth.

[Objective: Understand the sampling distribution of a mean] Suppose that the process of drawing samples of size 30 from the population of all male participants is repeated 100 times. If possible, sketch and describe what the sampling distribution of the means will look like and state the approximate mean value of the distribution. Round to the nearest thousandth.

9-2

Chapter 9 Test C 5.

[Objective: Show understanding of mean and standard error of a sampling distribution of means] Explain what is meant when we say that the sample mean is an unbiased estimator.

Section 9.2 (The Central Limit Theorem for Sample Means) 6.

[Objective: Understand the Central Limit theorem for sample means] How might the shapes of a population distribution and a sampling distribution look the same? How might they look different?

[Objective: Apply the Central Limit Theorem] Suppose that a major league baseball game has an average length of 2.9 hours with a standard deviation of 0.5 hours. It is known that game length is not normally distributed. Suppose a random sample of 36 games is taken from the population. Sketch the probability distribution and shade in the region that corresponds to the probability. What is the approximate probability that average game length will be greater than 3.1 hours? Round to the nearest thousandth.

[Objective: Apply the Central Limit Theorem] Suppose that a major league baseball game has an average length of 2.9 hours with a standard deviation of 0.5 hours. It is known that game length is not normally distributed. Suppose a random sample of 36 games is taken from the population. What is the approximate probability that average game length will be greater than 3.15 hours or less than 2.75 hours? Round to the nearest thousandth.

[Objective: Understand the characteristics of the t-distribution] Compare the normal distribution and the tdistribution. How are they similar? How are they different?

Chapter 9 Test C

9- 3

11. [Objective: Interpret a confidence interval] Explain whether the confidence interval provides evidence that the mean catering cost of a wedding has increased.

12. [Objective: Interpret a confidence interval] A popular wedding magazine states in an article on catering costs that “There is a 95% chance that the catering cost of your wedding will be between $106 and $114 per person.” Explain what is wrong with this statement and write a better statement that correctly interprets the confidence interval.

13. [Objective: Find a confidence interval for sample mean] The weights at birth of five randomly chosen baby hippopotamuses were 75, 99, 107, 82, and 63 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby hippopotamuses. Use technology for your calculations. Give the confidence interval in the form “ estimate ± margin of error ”. Round to the nearest pound.

9-4

Chapter 9 Test C 14. [Objective: Understand the hypothesis of the mean] Suppose a consumer product researcher wanted to find out whether a printer ink cartridge lasted longer than the manufacturer’s claim that their ink cartridges could print 400 pages. The researcher tested 40 ink cartridges and recorded the number of pages that were printed before the ink started to fade. Test the hypothesis that the ink cartridges lasted for more than 400 pages. Following are the summary statistics:

x = 415 pages, s = 30 pages Be sure to report the null and alternative hypothesis, test statistic, p-value, your decision regarding the null hypothesis, and your conclusion about the original claim. Round all values to the nearest thousandth.

15. [Objective: Understand the hypothesis of the mean] The quality engineer at a paint manufacturer conducted a hypothesis test to test the claim that the mean volume of paint cans had changed after an adjustment in the manufacturing process. Mean volume in paint cans before the adjustment was 1.02 gallons. Assume that all conditions for testing have been met. She used technology to complete the hypothesis test. Following is the null and alternative hypothesis and the output from her graphing calculator.

H 0 : μ = 1.02 gallons H a : μ ≠ 1.02 gallons T-Test

μ ≠ 1.02 t = −5.89255651 p = 3.4248373 E − 7 x = .97 Sx = .06 n = 50

Write a statement explaining what her decision regarding the null hypothesis should be and a statement summarizing her conclusion regarding the claim that average volume of paint cans had changed. Has the adjustment in the manufacturing process changed the average volume of paint cans?

Chapter 9 Test C

9- 5

Section 9.4 (Comparing Two Population Means) 16. [Differentiate between dependent and independent samples] State whether the situation has dependent or independent samples. A researcher wants to know if reaction time is affected by body type of the vehicle being driven. He measures the reaction time of 40 drivers while they drive a compact car then he measures the reaction time while they drive an SUV.

17. [Differentiate between dependent and independent samples] State whether the situation has dependent or independent samples. A researcher wants to know if reaction time is affected by the gender of the driver. He measures the reaction time of 30 female drivers while they drive a compact car, then he measures the reaction time of 30 male drivers while they drive a compact car.

18. [Objective: Conduct a hypothesis test on two independent samples] A sociologist wants to know whether there is a difference in the mean number of times that men and women in the U. S. check their Smartphone during the day. Test the hypothesis that the mean number of times that men and women in the U. S. check their Smartphone during the day is different. Following are the summary statistics:

Men n = 25 x = 30 checks per day s = 6 checks per day

Women n = 40 x = 35 checks per day s = 7 checks per day

Assume that all conditions for testing have been met. Be sure to report the null and alternative hypothesis, test statistic, p-value, your decision regarding the null hypothesis, and your conclusion about the original claim. Round all values to the nearest thousandth.

9-6

Chapter 9 Test C 19. [Objective: Conduct a hypothesis test on two independent samples] A chewing gum manufacturer wants to know whether students score higher on math tests when they are allowed to chew gum during a test. He conducts an experiment with a sample of four randomly selected students and records the test results with gum chewing and without. Test the hypothesis that test scores were higher when students were allowed to chew gum during the test. Following are the test scores for the four participants:

Participant #1 Participant #2 Participant #3 Participant #4

Without gum 79 95 85 82

With gum 80 94 87 84

Section 9.5 (Overview of Analyzing Means) 20. [Objective: Compare confidence intervals and hypothesis tests] Describe the circumstances under which a confidence interval and hypothesis test yield the same results?

Chapter 9 Test C

9- 7

Chapter 9 Test C—Answer Key 1. 2. 3.

This is a parameter. It is calculated using date from every member of the population. The population it was drawn from is small enough in scope that the population mean can be calculated. The sample mean can vary from sample to sample, but the expected value of any sample of 30 drawn from the population will typically be the same as the population mean of 1.54 hours.

SE = 0.22

30 ≈ 0.040

The distribution will be approximately bell-shaped (normally distributed). The mean will be 1.54 hours (the same as the population mean). 5. When many samples of size n are taken from a population, the mean of the sampling distribution of sample means is equal to the population mean. Since sample mean is an accurate estimate of the population parameter, it is called an unbiased estimator. 6. A population distribution and sampling distribution might both be normally distributed, but not necessarily. The population distribution histogram can have any distribution, but if the sample size is large, the sampling distribution will always be approximately normal. 7. 0.008 8. 0.037 9. Both distributions are symmetrical and unimodal, but the t-distribution takes on a slightly different shape depending on sample size. The t-distribution will have thicker tails for smaller samples, but as sample size increases, the shape will more closely resemble the bell-shaped normal distribution. 10. All conditions are met. The sample is randomly selected, the sample observations are independent, and the sample size is large enough that population distribution does not matter. 11. The confidence interval provides strong evidence that the catering cost of a wedding has increased because the 95% confidence interval is above $100 and there is no overlap. 12. A confidence interval does not describe a probability. The correct statement should be similar to the following: “We are 95% confident that the mean catering cost of a wedding will be between $106 and $114 per person.” 13. 85 ± 22 pounds 14. H 0 : μ = 400 and H a : μ > 400; t = 3.162; p = 0.002; Reject the null hypothesis; there is strong evidence to suggest the ink cartridges last for more than 400 pages (the claim is supported). 15. Reject the null hypothesis; there is strong evidence to suggest that average volume of paint cans is different than 1.02 gallons. The adjustment in the manufacturing process has affected volume of paint cans. 16. The samples are dependent 17. The samples are independent 18. H 0 : μ1 =μ2 and H a : μ1 ≠ μ 2 ; t = −3.063; p = 0.003; Reject the null hypothesis; there is strong evidence to suggest that there is a difference in the mean number of times that men and women check their Smartphone during the day. 19. H 0 : μdifference =0 and H a : μ difference < 0; t = −1.414; p = 0.126; Fail to reject the null hypothesis; there is insufficient evidence to support the claim that there is a difference in the average test scores while chewing gum and not chewing gum. The chewing gum manufacturer’s claim is not supported. 20. When the alternative hypothesis is two-tailed the confidence interval and hypothesis will yield the same results.

Chapter 10 Test A—Multiple Choice Section 10.1 (The Basic Ingredients for Testing with Categorical Variables) Use the following table to answer questions (1) - (3). The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of flying than males.

Expressed a fear of flying Did not express a fear of flying

Men 42 58

Women 112 98

[Objective: Understand a two-way table] How many categorical variables are summarized in the table? a. Four b. Three c. Two d. Zero

[Objective: Understand a two-way table] What fraction represents the proportion of people in the study who did not express a fear of flying? Round to the nearest tenth of a percent. 156 a. 310

154 310

100 310

210 310

[Objective: Calculate expected value] Find the expected number of women who should express a fear of flying, if the variables are independent. Round to the nearest whole number. a. 81 b. 104 c. 50 d. 11

Use the following information to answer questions (4) and (5). Lambda olive oil is touted as the “World’s most expensive olive oil”. A twelve ounce bottle typically costs fifty dollars or more. In a blind taste test, a group of food experts tasted three premium olive oils, one of which was Lambda olive oil. When asked to pick the Lambda olive oil, 84 got it right and 87 got it wrong. 4.

[Objective: Calculate expected value] If this group were just guessing, how many people (out of 171) would be expected to guess correctly? a. 86 b. 57 c. 114 d. Not enough information given to calculate expected value.

10-2

Chapter 10 Test A

[Objective: Calculate the Chi-square test statistic] Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. a. 6.39 b. 12.79 c. 23.68 d. 19.18

[Objective: Understand the Chi-square Distribution] Choose the statement that is not true about the chisquare distribution or choose (d) if all the statements are true. a. The shape of the chi-square distribution depends on the degrees of freedom. b. The lower the degrees of freedom, the more skewed to the right the chi-square distribution will be. c. Values for the chi-square statistic (on the horizontal axis) can be negative, positive, or zero. d. All of the above statements are true about the chi-square distribution.

Section 10.2 (The Chi-square Test for Goodness-of-fit) Use the following information to answer questions (7) – (9) A dowsing rod is a “Y” or “L” shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L-shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an "X" over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.

Observed Expected

Correctly identified location of ground water 21 15.6

Incorrectly identified location of ground water 57 62.4

[Objective: Conduct a goodness-of-fit test] Choose the correct null and alternative hypothesis. a. H 0 : The dowsing rods correctly identify the location of ground water 50% of the time.

H a : The dowsing rods correctly identify the location of ground water more than 50% of the time. b.

H 0 : The dowsing rods correctly identify the location of ground water 20% of the time. H a : The dowsing rods correctly identify the location of ground water more than 20% of the time.

H 0 : The dowsing rods work better at locating ground water than guessing. H a : The dowsing rods work no better at locating ground water than guessing.

H 0 : The dowsing rods work no better at locating ground water than guessing. H a : The dowsing rods work better at locating ground water than guessing.

Chapter 10 Test A 8.

10- 3

[Objective: Conduct a goodness-of-fit test] Test the hypothesis that the dowsing rods worked better at locating ground water than guessing. Using a goodness-of-fit test and a 0.05 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. a. Fail to reject H 0 ; There is not enough evidence to conclude that the dowsing rods worked better b.

than guessing. Reject H 0 ; There is not enough evidence to conclude that the dowsing rods worked better than

guessing. Fail to reject H 0 ; There is enough evidence to conclude that the dowsing rods worked better than

guessing. Reject H 0 ; There is enough evidence to conclude that the dowsing rods worked better than guessing.

[Objective: Understand the goodness-fit-test] Of the following statements, which one is not true about the chi-square statistic and p-value? Choose (d) if all statements are true. a. The larger the chi-square statistic, the smaller the p-value. b. Under the assumption that the null is true, the p-value is the probability that the chi-square statistic will be as big as or bigger than the observed value. c. On the chi-square distribution, the p-value is represented by the area under the curve to the right of the chi-square statistic. d. All of the above statements are true.

Section 10.3 (Chi-square Tests for Associations between Categorical Variables) 10. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a random sample of 1,220 U. S. adults were asked about their opinion regarding federal spending on public education. Respondents were asked whether federal spending on public education was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence? a. Homogeneity b. Independence 11. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a researcher was interested in learning more about parents’ concerns when their children go away to college. The researcher asks the parents of 900 randomly selected freshman at a private college and the parents of 1,020 randomly selected freshman at a public college to rate their level of concern with the following statement: We are (a) not at all concerned (b) somewhat concerned or (c) very concerned about the potential pressure to drink alcohol that our child will be exposed to while at college. If we wanted to test whether there was an association between the response to the question and the type of college that the freshman was attending, would this be a test of homogeneity or of independence? a. Homogeneity b. Independence

10-4

Chapter 10 Test A

12. [Objective: Identify the correct hypotheses for tests of independence/homogeneity] Suppose a random sample of 1,220 U. S. adults were asked about their opinion regarding federal spending on public education. Respondents were asked whether federal spending on public education was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. Choose the correct hypotheses to test whether there is an association between the response to the question and income level. a.

H 0 : Among U. S. adults, opinions about federal spending on education and income level are associated. H a : Among U. S. adults, opinions about federal spending on education and income level are independent.

H 0 : Among U. S. adults, opinions about federal spending on education and income level are independent. H a : Among U. S. adults, opinions about federal spending on education and income level are associated.

H 0 : There is no difference between the proportions of U. S. adults who responded (a), (b) or (c) to the opinion question. H a : There is a difference between the proportions of U. S. adults who responded (a), (b) or (c) to the opinion question

None of the above

13. [Objective: Understand the Chi-square test] The table below shows the gender and the percentage of each gender that spent different amounts at a local toy store. The data was taken from a random sample of single shoppers collected over five consecutive Saturdays at the toy store. Choose the reason(s) why you cannot do a chi-square test with this data.

Male shopper Female shopper

a. b. c. d.

$0-$20 18% 20%

>$20-$50 49% 62%

>$50 33% 18%

The samples were not collected randomly The data are from the entire population, not a sample, so inference is unnecessary. There is not enough information to convert the percentages to counts. All of the above.

Chapter 10 Test A

10- 5

14. [Objective: Conduct a chi-square test] Suppose a study was conducted to see whether there is an association between marital status and breast cancer remission. The table below shows the results from the study. Assume all conditions for testing have been met. Patient was in remission _________ after treatment Married (at the time of treatment) Unmarried (at the time of treatment)

1 year

2 years

3 or more years

This was an observational study of randomly chosen patients who had received similar chemotherapy treatments. Test the hypothesis that marital status and remission from breast cancer are associated, using a significance level of 0.05. Choose the correct decision regarding the null hypothesis and conclusion. Refer to the computer output below.

χ 2 − Test X 2 = .5601810006 p = .7557153459 df=2

a. b. c. d.

Reject the null hypothesis; marital status and cancer remission are not associated. Reject the null hypothesis; marital status and cancer remission are associated. Fail to reject the null hypothesis; marital status and cancer remission are not associated. Fail to reject the null hypothesis; marital status and cancer remission are associated.

10-6

Chapter 10 Test A

15. [Objective: Conduct a chi-square test] A health foods store owner is thinking about carrying some new products and is interested in her customer’s opinions. The shop owner decides to randomly sample 202 customers and ask them whether they have (a) heard about the health benefits of coconut milk and (b) whether they have heard of the health benefits of Quinoa flour. She also asked each respondent how often they typically make purchases during a month. The table below shows the results from the study. Assume all conditions for testing have been met.

Coconut milk, yes? Coconut milk, no? Quinoa flour, yes? Quinoa flour, no?

1-2 purchases/mo. 57 12 19 10

3+ purchases/mo. 21 11 25 8

5+ purchases/mo. 11 10 10 8

Test the hypothesis that how the respondents answered the questions is associated with number of monthly purchases, using a significance level of 0.05. Choose the correct decision regarding the null hypothesis and conclusion. Refer to the computer output below.

χ 2 − Test X 2 = 19.43852896 p = .0034836969 df=6

a. b. c. d.

Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated. Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated. Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated. Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated.

16. [Objective: Understand the chi-square test] Choose the statement that is not true about the chi-square test or choose (d) if all the statements are true. a. The conclusion of a chi-square test tells only whether the variables under study are associated, not how they are associated. b. The test statistic is the same for a test of homogeneity or a test for independence, it is chi-square ( X 2 ) . c. d.

To conduct a chi-square test you must have a large enough sample. This condition is met if each expected value is 5 or more. All of the above statements are true.

Chapter 10 Test A

10- 7

Section 10.4 (Hypothesis Tests When Sample Sizes are Small) 17. [Objective: Understand Fisher’s Exact Test] Choose the statement that is not true about Fisher’s Exact Test or choose (d) if all the statements are true. a. When sample size is small resulting in expected cell counts that are less than 5, Fisher’s Exact Test is one option that can be used to conduct a hypothesis test. b. Fisher’s Exact Test cannot be used for tables with more than two rows or columns. c. With Fisher’s Exact Test an exact p-value can be calculated instead of using an approximation for the p-value as is the case with the chi-square test. d. All of the above statements are true.

Use the following information to answer questions (18) and (19). The data in the top row of the table shows the number of vacation days taken by the respondent in the previous 90 days. The respondents also reported their level of happiness; Very H means very happy, and so on.

Very H Fairly H Not very H Not H

Vacation days taken in last 90 days 0 1 2 3 9 15 20 25 5 8 26 35 6 6 4 2 5 7 5 3

4 31 48 1 0

5+ 37 50 2 1

18. [Objective: Conduct hypothesis test when sample size is small] We wish to test whether happiness is associated with taking vacation days. Choose the statement that is true about the hypothesis test or choose (d) if all the statements are false. a. The chi-square test is not appropriate because some expected cell counts will be less than 5. b. The sample size is large so the chi-square test is appropriate. c. A hypothesis test cannot be conducted on this data because some of the observed cell counts are zero and there must be at least one observation in each cell. d. All of the above statements are false.

10-8

Chapter 10 Test A

19. [Objective: Conduct hypothesis test on merged data] The following table shows the data after merging categories so that there are two column categories (0-1 vacation days and 2 or more vacation days), and two row categories (happy and unhappy). Expected values for each cell are also shown in parenthesis. Test the hypothesis that there is an association between happiness and number of vacation days taken in the last 90 days, using a significance level of 0.05. State the value of the test statistic rounded to two decimal places and state whether the p-value is closer to zero or one.

Happy Unhappy

0-1 vacation days 37 (53.701) 24 (7.2991)

χ 2 = 4.22 ; The p-value will be close to zero.

χ 2 = 4.22 ; The p-value will be close to one.

χ 2 = 52.54 ; The p-value will be close to zero.

χ 2 = 52.54 ; The p-value will be close to one.

2+ vacation days 272 (255.3) 18 (34.701)

20. [Objective: Conduct Fisher’s Exact Test] The following table shows the results from a study to see if a home remedy ointment for mosquito bites worked better than a placebo. Each participant was randomly assigned to receive the home remedy ointment or the placebo ointment. “Improvement” means no symptoms of itching after three minutes.

No Improvement Improvement Total

Home remedy 2 6 8

Placebo

Total

4 5 9

6 11 17

The alternative hypothesis is that the home remedy ointment leads to improvement (in this case, less itching). The p-value for a one-tailed Fisher’s exact test with these data is 0.618. Suppose the study had turned out differently, as in the following table.

No Improvement Improvement Total

Home remedy 0 8 8

Placebo

Total

6 3 9

6 11 17

Would Fisher’s Exact Test have led to a p-value larger or smaller than 0.618? a. b.

The p-value would be larger The p-value would be smaller

Chapter 10 Test A Chapter 10 Test A—Answer Key 1. C 2. A 3. B 4. B 5. D 6. C 7. D 8. A 9. D 10. B 11. A 12. B 13. C 14. C 15. D 16. D 17. B 18. A 19. C 20. B

10- 9

Chapter 10 Test B—Multiple Choice Section 10.1 (The Basic Ingredients for Testing with Categorical Variables) Use the following table to answer questions (1) - (3). The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of heights than males.

Expressed a fear of heights Did not express a fear of heights

Men 68 94

Women 109 89

[Objective: Understand a two-way table] What fraction represents the proportion of people in the study who expressed a fear of heights? 183 a. 360 b.

177 360

162 360

198 360

[Objective: Understand a two-way table] How many categorical variables are summarized in the table? a. Two b. Three c. Four d. zero

[Objective: Calculate expected value] Find the expected number of men who should express a fear of heights, if the variables are independent. Round to the nearest whole number. a. 97 b. 80 c. 87 d. 93

Use the following information to answer questions (4) and (5). Bellei Extravecchio Balsamico Tradizionale is a very expensive brand of balsamic vinegar. A twelve ounce bottle typically costs two-hundred dollars or more. In a blind taste test, a group of food experts tasted three premium balsamic vinegars, with the most expensive one being Bellei Extravecchio Balsamico Tradizionale. When asked to pick the most expensive balsamic vinegar, 92 got it right and 76 got it wrong. 4.

[Objective: Calculate expected value] If this group were just guessing, how many people (out of 168) would be expected to guess correctly? a. 84 b. 70 c. 56 d. Not enough information given to calculate expected value.

10-2

Chapter 10 Test B

[Objective: Calculate the Chi-square test statistic] Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. a. 11.57 b. 23.14 c. 41.21 d. 34.71

[Objective: Understand the Chi-square Distribution] Choose the statement that is not true about the chisquare distribution or choose (d) if all the statements are true. a. Values for the chi-square statistic (on the horizontal axis) can be negative, positive, or zero. b. The shape of the chi-square distribution depends on the degrees of freedom. c. The lower the degrees of freedom, the more skewed to the right the chi-square distribution will be. d. All of the above statements are true about the chi-square distribution.

Observed Expected

Correctly identified location of ground water 23 18.2

Incorrectly identified location of ground water 68 72.8

[Objective: Conduct a goodness-of-fit test] Choose the correct null and alternative hypothesis. a. H 0 : The dowsing rods correctly identify the location of ground water 50% of the time. H a : The dowsing rods correctly identify the location of ground water more than 50% of the time. b.

H 0 : The dowsing rods work no better at locating ground water than guessing. H a : The dowsing rods work better at locating ground water than guessing.

H 0 : The dowsing rods correctly identify the location of ground water 20% of the time. H a : The dowsing rods correctly identify the location of ground water more than 20% of the time.

H 0 : The dowsing rods work better at locating ground water than guessing. H a : The dowsing rods work no better at locating ground water than guessing.

Chapter 10 Test B 8.

10- 3

[Objective: Conduct a goodness-of-fit test] Test the hypothesis that the dowsing rods worked better at locating ground water than guessing. Using a goodness-of-fit test and a 0.05 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. a. Reject H 0 . There is enough evidence to conclude that the dowsing rods worked better than b.

guessing. Fail to reject H 0 . There is not enough evidence to conclude that the dowsing rods worked better

than guessing. Reject H 0 . There is not enough evidence to conclude that the dowsing rods worked better than

guessing. Fail to reject H 0 . There is enough evidence to conclude that the dowsing rods worked better than guessing.

[Objective: Understand the goodness-fit-test] Of the following statements, which one is not true about the chi-square statistic and p-value? Choose (d) if all statements are true. a. The smaller the chi-square statistic, the larger the p-value. b. Under the assumption that the null is true, the p-value is the probability that the chi-square statistic will be as big as or bigger than the observed value. c. On the chi-square distribution, the p-value is represented by the area under the curve to the right of the chi-square statistic. d. All of the above statements are true.

Section 10.3 (Chi-square Tests for Associations between Categorical Variables) 10. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a researcher was interested in learning more about parents’ concerns when their children start elementary school. The researcher asks the parents of 800 randomly selected first graders in rural school district and the parents of 950 randomly selected first graders in an urban school district to rate their level of concern with the following statement: We are (a) not at all concerned (b) somewhat concerned or (c) very concerned about the nutrition level of school lunches. If we wanted to test whether there was an association between the response to the question and the type of school district that the first grader was attending, would this be a test of homogeneity or of independence? a. Homogeneity b. Independence 11. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a random sample of 1,220 U. S. adults were asked about their opinion regarding federal spending on infrastructure (i.e. roads and bridges). Respondents were asked whether federal spending on infrastructure was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence? a. Homogeneity b. Independence

10-4

Chapter 10 Test B

12. [Objective: Identify the correct hypotheses for tests of independence/homogeneity] Suppose a random sample of 1,220 U. S. adults were asked about their opinion regarding federal spending on infrastructure (i.e. roads and bridges). Respondents were asked whether federal spending on infrastructure was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. Choose the correct hypotheses to test whether there is an association between the response to the question and income level. a.

H 0 : Among U. S. adults, opinions about federal spending on infrastructure and income level are associated. H a : Among U. S. adults, opinions about federal spending on infrastructure and income level are independent.

H 0 : Among U. S. adults, opinions about federal spending on infrastructure and income level are independent. H a : Among U. S. adults, opinions about federal spending on infrastructure and income level are associated.

None of the above

13. [Objective: Understand the Chi-square test] The table below shows the gender and the percentage of each gender that spent different amounts at a local hardware store. The data was taken from a random sample of single shoppers collected over five consecutive Saturdays at the hardware store. Choose the reason(s) why you cannot do a chi-square test with this data.

Male shopper Female shopper

a. b. c. d.

$0-$20 27% 46%

>$20-$50 42% 35%

>$50 31% 19%

There is not enough information to convert the percentages to counts. The samples were not collected randomly The data are from the entire population, not a sample, so inference is unnecessary. All of the above.

Chapter 10 Test B

10- 5

14. [Objective: Conduct a chi-square test] Suppose a study was conducted to see whether there is an association between marital status and vehicle color. The table below shows the results from the study. Assume all conditions for testing have been met. Vehicle color red

black/white/silver

blue/green/other

Married

Unmarried

This was an observational study of randomly chosen vehicle owners. Test the hypothesis that marital status and vehicle color are associated, using a significance level of 0.05. Choose the correct decision regarding the null hypothesis and conclusion. Refer to the computer output below.

χ 2 − Test X 2 = 1.754774169

p = .4158681217 df=2

a. b. c. d.

Reject the null hypothesis; marital status and vehicle color are not associated. Reject the null hypothesis; marital status and vehicle color are associated. Fail to reject the null hypothesis; marital status and vehicle color are associated. Fail to reject the null hypothesis; marital status and vehicle color are not associated.

10-6

Chapter 10 Test B

Coconut milk, yes? Coconut milk, no? Quinoa flour, yes? Quinoa flour, no?

1-2 purchases/mo. 57 12 19 10

3+ purchases/mo. 21 11 25 8

5+ purchases/mo. 11 10 10 8

χ 2 − Test X 2 = 19.43852896

p = .0034836969 df=6

a. b. c. d.

Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated. Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated. Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated. Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated.

16. [Objective: Understand the chi-square test] Choose the statement that is not true about the chi-square test or choose (d) if all the statements are true. a. The conclusion of a chi-square test tells whether the variables under study are associated and how they are associated. b. The test statistic is the same for a test of homogeneity or a test for independence, it is chi-square ( X 2 ) . c. d.

To conduct a chi-square test you must have a large enough sample. This condition is met if each expected value is 5 or more. All of the above statements are true.

Chapter 10 Test B

10- 7

Section 10.4 (Hypothesis Tests When Sample Sizes are Small) 17. [Objective: Understand Fisher’s Exact Test] Choose the statement that is not true about Fisher’s Exact Test or choose (d) if all the statements are true. a. When sample size is small resulting in expected cell counts that are less than 5, Fisher’s Exact Test is one option that can be used to conduct a hypothesis test. b. Fisher’s Exact Test can be used for tables with more than two rows or columns. c. With Fisher’s Exact Test an exact p-value can be calculated instead of using an approximation for the p-value as is the case with the chi-square test. d. All of the above statements are true.

Use the following information to answer questions (18) and (19). The data in the table shows the number of vacation days taken by the respondent in the previous 90 days in the top row. The respondents also reported their level of happiness; Very H means very happy, and so on.

Very H Fairly H Not very H Not H

Vacation days taken in last 90 days 0 1 2 3 10 14 19 27 6 7 28 33 5 7 6 3 6 6 5 2

4 30 45 0 1

5+ 37 49 1 2

10-8

Chapter 10 Test B

Happy Unhappy

0-1 vacation days 84 (104) 35 (15.003)

χ 2 = 46.28 ; The p-value will be close to one.

χ 2 = 46.28 ; The p-value will be close to zero.

χ 2 = 5.03 ; The p-value will be close to one.

χ 2 = 5.03 ; The p-value will be close to zero.

2+ vacation days 221 (201) 9 (28.997)

No Improvement Improvement Total

Home remedy 2 6 8

Placebo

Total

4 5 9

6 11 17

The alternative hypothesis is that the home remedy ointment leads to improvement (in this case, less itching). The p-value for a one-tailed Fisher’s Exact Test with these data is 0.618. Suppose the study had turned out differently, as in the following table.

No Improvement Improvement Total

Home remedy 6 2 8

Placebo

Total

0 9 9

6 11 17

Would Fisher’s Exact Test have led to a p-value larger or smaller than 0.618? a. b.

The p-value would be larger The p-value would be smaller

Chapter 10 Test B

Chapter 10 Test B—Answer Key 1. B 2. A 3. B 4. C 5. D 6. A 7. B 8. B 9. D 10. A 11. B 12. C 13. A 14. D 15. C 16. A 17. D 18. A 19. B 20. A

10- 9

Chapter 10 Test C—Short Answer Section 10.1 (The Basic Ingredients for Testing with Categorical Variables) Use the following table to answer questions (1) - (3). The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of public speaking then males.

Expressed a fear of public speaking Did not express a fear of public speaking

Men 112 91

Women 79 82

[Objective: Understand a two-way table] Describe the categorical variables that are summarized in the table. Describe one association that could be tested based on the information from the table.

[Objective: Understand a two-way table] Calculate the percentage of people in the study that did not express a fear of public speaking. Round to the nearest tenth of a percent.

[Objective: Calculate expected value] Find the expected number of men who should express a fear of public speaking, if the variables are independent. Round to the nearest hundredth.

Use the following information to answer questions (4) and (5). Almas caviar is a very expensive Iranian brand of Beluga sturgeon caviar. A one pound gold-plated tin typically costs $12,000 or more. In a blind taste test, a group of food experts tasted four premium Beluga sturgeon caviars, with the most expensive one being Almas caviar. When asked to pick the most expensive caviar, 141 got it right and 79 got it wrong. 4.

[Objective: Calculate expected value] If this group were just guessing, how many people (out of 220) would be expected to guess correctly?

[Objective: Calculate the Chi-square test statistic] Calculate the observed value of the chi-square statistic by hand. Show all your work. Round to the nearest hundredth.

10-2 6.

Chapter 10 Test C [Objective: Understand the Chi-square Distribution] Compare the sampling distributions of the normally distributed test statistic z and the chi-square test statistic. How do the sampling distributions differ in shape and range of values?

Section 10.2 (The Chi-square Test for Goodness-of-fit) Use the following information to answer questions (7) – (8) There are those who believe that identical twins have a psychic connection. Skeptics of this belief conducted an experiment to see if identical twins had some kind of psychic connection. A twin was placed at random behind on of four similar doors and three other non-related people were placed behind the other three doors. The other twin was asked to identify which door their sibling was behind. The results of the experiment are summarized below, along with the output for the goodness-of-fit test.

Observed Expected

Correctly identified the door concealing the twin

Incorrectly identified the door concealing the twin

51 30.25

70 90.75

χ 2 − Test X 2 = 18.97796143 p = 1.3223704E − 5 df=1

[Objective: Conduct a goodness-of-fit test] In the context of this description, state the correct null and alternative hypothesis to test the claim that identical twins have a psychic connection.

[Objective: Conduct a goodness-of-fit test] Test the hypothesis that identical twins have a psychic connection and could locate their twin behind the door more often than would be expected by guessing. Using a goodness-of-fit test and a 0.05 level of significance, state the correct decision regarding the null hypothesis and summarize your conclusion using a complete sentence.

Chapter 10 Test C 9.

10- 3

[Objective: Understand the goodness-fit-test] Suppose a goodness-of-fit test is used to test the claim that obesity rates in the elderly have changed since the time of the Egyptian mummies. The p-value is calculated to be 0.00023. Describe the value of the chi-square test statistic (is it likely to be large? small?) and the decision regarding the null hypothesis that there is no difference in obesity rates for mummies and modern-day elderly people.

Section 10.3 (Chi-square Tests for Associations between Categorical Variables) 10. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a researcher was interested in learning more about high school seniors’ concerns about the future. The researcher asks the 750 randomly selected female high school seniors and the 800 randomly selected male high school seniors to rate their level of concern with the following statement: I am (a) not at all concerned (b) somewhat concerned or (c) very concerned about getting accepted into a two or four year college. If we wanted to test whether there was an association between the response to the question and the gender of the high school senior, would this be a test of homogeneity or independence?

11. [Objective: Differentiate between a test for homogeneity and a test for independence] Suppose a random sample of 1,105 adults were asked about their opinion regarding salaries of full-time employees at a local state funded university. Respondents were asked whether salaries at the university were (a) too low, (b) adequate, or (c) too high. Respondents were also classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence?

12. [Objective: Identify the correct hypotheses for tests of independence/homogeneity] Suppose a random sample of 1,105 adults were asked about their opinion regarding salaries of full-time employees at a local state funded university. Respondents were asked whether salaries at the university were (a) too low, (b) adequate, or (c) too high. Respondents were also classified by income level. State the correct hypothesis to test whether there is an association between the response to the questions and income level.

10-4

Chapter 10 Test C

13. [Objective: Understand the Chi-square test] The table below shows the gender and the percentage of each gender that spent different amounts at a local book store. The data was taken from a random sample of single shoppers collected over five consecutive Saturdays at the book store. Can a chi-square test be done with this data? Explain why or why not.

Male shopper Female shopper

$0-$20 36% 67%

>$20-$50 42% 21%

>$50 22% 12%

Use the following information to answer questions (14) and (15). Suppose that you are a researcher and that you want to research whether a home remedy that recommends applying a salve of tobacco and saliva to cuts or scrapes to absorb toxins really works. You decide to conduct a study to see whether there is an association between the treatment and the outcome. A positive outcome means that the patient reported lower pain levels and the wound had healed without infection in less than 14 days. The table below shows the results from your study. Assume all conditions for testing have been met.

Home remedy

Placebo

Positive outcome

Not positive outcome

14. [Objective: Conduct a chi-square test] Test the hypothesis that the treatment is associated with the outcome, using a significance level of 0.05. State the correct decision regarding the null hypothesis and write a conclusion using a full sentence. Refer to the TI-83 output below.

χ 2 − Test X 2 = 4.776341117 p = .0288533643 df=1

Chapter 10 Test C

10- 5

15. [Objective: Understand the chi-square test] From a purely scientific point-of-view, would you suggest the tobacco and saliva salve to a co-worker who expressed an interest in home remedies? Explain why or why not.

16. [Objective: Conduct a chi-square test] An eco-friendly home improvement store owner is thinking about carrying some new products and is interested in her customer’s opinions. The shop owner decides to randomly sample 397 customers and ask them whether they have (a) heard about bricks made from recycled materials and (b) whether they have heard of landscaping materials made from recycled glass. She also asked each respondent how often they typically make purchases during a month. The table below shows the results from the study. Assume all conditions for testing have been met. 1-2 purchases/mo. 82 44 23 11

Recycled bricks, yes? Recycled bricks, no? Landscaping glass, yes? Landscaping glass, no?

3+ purchases/mo. 42 32 7 18

5+ purchases/mo. 72 12 44 10

Test the hypothesis that how the respondents answered the questions is associated with number of monthly purchases, using a significance level of 0.05. State the correct decision regarding the null hypothesis and write a conclusion using a full sentence. Refer to the output below.

χ 2 − Test X 2 = 50.30560555 p = 4.0824186 E − 9 df=6

10-6

Chapter 10 Test C

Section 10.4 (Hypothesis Tests When Sample Sizes are Small) Use the following information to answer questions (18) and (19). The data in the top row of the table shows the number of days for which the respondent participated in an outdoor activity for at least thirty minutes in the previous 60 days. The respondents also reported their level of happiness; Very H means very happy, and so on.

Level of Happiness

Very H Fairly H Not very H Not H

Days of outdoor activity (30 minutes or more) 0 1 2 3 4 8 12 17 19 21 4 5 25 14 25 4 6 6 3 0 7 5 6 1 0

5+ 24 30 1 1

17. [Objective: Conduct hypothesis test when sample size is small] Suppose you want to test the hypothesis that happiness level is associated with level of daily outdoor activity. Can a chi-square test for independence be used to test the hypothesis? Explain why or why not.

18. [Objective: Conduct hypothesis test on merged data] The following table shows the data after merging categories so that there are two column categories (0-2 days of outdoor activity of at least thirty minutes and 3 or more days of outdoor activity of at least thirty minutes), and two row categories (happy and unhappy). Expected values for each cell are also shown in parenthesis. Test the hypothesis that there is an association between happiness and with level of daily outdoor activity in the last 60 days, using a significance level of 0.05. State the value of the test statistic rounded to two decimal places, state whether the p-value is closer to zero or one.

Happy Unhappy

0-2 days of outdoor activity 71 (87.787) 34 (17.213)

3+ days of outdoor activity 133 (116.21) 6 (22.787)

Chapter 10 Test C

10- 7

19. [Objective: Conduct hypothesis test when sample size is small] Describe at least one advantage and one disadvantage of combining categories as was done in the previous question?

20. [Objective: Conduct Fisher’s Exact Test] The following table shows the results from a study to see if a home remedy salve for bee stings worked better than a placebo. Each participant was randomly assigned to receive the home remedy salve or a placebo ointment. “Improvement” means that there were no symptoms of pain after fifteen minutes.

No Improvement Improvement Total

Home remedy 2 6 8

Placebo

Total

1 7 8

3 13 16

The alternative hypothesis is that the home remedy salve leads to improvement (in this case, no pain). The p-value for a one-tailed Fisher’s Exact Test with these data is 0.500. Suppose the study had turned out differently, as in the following table.

No Improvement Improvement Total

Home remedy 0 8 8

Placebo

Total

3 5 8

3 13 16

Would Fisher’s Exact Test have led to a p-value larger or smaller than 0.500? Explain.

10-8

Chapter 10 Test C

Chapter 10 Test C—Answer Key 1.

3. 4.

Gender and the response to the question regarding fear of public speaking are the categorical variables. One association that could be tested is whether there is an association between gender and a fear of public speaking. 173 = 0.475 or 47.5% 364 106.58 55

χ 2 = 179.30

The sampling distribution of z is bell-shaped and values can be any real number. The sampling distribution

of χ 2 is usually not symmetric and is often right skewed. The values of χ 2 must be positive. 7.

H o : The identical twin did no better than guessing which door concealed the other twin; H a : The identical twin did better than guessing which door concealed the other twin.

Reject the null hypothesis; there is enough evidence to conclude that the identical twin did better than guessing which door concealed the other twin. 9. The chi-square statistic will be close to zero. The p-value supports a decision to reject the null hypothesis. 10. Independence 11. Homogeneity 12. H 0 : Among adults, opinions about state-funded university salaries for full-time employees and

income level are independent. H a : Among adults, opinions about state-funded university salaries for full-time employees and income level are associated. 13. The data are given as percentages, not frequencies, and there is not enough information given to convert the percentages to counts. 14. Reject the null hypothesis. There is enough evidence to conclude that the treatment and the outcome are associated. 15. Yes, they should try the home remedy because the study showed that the treatment and the positive outcome were associated. 16. Reject the null hypothesis. How the respondents answered the question and number of monthly purchases are associated. 17. No, because some of the expected cell counts will be less than 5 and a chi-square test would results in inaccurate p-values which could lead to a wrong conclusion. 18. χ 2 = 34.37 ; the p-value will be close to zero. 19. If combined categories result in expected cell counts that are 5 or greater, then an advantage of combining categories is that you can conduct a chi-square test which will result in accurate p-values. A disadvantage of combining categories is that any conclusions will now apply to a broader group and may lose some of its practical value. 20. The p-value would be smaller than 0.500 because these results are more extreme.

Chapter 11 Test A—Multiple Choice Section 11.1 (The Problem of Multiple Comparisons) 1.

[Objective: Demonstrate understanding of ANOVA] Choose the statement that is not true about multiple comparisons and ANOVA. Choose (d) if all the statements are true. a. ANOVA is a method for testing whether there is an association between a categorical variable and a numerical variable. b. When doing multiple comparisons, the response variable is always numerical, but the independent variable can be numerical or categorical. c. When doing a multiple comparison, the overall significance level will increase meaning it is more likely that an incorrect conclusion will be drawn. d. All of the above statements are true.

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: You wish to test whether the mean number of words recalled from short term memory is different for males and females. a. One-sample t-test b. Two-sample t-test c. ANOVA d. Chi-square test

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: You wish to test whether an association exists between the type of vehicle a driver owns and the cost of speeding tickets. a. One-sample t-test b. Two-sample t-test c. ANOVA d. Chi-square test

[Objective: Demonstrate understanding of the Bonferroni Correction] Suppose you have observations from six different regions within your state and you wish to do hypothesis tests to compare the mean income across groups. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.10? Round to the nearest thousandth. a. 0.050 b. 0.007 c. 0.033 d. None of the above

11-2

Chapter 11 Test A

Section 11.2 (Analysis of Variance) 6.

[Objective: Understand the purpose of ANOVA] Choose the statement that best describes the purpose of ANOVA. a. ANOVA is a procedure for comparing the means of several groups. b. ANOVA is a procedure for comparing different categories for several groups. c. The ANOVA procedure will reveal whether the means of several groups are different and which group or groups have a different mean. d. None of the above.

[Objective: Understand the F-Statistics] Identify the test statistic used for the ANOVA procedure and how it is calculated. a. The test statistic is z and is the ratio of the mean within a group to the variation between groups. b. The test statistic is F and is calculated by finding the difference between group means. c. The test statistic is z and is calculated by finding the mean z-score between groups. d. The test statistic is F and is the ratio of the variation between groups to the variation within groups.

[Objective: Understand the F-statistic] Suppose a researcher collected data to compare whether dogs of different size categories differed in mean cost to a pet owner. Dogs were categorized as small, medium, or large. Cost was calculated as the average annual amount spent on food, veterinary visits, and medications. The calculated F-statistic was 421.58. Given this test statistic, which of the following is the most reasonable conclusion? a. The F-statistics shows that variation within groups is larger than variation between groups, therefore, the researcher will likely conclude that there is not an association between dog size and mean cost to the pet owner. b. The F-statistic shows that large dogs have a significantly higher cost to the pet own than the other categories, therefore, the researcher will conclude that there is an association between dog size and mean cost to the pet owner. c. The F-statistics shows that variation between groups is larger than variation within groups, therefore, the researcher will likely conclude that there is an association between dog size and mean cost to the pet owner. d. None of the above

Chapter 11 Test A

11- 3

Use the following information for questions (9) and (10). Researchers conducted a study that examined marital status and stress levels. A hypothesis test was conducted to test the claim that people with different marital statuses have a different mean stress level. The TI-84 output for the test is shown below. One-way ANOVA F= 5.21104034 P = .0105994136 Factor df=3 SS=76.7 MS=25.5666667

[Objective: Understand the ANOVA test] State the null and alternative hypothesis. a. H 0 : Marital status and stress levels are associated. H a : Marital status and stress levels are not associated. b.

H 0 : Marital status and stress levels are not associated. H a : Marital status and stress levels are associated.

H 0 : Marital status and stress levels are not associated. H a : Mean stress levels of married people are greater than the mean stress levels of unmarried people.

H 0 : There is no difference in the mean stress levels of married and unmarried people. H a : There is a difference in the mean stress levels of married and unmarried people.

10. [Objective: Understand the ANOVA test] What is the value of the test statistic? Round to the nearest hundredth if necessary. a. 25.27 b. 76.70 c. 5.21 d. Can’t be determined with the given information 11. [Objective: Understand the ANOVA test] In the context of the ANOVA test, which of the following phrases is equivalent to the phrase “variation between groups”? a. Variation due to treatment b. Variation due to factors c. Explained variation d. All of the above

11-4

Chapter 11 Test A

Section 11.3 (The ANOVA Test) 12. [Objective: Understand the ANOVA test] A movie studio did a poll to determine whether women in different age groups watched different amounts of horror movies. Check the computer output to see whether the same-variance condition for ANOVA holds. Is ANOVA appropriate? One-way ANOVA F=15.44055794 p=7.5111227E-5 Factor df=2 SS=1957.90476 MS=978.952381 Error df=21 SS=1331.42857 MS=63.4013605 Sxp=7.96249713

a. b.

Cat 17-24 25-32 33-40

n 20 12 21

Mean 9.5 14.7 6.5

StDev 6.02 4.82 10.21

Yes No

13. [Objective: Understand the ANOVA test] Which of the following is not one of the conditions that must be checked in order for the calculated F-statistic to follow the F-distribution. a. The groups are independent of each other. b. Each group’s population must be at least 10 times larger than its respective sample. c. The variances or standard deviations of the groups must be equal. d. The distribution of the observations is Normal in each group’s population or the sample size is large. 14. [Objective: Understand the ANOVA test] Choose the statement that best describes the F-statistic for the ANOVA test. a. The F-statistic compares the variation between groups to the variation within groups. A large Fstatistic indicates that variation between groups is small relative to variation within groups. b. The F-statistic is the probability of getting the sample results, assuming that there is no difference in the groups. c. The F-statistic compares the variation between groups to the variation within groups. A large Fstatistic indicates that variation between groups is large relative to variation within groups. d. None of the above

Chapter 11 Test A

11- 5

Use the following information for questions (15) – (16). A group of home gardeners want to test whether the type of soil used to grow heirloom tomatoes has an effect on the number of tomatoes harvested. Gardeners randomly assigned tomato plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Forty plants were assigned to each group. At the end of the growing season the number of tomatoes harvested was counted. Assume that all other conditions for the ANOVA test have been met. One-way ANOVA F= 12.57818941 P = 2.5637132 E − 4 Factor df=2 SS=710.349206 MS=355.174603

15. [Objective: Understand the ANOVA test] State the null and alternative hypothesis. a. H 0 : µ none < µcommercial < µcompost

H a : There is no difference in the mean number of tomatoes harvested. b.

H 0 : µ none = µcommercial = µcompost H a : The mean number of tomatoes harvested differs by type of fertilizer used.

H 0 : The mean number of tomatoes harvested differs by type of fertilizer used. H a : There is no difference in the mean number of tomatoes harvested by type of fertilizer used.

None of the above

16. [Objective: Understand the ANOVA test] Using the test results provided, test the hypothesis that soil treatment affects the number of tomatoes harvested. Use a significance level of 5%. Choose the correct decision regarding the null hypothesis and correct conclusion. a. Reject H 0 . We can conclude that the treatment of the soil affects the number of heirloom b.

tomatoes harvested. Fail to reject H 0 . We can conclude that the treatment of the soil affects the number of heirloom

tomatoes harvested. Reject H 0 . We can conclude that the treatment of the soil does not affect the number of heirloom

tomatoes harvested Fail to reject H 0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested

11-6

Chapter 11 Test A

Use the following information to answer questions (17) - (19). Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study—red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer questions (17) - (19). One-way ANOVA Variation between groups=462.5 Variation within groups=388.54 p=.3547790947

17. [Objective: Understand the ANOVA test] Compute the F-statistic. Round to the nearest hundredth. a. 1.19 b. 0.84 c. 0.35 d. Not enough information is given

18. [Objective: Understand the ANOVA test] Choose the correct conclusion for the hypothesis that vehicle color affects the amount of a speeding ticket. Assume all ANOVA test conditions have been satisfied. a. Reject H 0 . The vehicle color has an effect on the amount of the speeding ticket. b.

Fail to reject H 0 . The vehicle color has an effect on the amount of the speeding ticket.

Reject H 0 . The vehicle color has no effect on the amount of the speeding ticket.

Fail to reject H 0 . The vehicle color has no effect on the amount of the speeding ticket.

Section 11.4 (Post Hoc Procedures) 19. [Objective: Understand Post-hoc Procedures] Do the ANOVA test results warrant a Post-hoc procedure? a. Yes b. No 20. [Objective: Understand Post-hoc Procedures] An ANOVA test was conducted to see whether bike frame type (Type A, Type B, or Type C) had an effect on speed over a one mile distance. Test results warranted post-hoc procedures. The Tukey HSD approach was used with the following results: Group Comparison

98.33% Confidence Interval

Type A – Type B

( −12.70, −4.29 ) ( −19.63, −8.87 ) ( −9.89, −1.61)

Type A – Type C Type B – Type C

Is there evidence that one type of bike frame is faster than the others? Which type of frame appears to be the fastest? a. b. c. d.

Yes, the confidence interval results show that frame type A is the faster than B or C. Yes, the confidence interval results show that frame type B is the faster than A or C. Yes, the confidence interval results show that frame type C is the faster than A or B. No, there is not enough evidence to say with confidence that one frame type is faster than the others because none of the confidence intervals contain zero.

Chapter 11 Test A Chapter 11 Test A—Answer Key 1. B 2. B 3. C 4. A 5. B 6. A 7. D 8. C 9. B 10. C 11. D 12. B 13. B 14. B 15. C 16. A 17. A 18. D 19. B 20. A

11- 7

Chapter 11 Test B—Multiple Choice Section 11.1 (The Problem of Multiple Comparisons) 1.

[Objective: Demonstrate understanding of ANOVA] Choose the statement that is not true about multiple comparisons and ANOVA. Choose (d) if all the statements are true. a. ANOVA is a method for testing whether there is an association between a categorical variable and a numerical variable. b. When doing a multiple comparison, the overall significance level will increase meaning it is more likely that an incorrect conclusion will be drawn. c. When doing multiple comparisons, the response variable is always numerical, but the independent variable can be numerical or categorical. d. All of the above statements are true.

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: You wish to test whether an association exists between type of vehicle purchased and vehicle color. a. One-sample t-test b. Two-sample t-test c. ANOVA d. Chi-square test

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: You wish to test whether an association exists between the type of vehicle purchased and how many children the buyer has. a. One-sample t-test b. Two-sample t-test c. ANOVA d. Chi-square test

[Objective: Demonstrate understanding of the Bonferroni Correction] Suppose you have observations from five different regions within your state and you wish to do hypothesis tests to compare the mean home value across groups. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.05? Round to the nearest thousandth. a. 0.050 b. 0.010 c. 0.005 d. None of the above

11-2

Chapter 11 Test B

Section 11.2 (Analysis of Variance) 6.

[Objective: Understand the purpose of ANOVA] Choose the statement that best describes the purpose of ANOVA. a. The ANOVA procedure will reveal whether the means of several groups are different and which group or groups have a different mean. b. ANOVA is a procedure for comparing the means of several groups. c. ANOVA is a procedure for comparing different categories for several groups. d. None of the above.

[Objective: Understand the F-Statistics] Identify the test statistic used for the ANOVA procedure and how it is calculated. a. The test statistic is z and is the ratio of the mean within a group to the variation between groups. b. The test statistic is F and is the ratio of the variation between groups to the variation within groups. c. The test statistic is z and is calculated by finding the mean z-score between groups. d. The test statistic is F and is calculated by finding the difference between group means.

[Objective: Understand the F-statistic] Suppose a researcher collected data to compare whether dogs of different size categories differed in mean cost to a pet owner. Dogs were categorized as small, medium, or large. Cost was calculated as the average annual amount spent on food, veterinary visits, and medications. The calculated F-statistic was 421.58. Given this test statistic, which of the following is the most reasonable conclusion? a. The F-statistics shows that variation between groups is larger than variation within groups, therefore, the researcher will likely conclude that there is an association between dog size and mean cost to the pet owner. b. The F-statistics shows that variation within groups is larger than variation between groups, therefore, the researcher will likely conclude that there is not an association between dog size and mean cost to the pet owner. c. The F-statistic shows that large dogs have a significantly higher cost to the pet own than the other categories, therefore, the researcher will conclude that there is an association between dog size and mean cost to the pet owner. d. None of the above

Chapter 11 Test B

11- 3

H 0 : Marital status and stress levels are not associated. H a : Mean stress levels of married people are greater than the mean stress levels of unmarried people.

H 0 : Marital status and stress levels are not associated. H a : Marital status and stress levels are associated.

H 0 : There is no difference in the mean stress levels of married and unmarried people. H a : There is a difference in the mean stress levels of married and unmarried people.

10. [Objective: Understand the ANOVA test] What is the value of the test statistic? Round to the nearest hundredth if necessary. a. 25.27 b. 5.21 c. 76.70 d. Can’t be determined with the given information 11. [Objective: Understand the ANOVA test] In the context of the ANOVA test, which of the following phrases is equivalent to the phrase “variation within groups”? a. Variation due to error b. Residual variation c. Unexplained variation d. All of the above

11-4

Chapter 11 Test B

a. b.

Cat 17-24 25-32 33-40

n 20 12 21

Mean 9.5 14.7 6.5

StDev 7.02 5.82 9.21

Yes No

13. [Objective: Understand the ANOVA test] Which of the following is not one of the conditions that must be checked in order for the calculated F-statistic to follow the F-distribution. a. The groups are independent of each other. b. The variances or standard deviations of the groups must be equal. c. The distribution of the observations is Normal in each group’s population or the sample size is large. d. Each group’s population must be at least 10 times larger than its respective sample. 14. [Objective: Understand the ANOVA test] Choose the statement that best describes the F-statistic for the ANOVA test. a. The F-statistic compares the variation between groups to the variation within groups. A small Fstatistic indicates that variation between groups is small relative to variation within groups. b. The F-statistic compares the variation between groups to the variation within groups. A small Fstatistic indicates that variation between groups is large relative to variation within groups. c. The F-statistic is the probability of getting the sample results, assuming that there is no difference in the groups. d. None of the above

Chapter 11 Test B

11- 5

Use the following information for questions (15) – (16). A group of home gardeners want to test whether the type of soil used to grow heirloom tomatoes has an effect on the number of tomatoes harvested. Gardeners randomly assigned tomato plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Forty plants were assigned to each group. At the end of the growing season the number of tomatoes harvested was counted. Assume that all other conditions for the ANOVA test have been met. One-way ANOVA F= 0.762135 P = 0.5202354 Factor df=2 SS=710.349206 MS=355.174603

15. [Objective: Understand the ANOVA test] State the null and alternative hypothesis. a. H 0 : µ none < µcommercial < µcompost

H a : There is no difference in the mean number of tomatoes harvested. b.

H 0 : The mean number of tomatoes harvested differs by type of fertilizer used. H a : There is no difference in the mean number of tomatoes harvested by type of fertilizer used.

H 0 : µ none = µcommercial = µcompost H a : The mean number of tomatoes harvested differs by type of fertilizer used.

None of the above

tomatoes harvested. Fail to reject H 0 . We can conclude that the treatment of the soil affects the number of heirloom

tomatoes harvested. Reject H 0 . We can conclude that the treatment of the soil does not affect the number of heirloom

tomatoes harvested Fail to reject H 0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested

11-6

Chapter 11 Test B

17. [Objective: Understand the ANOVA test] Compute the F-statistic. Round to the nearest hundredth. a. 35.48 b. 10.51 c. 0.10 d. Not enough information is given 18. [Objective: Understand the ANOVA test] Choose the correct conclusion for the hypothesis that vehicle color affects the amount of a speeding ticket. Assume all ANOVA test conditions have been satisfied. a. Reject H 0 . The vehicle color has an effect on the amount of the speeding ticket. b.

Fail to reject H 0 . The vehicle color has an effect on the amount of the speeding ticket.

Reject H 0 . The vehicle color has no effect on the amount of the speeding ticket.

Fail to reject H 0 . The vehicle color has no effect on the amount of the speeding ticket.

98.33% Confidence Interval

Type A – Type B

( −12.70, −4.29 ) ( −19.63, −8.87 ) ( −9.89, −1.61)

Type A – Type C Type B – Type C

Is there evidence that one type of bike frame is faster than the others? Which type of frame appears to be the fastest? a. b. c. d.

Yes, the confidence interval results show that frame type B is the faster than A or C. Yes, the confidence interval results show that frame type C is the faster than A or B. Yes, the confidence interval results show that frame type A is the faster than B or C. No, there is not enough evidence to say with confidence that one frame type is faster than the others because none of the confidence intervals contain zero.

Chapter 11 Test B Chapter 11 Test B—Answer Key 1. C 2. D 3. C 4. A 5. C 6. B 7. B 8. A 9. C 10. B 11. D 12. A 13. C 14. D 15. A 16. D 17. B 18. A 19. B 20. C

11- 7

Chapter 11 Test C—Multiple Choice Section 11.1 (The Problem of Multiple Comparisons) 1.

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: Suppose that researchers wish to study whether dieting affects alertness levels. Three popular dieting plans were included in the study. After following the diet for several weeks, participants were given a test that measured response time to a stimulus.

[Objective: Choose the appropriate test] Choose the appropriate test for the following situation: You wish to test whether an association exists between type of vehicle transmission purchased—automatic or manual— and gender of the purchaser.

[Objective: Demonstrate understanding of the Bonferroni Correction] Suppose you have observations from six different school districts within your state and you wish to do hypothesis tests to compare the mean home value across school districts. How many comparisons can be done with six groups?

[Objective: Demonstrate understanding of the Bonferroni Correction] Suppose you have observations from six different school districts within your state and you wish to do hypothesis tests to compare the mean home value across groups. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.10? Round to the nearest thousandth.

Section 11.2 (Analysis of Variance) 5.

[Objective: Understand the purpose of ANOVA] What is the meaning the overall significance level of a test? Explain what happens to the overall significance level when multiple comparisons are made, that is, when multiple hypothesis tests are conducted in an effort to compare the means of several groups.

11-2

Chapter 11 Test C

[Objective: Understand the ANOVA test] What is the test statistic used for the ANOVA procedure? Explain how it is calculated.

[Objective: Understand the F-statistic] Suppose a researcher collected data to compare whether cats of different size categories differed in mean number of hours slept per day. Cats were categorized as small, medium, or large. The calculated F-statistic was 220.41. Given this test statistic, can we conclude that there is more variation between groups or within groups? Based on the test statistic, is it likely that an association exists between cat size and mean number of hours that a cat sleeps per day? Explain.

Use the following information for questions (8) and (9). Researchers conducted a study that examined marital status and work hours. A hypothesis test was conducted to test the claim that people with different marital statuses have a different mean number of work hours. Participants were categorized as married, single, or divorced. Assume that all conditions for the ANOVA test have been met. The TI-84 output for the test is shown below. One-way ANOVA F= 7.010859729 p = 007083712 Factor df=2 SS=860.777778 MS=430.388889

[Objective: Understand the ANOVA test] State the null and alternative hypothesis.

[Objective: Understand the ANOVA test] What is the value of the test statistic? Round to the nearest hundredth if necessary.

Chapter 11 Test C

11- 3

10. [Objective: Understand the ANOVA test] In the context of the ANOVA test, do the phrases “variation due to treatment”, “explained variation”, and “variation due to factors”, describe variation within groups or variation between groups?

Section 11.3 (The ANOVA Test) 11. [Objective: Understand the ANOVA test] A television studio did a poll to determine whether women in different age groups watched different amounts of comedy programming. Check the TI-84 output to see whether the same-variance condition for ANOVA holds. Show your work. Is ANOVA appropriate? One-way ANOVA F=1.739515653 p=.2092138468 Factor df=2 SS=16.3611111 MS=8.18055556 Error df=15 SS=70.5416667 MS=4.70277778 Sxp=2.16858889

Cat 17-24 25-32 33-40

n 25 15 21

Mean 3.42 4.50 2.17

StDev 1.28 3.45 0.75

11-4

Chapter 11 Test C

Use the following information for questions (12) – (14). A group of home gardeners want to test whether the type of soil used to grow carrots has an effect on the number of carrots harvested. Gardeners randomly assigned carrot plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Fifty plants were assigned to each group. At the end of the growing season the number of carrots harvested was counted. Use the output below to answer questions (12) – (14). Assume that all other conditions for the ANOVA test have been met. One-way ANOVA F= 1.110676532 p = 0.3434138248 Factor df=2 SS=19888.2933 MS=9944.14663

12. [Objective: Understand the ANOVA test] Interpret the boxplots given. Compare medians, interquartile ranges, and shapes, and mention any potential outliers.

No Fertilizer

Plant Food

Compost

100

120

140

Number of Ca rrots Ha rvested

13. [Objective: Understand the ANOVA test] State the null and alternative hypothesis.

14. [Objective: Understand the ANOVA test] Using the test results provided, test the hypothesis that soil treatment affects the number of carrots harvested. Use a significance level of 5%. State the correct decision regarding the null hypothesis and write a sentence summarizing the conclusion.

Chapter 11 Test C

11- 5

15. [Objective: Understand the ANOVA test] Describe two of the four conditions that must be checked in order for the calculated F-statistic to follow the F-distribution.

16. [Objective: Understand the ANOVA test] The figure below shows the F-distribution with 5 and 10 degrees of freedom to test the hypothesis that age groups and reading speed are associated. The shaded area represents the p-value. Assume that all conditions for ANOVA have been met. Should the null hypothesis that the age group population means are equal be rejected? What conclusion can be drawn about the association between age group and reading speed?

11-6

Chapter 11 Test C

Use the following information to answer questions (17) - (19). Researchers want to test whether the color of a cyclist’s jersey has an effect on race finish times. Four jersey colors were used for the study—red, green, blue, and yellow. Thirty cyclists were randomly assigned to each group. Use the output below to answer questions (17) (19). One-way ANOVA Variation between groups=21.8848611 Variation within groups=17.82025 p=.3255653716

17. [Objective: Understand the ANOVA test] Compute the F-statistic. Show your work. Round to the nearest hundredth.

18. [Objective: Understand the ANOVA test] Assume that all ANOVA test conditions have been satisfied. State the correct decision regarding the null hypothesis for the claim that jersey color affects race finish times. Write a sentence summarizing your conclusion.

Section 11.4 (Post Hoc Procedures) 19. [Objective: Understand post-hoc procedures] Do the ANOVA test results warrant a post-hoc procedure? Explain why or why not.

20. [Objective: Understand post-hoc procedures] An ANOVA test was conducted to see whether running shoe sole type (Type A, Type B, or Type C) had an effect on shoe wear over a six month period. Test results warranted post-hoc procedures. The Tukey HSD approach was used with the following results: Group Comparison Type A – Type B Type C – Type B Type C – Type A

98.33% Confidence Interval

( −14.02,11.23) ( −9.25, −1.25) ( −10.11, −3.21)

Is there evidence that one type of running shoe sole wears better than the others? Which type of running shoe sole appears to wear the best? Explain.

Chapter 11 Test C

11- 7

Chapter 11 Test C—Answer Key 1. 2. 3. 4. 5.

6. 7. 8.

ANOVA test Chi-square test 15 comparisons 0.007 The overall significance level is the probability that you will mistakenly reject the null hypothesis in at least one of several hypothesis tests. When multiple comparisons are made the overall significance level will increase. The test statistic is F and is calculated by dividing the variation between groups by the variation within groups. There is more variation between groups. Since F—the ratio of variation between groups to the variation within groups—is relatively large, it is likely that an association exists. H 0 : Marital status and work hours are not associated.; H a : Marital status and work hours are associated.

9. F=7.01 10. Variation between groups 11. The smallest SD is 0.75. The largest SD is more than 2 × 0.75 so it appears that variances are not the same. ANOVA would not be appropriate. 12. The medians are similar, although the median for compost was the largest. The shapes are not strongly skewed, although the plant food group has a shape that is skewed somewhat to the left. There does not appear to be any potential outliers. 13. H 0 : µ none = µ commercial = µ compost H a : The mean number of carrots harvested differs by type of fertilizer used.

14. The four conditions are (1) random sample and independent measurements, (2) independent groups, (3) same variance, and (4) Normal distribution or large sample. 15. We cannot reject the null hypothesis that population means are all the same. There’s not enough evidence to conclude that there is an association between soil type and number of carrots harvested. 16. Since the p-value is relatively large, the null hypothesis that age group means are equal should not be rejected. Although reading speed may vary from person to person, it doesn’t seem to have anything to do with age group. 21.8848611 17. F = ≈ 1.23 17.82025 18. We cannot reject the null hypothesis that population means are all the same. There’s not enough evidence to conclude that there is an association between jersey color and race finish times. 19. No. Ad-hoc procedures are only warranted when the ANOVA test results in rejection of the null hypothesis. Rejection of the null hypothesis is an indication that at least one of the group means is different which would warrant post-hoc procedures. 20. Sole type C has a mean that is less than the mean for the others. There is insufficient evidence to conclude that there is difference between the mean wear of sole type A and the mean wear of sole type B.

Chapter 12 Test A—Multiple Choice Section 12.1 (Variation out of Control) 1.

[Objective: Identify treatment and response variables] Which workout recovery drink is better: water, a fortified sports drink, or chocolate milk? In a study, researchers randomly assigned 50 similar athletes to one of three groups. All subjects participated in the same workout. Depending on which group they were assigned to, subjects then drank 12 ounces of water, a fortified sports drink, or chocolate milk. Afterwards, subjects were measured for muscle fatigue. Researchers found that the athletes that drank chocolate milk after a workout recovered from muscle fatigue faster than either those who drank water or the sports drink. For this controlled experiment, identify the treatment and the response variables. a. Treatment: Water, sports drink, or chocolate milk. Response: Change in muscle fatigue. b. Treatment: Change in muscle fatigue. Response: Water, sports drink, or chocolate milk. c. Treatment: Chocolate milk only. Response: Change in muscle fatigue.

Use the following information to answer questions (2) and (3). Which treatment is most effective at treating head lice: Benzyl alcohol lotion, dandruff shampoo, or mayonnaise? In a study, researchers randomly assigned 55 subjects with head lice to one of three groups. All subjects had similar cases of head lice. Depending on which group they were assigned to, subjects received an over-the-counter remedy of Benzyl alcohol lotion, dandruff shampoo, or mayonnaise, which was applied by a trained professional. Afterwards, each subject was examined and any lice or eggs found were counted. Researchers found that Benzyl alcohol lotion produced better results than both dandruff shampoo and mayonnaise. 2.

[Objective: Identify treatment and response variables] Was this an observational study or a controlled experiment? a. Observational Study b. Controlled Experiment

[Objective: Understand cause-and-effect conclusions] Choose the statement that restates the conclusion of the study in terms of cause-and-effect conclusion. a. People who use Benzyl alcohol lotion have lower instances of head lice than those who used shampoo or mayonnaise. b. Benzyl alcohol lotion effectively treats head lice compared to mayonnaise and shampoo. c. Mayonnaise effectively treats head lice.

[Objective: Understanding the power of a test] Which of the following is not a primary factor that affects the power of a test? a. Natural variability within the population b. The size of the true difference between treatment groups c. Sample size d. The sample standard deviation

12-2 5.

Chapter 12 Test A [Objective: Understand block design] A chocolate chip manufacturer has developed a new recipe for its chocolate chips and is planning a consumer taste test. Researchers in the test kitchen want to measure whether there is a difference in consumer opinions between the old recipe and the new recipe. Company researchers believe that consumer reaction to the new chip will depend on age so they decide to block on age. To do this, they create blocks for consumers between the ages 12-18 years, 19-25 years, 26-32 years, and 32 years and older. They then randomly assign subjects in each block to taste cookies made with the new and old recipe chocolate chip. Following the taste test, participants respond to a questionnaire about the cookies they tasted. Is this an effective design for this study? a. Yes b. No

Use the following information to answer questions (6) and (7). These two headlines are on the same topic. Headline A: Gaining weight after giving birth for the first time leads to pregnancy-related diabetes during second pregnancy, a new study finds. Headline B: Women who gain weight after giving birth for the first time dramatically increase their risk of developing pregnancy-related diabetes during their second pregnancy, a new study suggests. 6.

[Objective: Understand cause-and-effect conclusions] Which one has language that suggests a cause-andeffect relationship? a. Headline A b. Headline B

[Objective: Understand the difference between a controlled experiment and an observational study] Was the study referenced most likely a controlled experiment or an observational study? a. Controlled experiment b. Observational study

[Objective: Choose the appropriate statistical test] Suppose a new engine additive is designed to improve performance in race cars and researchers wish to test the effectiveness of the new additive. On a closed race course, the individual race times for forty race cars are recorded. The race times (on the same course) for each of the cars after being treated with the engine additive are then recorded. Which test design would be most appropriate for this scenario? a. Two-sample t-test b. Paired t-test c. None of the above

Section 12.2 (Controlling Variation in Surveys) 9.

[Objective: Understand the difference between sampling strategies] Purple Loosetrife is considered an invasive plant in Michigan. To detect the presence of Purple Loosetrife on public land, environmental researcher’s partition land into one acre parcels then randomly select a sample of parcels to be fully inspected for the presence of Purple Loostrife. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling

Chapter 12 Test A

12- 3

10. [Objective: Understand the difference between sampling strategies] Suppose a manufacturer of rearview mirrors decides to inspect every fifteenth part for defects. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling 11. [Objective: Understand the difference between sampling strategies] Suppose state lawmakers are interested in finding out whether a newly instituted elementary school program about bullying is effective. A statistician divides the state into three regions then randomly selects a sample of thirty elementary schools from each region. Students at the selected schools complete a questionnaire about bullying. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling 12. [Objective: Understand the difference between sampling strategies] Which of the following is not a benefit of a stratified sampling plan? a. Members of a stratum, or similar group, are likely to respond the same, which leads to results with lower variability. b. Statistics from a stratum will reflect the parameters of the population (from which the stratum was drawn) as a whole. c. Increased precision d. All of the above are benefits of a stratified sampling plan. 13. [Objective: Understand the difference between sampling strategies] Which of the following statements is not true about a cluster sampling plan? a. Cluster sampling can make it easier to access very large populations. b. If carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. c. In cluster sampling, the clusters contain objects that are as similar as possible. d. All of the above statements are true of a cluster sampling plan. 14. [Objective: Understand the difference between sampling strategies] Which of the following statements is not true about a systematic sampling plan? a. With systematic sampling, objects from the population are sampled at regular intervals. b. Systematic sampling works best when objects are received in sequence during a specific time period and the characteristic of interest can be reasonably assumed to be randomly mixed during the time period in which the data will be collected. c. A systematic sampling plan is often used for exit polls and quality control studies. d. All of the above statements are true for a systematic sampling plan.

12-4

Chapter 12 Test A

15. [Objective: Choosing a sampling strategy] Suppose a large endowment has been left to your city by a private donor. The city council has decided that the endowment should be used to build an activity center near the center of town. One option is to build an activity center for senior citizens. It is decided that a stratified sampling plan would be the best way to get input from the public since opinions within a strata are likely to be similar. Which method for stratifying seems to be most reasonable for this scenario? a. Stratify by gender. b. Stratify by age. c. Stratify by income. d. None of the above. 16. [Objective: Choosing a sampling strategy] A newspaper reporter is interested in the outcome of a local election on a controversial issue. She knows that opinions of voters who visit the polls in the morning could be different than those who visit later in the day and she does not want a biased sample. Which sampling method is likely to result in an unbiased sample? a. A large sample of voters that have been stratified by age. b. A cluster sample of all the voters who visit the polls between 8:00 am and 9:00 am and between 5:00 pm and 6:00 pm. c. A systematic sample of every tenth voter throughout the day d. None of the above.

Section 12.3 (Putting It All Together: Reading Research Papers) Use the following information to answer questions (17) and (18). A researcher wonders whether receiving a job offer is effected by whether an interviewee wears a red tie. The researcher used the following study design to collect data: The researcher chose five large companies in a large city and observed the tie color of all interviewees during a two week period. He also records whether the person was hired by the company. He finds that interviewees who wear red ties are more likely to be hired than interviewees who do not wear red ties. 17. [Objective: Understand research papers] Choose the statement that correctly explains why we can or cannot generalize these results to a larger population. a. The interviewees (and companies where they interviewed) were not selected from a larger population, so the results cannot be generalized beyond the sample. b. The interviewees were randomly selected to participate in the study so the results can be generalized to the larger population. c. The interviewees were not randomly assigned to wear a red tie to the interview so the results cannot be generalized to beyond the sample. d. This was an observational study so results can be generalized to a larger population. 18. [Objective: Understand research papers and statistical reports] Choose the statement that correctly explains why we can or cannot make a cause-and-effect conclusion. a. This was an observational study. Since random assignment was not used, we cannot conclude that getting hired was caused by wearing a red tie. b. This was an experimental study so we conclude that wearing a red tie caused an interviewee to be hired. c. The interviewees (and companies where they were interviewed) were not selected from a larger population, so we cannot conclude that there was a cause-and-effect relationship. d. Since this was a controlled observational study we can conclude that there was a cause-and-effect relationship. Copyright © 2013 Pearson Education, Inc.

Chapter 12 Test A

12- 5

19. [Objective: Understand research papers and statistical reports] Suppose it was reported on the news that a recent study concluded that the probability that you will get brain cancer if you use a cell phone more than doubled from 1 in 460,000 to 1 in 230,000. Choose the statement that best summarizes the significance of this result. a. Many would say that this result has both clinical and statistical significance because the probability more than doubled and this result could have a significant impact on lives. b. Many would say that this result does not have clinical or statistical significance because the probabilities are so small that they are meaningless. c. Although this result may have clinical significance, it does not have statistical significance since the probability that you will get brain cancer if you use cell phone is still so small that it is unlikely to have a meaningful affect on lives. d. Although this result may have statistical significance, it does not have clinical significance since the probability that you will get brain cancer if you use cell phone is still so small that it is unlikely to have a meaningful affect on lives. 20. [Objective: Understand research papers and statistical reports] Suppose a sociologist reports that after analyzing data that he collected on the drinking habits of college students, he decided to test that hypothesis that there is no difference in the drinking habits of fourth year students and first year students. What research error has been committed by the sociologist? a. Media bias b. Data dredging c. Publication bias d. Clinical significance

12-6

Chapter 12 Test A

Chapter 12 Test A—Answer Key 1. A 2. B 3. B 4. D 5. A 6. A 7. B 8. B 9. C 10. A 11. B 12. B 13. C 14. D 15. B 16. C 17. A 18. A 19. D 20. B

Chapter 12 Test B—Multiple Choice Section 12.1 (Variation out of Control) 1.

[Objective: Identify treatment and response variables] Which energy drink is better? In a study, researchers randomly assigned 50 similar adults to one of three groups. All subjects were allowed to sleep for eight hours followed by staying awake for ten hours. Depending on which group they were assigned to, subjects then drank 12 ounces of strong coffee, a name brand “5-hour” energy drink, or 16 oz of a name brand sports drink containing caffeine and vitamins. Afterwards, subjects were measured for mental fatigue. Researchers found that the adults that drank strong coffee after being awake for ten hours were less fatigued than either those who drank the “5-hour” energy drink or the sports drink. For this controlled experiment, identify the treatment and the response variables. a. Treatment: Strong coffee only. Response: Change in mental fatigue. b. Treatment: Strong coffee, 5-hour energy drink, and sports drink. Response: Change in mental fatigue. c. Treatment: Change in mental fatigue. Response: Strong coffee, 5-hour energy drink, and sports drink.

Use the following information to answer questions (2) and (3). Which treatment is most effective at treating carpet stains: vinegar, ammonia, or hydrogen peroxide? In a study, researchers randomly assigned 55 carpet samples with identical stains to one of three groups. Depending on which group they were assigned to, each carpet sample received a home remedy of water mixed with vinegar, ammonia, or hydrogen peroxide. Afterwards, each carpet sample was examined and any remaining stain was measured. Researchers found that a vinegar and water mixture produced better results than both ammonia and hydrogen peroxide. 2.

[Objective: Identify treatment and response variables] Was this a controlled experiment or an observational study? a. Controlled Experiment b. Observational Study

[Objective: Understand cause-and-effect conclusions] Choose the statement that restates the conclusion of the study in terms of a cause-and-effect conclusion. a. A water and vinegar mixture effectively treats carpet stains compared to ammonia and hydrogen peroxide. b. People who use water and vinegar to treat carpet stains will have fewer visible carpet stains. c. A water and ammonia mixture effectively treats carpet stains compared to ammonia and hydrogen peroxide.

[Objective: Understanding the power of a test] Which of the following is not a primary factor that affects the power of a test? a. The size of the true difference between treatment groups b. Natural variability within the population c. The sample standard deviation d. Sample size

12-2 5.

Chapter 12 Test B [Objective: Understand block design] A ranch salad dressing manufacturer has developed a new recipe for its ranch dressing and is planning a consumer taste test. Researchers in the test kitchen want to measure whether there is a difference in consumer opinions between the old recipe and the new recipe. Company researchers believe that consumer reaction to the new dressing will depend on gender so they decide to block on gender. To do this, they create blocks for female adult consumers and male adult consumers. They then randomly assign subjects in each block to taste dressings using the new and old recipe. Following the taste test, participants respond to a questionnaire about the dressings they tasted. Is this an effective design for this study? a. No b. Yes

Use the following information to answer questions (6) and (7). These two headlines are on the same topic. Headline A: Married men who gain weight following the birth of their first child dramatically increase their risk of further weight gain following the birth of more children, a new study suggests. Headline B: Married male weight gain after the birth of the first child leads to additional weight gain following the birth of more children, a new study finds. 6.

[Objective: Understand cause-and-effect conclusions] Which one has language that suggests a cause-andeffect relationship? a. Headline A b. Headline B

[Objective: Choose the appropriate statistical test] Suppose a sociologist is interested in finding out if there is an association between gender and opinions on human cloning. Which test design would be most appropriate for this scenario? a. Two-sample t-test b. Chi-square c. Paired t-test d. ANOVA

Section 12.2 (Controlling Variation in Surveys)

[Objective: Understand the difference between sampling strategies] Suppose a gumball manufacturer decides to inspect every twenty-fifth gumball for defects. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling

Chapter 12 Test B

12- 3

10. [Objective: Understand the difference between sampling strategies] Suppose state lawmakers are interested in finding out whether a newly instituted state program about distracted driving is effective. A statistician divides the state into three regions then randomly selects a sample of adult licensed drivers from each region. Participants are then asked to fill out a questionnaire about distracted driving. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling 11. [Objective: Understand the difference between sampling strategies] The Sirex Noctilio, a wood wasp, is considered an invasive species in Michigan and can harm and even kill pine trees. To detect the presence of the wood wasp on public land, environmental researchers partition land into one acre parcels then randomly select a sample of parcels to be fully inspected for the presence of the wood wasps in pine trees. What kind of sampling does this illustrate? a. Systematic b. Stratified c. Cluster d. Random Sampling 12. [Objective: Understand the difference between sampling strategies] Which of the following is not a benefit of a stratified sampling plan? a. Members of a stratum, or similar group, are likely to respond the same, which lead to results with lower variability. b. Increased precision c. Statistics from a stratum will reflect the parameters of the population as a whole. d. All of the above are benefits of a stratified sampling plan. 13. [Objective: Understand the difference between sampling strategies] Which of the following statements is not true about a cluster sampling plan? a. Cluster sampling can make it easier to access very large populations. b. If carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. c. In cluster sampling, some natural or convenient distinction is used to divide the population. d. All of the above statements are true of a cluster sampling plan. 14. [Objective: Understand the difference between sampling strategies] Which of the following statements is not true about a systematic sampling plan? a. A systematic sampling plan divides the population into mini-populations that will have lower variability. b. With systematic sampling, objects from the population are sampled at regular intervals. c. Systematic sampling works best when objects are received in sequence during a specific time period and the characteristic of interest can be reasonably assumed to be randomly mixed during the time period in which the data will be collected. d. All of the above statements are true for a systematic sampling plan.

12-4

Chapter 12 Test B

15. [Objective: Choosing a sampling strategy] Suppose a large endowment has been left to your city by a private donor. The city council has decided that the endowment should be used to build a recreation park near the center of town. One option is to build skate park. It is decided that a stratified sampling plan would be the best way to get input from the public since opinions within a strata are likely to be similar. Which method for stratifying seems to be most reasonable for this scenario? a. Stratify by age. b. Stratify by gender. c. Stratify by income. d. None of the above. 16. [Objective: Choosing a sampling strategy] An office building janitorial company is interested in the opinions of people who visit the office building on a typical day. He thinks that opinions of people who visit the office building in the morning could be different than those who visit later in the day and he does not want a biased sample. Which sampling method is likely to result in an unbiased sample? a. A large sample of people who visit the building that have been stratified by age. b. A systematic sample of every tenth person who visits the office building throughout the day. c. A cluster sample of all the people who visit the office building between 8:00 am and 9:00 am and between 5:00 pm and 6:00 pm. d. None of the above.

Section 12.3 (Putting It All Together: Reading Research Papers) Use the following information to answer questions (17) and (18). A researcher wonders whether speed of service at a diner is affected by whether a male customer is wearing a suit jacket. The researcher used the following study design to collect data: The researcher chose five diners in a large city and recorded the number of male customers that wore a suit jacket during a two week period. He also records how long it took a waiter or waitress to address the customer. He finds that male customers who wear suit jackets were addressed by the wait staff faster than male customers who did not wear a suit jacket. 17. [Objective: Understand research papers] Choose the statement that correctly explains why we can or cannot generalize these results to a larger population. a. This was an observational study so results can be generalized to a larger population. b. The male customers (and diners) were not selected from a larger population, so the results cannot be generalized beyond the sample. c. The male customers were randomly selected to participate in the study so the results can be generalized to the larger population. d. The male customers were not randomly assigned to wear a suit jacket so the results cannot be generalized to beyond the sample.

Chapter 12 Test B

12- 5

18. [Objective: Understand research papers and statistical reports] Choose the statement that correctly explains why we can or cannot make a cause-and-effect conclusion. a. This was an experimental study so we conclude that wearing a suit jacket caused a male customer to be addressed faster. b. The male customers (and diners) were not selected from a larger population, so we cannot conclude that there was a cause-and-effect relationship. c. This was an observational study. Since random assignment was not used, we cannot conclude that getting addressed faster was caused by wearing a suit jacket. d. Since this was a controlled observational study we can conclude that there was a cause-and-effect relationship. 19. [Objective: Understand research papers and statistical reports] Which of the following is not an important principle that should always be kept in mind when reading research articles containing statistical research results? a. Don’t rely solely on the conclusions of any single paper b. Be wary of conclusions based on very complex statistical or mathematical models. c. Stick to peer-reviewed journals. d. All of the above are important principles. 20. [Objective: Understand research papers and statistical reports] Suppose it was reported on the news that a recent study concluded that the probability that you will get brain cancer if you use a cell phone more than doubled from 1 in 460,000 to 1 in 230,000. Choose the statement that best summarizes the significance of this result. a. Although this result may have statistical significance, it does not have clinical significance since the probability that you will get brain cancer if you use cell phone is still so small that it is unlikely to have a meaningful affect on lives. b. Many would say that this result has both clinical and statistical significance because the probability more than doubled and this result could have a significant impact on lives. c. Many would say that this result does not have clinical or statistical significance because the probabilities are so small that they are meaningless. d. Although this result may have clinical significance, it does not have statistical significance since the probability that you will get brain cancer if you use cell phone is still so small that it is unlikely to have a meaningful affect on lives.

12-6

Chapter 12 Test B

Chapter 12 Test B—Answer Key 1. B 2. A 3. A 4. C 5. B 6. B 7. A 8. B 9. A 10. B 11. C 12. C 13. D 14. A 15. A 16. B 17. B 18. C 19. D 20. A

Chapter 12 Test C—Short Answer Section 12.1 (Variation out of Control) 1.

[Objective: Identify treatment and response variables] Does listening to music improve efficiency for mundane tasks? In a study, researchers randomly assigned 50 similar adults to one of three groups. All subjects were asked to stuff 400 envelopes. Depending on which group they were assigned to, subjects heard Top 40 music, classical music, or no music while they worked. Researchers recorded how long it took the participants to stuff the envelopes. Researchers found that the adults that listened to no music completed the work faster than both the group that listened to Top 40 music and classical music. For this controlled experiment, state the treatment and the response variables.

Use the following information to answer questions (2) and (3). Which treatment is most effective at treating bathroom mildew: water mixed with vinegar, ammonia, or soap? In a study, researchers randomly assigned 55 similar tile samples with similar amounts of mildew to one of three groups. Depending on which group they were assigned to, each tile sample received a home remedy of water mixed with vinegar, ammonia, or soap. Afterwards, each tile sample was examined and any remaining mildew was measured. Researchers found that a soap and water mixture produced better results than both vinegar and ammonia. 2.

[Objective: Identify treatment and response variables] Is this a controlled experiment or an observational study? Explain.

[Objective: Understand cause-and-effect conclusions] Write a statement that restates the conclusion of the study in terms of a cause-and-effect conclusion.

[Objective: Understanding the power of a test] List three factors that affect the statistical power of a test. Be sure to list the factor over which researchers actually have some control.

12-2 5.

Chapter 12 Test C [Objective: Understand block design] A publisher is considering publishing a new magazine about environmentally-friendly living in the city and plans to gage interest in the magazine using potential consumers. Company researchers want to measure whether there is a difference in opinion between the new magazine and the competing publication. Company researchers believe that consumer reaction to the new magazine will depend on age so they decide to block on age. To do this, they create blocks for consumers between the ages 18-24 years, 25-32 years, 33-45 years, and 46 years and older. They then randomly select two of the blocks to read the first issue of the new magazine and the other two blocks to read a similar competing publication. Following the test, participants respond to a questionnaire about the magazine they read. Is this an effective design for the study? If not, describe an improvement.

Use the following information to answer questions (6) and (7). These two headlines are on the same topic. Headline A: Children who view advertisements for fast food dramatically increase their risk of becoming obese adolescents, a new study suggests. Headline B: Viewing advertisements for fast food leads to childhood obesity, a new study finds. 6.

[Objective: Understand the difference between a controlled experiment and an observational study] Was the study referenced most likely a controlled experiment or an observational study? Explain.

[Objective: Understand cause-and-effect conclusions] Which one has language that suggests a cause-andeffect relationship? Is this problematic? Can you think of a confounding factor other than exposure to fast food advertisements that might cause childhood obesity?

[Objective: Choose the appropriate statistical test] Suppose a sleep expert conducts a study to detect the effects of a new doctor-recommended breathing and relaxation routine on insomnia. Hours of sleep per night were measured for participants before starting the routine and four weeks after using the breathing and relaxation routine just before bed. Which test design would be most appropriate for this scenario? Explain.

Chapter 12 Test C

12- 3

Section 12.2 (Controlling Variation in Surveys)

[Objective: Understand the difference between sampling strategies] Suppose a snow blower manufacturer decides to inspect every seventh snow blower for paint defects. What kind of sampling does this illustrate?

10. [Objective: Understand the difference between sampling strategies] Suppose the CEO of a large corporation is interested in finding out whether a newly instituted program about professionalism in the workplace is effective. A statistician divides the employees into three job types then randomly selects a sample of employees from each job type. Participants are then asked to fill out a questionnaire about professionalism in the workplace. What kind of sampling does this illustrate?

11. [Objective: Understand the difference between sampling strategies] Asphodelus fistulosus, a noxious weed commonly known as onionweed, is considered an invasive species in New Mexico and can choke out other vegetation. To detect the presence of onionweed on public land, environmental researchers partition land into one acre parcels then randomly select a sample of parcels to be fully inspected for the presence of onionweed. What kind of sampling does this illustrate?

12. [Objective: Understand the difference between sampling strategies] Describe one benefit to a stratified sampling plan.

13. [Objective: Understand the difference between sampling strategies] Describe one benefit to a cluster sampling plan.

14. [Objective: Understand the difference between sampling strategies] Describe a situation where a systematic sampling plan would be appropriate.

12-4

Chapter 12 Test C

15. [Objective: Choosing a sampling strategy] Suppose a large donation has been left to a local community college by a private donor. School officials have decided that one option for the money would be to build a new gymnasium for student and community use. It is decided that a stratified sampling plan would be the best way to get input from the public about this option since opinions within a stratum are likely to be similar. What would be a reasonable way to stratify the population?

16. [Objective: Choosing a sampling strategy] A bank manager is interested in the opinions of people who visit the bank on a typical day. The manager thinks that opinions of people who visit the bank in the morning could be different than those who visit later in the day and he does not want a biased sample. Describe a sampling method that is likely to result in an unbiased sample. Explain.

Section 12.3 (Putting It All Together: Reading Research Papers) Use the following information to answer questions (17) and (18). A researcher wonders whether length of time that it takes to hail a taxi is affected by the gender of the person hailing the taxi. The researcher used the following study design to collect data: The researcher chose five busy intersections in a large city and recorded the number of males and females who hailed a taxi. He also records how long it took to get a taxi to stop. He finds that male customers waited a shorter period of time than females waited for a taxi. 17. [Objective: Understand research papers] Explain why we can or cannot generalize these results to a larger population.

18. [Objective: Understand research papers and statistical reports] Restate the conclusion of the study in terms of a cause-and-effect conclusion. Explain why we can or cannot make a cause-and-effect conclusion.

Chapter 12 Test C

12- 5

19. [Objective: Understand research papers and statistical reports] Write at least two questions that you should ask yourself when reading articles containing statistical research results. Explain why these questions are important.

20. [Objective: Understand research papers and statistical reports] Suppose it was reported on the news that a recent study concluded that pollutants in the air have dramatically increased our chances of getting a rare form of skin cancer from 1 in 250,000 to 1 in 150,000. Compare the clinical and statistical significance of this result.

12-6

Chapter 12 Test C

Chapter 12 Test C—Answer Key 1.

The treatments were Top 40 music, classical music, and no music. The response variable was the time it took participants to stuff the 400 envelopes. 2. Controlled experiment. 3. A soap and water mixture effectively treats mildew stains on tile compared to vinegar and ammonia. 4. Sample size, the size of the true difference between the groups, and the natural variability within the population. Researchers have control over sample size only. 5. This is not an effective design for the study because researchers randomly assigned entire blocks to treatment groups. To improve the study they should randomize within blocks. 6. This is most likely an observational study since it would not be ethical to purposely try to cause childhood obesity. 7. Headline B implies a cause-and-effect relationship. This is problematic because it is inappropriate to make cause-and-effect statements based on observational studies. Confounding factors will vary. 8. A paired t-test is the most appropriate test because the treatment was given to the same group of people with measurements taken before and after the treatment. 9. Systematic 10. Stratified 11. Cluster 12. Some of benefits of a stratified sampling plan are increased precision and decreased variability within strata. 13. Cluster sampling can make it easier to access very large populations and if carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. 14. Answers will vary, but systematic sampling planes are often used for exit polls or in quality control studies. 15. Answers will vary, but stratification by age or gender is likely to be the most reasonable. 16. A systematic sampling plan would yield an unbiased sample. Opinions of people visiting the bank are likely to be mixed during a typical day so a systematic sampling plan where every kth customer participates will provide an unbiased sample. 17. The male and female customers (and intersections) were not selected from a larger population, so the results cannot be generalized beyond the sample. 18. Possible cause-and-effect conclusion: Being male leads to shorter wait times for a taxi, a new study finds. This was an observational study. Since random assignment was not used, we cannot conclude that being male caused the taxis to stop faster. 19. Possible questions: Did the study use random sampling and random assignment? Is the conclusion supported by other similar studies? Are the conclusions based on complex statistical or mathematical models? Is the journal containing the article peer-reviewed? Is the evidence compelling enough to support the conclusion? These questions encourage critical evaluation of published research or televised new reports that could affect our lives. 20. Although this result may have clinical significance, it does not have statistical significance since the probability that you will get the rare form of skin cancer is still so small that it is unlikely to have a meaningful affect on lives.

Chapter 13 Test A—Multiple Choice Section 13.1 (Transforming Data) 1.

[Objective: Understand the need for nonparametric inference techniques] Which of the following is an indication that nonparametric inference might be necessary? a. The sample size is too small to assume the CLT holds. b. The distribution of the population is not Normal. c. The data is strongly skewed d. All of the above are indications that nonparametric inference might be necessary

[Objective: Interpreting a QQ Plot] In the context of nonparametric inference, what information can the QQ provide? a. It is a tool that displays quartile locations of data values which can provide information about the sample distribution. b. It is a tool that can help you determine whether a sample is drawn from a normal population. c. It is tool that displays the distribution of transformed data. d. None of the above

[Objective: Interpreting a QQ Plot] Which of the following QQ plots most closely depicts data from a normally distributed population? a.

b. 120,000

Var 1

120,000

60,000

0 -3

-2

-1

-3

-2

Normal Quantiles

d. 120,000

Var 1

120,000

Var 1

-1

60,000

0 -3

-2

-1

-3

Normal Quantiles

-2

-1

0 Normal Quantiles

13-2

Chapter 13 Test A

Use the following information to answer questions (4) and (5). Suppose the manager of a large high-end jewelry store wants to estimate the amount spent by customers during the holiday season. She took a random sample of customers and recorded the amount they spent. A histogram of the data shows that the data is strongly left-skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) log-transformed data, which showed a more normally distributed data set. Use this information to answer questions (4) and (5). (A) One-Sample T: Purch Variable N Mean StDev SE Mean Purch 15 223.5 100.4 26.0

95% CI (167.9, 279.1)

(B) One-Sample T: LogPurch Variable N Mean StDev LogPurch 15 2.311 0.236

SE Mean 0.101

95% CI (2.2, 2.4)

[Objective: Understand Transformed Data] Calculate the width of both intervals (note that you will need to convert the log-transformed interval back into dollars). Which interval is narrower? a. Width of interval for untransformed data: 111.2; width of interval for transformed data: 92.7. The width of the interval for the log transformed data is narrower. b. Width of interval for untransformed data: 111.2; width of interval for transformed data: 204.6. The width of the interval for the untransformed data is narrower. c. Width of interval for untransformed data: 55.6; width of interval for transformed data: 111.2. The width of the interval for the log transformed data is narrower. d. Cannot be determined with the given information

[Objective: Understand Transformed Data] Choose the statement that explains which confidence interval is likely to be a more precise estimate of amount spent and why. a. The confidence interval for the untransformed data is more precise because the values are in actual dollars which is more meaningful. b. The confidence interval for the geometric mean is more precise because the distribution of the logtransformed data is more symmetric. c. The confidence interval for the untransformed data is more precise because it is strongly leftskewed and the confidence interval gives a wider interval. d. None of the above.

[Objective: Understand the Geometric Mean] Find the mean, median, and geometric mean for the following numbers: 10, 300, 1500, and 33,000. Round to the nearest tenth. a. 8702.5, 900.0, 29.2 b. 8702.5, 900.0, 620.8 c. 8700.0, 620.8, 950.0 d. 900.0, 1000.0, 8700.0

Section 13.2 (The Sign Test for Paired Data) 7.

[Objective: Understand the sign test for paired data] Which of the following statements could be a reason to justify the use of the sign test? a. The sample size is small b. The distribution of the population is unknown or not Normal c. The data are matched pairs d. All of the above

Chapter 13 Test A 8.

13- 3

[Objective: Understand the sign test for paired data] Which of the following statements is not true about the sign test? a. Matched pairs must be independent of other pairs in the sample. b. The p-value is based on the normal distribution. c. The sign test relies on the signs (negative or positive) of the measured differences in pairs. d. The binomial model is used to find an exact p-value

Use the following information to answer questions (9) – (11). Can stretching help you stay alert in class? Thirty-six subjects were measured for alertness at the beginning of class; the subjects then participated in some light arm and neck stretches followed by a forty-five minute lecture. Each subject was then measured for alertness at the end of the lecture. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met: Hypothesis test results: Parameter: Median of variable H 0 : median = 0

H A : median ≠ 0 Variable n Difference 36

n for tests 32

Sample Median 1

Below 12

Equal 4

Above 20

P-value 0.2153

[Objective: Understand the sign test for paired data] Choose the correct null and alternative hypothesis. a. H 0 : The median difference in alertness is 0.

H A : The median difference in alertness is not 0. b.

H 0 : The median difference in alertness is not 0. H A : The median difference in alertness is 0.

H 0 : The median difference in alertness is 1. H A : The median difference in alertness is not 1.

None of the above.

10. [Objective: Perform the Sign Test] What is the name and value of the test statistic? a. F = 20 b. S = 20 c. S = 12 d. Z = 12 11. [Objective: Perform the Sign Test] Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. a. Fail to reject H 0 . There is evidence to suggest that there is a difference in alertness after light arm b.

and neck stretches before the lecture. Reject H 0 . There is evidence to suggest that there is no difference in alertness after light arm and

neck stretches before the lecture. Fail to reject H 0 . There is evidence to suggest that there is no difference in alertness after light

arm and neck stretches before the lecture. Reject H 0 . There is evidence to suggest that there is difference in alertness after light arm and neck stretches before the lecture. Copyright © 2013 Pearson Education, Inc.

13-4

Chapter 13 Test A

Section 13.3 (Mann-Whitney Test for Two Independent Groups) 12. [Objective: Understand the Mann-Whitney Test] Choose the statement that is not true about the MannWhitney Test. a. The Mann-Whitney Test can be used when the Normal condition of the t-test is not met. b. The Mann-Whitney Test is based on the ranks of the observations, not on their actual values. c. The Mann-Whitney Test is used to compare the centers of two groups of numerical variables. d. The Mann-Whitney Test is based on the number of pairs with positive differences. Use the following information to answer questions (13) and (14). Suppose the Nielson Organization conducted a survey to find out how many minutes of reality-type television programming people watched in one week. Assume that all conditions for the Mann-Whitney test have been met. Use the following test output to answer the questions (13) and (14). Hypothesis test results: m1=median for women m2=median for men Parameter: m1-m2 Difference n1 m1-m2 10

n2 10

Diff.Est 50

Test Stat 38.2

P-value 0.022

Method Norm. Approx.

13. [Objective: Understand the Mann-Whitney Test] Choose the correct null and alternative hypothesis to test the claim that men and women watch different amounts of reality-type programming. a. H 0 : The median for men and women are not equal.

H A : The median for men is equal to the median for women. b.

H 0 : The median for men is equal to the median for women. H A : The median for men and women are not equal.

H 0 : The median for men is equal to the median for women. H A : The median for men is less than the median for women.

H 0 : The median for men is equal to the median for women. H A : The median for men is greater than the median for women.

14. [Objective: Understand the Mann-Whitney Test] Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. a. Fail to reject H 0 . There is evidence to suggest that there is a difference in the median amount of b.

reality-type television that men and women watch. Reject H 0 . There is evidence to suggest that there is no difference in the median amount of

reality-type television that men and women watch. Fail to reject H 0 . There is evidence to suggest that there is no difference in the median amount of

reality-type television that men and women watch. Reject H 0 . There is evidence to suggest that there is a difference in the median amount of realitytype television that men and women watch.

Chapter 13 Test A

13- 5

15. [Objective: Choosing a nonparametric test] You are presented with data from two independent samples. The variable being measured is continuous. The distribution of the population of each sample is right skewed. You wish to test the hypothesis that there is a difference in the median value of the variable for the samples. What type test/method should you use? a. Data transformation and t-interval b. Sign Test c. Mann-Whitney Test d. Paired t-test 16. [Objective: Choosing a nonparametric test] A used car lot owner wanted to estimate the amount spent by customers during the summer months. She took a random sample of customers and recorded the amount they spent. A histogram showed the data was right-skewed so she took the log of each value and verified that the distribution of these values was more Normally distributed. What test/method should she use to estimate the mean amount spent during the summer months? a. Data transformation and t-interval b. Sign Test c. Mann-Whitney Test d. Paired t-test 17. [Objective: Choosing a nonparametric test] A new fiber bar is advertised to curb hunger for three hours. A sample of thirty-six hungry subjects were asked to record their level of hunger before eating the fiber bar and again three hours after eating the fiber bar. Which test should be used to test the hypothesis the there is no difference in the level of hunger three hours after eating the fiber bar (i.e. the fiber bar curbed hunger for three hours)? a. Data transformation b. Sign Test c. Mann-Whitney Test d. Paired t-test

13-6

Chapter 13 Test A

Section 13.4 (Randomization Tests) Use the following information to answer questions (18) – (20). Math self-efficacy can be defined as one’s belief in his or her own ability to perform mathematical tasks. A college math professor wishes to find out if her students’ math self-efficacy matches reality. To do this she gives a math quiz then asks her students to rate their level of confidence in how well they did on the quiz. She plans to test whether those who had little confidence that they did well on the quiz actually performed worse than those who had a high level of confidence that they did well on the quiz. Shown below is the approximate sampling distribution of the difference in mean quiz scores. The table below shows the summary statistics for the two groups. Assume that all conditions for a randomization test have been satisfied. 100

Group

Mean

Median

High Conf. 106 78.6 77.5 Low Conf. 211 73.2 72.5 Test Stat: Mean High Conf. – Mean Low Conf. Number of simulations: 350

Standard Deviation 5.5 4.2

IQR 9.5 8.3

Frequency

-9

-6

-3

Difference in Mean Quiz Scores

18. [Objective: Understand Randomization Tests] State the null and alternative hypothesis and also the value of the test statistic for the professor’s randomization test. a. H 0 : The typical quiz score for those with high confidence is the same as that

of those with low confidence. H A : The typical quiz score for those with high confidence is greater than that

of those with low confidence. The Test Statistic is 5.4. H 0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. H A : The typical quiz score for those with high confidence is greater than that

of those with low confidence. The Test Statistic is − 5.4. H 0 : The typical quiz score for those with high confidence is greater than that of those with low confidence. H A : The typical quiz score for those with high confidence is the same as that

Chapter 13 Test A

13- 7

19. [Objective: Understand Randomization Tests] Use the histogram to roughly estimate the p-value. Choose the answer that most closely approximates the p-value. (Approximations have been made to the nearest hundredth.) a. p = 0.00 b.

p = 0.40

p = 0.90

None of the above

20. [Objective: Understand Randomization Tests] Carry out the randomization test. What is the professor’s conclusion? Are differences in mean quiz scores due to chance? a. Fail to reject H 0 . The professor should conclude that typical quiz scores for those with high

confidence is greater than that of those with low confidence. The student’s self-efficacy matches reality. Reject H 0 . The professor should conclude that there is no difference in mean quiz scores for

those with high confidence and those with low confidence. The student’s self-efficacy does not match reality. Fail to reject H 0 . The professor should conclude that there is no difference in mean quiz scores

for those with high confidence and those with low confidence. The student’s self-efficacy does not match reality. Reject H 0 . The professor should conclude that typical quiz scores for those with high confidence is greater than that of those with low confidence. The student’s self-efficacy matches reality.

13-8

Chapter 13 Test A

Chapter 13 Test A—Answer Key 1. D 2. B 3. A 4. A 5. C 6. B 7. D 8. B 9. A 10. B 11. C 12. D 13. B 14. D 15. C 16. A 17. B 18. A 19. B 20. C

Chapter 13 Test B—Multiple Choice Section 13.1 (Transforming Data) 1.

[Objective: Understand the need for nonparametric inference techniques] Which of the following is not necessarily an indication that nonparametric inference might be necessary? a. Variables must be matched pairs b. The sample size is too small to assume the CLT holds. c. The distribution of the population is not Normal. d. The data is strongly skewed

[Objective: Interpreting a QQ Plot] In the context of nonparametric inference, what information can the QQ provide? a. It is a tool that displays quartile locations of data values which can provide information about the sample distribution. b. It is tool that displays the distribution of transformed data. c. It is a tool that can help you determine whether a sample is drawn from a normal population. d. None of the above

[Objective: Interpreting a QQ Plot] Which of the following QQ plots most closely depicts data from a skewed population? a.

b. 120,000

Var 1

120,000

60,000

0 -3

-2

-1

-3

-2

-1

Normal Quantiles

c. 120,000

120,000

Var 1

Normal Quantiles

60,000

0 -3

-2

-1

-3

Normal Quantiles

-2

-1

Normal Quantiles

13-2

Chapter 13 Test B

Use the following information to answer questions (4) and (5). Suppose the manager of a large furniture store wants to estimate the amount spent by customers during the holiday season. She took a random sample of customers and recorded the amount they spent. A histogram of the data shows that the data is strongly left-skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) logtransformed data, which showed a more normally distributed data set. Use this information to answer questions (4) and (5). (A) One-Sample T: Purch Variable N Mean StDev SE Mean 95% CI Purch 17 544.5 140.2 26.0 (472.4, 616.6)

(B) One-Sample T: LogPurch Variable N Mean StDev LogPurch 17 2.73 0.121

SE Mean 0.101

95% CI (2.7, 2.8)

[Objective: Understand Transformed Data] Calculate the width of both intervals (note that you will need to convert the log-transformed interval back into dollars). Which interval is narrower? a. Width of interval for untransformed data: 144.2; width of interval for transformed data: 152.5. The width of the interval for the untransformed data is narrower. b. Width of interval for untransformed data: 144.2; width of interval for transformed data: 129.8. The width of the interval for the log transformed data is narrower. c. Width of interval for untransformed data: 72.1; width of interval for transformed data: 273. The width of the interval for the log transformed data is narrower. d. Cannot be determined with the given information

[Objective: Understand Transformed Data] Choose the statement that explains which confidence interval more precisely depicts the data and why. a. The confidence interval for the untransformed data is more precise because it is strongly leftskewed and the confidence interval gives a wider interval. b. The confidence interval for the untransformed data is more precise because the values are in actual dollars which is more meaningful. c. The confidence interval for the geometric mean is more precise because the distribution of the logtransformed data is more symmetric. d. None of the above.

[Objective: Understand the Geometric Mean] Find the mean, median, and geometric mean for the following numbers: 120, 400, 1300, and 22,000. List from smallest to largest and round to the nearest tenth. a. 9300.0, 850.0, 120.0 b. 400.0, 1082.4, 9273.8 c. 4765.0, 850.0, 1185.5 d. 9273.8, 850.0, 1082.4

Section 13.2 (The Sign Test for Paired Data) 7.

[Objective: Understand the sign test for paired data] Which of the following statements is not true about the sign test? a. Matched pairs must be independent of other pairs in the sample. b. The sign test relies on the signs (negative or positive) of the measured differences in pairs. c. The p-value is based on the normal distribution. d. The binomial model is used to find an exact p-value

Chapter 13 Test B 8.

13- 3

[Objective: Understand the sign test for paired data] Which of the following statements could be a reason to justify the use of the sign test? a. The data are matched pairs b. The sample size is small c. The distribution of the population is unknown or not Normal d. All of the above

Use the following information to answer questions (9) – (11). Can deep-knee bends help you stay alert in class? Forty subjects were measured for alertness at the beginning of class then voluntarily performed fifteen deep-knee bends followed by a forty-five minute lecture. Each subject was then measured for alertness at the end of the lecture. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met: Hypothesis test results: Parameter: Median of variable H 0 : median = 0

H A : median ≠ 0 Variable n Difference 40

n for tests 35

Sample Median 1

Below 14

Equal 5

Above 21

P-value 0.4777

[Objective: Understand the sign test for paired data] Choose the correct null and alternative hypothesis. a. H 0 : The median difference in alertness is not 0.

H A : The median difference in alertness is 0. b.

H 0 : The median difference in alertness is 0. H A : The median difference in alertness is not 0.

H 0 : The median difference in alertness is 1. H A : The median difference in alertness is not 1.

None of the above.

10. [Objective: Perform the Sign Test] What is the name and value of the test statistic? a. F = 21 b. S = 14 c. Z = 5 d. S = 21 11. [Objective: Perform the Sign Test] Using a significance level of 5%, state the correct decision regarding the null hypothesis and concluding statement. a. Fail to reject H 0 . There is evidence to suggest that there is no difference in alertness after deep b.

knee bends before the lecture. Fail to reject H 0 . There is evidence to suggest that there is a difference in alertness after deep

knee bends before the lecture. Reject H 0 . There is evidence to suggest that there is no difference in alertness after deep knee

bends before the lecture. Reject H 0 . There is evidence to suggest that there is difference in alertness after deep knee bends before the lecture.

13-4

Chapter 13 Test B

Section 13.3 (Mann-Whitney Test for Two Independent Groups) 12. [Objective: Understand the Mann-Whitney Test] Choose the statement that is not true about the MannWhitney Test. a. The Mann-Whitney Test is based on the number of pairs with negative differences. b. The Mann-Whitney Test can be used when the Normal condition of the t-test is not met. c. The Mann-Whitney Test is based on the ranks of the observations, not on their actual values. d. The Mann-Whitney Test is used to compare the centers of two groups of numerical variables. Use the following information to answer questions (13) and (14). Suppose the Nielson Organization conducted a survey to find out how many minutes of televised sporting events people watched in one week. Assume that all conditions for the Mann-Whitney test have been met. Use the following test output to answer the questions (13) and (14). Hypothesis test results: m1=median for adults ages 24-34 m2=median for adults ages 35-45 Parameter: m1-m2 Difference n1 n2 m1-m2 12 12

Diff.Est 65

Test Stat 45.7

P-value 0.120

Method Norm. Approx.

13. [Objective: Understand the Mann-Whitney Test] Choose the correct null and alternative hypothesis to test the claim that adults between the ages of 24 and 34 and adults between the ages of 35 and 45 watch different amounts of televised sporting events. a. H 0 : The median for the two age groups are not equal.

H A : The median for adults ages 24 to 34 is equal to the median for adults ages 35-45. b.

H 0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. H A : The median for the two age groups are not equal.

H 0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. H A : The median for adults ages 24 to 34 is less than the median for adults ages 35 to 45.

H 0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. H A : The median for adults ages 24 to 34 is greater than the median for adults ages 35 to 45.

televised sporting events that adults in the two age groups watched. Reject H 0 . There is evidence to suggest that there is no difference in the median amount of

televised sporting events that adults in the two age groups watched. Fail to reject H 0 . There is evidence to suggest that there is no difference in the median amount of

televised sporting events that adults in the two age groups watched. Reject H 0 . There is evidence to suggest that there is difference in the median amount of televised sporting events that adults in the two age groups watched.

Chapter 13 Test B

13- 5

15. [Objective: Choosing a nonparametric test] A new fiber bar is advertised to curb hunger for three hours. A sample of thirty-six hungry subjects were asked to record their level of hunger before eating the fiber bar and again three hours after eating the fiber bar. Which test should be used to test the hypothesis the there is no difference in the level of hunger three hours after eating the fiber bar (i.e. the fiber bar curbed hunger for three hours)? a. Data transformation b. Paired t-test c. Sign Test d. Mann-Whitney Test 16. [Objective: Choosing a nonparametric test] You are presented with data from two independent samples. The variable being measured is continuous. The distribution of the population of each sample is right skewed. You wish to test the hypothesis that there is a difference in the median value of the variable for the samples. What type test/method should you use? a. Data transformation b. Paired t-test c. Sign Test d. Mann-Whitney Test 17. [Objective: Choosing a nonparametric test] A used car lot owner wanted to estimate the amount spent by customers during the summer months. She took a random sample of customers and recorded the amount they spent. A histogram showed the data was right-skewed so she took the log of each value and verified that the distribution these values was more Normally distributed. What test/method should she use to estimate the mean amount spent during the summer months? a. Data transformation b. Paired t-test c. Sign Test d. Mann-Whitney Test

13-6

Chapter 13 Test B

Section 13.4 (Randomization Tests) Use the following information to answer questions (18) – (20). Math self-efficacy can be defined as one’s belief in his or her own ability to perform mathematical tasks. A college math professor wishes to find out if her female students’ math self-efficacy matches reality. To do this she gives a math quiz to the female students then asks them to rate their level of confidence in how well they did on the quiz. She plans to test whether those who had little confidence that they did well on the quiz actually performed worse than those who had a high level of confidence that they did well on the quiz. Shown below is the approximate sampling distribution of the difference in mean quiz scores. The table below shows the summary statistics for the two groups. Assume that all conditions for a randomization test have been satisfied. 100

Group

Mean

Median

High Conf. 106 77.2 75.5 Low Conf. 211 62.2 59.2 Test Stat: Mean High Conf. – Mean Low Conf. Number of simulations: 350

Standard Deviation 6.5 5.9

IQR 10.5 9.3

Frequency

-9

-6

-3

Difference in Mean Quiz Scores

18. [Objective: Understand Randomization Tests] State the null and alternative and also the value of the test statistic for the professor’s randomization test. a. H 0 : The typical quiz score for those with high confidence is the same as that

of those with low confidence. H A : The typical quiz score for those with high confidence is greater than that of those with low confidence. b.

The Test Statistic is 15. H 0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. H A : The typical quiz score for those with high confidence is greater than that of those with low confidence.

The Test Statistic is − 15. H 0 : The typical quiz score for those with high confidence is greater than that of those with low confidence. H A : The typical quiz score for those with high confidence is the same as that of those with low confidence.

The Test Statistic is -16.3. H 0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. H A : The typical quiz score for those with high confidence is greater than that of those with low confidence. The Test Statistic is − 1.2.

Chapter 13 Test B

13- 7

p = 0.40

p = 0.00

None of the above

confidence is less than that of those with low confidence. The student’s self-efficacy matches reality. Reject H 0 . The professor should conclude that there is no difference in mean quiz scores for

13-8

Chapter 13 Test B

Chapter 13 Test B—Answer Key 1. A 2. C 3. B 4. B 5. A 6. D 7. C 8. D 9. B 10. D 11. A 12. A 13. B 14. C 15. C 16. D 17. A 18. A 19. A 20. D

Chapter 13 Test C—Multiple Choice Section 13.1 (Transforming Data) 1.

[Objective: Understand the need for nonparametric inference techniques] Explain some factors that might indicate that nonparametric inference might be necessary.

[Objective: Interpreting a QQ Plot] Suppose you are asked to analyze sample data but you do not know the distribution of the population it came from. Explain how a QQ plot can be used to give you information about the population from which the samples was drawn.

13-2

Chapter 13 Test C

Refer to the following two histograms and QQ plots of the same data to answer questions (3) and (4).

b. 100

100

Frequency

80 Frequency

40 20

110 120 130 140

IQ Scores

120,000

Var 1

120,000

60,000

0 -3

-2

-1

-3

Normal Quantiles

40 60 80 100 120 130 140

Income (thousands of dollars per year)

-2

-1

Normal Quantiles

[Objective: Interpreting a QQ Plot] Match the histogram with the corresponding QQ plot. Histogram (a) goes with QQ plot ________. Histogram (b) goes with QQ plot ________.

[Objective: Interpreting a QQ Plot] For which sample might a log transform be useful? Explain. (There are no zeros or negative values in either data set.)

Chapter 13 Test C

13- 3

Use the following information to answer questions (5) and (6). Suppose the manager of a large appliance and electronics store wants to estimate the amount spent by customers during the holiday season. He took a random sample of customers and recorded the amount they spent. A histogram of the data shows that the data is strongly left-skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) log-transformed data, which showed a more normally distributed data set. (A) One-Sample T: Purch Variable N Mean StDev SE Mean 95% CI Purch 20 675.2 350.1 26.0 (511.35, 839.05)

(B) One-Sample T: LogPurch Variable N Mean StDev LogPurch 21 2.78 0.246

SE Mean 0.101

95% CI (2.67, 2.89)

[Objective: Understand Transformed Data] Calculate the width of both intervals (note that you will need to convert the log-transformed interval back into dollars) and state which interval is narrower?

[Objective: Understand Transformed Data] Which interval should the manager report to the store owner about the typical amount of money spent during the holiday season? Explain.

[Objective: Understand the Geometric Mean] Calculate the mean, median, and geometric mean for the following numbers: 110, 500, 1700, and 31,000. List from smallest to largest and round to the nearest tenth.

Section 13.2 (The Sign Test for Paired Data) 8.

[Objective: Understand the sign test for paired data] Describe some conditions that might indicate that sign test should be used for inference.

13-4

Chapter 13 Test C

Use the following information to answer questions (9) – (12). Can dogs lower anxiety in math class? Fifty subjects who reported anxiety about attending math class were measured for stress at the beginning of a math class then spent 15 minutes interacting with a dog followed by a forty-five minute math lecture. Each subject was then measured for stress at the end of the lecture. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met: Hypothesis test results: Parameter: Median of variable H 0 : median = 0

H A : median ≠ 0 Variable n Difference 50

n for tests 46

Sample Median 1

Below 18

Equal 4

Above 28

P-value 0.1839

[Objective: Understand the sign test for paired data] State the null and alternative hypothesis to test the claim that dogs can affect anxiety levels.

10. [Objective: Understand the sign test for paired data] Explain how the binomial model is used to calculate the p-value.

11. [Objective: Perform the Sign Test] Calculate the value of the test statistic and state the value of the pvalue.

12. [Objective: Perform the Sign Test] Using a significance level of 5%, state the correct decision regarding the null hypothesis and write a sentence which summarizes the conclusion and addresses the claim.

Chapter 13 Test C

13- 5

Section 13.3 (Mann-Whitney Test for Two Independent Groups) 13. [Objective: Understand the Mann-Whitney Test] List three of the five conditions, pertaining to the sample, which must be met in order to use the Mann-Whitney test.

Use the following information to answer questions (14) and (15). Suppose the Nielson Organization conducted a survey to find out how many minutes of crime dramas that people watched in one week. Assume that all conditions for the Mann-Whitney test have been met. Use the following test output to answer the questions (14) and (15). Hypothesis test results: m1=median for adults ages 24-39 m2=median for adults ages 40-55 Parameter: m1-m2 Difference n1 n2 m1-m2 13 13

Diff.Est 65

Test Stat 45.7

P-value 0.120

Method Norm. Approx.

14. [Objective: Understand the Mann-Whitney Test] State the null and alternative hypothesis to test the claim that adults between the ages of 24 and 39 and adults between the ages of 40 and 55 watch different amounts of crime dramas on television.

15. [Objective: Understand the Mann-Whitney Test] Using a significance level of 5%, state the correct decision regarding the null hypothesis and write a sentence which summarizes the conclusion and addresses the claim.

16. [Objective: Choosing a nonparametric test] Suppose data was collected from ten women and twelve men about the length in minutes of their commute to work. The histogram for men was roughly normal, but the histogram for women was strongly skewed to the right. Explain why the t-test is not appropriate to test whether men and women have different commute times.

13-6

Chapter 13 Test C

17. [Objective: Choosing a nonparametric test] You are presented with data from two independent samples. The variable being measured is continuous. The distribution of the population of each sample is right skewed. You wish to test the hypothesis that there is a difference in the median value of the variable for the samples. What type of test/method should you use? Explain why the t-test is not appropriate in this situation.

Section 13.4 (Randomization Tests) Use the following information to answer questions (18) – (20). Math self-efficacy can be defined as one’s belief in his or her own ability to perform mathematical tasks. A college math professor wishes to find out if her male students’ math self-efficacy matches reality. To do this she gives a math quiz to the male students then asks them to rate their level of confidence in how well they did on the quiz. She plans to test whether those who had little confidence that they did well on the quiz actually performed worse than those who had a high level of confidence that they did well on the quiz. Shown below is the approximate sampling distribution of the difference in mean quiz scores. The table below shows the summary statistics for the two groups. Assume that all conditions for a randomization test have been satisfied. 100

Group

Mean

Median

High Conf. 105 82.5 84.0 Low Conf. 201 72.1 69.9 Test Stat: Mean High Conf. – Mean Low Conf. Number of simulations: 350

Standard Deviation 7.1 6.2

IQR 12.5 10.4

Frequency

-9

-6

-3

Difference in Mean Quiz Scores

18. [Objective: Understand Randomization Tests] State the null and alternative hypothesis and also the value of the test statistic for the professor’s randomization test.

Chapter 13 Test C

13- 7

19. [Objective: Understand Randomization Tests] Explain how you would use the histogram to get an approximate p-value and state your p-value estimation.

20. [Objective: Understand Randomization Tests] Complete the randomization test by stating the proper decision regarding the null hypothesis and the professor’s conclusion. Are differences in mean quiz scores due to chance?

13-8

Chapter 13 Test C

Chapter 13 Test C—Answer Key 1. 2. 3. 4. 5. 6. 7. 8. 9.

Sample size is small, the population is not normal, the data is strongly skewed. The QQ plot is a tool that can help you determine whether the sample was drawn from a normal population. (a) goes with (c), (b) goes with (d). A log transform might be useful for the data shown in histogram (b) because it is right skewed. The width of the untransformed data is 327.7, the width of the transformed data is 314.24. The width of the confidence interval for the transformed data is narrower. The confidence interval for the geometric mean is more appropriate because the log-transformed data had a symmetric distribution. The confidence interval for the geometric mean will be more accurate and precise. Median: 1100.0, Geometric Mean: 1304.8, Mean: 8327.5 Sample data are matched pairs and the normal conditions of the paired t-test are not satisfied. H 0 : The median difference in alertness is 0.

H A : The median difference in alertness is not 0. 10. The sampling distribution for S is binomial with n = 46 and p = 0.5 . For a two-tailed test, find the probability that S will be as extreme as or more extreme than 28 or 18. 11. Test statistic S = 28 and p-value = 0.1839. 12. Fail to reject H 0 . There was no difference in stress after interacting with a dog before a math lecture. 13. The five conditions are (1) there are two independent groups, (2) the response variable is numerical and continuous, (3) each group is a random sample from some representative population, (4) the observations are independent of one another, and (5) the population distribution of each group has the same shape. 14. H 0 : The median for adults ages 24 to 39 is equal to the median for adults ages 40 to 55.

H A : The median for the two age groups are not equal. 15. Fail to reject H 0 . There was no difference in the median amount of crime dramas that adults in the two age groups watched. 16. The sample sizes are small and the women’s data is skewed. 17. Because the data is from independent samples, the variable is continuous and the population distribution of each group has the same shape, the Mann-Whitney test would be the best method. The t-test is not appropriate because the population distributions are not normal. 18. H 0 : The typical quiz score for male students with high confidence is the same as that

of those with low confidence. H A : The typical quiz score for male students with high confidence is greater than that of those with low confidence. The Test Statistic is 14.1. 19. To approximate the p-value using the histogram draw a vertical line at about 14.1 then approximate the proportion of observations to the right of the vertical line. From the histogram it appears that there are no observations to the right of 14.1 so the p-value is approximately zero. 20. Reject H 0 . The professor should conclude that typical quiz scores for male students with high confidence is greater than that of those with low confidence. The student’s self-efficacy matches reality.

Chapter 14 Test A—Multiple Choice Section 14.1 (The Linear Regression Model) 1.

[Objective: Understanding the Linear Regression Model] Which of the following is not true about residuals? If all the statements are true choose (d). a. The residuals can be described as the excess, due to randomness, that doesn’t fit on the line. b. The residuals can be determined by finding the difference between the actual observed value and the predicted dependent variable. c. The residuals are the result of natural variation in the independent variable. d. All of the above are true.

[Objective: Understanding the Linear Regression Model] Choose the condition of the linear regression model that cannot by verified by examining the residuals. Choose (d) if all the conditions given can be verified by examining the residuals. a. Linearity b. Constant Standard Deviation c. Normality of errors d. All of the above can be verified by examining the residuals

[Objective: Understanding the Linear Regression Model] Biologists studying the relationship between the number of Round Goby (an invasive prey fish) and the number of salmon eggs in streams believe that the deterministic component of the relationship is a straight line. A scatterplot shows that even though the general trend is linear, the points do not fall exactly on a straight line. Which of the following factors might account for the random component of this regression model? a. Different size salmon might affect the number of eggs laid. b. Variation in the size of the Goby might cause variation in the amount of salmon eggs consumed. c. Variability might appear in the instrument used to count salmon eggs. d. All of the above are possible factors that could account for the random component of the regression model.

14-2 4.

Chapter 14 Test A [Objective: Understanding the Linear Regression Model] Suppose that you were presented with data showing the association between days absent from class and final class average. Which of the following residual plots below suggests that the association between number of days absent from class and final class average is linear? b.

Residuals

Number of days absent from class

Residuals

Number of days absent from class

[Objective: Understanding the Linear Regression Model] Which of the following statements is not true about the constant standard deviation condition of the linear regression model? a. A residual plot can highlight the existence of a nonconstant standard deviation even when it is hard to see in the original Scatter plot b. A QQ plot can help you determine whether the constant standard deviation condition holds. c. A constant standard deviation means that the vertical spread of the y-values about the line is the same across the entire line. d. A fan shape in a residual plot indicates that the constant standard deviation condition does not hold.

Chapter 14 Test A

14- 3

Use the following information to answer questions (6) - (8). Below is the scatterplot showing the association between raw material (in tons) put into an injection molding machine each day (x), and the amount of scrap plastic (in tons) that is collected from the machine every 4 weeks (y). The residual plot of the data is also shown along with a QQ plot of the residuals.

Raw Material (in tons)

QQ Plot Sample Quantiles

Residual Plot

Residuals

Scrap plastic (in tons)

Scatterplot

Raw Material (in tons)

Theoretical Quantiles

[Objective: Use the residual plot to check conditions of the linear regression model] Choose the statement that best describes whether the condition for linearity does or does not hold for the linear regression model. a. The QQ plot mostly follows a straight line trend—the QQ plot is consistent with the claim of linearity. b. The residual plot shows no trend—the residual plot is consistent with the claim of linearity. c. The residual plot does not display a fan shape—the residual plot is consistent with the claim of linearity. d. The residual plot shows a horizontal trend—the residual plot is not consistent with the claim of linearity.

[Objective: Use the residual plot to check conditions of the linear regression model] Choose the statement that best describes whether the condition for constant standard deviation does or does not hold for the linear regression model. a. The QQ plot mostly follows a straight line—the QQ plot is consistent with the claim of constant standard deviation. b. The scatter plot shows a linear trend—the scatter plot is not consistent with the claim of constant standard deviation. c. The residual plot does not display a fan shape—the residual plot is consistent with the claim of constant standard deviation. d. The residual plot shows no trend—the residual plot is not consistent with the claim of constant standard deviation.

[Objective: Use the residual plot to check conditions of the linear regression model] Choose the statement that best describes whether the condition for normality of errors does or does not hold for the linear regression model. a. The QQ plot mostly follows a straight line, therefore the normality condition is satisfied. b. The residual plot shows no trend, therefore the normality condition is satisfied. c. The residual plot does not display a fan shape, therefore the normality condition is satisfied. d. The residual plot shows no trend, therefore the normality condition is not satisfied.

14-4

Chapter 14 Test A

Section 14.2 (Using the Linear Model) 9.

[Objective: Understand the linear model] Choose the statement that is true about the estimators for slope and intercept of a regression line when the conditions of the linear model hold. a. The standard errors for the estimators must come from a population that is Normally distributed. b. The estimators for the slope and intercept of a regression line are unbiased. c. The sampling distributions of the estimators follow the chi-square model. d. None of the above

Use the following information to answer questions (10) – (12). A statistics professor is interested in learning whether there is a positive association between number of posts by online students on a message board and the final class average in an online statistics course. The computer output below shows the results from a regression model in which the final class average was predicted by the number of message board posts. Assume that the conditions of the linear regression model are satisfied. LinRegTTest for y = a + bx β >0 a = 69.7104476 t = 3.707824045 b = .8144278607 p = 0.0024306618 s = 5.14275481 r = .7774060152 df = 119

10. [Objective: Conduct a hypothesis test on a linear model] Choose the correct null and alternative hypothesis to test whether there is an association between final class average and number of message board posts. a. H 0 : There is no linear association between the number of message board posts and the final class average.

H a : There is a positive linear association between the number of message board posts and the final class average. b.

H 0 : There is a linear association between the number of message board posts and the final class average. H a : There is no linear association between the number of message board posts and the final class average.

H 0 : The correlation is positive. H a : The correlation is zero.

None of the above.

11. [Objective: Conduct a hypothesis test on a linear model] Choose the correct observed value of the test statistic and the p-value. Round to the nearest thousandth. a. t = 3.708, p = 0.005 b.

t = 0.002, p = 3.707

t = 0.777, p = 0.002

t = 3.708, p = 0.002

Chapter 14 Test A

14- 5

12. [Objective: Conduct a hypothesis test on a linear model] Choose the correct decision regarding the null hypothesis and the correct conclusion. State your conclusion using a significance level of 5%. a. Fail to reject H 0 . There is enough evidence to conclude that the final class average is positively b.

associated with the number of message board posts. Reject H 0 . There is enough evidence to conclude that the final class average is positively

associated with the number of message board posts. Fail to reject H 0 . There is not enough evidence to conclude that the final class average is

positively associated with the number of message board posts. Reject H 0 . There is not enough evidence to conclude that the final class average is positively associated with the number of message board posts.

Use the following information to answer questions (13) - (15). A random sample of 30 married couples were asked to report the height of their spouse and the height of their biological parent of the same gender as their spouse. The output of a regression analysis for predicting spouse height from parent height is shown. Assume that the conditions of the linear regression model are satisfied. Regression Analysis: Spouse versus Parent The regression equation is spouse = 48.40 + 0.25 parent Predictor: Constant Predictor: Parent Parameter Estimate: 48.398 Parameter Estimate: 0.247 Standard Error: 39.695 Standard Error: 0.566 T-statistic: 1.219 T-statistic: 0.437 p-value: 0.277 p-value: 0.680 S=7.794899045 R-sq=0.036858791 r=0.1919864344

14-6

Chapter 14 Test A

14. [Objective: Conduct a hypothesis test on a linear model] Test the hypothesis that the slope is zero (significance level is 0.05), then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion. a. Reject H 0 . There is enough evidence to reject a slope of zero which is an indication that a linear b.

association exists between parent height and spouse height. Fail to reject H 0 . We don’t have enough evidence to reject a slope of zero which is an indication

that no linear association exists between parent height and spouse height. Reject H 0 . We don’t have enough evidence to reject a slope of zero which is an indication that

no linear association exists between parent height and spouse height. Fail to reject H 0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between parent height and spouse height.

15. [Objective: Conduct a hypothesis test on a linear model] If the intercept was 0 and the slope was 1, what would that say about the association? a. It would mean that on average, the spouse and the parent are the same height. b. It would mean that on average, the spouse is 1 inch taller than the parent. c. It would mean that the spouse height should not be predicting using parent height. d. None of the above. 16. [Objective: Interpreting confidence intervals for regression] The regression output below is the result of testing whether there is an association between the number of practice test problems a student completed and the number of questions answered correctly on the test. Assume that the conditions of the linear regression model are satisfied. What is the 95% confidence interval for the intercept (rounded to the nearest hundredth)? Does this interval support the theory that the intercept is zero? Choose the statement that summarizes your answer in context. Coefficients Intercept Slope

-2.12319885 1.108789625

Standard Error 2.3432 0.0977

Lower 95%

Upper 95%

-8.14673883 0.85765303

3.90034114 1.359926221

( −8.15,3.90) . This interval supports the theory that the intercept could be zero. In this context this would mean that a student who completed zero practice test problems could reasonably expect to get a zero on the test.

( 0.86,1.36) . This interval does not support the theory that the intercept could be zero. In this context the intercept is greater than zero so a student could expect to get a positive score on the test even if they did none of the practice problems.

( −8.15,3.90) . This interval does not support the theory that the intercept could be zero. In this

context this would mean that for approximately every two practice problems completed, the student could expect to get approximately one test question correct. None of the above

Chapter 14 Test A

14- 7

Section 14.3 (Predicting Values and Estimating Means) 17. [Objective: Understand confidence intervals and prediction intervals] Which of the following statements is not true about prediction intervals? a. Prediction intervals are wider than confidence intervals because there is more uncertainty in predicting an individual’s value. b. A prediction interval is concerned with estimating a population parameter. c. The width of the prediction interval is affected by the size of the standard deviation of the population distribution d. All of the above statements are true Use the following information to answer questions (18) and (19). A high school boys cross country coach performs a regression to predict the finish times of runners in the 10k event from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold. Regression Analysis: Training versus Finish Time The regression equation is finish time = 65.84 − 0.15 training minutes Predicted Values: X value Pred. Y s.e. (Pred. y) 95% C.I. 1.44112 180 38.84 ( 34.73, 42.95)

95% P.I. ( 31.13, 46.55)

18. [Objective: Understand confidence intervals and prediction intervals] The coach wants to predict the finish time of his top runner who trained for 180 minutes the previous week. Should the coach use a confidence interval or a prediction interval? a. Confidence Interval b. Prediction Interval 19. [Objective: Understand confidence intervals and prediction intervals] Suppose the coach’s top runner trained for 180 minutes the previous week. If this runner participates in the 10k event, what is the coach’s expected finish time for this runner? Can he be reasonably confident that this runner will beat the previous season’s record of 43 minutes? a. Expected finish time is 38.84 minutes. The coach can be confident that this runner will beat the previous season’s record of 43 minutes because the interval contains the value of 43 minutes. b. Expected finish time is 65.84 minutes. The coach cannot be confident that this runner will beat the previous season’s record because the interval contains the value of 43 minutes. To be confident the interval would have to lie completely below 43 minutes. c. Expected finish time is 38.84 minutes. The coach cannot be confident that this runner will beat the previous season’s record because the interval contains the value of 43 minutes. To be confident the interval would have to lie completely below 43 minutes. d. Expected finish time is 65.84 minutes. The coach can be confident that this runner will beat the previous season’s record of 43 minutes because the interval contains the value of 43 minutes.

14-8

Chapter 14 Test A

20. [Objective: Evaluating the regression model when some conditions do not hold] Of the following conditions, which one can fail yet produce a regression model that is reasonably good, in many cases, if the sample size is large? a. Normality of errors b. Linearity c. Independence of errors d. None of the above

Chapter 14 Test A Chapter 14 Test A—Answer Key 1. C 2. D 3. D 4. A 5. B 6. B 7. C 8. A 9. B 10. A 11. D 12. B 13. C 14. B 15. A 16. A 17. B 18. B 19. C 20. A

14- 9

Chapter 14 Test B—Multiple Choice Section 14.1 (The Linear Regression Model) 1.

[Objective: Understanding the Linear Regression Model] Which of the following is not a condition of the linear regression model? a. Linearity b. Residuals must be normally distributed c. Constant Standard Deviation d. Normality of errors

[Objective: Understanding the Linear Regression Model] Suppose that you were presented with data showing the association between days absent from class and final class average. Which of the following residual plots below suggests that the association between number of days absent from class and final class average is linear? b.

Residuals

Number of days absent from class

Residuals

Number of days absent from class

14-2

Chapter 14 Test B

[Objective: Understanding the Linear Regression Model] Environmental biologists studying the relationship between the number of owls in a forested region and the number of field mice in the region believe that the deterministic component of the relationship is a straight line. A scatterplot shows that even though the general trend is linear, the points do not fall exactly on a straight line. Which of the following factors might account for the random component of this regression model? a. Different size owls might affect the number of mice eaten. b. Variability might appear in the instrument used to count mice. c. Variation in the size of the mice might cause variation in the amount of mice consumed by owls. d. All of the above are possible factors that could account for the random component of the regression model.

[Objective: Understanding the Linear Regression Model] Which of the following statements is not true about the constant standard deviation condition of the linear regression model? a. A residual plot can highlight the existence of a nonconstant standard deviation even when it is hard to see in the original scatterplot b. A constant standard deviation means that the vertical spread of the y-values about the line is the same across the entire line. c. A fan shape in a residual plot indicates that the constant standard deviation condition does not hold. d. All of the above are true about the constant standard condition.

Use the following information to answer questions (6) - (8). Below is the scatterplot showing the association between miles driven in a semi truck (x), and the amount of tread wear on the tires (y). The residual plot of the data is also shown along with a QQ plot of the residuals.

Miles driven

QQ Plot

Sample Quantiles

Residual Plot

Residuals

Tire tread wear

Scatterplot

Miles driven

Theoretical Quantiles

[Objective: Use the residual plot to check conditions of the linear regression model] Based on plots provided, choose the statement that best describes whether the condition for linearity does or does not hold for the linear regression model. a. The residual plot shows no trend—the residual plot is consistent with the claim of linearity. b. The residual plot shows a horizontal trend—the residual plot is not consistent with the claim of linearity. c. The QQ plot mostly follows a straight line trend—the QQ plot is consistent with the claim of linearity. d. The residual plot does not display a fan shape—the residual plot is consistent with the claim of linearity.

Chapter 14 Test B

14- 3

[Objective: Use the residual plot to check conditions of the linear regression model] Based on plots provided, choose the statement that best describes whether the condition for constant standard deviation does or does not hold for the linear regression model. a. The residual plot does not display a fan shape—the residual plot is consistent with the claim of constant standard deviation. b. The QQ plot mostly follows a straight line—the QQ plot is consistent with the claim of constant standard deviation. c. The residual plot shows no trend—the residual plot is not consistent with the claim of constant standard deviation. d. The scatter plot shows a linear trend—the scatter plot is not consistent with the claim of constant standard deviation.

[Objective: Use the residual plot to check conditions of the linear regression model] Choose the statement that best describes whether the condition for normality of errors does or does not hold for the linear regression model. a. The residual plot shows no trend, therefore the normality condition is satisfied. b. The residual plot does not display a fan shape, therefore the normality condition is satisfied. c. The QQ plot mostly follows a straight line, therefore the normality condition is satisfied. d. The residual mostly follows a horizontal line which would have a slope of zero, therefore the normality condition is not satisfied.

Section 14.2 (Using the Linear Model) 9.

[Objective: Understand the linear model] Choose the statement(s) that are not true about the estimators for slope and intercept of a regression line when the conditions of the linear model hold. If each statement is true choose (d). a. The sampling distribution of the estimators will follow the Normal model. b. The estimators for the slope and intercept of a regression line are unbiased. c. The sampling distributions of the estimators follow the chi-square model. d. All of the above are true.

14-4

Chapter 14 Test B

Use the following information to answer questions (10) – (12). A humanities professor is interesting in learning whether there is a positive association between average online homework scores and the final class average in an online humanities course. The computer output below shows the results from a regression model in which the final class average was predicted by the average online homework score. Assume that the conditions of the linear regression model are satisfied. LinRegTTest for y = a + bx β >0 a = −11.15986949 t = 8.285803594 b = 1.111745514 p = 2.0898031E − 4 s = 4.349532611 r = .9654613007 df = 124

10. [Objective: Conduct a hypothesis test on a linear model] Choose the correct null and alternative hypothesis to test whether there is an association between the final class average and average online homework scores. a. H 0 : There is no linear association between the final class average and the average online homework score.

H a : There is a positive linear association between the final class average and the average online homework score. b.

H 0 : There is a linear association between the final class average and the average online homework score. H a : There is no linear association between the final class average and the average online homework score.

H 0 : The correlation is positive. H a : The correlation is zero.

None of the above.

11. [Objective: Conduct a hypothesis test on a linear model] Choose the correct observed value of the test statistic and the p-value. Round to the nearest thousandth. a. t = 4.345, p = 0.005 b.

t = 8.286, p = 0.000

t = 8.286, p = 2.090

t = 1.112, p = 0.000

associated with the average online homework score. Reject H 0 . There is not enough evidence to conclude that the final class average is positively

associated with the average online homework score. Reject H 0 . There is enough evidence to conclude that the final class average is positively

associated with the average online homework score. Fail to reject H 0 . There is not enough evidence to conclude that the final class average is positively associated with the average online homework score.

Chapter 14 Test B

14- 5

Use the following information to answer questions (13) - (15). A random sample of 30 couples who were also new home owners were asked to report the cost of their first house and their combined age when they married. The output of a regression analysis for predicting home cost from combined age is shown. Assume that the conditions of the linear regression model are satisfied. Regression Analysis: Combined age versus home cost The regression equation is home cost = 73.74 + 2122.75 combined age Predictor: Constant Predictor: Combined Age Parameter Estimate: 73.74 Parameter Estimate: 2122.75 Standard Error: 24.655 Standard Error: 4.814 T-statistic: 0.003 T-statistic: 4.882 p-value: 0.998 p-value: 0.008 S=19837.70325 R-sq=0.8562890509 r=0.9253588768

13. [Objective: Interpret the regression analysis] What is the slope of the regression line? Choose the statement that is the correct interpretation of the slope in context. a. The slope is 2122.75. On average, for each additional year in combined age, home cost is about $2,122.75 higher, in the sample. b. The slope is 73.74. On average, for couples with a combined age over 73.74, the home cost is an additional $2,122.75 per year over 73.74. c. The slope is 73.74. On average, for each additional year in combined age, the home cost is about $2,122.75 higher, in the sample. d. The slope is 2122.75. On average, for each additional $2,122.75 in home cost, the combined age is about 1 year higher, in the sample. 14. [Objective: Conduct a hypothesis test on a linear model] Test the hypothesis that the slope is zero (significance level is 0.05), then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion. a. Reject H 0 . There is enough evidence to reject a slope of zero which is an indication that a linear b.

association exists between combined age of the couple and home cost. Fail to reject H 0 . We don’t have enough evidence to reject a slope of zero which is an indication

that no linear association exists between combined age of the couple and home cost. Reject H 0 . We don’t have enough evidence to reject a slope of zero which is an indication that no

linear association exists between combined age of the couple and home cost. Fail to reject H 0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between combined age of the couple and home cost.

15. [Objective: Conduct a hypothesis test on a linear model] If the slope were 1, what would that say about the association? a. If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be 1 dollar more. b. If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be $2,122.75 higher. c. If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be $2,122.75 lower. d. None of the above.

14-6

Chapter 14 Test B

16. [Objective: Interpreting confidence intervals for regression] The regression output below is the result of testing whether there is an association between the number of hours of sleep a student had the night before an exam and the number of questions answered correctly on the exam. Assume that the conditions of the linear regression model are satisfied. What is the 95% confidence interval for the intercept (rounded to the nearest hundredth)? Does this interval support the theory that the intercept is zero? Choose the statement that summarizes your answer in context. Coefficients Intercept Slope

-0.25 0.087621024

Standard Error 1.85713484 0.023022636

Lower 95%

Upper 95%

-4.64144652 0.033181139

4.14144652 0.142060908

( 0.03,0.14 ) . This interval does not support the theory that the intercept could be zero. In this context the intercept is greater than zero so a student could expect to get a positive score on the test even if they did not get any hours of sleep.

( −4.64, 4.14 ) . This interval supports the theory that the intercept could be zero. In this context this would mean that a student who had zero hours of sleep could reasonably expect to get a zero on the test.

( −4.64, 4.14 ) . This interval does not support the theory that the intercept could be zero. In this

context this would mean that for approximately every 0.08 hours of sleep, the student could expect to get approximately one test question correct. None of the above

Section 14.3 (Predicting Values and Estimating Means) 17. [Objective: Understand confidence intervals and prediction intervals] Which of the following statements is true about prediction intervals? a. Prediction intervals are wider than confidence intervals because there is more uncertainty in predicting an individual’s value. b. A prediction interval is concerned with predicting values for individuals. c. The width of the prediction interval is affected by the size of the standard deviation of the population distribution d. All of the above statements are true

Chapter 14 Test B

14- 7

Use the following information to answer questions (18) and (19). A high school girls cross country coach performs a regression to predict the finish times of runners in the 10k event from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold. Regression Analysis: Training versus Finish Time The regression equation is finish time = 69.71 − 0.13 training minutes Predicted Values: X value Pred. Y s.e. (Pred. y) 95% C.I. 1.82543 145 50.86 ( 46.09,55.63)

95% P.I. ( 44.85,56.87 )

18. [Objective: Understand confidence intervals and prediction intervals] The coach wants to predict the finish time of his top runner who trained for 145 minutes the previous week. Should the coach use a confidence interval or a prediction interval? a. Confidence Interval b. Prediction Interval 19. [Objective: Understand confidence intervals and prediction intervals] Suppose the coach’s top runner trained for 145 minutes the previous week. If this runner participates in the 10k event, what is the coach’s expected finish time for this runner? Can he be reasonably confident that this runner will beat the time she had at the last meet of 51 minutes? a. Expected finish time is 50.86 minutes. The coach cannot be confident that this runner will beat the previous meet’s time because the interval contains the value of 51 minutes. To be confident the interval would have to lie completely below 51 minutes. b. Expected finish time is 50.86 minutes. The coach can be confident that this runner will beat the previous meet’s time of 51 minutes because the interval contains the value of 51 minutes. c. Expected finish time is 88.56minutes. The coach cannot be confident that this runner will beat the previous meet’s time because the interval contains the value of 51 minutes. To be confident the interval would have to lie completely below 51 minutes. d. Expected finish time is 88.56 minutes. The coach can be confident that this runner will beat the previous meet’s time of 51 minutes because the interval contains the value of 51 minutes. 20. [Objective: Evaluating the regression model when some conditions do not hold] Which of the following is not true about the coefficient of determination, r 2 ? a.

The coefficient of determination, r 2 , ranges from 0% to 100% and represents the amount of variability in the response variable (y) explained by the regression line.

In order to interpret r 2 the linearity condition of the linear regression model must be satisfied.

A hypothesis test should be conducted verify that r 2 is large enough to conclude a linear relationship exists. The coefficient of determination, r 2 , is a statistic that will give some information about how well the data fits the model, but it should not be the only piece of information taken into consideration when determining how useful a linear model might be.

14-8

Chapter 14 Test B

Chapter 14 Test B—Answer Key 1. D 2. B 3. A 4. D 5. D 6. A 7. A 8. C 9. C 10. A 11. B 12. C 13. A 14. C 15. A 16. B 17. D 18. B 19. A 20. C

Chapter 14 Test C—Short Answer Section 14.1 (The Linear Regression Model) 1.

[Objective: Understanding the Linear Regression Model] Explain what residuals are. Where do residuals come from? How are residuals calculated? Complete the table below by calculating the residuals for the following small data set. The linear model relating x and y is y = 5 − 10 x . x 11 5 7 4

y -100 -51 -63 -34

Residual

[Objective: Understanding the Linear Regression Model] Explain how a residual plot can be useful in determining whether the condition for linearity and the condition for a constant standard deviation have been satisfied.

[Objective: Understanding the Linear Regression Model] Researchers studying the relationship between the number of fat grams and the net weight in ounces of a fun size package of peanut M & M’s believe that the deterministic component of the relationship is a straight line. A scatterplot shows that even though the general trend is linear, the points do not fall exactly on a straight line. Describe two factors that might account for the random component of this regression model.

14-2

Chapter 14 Test C

Suppose that you were presented with data showing the association between days absent from class and final class average. Use the following residual plots, based on the data, to answer questions (4) and (5). b.

Residuals

Number of days absent from class

Residuals

Number of days absent from class

Which of the residual plots above would suggest that the association between number of days absent from class and final class average is linear? Explain.

Which of the residual plots above would suggests that the condition for constant standard deviation might not be satisfied? Explain.

Chapter 14 Test C

14- 3

Use the following information to answer questions (6) - (8). Below is the scatterplot showing the association between the number of workers on an assembly team (x), and the number of parts assembled in an 8-hour shift (y). The residual plot of the data is also shown along with a QQ plot of the residuals. Scatterplot

QQ Plot

Number of team members

Sample Quantiles

Residuals

Number of parts

Residual Plot

Number of team members

Theoretical Quantiles

[Objective: Use the residual plot to check conditions of the linear regression model] Use the plot(s) above to explain whether the condition for normality of errors is satisfied.

[Objective: Use the residual plot to check conditions of the linear regression model] Use the plot(s) above to explain whether the condition for constant standard deviation is satisfied.

[Objective: Use the residual plot to check conditions of the linear regression model] Use the plot(s) above to explain whether the condition for linearity is satisfied.

Section 14.2 (Using the Linear Model) 9.

[Objective: Understand the linear model] Consider the following statement: “When the conditions of the linear model hold, the estimators for slope and intercept are unbiased.” What is meant by the word unbiased in this context?

14-4

Chapter 14 Test C

Use the following information to answer questions (10) – (12). An engineer is interested in learning whether there is an association between temperature ( F D ) and the strength of an automotive plastic cup holder which is measured by finding the pounds per square inch (psi) it takes to break the cup holder. The computer output below shows the results from a regression model in which the breaking point in psi was predicted by the temperature. Assume that the conditions of the linear regression model are satisfied. LinRegTTest for y = a + bx β >0 a = 10.06846186 t = 5.9838909 b = .2252509467 p = 9.3423882 E − 4 s = 2.595293293 r = .9367346759 df = 119

10. [Objective: Conduct a hypothesis test on a linear model] State the null and alternative hypothesis to test whether there is an association between temperature and break strength two different ways.

11. [Objective: Conduct a hypothesis test on a linear model] What is the observed value of the test statistic and the p-value? Round to the nearest thousandth.

12. [Objective: Conduct a hypothesis test on a linear model] State the decision regarding the null hypothesis and the correct conclusion. State your conclusion using a significance level of 5%.

Chapter 14 Test C

14- 5

Use the following information to answer questions (13) - (15). The pulse rate of a random sample of 30 second graders was recorded before and after a fifteen minute recess. The output of a regression analysis for predicting pulse rate after recess from pulse rate before recess is shown. Assume that the conditions of the linear regression model are satisfied. Regression Analysis: Pulse before recess versus Pulse after recess The regression equation is pulse before = 14.62 + 0.92 pulse after Predictor: Constant Predictor: Pulse before Parameter Estimate: 14.6190 Parameter Estimate: 0.9167 Standard Error: 8.536 Standard Error: 0.111 T-statistic: 1.713 T-statistic: 8.26 p-value: 0.147 p-value: 0.000 S=7.794899045 R-sq=0.036858791 r=0.1919864344

13. [Objective: Interpret the regression analysis] State the slope of the regression line. Write a sentence explaining what this slope means in this context.

14. [Objective: Conduct a hypothesis test on a linear model] Test the hypothesis that the slope is zero (significance level is 0.05), then state the correct decision regarding the null hypothesis and write a statement that correctly summarizes the conclusion in context.

15. [Objective: Conduct a hypothesis test on a linear model] If the intercept was 0 and the slope was 1, explain how the linear model would be interpreted in context.

14-6

Chapter 14 Test C

16. [Objective: Interpreting confidence intervals for regression] The regression output below is the result of testing whether there is an association between the number of practice test problems a student completed and the number of questions answered correctly on the test. Assume that the conditions of the linear regression model are satisfied. What is the 95% confidence interval for the slope (rounded to the nearest hundredth)? Does this interval support the theory that the slope is zero? Write a statement that summarizes your answer in context. Coefficients Intercept Slope

-2.12319885 1.108789625

Standard Error 2.3432 0.0977

Lower 95%

Upper 95%

-8.14673883 0.85765303

3.90034114 1.359926221

Section 14.3 (Predicting Values and Estimating Means) 17. [Objective: Understand confidence intervals and prediction intervals] Explain the difference between confidence intervals and prediction intervals. Be sure to include the type of situation in which each type of interval would be used. Which interval is likely to be wider and why?

Use the following information to answer questions (18) and (19). A high school boys track and field coach performs a regression to predict the vault height (in feet) of a pole vault from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold. Regression Analysis: Training versus vault height The regression equation is vault height = 12.75 + 0.015 training minutes Predicted Values: X value Pred. Y s.e. (Pred. y) 95% C.I. 1.24113 175 15.375 (12.275,18.475)

95% P.I. (10.875,19.875)

18. [Objective: Understand confidence intervals and prediction intervals] The coach wants to project what the jump height of his top pole vaulter will be who trained for 175 minutes the previous week. Should the coach use a confidence interval or a prediction interval?

Chapter 14 Test C

14- 7

19. [Objective: Understand confidence intervals and prediction intervals] Suppose the coach’s top pole vaulter trained for 175 minutes the previous week. If this athlete participates in the pole vault event, what is the coach’s expected jump height for this athlete? Can he be reasonably confident that this athlete will beat the previous season’s record of 19.8 feet? Explain.

20. [Objective: Evaluating the regression model when some conditions do not hold] State the two conditions that must be satisfied, without exception, to make inferences using a linear model.

14-8

Chapter 14 Test C

Chapter 14 Test C—Answer Key 1.

The residual is the variation from the linear model that naturally occurs. The residuals of a linear model can be calculated by finding the difference between the actual observed value and the predicted dependent variable. x 11 5 7 4

y -100 -51 -63 -34

Residual -5 -6 -2 1

If the original data is linear, the residual plot should have no slope or trend. If the residual plot shows a fan shape, then the condition that y-values have a constant standard deviation may not be satisfied. 3. Answers will vary, but possible answers could include (1) M & M’s might contain different size peanuts and (2) there might be variability in the instrument used to measure net weight. 4. Residual plot (a) indicates that the relationship is linear because the plot shows no trend and a slope of zero. 5. Residual plot (d) indicates that the condition for a constant standard deviation may not be satisfied because the plot shows fan shape. 6. The QQ plot shows a linear trend—the QQ plot is consistent with the claim of normality of errors. 7. The residual plot does not have a fan shape— the residual plot is consistent with the claim of constant standard deviation. 8. The residual plot shows no trend—the residual plot is consistent with the claim of linearity. 9. This means that typically the calculated values for the intercept and slope, based on the sample data, will be the same as the population values. 10. Possible answers: H 0 : There is no linear association between the temperature and the break strength of the cup holder. H a : There is a positive linear association between the temperature and the break strength of the cup holder.

H 0 : The slope equals zero. H a : The slope does not equal zero. H 0 : The correlation is zero. H a : The correlation is not zero. 11. t = 5.983, p = 0.001 12. Reject H 0 . There is enough evidence to conclude that the temperature is negatively associated with the break strength of the cup holder. 13. The slope is 0.92. On average, for each additional beat per minute before recess, the second grader has a pulse rate that is about 0.92 beats per minute higher after recess. 14. Reject H 0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between pulse rate before recess and pulse rate after recess. 15. It would mean that on average, pulse rate before and after recess is the same. 16.

( 0.86,1.36 ) . This interval does not support the theory that the slope could be zero. In this context the

slope is greater than zero so a student could expect to get a positive score on the test even if they did none of the practice problems. 17. A confidence interval is used to estimate a population parameter. A prediction interval is used to estimating a value for an individual. Prediction intervals are likely to be wider than confidence intervals because the sampling distribution for the mean has a smaller standard deviation then the population distribution.

Chapter 14 Test C

14- 9

18. Prediction interval 19. Expected pole vault height is 15.375 minutes. The coach can be confident that this athlete will beat the previous season’s record of 19.8 feet because the interval is entirely below 19.8 feet. 20. (1) the linearity condition and (2) the independence of errors most be satisfied for the linear model to be appropriate.