Ch. 1 Introduction to Data 1.1 What Are Data? 1 Understand Concepts Regarding Data MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Data can be defined as numbers in context. Suppose you are given the following set of numbers: 1.73, 1.83, 1.57, 1.88, 1.70, 1.65 What additional information would allow you to define these numbers as data? A) Units of measurement. This could represent the heights of six 5-year-olds, in meters. B) Units of measurement. This could represent the heights of six 20-year-olds, in meters. C) We need to know where these numbers were collected. D) We need to know who collected these numbers. Answer: B 2) Data can be defined as numbers in context. Suppose you are given the following set of numbers: 18, 22, 22, 20, 19, 21 What additional information would allow you to define these numbers as data? A) We need to know where these numbers were collected. B) We need to know who collected these numbers. C) Units of measurement. This could represent the ages of six high school students. D) Units of measurement. This could represent the ages of six college students. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) Give an example of how data could be collected about you on a daily basis. Answer: Answers will vary. Examples might include: Facebook postings, Twitter tweets, Instagram photos, emails sent/received, credit/debit card swipes, GPS, text messaging, etc.
1.2 Classifying and Storing Data 1 Understand the Fundamentals of Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A statistics student collected data from other students in her class who ride a bike to school. The following table shows data about their bikes: Color Series Number Weight (lbs) Road Bike Average Speed (mph) 32 0 16 Black A120 Blue B640 22 1 24 Green C300 26 0 14 Black D90 16 1 23 How many variables are there? A) 5 B) 4
C) 20
Answer: A
Page 1 Copyright © 2020 Pearson Education, Inc.
D) 7
2) A statistics student collected data from other students in her class who ride a bike to school. The following table shows data about their bikes: Color Series Number Weight (lbs) Road Bike Average Speed (mph) Black A120 32 0 16 Blue B640 21 1 24 Green C300 29 0 14 Black D90 14 1 23 Observations were made on how many bikes? A) 4 B) 5
C) 20
D) 7
Answer: A 3) In a recent school poll, the administrators asked if students were satisfied with the schoolʹs course offerings. What is the population of interest here? A) All students who are satisfied with the course offerings. B) All students who are not satisfied with the course offerings. C) All students who attend the school. D) All students who participated in the poll. Answer: C 4) In a recent high school poll, the principal asked if students were satisfied with the amount of after -school activities offered. What is the population of interest here? A) All students who attend the school. B) All students who participated in the poll. C) All students who are satisfied with the amount of after-school activities that are offered. D) All students who are not satisfied with the amount of after-school activities that are offered. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 5) In a recent survey at UCLA, some incoming freshmen students were asked if they planned to take more than one math class before they graduated. What is the population of interest here and what is the sample? Answer: The population is the entire freshman class at UCLA. The sample includes the particular freshmen who participated in the survey. 2 Distinguish Between Numerical and Categorical Variables MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The average gas mileage of the top selling mini-vans for each U. S. car manufacturer is an example of what type of variable? A) Numerical variable B) Categorical variable C) Neither Answer: A 2) A state senatorʹs comments about the dangers of global warming are an example of what type of variable? B) Categorical variable C) Neither A) Numerical variable Answer: C 3) Marital status of each member of a randomly selected group of adults is an example of what type of variable? A) Numerical variable B) Categorical variable C) Neither Answer: B
Page 2 Copyright © 2020 Pearson Education, Inc.
4) The ethnicity of the individual respondents in a political poll of a randomly selected group of adults is an example of what type of variable? A) Numerical variable B) Categorical variable C) Neither Answer: B 5) The average number of hours spent completing statistics homework for a randomly selected group of statistics students is an example of what type of variable? B) Categorical variable C) Neither A) Numerical variable Answer: A 6) The number of parents who attended parent teacher conferences at a local elementary school is an example of what type of variable? A) Numerical variable B) Categorical variable C) Neither Answer: A 7) A bicycle manufacturer produces four different bicycle models. Information is summarized in the table below: Model Series Number Weight Style Ascension A120 33 Mountain 20 Road Road Runner B640 All Terrain C300 29 Hybrid Class Above D90 14 Racing Identify the variables and determine whether each variable is numerical or categorical. A) series number: categorical; weight: numerical; style: categorical B) series number: numerical; weight: numerical; style: categorical C) series number: numerical; weight: categorical; style: categorical D) series number: categorical; weight: categorical; style: categorical Answer: A 8) An international relations professor is supervising four masterʹs students. Information about the students is summarized in the table. Student Name Student Number Area of Interest Anna 914589205 Africa Middle East Pierre 981672635 Juan 906539012 Latin America Yoko 977530271 Asia
GPA 3.44 3.13 3.30 3.48
Identify the variables and determine whether each variable is numerical or categorical. A) student number: categorical; area of interest: categorical; GPA: numerical B) student number: numerical; area of interest: categorical; GPA: numerical C) student number: numerical; area of interest: categorical; GPA: categorical D) student number: categorical; area of interest: categorical; GPA: categorical Answer: A
Page 3 Copyright © 2020 Pearson Education, Inc.
9) Determine which of the following five variables are numerical and which are categorical. age, gender, weight, ethnicity, favorite math class A) All of the variables are categorical. B) All of the variables are numerical. C) Age, weight, and favorite math class are numerical variables. Gender and ethnicity are categorical variables. D) Age and weight are numerical variables. Gender, ethnicity, and favorite math class are categorical variables. Answer: D 10) Determine which of the following five variables are numerical and which are categorical. age, gender, height, favorite candy, eye color A) Age, height, and favorite candy are numerical variables. Gender and ethnicity are categorical variables. B) Age and height are numerical variables. Gender, favorite candy, and eye color are categorical variables. C) All of the variables are categorical. D) All of the variables are numerical. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 11) Give an example of one categorical variable and one numerical variable. Answer: Answers will vary. Examples might include: categorical - gender, favorite candy, year in school, favorite color, etc.; numerical - age, height, weight, speed, etc. 3 Understand Methods for Coding Categorical Variables MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) In a survey, married couples were asked, ʺDo you have children?ʺ The response was electronically recorded as . a ʺ1ʺ for yes and a ʺ0ʺ for no. This is an example of A) Coded categorical data C) Random sample
B) Unstacked numerical data D) None of these
Answer: A 2) In a survey, high school graduates were asked ʺDid you play sports in high school?ʺ The response was electronically recorded as a ʺ1ʺ for yes and a ʺ0ʺ for no. This is an example of . A) Random sample C) Coded categorical data
B) Unstacked numerical data D) None of these
Answer: C 3) According to the following data table, which variable(s) is(are) categorical? Age Gender Weight Ethnicity 23 1 180 1 18 0 126 0 20 0 139 2 19 1 154 1 20 1 202 3 A) None are categorical because there are only numbers in the table B) Age, gender, and ethnicity C) Gender and ethnicity D) Gender Answer: C Page 4 Copyright © 2020 Pearson Education, Inc.
4) According to the following data table, which variable(s) is(are) categorical? Age Gender Shoe Size Ethnicity 18 1 10 1 23 0 7 0 21 0 6 2 19 1 11 1 20 1 10 3 A) Gender B) Gender and ethnicity C) Gender, shoe size, and ethnicity D) None are categorical because there are only numbers in the table Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 5) In the following table, gender is a categorical variable. Give one possible way the variable could have been coded.
Answer: 2 possible ways to code: 0 - Male, 1 - Female; OR 0 - Female, 1 - Male 4 Organize Data in Stacked Format and Unstacked Format MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The table gives the GPA and gender of students in a business class. GPA Female 3.89 1 3.45 0 3.56 0 3.58 1 Is the format of the data set stacked or unstacked? A) stacked
B) unstacked
Answer: A
Page 5 Copyright © 2020 Pearson Education, Inc.
2) The table gives the GPA of some students in two math classes. One class meets in the morning and one in the aft Morning Afternoon 3.67 3.59 2.97 3.84 3.12 3.78 3.64 3.63 Is the format of the data set stacked or unstacked? A) unstacked
B) stacked
Answer: A 3) The following data table is organized using which method? Menʹs Ages Womenʹs Ages 35 42 39 33 41 37 37 35 40 39 A) This is stacked data because the ages are separated by groups (in this case, gender). B) This is stacked data because each row represents one person. C) This is unstacked data because the ages are separated by groups (in this case, gender). D) This is unstacked data because each row represents one person. Answer: C 4) The following data table is organized using which method? Gender Age Male 35 Female 42 Female 33 Male 37 Female 39 A) This is stacked data because the ages are separated by groups (in this case, gender). B) This is stacked data because each row represents one person. C) This is unstacked data because the ages are separated by groups (in this case, gender). D) This is unstacked data because each row represents one person. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 5) Determine whether the following data table is stacked or unstacked and explain your reasoning.
Answer: This is stacked data because each row represents one person.
Page 6 Copyright © 2020 Pearson Education, Inc.
1.3 Investigating Data 1 Determine Whether Questions Related to Variables in a Given Table Can be Answered by the Table MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the data in Table 1A to answer the question.
Note: 1 is female, 0 is male.
1) Suppose you wanted to know whether the studentʹs commute distance was associated with the studentʹs living situation. Using the data table if possible, which variables would you use? A) Use Commute Distance (Miles) and Living Situation. B) Use Commute Distance (Miles) and College Units Acquired. C) Data on studentʹs living situation are not included in this study. D) Use College Units Acquired and Living Situation. Answer: A 2) Suppose you wanted to know whether the men or the women had larger ring sizes. In the Female column of the table, 1 represents Female and 0 stands for Male. Using the data table, if possible, which variables would you use? A) Use Female and Ring Size. B) Use Female and Height. C) Data on studentʹs ring size are not included in this study. D) Use Height and Ring Size. Answer: A 3) Suppose you wanted to know whether the studentʹs height was associated with the studentʹs weight. Using the data table, if possible, which variables would you use? A) Data on studentʹs weight are not included in this study. B) Use Height and Weight. C) Use Female and Height. D) Use Weight and Ring Size. Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
4) Suppose you wanted to know whether the studentʹs hair color was associated with the shoe size. Using the data table, if possible, which variables would you use? A) Data on Shoe Size are not included in this study. B) Use Hair Color and Number of Aunts. C) Use Hair Color and Living Situation. D) Use Hair Color and Ring Size. Answer: A A data set on Shark Attacks Worldwide posted on StatCrunch records data on all shark attacks in recorded history including attacks before 1800. Variables contained in the data include time of attack, date, location, activity the victim was engaged in when attacked, type of injuries sustained by the victim, whether or not the injury was fatal, and species of shark. Which of the following questions could not be answered using this data set? (Source: www.sharkattackfile.net) 5) Using the data described, if possible, which variable(s) would you use to determine in which year the least number of shark attacks occurred? B) Use Hair Color and Number of Aunts. A) Use Date. C) Use Location. D) Data on the year are not included in the table. Answer: A 6) Using the data described , if possible, which variable would you use to determine if shark attacks happen more often to men than women? A) Data on gender of the victim are not included in the table. B) Use Activity of the Victim C) Use Type of Injury. D) Use Species of Shark. Answer: A
1.4 Organizing Categorical Data 1 Find Frequencies, Proportions, and Percentages and Use them to Describe and Compare Data MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. In a study of 900 adults, 45 out of the 325 men in the study said that they preferred to rent a movie on DVD rather than going out to a movie theater. 1) What is the approximate percentage of men in this study who prefer to rent a movie on DVD? A) 13.8% B) 36% C) 5% Answer: A 2) What is the approximate percentage of women who participated in this study? A) 41% B) 63.9% C) 7.8% D) Not enough information available Answer: B In a study of 1050 adults, 175 out of the 650 women in the study said that they preferred to drive an SUV to driving a compact car. 3) What is the approximate percentage of study participants who are women in this study who said that they prefer to drive an SUV to driving a compact car? A) 61.9% B) 16.7% C) 26.9% Answer: C 4) What is the approximate percentage of study participants who are women? A) 61.9% B) 16.7% D) Not enough information available C) 26.9% Answer: A Page 8 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 5) In a sample of 775 senior citizens, approximately 67% said that they had seen a television commercial for life insurance. About how many senior citizens is this? A) 256 B) 67 C) 519 D) Not enough information available. Answer: C 6) In a sample of 800 first-year college students, 72% said that they check their Facebook page at least three times a day. How many students is this? B) 576 A) 72 D) Not enough information available. C) 224 Answer: B The two-way table below shows teenage driver gender and whether or not the respondent had texted at least once while driving during the last thirty days.
7) What percentage of the sample had texted at least once while driving in the past thirty days? A) 62.5% B) 37.5% C) 50% D) 43.75% Answer: B 8) What percentage of the sample were female drivers? A) 62.5% B) 50%
C) 78%
D) 28.3%
Answer: B The two-way table below shows the survey results when sixty adults were asked whether they had made a clothing purchase in the last thirty days.
9) What percentage of the sample had not made a clothing purchase in the past thirty days? A) 35% B) 50% C) 33% D) 65% Answer: A 10) Of the adult males surveyed, what percentage had made a clothing purchase in the last thirty days? A) 35% B) 50% C) 33% D) 65% Answer: B In a study of 1350 elementary school children, 118 out of the 615 girls in the study said they want to be a teacher when they grow up. 11) What percent of the studyʹs participants were boys? A) 19.2% B) 45.6% C) 54.4% D) 83.7% Answer: C Page 9 Copyright © 2020 Pearson Education, Inc.
12) What percent of girls want to be a teacher when they grow up? A) 8.7% B) 19.2% C) 45.6%
D) 80.8%
Answer: B In a study of 1200 adults, 480 out of the 630 women in the study said they attended a state college or university. 13) What percent of the studyʹs participants were women? A) 40% B) 47.5% C) 52.5% D) 76.2% Answer: C 14) What percent of women attended a state college or university? B) 47.5% C) 52.5% A) 40%
D) 76.2%
Answer: D Solve the problem. 15) According to the following two-way table, what percent of people in the sample prefer dogs? Male Female Dog 40 25 Cat 25 10 A) 25% B) 35% C) 40% D) 65% Answer: D 16) According to the following two-way table, why are percentages more useful than counts to compare pet preferences between males and females? Male Female Dog 40 25 Cat 25 10 A) There are more males than females in the sample. B) There are more people who prefer dogs than cats in the sample. C) You should only use counts in a two-way table. D) You should only use percentages in a two-way table. Answer: A 17) According to the following two-way table, what percent of people in the sample take naps?
A) 25%
B) 35%
C) 55%
D) 60%
Answer: C 18) According to the following two-way table, why are percentages more useful than counts to compare the amount of males and females who take naps?
A) There are more males than females in the sample. B) There are more people who take naps than people who do not in the sample. C) You should only use counts in a two-way table. D) You should only use percentages in a two-way table. Answer: A
Page 10 Copyright © 2020 Pearson Education, Inc.
19) A two-way table is useful for describing which types of variables? A) Two numerical variables. B) Two categorical variables. C) One numerical variable. D) One numerical variable and one categorical variable. Answer: B 20) A two-way table could be used for which of the following pairs of variables? A) Age and height B) Gender and age C) Gender and favorite class D) Age and favorite class Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 21) What types of variables are represented in a two-way table? Give an example. Answer: Two categorical variables. Answers will vary. Examples might include: gender & favorite color, gender & year in school, year in school & favorite animal, etc. In a recent study of 1200 adult smokers, 125 out of the 560 males in the study said they were interested in joining a help group to quit smoking. 22) What percent of the studyʹs participants were female? Answer:
640 = 0.533 = 53.3% 1200
23) What percent of males are interested in joining this group? Answer:
125 = 0.223 = 22.3% 560
Solve the problem. 24) According to the following two-way table, what percent of people in the sample eat breakfast?
Answer:
75 = 0.75 = 75% 100
25) According to the following two-way table, why are percentages more useful than counts to compare the amount of males and females who eat breakfast?
Answer: The group sizes are different. There are 55 males, but only 45 females.
Page 11 Copyright © 2020 Pearson Education, Inc.
2 Identify Missing Information MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Identify the type of sampling used. 1) A recent report showed there were 43 accidents involving pedestrians in City A and 62 accidents involving pedestrians in City B this year. The mayor of City A claims that his city is safer for pedestrians than City B. What information is missing that might contradict this claim? A) The total number of pedestrians in both City A and City B B) The number of accidents that do not involve pedestrians in both City A and City B C) The number of crosswalks in both City A and City B D) The number of accidents involving pedestrians from the previous year Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Answer the question. 2) Only two cafeterias are available at a large university. The first offers vegetarian food and the second offers only non-vegetarian meals. The vegetarian cafeteria serves 30 students on a given Friday, while the non-vegetarian cafeteria serves 15 lunches on that same Friday. A student claims that this is evidence that students who were on campus on that Friday preferred vegetarian food. What information is missing that might contradict this claim? Answer: It is not known the percentage of the student body in the two cafeterias on Friday. The larger number of students eating at the first cafeteria on Friday could be because the first cafeteria has a larger capacity than the second cafeteria or that it is closer to campus. An alternate possibility could be that we donʹt know the number of students on campus that Friday. Quite possibly the university has more than 45 students, and we donʹt know what the rest of them ate. (Presumably they went off campus or brought their own food.) 3) In a national safety report, the number of bicyclist fatalities in City X was 108 and the number of bicyclist fatalities in City Y was 59. Can we conclude that bicyclists are less safe in City X than in City Y? If you answered no, what additional data would allow us to make a conclusion about which city is less safe for bicyclists? Answer: We cannot conclude that bicyclists are less safe in City X than in City Y. The population of each city would be needed to compare the fatality percent or rate with respect to total population. 4) The number of clinically obese men in State A is 156,261 and the number of clinically obese men in State B is 294,269. Someone makes the claim that this is evidence that men exercise more in State A. What information is missing that might contradict this claim? Answer: We need to know the total number of men in State A and State B so that a comparison can be made of the percentage of the men in each state that are clinically obese. There could be a much higher male population in State B than State A. Also, assumptions about exercise and obesity are being made. 5) In a study at one university, it has been recorded that Model 1 smart phone screens were brought to a shop to be repaired 5,876 times in one year . Model 2 smart phone screens were brought into the same shop to be repaired only 702 times that year. Can we conclude that Model 1 smart phones screens are more fragile than Model 2 smart phone screens? If you answered no, what additional data would allow us to make a conclusion about which type of smart phone screen is more fragile? Answer: It cannot be conclude that Model 1 smart phones screens are more fragile than Model 2 smart phone screens . We need to know the percentage of each type of smart phone model brought into the store for screen repairs. To find this percentage, the number of each type of smart phone models that are in the population is required. Model 1 smart phones could be a lot more popular than Model 2 smart phones, for instance. Page 12 Copyright © 2020 Pearson Education, Inc.
1.5 Collecting Data to Understand Causality 1 Distinguish Between Observational Studies and Controlled Experiments MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Indicate whether the study described is an observational study or a controlled experiment. 1) The obesity rates of elementary age children living in urban areas are compared to those living in rural areas to see whether children in urban settings have higher obesity rates. B) Controlled experiment A) Observational study Answer: A 2) ʺPeople with diabetes are at higher risk for certain cancers than those without the blood sugar disease, suggests a new study based on a telephone survey of nearly 400,000 adults.ʺ A) Observational study B) Controlled experiment Answer: A 3) A group of students is divided into two groups. One group is a given a new chewable vitamin and the other group is given a placebo. After six months they are asked to fill out a questionnaire and given a health exam to see whether the new vitamin has health benefits that are better than a placebo. B) Controlled experiment A) Observational study Answer: B 4) The smoking rates of teens in urban areas are compared to those living in rural areas to see whether teens living in rural settings have higher rates of smoking. A) Observational study B) Controlled experiment Answer: A 5) A group of cancer patients is divided into two groups. One group is given a new drug to fight the side effects of chemotherapy and the other group is given a placebo. After three months they are asked to respond to a questionnaire about the frequency and severity of their side effects to see whether the new drug improved the overall negative side effects of chemotherapy. B) Controlled experiment A) Observational study Answer: B 6) A group of students is divided into two groups. One group listens to classical music while taking a math test and the other group takes the test in silence. The average test scores of the two groups are compared to see whether listening to music during a math test has an effect on scores. B) Controlled experiment A) Observational study Answer: B Determine if the following scenario is an observational study or a controlled experiment. 7) A doctor is interested in determining whether a certain medication increases the risk of high blood pressure. He randomly selects 100 people for his study - 50 who will take the medication, and 50 who will take a placebo. He checks the patientsʹ blood pressures weekly for six months. B) Controlled experiment C) Neither A) Observational study Answer: B 8) A doctor is interested in determining whether a certain medication increases the risk of high blood pressure. He reviews his patientsʹ medical records and finds that a higher proportion of people who take the medication are suffering from high blood pressure. B) Controlled experiment C) Neither A) Observational study Answer: A
Page 13 Copyright © 2020 Pearson Education, Inc.
9) A doctor is interested in determining whether a certain medication reduces migraines. She randomly selects 100 people for his study - 50 who will take the medication, and 50 who will take a placebo. The patients are examined once a week for six weeks. A) Observational study B) Controlled experiment C) Neither Answer: B 10) A doctor is interested in determining whether a certain medication reduces migraines. She reviews her patientsʹ medical records and finds that a higher proportion of people who take the medication have fewer migraines than those who did not take the medication. B) Controlled experiment C) Neither A) Observational study Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Determine if the following scenario is an observational study or a controlled experiment and explain your reasoning. 11) A school teacher is interested in determining whether students who take multiple choice tests do better than students who take true/false tests. She has been giving multiple choice tests since she started teaching and is wondering if she should change her testing method. She randomly assigns half of her students to take a multiple choice test about grammar rules, and the other half to take a true/false test about grammar rules. She compares the test scores of the students in each group. Answer: This is a controlled experiment because the students are randomly assigned to the treatment group (true/false test) and the control group (multiple choice test). 12) A doctor is interested in determining whether a certain medication is effective at treating abdominal pain. He reviews his patientsʹ medical records and finds that a higher proportion of people who took the medication fewer abdominal pain symptoms than those who did not take the medication. Answer: This is an observational study because the doctor did not randomly assign patients into groups. Instead, he simply looked at medical files. 2 Identify Potential Problems and/or Improvements for a Research Study MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Consider the following statement ʺMy child was bullied on the school bus and so was my neighborʹs child, so obviously, bullying is a big problem on school buses and something needs to be done about it!ʺ What is wrong with this statement? A) The statement exhibits bias. B) The statement is anecdotal. C) The person making the statement confused correlation with causation. D) None of these--the statement is valid. Answer: B 2) Before opening a new dealership, an auto manufacturer wants to gather information about car ownership and driving habits of the local residents. The marketing manager of the company randomly selects 1000 households from all households in the area and mails a questionnaire to them. Of the 1000 surveys mailed, she receives 130 back. What is the problem with how the information is gathered? A) The only responses were from people who chose to send the survey back. B) The 1000 surveys were not sent to randomly selected households. C) Only residents from the local area were polled. D) To get a random sample, surveys would have to be mailed to every household. Answer: A
Page 14 Copyright © 2020 Pearson Education, Inc.
3 Understand When and Why to Infer or not Infer a Cause -and-Effect Relationship from a Research Study MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Consider the following statement, ʺBabies who breastfeed are less likely to grow into children with behavioral problems by the time they reach age 5 than those who receive formula milk.ʺ Which of the following is a plausible confounding variable in this study? A) The quality of the formula milk B) Motherʹs social-economic status C) The age at which breastfeeding ends D) All of these E) None of these Answer: D 2) Consider the following statement: ʺResearchers conducted a large observational study and determined that children who participated in school music programs scored higher on math exams in later grades than those who did not.ʺ Suppose that upon hearing this a politician states that all children should participate in school music programs. What is wrong with the politicianʹs statement? A) There was a placebo effect. B) This study exhibits bias. C) The controlled experiment was not double-blinded. D) The politician confused correlation with causation. Answer: D 3) Consider the following statement, ʺIn a nationwide study, children on an all -organic diet are more alert in school than those not on an all-organic diet.ʺ Which of the following is a plausible confounding variable in this study? A) The quality of the non-organic diet B) Parentsʹ social-economic status C) School start times D) All of these E) None of these Answer: D 4) Researchers conducted an experiment to determine if riding a bike to school improves attention span. What are the treatment and outcome variables? A) The treatment variable is riding a bike to school. The outcome variable is whether or not the child rode a bike to school. B) The treatment variable is riding a bike to school. The outcome variable is the childʹs attention span. C) The treatment variable is attention span. The outcome variable is whether or not the child rode a bike to school. D) The treatment variable is attention span. The outcome variable is the childʹs attention span score. Answer: B
Page 15 Copyright © 2020 Pearson Education, Inc.
5) Researchers conducted an experiment to determine if children who participate in a new after-school tutoring program do better on state-mandated tests than children who do not attend the program. What are the treatment and outcome variables? A) The treatment variable is participation in the after-school program. The outcome variable is whether or not a child attended. B) The treatment variable is participation in the after-school program. The outcome variable is the test score on the state-mandated test. C) The treatment variable is the state-mandated test. The outcome variable is the participation in the after-school program. D) The treatment variable is the state-mandated test. The outcome variable is the test score on the state-mandated test. Answer: B 6) Researchers conducted a study and determined that students who carpool have less friends than students who ride the bus to school. Can we conclude that carpooling causes students to have less friends? A) Yes, this is an observational study and we can conclude causation. B) Yes, this is an experiment and we can conclude causation. C) No, this is an observational study and we cannot conclude causation. D) No, this is an experiment and we cannot conclude causation. Answer: C 7) Researchers conducted a study and determined that students who participate in sports are happier than students who do not. Can we conclude that participating in sports makes students happier? A) Yes, this is an observational study and we can conclude causation. B) Yes, this is an experiment and we can conclude causation. C) No, this is an observational study and we cannot conclude causation. D) No, this is an experiment and we cannot conclude causation. Answer: C 8) A gym is offering a new 6-week diet plan for its members. Members who sign up for the program are weighed and measured once a week for the duration of the program. The owners of the gym want to know if the diet plan actually helps people lose weight. What variable could be a possible confounding factor in determining the cause of weight loss? B) The personʹs marital status. A) The personʹs education level. C) The personʹs social life. D) The personʹs exercise routine. Answer: D 9) A gym is offering a new 6-week weight loss exercise program for its members. Members who sign up for the program are weighed and measured once a week for the duration of the program. The owners of the gym want to know if the weight loss program actually helps people lose weight. What variable could be a possible confounding factor in determining the cause of weight loss? B) The personʹs marital status. A) The personʹs commitment to the program. C) The personʹs family structure. D) The personʹs diet. Answer: D 10) Coconut oil has become quite popular in recent years. People who use coconut oil claim it helps with hair care, skin care, stress relief, weight loss, and a boosted immune system. Can we conclude that the use of coconut oil causes these health benefits? A) Yes, the claims are anecdotes and give us a good comparison group to find health differences. B) No, the claims are anecdotes and do not give us a true comparison group to find health differences. C) Yes, the claims are true stories, so we do have evidence of the health benefits. D) No, the claims are lies, so we do not have evidence of the health benefits. Answer: B
Page 16 Copyright © 2020 Pearson Education, Inc.
11) In Los Angeles, juice cleansing is very popular. Some people have claimed that the cleanses are beneficial for weight loss, body detoxification, and treatment and prevention of illnesses. Can we conclude that juice cleansing causes these health benefits? A) Yes, the claims are true stories, so we do have evidence of the health benefits. B) No, the claims are lies, so we do not have evidence of the health benefits. C) Yes, the claims are anecdotes and give us a good comparison group to find health differences. D) No, the claims are anecdotes and do not give us a true comparison group to find health differences. Answer: D 12) What does it mean for an experiment to be random? A) Assignment into the control and treatment groups is determined by chance. B) Assignment into the control and treatment groups is determined by the researcher. C) Assignment into the control and treatment groups is determined by the participants. D) Assignment into the control and treatment groups is determined by a person who is not involved in the research. Answer: A 13) What does it mean for an experiment to be double-blinded? A) The researcher does not know which participants are in the treatment and control groups. B) The participants do not know who is in the treatment and control groups. C) Neither the researcher nor the participants know who is in the treatment and control groups. D) The researcher and the participants know which group they are in because it is unethical to keep this information from them. Answer: C A group of 500 patients who suffer from skin cancer were asked to participate in a study to determine the effectiveness of a new medication. The patients were randomly divided into two groups, one that was given the actual medication, and one that received a placebo pill. A good outcome was defined as the cancer being in remission after 6 months of treatment. The results of the study are below.
14) Approximately what percent of patients who took the medication had cancer remission? A) 48% B) 50% C) 58% D) 67% Answer: D 15) Was the new medication effective for cancer remission? A) Yes, a higher percent of patients who took the medication had cancer remissions than the patients who took the placebo. B) Yes, both groups had more patients with cancer remissions. C) No, the patients who took the placebo also had cancer remissions. D) No, this was not a controlled experiment. Answer: A 16) Can we conclude that the cancer remissions were caused by the new medication? A) Yes, this is a controlled experiment. Since a higher percent of patients who took the medication had cancer remissions, we can conclude causation. B) Yes, this is a controlled experiment. We can always conclude causation with a controlled experiment. C) No, even though this is a controlled experiment, there was no difference between the treatment and control groups, so we cannot conclude causation. D) No, even though this is a controlled experiment, there might be a confounding factor since the placebo group had cancer remissions too. Answer: A Page 17 Copyright © 2020 Pearson Education, Inc.
A group of 500 patients who suffer from hypothyroidism, a condition in which your thyroid does not produce enough of certain hormones, were asked to participate in a study to determine the effectiveness of a new medication. The patients were randomly divided into two groups, one that was given the actual medication, and one that received a placebo pill. The results of the study are below. Medication Placebo Symptoms improved 205 140 Symptoms did not improve 65 90 17) What percent of patients who took the medication had improved symptoms? A) 41% B) 54% C) 65.2%
D) 75.9%
Answer: D 18) Was the new medication effective in treating hypothyroidism? A) Yes, a higher percent of patients who took the medication had improved symptoms than the patients who took the placebo. B) Yes, both groups had more patients with improved symptoms. C) No, the patients who took the placebo also had improved symptoms. D) No, this was not a controlled experiment. Answer: A 19) Can we conclude that the improved symptoms were caused by the new medication? A) Yes, this is a controlled experiment. Since a higher percent of patients who took the medication had improved symptoms, we can conclude causation. B) Yes, this is a controlled experiment. We can always conclude causation with a controlled experiment. C) No, even though this is a controlled experiment, there was no difference between the treatment and control groups, so we cannot conclude causation. D) No, even though this is a controlled experiment, there might be a confounding factor since the placebo group had improved symptoms too. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 20) Researchers conducted an experiment to determine if having a dog day on college campuses during final exam week lowers studentsʹ stress levels. A dog day is when dogs from a local animal shelter are brought onto campus for students to play and interact with. What are the treatment and outcome variables for this experiment? Answer: Treatment variable - whether or not a campus had a dog day. Outcome variable - studentsʹ stress levels during final exams. 21) Researchers conducted a study and determined that coworkers who socialize outside of work are more productive than coworkers who do not. Can we conclude that socializing outside of work causes coworkers to be more productive? Explain your reasoning. Answer: No, this is an observational study and we cannot conclude causation. 22) A college is offering a new free tutoring program for students in an introductory statistics class. The school wants to know if this new program improves studentsʹ test scores on their midterms and final exams. What variable could be a possible confounding factor in determining why studentsʹ scores improved or not? Answer: Answers will vary. Examples might include: a studentʹs access to other help/tutoring programs, a studentʹs major on campus (e.g. a mathematics major versus a history major), a studentʹs study skills prior to the program, etc.
Page 18 Copyright © 2020 Pearson Education, Inc.
23) Give an example of how anecdotal evidence can be used to persuade consumers to purchase a product. Answer: Answers will vary. Examples might include: (1) a pregnancy blog references a few individual womenʹs experiences with cocoa butter lotion and its reduction of stretch marks, (2) a local health store includes quotes from 5 customers on an advertisement that claims coconut oil consumption can reduce stress and improve health, (3) a commercial for skincare products interviews a small group of people that claim the product has cured their acne, etc. 24) What is the difference between a blind and a double blind study? Which is most ideal? Answer: In a blind study, the participants do not know which group they have been assigned to. For example, in a medical experiment, the patients do not know if they are receiving actual medication or just a placebo. In a double blind study, neither the researchers, nor the participants know which group the participants have been assigned to. A double blind study is better than a blind study. A group of 500 patients who suffer from severe migraines were asked to participate in a study to determine the effectiveness of a new medication. The patients were randomly divided into two groups, one that was given the actual medication, and one that received a placebo pill. A good outcome was defined as a reduction in the number of migraines during a monthʹs time. The results of the study are below.
25) Approximately what percent of patients who took the medication had a reduction in the amount of migraines? Answer:
185 185 = = 0.6727 = 67.3% 185 + 90 275
26) Was the new medication effective for reducing migraines? Explain your reasoning and include any calculations. Answer: Yes, a higher percent of patients who took the medication had fewer migraines patients who took the placebo
185 = 67.3% than the 275
70 = 31.1% 275
27) Can we conclude that the reduction of migraines was caused by the new medication? Explain your reasoning. Answer: Yes, this is a controlled experiment. Since a higher percent of patients who took the medication had fewer migraines, we can conclude causation.
Page 19 Copyright © 2020 Pearson Education, Inc.
Ch. 1 Introduction to Data Answer Key 1.1 What Are Data? 1 Understand Concepts Regarding Data 1) B 2) D 3) Answers will vary. Examples might include: Facebook postings, Twitter tweets, Instagram photos, emails sent/received, credit/debit card swipes, GPS, text messaging, etc.
1.2 Classifying and Storing Data 1 Understand the Fundamentals of Statistics 1) A 2) A 3) C 4) A 5) The population is the entire freshman class at UCLA. The sample includes the particular freshmen who participated in the survey. 2 Distinguish Between Numerical and Categorical Variables 1) A 2) C 3) B 4) B 5) A 6) A 7) A 8) A 9) D 10) B 11) Answers will vary. Examples might include: categorical - gender, favorite candy, year in school, favorite color, etc.; numerical - age, height, weight, speed, etc. 3 Understand Methods for Coding Categorical Variables 1) A 2) C 3) C 4) B 5) 2 possible ways to code: 0 - Male, 1 - Female; OR 0 - Female, 1 - Male 4 Organize Data in Stacked Format and Unstacked Format 1) A 2) A 3) C 4) B 5) This is stacked data because each row represents one person.
1.3 Investigating Data 1 Determine Whether Questions Related to Variables in a Given Table Can be Answered by the Table 1) A 2) A 3) A 4) A 5) A 6) A
1.4 Organizing Categorical Data 1 Find Frequencies, Proportions, and Percentages and Use them to Describe and Compare Data 1) A 2) B Page 20 Copyright © 2020 Pearson Education, Inc.
3) C 4) A 5) C 6) B 7) B 8) B 9) A 10) B 11) C 12) B 13) C 14) D 15) D 16) A 17) C 18) A 19) B 20) C 21) Two categorical variables. Answers will vary. Examples might include: gender & favorite color, gender & year in school, year in school & favorite animal, etc. 640 22) = 0.533 = 53.3% 1200 23)
125 = 0.223 = 22.3% 560
24)
75 = 0.75 = 75% 100
25) The group sizes are different. There are 55 males, but only 45 females. 2 Identify Missing Information 1) A 2) It is not known the percentage of the student body in the two cafeterias on Friday. The larger number of students eating at the first cafeteria on Friday could be because the first cafeteria has a larger capacity than the second cafeteria o that it is closer to campus. An alternate possibility could be that we donʹt know the number of students on campus that Friday. Quite possibly the university has more than 45 students, and we donʹt know what the rest of them ate. (Presumably they went off campus or brought their own food.) 3) We cannot conclude that bicyclists are less safe in City X than in City Y. The population of each city would be needed to compare the fatality percent or rate with respect to total population. 4) We need to know the total number of men in State A and State B so that a comparison can be made of the percentage of the men in each state that are clinically obese. There could be a much higher male population in State B than State A. Also, assumptions about exercise and obesity are being made. 5) It cannot be conclude that Model 1 smart phones screens are more fragile than Model 2 smart phone screens . We need to know the percentage of each type of smart phone model brought into the store for screen repairs. To find this percentage, the number of each type of smart phone models that are in the population is required. Model 1 smart phones could be a lot more popular than Model 2 smart phones, for instance.
1.5 Collecting Data to Understand Causality 1 Distinguish Between Observational Studies and Controlled Experiments 1) A 2) A 3) B 4) A 5) B 6) B 7) B Page 21 Copyright © 2020 Pearson Education, Inc.
8) A 9) B 10) A 11) This is a controlled experiment because the students are randomly assigned to the treatment group (true/false test) and the control group (multiple choice test). 12) This is an observational study because the doctor did not randomly assign patients into groups. Instead, he simply looked at medical files. 2 Identify Potential Problems and/or Improvements for a Research Study 1) B 2) A 3 Understand When and Why to Infer or not Infer a Cause -and-Effect Relationship from a Research Study 1) D 2) D 3) D 4) B 5) B 6) C 7) C 8) D 9) D 10) B 11) D 12) A 13) C 14) D 15) A 16) A 17) D 18) A 19) A 20) Treatment variable - whether or not a campus had a dog day. Outcome variable - studentsʹ stress levels during final exams. 21) No, this is an observational study and we cannot conclude causation. 22) Answers will vary. Examples might include: a studentʹs access to other help/tutoring programs, a studentʹs major on campus (e.g. a mathematics major versus a history major), a studentʹs study skills prior to the program, etc. 23) Answers will vary. Examples might include: (1) a pregnancy blog references a few individual womenʹs experiences with cocoa butter lotion and its reduction of stretch marks, (2) a local health store includes quotes from 5 customers on an advertisement that claims coconut oil consumption can reduce stress and improve health, (3) a commercial for skincare products interviews a small group of people that claim the product has cured their acne, etc. 24) In a blind study, the participants do not know which group they have been assigned to. For example, in a medical experiment, the patients do not know if they are receiving actual medication or just a placebo. In a double blind study, neither the researchers, nor the participants know which group the participants have been assigned to. A double blind study is better than a blind study. 185 185 25) = = 0.6727 = 67.3% 185 + 90 275 26) Yes, a higher percent of patients who took the medication had fewer migraines took the placebo
185 = 67.3% than the patients who 275
70 = 31.1% 275
27) Yes, this is a controlled experiment. Since a higher percent of patients who took the medication had fewer migraines, we can conclude causation.
Page 22 Copyright © 2020 Pearson Education, Inc.
Ch. 2 Picturing Variation with Graphs 2.1 Sec 1-2. Visualizing Variation in Numerical Data/Summarizing Important Features of a Numerical Distribution 1 Interpret Dotplots and Histograms MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A fitness instructor measured the heart rates of the participants in a yoga class at the conclusion of the class. The data is summarized in the histogram below. There were fifteen people who participated in the class between the ages of 25 and 45. Use the histogram to answer the question.
1) How many participants had a heart rate between 120 and 130 bpm? A) 2 B) 4 C) 3
D) 5
Answer: C 2) How many participants had a heart rate between 140 and 150 bpm? A) 2 B) 4 C) 3
D) 5
Answer: A 3) What percentage of the participants had a heart rate greater than 130 bpm? A) 13% B) 27% C) 33%
D) 53%
Answer: D 4) What is the approximate percentage of participants that had a heart rate less than 130 bpm? C) 33% D) 53% A) 13% B) 47% Answer: B
Page 1 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 5) Each day for twenty days a record store owner counts the number of customers who purchase an album by a cer artist. The data and a dotplot of the data are shown below: Data set: 1, 3, 4, 4, 5, 6, 7, 2, 3, 4, 4, 5, 6, 8, 2, 3, 4, 5, 6, 7, 9
Which of the following statements can be made using the given information? A) On the first day of collecting data the record store owner had one person purchase an album by the artist. B) The dotplot shows that this data has a roughly bell -shaped distribution. C) During the twenty days when the record store owner collected data, there were some days when no one purchased an album by the artist. D) None of these Answer: B 6) For twenty days a record store owner counts the number of customers who purchase an album by a certain artis data and a dotplot of the data are shown below: Data set: 1, 3, 4, 4, 5, 6, 7, 2, 3, 4, 4, 5, 6, 8, 2, 3, 4, 5, 6, 7, 9
Which of the following statements can be made using the given information? A) On five of the twenty days observed by the record store owner, there were four albums by the artist purchased. B) During the twenty days when the record store owner collected data, at least one album by the artist was purchased each day. C) The dotplot shows that this data has a roughly bell -shaped distribution. D) All of these Answer: D
Page 2 Copyright © 2020 Pearson Education, Inc.
7) The histogram below shows the distribution of pass rates on a swimming test of all children who completed a four week summer swim course at the local YMCA. How many of the courses had a pass rate less than 40 percent?
A) About 8 C) About 3
B) About 5 D) Not enough information available
Answer: C 8) The histogram below shows the distribution of pass rates on a swimming test taken by all children who completed a four week summer swim course at the local YMCA. What is the typical pass rate for the swim test?
A) About 75% C) About 95%
B) About 55% D) Not enough information available
Answer: A
Page 3 Copyright © 2020 Pearson Education, Inc.
9) Based on the histogram below, would it be unusual to be on hold for 5 minutes or more at this call center?
A) Yes, it would be unusual. B) No, it would not be unusual. C) Not enough information given. Answer: A 10) A dot plot of the speeds of a sample of 50 cars passing a policeman with a radar gun is shown below.
What proportion of the motorists were driving above the posted speed limit of 55 miles per hour? A) 0.50 B) 0.64 C) 0.14 D) 7 Answer: A Provide an appropriate response. 11)
How many people were 20 years old? A) 9 B) 11
C) 13
Answer: A Page 4 Copyright © 2020 Pearson Education, Inc.
D) 8
12)
How many people were 22 years old or older? A) 13 people B) 8 people
C) 17 people
Answer: A Solve the problem. 13) In the following histogram, what can you conclude about the bin width?
A) The bin width is too small. We are given too much detail. B) The bin width is too large. We are given too much detail. C) The bin width is too small. We are hiding details of the distribution. D) The bin width is too large. We are hiding details of the distribution. Answer: A
Page 5 Copyright © 2020 Pearson Education, Inc.
D) 15 people
14) In the following histogram, what can you conclude about the bin width?
A) The bin width is too small. We are given too much detail. B) The bin width is too large. We are given too much detail. C) The bin width is too small. We are hiding details of the distribution. D) The bin width is too large. We are hiding details of the distribution. Answer: D
Page 6 Copyright © 2020 Pearson Education, Inc.
15) Which histogram represents the same data as the dotplot shown below?
A)
B)
C)
D)
Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
16) Which dotplot represents the same data as the histogram shown below?
A)
B)
C)
D)
Answer: D
Page 8 Copyright © 2020 Pearson Education, Inc.
17) The following histogram represents audience movie ratings (on a scale of 1-100) of 489 movies. What is the typical movie rating given by audiences according to this distribution?
A) The typical value is about 40. C) The typical value is about 60.
B) The typical value is about 50. D) The typical value is about 70.
Answer: C 18) What is the typical value for the histogram shown below?
A) The typical value is 40 because it is the center of the distribution. B) The typical value is 40 because it is the average of 20 and 60. C) Since the data are bimodal, a typical value cannot be found. D) Since the data are bimodal, there are two typical values - one is about 20 and the other is about 60. Answer: D
Page 9 Copyright © 2020 Pearson Education, Inc.
19) The following histogram represents the movie runtimes (length of a movie in minutes) of 489 movies. What is the typical movie runtime according to this distribution?
A) The typical value is about 90. C) The typical value is about 120.
B) The typical value is about 100. D) The typical value is about 130.
Answer: B 20) What is the typical value for the histogram shown below?
A) The typical value is 70 because it is the average of 50 and 90. B) The typical value is 70 because it is the center of the distribution. C) Since the data are bimodal, a typical value cannot be found. D) Since the data are bimodal, there are two typical values - one is about 50 and the other is about 90. Answer: D 21) What is the difference between a histogram and a relative frequency histogram? A) A histogram uses numbers to record how many observations are in a data set, and a relative histogram uses categories. B) A histogram uses categories to record how many observations are in a data set, and a relative histogram uses counts. C) A histogram uses counts to record how many observations are in a data set, and a relative histogram uses proportions. D) A histogram uses proportions to record how many observations are in a data set, and a relative histogram uses counts. Answer: C
Page 10 Copyright © 2020 Pearson Education, Inc.
22) Which of the following would likely show a bimodal distribution in a histogram? A) The heights of all students in a high school band. B) The ages of students who attend a 4-year university. C) The number of hours preschoolers plays outside. D) The final exam grades for an introductory statistics course. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 23) How is a dotplot similar to a histogram? How is it different? Answer: A dotplot and a histogram both show the overall shape of a distribution. They both can help determine a distributionʹs shape, center, and spread. They differ in terms of appearance in only one way. A dotplot displays a dot to represent each observation in the data, while a histogram uses bars to display intervals of observations. 24) Below are two histograms. One corresponds to the ages at which a sample of people applied for marriage licenses; the other corresponds to the last digit of a sample of social security numbers. Which graph is which, and why?
Answer: Histogram (a) displays the last digits of social security numbers because all of the values are mostly equally likely. Since the last digit of social security numbers are created randomly, we would expect any digit between 0 and 9 to show up just as often as another digit. Histogram (b) displays the ages at which a sample of people applied for a marriage license. Since most people get married in their early to mid-twenties, but there are also people who wait to get married until a later age, we would expect the distribution to be right-skewed.
Page 11 Copyright © 2020 Pearson Education, Inc.
25) The following histogram represents the number of reviews a movie received on a popular website. What is the typical number of reviews a movie is expected to receive, according to this distribution? Explain your reasoning.
Answer: The typical number of reviews a movie will receive is about 130. We know this because the distribution is centered around the value 130 on the x-axis. 26) How would you describe the typical value for this histogram? Explain your reasoning.
Answer: Since the data are bimodal, there are two typical values - one is about 40 and the other is about 80.
Page 12 Copyright © 2020 Pearson Education, Inc.
27) If you were to create a dotplot to display the same data that is represented in the following histogram, how many dots would you draw to represent heights that fall between 1.5 meters and 1.6 meters?
Answer: About 20 dots should be drawn because there are about 20 people whose heights fall between 1.5 meters and 1.6 meters, as shown by the frequency value on the y -axis. 2 Summarize Important Features of a Numerical Distribution MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The histogram shows the distribution of pitch speeds for a sample of 75 pitches for a college pitcher during one season. Which of the following statements best describes the distribution of the histogram below?
A) The distribution has a large amount of variation which can be seen by comparing the heights of the bars in the histogram. B) The distribution is right-skewed and shows that most of the pitches were more than 90 mph. C) The distribution is left-skewed and shows that most of the pitches were less than 95 mph. D) The distribution is symmetric around a pitch speed of about 93 mph. Answer: D
Page 13 Copyright © 2020 Pearson Education, Inc.
2) The histogram below is the distribution of heights for a randomly selected Boy Scout troupe. Choose the stateme is true based on information from the histogram.
A) The gap between the two smallest values indicates an outlier may be present. B) The smallest value is so extreme that it is possible that a mistake was made in recording the data. C) Although the smallest value does not fit the pattern, it should not be altogether disregarded. It is possible that the Boy Scout is 2.4 feet tall. D) All of these are true statements. Answer: D 3) Data was collected on hand grip strength of adults. The histogram below summarizes the data. Which statement about the distribution of the data shown in the graph?
A) The graph is useless because it is bimodal. B) The best estimate of typical grip strength is 80-90 pounds because it is in the center of the distribution. C) There must have been a mistake made in data collection because the distribution should be bell -shaped. D) The graph shows evidence that two different groups may have been combined into one collection. Answer: D
Page 14 Copyright © 2020 Pearson Education, Inc.
4) The histogram below displays the distribution of the length of time on hold, for a collection of customers, calling repair call center. Use the histogram to select the true statement.
A) The distribution is symmetrical. The number of callers who waited on hold for less than three minutes was the same as the number of callers who waited on hold for more than three minutes. B) The distribution is left-skewed and most callers waited on hold at least three minutes. C) The distribution shows that the data was highly variable with some callers waiting on hold as many as 20 minutes. D) The distribution is right-skewed and most callers waited on hold less than three minutes. Answer: D
Page 15 Copyright © 2020 Pearson Education, Inc.
Choose the histogram that matches the description. 5) The distribution of heights of adult males tends to be symmetrical. A)
B)
C)
Answer: B
Page 16 Copyright © 2020 Pearson Education, Inc.
6) The distribution of the number of times individuals in the 18-24 age group log onto a social networking website during the course of a day tends to be right skewed. A)
B)
C)
Answer: C
Page 17 Copyright © 2020 Pearson Education, Inc.
7) The distribution of test scores for a group of adults on a written driving exam following a refresher course tends to be left-skewed. A)
B)
C)
Answer: A
Page 18 Copyright © 2020 Pearson Education, Inc.
Construct the dotplot for the given data. 8) A store manager counts the number of customers who make a purchase in his store each day. The data are as fol 5 6 3 9 2 5 5 6 3 2
0
5
10 B)
A)
0
5
10
0
5
10
0
5
10
D)
C)
0
5
10
Answer: A 9) The following data represent the number of cars passing through a toll booth during a certain time period over a number of days. 18 19 17 17 24 18 21 18 19 15 22 19 23 17 21
15 A)
20
25 B)
15
20
25
C)
15
20
25
15
20
25
D)
15
20
25
Answer: A
Page 19 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 10) How are individual observations recorded in a dotplot, a histogram, and a stemplot? A) A dotplot displays the actual values of observations. A histogram displays a dot for every observation. A stemplot uses bars to display intervals of observations. B) A dotplot displays a dot for every observation. A histogram uses bars to display intervals of observations. A stemplot displays the actual values of observations. C) A dotplot displays the actual values of observations. A histogram uses bars to display intervals of observations. A stemplot displays a dot for every observation. D) A dotplot uses bars to display intervals of observations. A histogram displays a dot for every observation. A stemplot displays the actual values of observations. Answer: B 11) How are individual observations recorded in a dotplot versus a stemplot? A) A dotplot displays the actual values of observations. A stemplot uses bars to display intervals of observations. B) A dotplot displays the actual values of observations. A stemplot displays a dot for every observation. C) A dotplot displays a dot for every observation. A stemplot displays the actual values of observations. D) A dotplot displays a dot for every observation. A stemplot uses bars to display intervals of observations. Answer: C 12) When examining distributions of numerical data, what three components should you look for? A) Symmetry, center, and spread B) Symmetry, skewness, and spread D) Shape, center, and spread C) Shape, symmetry, and spread Answer: D 13) The two histograms below display the exact same data. How do the plots differ?
A) Histogram (i) uses frequencies to simply count the number of observations at a given value. Histogram (ii) uses relative frequencies to show the proportion of observations at a given value. B) Histogram (i) uses relative frequencies to show the proportion of observations at a given value. Histogram (ii) uses frequencies to simply count the number of observations at a given value. C) Histograms (i) and (ii) are exactly the same; there are no differences between the plots. D) Histograms (i) and (ii) do not display the same data because the values listed on the y -axis do not match. Answer: B
Page 20 Copyright © 2020 Pearson Education, Inc.
Match one of the histograms with its description. 14) The distribution of scores on an easy test is displayed in histogram __________. A) B) C)
Answer: A 15) The distribution of household income in a large city is displayed in histogram __________. B) C) A)
Answer: C 16) The distribution of female heights is displayed in histogram __________. B) A)
C)
Answer: B 17) The distribution of test scores for a group of students who received a 15 -minute study session prior to taking a test is displayed in histogram ________. B) C) A)
Answer: C
Page 21 Copyright © 2020 Pearson Education, Inc.
18) The distribution of male heights is displayed in histogram ________. B) A)
C)
Answer: B 19) The distribution of the number of ʺfriendsʺ users of a popular social media site has is displayed in histogram ________. B) C) A)
Answer: A Solve the problem. 20) Order the following histograms from least to most variability.
A) (i), (ii), (iii)
B) (ii), (i), (iii)
C) (ii), (iii), (i)
Answer: D
Page 22 Copyright © 2020 Pearson Education, Inc.
D) (iii), (i), (ii)
21) Order the following histograms from most to least variability.
A) (i), (ii), (iii)
B) (ii), (i), (iii)
C) (ii), (iii), (i)
D) (iii), (i), (ii)
Answer: B 22) When examining distributions of numerical data, what three components should you look for? A) Shape, center, and spread B) Shape, symmetry, and spread C) Symmetry, skewness, and spread D) Symmetry, center, and spread Answer: A 23) Which of the following would likely show a bimodal distribution in a histogram? A) The midterm exam scores for an introduction to Spanish course. B) The ages of students who attend a local high school. C) The number of hours a college student spends on homework per night. D) The price of college tuition, including both public and private schools. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 24) When examining distributions of numerical data, what three components should you try to describe? Answer: Shape, center, and spread of the data. 25) Describe a scenario in which a distribution could be bimodal. Explain your reasoning. Answer: Answers may vary. Some examples include: (1) The price of college tuition, including both public and private schools (the different types of colleges would create two modes - private colleges would most likely have higher tuition costs compared to public schools). (2) The heights of all students at a high school (the different genders would create two modes - males are typically taller than females). 26) The two histograms below display the exact same data. How do the plots differ?
Answer: Histogram (a) uses frequencies to simply count the number of observations at a given value. Histogram (b) uses relative frequencies to show the proportion of observations at a given value.
Page 23 Copyright © 2020 Pearson Education, Inc.
27) Order the following histograms from least to most variability. Explain your reasoning.
Answer: Least to most variability: (iii), (ii), (i). Histogram (iii) has the least variability because it has more points that are close to the center of the distribution. Histogram (i) has the most variability because it has more points that are far away from the center of the distribution. What would you expect the shape of the distribution described to look like? Explain your reasoning. 28) The distribution of the household incomes in a large city. Answer: The distribution of incomes would most likely be right-skewed because most people earn middle-class salaries, but the very wealthy people are likely to earn incomes much higher than average. 29) The distribution of scores on an easy test. Answer: The distribution of scores on an easy test would most likely be left-skewed because most test-takers will do well on the test, and a few will still do poorly. 30) The distribution of the time (in minutes) it takes to drive to work using the same route each day. Answer: The distribution of the time it takes to drive to work using the same route each day should be roughly symmetric because the time you leave your house is probably the same each day. The commute times will be very similar on a day-to-day basis.
Page 24 Copyright © 2020 Pearson Education, Inc.
2.2 Sec 3-4. Visualizing Variation in Categorical Variables/Summarizing Categorical Distributions 1 Interpret Bar Charts and Pie Charts MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A group of junior high athletes was asked what team sport was their favorite. The data are summarized in the table below. On the pie chart, which area would correspond to the category ʺSoccerʺ?
A) Area A
B) Area B
C) Area C
D) Area D
Answer: A 2) A group of junior high athletes was asked what team sport was their favorite. The data are summarized in the table below. On the pie chart, which area would correspond to the category ʺVolleyballʺ?
A) Area A
B) Area B
C) Area C
Answer: B
Page 25 Copyright © 2020 Pearson Education, Inc.
D) Area D
3) The graph below displays the number of homicides in the city of Flint, Michigan for each of the last three years. A reporter interprets this graph to mean that the number of murders in 2010 was more than twice the number of murders in 2008. Is the reporter making a correct interpretation?
A) No. The width of the bars is identical, indicating that the number of murders in 2010 is no different from 2008. B) Yes. The bar for 2010 is twice the height of the bar for 2008 and the number of murders indicated above the bars confirms that murders in 2010 were more than twice the level in 2008. C) There is not enough information given in the graph to determine whether the reporterʹs interpretation is correct or not. Answer: B
Page 26 Copyright © 2020 Pearson Education, Inc.
4) The graph below displays the number of applications for a concealed weapons permit in Montcalm County, Michigan, for each of three years. A reporter interprets this graph to mean that applications in 2010 are more than twice the level in 2008. Is the reporter making a correct interpretation?
A) No. Although the 2010 bar is more than twice the height of the 2008, the bars do not begin at 0 applications, so the graph does not correctly represent the data. Fifty -five is not equal to two times the number of applications made in 2008. B) No. The width of the bars is identical, indicating that the number of applications in 2010 is no different from 2008. C) Yes. The bar for 2010 is twice the height of the bar for 2008 and the number of applications indicated above the bars shows that applications in 2010 are more than twice the level in 2008. Answer: A 5) Which of the following statements about bar graphs is true? A) It sometimes doesnʹt matter in which order you place the bars representing different categories. B) It is appropriate to have gaps between the bars on the graph. C) On a bar graph, the width of the bars has no meaning. D) All of these are true for bar graphs. Answer: D
Page 27 Copyright © 2020 Pearson Education, Inc.
The following double-bar graph illustrates the revenue for a company for the four quarters of the year for two different years. Use the graph to answer the question.
6) In what quarter was the revenue the greatest for Year 2? A) fourth quarter B) first quarter
C) second quarter
D) third quarter
C) fourth quarter
D) third quarter
C) $50 million
D) $10 million
Answer: A 7) In what quarter was the revenue the least for Year 1? A) second quarter B) first quarter Answer: A 8) What was the revenue for the third quarter of Year 1? A) $35 million B) $7 million Answer: A
Page 28 Copyright © 2020 Pearson Education, Inc.
Construct a pie chart representing the given data set. 9) The following data give the distribution of the types of houses in a town containing 48,000 houses. House Type Frequency Percentage Cape 12,000 25% Garrison 19,200 35% Split 16,800 40%
A)
B)
Answer: A 10) 1,000 movie critics rated a movie. The following data give the rating distribution. Rating Frequency Percentage Excellent 200 20% Good 500 50% Fair 300 30%
A)
B)
Answer: A Page 29 Copyright © 2020 Pearson Education, Inc.
11) The following figures give the distribution of land (in acres) for a county containing 70,000 acres. Land Use Acres Percentage Forest 10,500 15 % Farm 7,000 10 % Urban 52,500 75 %
A)
B)
Answer: A Solve the problem. 12) What is the difference between a bar chart and a histogram? A) They can both be used to represent numerical data. B) They can both be used to represent categorical data. C) A bar chart represents numerical data and a histogram represents categorical data. D) A bar chart represents categorical data and a histogram represents numerical data. Answer: D
Page 30 Copyright © 2020 Pearson Education, Inc.
13) Which statement below is NOT supported by the following bar chart?
A) In general, people always wear seat belts. B) About 2000 people wear seat belts ʺsometimes.ʺ C) More females wear seat belts compared to males. D) More males wear seat belts compared to females. Answer: D 14) Which statement below is NOT supported by the following bar chart?
A) More females wear sunscreen than males. B) Very few people, in general, always wear sunscreen. C) More males wear sunscreen than females. D) About 50% of males never wear sunscreen. Answer: C 15) What does it mean to find the mode of a bar chart? A) You cannot find a mode for categorical data. Modes are only used with numerical data. B) The mode can be found by finding the bar, or category, with the most observations. C) The mode can be found by adding up the total number of categories. D) The mode can be found by adding up the total number of observations and dividing by the number of categories. Answer: B
Page 31 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 16) What is the difference between a bar chart and a histogram? Answer: A bar chart represents a categorical variable and a histogram represents a numerical variable. 17) What does it mean to find the mode of a bar chart? Answer: The mode can be found by finding the bar, or category, with the most observations. It will be the highest bar in the plot. 18) Using the following bar chart, what can you say about the difference in seat belt use for males versus females?
Answer: Answers may vary. Some examples include: (1) In general, people always wear seat belts. (2) Females wear seatbelts more than males. (3) About the same number of males and females report wearing seat belts ʺsometimes.ʺ
Page 32 Copyright © 2020 Pearson Education, Inc.
2 Summarize Categorical Distributions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. The following side-by-side bar graph shows the level of post-secondary education achieved ten years after high school for graduates from the years 1999 and 2001. Use the bar graph to answer the question.
1) What was the most common response for 1999? A) No College C) Graduated College, Associateʹs Degree
B) Some College D) Graduated College, Bachelorʹs Degree
Answer: B 2) In which category was there more variability? A) No College C) Graduated College, Associateʹs Degree
B) Some College D) Graduated College, Bachelorʹs Degree
Answer: A 3) What is the mode response for 2001? A) Graduated College, Bachelorʹs Degree C) Some College
B) Graduated College, Associateʹs Degree D) No College
Answer: C 4) Which category shows the least amount of variation between years? A) No College B) Some College C) Graduated College, Associateʹs Degree D) Graduated College, Bachelorʹs Degree Answer: A
Page 33 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 5) The bar charts below depict the marital statuses of Americans, separated by gender. Which bar chart shows more variability in marital status? Why?
A) The female bar chart shows more variability because many of the observations fall into one category (ʺMarriedʺ). B) The female bar chart shows more variability because there are more observations in the different categories than there are for males. C) The male bar chart shows more variability because because many of the observations fall into one category (ʺMarriedʺ). D) The male bar chart shows more variability because there are more observations in the different categories than there are for females. Answer: B
Page 34 Copyright © 2020 Pearson Education, Inc.
6) The bar charts below depict the veteran statuses of Americans, separated by gender. Which bar chart has more variability in veteran status? Why?
A) The female bar chart shows more variability because many of the observations fall into one category (ʺNon-Veteranʺ). B) The female bar chart shows more variability because there are more observations in the different categories than there are for males. C) The male bar chart shows more variability because because many of the observations fall into one category (ʺNon-Veteranʺ). D) The male bar chart shows more variability because there are more observations in the different categories than there are for females. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) The bar charts below depict theMPAA movie ratings of 489 movies, separated by high and low critic scores. Which bar chart shows more variability in MPAA movie ratings? Why?
Answer: The ʺhigh critic scoresʺ bar chart shows more variability because there are more observations in the different categories than there are for the ʺlow critic scores.ʺ
Page 35 Copyright © 2020 Pearson Education, Inc.
2.3 Section 5: Interpreting Graphs 1 Explain Why a Graphic May be Hard to Interpret, and Identify Other Possible Graphical Displays MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
1) Parking at a university has become a problem. University administrators are interested in determining the average time it takes a student to find a parking spot. An administrator inconspicuously followed 100 students and recorded how long it took each of them to find a parking spot. Which of the following types of graphs should NOT be used to display information concerning the students parking times? B) stemplot C) histogram D) dotplot A) pie chart Answer: A A large state university conducted a survey among their students and received 400 responses. The survey asked the studen provide the following information: * Age * Year in School (Freshman, Sophomore, Junior, Senior) * Major 2) What type of graph would you use to describe the distribution of the variable Major? A) A histogram because Major is a numerical variable. B) A histogram because Major is a categorical variable. C) A bar chart because Major is a numerical variable. D) A bar chart because Major is a categorical variable. Answer: D 3) What type of graph would you use to answer questions about how the popularity of majors (Major) differs for the different years of students (Year in School)? A) Side-by-side bar charts should be used since these are two categorical variables. B) Side-by-side bar charts should be used since these are two numerical variables. C) Side-by-side histograms should be used since these are two categorical variables. D) Side-by-side histograms should be used since these are two numerical variables. Answer: A A large state university conducted a survey among their students and received 300 responses. The survey asked the studen provide the following information: * Age * Year in School (Freshman, Sophomore, Junior, Senior) * Gender * GPA * Height 4) What type of graph would you use to describe the distribution of the variable Height? A) A histogram because Height is a numerical variable. B) A histogram because Height is a categorical variable. C) A bar chart because Height is a numerical variable. D) A bar chart because Height is a categorical variable. Answer: A 5) What type of graph would you use to represent the relationship between Gender and Year in School? A) A side-by-side histogram should be used since these are two numerical variables. B) A side-by-side histogram should be used since these are two categorical variables. C) A side-by-side bar chart should be used since these are two numerical variables. D) A side-by-side bar chart should be used since these are two categorical variables. Answer: D Page 36 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A large state university conducted a survey among their students and received 400 responses. The survey asked the studen provide the following information: * Age * Year in School (Freshman, Sophomore, Junior, Senior) * Gender 6) What type of graph would you use to describe the distribution of the variable Year in School? Explain your reasoning. Answer: A bar chart since Year in School is a categorical variable. 7) What type of graph would you use to describe the distribution of the variables Gender and Year in School? Explain your reasoning. Answer: A side-by-side bar chart should be used since these are two categorical variables.
8) The accompanying graph shows the distribution of data on whether houses in a neighborhood have a swimming pool. (A 1 indicates the house has a swimming pool, and a 0 indicates it does not have a swimming pool.) A real estate agent claims that almost twice as many homes in this neighborhood have swimming pools than do not.ʹʺ Does this graph support this claim or not? Explain.
Answer: The graph supports the claim. Roughly 15 homes do not have swimming pools, and over 40, which is more than twice 15, do have swimming pools.ʺ
9) A student has gathered data on self-perceived height image, where 1 represents “short,” 2 represents “about right,”and 3 represents “tall.” A graph is given for these data. What type of graph would be a better choice to display these data? Explain.
Answer: This data set is categorical since the numbers (1, 2, and 3) represent categories. Therefore a more appropriate graph would be a bar graph or pie graph.
Page 37 Copyright © 2020 Pearson Education, Inc.
10) The pie chart reports the distribution for the number of hours of studying “last night” for a sample of 200 college students. What would be a better type of graph for displaying these data? Explain why this pie chart is hard to interpret.
Answer: Hours spent studying is a numerical variable. A histogram or dotplot would better enable us to see the distribution of values. Because there are so many possible numerical values, this pie chart has so many “slices” that it is difficult to tell which is which. 11) The following graph shows the ages of females (labeled 1) and males (labeled 0) who are majoring in business at a community college. What type(s) of graph(s) would be more appropriate?
Answer: This is a bar chart (or bar graph). Bar graphs are for categorical data. These data are numerical and would better be shown with a pair of histograms or a pair of dotplots.
Page 38 Copyright © 2020 Pearson Education, Inc.
2 Interpret Graphs MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A word cloud was created using the first chapter of John Greenʹs The Fault in Our Stars. (Note that filler words such as ʺthe,ʺ ʺa/an,ʺ and ʺandʺ were excluded from the plot.)
1) According to the word cloud, what is the most common word in the first chapter of The Fault in Our Stars? Why? A) The most common word is ʺthingʺ because it appears in the middle of the cloud. B) The most common word is ʺaugustusʺ because he is a main character in the story. C) The most common word is ʺhazelʺ because that is the narrator’’s name. D) The most common word is ʺaugustusʺ because it is the largest in size. Answer: D 2) What information is NOT explicitly portrayed in the word cloud? A) The words that occur most frequently in the chapter. B) The number of times each word occurs. C) The specific word that occurs most often. Answer: B
Page 39 Copyright © 2020 Pearson Education, Inc.
A word cloud was created using the first chapter of Lewis Carrollʹs Aliceʹs Adventures in Wonderland. (Note that filler words such as ʺthe,ʺ ʺa/an,ʺ and ʺandʺ were excluded from the plot.)
3) According to the word cloud, what is the most common word in the first chapter of Aliceʹs Adventures in Wonderland? Why? A) The most common word is ʺaliceʺ because it is the largest in size. B) The most common word is ʺaliceʺ because she is a main character in the story. C) The most common word is ʺmarkedʺ because it appears at the top of the cloud. D) The most common word is ʺgardenʺ because it appears in the middle of the cloud. Answer: A 4) What information is NOT explicitly portrayed in the word cloud? A) The words that occur most frequently in the chapter. B) The specific word that occurs most often. C) The number of times each word occurs. Answer: C
Page 40 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A word cloud was created using the first chapter of J.K. Rowlingʹs Harry Potter and the Sorcererʹs Stone. (Note that filler words such as ʺthe,ʺ ʺa/an,ʺ and ʺandʺ were excluded from the plot.)
5) According to the word cloud, what is the most common word in the first chapter of Harry Potter and the Sorcererʹs Stone? Why? Answer: The most common word is ʺdursleyʺ because it is the largest in size. 6) The words ʺowlsʺ and ʺeyesʺ appear to be similar in size. Does this mean that each of these words is used the same number of times in the first chapter of the book? Why or why not? Answer: A word cloud can only tell us what words are the most common which shows relative frequency. So if the words are same size, they occur with the same relative frequency. Since these words come from the same chapter, this means they also occur the same number of times.
Page 41 Copyright © 2020 Pearson Education, Inc.
7) People who played a sport as children were asked how many hours a day they practiced when they were teenagers, and whether they still play now that they are adults. To understand the graph, look at the third bar (spanning 1.0 to 1.5); it shows that there were five people (the grey part of the bar) who practiced between 1.0 and 1.5 hours and do not still play as adults, and there were two people (the white part of the bar) who practiced 1.0 to 1.5 hours and still play as adults. Did those who still play sports tend to practice a different amount as children than those who did not? Explain.
Answer: Those who still play tended to have practiced more as teenagers, which we can see because the center of the distribution for those who still play is about 2 or 2.5 hours, compared to only about 1 or 1.5 hours for those who do not. The distribution could be displayed as a pair of histograms or a pair of dotplots. 8) Refer to the accompanying bar chart, which shows the time spent on a typical day watching television for a sample of men and women. Each person was asked to choose the one of four intervals that best fit the amount he or she spent watching television (for example, “0 to 4 hours” or “12 or more hours”). a. Identify the two variables. State whether they are categorical or numerical. b. Which would be the better choice for these data , histogram or bar chart? c. If you had the actual number of hours for each person, rather than just an interval, what type of graph should to display the distribution of the actual numbers of hours? d. Explain what this graph tells us about the difference between these menʹs and womenʹs television -watching h
Answer: a. Gender is categorical and Time range is also categorical. b. The bar chart is appropriate since the data sets are both categorical variables. c. You could make two histograms (or two dotplots) for the data because the time would be numerical. It would be ideal to use a common horizontal axis for easy comparison of the two graphs. d. For this sample, the men tended to watch more television than the women (The mode for men is 4-8 hours, and the mode for the women is 0 -4 hours).
Page 42 Copyright © 2020 Pearson Education, Inc.
9) People who grew up on a farm as children were asked how many years they lived on a farm when they were children, and whether they became farmers as adults. To understand the graph, look at the second bar (spanning 2 to 4); it shows that there was one person (the grey part of the bar) who lived on a farm as a child between 2 and 4 years and did not become a farmer as an adult, and there were two people (the white part of the bar)who lived on a farm as a child between 2 and 4 years and became farmers as adults. Comment on what the graph shows. What other type of graphs could be used for this data set?
Answer: Those in the sample who are farmers as adults tended to live on a farm as a child for fewer years, The distribution could be displayed as a pair of side-by -side histograms or a pair of dotplots. 10) Refer to the accompanying bar chart, which shows a sample of the number of flights in a typical year for some travelers. Each traveler was asked to choose the one of four intervals that best fit the number of flights he or she took for either business or pleasure (for example, “0 to 5 flights” or “15 + flights”). a. Identify the two variables. Then state whether they are categorical or numerical. b. Which would be the better choice for these data , histogram or bar chart? c. If you had the actual number of flights for each person, rather than just an interval, what type of graph should you use to display the distribution of the actual numbers of flights? d. Explain what this graph tells us about the difference between the number of flights taken for business versus the number of flights taken for pleasure?
Answer: a. Type of flight is categorical and Number of Flights range is also categorical. b. Since the data set both variables are categorical, the bar chart is appropriate. c. You could make two histograms (or two dotplots) for the data because the number of flights would be numerical. It would be ideal to use a common horizontal axis for easy comparison of the two graphs. d. For this sample, the distribution show that the travelers tend to fly about the same for business and pleasure. (The mode for business is 5 -10 flights and the mode for pleasure is also 5-10 flights.)
Page 43 Copyright © 2020 Pearson Education, Inc.
Ch. 2 Picturing Variation with Graphs Answer Key 2.1 Sec 1-2. Visualizing Variation in Numerical Data/Summarizing Important Features of a Numerical Distribution 1 Interpret Dotplots and Histograms 1) C 2) A 3) D 4) B 5) B 6) D 7) C 8) A 9) A 10) A 11) A 12) A 13) A 14) D 15) A 16) D 17) C 18) D 19) B 20) D 21) C 22) A 23) A dotplot and a histogram both show the overall shape of a distribution. They both can help determine a distributionʹs shape, center, and spread. They differ in terms of appearance in only one way. A dotplot displays a dot to represent each observation in the data, while a histogram uses bars to display intervals of observations. 24) Histogram (a) displays the last digits of social security numbers because all of the values are mostly equally likely. Since the last digit of social security numbers are created randomly, we would expect any digit between 0 and 9 to show up just as often as another digit. Histogram (b) displays the ages at which a sample of people applied for a marriage license. Since most people get married in their early to mid-twenties, but there are also people who wait to get married until a later age, we would expect the distribution to be right -skewed. 25) The typical number of reviews a movie will receive is about 130. We know this because the distribution is centered around the value 130 on the x-axis. 26) Since the data are bimodal, there are two typical values - one is about 40 and the other is about 80. 27) About 20 dots should be drawn because there are about 20 people whose heights fall between 1.5 meters and 1.6 meters, as shown by the frequency value on the y -axis. 2 Summarize Important Features of a Numerical Distribution 1) D 2) D 3) D 4) D 5) B 6) C 7) A 8) A 9) A 10) B 11) C 12) D Page 44 Copyright © 2020 Pearson Education, Inc.
13) B 14) A 15) C 16) B 17) C 18) B 19) A 20) D 21) B 22) A 23) D 24) Shape, center, and spread of the data. 25) Answers may vary. Some examples include: (1) The price of college tuition, including both public and private schools (the different types of colleges would create two modes - private colleges would most likely have higher tuition costs compared to public schools). (2) The heights of all students at a high school (the different genders would create two modes - males are typically taller than females). 26) Histogram (a) uses frequencies to simply count the number of observations at a given value. Histogram (b) uses relative frequencies to show the proportion of observations at a given value. 27) Least to most variability: (iii), (ii), (i). Histogram (iii) has the least variability because it has more points that are close to the center of the distribution. Histogram (i) has the most variability because it has more points that are far away from the center of the distribution. 28) The distribution of incomes would most likely be right-skewed because most people earn middle-class salaries, but the very wealthy people are likely to earn incomes much higher than average. 29) The distribution of scores on an easy test would most likely be left-skewed because most test-takers will do well on the test, and a few will still do poorly. 30) The distribution of the time it takes to drive to work using the same route each day should be roughly symmetric because the time you leave your house is probably the same each day. The commute times will be very similar on a day-to-day basis.
2.2 Sec 3-4. Visualizing Variation in Categorical Variables/Summarizing Categorical Distributions 1 Interpret Bar Charts and Pie Charts 1) A 2) B 3) B 4) A 5) D 6) A 7) A 8) A 9) A 10) A 11) A 12) D 13) D 14) C 15) B 16) A bar chart represents a categorical variable and a histogram represents a numerical variable. 17) The mode can be found by finding the bar, or category, with the most observations. It will be the highest bar in the plot. 18) Answers may vary. Some examples include: (1) In general, people always wear seat belts. (2) Females wear seatbelts more than males. (3) About the same number of males and females report wearing seat belts ʺsometimes.ʺ 2 Summarize Categorical Distributions 1) B 2) A 3) C Page 45 Copyright © 2020 Pearson Education, Inc.
4) A 5) B 6) D 7) The ʺhigh critic scoresʺ bar chart shows more variability because there are more observations in the different categories than there are for the ʺlow critic scores.ʺ
2.3 Section 5: Interpreting Graphs 1 Explain Why a Graphic May be Hard to Interpret, and Identify Other Possible Graphical Displays 1) A 2) D 3) A 4) A 5) D 6) A bar chart since Year in School is a categorical variable. 7) A side-by-side bar chart should be used since these are two categorical variables. 8) The graph supports the claim. Roughly 15 homes do not have swimming pools, and over 40, which is more than twice 15, do have swimming pools.ʺ 9) This data set is categorical since the numbers (1, 2, and 3) represent categories. Therefore a more appropriate graph would be a bar graph or pie graph. 10) Hours spent studying is a numerical variable. A histogram or dotplot would better enable us to see the distribution of values. Because there are so many possible numerical values, this pie chart has so many “slices” that it is difficult to tell which is which. 11) This is a bar chart (or bar graph). Bar graphs are for categorical data. These data are numerical and would better be shown with a pair of histograms or a pair of dotplots. 2 Interpret Graphs 1) D 2) B 3) A 4) C 5) The most common word is ʺdursleyʺ because it is the largest in size. 6) A word cloud can only tell us what words are the most common which shows relative frequency. So if the words are same size, they occur with the same relative frequency. Since these words come from the same chapter, this means they also occur the same number of times.
7) Those who still play tended to have practiced more as teenagers, which we can see because the center of the distribution for those who still play is about 2 or 2.5 hours, compared to only about 1 or 1.5 hours for those who do not. The distribution could be displayed as a pair of histograms or a pair of dotplots. 8) a. Gender is categorical and Time range is also categorical. b. The bar chart is appropriate since the data sets are both categorical variables. c. You could make two histograms (or two dotplots) for the data because the time would be numerical. It would be ideal to use a common horizontal axis for easy comparison of the two graphs. d. For this sample, the men tended to watch more television than the women (The mode for men is 4 -8 hours, and the mode for the women is 0-4 hours). 9) Those in the sample who are farmers as adults tended to live on a farm as a child for fewer years, The distribution could be displayed as a pair of side-by -side histograms or a pair of dotplots. 10) a. Type of flight is categorical and Number of Flights range is also categorical. b. Since the data set both variables are categorical, the bar chart is appropriate. c. You could make two histograms (or two dotplots) for the data because the number of flights would be numerical. It would be ideal to use a common horizontal axis for easy comparison of the two graphs. d. For this sample, the distribution show that the travelers tend to fly about the same for business and pleasure. (The mode for business is 5-10 flights and the mode for pleasure is also 5-10 flights.)
Page 46 Copyright © 2020 Pearson Education, Inc.
Ch. 3 Numerical Summaries of Center and Variation 3.1 Summaries for Symmetric Distributions 1 Find and Interpret the Mean and Standard Deviation of a Data Set MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A city planner says, ʺThe typical commute to work for someone living in the city limits is less than the commute to work for someone living in the suburbs.ʺ What does this statement mean? A) If you live in the city limits you will have a longer commute time. B) The center of the distribution of commute times for a city-dweller is less than the center of the distribution for those living in the suburbs. C) All city dwellers spend less time commuting to work than those living in the suburbs. D) There is less variation in the commute time of those living in the suburbs. Answer: B 2) A school board member says, ʺThe typical bus ride to school for a student living in the city limits is more than the bus ride to school for a student living in the suburbs.ʺ What does this statement mean? A) There is less variation in the bus ride times of those living in the suburbs. B) If you are a student living in the city limits you will have a shorter commute time. C) All students living in the city spend less time riding the bus to school than those living in the suburbs. D) The center of the distribution of bus ride times for a city-dweller is more than the center of the distribution for those living in the suburbs Answer: D
Page 1 Copyright © 2020 Pearson Education, Inc.
The following list shows the age at appointment of U.S. Supreme Court Chief Justices appointed since 1900. Use the data to answer the question.
3) The U.S. Supreme Court Chief Justice data was used to create the following output in an Excel spreadsheet. Cho statement that best summarizes the variability of the dataset.
A) The age of most of the U.S. Supreme Court Chief Justiceʹs since 1900 are within 5.6 years of the mean age. B) The age of most of the U.S. Supreme Court Chief Justiceʹs since 1900 are within 31.3 years of the mean age. C) The ages of most of the U.S. Supreme Court Chief Justices are between 50 and 68 years. D) None of these. Answer: A Solve the problem. 4) The following nine values represent race finish times in hours for a randomly selected group of participants in an extreme 10 km race (a 10 km race with obstacles). Which of the following is closest to the mean of the data set 1.0, 1.1, 1.2, 1.2, 1.3, 1.4, 1.4, 1.4, 1.5 A) 1.1 hours
B) 1.3 hours
C) 1.5 hours
Answer: B
Page 2 Copyright © 2020 Pearson Education, Inc.
D) 1.6 hours
5) The following nine values represent race finish times in hours for a randomly selected group of participants in an extreme 10k race (a 10k race with obstacles). Which of the following is closest to the mean of the following data set? 1.0, 1.2, 1.2, 1.3, 1.4, 1.5, 1.5, 1.7, 2.1 A) x is about 1.1 hours
B) x is about 1.3 hours
C) x is about 1.4 hours
D) x is about 1.6 hours
Answer: C The following list shows the age at appointment of U.S. Supreme Court Chief Justices appointed since 1900. Use the data to answer the question.
6) Find the mean, rounding to the nearest tenth of a year, and interpret the mean in this context. A) The typical age of a U.S. Supreme Court Chief Justice appointed since 1900 is 63.0. B) The typical age of a U.S. Supreme Court Chief Justice appointed since 1900 is 64.1. C) The typical age of a U.S. Supreme Court Chief Justice appointed since 1900 is 61.4. D) The typical age of a U.S. Supreme Court Chief Justice appointed since 1900 is 61.0. Answer: C Find the mean for the given sample data. Unless otherwise specified, round your answer to one more decimal place than that used for the observations. 7) The students in Hugh Loganʹs math class took the Scholastic Aptitude Test. Their math scores are shown below. Find the mean score. 623 617 351 346 604 351 344 597 470 482 A) 478.5
B) 488.3
C) 469.1
D) 476
Answer: A 8) Last year, nine employees of an electronics company retired. Their ages at retirement are listed below. Find the m retirement age. 56 62 67 55 67 58 67 53 51 A) 59.6 yr
B) 58.9 yr
C) 58.3 yr
D) 58 yr
Answer: A 9) The grocery expenses for six families were $85.87, $74.96, $60.28, $75.72, $55.54, and $82.77. Compute the mean grocery bill. Round your answer to the nearest cent. A) $72.52 B) $108.79 C) $87.03 D) $75.03 Answer: A
Page 3 Copyright © 2020 Pearson Education, Inc.
Find the standard deviation for the given sample data. Round your answer to one more decimal place than is present in the original data. 10) Christine is currently taking college astronomy. The instructor often gives quizzes. On the past seven quizzes, Ch got the following scores: 49 12 35 30 20 42 77 A) 21.3 B) 35 C) 12,763 D) 10,032.1 Answer: A 11) The top nine scores on the organic chemistry midterm are as follows. 80, 47, 24, 45, 58, 68, 56, 30, 45 A) 17.5 B) 16.5 C) 6.3
D) 18.7
Answer: A 12) The numbers listed below represent the amount of precipitation (in inches) last year in six different U.S. cities. 19.3 18.3 32.9 42.4 21.2 18.1 B) 37.7 in. C) 4,364.6 in. D) 3,860.8 in. A) 10.04 in. Answer: A Solve the problem. 13) The mean can be thought of as the balancing point of a distribution. According to this description, at what value is the following distribution balanced?
A) About 85
B) About 95
C) About 105
Answer: C
Page 4 Copyright © 2020 Pearson Education, Inc.
D) About 115
14) The mean can be thought of as the balancing point of a distribution. According to this description, at what value is the following distribution balanced?
A) About 20
B) About 30
C) About 40
D) About 50
Answer: B 15) Two Geometry classes at North Hollywood High School took the same quiz. Mr. Davis had 15 students in his class with a mean score of 70. Mrs. Brownʹs class of 25 students had a mean score of 80. Overall, what was the mean score for all students on the quiz? A) 74.50 B) 75.00 C) 76.25 D) 75.75 Answer: A 16) Two Physics classes at Jefferson High School took the same quiz. Mr. Spears had 20 students in his class with a mean score of 80. Mrs. Guytonʹs class of 30 students had a mean score of 90. Overall, what was the mean score for all students on the quiz? C) 86 D) 87 B) 85 A) 84 Answer: C 17) The heights (in meters) of five adults are listed below. Calculate the mean height of these adults. 1.57, 1.65, 1.73, 1.75, 1.78 A) 1.70 meters
B) 1.71 meters
C) 1.72 meters
D) 1.73 meters
Answer: A 18) The ages of five adults are listed below. Calculate the mean age of these adults. 26, 38, 45, 50, 61 A) 44
B) 45
C) 46
Answer: A
Page 5 Copyright © 2020 Pearson Education, Inc.
D) 47
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 19) The mean can be thought of as the balancing point of a distribution. According to this description, at approximately what value is the following distribution balanced?
Answer: The mean is approximately 27. 20) Two algebra classes at University High School took the same quiz. Mr. Athens had 25 students in his class with a mean score of 80. Mrs. Suttonʹs class of 30 students had a mean score of 75. Overall, what was the mean score for all students on the quiz? Answer: Mean Score =
(25 80) + (30 75) 4250 = = 77.27. 25 + 30 55
21) The ages of five college students are listed below. Calculate the mean age of these students. 18, 18, 19, 20, 21 Answer: Mean Age =
18 + 18 + 19 + 20 +21 96 = = 19.2 years old. 5 5
2 Compare Means and Standard Deviations MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Below is the standard deviation for extreme 10k finish times for a randomly selected group of women and men. the statement that best summarizes the meaning of the standard deviation. Women: s = 0.16
Men: s = 0.25
A) On average, menʹs finish times will be 0.25 hours faster than the overall average finish time. B) On average, womenʹs finish times will be 0.16 hours less than menʹs finish times. C) The distribution of womenʹs finish times is less varied than the distribution of menʹs finish times. D) The distribution of menʹs finish times is less varied than the distribution of womenʹs finish times. Answer: C
Page 6 Copyright © 2020 Pearson Education, Inc.
2) Below is the standard deviation for extreme 10k finish times for a randomly selected group of women and men. the statement that best summarizes the meaning of the standard deviation. Women: s = 0.17
Men: s = 0.21
A) On average, menʹs finish times will be 0.21 hours faster than the overall average finish time. B) On average, womenʹs finish times will be 0.17 hours less than menʹs finish times. C) The distribution of menʹs finish times is less varied than the distribution of womenʹs finish times. D) The distribution of womenʹs finish times is less varied than the distribution of menʹs finish times. Answer: D 3) Which of the following measurements is likely to have the least variation? A) The individual weights in ounces of oranges in a randomly selected five -pound bag of oranges at the market. B) The individual mass measured in grams of quarters in a randomly selected ten dollar roll of quarters. C) The individual heights of children, measured in inches, in a randomly selected class of sixth grade students. Answer: B 4) Which of the following measurements is likely to have the most variation? A) The individual weights in ounces of tennis balls in a randomly selected can of tennis balls. B) The volume of individual pop cans measured in fluid ounces from a randomly selected twenty -four pack. C) The individual weights in ounces of potatoes in a randomly selected crate of potatoes. Answer: C 5) For the pair of histograms below, determine which distribution has the larger standard deviation.
A) (i) has a larger standard deviation than (ii). B) (ii) has a larger standard deviation than (i). C) Both histograms have the same standard deviation. D) There is no way to determine which one is larger. Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
6) For the pair of histograms below, determine which distribution has the smaller standard deviation.
A) (i) has a smaller standard deviation than (ii). B) (ii) has a smaller standard deviation than (i). C) Both histograms have the same standard deviation. D) There is no way to determine which one is smaller. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) For the pair of histograms below, determine which distribution has the smaller standard deviation. Why?
Answer: Histogram (ii) has the smaller standard deviation because more observations are closer to the mean. 3 Use Means and Standard Deviations MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) At one college, GPAʹs have a distribution that is unimodal and approximately symmetric with a mean of 2.6 and a standard deviation of 0.4. Give the values for GPAʹs from one standard deviation below the mean to one standard deviation above the mean? B) 2.2 to 2.6 C) 2.6 to 3 D) 0.4 to 3 A) 2.2 to 3 Answer: A
Page 8 Copyright © 2020 Pearson Education, Inc.
2) At one college, GPAʹs have a distribution that is unimodal and approximately symmetric with a mean of 3 and a standard deviation of 0.4. Is a GPA of 3.8 more than one standard deviation above the mean? A) Yes B) No Answer: A 3) In Country X, the mean weight gain for women during a full -term pregnancy is 29.8 pounds. The standard deviation of weight gain for this group is 8.7 pounds, and the shape of the distribution of weight gains is symme and unimodal. a. State the weight gain for women one standard deviation below the mean and for one standard deviation abov mean. b. Is a weight gain of 40 pounds more or less than one standard deviation from the mean? A) a. 21.1 to 38.5 pounds, b. 40 is more than one standard deviation from the mean. B) a. 21.1 to 38.5 pounds, b. 40 is less than one standard deviation from the mean. C) a. 12.4 to 47.2 pounds, b. 40 is more than one standard deviation from the mean. D) a. 12.4 to 47.2 pounds, b. 40 is less than one standard deviation from the mean. Answer: A 4) In Country X, the mean birth length for children born at full term (after 40 weeks) is 53.5 centimeters (about 21.1 inches). Suppose the standard deviation is 2.3 centimeters and the distributions are unimodal and symmetric. a. What is the range of birth lengths (in centimeters) of children born in Country X from one standard deviation the mean to one standard deviation above the mean? b. Is a birth length of 55 centimeters more than one standard deviation above the mean? A) a. 51.2 to 55.8 centimeters, b. 55 is not more than one standard deviation from the mean because 55 is not more than 55.8. B) a. 51.2 to 54.7 centimeters, b. 55 is more than one standard deviation from the mean because 55 is more than 54.7. C) a. 50.6 to 54.7 centimeters, b. 55 is more than one standard deviation from the mean because 55 is more than 54.7. D) a. 50.6 to 55.8 centimeters, b. 55 is not more than one standard deviation from the mean because 55 is not more than 55.8. Answer: A 5) In Country X, the mean height for males is 175 centimeters. The standard deviation of height for this group is 5.8 centimeters, and the shape of the distribution of heights is symmetric and unimodal. a. State the height for males one standard deviation below the mean and for one standard deviation above the m b. Is a height of 179 centimeters more or less than one standard deviation from the mean? A) a. 169.2 to 180.8 centimeters, b. 179 is less than one standard deviation from the mean. B) a. 169.2 to 180.8 centimeters, b. 179 is more than one standard deviation from the mean. C) a. 163.4 to 186.6 centimeters, b. 179 is less than one standard deviation from the mean. D) a. 163.4 to 186.6 centimeters, b. 179 is more than one standard deviation from the mean. Answer: A 6) In a certain manufacturing process, the mean weight of an item produced is 90.4 grams (about 3.2 ounces). Suppose the standard deviation is 0.8 gram and the distributions are unimodal and symmetric. a. What is the range of weights (in grams) of items manufactured from one standard deviation below the mean to one standard deviation above the mean? b. Is a weight of 89.2 grams more than one standard deviation above the mean? A) a. 89.6 to 91.2 grams, b. 89.2 is not more than one standard deviation from the mean. B) a. 89.6 to 91.2 grams, b. 89.2 is more than one standard deviation from the mean. C) a. 88.8 to 92.0 centimeters, b. 89.2 is more than one standard deviation from the mean. D) a. 88.8 to 92.0 centimeters, b. 89.2 is not more than one standard deviation from the mean. Answer: A
Page 9 Copyright © 2020 Pearson Education, Inc.
3.2 Whatʹs Unusual? The Empirical Rule and z-Scores 1 Use the Empirical Rule MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. The mean age of lead actresses from the top ten grossing movies of 2010 was 29.6 years with a standard deviation of 6.35 years. Assume the distribution of the actressesʹ ages is approximately unimodal and symmetric. 1) Between what two values would you expect to find about 95% of the lead actresses ages? A) 23.25 and 35.95 years B) 10.55 and 48.65 years C) 16.9 and 42.3 years D) None of these Answer: C 2) Between what two values would you expect to find about 68% of the lead actresses ages? B) 10.55 and 48.65 years A) 23.25 and 35.95 years C) 16.9 and 42.3 years D) None of these Answer: A 3) In 1993, actress Anna Paquin won an academy award in for the movie ʺThe Piano.ʺ She was 11 -years-old. Finish the statement: ʺAccording to the Empirical Rule, the ages of nearly all lead actresses will be between and years. Anna Paquin was this range when she won the academy award.ʺ A) 16.9; 42.3; not within C) 10.6; 48.7; not within
B) 10.6; 48.7; within D) 23.3; 36.0; within
Answer: B Use the following information to answer the question. The mean age of lead actors from the top ten grossing movies of 2007 was 36.4 years with a standard deviation of 9.87 years. Assume the distribution of the actors ages is approximately unimodal and symmetric. 4) Between what two values would you expect to find about 68% of the lead actors ages? A) 6.87 and 66.01 years B) 26.53 and 46.27 years C) 16.66 and 56.14 years D) None of these Answer: B 5) Between what two values would you expect to find about 95% of the lead actors ages? B) 26.53 and 46.27 years A) 6.87 and 66.01 years C) 16.66 and 56.14 years D) None of these Answer: C 6) In 2002, actor Adrian Brody won an academy award in for the movie ʺThe Pianist.ʺ He was 29 -years-old. Finish the statement: ʺAccording to the Empirical Rule, the ages of nearly all lead actors will be between and years. Adrien Brody was this range when she won the academy award.ʺ A) 6.8; 66.0; within C) 16.7; 56.1; within
B) 6.8; 66.0; not within D) 26.5; 46.3; not within
Answer: A
Page 10 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. The economic impact of an industry, such as sport fishing, can be measured by the retail sales it generates. In 2006, the economic impact of great lakes fishing in states bordering the great lakes had a mean of $318 and a standard deviation of $83.5. Note that all dollar amounts are in millions of dollars. Assume the distribution of retail sales is unimodal and symmetric. (Source: National Oceanic and Atmospheric Administration). 7) For what percentage of great lakes states would you expect the economic impact from fishing to be between $234.5 and $401.5 (in millions of dollars)? D) None of these A) 95% B) 68% C) Nearly all Answer: B 8) The economic impact of fishing for nearly all great lakes states should fall within what range (in millions of dollars)? A) $151 to $485 B) $67.5 to $568.5 C) $234.5 to $401.5 D) $83.5 to $318 Answer: A 9) For what percentage of great lakes states would you expect the economic impact from fishing to be between $151.00 and $485.00 (in millions of dollars)? B) 68% C) Nearly all D) None of these A) 95% Answer: A Provide an appropriate response. 10) The amount of television viewed by todayʹs youth is of primary concern to Parents Against Watching Television (PAWT). 300 parents of elementary school-aged children were asked to estimate the number of hours per week that their child watched television. The mean and the standard deviation for their responses were 17 and 2, respectively. PAWT constructed a stem-and-leaf display for the data that showed that the distribution of times was a bell-shaped distribution. Give an interval around the mean where you believe most (approximately 95%) of the television viewing times fell in the distribution. B) less than 15 and more than 19 hours per week A) between 13 and 21 hours per week D) between 15 and 19 hours per week C) between 11 and 23 hours per week Answer: A 11) A small computing center has found that the number of jobs submitted per day to its computers has a distribution that is approximately bell shaped, with a mean of 71 jobs and a standard deviation of 11. Where do we expect most (approximately 95%) of the distribution to fall? B) between 60 and 82 jobs per day A) between 49 and 93 jobs per day D) between 49 and 104 jobs per day C) between 38 and 104 jobs per day Answer: A Solve the problem. 12) A class of 30 introductory statistics students took a quiz worth 100 points. The standard deviation of the scores was 0. What must be true about the studentsʹ scores? A) The mean, median, and mode must all be 0. B) Every student received the same score. C) Every student scored 100 on the quiz. D) There was an error in the calculation for standard deviation. Answer: B 13) A class of 30 French students took a quiz worth 50 points. The standard deviation of the scores was 0. What must be true about the studentsʹ scores? A) The mean, median, and mode must all be 0. B) Every student received the same score. C) Every student scored 50 on the quiz. D) There was an error in the calculation for standard deviation. Answer: B Page 11 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. The distribution of the number of hours of sleep people get per night is unimodal and symmetric with a mean of 6 hours and a standard deviation of 1.5 hours. 14) Approximately what percent of people sleep between 6 and 7.5 hours per night? A) 16% B) 24% C) 34% D) 68% Answer: C Use the following information to answer the question. The distribution of the number of hours people spend at work per day is unimodal and symmetric with a mean of 8 hours and a standard deviation of 0.5 hours. 15) 95% of all people work between ______ and ______ hours per day? A) 7.5 and 8.5 hours B) 7 and 8.5 hours C) 7.5 and 9 hours D) 7 and 9 hours Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Use the following information to answer the question. A math professor has two routes he can take to campus. The length of time it takes to drive each of these routes depends on traffic. Route A has a mean travel time of 20 minutes, with a standard deviation of 4 minutes. Route B has a mean travel time of 16 minutes, with a standard deviation of 3 minutes. Assume that both distributions are unimodal and roughly symmetric. 16) If the professor takes Route A, how often will he be able to drive to campus between 16 and 24 minutes? Answer: According to the Empirical Rule, the professor will be able to drive to campus between 16 and 24 minutes 68% of the time. This is because we move one standard deviation above and below the mean (20 - 4 = 16 and 20 + 4 = 24). Solve the problem. 17) A class of 20 history students took a quiz worth 100 points. The standard deviation of the scores was 0. What can you say about the scores of the students on this quiz? Answer: Every student received the same score on the quiz. 2 Solve Problems Using z-Scores MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. The mean age of lead actresses from the top ten grossing movies of 2010 was 29.6 years with a standard deviation of 6.35 years. Assume the distribution of the actressesʹ ages is approximately unimodal and symmetric. 1) In 2010, popular actress Jennifer Aniston was 41-years-old. What is Jennifer Anistonʹs age if it is standardized? Would it be unusual for a 41-year-old actress to be in a top-grossing film of 2010? Assume the Empirical Rule applies and round to the nearest hundredth. B) z = 1.80; It would be unusual. A) z = 1.80; It would not be unusual. C) z = -1.80; It would be unusual. D) z = -1.80; It would not be unusual. Answer: A Use the following information to answer the question. The mean age of lead actors from the top ten grossing movies of 2007 was 36.4 years with a standard deviation of 9.87 years. Assume the distribution of the actors ages is approximately unimodal and symmetric. 2) In 2007, popular actor and singer Justin Timberlake was 26 -years-old. What is Justin Timberlakeʹs age in 2007 if it is standardized? Would it be unusual for a 26-year-old actor to be in a top-grossing film of 2007? Assume the Empirical Rule applies and round to the nearest hundredth. B) z = 1.05; It would be unusual. A) z = 1.05; It would not be unusual. C) z = -1.05; It would be unusual. D) z = -1.05; It would not be unusual. Answer: D
Page 12 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. The economic impact of an industry, such as sport fishing, can be measured by the retail sales it generates. In 2006, the economic impact of great lakes fishing in states bordering the great lakes had a mean of $318 and a standard deviation of $83.5. Note that all dollar amounts are in millions of dollars. Assume the distribution of retail sales is unimodal and symmetric. (Source: National Oceanic and Atmospheric Administration). 3) If a new report came out saying that the economic impact of great lakes sport fishing on the economy of Illinois was $93,588,546, would you say this was unusual? Note that this dollar amount must be converted before calculating a standard score. B) Yes, it is unusually high. A) No, it is in the range of typical values. C) Yes, it is unusually low. D) Not enough information available Answer: C Solve the problem. 4) The mean price of a pound of ground beef in 75 cities in the Midwest is $2.11 and the standard deviation is $0.56. A histogram of the data shows that the distribution is symmetrical. A local Midwest grocer is selling a pound of ground beef for $3.25. What is this price in standard units? Assuming the Empirical Rule applies, would this price be unusual or not? Round to the nearest hundredth. A) z = 2.04; This is unusually expensive ground beef. B) z = 2.04; This price would not be unusual. C) z = -2.04; This price would not be unusual. D) z = -2.04; This is unusually inexpensive ground beef. Answer: A 5) In 2007, the mean price per pound of lobster in New England was $11.48 and the standard deviation was $2.12. A histogram of the data shows that the distribution is symmetrical. A local New England grocer is selling lobster for $8.99 per pound. What is this price in standard units? Assuming the Empirical Rule applies, would this price be considered unusual or not? Round to the nearest hundredth. B) z = 1.17; This price would not be unusual. A) z = 1.17; This is unusually expensive lobster. C) z = -1.17; This price would not be unusual. D) z = -1.17; This is unusually inexpensive lobster. Answer: C Provide an appropriate response. 6) Many firms use on-the-job training to teach their employees new software. Suppose you work in the personnel department of a firm that just finished training a group of its employees in new software, and you have been requested to review the performance of one of the trainees on the final test that was given to all trainees. The mean and standard deviation of the test scores are 70 and 2, respectively, and the distribution of scores is mound-shaped and symmetric. Suppose the trainee in question received a score of 65. Compute the traineeʹs z-score. B) z = 2.5 C) z = -0.9 D) z = 0.90 A) z = -2.50 Answer: A 7) A television station claims that the amount of advertising per hour of broadcast time has an average of 17 minutes and a standard deviation equal to 2.8 minutes. You watch the station for 1 hour, at a randomly selected time, and carefully observe that the amount of advertising time is equal to 14 minutes. Calculate the z-score for this amount of advertising time. B) z = 1.07 C) z = -0.66 D) z = 0.66 A) z = -1.07 Answer: A Use the following information to answer the question. The distribution of the number of hours people spend at work per day is unimodal and symmetric with a mean of 8 hours and a standard deviation of 0.5 hours. 8) If Anthonyʹs z-score for his work hours was -1.3, how many hours did he work? A) 7.20 hours B) 7.35 hours C) 7.50 hours D) 8.65 hours Answer: B
Page 13 Copyright © 2020 Pearson Education, Inc.
9) If Anthonyʹs z-score for his work hours was -1.3, explain what this value means in terms of the number of hours he works. A) Anthony works 1.3 hours less than the average person. B) Anthony works 1.3 hours more than the average person. C) The number of hours Anthony works is 1.3 standard deviations below the mean. D) The number of hours Anthony works is 1.3 standard deviations above the mean. Answer: C 10) If Stacy worked 8.7 hours yesterday, would you consider this unusual? A) No, 8.7 hours is less than 2 standard deviations above the mean. B) No, 8.7 hours is less than 1 standard deviations above the mean. C) Yes, 8.7 hours is more than 2 standard deviations above the mean. D) Yes, 8.7 hours is more than 1 standard deviations above the mean. Answer: A Use the following information to answer the question. The distribution of the number of hours of sleep people get per night is unimodal and symmetric with a mean of 6 hours and a standard deviation of 1.5 hours. 11) If James had a z-score of 1.2, how many hours did he sleep? A) 7.5 hours B) 7.8 hours C) 8.0 hours D) 8.2 hours Answer: B 12) If James had a z-score of 1.2, explain what this value means in terms of the number of hours of sleep he gets. A) James sleeps 1.2 hours less than the average person. B) James sleeps 1.2 hours more than the average person. C) The number of hours James sleeps is 1.2 standard deviations below the mean. D) The number of hours James sleeps is 1.2 standard deviations above the mean. Answer: D 13) If Amanda slept 9.2 hours last night, would you consider this unusual? A) No, 9.2 hours is less than 2 standard deviations above the mean. B) No, 9.2 hours is less than 3 standard deviations above the mean. C) Yes, 9.2 hours is more than 2 standard deviations above the mean. D) Yes, 9.2 hours is more than 3 standard deviations above the mean. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Use the following information to answer the question. A math professor has two routes he can take to campus. The length of time it takes to drive each of these routes depends on traffic. Route A has a mean travel time of 20 minutes, with a standard deviation of 4 minutes. Route B has a mean travel time of 16 minutes, with a standard deviation of 3 minutes. Assume that both distributions are unimodal and roughly symmetric. 14) Today, the z-score for the professorʹs travel time was -2.1. How many minutes did it take him to drive to campus if he took Route B? Answer: Using the z-score formula, -2.1 =
x - 16 . If we solve for x, we get x = 9.7. So, it took the professor 9.7 3
minutes to drive to campus. 15) Explain what a z-score of 1.3 would mean if the professor took Route A. Answer: The number of minutes it took the professor to get to campus is 1.3 standard deviations above the mean. It took him longer than average to get to campus.
Page 14 Copyright © 2020 Pearson Education, Inc.
16) If it took the professor 24 minutes to drive to campus using Route B one day, would you consider this an unusual drive time? Why? Answer: Yes, 24 minutes is more than 2 standard deviations above the mean (16 + 3 + 3 = 22). Since 24 > 22, it took the professor an unually long amount of time to get to campus.
3.3 Summaries for Skewed Distributions 1 Understand Concepts Related to Measures of Center and Spread MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose we have a distribution of the number of ʺfriendsʺ all users of a popular social media site have. What measure of spread would be best to describe this data? A) The spread should be described with the standard deviation because the distribution will be symmetric. B) spread should be described with the IQR because the distribution will be symmetric. C) The spread should be described with the standard deviation because the distribution will be skewed to the right. D) The spread should be described with the IQR because the distribution will be skewed to the right. Answer: D 2) Suppose we have a distribution of student exam scores on an easy test. What measure of spread would be best to describe this data? A) The spread should be described with the standard deviation because the distribution will be symmetric. B) spread should be described with the IQR because the distribution will be symmetric. C) The spread should be described with the standard deviation because the distribution will be skewed to the left. D) The spread should be described with the IQR because the distribution will be skewed to the left. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) Suppose we have the distribution of household income in the United States. What measures of center and spread would be best to describe this data? Answer: The median should be used to describe the center, and the IQR should be used to describe the spread because the distribution will be skewed to the right.
Page 15 Copyright © 2020 Pearson Education, Inc.
2 Find the Median and Interquartile Range of a Data Set MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Here is a table recording the number of deaths for the top thirteen worst U.S. tornados since 1925. A histogram showing the distribution is also included.
1) Estimate the most appropriate measure of variability. A) Standard Deviation; 169.4 C) Standard Deviation; 178.5
B) IQR; 574 D) IQR; 156
Answer: D Find the median for the given sample data. 2) The salaries of ten randomly selected doctors are shown below. $109,000 $116,000 $172,000 $239,000 $213,000 $123,000 $145,000 $795,000 $232,000 $188,000 A) $180,000
B) $259,000
C) $233,000
D) $172,000
C) $4,636.40
D) $4,399.00
Answer: A 3) A new business had the following monthly net gains: $6,659 $2,182 $2,449 $7,781 $6,854 $2,284 $1,749 $6,054 $4,399 $5,953 A) $5,176.00
B) $5,151.56
Answer: A
Page 16 Copyright © 2020 Pearson Education, Inc.
4) In ten trips to Las Vegas, a person had the following net gains: $2,441 $2,490 $4,590 $6,988 $1,206 $1,179 $6,444 $8,099 $6,427 $4,800 A) $4,695.00
B) $4,963.72
C) $4,466.90
D) $4,590
Answer: A 5) A store manager kept track of the number of newspapers sold each week over a seven -week period. The results are shown below. 65, 65, 215, 147, 281, 241, 241 A) 215 newspapers
B) 179 newspapers
C) 147 newspapers
D) 241 newspapers
Answer: A 6) The number of vehicles passing through a bank drive-up line during each 15-minute period was recorded. The results are shown below. 29 31 29 32 32 29 34 31 39 35 35 33 28 35 29 24 19 31 31 31 A) 31 vehicles
B) 30.85 vehicles
C) 35 vehicles
D) 32 vehicles
C) 5.5
D) 3
Answer: A Determine the interquartile range. 7) Determine the interquartile range. 1, 3, 5, 7, 9, 11, 1, 3, 5, 7, 9, 11 A) 6 B) 6.5 Answer: A 8) The test scores of 19 students are listed below. Find the interquartile range. 91 45 86 63 97 56 82 83 50 44 92 94 A) 27.5
71 60 90 80 88 72 67 B) 30
C) 29.5
D) 30.5
Answer: A 9) The weekly salaries (in dollars) of sixteen government workers are listed below. Find the interquartile range. 787 641 820 475 615 524 576 688 875 598 460 560 A) $166.00
676 667 508 490 B) $168
C) $297
Answer: A
Page 17 Copyright © 2020 Pearson Education, Inc.
D) $173.00
Solve the problem. 10) The number of students enrolled in a college algebra class for the last seven semesters are listed below. Find the median. 71 68 68 76 72 73 68 A) 68
B) 70
C) 71
D) 76
Answer: C 11) The number of students enrolled in a college algebra class for the last seven semesters are listed below. Find the median. 60 61 55 57 64 58 58 A) 57
B) 58
C) 59
D) 60
Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 12) The number of dogs available at a local animal shelter was recorded for the past 8 days. Find the median. 53 27 45 43 46 67 50 32 Answer: The median number of dogs at the animal shelter is 45.5. When the numbers are ordered (27, 32, 43, 45, 46, 50, 53, 67), we see that the median is between 45 and 46. So, taking the average of those two numbers yields a median of 45.5 dogs. 3 Interpret the Meanings of Quartiles and Median MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Data were collected on the total energy consumption per capita (in million BTUs) for a number of cities in Coun summary of the data is shown in the following table. Summary statistics: Min Q1 Median Q3 Max Column Total BTU 186.3 242.1 309.5 388.3 909.8 a. What percentage of Country X consumed more than 388.3 million BTUs per capita? b. What percentage of Cou consumed more than 242.1 million BTUs per capita? c. What percentage of Country X consumed less than 309.5 million BTUs per capita? d. Find and interpret in context the IQR for this data set. A) a. 25%, b. 75%, c. 50%, d. IQR = 146.2. The range of the middle 50% of the sorted data is 146.2 million BTUs. B) a. 25%, b. 75%, c. 50%, d. IQR = 67.4. The range of the middle 50% of the sorted data is 67.4 million BTUs. C) a. 50%, b. 75%, c. 25%, d. IQR = 146.2. The range of the middle 50% of the sorted data is 146.2 million BTUs. D) a. 50%, b. 75%, c. 25%, d. IQR = 67.4. The range of the middle 50% of the sorted data is 67.4 million BTUs. Answer: A
Page 18 Copyright © 2020 Pearson Education, Inc.
2) Data were collected on the total energy consumption per capita (in million BTUs) for a number of cities in Coun summary of the data is shown in the following table. Summary statistics: Min Q1 Median Q3 Max Column Total BTU 11.7 49.8 94.3 153.7 639.2 a. What percentage of Country X consumed fewer than 49.8 million BTUs per capita? b. What percentage of Country X consumed fewer than 153.7 million BTUs per capita? c. Complete this sentence: 50% of Country X consumed more than ___ million BTUs per capita. d. Find and interpret in context the IQR for this data set. A) a. 25%, b. 75%, c. 94.3, d. IQR = 103.9. The range of the middle 50% of the sorted data is 103.9 million BTUs. B) a. 50%, b. 75%, c. 94.3, d. IQR = 101.8. The range of the middle 50% of the sorted data is 101.8 million BTUs. C) a. 25%, b. 50%, c. 94.3, d. IQR = 103.9. The range of the middle 50% of the sorted data is 103.9 million BTUs. D) a. 25%, b. 50%, c. 94.3, d. IQR = 101.8. The range of the middle 50% of the sorted data is 101.8 million BTUs. Answer: A 3) Data were collected on the total water consumption per capita (in million liters) for a number of cities in Countr summary of the data is shown in the following table. A summary of the data is shown in the following table. Summary statistics: Min Q1 Median Q3 Max Column Total Liters 1.3 2.4 3.9 4.6 8.7 a. What percentage of Country X consumed more than 4.6 million liters of water per capita? b. What percentage Country X consumed more than 2.4 million liters of water per capita? c. What percentage of Country X consume less than 3.9 million liters of water per capita? d. Find and interpret in context the IQR for this data set. A) a. 25%, b. 75%, c. 50%, d. IQR = 2.2. The range of the middle 50% of the sorted data is 2.2 million liters. B) a. 25%, b. 75%, c. 50%, d. IQR = 3.9. The range of the middle 50% of the sorted data is 3.9 million BTUs. C) a. 50%, b. 75%, c. 25%, d. IQR = 2.2. The range of the middle 50% of the sorted data is 2.2 million liters. D) a. 50%, b. 75%, c. 25%, d. IQR = 3.9. The range of the middle 50% of the sorted data is 3.9 million BTUs. Answer: A
Page 19 Copyright © 2020 Pearson Education, Inc.
4) Data were collected on the industrial water consumption (in millions of liters) per capita for a number of cities i Country X. A summary of the data is shown in the following table. Summary statistics: Min Q1 Median Q3 Max Column Total Liters 0.1 0.6 1.4 2.3 4.8 a. What percentage of Country X consumed fewer than 0.6 million liters of water per capita? b. What percentage of Country X consumed fewer than 2.3 million liters of water per capita? c. Complete this sentence: 50% of Country X consumed more than ___ million liters of water per capita. d. Find and interpret in context the IQR for this data set. A) a. 25%, b. 75%, c. 1.4, d. IQR = 1.7. The range of the middle 50% of the sorted data is 1.7 million liters of water. B) a. 50%, b. 75%, c. 1.6, d. IQR = 2.5. The range of the middle 50% of the sorted data is 2.5 million liters of water. C) a. 25%, b. 75%, c. 1.6, d. IQR = 2.5. The range of the middle 50% of the sorted data is 2.5 million liters of water. D) a. 50%, b. 75%, c. 1.4, d. IQR = 1.7. The range of the middle 50% of the sorted data is 1.7 million liters of water. Answer: A
Page 20 Copyright © 2020 Pearson Education, Inc.
3.4 Comparing Measures of Center 1 Understand Concepts Related to Outliers, Including How they Influence Measures of Center MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Here is a table recording the number of deaths for the top thirteen worst U.S. tornados since 1925. A histogram showing the distribution is also included.
1) The worst tornado on record since 1925 is a tornado that went through Missouri, Illinois, and Indiana on March 18, 1925. It killed 689 people. Suppose that when this value was entered into a calculator or other software a mistake was made and it was entered as 1,689. Choose the statement that describes what affect this mistake will have on the mean and median. A) Both the median and the mean will be higher than they should be. B) The median and the mean will not be affected by the error. Both measures of center are resistant to extreme values. C) The median will not be affected by the error, but the mean will be higher than it should be. D) The median will be higher than it should be, but the mean will not be affected by the error. Answer: C Solve the problem. Round monetary amounts to the nearest dollar. 2) Packages of a certain candy vary slightly in weight. Here are the measured weights of nine packages, in ounces: 1.7023 1.7044 1.7051 1.7066 1.6762 1.7048 1.7029 1.7035 1.7058 a. Find the mean and the median of these weights. b. Which, if any, of these weights would you consider to be an outlier? c. What are the mean and median weights if the outlier is excluded? B) a. Mean: 1.7013; median: 1.7044 A) a. Mean: 1.7013; median: 1.7044 b. 1.6762 is an outlier. b. 1.6762 is an outlier. c. Mean: 1.7044; median: 1.7044 c. Mean: 1.7044; median: 1.7046 C) a. Mean: 1.7013; median: 1.7044 D) a. Mean: 1.7013; median: 1.7044 b. 1.6762 is an outlier. b. There is no outlier. c. Mean: 1.7044; median: 1.7056 c. No change. Mean: 1.7013; median: 1.7044 Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 3) The distributions of income in the United States is strongly skewed to the right. Which of the following is true? A) The median income is smaller than the mean income. B) The median income is larger than the mean income. C) The median income is equal to the mean income. D) There is not enough information to compare the median and mean values. Answer: A 4) The distribution of the number of ʺfriendsʺ all users of a popular social media site have is strongly skewed to the right. Which of the following is true? A) The median number of friends is smaller than the mean. B) The median number of friends is larger than the mean. C) The median number of friends is equal to the mean. D) There is not enough information to compare the median and mean values. Answer: A 5) Suppose we have a data set of the number of car accidents per day in Los Angeles during the year 2013. The data was input into a spreadsheet manually by an assistant at the Department of Transportation. For one day in June 2013, he input that there were 1230 car accidents; but there were actually only 123 that day. How this error affect the measures of center for this data? A) The mean and the median will be higher than they should be. B) The mean and the median will not be affected by this error. C) The mean will be higher than it should be, but the median will not be affected. D) The median will be higher than it should be, but the mean will not be affected. Answer: C 6) Suppose we have a data set of the number of car accidents per day in Los Angeles during the year 2013. The data was input into a spreadsheet manually by an assistant at the Department of Transportation. For one day in July 2013, he input that there were 14 car accidents; but there were actually only 140 that day. How will this error affect the measures of center for this data? A) The mean and the median will be lower than they should be. B) The mean and the median will not be affected by this error. C) The mean will be lower than it should be, but the median will not be affected. D) The median will be lower than it should be, but the mean will not be affected. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) For a left-skewed distribution, how will the median value compare to the mean value? Answer: For a left-skewed distribution, the median value will be larger than the mean. 8) Suppose we have a data set of the number of car accidents per day in Los Angeles during the year 2013. The data was input into a spreadsheet manually by an assistant at the Department of Transportation. For one day in July 2013, he input that there were 1500 car accidents; but there were actually only 150 that day. How error affect the measures of center for this data? Answer: The mean will be higher than it should be, but the median will not be affected. This is because the median is resistant to outliers, but the mean is not.
Page 22 Copyright © 2020 Pearson Education, Inc.
2 Determine an Appropriate Measure of Center MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Here is a table recording the number of deaths for the top thirteen worst U.S. tornados since 1925. A histogram showing the distribution is also included.
1) Choose the most appropriate measure of center then calculate the typical value rounded to the nearest tenth. A) Median; 181.0 B) Median; 239.9 C) Mean; 239.9 D) Mean; 181.0 Answer: A Provide an appropriate response. 2) The annual profits of five large corporations in a certain area are given below. Which measure of central tendenc should be used? $165,000 $173,000 $193,000 $163,000 $1,243,000 A) median B) mean C) mode D) standard deviation Answer: A
Page 23 Copyright © 2020 Pearson Education, Inc.
3.5 Using Boxplots for Displaying Summaries 1 Interpret Boxplots MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the side-by-side boxplots below to answer the question. The boxplots summarize the number of sentenced prisoners by state in the Midwest and West.
1) Pick the statement that best describes the shape of the distribution for the states in the West. A) The data appears to be roughly symmetrical with a possible outlier. B) The data appears to be right-skewed with a possible outlier. C) The data appears to be left-skewed with large variability. Answer: B 2) Based on the boxplot for the Midwest, which of the following is true? A) 25% of the states sentenced less than 1,435 prisoners. B) 25% of the states sentenced more than 29,928 prisoners. C) 50% of the states sentenced less than 4,322 prisoners. D) 50% of the states sentenced more than 29,928 prisoners. Answer: B 3) Using the boxplot for the Midwest, determine which of the following statements about the distribution cannot be justified. A) The range is 32,467. B) About 75% of the West states had 3,887 or more prisoners. C) The distribution is skewed to the right. D) There are fewer states with 3887.5 to 6887 prisoners than states with 6887 to 15,706 prisoners. Answer: D 4) Pick the statement that best describes the shape of the distribution for the states in the Midwest. A) The data appears to be roughly symmetrical with a possible outlier. B) The data appears to be right-skewed with a possible outlier. C) The data appears to be left-skewed with large variability. D) The data appears to be right-skewed with large variability. Answer: D 5) Based on the boxplot for the West, which of the following is true? A) 25% of the states sentenced more than 15,706 prisoners. B) 25% of the states sentenced less than 3,888 prisoners. C) 50% of the states sentenced less than 15,706 prisoners. D) 50% of the states sentenced less than 22,662 prisoners. Answer: A
Page 24 Copyright © 2020 Pearson Education, Inc.
6) Using the boxplot for the West, determine which of the following statements about the distribution cannot be justified. A) The range is 32,467. B) About 75% of the West states had 3,887 or more prisoners. C) The distribution is skewed to the right. D) There are fewer states with 3887.5 to 6887 prisoners than states with 6887 to 15,706 prisoners. E) The interquartile range is about 11,819. Answer: D The following boxplot contains information about the length of time (in minutes) it took women participants to finish the marathon race at the 2012 London Olympics.
7) What can be said about the shape of the distribution of womenʹs running times for the marathon? A) The distribution is symmetric. B) The distribution is unimodal. C) The distribution is left-skewed. D) The distribution is right-skewed. Answer: D 8) The fastest 25% of women participants ran the marathon how quickly? A) Less than 150 minutes. B) Less than 155 minutes. D) More than 160 minutes. C) More than 155 minutes. Answer: A
Page 25 Copyright © 2020 Pearson Education, Inc.
The following boxplot contains information about the length of time (in minutes) it took men participants to finish the marathon race at the 2012 London Olympics.
9) What can be said about the shape of the distribution of menʹs running times for the marathon? A) The distribution is symmetric. B) The distribution is unimodal. C) The distribution is left-skewed. D) The distribution is right-skewed. Answer: D 10) The slowest 25% of men participants ran the marathon how quickly? A) Less than 136 minutes. B) Less than 139 minutes. D) More than 143 minutes. C) More than 139 minutes. Answer: D Solve the problem. 11) The boxplots below represent movie runtimes (length of a movie in minutes) for movies that have been rated by the Motion Picture Association of America as R, PG-13, PG, and G. List ratings according to their media runtimes, from shortest to longest.
A) R, PG-13, PG, G
B) G, PG, PG-13, R
C) G, PG, R, PG-13
Answer: C Page 26 Copyright © 2020 Pearson Education, Inc.
D) PG-13, R, PG, G
12) The boxplots below represent movie runtimes (length of a movie in minutes) for movies that have been rated by the Motion Picture Association of America as R, PG-13, PG, and G. List ratings according to their media runtimes, from longest to shortest.
A) R, PG-13, PG, G
B) G, PG, PG-13, R
C) G, PG, R, PG-13
D) PG-13, R, PG, G
Answer: D 13) What feature of a distribution can NOT be determined from a boxplot? A) Skewness B) Modality C) Center Answer: B 14) What feature of a distribution can NOT be determined from a boxplot? A) Center B) Spread C) Skewness D) Number of observations Answer: D
Page 27 Copyright © 2020 Pearson Education, Inc.
D) Spread
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 15) The boxplots below represent criticsʹ movie ratings (from 0 to 100) for movies that have been rated by the Motion Picture Association of America as R, PG-13, PG, and G. List the movie ratings according to their median criticsʹ rating, from worst to best.
Answer: Worst to best: PG, PG-13, R, G. 16) Approximately what value corresponds to the first quartile of the following boxplot?
Answer: Quartile 1 (Q1) is represented by the line that creates the left edge of the box. The value is greater than 20, but less than 30, so an appropriate estimate for Q1 would be 25. 17) What is one feature of a distribution that can NOT be determined from a boxplot? Answer: There are 2 answers here: modality and the number of observations in the original data.
Page 28 Copyright © 2020 Pearson Education, Inc.
The following boxplot contains information about the length of time (in minutes) it took men participants to finish the marathon race at the 2012 London Olympics.
18) What do the dots on the right side of the boxplot represent? What does this mean in terms of marathon times? Answer: The dots on the right side of the boxplot represent high outliers. This means that the men whose times are represented by the dots ran the marathon much slower than the rest of the participants. 19) The fastest 25% of men participants ran the marathon how quickly? Answer: Faster than 136 minutes. (Note: Values may vary based on approximations, but should be between 135 and 137 minutes.) 2 Construct Boxplots MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Calculate the five-number summary for the following dataset. 51 53 62 34 36 39 43 63 73 79 A) 32, 39, 52, 63, 79 B) 34, 37.5, 51, 62.5, 73 C) 34, 37.5, 53, 68, 79 D) 34, 39, 52, 63, 79 Answer: D 2) Calculate the five-number summary for the following dataset. 41.19, 83.51, 19.98, 114.60, 63.08, 83.88 B) 19.98, 41.19, 75, 115, 83.88 A) 19.98, 41.19, 73.295, 83.88, 114.6 C) 41.19, 73.295, 83.88, 114.6 D) 19, 41, 63, 84, 115 Answer: A Obtain the five-number summary for the given data. 3) 1, 4, 6, 7, 10, 11, 13 B) 1, 4, 7, 11, 13 A) 1, 5, 7, 10.5, 13
C) 1, 6, 7, 10, 13
Answer: A
Page 29 Copyright © 2020 Pearson Education, Inc.
D) 1, 5.5, 6, 4.5, 13
Solve the problem. 4) Which boxplot represents the same data as the histogram shown below?
A)
B)
C)
D)
Answer: D
Page 30 Copyright © 2020 Pearson Education, Inc.
5) Which boxplot represents the same data as the histogram shown below?
A)
B)
C)
D)
Answer: C 6) Calculate the five-number summary for the following dataset. 51 53 62 34 36 39 43 63 73 79 A) 32, 39, 52, 63, 79 B) 34, 37.5, 51, 62.5, 73 C) 34, 37.5, 53, 68, 79 D) 34, 39, 52, 63, 79 Answer: D Page 31 Copyright © 2020 Pearson Education, Inc.
7) Calculate the five-number summary for the following dataset. 41.19, 83.51, 19.98, 114.60, 63.08, 83.88 A) 19.98, 41.19, 73.295, 83.88, 114.6 B) 19.98, 41.19, 75, 115, 83.88 C) 41.19, 73.295, 83.88, 114.6 D) 19, 41, 63, 84, 115 Answer: A 8) The five-number summary of the ages of passengers on a cruise ship is listed below.
Consider the following two statements regarding outliers for this data and determine which, if any, are correct. (i) There is at least one passenger whose age is a low outlier. (ii) There is at least one passenger whose age is a high outlier. A) Only statement (i) is correct. B) Only statement (ii) is correct. D) Neither statement (i) or (ii) is correct. C) Both statements (i) and (ii) are correct. Answer: B 9) The five-number summary of the ages of a group of 100 adults is listed below.
Consider the following two statements regarding outliers for this data and determine which, if any, are correct. (i) There is at least one adult whose age is a low outlier. (ii) There is at least one adult whose age is a high outlier. A) Only statement (i) is correct. B) Only statement (ii) is correct. D) Neither statement (i) or (ii) is correct. C) Both statements (i) and (ii) are correct. Answer: B 10) Ten households were asked how many pets they currently own. The results are shown below. 1
0
2
4
3
2
1
1
6
4
What is the IQR for this set of data? A) 1 B) 2
C) 3
D) 4
Answer: C 11) Ten parents were asked the ages of their oldest child. The results are shown below. 29 12 10 6
22 19 16 14 2
What is the IQR for this set of data? A) 10 B) 12
28
C) 15
Answer: B
Page 32 Copyright © 2020 Pearson Education, Inc.
D) 27
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 12) The five-number summary of the length of phone calls of 100 people is listed below.
Consider the following two statements regarding outliers for this data and determine which, if any, are correct. Explain your reasoning. (i) There is at least one person made a phone call that is a low outlier. (ii) There is at least one person made a phone call that is a high outlier. Answer: Only statement (ii) is correct - there is at least one cell phone which has a battery life that is a high outlier. In order to determine which values may be outliers, use the 1.5IQR Rule as follows:
13) Ten parents were asked the ages of their youngest child. The results are shown below. 6
25 12 20 6
2
17 22 23 10
What is the IQR for this set of data? Answer: When the values are arranged in order (2, 6, 6, 10, 12, 17, 20, 22, 23, 25), we find that Q3 = 22 and Q1 = 6. Therefore, IQR = Q3 - Q1 = 22 - 6 = 16.
Page 33 Copyright © 2020 Pearson Education, Inc.
Ch. 3 Numerical Summaries of Center and Variation Answer Key 3.1 Summaries for Symmetric Distributions 1 Find and Interpret the Mean and Standard Deviation of a Data Set 1) B 2) D 3) A 4) B 5) C 6) C 7) A 8) A 9) A 10) A 11) A 12) A 13) C 14) B 15) A 16) C 17) A 18) A 19) The mean is approximately 27. (25 80) + (30 75) 4250 20) Mean Score = = = 77.27. 25 + 30 55 21) Mean Age =
18 + 18 + 19 + 20 +21 96 = = 19.2 years old. 5 5
2 Compare Means and Standard Deviations 1) C 2) D 3) B 4) C 5) A 6) B 7) Histogram (ii) has the smaller standard deviation because more observations are closer to the mean. 3 Use Means and Standard Deviations 1) A 2) A 3) A 4) A 5) A 6) A
3.2 Whatʹs Unusual? The Empirical Rule and z-Scores 1 Use the Empirical Rule 1) C 2) A 3) B 4) B 5) C 6) A 7) B 8) A 9) A Page 34 Copyright © 2020 Pearson Education, Inc.
10) A 11) A 12) B 13) B 14) C 15) D 16) According to the Empirical Rule, the professor will be able to drive to campus between 16 and 24 minutes 68% of the time. This is because we move one standard deviation above and below the mean (20 - 4 = 16 and 20 + 4 = 24). 17) Every student received the same score on the quiz. 2 Solve Problems Using z-Scores 1) A 2) D 3) C 4) A 5) C 6) A 7) A 8) B 9) C 10) A 11) B 12) D 13) C x - 16 . If we solve for x, we get x = 9.7. So, it took the professor 9.7 minutes to drive 14) Using the z-score formula, -2.1 = 3 to campus. 15) The number of minutes it took the professor to get to campus is 1.3 standard deviations above the mean. It took him longer than average to get to campus. 16) Yes, 24 minutes is more than 2 standard deviations above the mean (16 + 3 + 3 = 22). Since 24 > 22, it took the professor an unually long amount of time to get to campus.
3.3 Summaries for Skewed Distributions 1 Understand Concepts Related to Measures of Center and Spread 1) D 2) D 3) The median should be used to describe the center, and the IQR should be used to describe the spread because the distribution will be skewed to the right. 2 Find the Median and Interquartile Range of a Data Set 1) D 2) A 3) A 4) A 5) A 6) A 7) A 8) A 9) A 10) C 11) B 12) The median number of dogs at the animal shelter is 45.5. When the numbers are ordered (27, 32, 43, 45, 46, 50, 53, 67), we see that the median is between 45 and 46. So, taking the average of those two numbers yields a median of 45.5 dogs. 3 Interpret the Meanings of Quartiles and Median 1) A 2) A Page 35 Copyright © 2020 Pearson Education, Inc.
3) A 4) A
3.4 Comparing Measures of Center 1 Understand Concepts Related to Outliers, Including How they Influence Measures of Center 1) C 2) A 3) A 4) A 5) C 6) C 7) For a left-skewed distribution, the median value will be larger than the mean. 8) The mean will be higher than it should be, but the median will not be affected. This is because the median is resistant to outliers, but the mean is not. 2 Determine an Appropriate Measure of Center 1) A 2) A
3.5 Using Boxplots for Displaying Summaries 1 Interpret Boxplots 1) B 2) B 3) D 4) D 5) A 6) D 7) D 8) A 9) D 10) D 11) C 12) D 13) B 14) D 15) Worst to best: PG, PG-13, R, G. 16) Quartile 1 (Q1) is represented by the line that creates the left edge of the box. The value is greater than 20, but less than 30, so an appropriate estimate for Q1 would be 25. 17) There are 2 answers here: modality and the number of observations in the original data. 18) The dots on the right side of the boxplot represent high outliers. This means that the men whose times are represented by the dots ran the marathon much slower than the rest of the participants. 19) Faster than 136 minutes. (Note: Values may vary based on approximations, but should be between 135 and 137 minutes.) 2 Construct Boxplots 1) D 2) A 3) A 4) D 5) C 6) D 7) A 8) B 9) B 10) C 11) B
Page 36 Copyright © 2020 Pearson Education, Inc.
12) Only statement (ii) is correct - there is at least one cell phone which has a battery life that is a high outlier. In order to determine which values may be outliers, use the 1.5IQR Rule as follows:
13) When the values are arranged in order (2, 6, 6, 10, 12, 17, 20, 22, 23, 25), we find that Q3 = 22 and Q1 = 6. Therefore, IQR = Q3 - Q1 = 22 - 6 = 16.
Page 37 Copyright © 2020 Pearson Education, Inc.
4) Which scatterplot below depicts a stronger linear relationship? Why?
A) Scatterplot (i) shows a stronger linear relationship because it has less vertical variation between points. B) Scatterplot (i) shows a stronger linear relationship because it has more vertical variation between points. C) Scatterplot (ii) shows a stronger linear relationship because it has less vertical variation between points. D) Scatterplot (ii) shows a stronger linear relationship because it has more vertical variation between points. Answer: A 5) Which scatterplot below depicts a stronger linear relationship? Why?
A) Scatterplot (i) shows a stronger linear relationship because it has less vertical variation between points. B) Scatterplot (i) shows a stronger linear relationship because it has more vertical variation between points. C) Scatterplot (ii) shows a stronger linear relationship because it has less vertical variation between points. D) Scatterplot (ii) shows a stronger linear relationship because it has more vertical variation between points. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 6) Suppose data were collected on neighborhoods about the number of crimes committed and the number of police who patrol the area. Which variable is the explanatory variable and which one is the response? Explain your reasoning. Answer: There are two possible answers here: (1) The number of crimes committed is the explanatory variable and the number of police on patrol is the response because the amount of crime can explain a need for more or less police officers. (2) The number of police on patrol is the explanatory variable and the amount of crime is the response because the more police that are on patrol could explain a reduction in the amount of crime.
Page 2 Copyright © 2020 Pearson Education, Inc.
7) Which scatterplot below depicts a stronger linear relationship? Why? Explain what the scatterplot shows regarding a treeʹs volume.
Answer: Scatterplot (ii) shows a stronger linear relationship because it has less vertical variation between points. As a treeʹs diameter increases, its volume typically increases as well. 2 Describe and Interpret Increasing Trends, Decreasing Trends, or no Trends from Scatterplots MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The scatterplot below shows the hat size and IQ of some adults. Is the trend positive, negative, or near zero?
A) Positive
B) Negative
C) Near Zero
Answer: C 2) Choose the best statement to summarize the association shown between hat size and IQ in the scatterplot below.
A) As hat size increases, IQ scores tend to increase. B) As hat size increases, IQ scores tend to decrease. C) The scatterplot does not show a trend that would indicate an association between hat size and IQ scores. D) Hat size causes IQ to increase. Answer: C
Page 3 Copyright © 2020 Pearson Education, Inc.
3) Doctors hypothesize that smoking cigarettes inflames the bronchial tubes and so makes it harder to breathe. They measured the lung capacity (in liters) and the number of cigarettes smoked in a typical day for a sample of adults. Is the scatterplot below consistent with the researcherʹs hypothesis?
A) Yes, it is consistent. B) No, it contradicts this hypothesis. C) There is no evidence in support or contradiction of the hypothesis. Answer: B 4) The scatterplot below shows the number of tackles received and the number of concussions received for a team of football players for the most recent season. Choose the statement that best describes the trend.
A) Teams that receive a greater number of tackles tend to have a higher number of concussions. B) Teams that receive a greater number of tackles tend to receive a lower number of concussions. C) There is no association between the number of tackles a team receives and the number of concussions. Answer: A 5) What key things should you look for when examining the potential linear association between two variables? A) A noticeable positive or negative trend on the scatterplot. B) The vertical spread of the points on the scatterplot which indicate the strength of an association. C) A noticeable overall linear shape on the scatterplot. D) All of these. Answer: D
Page 4 Copyright © 2020 Pearson Education, Inc.
4.2 Measuring Strength of Association with Correlation 1 Identify and Interpret Correlation in a Scatterplot MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The following calculator screenshots show the scatterplot and the correlation coefficient between car weight and car length for a sample of 2009 model year cars.
The relationship between ʺcar lengthʺ and ʺcar weightʺ can be described as A) A strong positive linear relationship B) A moderate positive linear relationship C) A strong negative linear relationship D) A weak negative relationship Answer: B 2) The following calculator screenshots show the scatterplot and the correlation coefficient between the number of days absent and the final grade for a sample of college students in a general education statistics course at a large community college.
The relationship between ʺdays absentʺ and ʺfinal gradeʺ can be described as A) A strong positive linear relationship B) A moderate positive linear relationship C) A strong negative linear relationship D) A weak negative relationship Answer: C
Page 7 Copyright © 2020 Pearson Education, Inc.
3) The table shows the number of minutes ridden on a stationary bike and the approximate number of calories burned. Plot the points on the grid provided then choose the most likely correlation coefficient from the answer choices below.
A) -0.99
B) 0.99
C) -0.20
D) 0.20
Answer: B 4) The table shows the number of minutes ridden on a stationary bike and the approximate number of calories burned. Plot the points on the grid provided then choose the most likely correlation coefficient from the answer choices below.
A) 0.99
B) -0.99
C) -0.30
Answer: A
Page 8 Copyright © 2020 Pearson Education, Inc.
D) 0.30
Choose the scatterplot that matches the given correlation coefficient. 5) r = 0.8787 A)
B)
C)
Answer: B
Page 9 Copyright © 2020 Pearson Education, Inc.
6) r = -0.6542 A)
B)
C)
Answer: C
Page 10 Copyright © 2020 Pearson Education, Inc.
7) r = -0.3120 A)
B)
C)
Answer: A
Page 11 Copyright © 2020 Pearson Education, Inc.
8) r = -0.3526 A)
B)
C)
Answer: A
Page 12 Copyright © 2020 Pearson Education, Inc.
9) r = 0.8670 A)
B)
C)
Answer: B Solve the problem. 10) Which of the following statements regarding the correlation coefficient is not true? A) A correlation coefficient value of 0.00 indicates that two variables have no linear correlation at all. B) The correlation coefficient measures the strength of the linear relationship between two numerical variables. C) The correlation coefficient has values that range from -1.0 to 1.0 inclusive. D) All of these are true statements. Answer: D 11) Which of the following statements regarding the correlation coefficient is not true? A) The correlation coefficient has values that range from -1.0 to 1.0 inclusive. B) The correlation coefficient measures the strength of the linear relationship between two numerical variables. C) A value of 0.00 indicates that two variables are perfectly linearly correlated. D) All of these are true statements. Answer: C
Page 13 Copyright © 2020 Pearson Education, Inc.
12) Which of the following scatterplots shows data with the highest correlation between the explanatory and response variables? B) A)
C)
D)
Answer: A 13) Which of the following scatterplots shows data with the highest correlation between the explanatory and response variables? B) A)
C)
D)
Answer: C
Page 14 Copyright © 2020 Pearson Education, Inc.
14) Given the following scatterplot, if the point in the upper right corner were removed, what will happen to the val the correlation coefficient, r?
A) r will get closer to 1. C) r will get closer to -1.
B) r will get closer to 0. D) r will not change.
Answer: C 15) Given the following scatterplot, if a point was added in the upper left corner with an x -value of 10 and a y-value of 20, what would happen to the value of the correlation coefficient, r?
A) r will get closer to 0. C) r will get closer to -1.
B) r will get closer to 1. D) r will not change.
Answer: A 16) Which of the following statements regarding the correlation coefficient is true? A) The correlation coefficient is a value between 0 and 1. B) A high correlation coefficient tells us the data are linear. C) A correlation coefficient of 0 means that two variables are perfectly linearly related. D) A correlation coefficient of -1 means that as one variable increases, the other decreases. Answer: D
Page 15 Copyright © 2020 Pearson Education, Inc.
17) Which of the following statements regarding the correlation coefficient is false? A) The correlation coefficient is a value between -1 and 1. B) A high correlation coefficient tells us the data are linear. C) A correlation coefficient of 1 means that two variables are perfectly linearly related. D) A correlation coefficient of -1 means that as one variable increases, the other decreases. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 18) When will a correlation coefficient be negative? Answer: A correlation coefficient will be negative when there is a negative linear trend in a scatterplot. So, as the values of x increase, the values of y tend to decrease. 19) Which of the following scatterplots shows data with the highest correlation between the explanatory and response variables? Explain your reasoning.
Answer: Scatterplot (i) has the highest correlation because it is more linear than scatterplot (ii). The points have less vertical variation in scatterplot (i), so the relationship is stronger. 20) Given the following scatterplot, if the point in the upper left corner were removed, what would happen to the value of the correlation coefficient, r?
Answer: The correlation coefficient, r, will get closer to 1 because the point is an outlier. Since the rest of the points show a strongly positive linear relationship, r would reflect that and get closer to 1.
Page 16 Copyright © 2020 Pearson Education, Inc.
2 Calculate the Correlation Coefficient of Data MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The data below are the average one-way commute times (in minutes) of selected students during a summer literature class and the number of absences for those students for the term. Calculate the correlation coefficient. 76 89 95 94 92 102 79 104 84 Commute time (min), x 3 6 6 4 11 0 11 1 Number of absences, y -1 A) 0.980 B) 0.890 C) 0.881 D) 0.819 Answer: A 2) Calculate the correlation coefficient for the data below. 2 1 x -7 -5 -1 -3 -4 -2 0 18 1 15 4 5 8 12 2 13 y A) -0.104 B) -0.132
-6 14 C) -0.549
D) -0.581
Answer: A 3) The data below are the ages and annual pharmacy b ills (in dollars) of 9 randomly selected employees. Calculate correlation coefficient. 35 38 42 45 48 50 54 58 62 Age, x 118 122 125 133 144 147 150 152 154 Pharmacy bill ($), y A) 0.960 B) 0.998 C) 0.890 D) 0.908 Answer: A 4) In an area of the Great Plains, records were kept on the relationship between the rainfall (in inches) and the yield wheat (bushels per acre). Calculate the correlation coefficient. 11.7 10 14.6 13.7 20 11.5 8.2 16.8 17.2 Rainfall (in inches), x 57.8 58 81.4 48.2 30.9 75 77.8 Yield (bushels per acre), y 49.5 45.2 A) 0.981 B) 0.998 C) 0.900 D) 0.899 Answer: A 5) The data below are the number of hours worked (per week) and the final grades of 9 randomly selected students a drama class. Calculate the correlation coefficient. 0 3 6 4 9 2 15 8 5 Hours worked, x 91 79 73 75 64 85 48 69 75 Final Grade, y A) -0.991 B) -0.888 C) -0.918 D) -0.899 Answer: A Solve the problem. 6) Suppose 50 married couples are asked to provide their ages. For all couples, the husband is exactly 2 years older than his wife. For example, a 50-year-old husband has a wife who is 48 years old. What is the value of the correlation coefficient between the ages of husbands and their wives? A) No conclusions can be drawn without seeing a scatterplot of the data. B) The correlation coefficient between the ages of husbands and wives is equal to 1. C) The correlation coefficient between the ages of husbands and wives is equal to 0. D) The correlation coefficient between the ages of husbands and wives is equal to -1. Answer: B
Page 17 Copyright © 2020 Pearson Education, Inc.
7) Suppose daily high and low temperature measures were taken for 50 days. For each day, the daily high temperature was exactly 10 degrees higher than the daily low temperature. For example, on a day with a high temperature of 80°, the low temperature was 70°. What is the value of the correlation coefficient between th high and daily low temperatures? A) No conclusions can be drawn without seeing a scatterplot of the data. B) The correlation coefficient between the daily high and low temperatures is equal to -1. C) The correlation coefficient between the daily high and low temperatures is equal to 0. D) The correlation coefficient between the daily high and low temperatures is equal to 1. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 8) Suppose the heights of teenagers were recorded in 2003 and then recorded again 5 years later, in 2008. For each person, the height in 2008 was 6 inches (15.24 centimeters) taller than the height in 2003. For example, a teenager who was 53 inches (134.62 cm) tall in 2003 measured at a height of 59 inches (149.86 cm) in 2008. What would be the correlation coefficient between the 2003 heights and the 2008 heights? Explain your reasoning. Answer: The correlation coefficient between the heights of the teenagers in 2003 and 2008 is equal to 1 because for every person, the height increased exactly 6 inches (15.24 cm) from 2003 to 2008. If we plotted the data, the points would form a straight line. 9) The distance (in kilometers) and price (in dollars) for one-way airline tickets from City A to several cities are shown in the table. Distance (km) Price ($) Destination City B 2387 232 City C 4523 399 City D 1035 159 City E 2290 155 City F 3875 290 a. Find the correlation coefficient for this data. Use distance as the x -variable and price as the y-variable. b. Recalculate the correlation coefficient for this data using price as the x-variable and distance as the y-variable. What effect does this have on the correlation coefficient? c. Suppose a $25 security fee was added to the price of each ticket. What effect would this have on the correlation coefficient? d. Suppose the airline had a sale where all ticket prices decreased by 50%. What effect would this have on the correlation coefficient? Answer: a. r = 0.92 b. r = 0.92, The correlation coefficient stays the same. c. r = 0.92. Adding a constant to all y-values does not change the value of r. d. r = 0.92;. The correlation coefficient stays the same.
Page 18 Copyright © 2020 Pearson Education, Inc.
10) The table shows distances between selected cities and the cost of a business class train ticket for travel between these cities. Distance (mi) 399 104 216 312 403
Cost ($) 231 124 157 275 299
a. Calculate the correlation coefficient for the data shown in the table above. b. The table for part (b) shows the same information, except that the distance was converted to kilometers by multiplying the number of miles by 1.609. What happens to the correlation when the numbers are multiplied by a constant? Distance (mi) 642 167 348 502 648
Cost ($) 266 159 192 310 334
c. Suppose a surcharge is added to every train ticket to fund baggage handling. A fee of $35 is added to each tick matter how long the trip is. What happens to the correlation coefficient when a constant is added to each numbe Answer: a. r = 0.88 b. r = 0.88. Multiplying by a constant does not change the value of r. c. r = 0.88. Adding a constant does not change the value of r because the strength of the association is not affected. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 11) A graph was created showing the relationship between high school GPA and college GPA. High school GPA score was the predictor and college GPA was the response variable. If you reverse the variables so that college GPA was the predictor and high school GPA was the response variable, what effect would this have on the numerical value of the correlation coefficient? A) The value of r would stay the same. B) The value of r would increase. C) The value of r would decrease. Answer: A 12) A news magazine published an article with the headline ʺPositive Correlation Found between Volunteerism and Career Success.ʺ Explain what a positive correlation means in the context of this headline. A) Higher volunteerism is associated with higher career success. B) Higher volunteerism is associated with lower career success. C) Higher career success is associated with higher volunteerism. D) Higher career success is associated with lower volunteerism. Answer: A
Page 19 Copyright © 2020 Pearson Education, Inc.
13) An internet news site published an article with the headline ʺStudy Finds Correlation between Wealth and Healthcare.ʺ Would you expect this correlation to be negative or positive? Explain your reasoning in the context of this headline. A) We expect the correlation is positive. Higher levels of wealth are associated with better healthcare. B) We expect the correlation is negative. Higher levels of wealth are associated with worse healthcare. Answer: A
4.3 Modeling Linear Trends 1 Predict Values From a Regression Equation MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. The following linear regression model can be used to predict ticket sales at a popular water park. Ticket sales per hour = -631.25 + 11.25(current temperature in °F) 1) What is the predicted number of tickets sold per hour if the temperature is 86°F? Round to the nearest whole ticket. A) About 252 tickets B) About 276 tickets C) About 301 tickets D) About 336 tickets Answer: D 2) What is the predicted number of tickets sold per hour if the temperature is 79°F? Round to the nearest whole ticket. B) About 257 tickets C) About 250 tickets D) About 310 tickets A) About 258 tickets Answer: A Use the following regression equation regarding professor salaries to answer the question. ^
Salary = 95000 + 1280 (Years) Note that Years is the number of years a professor has worked at a college, and Salary is the annual salary (in dollars) the professor earns. 3) What is the expected salary of a professor who has worked at a college for 15 years? A) $95,000 B) $96,280 C) $106,700 D) $114,200 Answer: D Use the following regression equation regarding car mileage to answer the question. ^
Highway = 0.892 + 1.337 (City) Note that City is the estimated miles per gallon (mpg) a car gets while driving on city streets, and Highway is the estimated miles per gallon (mpg) a car gets while driving on highways. 4) What is the expected salary of a professor who has worked at a college for 15 years? B) 21.9 mpg C) 30.8 mpg D) 31.6 mpg A) 16.5 mpg Answer: D
Page 20 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Use the following regression equation regarding airline tickets to answer the question. ^
Price = 49 + 0.22 ∙ (Distance) Note that Distance is the amount of miles between the departure and arrival cities, and Price is the cost of an airline ticket. 5) What is the expected ticket price for a flight between Los Angeles and New York City, which are 2448.3 miles apart? ^
Answer: Price = 49 + 0.22 (Distance) = 49 + 0.22 (2448.3) = 49 + 538.63 = $587.63 Suppose data were recorded for 100 employees of a large company that included annual salaries and the number of years the employee has been in their current position. The mean annual salary was $58,000 with a standard deviation of 12,500. The mean number of years in the current position is 10 with a standard deviation of 3. The correlation coefficient between the two variables is approximately 0.93. 6) If an employee earns an annual salary of $85,000, approximately how long has she held her current position at the company? Round your answer to the nearest year. Answer: Using the regression equation, we find that she has held her current position in the company for approxim 17 years. Salary = 19250 + 3875 (Years in Position) 85,000 = 19,250 + 3875 (Years in Position) 65,750 = 3875 (Years in Position) 16.97 = (Years in Position) 2 Interpret or Set Up a Regression Equation and/or Scatterplot MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose it has been established that ʺannual incomeʺ and ʺYears of collegeʺ are linearly related, and that the relationship can be modeled using the following equation: Annual Income = $23,400+$7200(Years of College). In variable, and ʺYears of Collegeʺ is the ? this model, ʺAnnual Incomeʺ is the ? variable. The two variables have a
?
linear relationship.
A) Dependent; Independent; Positive C) Independent; Dependent; Positive
B) Dependent; Independent; Negative D) Independent; Dependent; Negative
Answer: A 2) Suppose it has been established that ʺhome valueʺ and ʺYears of collegeʺ are linearly related, and that the relationship can be modeled using the following equation: Home value = $75,000+$12,500(Years of College). In variable, and ʺhome valueʺ is the ? this model, ʺyears of collegeʺ is the ? variable. The two variables have a
?
A) Dependent; Independent; Positive C) Independent; Dependent; Positive
. B) Dependent; Independent; Negative D) Independent; Dependent; Negative
Answer: C 3) A veterinarian is going to investigate whether homes with more pets tend to have more fleas. In this scenario, the explanatory variable is and the response variable is . A) Number of fleas spotted in a period of time; number of pets B) Number of pets; number of fleas spotted in a period of time C) Slope; intercept D) None of these Answer: B Page 21 Copyright © 2020 Pearson Education, Inc.
4) A concert ticket agent is going to investigate whether an increase in money spent on radio advertisements for a particular venue tends to lead to more concert ticket sales. In this scenario, the response variable is and the explanatory variable is . A) Amount of radio advertisement money spent; concert ticket sales B) Concert ticket sales; amount of radio advertisement money spent C) Slope; intercept D) None of these Answer: B Use the following information to answer the question. The following linear regression model can be used to predict ticket sales at a popular water park. Ticket sales per hour = -631.25 + 11.25(current temperature in °F) 5) Choose the statement that best states the meaning of the slope in this context. A) The slope tells us that if ticket sales are decreasing there must have been a drop in temperature. B) The slope tells us that a one degree increase in temperature is associated with an average increase in ticket sales of 11.25 tickets. C) The slope tells us that high temperatures are causing more people to buy tickets to the water park. D) None of these Answer: B 6) In this context, does the intercept have a reasonable interpretation? A) Yes, it is reasonable for people to go to a water park when it is 0°F, so park managers might want to know how many tickets they would sell on average on a 0°F day. B) No, at a temperature of 0°F, ticket sales would be -631.25 and it is not reasonable (or possible) to have negative ticket sales. C) Not enough information available Answer: B
Page 22 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A random sample of 30 married couples were asked to report the height of their spouse and the height of their biological parent of the same gender as their spouse. The output of a regression analysis for predicting spouse height from parent height is shown. Assume that the conditions of the linear regression model are satisfied.
7) What is the slope of the regression line? Choose the statement that is the correct interpretation of the slope in context. A) The slope is 48.40. On average, for each inch taller a parent is, the spouse is about 0.25 inches taller, in the sample. B) The slope is 48.40. On average, for each inch taller a parent is, the spouse is about 48.40 inches taller, in the sample. C) The slope is 0.25. On average, for each inch taller a parent is, the spouse is about 0.25 inches taller, in the sample. D) The slope is 0.25. On average, for each 0.25 inches taller a parent is, the spouse is about 1 inch taller, in the sample. Answer: C 8) If the intercept was 0 and the slope was 1, what would that say about the association? A) It would mean that on average, the spouse and the parent are the same height. B) It would mean that on average, the spouse is 1 inch taller than the parent. C) It would mean that the spouse height should not be predicting using parent height. D) None of these. Answer: A
Page 23 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A random sample of 30 couples who were also new home owners were asked to report the cost of their first house and their combined age when they married. The output of a regression analysis for predicting home cost from combined age is shown. Assume that the conditions of the linear regression model are satisfied.
9) What is the slope of the regression line? Choose the statement that is the correct interpretation of the slope in context. A) The slope is 2122.75. On average, for each additional year in combined age, home cost is about $2,122.75 higher, in the sample. B) The slope is 73.74. On average, for couples with a combined age over 73.74, the home cost is an additional $2,122.75 per year over 73.74. C) The slope is 73.74. On average, for each additional year in combined age, the home cost is about $2,122.75 higher, in the sample. D) The slope is 2122.75. On average, for each additional $2,122.75 in home cost, the combined age is about 1 year higher, in the sample. Answer: A 10) If the slope were 1, what would that say about the association? A) If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be 1 dollar more. B) If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be $2,122.75 higher. C) If the slope were 1, it would mean that on average, for every additional year in combined age, the home cost would be $2,122.75 lower. D) None of these. Answer: A Solve the problem. 11) A horticulturist conducted an experiment on 110 thirty-six inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of tomatoes harvested from the plants. The average amount of plant food given was 27.8 milliliters with a standard deviation of 2.1 milliliters. The average number of tomatoes harvested was 7.5 with a standard deviation of 1.5. The correlation coefficient was 0.7691. Use the information to calculate the slope of the linear model that predicts the number of tomatoes harvested from the amount of plant food given. Show your work and round to the nearest hundredth. A) -7.50 B) 1.08 C) 0.55 D) The slope cannot be determined without the actual data. Answer: C
Page 24 Copyright © 2020 Pearson Education, Inc.
12) A horticulturist conducted an experiment on 140 thirty-six inch plant boxes to see if the amount of plant food given to the plant boxes was associated with the number of habanera peppers harvested from the plants. The average amount of plant food given was 17.8 milliliters with a standard deviation of 0.7 milliliters. The average number of habanera peppers harvested was 6.5 with a standard deviation of 1.5. The correlation coefficient was 0.8123. Use the information to calculate the slope of the linear model that predicts the number of habanera peppers harvested from the amount of plant food given. Show your work and round to the nearest hundredth. A) -1.50 B) 1.74 C) 0.38 D) The slope cannot be determined without the actual data. Answer: B A random sample of 100 cars was taken and data were recorded on the miles per gallon (mpg) in the city and on the highway. The mean city mpg was 28 with a standard deviation of 9.2. The mean highway mpg was 35 with a standard deviation of 8.6. The correlation coefficient between city mpg and highway mpg is 0.95. 13) Determine the correct value of the slope for the linear model that predicts highway mpg from city mpg and interpret it in context. A) The slope is 0.888. For every one mpg increase in city mpg, the highway mpg is predicted to increase by 0.888. B) The slope is 0.888. For every one mpg increase in highway mpg, the city mpg is predicted to increase by 0.888. C) The slope is 1.016. For every one mpg increase in city mpg, the highway mpg is predicted to increase by 1.016. D) The slope is 1.016. For every one mpg increase in highway mpg, the city mpg is predicted to increase by 1.016. Answer: A 14) Determine which is the correct calculation of the intercept for the linear model that predicts highway mpg from city mpg. A) a = 28 - 0.888(35) = -3.08 B) a = 28 - 1.016(35) = -7.56 C) a = 35 - 0.888(28) = 10.136 D) a = 35 - 1.016(28) = 6.552 Answer: C Data were recorded for 117 months on a householdʹs gas bill (in dollars) and the average monthly temperatures for its neighborhood. The mean monthly temperature was 48.7°F with a standard deviation of 20.6. The mean gas bill price was $81.20 with a standard deviation of 66.5. The correlation coefficient between monthly temperature and gas bill price is -0.92. 15) Determine the correct value of the slope for the linear model that predicts gas bill price from monthly temperature and interpret it in context. A) The slope is -2.97. For every one degree increase in monthly temperature, the gas bill price is predicted to decrease by $2.97. B) The slope is -2.97. For every one dollar increase in gas bill price, the monthly temperature is predicted to decrease by 2.97°. C) The slope is -0.28. For every one degree increase in monthly temperature, the gas bill price is predicted to decrease by $0.28. D) The slope is -0.28. For every one dollar increase in gas bill price, the monthly temperature is predicted to decrease by 0.28°. Answer: A
Page 25 Copyright © 2020 Pearson Education, Inc.
16) Determine which is the correct calculation of the intercept for the linear model that predicts gas bill price from monthly temperature. A) a = 48.7 + 2.97(81.20) = 289.86 B) a = 48.7 + 0.28(81.20) = 71.44 C) a = 81.20 + 2.97(48.7) = 225.84 D) a = 81.20 + 0.28(48.7) = 94.84 Answer: C Use the following regression equation regarding professor salaries to answer the question. ^
Salary = 95000 + 1280 (Years) Note that Years is the number of years a professor has worked at a college, and Salary is the annual salary (in dollars) the professor earns. 17) Interpret the slope in the context of the data. A) The slope is 1280. If a professor has never worked at a college, his/her salary is expected to be $1,280. B) The slope is 95000. If a professor has never worked at a college, his/her salary is expected to be $95,000. C) The slope is 1280. For every additional year a professor works at a college, his/her salary is predicted to increase by $1,280. D) The slope is 95000. For every additional year a professor works at a college, his/her salary is predicted to increase by $95,000. Answer: C 18) Interpret the intercept in the context of the data. State whether the value is meaningful. A) The intercept is 1280. If a professor has completed 0 years of work at a college, his/her salary is expected to be $1,280. The value is meaningful because it represents the starting salary of a professor. B) The intercept is 95000. If a professor has completed 0 years of work at a college, his/her salary is expected to be $95,000. The value is meaningful because it represents the starting salary of a professor. C) The intercept is 1280. For every additional year a professor works at a college, his/her salary is predicted to increase by $1,280. The value is meaningful because the more a person works, the more money he/she typically makes. D) The intercept is 95000. For every additional year a professor works at a college, his/her salary is predicted to increase by $95,000. The value is meaningful because the more a person works, the more mone he/she typically makes. Answer: B Use the following regression equation regarding car mileage to answer the question. ^
Highway = 0.892 + 1.337 (City) Note that City is the estimated miles per gallon (mpg) a car gets while driving on city streets, and Highway is the estimated miles per gallon (mpg) a car gets while driving on highways. 19) Interpret the slope in the context of the data. A) The slope is 0.892. For every additional mpg a car gets in the city, its highway mpg is predicted to increase by 0.892. B) The slope is 1.337. For every additional mpg a car gets in the city, its highway mpg is predicted to increase by 1.337. C) The slope is 0.892. If a car gets 0 mpg in the city, it will get 0.892 mpg on the highway. D) The slope is 1.337. If a car gets 0 mpg in the city, it will get 1.337 mpg on the highway. Answer: B
Page 26 Copyright © 2020 Pearson Education, Inc.
20) Interpret the intercept in the context of the data. State whether the value is meaningful. A) The intercept is 0.892. For every additional mpg a car gets in the city, its highway mpg is predicted to increase by 0.892. The value is meaningful. B) The intercept is 1.337. For every additional mpg a car gets in the city, its highway mpg is predicted to increase by 1.337. The value is meaningful. C) The intercept is 0.892. If a car gets 0 mpg in the city, it will get 0.892 mpg on the highway. The value is not meaningful because if a car is not moving, it cannot have a mpg value. D) The intercept is 1.337. If a car gets 0 mpg in the city, it will get 1.337 mpg on the highway. The value is not meaningful because if a car is not moving, it cannot have a mpg value. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Use the following regression equation regarding airline tickets to answer the question. ^
Price = 49 + 0.22 ∙ (Distance) Note that Distance is the amount of miles between the departure and arrival cities, and Price is the cost of an airline ticket. 21) Interpret the slope of the regression equation in the context of the data. Answer: The slope is 0.22. For every additional mile of flight travel, the price of the airline ticket is predicted to increase by $0.22. 22) Interpret the intercept of the regression equation in the context of the data. Explain why the value is or is not meaningful. Answer: The intercept is 49. If you are traveling 0 miles, the price of an airline ticket is predicted to be $49. The value is not meaningful because you would not pay for an airline ticket if you are not traveling anywhere. Suppose data were recorded for 100 employees of a large company that included annual salaries and the number of years the employee has been in their current position. The mean annual salary was $58,000 with a standard deviation of 12,500. The mean number of years in the current position is 10 with a standard deviation of 3. The correlation coefficient between the two variables is approximately 0.93. 23) Determine the correct value of the slope for the linear model that predicts annual salary from the number of years in current position and interpret it in context. Answer: b = r
sy sx
= 0.93
12500 = 0.93 4166.67 = 3875. 3
So, the slope is $3875. For every additional year an employee has held his/her current position in the comp he/she is expected to earn $3875 more for their annual salary. 24) Calculate the value of the intercept for the linear model that predicts annual salary from the number of years in current position and interpret it in context, if appropriate. Answer: a = y - bx = 58000 - 3875(10) = 58000 - 38750 = 19,250. So, the intercept is $19,250. If an employee has worked for 0 years in the current position, he/she is predict earn a salary of $19,250. This is reasonable becaue $19,250 could be the employeeʹs starting salary.
Page 27 Copyright © 2020 Pearson Education, Inc.
4.4 Evaluating the Linear Model 1 Apply Concepts for Linear Models MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) In the NHL, the correlation between ʺGoals scored per gameʺ and ʺminutes on the iceʺ for a team of players is found to be 0.8178. Choose the statement that is true about the coefficient of determination. A) When given as a percent, the coefficient of determination is always between 0 and 100%. B) The coefficient of determination, r2 , is equal to approximately 0.6688. C) The coefficient of determination states that about 66.88% of the variation in goals scored per game is explained by minutes on the ice. D) All of these are true statements. Answer: D 2) In the NBA, the correlation between ʺsteals per gameʺ and ʺblocked shots per gameʺ is found to be 0.8045. Choose the statement that is true about the coefficient of determination. A) The coefficient of determination, r2 , is equal to approximately 0.6472. B) The coefficient of determination states that about 64.72% of the variation in blocked shots per game is explained by steals per game. C) When given as a percent, the coefficient of determination is always between 0 and 100%. D) All of these are true statements. Answer: D 3) Which of the following is not true about the coefficient of determination, r2 ? A) The coefficient of determination, r2 , ranges from 0% to 100% and represents the amount of variability in the response variable (y) explained by the regression line. B) In order to interpret r2 the linearity condition of the linear regression model must be satisfied. C) A hypothesis test should be conducted verify that r2 is large enough to conclude a linear relationship exists. D) The coefficient of determination, r2 , is a statistic that will give some information about how well the data fits the model, but it should not be the only piece of information taken into consideration when determining how useful a linear model might be. Answer: C 4) The following model was created to show the association between the number of massages received per month and self-predicted stress level: Stress level = 10 - 0.02(number of massages per month). The coefficient of determination for the model is 0.066 or 6.6%. Choose the true statement regarding this model. A) The model shows that getting even one massage a month will decrease your stress level. B) The model shows that on average a person getting no massages will always have a stress level of 10. C) The model shows that there is a negative association between stress level and number of monthly massages, but the relatively small coefficient of determination suggests that very little of the variation is explained by the model, so the model is probably not very good at predicting stress level. D) None of these. Answer: C
Page 28 Copyright © 2020 Pearson Education, Inc.
5) It is determined that a positive linear association exists between age (for children between the ages of 3 and 9 years) and attention span (measured in minutes). The scatterplot below shows the association. The prediction equation is also given. A college instructor uses the model to predict the attention span of the students in her class who have an average age of 29. Choose the best statement to summarize why this is not an appropriate use for the model. attention span = 4.68 + 3.40(age)
A) This is an inappropriate use of the model because the model was used to make predictions beyond the scope of the data. The college instructor is extrapolating. B) This is an inappropriate use of the model because a 29 -year-old person would be an outlier in this context. C) This is an inappropriate use of the model because age does not cause attention span to increase. Correlation does not mean causation. Answer: A 6) It is determined that a positive linear association exists between amount of a new smoking cessation drug taken (in milliliters) and weight gain in women (in pounds). The scatterplot below shows the association. The prediction equation is also given. A pharmacy technician uses the model to predict the potential weight gain for a man who takes the recommended dosage. Choose the best statement to summarize why this is not an appropriate use for the model. A) This is an inappropriate use of the model because a man would be an outlier in this context. B) This is an inappropriate use of the model because the model was used to make predictions beyond the scope of the data. The Pharmecy Technician is extrapolating. C) This is an inappropriate use of the model because taking the smoking cessation drug does not cause weight gain. Correlation does not mean causation. Answer: B 7) For any set of data, the regression equation will always pass through A) The point that represents the mean value of x and the mean value of y. B) Every point in the data set. C) At least two points in the data set. D) The intercept and the slope. Answer: A
Page 29 Copyright © 2020 Pearson Education, Inc.
8) For any set of data, the regression equation will always pass through A) At least two points in the data set. B) Every point in the data set. C) The point that represents the mean value of x and the mean value of y. D) The point with the smallest value of x and the largest value of x. Answer: C 9) What effects might an outlier have on a regression equation? A) An outlier has no effect on a regression equation. B) An outlier may affect only the slope of a regression equation. C) An outlier may affect only the correlation coefficient of a regression equation. D) An outlier may affect both the slope and the correlation coefficient of a regression equation. Answer: D 10) What effects might an outlier have on the slope of a regression equation? A) An outlier has no effect on the slope of a regression equation. B) An outlier could either increase or decrease the slope, depending on its location. C) An outlier always decreases the slope. D) An outlier always increases the slope. Answer: B 11) Which of the following sets of numbers would result in a negative correlation of -1?
A) The correlation coefficient would increase because we are adding 12 additional values (one for each month) on the x-axis. B) The correlation coefficient would decrease because we have less data points per month as opposed to per year. C) The correlation coefficient would be unchanged because it is not affected by changes in units. D) We cannot make any conclusions about the correlation coefficient because we would need to use the new data values to calculate it. Answer: D 12) Which of the following sets of numbers would result in a positive correlation of 1? B) C) A)
Answer: C
Page 30 Copyright © 2020 Pearson Education, Inc.
D)
The scatterplot below shows the relationship between the ages of women when they first married and the ages when they had their first child. The correlation coefficient between the values is 0.9315.
The regression equation for the data is ^
Age-first child = 6.404 + 0.955 (Age-married) 13) Would it be appropriate to say that the age at which a woman first marries causes her to have her first child after that age? A) Yes, the scatterplot shows a strong linear association, so the older a woman is when she first marries, the older she will be when she has her first child. B) Yes, since the correlation coefficient is close to 1, we can conclude causation. C) No, since the correlation coefficient does not equal 1, we cannot conclude causation. D) No, correlation never implies causation. Answer: D 14) Can we use the regression equation to predict the age at which a woman has her first child for a woman who first married at the age of 50? A) Yes, we can always make predictions once we have a regression equation. B) Yes, we can make a prediction because the scatterplot shows a strong positive linear relationship. C) No, we cannot make a prediction because a woman who marries at age 50 is outside the range of our data. We would be extrapolating. D) No, we cannot make a prediction because the correlation coefficient does not equal 1. Answer: C 15) Suppose the correlation coefficient describing how the heights of teenage boys predicts their weights is 0.85. What is the coefficient of determination (r2 ) and what does it mean in context? A) r2 is 0.7225. 72.25% of the variation in heights of teenage boys can be explained by their weight. B) r2 is 0.7225. 72.25% of the variation in weights of teenage boys can be explained by their height. C) r2 is 0.85. 85% of the variation in heights of teenage boys can be explained by their weight. D) r2 is 0.85. 85% of the variation in weights of teenage boys can be explained by their height. Answer: B
Page 31 Copyright © 2020 Pearson Education, Inc.
The scatterplot below shows the relationship between a carʹs speed and the distance it traveled to come to a complete stop when hitting the brakes. The correlation coefficient between the values is 0.81.
The regression equation for the data is ^
Distance = -17.579 + 3.932 (Speed) 16) Would it be appropriate to say that a carʹs speed causes its stopping distance? A) Yes, the scatterplot shows a strong linear association, so the faster a car is going, the longer it takes to stop it. B) Yes, since the correlation coefficient is close to 1, we can conclude causation. C) No, since the correlation coefficient does not equal 1, we cannot conclude causation. D) No, correlation never implies causation. Answer: D 17) Can we use the regression equation to predict the stopping distance of a car that is traveling at 40 mph? A) No, we cannot make a prediction because a car that is traveling at 40 mph is outside the range of our data. We would be extrapolating. B) No, we cannot make a prediction because the correlation coefficient does not equal 1. C) Yes, we can always make predictions once we have a regression equation. D) Yes, we can make a prediction because the scatterplot shows a strong positive linear relationship. Answer: A 18) What is the value of the coefficient of determination (r2 ) and what does it mean in context? A) r2 is 0.6561. 65.61% of the variation in a carʹs speed can be explained by its stopping distance. B) r2 is 0.6561. 65.61% of the variation in a carʹs stopping distance can be explained by its speed. C) r2 is 0.81. 81% of the variation in a carʹs speed can be explained by its stopping distance. D) r2 is 0.81. 81% of the variation in a carʹs stopping distance can be explained by its speed. Answer: B
Page 32 Copyright © 2020 Pearson Education, Inc.
The scatterplot shows the relationship between a vehicleʹs age (in years) and its current value (in dollars). The coefficient of determination was found to be 98.74%.
19) Calculate the value of the correlation coefficient between a carʹs age and its current value. A) r = 0.9937 B) r = 0.9874 C) r = -0.9937 D) r = -0.9874 Answer: C 20) If we converted the x-axis to months, instead of years, what would happen to the value of the correlation coefficient? A) The correlation coefficient would increase because we are adding 12 additional values (one for each month) on the x-axis. B) The correlation coefficient would decrease because we have less data points per month as opposed to per year. C) The correlation coefficient would be unchanged because it is not affected by changes in units. D) We cannot make any conclusions about the correlation coefficient because we would need to use the new data values to calculate it. Answer: C
Page 33 Copyright © 2020 Pearson Education, Inc.
The scatterplot shows the relationship between the foot lengths and foot widths of children. The coefficient of determination was found to be 41.09%.
21) Calculate the value of the correlation coefficient between a carʹs age and its current value. A) r = -0.4109 B) r = -0.6410 C) r = 0.4109 D) r = 0.6410 Answer: D 22) If we converted the x-axis to inches, instead of centimeters, what would happen to the value of the correlation coefficient? A) The correlation coefficient would increase because we there are 2.54 centimeters in 1 inch. B) The correlation coefficient would decrease because we there are 2.54 centimeters in 1 inch. C) The correlation coefficient would be unchanged because it is not affected by changes in units. D) We cannot make any conclusions about the correlation coefficient because we would need to use the new data values (with inches) to calculate it. Answer: C
Page 34 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. The scatterplot below shows the relationship between the average monthly temperature and the monthly cost of a gas bill. The correlation coefficient between the values is -0.92.
The regression equation for the data is ^
Gas Bill = 225.78 - 2.97 (Temperature) 23) Do lower monthly temperature cause gas bills to be more expensive? Explain your reasoning. Answer: No, lower monthly temperatures do not cause gas bills to be more expensive. Correlation never implies causation. We can say that lower temperatures are associated with higher gas bills, but we cannot say that the temperature causes the price. 24) If appropriate, calculate the expected gas bill cost for a month with an average temperature of -20°. If itʹs not appropriate, explain why. Answer: It would not be appropriate to make a prediction because a temperature of -20° is outside the range of our data. We would be extrapolating. 25) Determine the value of the coefficient of determination (r2 ) and explain its meaning in context? Answer: r2 = (-0.92)2 = 0.8464. This means that 84.64% of the variation in the cost of a gas bill can be explained by the monthly average temperature.
Page 35 Copyright © 2020 Pearson Education, Inc.
The following scatterplot shows the relationship between heights (in cm) and weights (in kg) of 100 Americans. The coefficient of determination was found to be 37.9%.
26) Calculate the value of the correlation coefficient between height and weight. Answer: r = r2 = 0.379 = 0.6156. Since the plot shows a positive relationship between height and weight, r = 0.6156. 27) If we converted the x-axis to inches, instead of centimeters, what would happen to the value of the correlation coefficient? Answer: The correlation coefficient would be unchanged because it is not affected by changes in units. Solve the problem. 28) Create a data set of 2 variables (X and Y) such that the relationship between X and Y would have a perfect negative correlation of -1. Note: The data set can be very small (three to four X and Y values). Answer: Answers may vary. Example:
Page 36 Copyright © 2020 Pearson Education, Inc.
29) For the following scatterplot, what effect would the outlier have on the slope of the regression equation? Explain your reasoning.
Answer: The outlier would decrease the slope because it would influence the lower left hand side of the regression line to move closer to it. This would make the line less steep, which results in a smaller slope value. 2 Perform a Linear Regression MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The data in the table represent the amount of pressure (psi) exerted by a stamping machine ( x), and the amount of scrap brass shavings (in pounds) that are collected from the machine each hour ( y). Also shown below are the outputs from two different statistical technologies (TI-83/84/94 Calculator and Excel). A scatterplot of the data confirms that there is a linear association. Report the equation for predicting scrap brass shavings using words such as scrap, not x and y. State the slope and intercept of the prediction equation. Round all calculations to the nearest thousandth.
A) scrap = 2.134 - 2.019(pressure); slope = -2.019 and the intercept is 2.134. B) scrap = -2.019 + 2.134(pressure); slope = -2.019 and the intercept is 2.134. C) scrap = 2.134 - 2.019(pressure); slope = 2.134 and the intercept is -2.019. D) scrap = -2.019 + 2.134(pressure); slope = 2.134 and the intercept is -2.019. Answer: D
Page 37 Copyright © 2020 Pearson Education, Inc.
2) The data in the table represent the amount of raw material (in tons) put into an injection molding machine each day (x), and the amount of scrap plastic (in tons) that is collected from the machine every four weeks (y). Also shown below are the outputs from two different statistical technologies (TI-83/84/94 Calculator and Excel). A scatterplot of the data confirms that there is a linear association. Report the equation for predicting scrap from raw material using words such as scrap, not x and y. State the slope and intercept of the prediction equation. Round all calculations to the nearest hundredth.
A) scrap = 2.19 - 2.38(raw material); slope = 2.19 and the intercept is -2.38. B) scrap = 2.19 - 2.38(raw material); slope = -2.38 and the intercept is 2.19. C) scrap = -2.38 + 2.19(raw material); slope = -2.38 and the intercept is 2.19. D) scrap = -2.38 + 2.19(raw material); slope = 2.19 and the intercept is -2.38. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) The following table shows the weights and prices of some whole rotisserie chickens at different supermarkets. a. Make a scatterplot with weight on the x-axis and cost on the y-axis. Include the regression line on your scatte b. Find the numerical value for the correlation between weight and price. Explain what the sign of the correlation c. Report the equation of the best-fit straight line, using weight as the predictor (x) and cost as the response (y). d. Report the slope and intercept of the regression line, and explain what they show. If the intercept is not approp to report, explain why. e. Find and interpret the coefficient of determination using the original data. Weight (lb.) 2.8 3.7 2.9 4.2 5.3 4.7
Price $3.92 $4.70 $4.41 $5.38 $6.84 $5.99
Page 38 Copyright © 2020 Pearson Education, Inc.
Answer: a.
b. r = 0.981. A positive correlation suggests that larger chickens tend to have a higher prices. c. Predicted Price = 0.998 + 1.07 Weight. d. The slope: for each additional pound, the price goes up by $1.07. The interpretation of the intercept is inappropriate because it is not possible to have a chicken that weighs 0 pounds. f. r2 = 0.96. 96% of the variation in chicken price is explained by weight.
Page 39 Copyright © 2020 Pearson Education, Inc.
4) Data were collected that included information on the weight of the trash (in pounds) on the street for one week and the number of people who live in the house. The following figure shows a scatterplot with the regression line. a. Is the trend positive or negative? What does that mean? b. Now calculate the correlation between the weight of trash and the number of people. c. Report the slope. For each additional person in the house, there are, on average, how many additional pounds of trash? d. Either report the intercept or explain why it is not appropriate to interpret it.
Answer: a. The data suggest a positive trend. This means that there tends to be more trash if there are more people living in the house. In other words, as the number of people in the household increases, the pounds of trash increases. b. r = 0.906. c. The slope is 8.481. For each additional person, on the average there are an additional 8.481 pounds of tra d. This suggests that with 0 people there should be 6.167 pounds of trash. But we should not draw that conclusion because it is not meaningful to think of a house producing trash all by itself (without people present.)
Page 40 Copyright © 2020 Pearson Education, Inc.
5) Data shown in the table are the 4th-grade reading and math scores for a sample of regions from a certain assessment organization. The scores represent the percentage of 4th -graders in each state who scored at or abov basic level in reading and math. A scatterplot of the data suggests a linear trend. 4th Grade Math 4th-Grade Reading Scores Scores 63 77 60 76 54 68 61 76 73 80 66 79 58 73 78 88 69 79 62 78 71 84 a. Find and report the value for the correlation coefficient and the regression equation predicting the math score from the reading score. Use the words Reading and Math in your regression equation and round off to two decimal places. Then find the predicted math score for a region with a reading score of 70. b. Find and report the value of the correlation coefficient regression equation for predicting the reading score fro math score. Then find the predicted reading score for a state with a math score of 70. c. Discuss the effect of changing the choice of dependent and independent variable on the value of r and on the regression equation. Answer: a. r = 0.94, Math = 33.14 + 0.69 Reading, 33.14 + 0.69(70) = 81. Predicted math score is 81. b. r = 0.94, Reading = -34.48 + 1.28 Math, -34.48 + 1.28(70) = 55, Predicted reading score is 55. c. Changing the choice of the dependent and independent variables does not change r but does change the regression equation. 6) Assume that in an ecology class, the teacher gives a midterm exam and a final exam. Assume that the association between midterm and final scores is linear. Here are the summary statistics: Midterm: Final:
Mean = 74, Standard deviation = 6 Mean = 74, Standard deviation = 6 Also, r = 0.75 and n = 30.
a. Find and report the equation of the regression line to predict the final exam score from the midterm score. b. For a student who gets 60 on the midterm, predict the final exam score. c. Consider a student who gets a 100 on the midterm. Without doing any calculations, state whether the predicted score on the final exam would be higher, lower, or the same as 100. Answer: a. b = r
sy sx
= 0.75
6 = 0.75 6
a = y - bx a = 74 - 55.5 a = 18.5 Predicted Final = 18.5 + 0.75 Midterm Predicted Final = 18.5 + 0.75 Midterm Predicted Final = 18.5 + 0.75(60) Predicted Final = 63.5 c. It should be lower than 100 because of regression toward the mean. b.
Page 41 Copyright © 2020 Pearson Education, Inc.
Ch. 4 Regression Analysis: Exploring Associations between Variables Answer Key 4.1 Visualizing Variability with a Scatterplot 1 Compare Scatterplots to Determine Which One Shows a Greater Strength of Association 1) A 2) B 3) C 4) A 5) C 6) There are two possible answers here: (1) The number of crimes committed is the explanatory variable and the number of police on patrol is the response because the amount of crime can explain a need for more or less police officers. (2) The number of police on patrol is the explanatory variable and the amount of crime is the response because the more police that are on patrol could explain a reduction in the amount of crime. 7) Scatterplot (ii) shows a stronger linear relationship because it has less vertical variation between points. As a treeʹs diameter increases, its volume typically increases as well. 2 Describe and Interpret Increasing Trends, Decreasing Trends, or no Trends from Scatterplots 1) C 2) C 3) B 4) A 5) D 6) A 7) A 8) A 9) C 10) D 11) Answers may vary. Example: People who spend more time working out tend to lose more weight.
4.2 Measuring Strength of Association with Correlation 1 Identify and Interpret Correlation in a Scatterplot 1) B 2) C 3) B 4) A 5) B 6) C 7) A 8) A 9) B 10) D 11) C 12) A 13) C 14) C 15) A 16) D 17) B 18) A correlation coefficient will be negative when there is a negative linear trend in a scatterplot. So, as the values of x increase, the values of y tend to decrease. 19) Scatterplot (i) has the highest correlation because it is more linear than scatterplot (ii). The points have less vertical variation in scatterplot (i), so the relationship is stronger. 20) The correlation coefficient, r, will get closer to 1 because the point is an outlier. Since the rest of the points show a strongly positive linear relationship, r would reflect that and get closer to 1. Page 42 Copyright © 2020 Pearson Education, Inc.
2 Calculate the Correlation Coefficient of Data 1) A 2) A 3) A 4) A 5) A 6) B 7) D 8) The correlation coefficient between the heights of the teenagers in 2003 and 2008 is equal to 1 because for every person, the height increased exactly 6 inches (15.24 cm) from 2003 to 2008. If we plotted the data, the points would form a straight line. 9) a. r = 0.92 b. r = 0.92, The correlation coefficient stays the same. c. r = 0.92. Adding a constant to all y-values does not change the value of r. d. r = 0.92;. The correlation coefficient stays the same. 10) a. r = 0.88 b. r = 0.88. Multiplying by a constant does not change the value of r. c. r = 0.88. Adding a constant does not change the value of r because the strength of the association is not affected. 11) A 12) A 13) A
4.3 Modeling Linear Trends 1 Predict Values From a Regression Equation 1) D 2) A 3) D 4) D ^
5) Price = 49 + 0.22 (Distance) = 49 + 0.22 (2448.3) = 49 + 538.63 = $587.63 6) Using the regression equation, we find that she has held her current position in the company for approximately 17 year Salary = 19250 + 3875 (Years in Position) 85,000 = 19,250 + 3875 (Years in Position) 65,750 = 3875 (Years in Position) 16.97 = (Years in Position) 2 Interpret or Set Up a Regression Equation and/or Scatterplot 1) A 2) C 3) B 4) B 5) B 6) B 7) C 8) A 9) A 10) A 11) C 12) B 13) A 14) C 15) A 16) C 17) C 18) B 19) B 20) C Page 43 Copyright © 2020 Pearson Education, Inc.
21) The slope is 0.22. For every additional mile of flight travel, the price of the airline ticket is predicted to increase by $0.22. 22) The intercept is 49. If you are traveling 0 miles, the price of an airline ticket is predicted to be $49. The value is not meaningful because you would not pay for an airline ticket if you are not traveling anywhere. sy 12500 23) b = r = 0.93 = 0.93 4166.67 = 3875. 3 sx So, the slope is $3875. For every additional year an employee has held his/her current position in the company, he/she is expected to earn $3875 more for their annual salary. 24) a = y - bx = 58000 - 3875(10) = 58000 - 38750 = 19,250. So, the intercept is $19,250. If an employee has worked for 0 years in the current position, he/she is predicted to earn a s $19,250. This is reasonable becaue $19,250 could be the employeeʹs starting salary.
4.4 Evaluating the Linear Model 1 Apply Concepts for Linear Models 1) D 2) D 3) C 4) C 5) A 6) B 7) A 8) C 9) D 10) B 11) D 12) C 13) D 14) C 15) B 16) D 17) A 18) B 19) C 20) C 21) D 22) C 23) No, lower monthly temperatures do not cause gas bills to be more expensive. Correlation never implies causation. We can say that lower temperatures are associated with higher gas bills, but we cannot say that the temperature causes the price. 24) It would not be appropriate to make a prediction because a temperature of -20° is outside the range of our data. We would be extrapolating. 25) r2 = (-0.92)2 = 0.8464. This means that 84.64% of the variation in the cost of a gas bill can be explained by the monthly average temperature. 26) r = r2 = 0.379 = 0.6156. Since the plot shows a positive relationship between height and weight, r = 0.6156. 27) The correlation coefficient would be unchanged because it is not affected by changes in units. 28) Answers may vary. Example:
29) The outlier would decrease the slope because it would influence the lower left hand side of the regression line to move closer to it. This would make the line less steep, which results in a smaller slope value. Page 44 Copyright © 2020 Pearson Education, Inc.
2 Perform a Linear Regression 1) D 2) D 3) a.
b. r = 0.981. A positive correlation suggests that larger chickens tend to have a higher prices. c. Predicted Price = 0.998 + 1.07 Weight. d. The slope: for each additional pound, the price goes up by $1.07. The interpretation of the intercept is inappropriate b it is not possible to have a chicken that weighs 0 pounds. f. r2 = 0.96. 96% of the variation in chicken price is explained by weight. 4) a. The data suggest a positive trend. This means that there tends to be more trash if there are more people living in the house. In other words, as the number of people in the household increases, the pounds of trash increases. b. r = 0.906. c. The slope is 8.481. For each additional person, on the average there are an additional 8.481 pounds of trash. d. This suggests that with 0 people there should be 6.167 pounds of trash. But we should not draw that conclusion beca it is not meaningful to think of a house producing trash all by itself (without people present.) 5) a. r = 0.94, Math = 33.14 + 0.69 Reading, 33.14 + 0.69(70) = 81. Predicted math score is 81. b. r = 0.94, Reading = -34.48 + 1.28 Math, -34.48 + 1.28(70) = 55, Predicted reading score is 55. c. Changing the choice of the dependent and independent variables does not change r but does change the regression equation. sy 6 6) a. b = r = 0.75 = 0.75 sx 6 a = y - bx a = 74 - 55.5 a = 18.5 Predicted Final = 18.5 + 0.75 Midterm Predicted Final = 18.5 + 0.75 Midterm Predicted Final = 18.5 + 0.75(60) Predicted Final = 63.5 c. It should be lower than 100 because of regression toward the mean. b.
Page 45 Copyright © 2020 Pearson Education, Inc.
Ch. 5 Modeling Variation with Probability 5.1 What is Randomness? 1 Use Simulations to Calculate Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) If 20 babies are born, how often are there 8 or fewer male babies? Assume that the gender of a baby is a random outcome. Presumably the probability of female 0.50. Which of the following experiments would not simulate this situation? A) Flip a coin twenty times. Designate a head to mean ʺfemaleʺ and a tail to mean ʺmaleʺ. B) Roll a die twenty times. Designate a 1, 2, or 3 to mean ʺfemaleʺ and a 4, 5, or 6 to mean ʺmaleʺ. C) Choose the first twenty digits from a row in the random number table. Designate odd numbers to mean ʺfemaleʺ and even numbers to mean ʺmaleʺ. D) All of these will simulate the gender of twenty babies. Answer: D 2) A baseball player has a 0.33 probability of a hit when at bat. After 30 at -bats, we want to know how often he will have more than 20 hits? Assume that having a hit is a random outcome. Which of the following experiments would not simulate this situation? A) Choose the first 30 digits from a row in the random number table. Designate odd numbers to mean ʺhitʺ and even numbers to mean ʺnon-hitʺ. B) Flip a coin thirty times. Designate a head to mean ʺhitʺ and a tail to mean ʺnon -hitʺ. C) All of these will simulate the number of hits. D) None of these will simulate the number of hits. Answer: D
Page 1 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) If we flip a coin 10 times, what percentage of the time will the coin land on heads? A first step to answering this question is to simulate 10 flips. Use the random number table in Appendix A partially shown he simulate flipping a coin 10 times. Let the digits 0, 1, 2, 3, 4 represent heads and the digits 5, 6, 7, 8, 9 represent tails. Begin with the first digit in the sixth row.
a. Write the sequence of 10 random digits using. b. Change the sequence of 10 random digits to a sequence of heads and tails, writing H for the digits 0, 1, 2, 3, 4 a T for the digits 5, 6, 7, 8, 9. c. What was the longest streak of heads in your list? d. What percentage of the flips were heads? Answer: a. 8 7 9 6 4 4 3 7 5 1 b. T T T T H H H T T H c. Longest streak was 3 heads. d. 40% of the flips were heads.
Page 2 Copyright © 2020 Pearson Education, Inc.
4) Suppose you are carrying out a randomized experiment to test if there is a difference in the amount of information remembered between students who take notes using a computer versus those who take notes by hand using pen and paper. You have 25 college student participants and want each to have an equal chance of being assigned to the computer group or the pen and paper group. Let the even digits (0, 2, 4, 6, 8) represent assignment to the computer group and the odd digits (1, 3, 5, 7, 9) represent assignment to the pen and paper group. Begin with the first digit in the eleventh row of the random number table in Appendix A partially shown here.
a. Write the sequence of 25 random digits. b. Change the sequence of 25 random digits to a sequence of C for computer (digits 0, 2, 4, 6, 8) and P for paper ( 1, 3, 5, 7, 9). c. What percentage of the 25 participants were assigned to the computer group? d. Describe another way the random number table could have been used to randomly assign participants to one two groups Answer: a. 2 7 5 8 3 0 1 8 6 6 5 8 2 5 0 3 8 1 0 3 3 5 8 2 5 b. C P P C P C P C C C P C C P C P C P C P P P C C P c. 52% were assigned to the computer group. d. The digits 0, 1, 2, 3, 4 could have assigned participants to Computer, and the digits 5, 6, 7, 8, 9 could have assigned participants to Paper. Other solutions are possible. 2 Identify Empirical and Theoretical Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Is the following an example of theoretical probability or empirical probability? A card player declares that there is a one in thirteen chance that the next card pulled from a well-shuffled, full deck will be a queen. A) Theoretical B) Empirical Answer: A 2) Is the following an example of theoretical probability or empirical probability? A homeowner notes that five out of seven days the newspaper arrives before 5 pm. He concludes that the probability that the newspaper will arrive before 5 pm tomorrow is about 71%. A) Theoretical B) Empirical Answer: B
Page 3 Copyright © 2020 Pearson Education, Inc.
3) Is the following an example of theoretical probability or empirical probability? A fisherman notes that eight out of ten times that he uses a certain lure he catches a fish within an hour. He concludes that the probability that the lure will catch a fish on his fishing next trip is about 80% B) Empirical A) Theoretical Answer: B 4) Is the following an example of theoretical probability or empirical probability? At a carnival shell game the player can pay three dollars and choose the shell that he or she believes is hiding the prize. There are four shells that are thoroughly mixed up after each guess. The player concludes that there is a one in four chance of randomly picking the winning shell. B) Empirical A) Theoretical Answer: A 5) A student tosses two coins 20 times and recordes the number of tails. Outcome Frequency
2 tails 1 tail 0 tails 10 2 8
The student concludes that the probability of getting exactly 1 tail when tossing two coins is empirical probability or theoretical probability? A) Empirical
1 . Is this an examp 10
B) Theoretical
Answer: A 6) A bag contains 9 red marbles, 6 blue marbles, and 4 green marbles. Jeffery claims that if a marble is selected at 6 . Is this an example of empirical random from the bag, the probability of choosing a blue marble is 19 probability or theoretical probability? A) Theoretical
B) Empirical
Answer: A 7) A homeowner notices that 8 out of 14 days the mail arrives before 3pm. She concludes that the probability that the mail will arrive before 3pm tomorrow is about 57%. Is this an example of a theoretical or empirical probability? A) Theoretical B) Empirical Answer: B 8) A six-sided die is rolled and a coin is tossed. The probability of getting a tail on the coin and a 2 on the die is 8.3%. Is this an example of a theoretical or empirical probability? B) Empirical A) Theoretical Answer: A 9) Is the following an example of theoretical probability or empirical probability? A survey was conducted to determine a group of elderly adultsʹ favorite breeds of dogs. He concludes that the probability that the elderly adults prefer a poodle is about 30%. A) Theoretical B) Empirical Answer: B 10) Is the following an example of theoretical probability or empirical probability? Joyce and Joel roll two dice 50 times and record their results in the accompanying chart. They calculate the number of times 7 was rolled. This is an example of what type of probability? B) Empirical A) Theoretical Answer: B Page 4 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 11) Describe the difference between a theoretical probability and an empirical probability. Give at least one example of each type of probability. Answer: An empirical probability is short-run relative frequency based on an experiment. A theoretical probability is a long-run relative frequency of an event after infinitely many trials. 12) A card player claims that the probability of choosing a red jack from a well -shuffled deck of cars is 1/26 because choosing any card is equally likely and there are two red jacks in the deck of fifty -two cards. Is this an example of a theoretical probability or an empirical probability? Explain. Answer: This is a theoretical probability because it is not based on a short -run experiment. 13) A researcher wants to compare experimental results with a theoretical value for the probability of getting a tail on a coin and a 1 on a die. Design an experiment to compare the probabilities. Define which one is theoretical and which one is an empirical probability? Explain. Answer: The probability of each independent event is 1/2 for P(tail) and 1/6 for P(1). The probability of both events together would be 1/2 * 1/6 = 1/12 or 8.3%. This is a theoretical value. An experiment can be designed to throw each coin and die separately and record the results for 10 flips and rolls and compare to the theoretical value. This will be the empirical probability.
5.2 Finding Theoretical Probabilities 1 Find Theoretical Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following statements is not true about probability? A) A probability of zero means that an event will not happen, a probability of one means that an event is certain to happen. B) Probability is used to measure how often random events occur. C) Probabilities are always numbers between 0 and 1 inclusive. D) All of these are true statements Answer: D 2) The National Center for Health Statistics has found that there is a 0.41% chance that an American citizen will die from falling. What is the probability that you will not die from a fall? (Round to the nearest hundredth of a percent) A) 99.59% B) 93.31% C) 59.00% D) Canʹt be determined with the given information. Answer: A 3) The National Center for Health Statistics has found that there is a 5.01% chance that an American citizen will die from an accident (unintentional injury). What is the probability that you will not die from an accident? (Round to the nearest hundredth of a percent) A) 95.00% B) 99.50% C) 94.99% D) Canʹt be determined with the given information. Answer: C
Page 5 Copyright © 2020 Pearson Education, Inc.
Use the following table to answer the question. A random sample of college students was asked to respond to a survey about how they spend their free time on weekends. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Saturday morning. The activity choices were homework, housework, outside employment, recreation, or other.
4) If one student is randomly chosen from the group, what is the probability that the student is female? A) 0.50 B) 0.48 C) 0.52 D) None of these Answer: C 5) If one student is randomly chosen from the group, what is the probability that the student chose ʺoutside employmentʺ as their most likely activity on a Saturday morning? A) 0.13 B) 0.23 C) 0.43 D) None of these Answer: B 6) If one student is randomly chosen from the group, what is the probability that the student is male? B) 0.48 C) 0.52 D) None of these A) 0.50 Answer: B 7) If one student is randomly chosen from the group, what is the probability that the student chose ʺrecreationʺ as their most likely activity on a Saturday morning? A) 0.310 B) 0.195 C) 0.115 D) None of these Answer: A 8) What is the probability that a randomly chosen survey respondent is male or chose ʺrecreationʺ as their most likely activity on Saturday mornings? B) 0.480 C) 0.675 D) None of these A) 0.790 Answer: C 9) If one student is randomly chosen from the group, what is the probability that the student chose ʺrecreationʺ or ʺotherʺ as their most likely activity on a Saturday morning? A) 0.395 B) 0.375 C) 0.275 D) None of these Answer: B 10) If one student is randomly chosen from the group, what is the probability that the student chose ʺhomeworkʺ or ʺhouseworkʺ as their most likely activity on a Saturday morning? A) 0.395 B) 0.145 C) 0.075 D) None of these Answer: A 11) If one student is randomly chosen from the group, what is the probability that the student is female or chose ʺhomeworkʺ as their most likely activity on a Saturday morning? B) 0.900 C) 0.755 D) None of these A) 0.665 Answer: A 12) If one student is randomly chosen from the group, what is the probability that the student is female and chose ʺoutside employmentʺ as their most likely activity on a Saturday morning? A) 0.13 B) 0.23 C) 0.43 D) None of these Answer: A
Page 6 Copyright © 2020 Pearson Education, Inc.
13) If one student is randomly chosen from the group, what is the probability that the student is male and chose ʺoutside employmentʺ as their most likely activity on a Saturday morning? A) 0.13 B) 0.23 C) 0.10 D) None of these Answer: C Provide an appropriate response. 14) Use the spinner below to answer the question. Assume that it is equally probable that the pointer will land on an of the five numbered spaces. If the pointer lands on a borderline, spin again.
Find the probability that the arrow will land on 4 or 1. 2 A) B) 4 5
C)
1 3
D)
3 4
Answer: A 15) You are dealt one card from a standard 52-card deck. Find the probability of being dealt an ace or a 6. 7 13 2 B) C) D) 7 A) 26 2 13 Answer: A 16) A die is rolled. The set of equally likely outcomes is {1, 2, 3, 4, 5, 6}. Find the probability of getting a 5. 1 5 A) B) C) 5 D) 0 6 6 Answer: A 17) A die is rolled. The set of equally likely outcomes is {1, 2, 3, 4, 5, 6}. Find the probability of getting a 9. 9 A) 0 B) 1 C) 9 D) 6 Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
Find the indicated probability. 18) The following table shows the career and age of retirement for a group of retired people.
11
35
90
45
181
10
47
86
46
189
60
171
309
190
730
Suppose one of these people is selected at random. Compute the probability that the person selected was a store A) 0.249 B) 0.099 C) 0.300 D) 0.025 Answer: A
Page 8 Copyright © 2020 Pearson Education, Inc.
19) The Book Industry Study Group, Inc., performs sample surveys to obtain information on characteristics of book readers. A book reader is defined to be one who read one or more books in the six months prior to the survey; a non-book reader is defined to be one who read newspapers or magazines but no books in the six months prior to the survey; a nonreader is defined to be one who did not read a book, newspaper, or magazine in the six months prior to the survey. The following data were obtained from a random sample of people 16 years old and over.
144
159
24
327
164
140
11
315
219
91
4
314
700
657
94
1,451
Suppose one of these people is selected at random. Compute the probability that the person is a nonreader. A) 0.065 B) 0.111 C) 0.038 D) 0.585 Answer: A Solve the problem. 20) The Social Security Administration reported in 2014 that if you are a 70 year old man, there is a 65.8% chance of making it to your 80th birthday. What is the probability that you will die before reaching your 80th birthday? (round to the nearest hundredth place) A) 0.34% B) 0.66% C) 34.20% D) Canʹt be determined with the given information. Answer: C
Page 9 Copyright © 2020 Pearson Education, Inc.
A random sample of adults was asked to respond to a survey about whether they use their cell phones to shop for a specific item. One question, summarized in the table below reflecting probabilities, asked each respondent to choose whether or not they use a cell phone app to shop. The major age groupings were under 40 or 40 years or older.
21) If one adult is randomly chosen from the group, what is the probability that the adult is 40 years or older? A) 40% B) 0.40% C) 60% D) 50% Answer: A 22) If one adult is randomly chosen from the group, what is the probability that the adult uses a cell phone app for shopping and is under 40 years of age? A) 40% B) 10% C) 60% D) 30% Answer: A 23) What type of probability is the 0.30 shown in the table? A) Theoretical
B) Empirical
Answer: B 24) If one adult is randomly chosen from the group, what is the probability that if the adult does not use a cell phone app for shopping and is over 40 years old? A) 40% B) 30% C) 60% D) 70% Answer: B 25) Find the probability that an adult randomly chosen from the group uses a cell phone app to shop? (round to the nearest thousandth) C) 40% D) 0.50% B) 50% A) 60% Answer: B A random sample of car buyers was asked to respond to a survey about what was the most important quality of the car they purchased. This question is summarized in the table below. The important contributors were fuel efficiency, looks, manufacturer reputation, price or other.
26) If one car buyer is randomly chosen from the group, what is the probability that the buyer is female? A) 0.50 B) 0.48 C) 0.52 D) 0.38 Answer: C 27) If one car buyer is randomly chosen from the group, what is the probability that the buyer chose a car based on ʺlooksʺ as their most important factor for the purchase? A) 0.320 B) 0.235 C) 0.160 D) 0.640 Answer: C 28) If one car buyer is randomly chosen from the group, what is the probability that the buyer is male and chose ʺmanufacturer reputationʺ as their most important factor for the purchase? A) 0.208 B) 0.100 C) 0.230 D) 0.130 Answer: B
Page 10 Copyright © 2020 Pearson Education, Inc.
29) If one car buyer is randomly chosen from the group, what is the probability that the buyer is female and chose ʺfuel efficiencyʺ or ʺotherʺ as their most important factor for the purchase? A) 0.212 B) 0.110 C) 0.300 D) 0.173 Answer: B 30) If one car buyer is randomly chosen from the group, what is the probability amongst female buyers that they chose ʺpriceʺ as their most important factor for the purchase? C) 0.310 D) 0.375 B) 0.240 A) 0.629 Answer: D 31) Find the probability of those car buyers who chose ʺlooksʺ as their most important factor was a female car buyer? A) 0.160 B) 0.469 C) 0.531 D) 0.163 Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Suppose that the typical work schedule for the wait staff at Samʹs BBQ Shack, which is open seven days a week, is five days on with two days off each week. A week begins on Monday and ends on Sunday. Assume that any day of the week is equally likely to be a day off and the selection of a day off is independent. 32) What is the probability that Isaac, a waiter at Samʹs BBQ Shack, will have Saturday or Sunday off? Show your work and round to the nearest tenth of a percent. Answer: 1/7 + 1/7 = 0.286 33) State the complement of the event given in the previous question and calculate the probability of the complement. Show your work and round to the nearest tenth of a percent. Answer: Complement: Isaac will work Saturday and will work Sunday (not(A or B) = (not A AND not B)); 0.714 34) What is the probability that Issac will have Saturday and Sunday off? Show your work and round to the nearest tenth of a percent. Answer: 1/7 * 1/7 = 0.020 A random sample of college students was asked to respond to a survey about how they spend their free time on a week night. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Wednesday afternoon/evening. The activity choices were homework, housework, outside employment, recreation, or other.
35) If one person is chosen randomly from the group, what is the probability that the person is working on their homework on a Wednesday afternoon/evening? (round to the nearest thousandth) Answer: 77/200 = 0.385 36) If one person is chosen randomly from the group, what is the probability that the student is female and working at outside employment on a Wednesday afternoon/evening? (round to the nearest thousandth) Answer: 30/200 = 0.150 37) If one person is chosen randomly from the group, what is the probability that the person was male and doing housework on a Wednesday afternoon/evening? (round to the nearest thousandth) Answer: 2/200 = 0.010
Page 11 Copyright © 2020 Pearson Education, Inc.
38) If one person is chosen randomly from the group, what is the probability that the person was either doing homework or outside employment on a Wednesday afternoon/evening? (round to the nearest thousandth) Answer: (77+70)/200 = 0.735 39) Compare the probability that a randomly chosen female student was engaged in recreation versus a male student on a Wednesday afternoon/evening. (Round to the nearest thousandth) Answer: (10/88) versus (11/112) = 0.114 versus 0.098 2 Identify Mutually Exclusive Events MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following table to answer the question. A random sample of college students was asked to respond to a survey about how they spend their free time on weekends. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Saturday morning. The activity choices were homework, housework, outside employment, recreation, or other.
1) Which of the following are mutually exclusive events? A) Student is male and student chose ʺhouseworkʺ as their most likely activity on Saturday mornings. B) Student is female and student chose ʺhouseworkʺ as their most likely activity on Saturday mornings. C) Student is male and student chose ʺoutside employmentʺ as their most likely activity on Saturday mornings. D) Student chose ʺrecreationʺ and student chose ʺotherʺ as their most likely activity on Saturday mornings. Answer: D 2) Which of the following are mutually exclusive events? A) Student is female and student chose ʺhouseworkʺ as their most likely activity on Saturday mornings. B) Student is male and student chose ʺhouseworkʺ as their most likely activity on Saturday mornings. C) Student is female and student chose ʺoutside employmentʺ as their most likely activity on Saturday mornings. D) Student chose ʺoutside employmentʺ and student chose ʺotherʺ as their most likely activity on Saturday mornings. Answer: D Determine whether the events are mutually exclusive. 3) The number of hours needed by sixth grade students to complete a research project was recorded. A student is selected at random. The events A and B are defined as follows. A = event the student took at most 9 hours B = event the student took at least 9 hours Are the events A and B mutually exclusive? A) Yes
B) No
Answer: B
Page 12 Copyright © 2020 Pearson Education, Inc.
4) The age distribution of students at a community college is recorded. A student from the community college is se at random. The events A and B are defined as follows. A = event the student is at most 32 B = event the student is at least 37 Are the events A and B mutually exclusive? A) Yes
B) No
Answer: A A random sample of adults was asked to respond to a survey about whether they use their cell phones to shop for a specific item. One question, summarized in the table below reflecting probabilities, asked each respondent to choose whether or not they use a cell phone app to shop. The major age groupings were under 40 or 40 years or older.
5) Which of the following are mutually exclusive events in this study? A) Adults over 40 years of age and uses a cell phone app for shopping. B) Adult is over 40 years of age and does not use a cell phone app for shopping. C) Adult is under 40 years of age and does not use a cell phone app for shopping. D) None of these choices. Answer: D Solve the problem. 6) A study asks a sample of adults whether they prefer cats or dogs or neither. Only one answer is allowed. Survey also records whether male or female. Which of the following are mutually exclusive events in this study? A) Adult is male and adult prefers neither cats nor dogs. B) Adult is female and likes dogs. C) Adult is male and likes cats and dogs. D) Adult is female and prefers cats. Answer: C 7) Which of the following are mutually exclusive events? A) A car buyer is female and a car buyer chose ʺlooksʺ as their most important factor for their purchase. B) A car buyer is male and a car buyer chose ʺmanufacturer reputationʺ as their most important factor for their purchase. C) A car buyer is female and a car buyer chose ʺfuel efficiencyʺ as their most important factor for their purchase. D) A car buyer chose ʺfuel efficiencyʺ and ʺotherʺ as their most important factor for their purchase. Answer: D
Page 13 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A random sample of college students was asked to respond to a survey about how they spend their free time on a week night. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Wednesday afternoon/evening. The activity choices were homework, housework, outside employment, recreation, or other.
8) Using this example, state two events that are mutually exclusive. Answer: Various. Any combination of the events ʺstudent doing homework – type xʺ and ʺstudent engaged in outside employment – type yʺ.
5.3 Associations in Categorical Variables 1 Find Conditional Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following table to answer the question. A random sample of college students was asked to respond to a survey about how they spend their free time on weekends. One question, summarized in the table below, asked each respondent to choose the one activity that they are most likely to participate in on a Saturday morning. The activity choices were homework, housework, outside employment, recreation, or other.
1) Find the probability that a female college student from the group chose ʺhouseworkʺ as their most likely activity on Saturday mornings? (Round to the nearest thousandth) A) 0.531 B) 0.520 C) 0.163 D) None of these Answer: C 2) Find the probability that a male college student from the group chose ʺhouseworkʺ as their most likely activity on Saturday mornings? (Round to the nearest thousandth) B) 0.156 C) 0.145 D) None of these A) 0.302 Answer: A Find the conditional probability. 3) The table below describes the smoking habits of a group of asthma sufferers. Light Heavy Nonsmoker smoker smoker Total Men 336 69 68 473 Women 349 63 62 474 Total 685 132 130 947 If one of the 947 subjects is randomly selected, find the probability that the person chosen is a nonsmoker given that the person is a woman. A) 0.736 B) 0.369 C) 0.509 D) 0.501 Answer: A
Page 14 Copyright © 2020 Pearson Education, Inc.
4) The table below describes the smoking habits of a group of asthma sufferers. Light Heavy Nonsmoker smoker smoker Total Men 315 65 64 444 Women 374 60 74 508 Total 689 125 138 952 If one of the 952 subjects is randomly selected, find the probability that the person chosen is a woman given that the person is a light smoker. A) 0.48 B) 0.063 C) 0.118 D) 0.131 Answer: A Suppose that a recent poll of American households about pet ownership found that for households with pets, 45% owned a dog, 34% owned a cat, and 10% owned a bird. Suppose that three households are selected randomly and with replacement and the ownership is mutually exclusive. 5) What is the probability that all three randomly selected households own a dog? (Round to the nearest hundredth) A) 0.91 B) 0.09 C) 0.80 D) 0.04 Answer: B 6) What is the probability that none of the three randomly selected households own a cat? (Round to the nearest hundredth) C) 0.71 D) 0.29 B) 0.96 A) 0.17 Answer: D 7) What is the probability that at least one of the three randomly selected households own a bird? (Round to the nearest hundredth) A) 0.08 B) 0.92 C) 0.73 D) 0.27 Answer: D 8) What is the probability that at least two of the three randomly selected households own either a cat or a dog? (Round to the nearest hundredth) B) 0.89 C) 0.06 D) 0.94 A) 0.97 Answer: B Suppose that a recent poll of American households about car ownership found that for households with a car, 39% owned a sedan, 33% owned a van, and 7% owned a sports car. Suppose that three households are selected randomly and with replacement. 9) What is the probability that all three randomly selected households own a van? (Round to the nearest thousandth) C) 0.003 D) 0.964 B) 0.036 A) 0.059 Answer: B 10) What is the probability that none of the three randomly selected households own a van? (Round to the nearest thousandth) A) 0.699 B) 0.060 C) 0.036 D) 0.301 Answer: D 11) What is the probability that at least one of the three randomly selected households own a sports car? (Round to the nearest thousandth) B) 0.800 C) 0.627 D) 0.003 A) 0. 200 Answer: A
Page 15 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Suppose that a recent poll of single people over the age of thirty -five were asked about their living arrangements. The poll found that 34% rented a house or apartment, 21% owned a house, and 17% owned a condominium. Suppose that four single people are selected randomly and with replacement. 12) What is the probability that all four people rent a house or apartment? Show your work and round to the nearest thousandth. Answer: 0.013 13) What is the probability that none of the four randomly selected people rent a house or apartment? Show your work and round to the nearest thousandth. Answer: 0.190 14) What is the probability that at least one of the four randomly selected people rents a house or apartment? Show your work and round to the nearest thousandth. Answer: 0.810 Solve the problem. 15) A multiple choice quiz contains five questions. Each question has four answer choices. Michael is not prepared for the quiz and decides to guess for each question. What is the probability that Michael will get at least one question correct? What is the probability that Michael will get all five questions correct? Show your work and round to the nearest thousandth. Answer: 0.763, 0.001 2 Determine if Events are Independent or Associated MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Use your intuition to decide whether the following events are likely to be independent or associated. Event A: A randomly selected person is married with no children. Event B: A randomly selected person opposes a tax credit for children. A) Associated B) Independent Answer: A 2) Use your intuition to decide whether the following events are likely to be independent or associated. Event A: The randomly selected carton of milk you purchased from the store is sour. Event B: Your car wonʹt start on a randomly selected morning. B) Independent A) Associated Answer: B 3) Use your intuition to decide whether the following two events are likely to be independent or associated. Event A: You reach into your dark closet, without looking, and pull out a black shirt. Event B: You reach into your sock drawer, without looking, and pull out black socks. A) Associated B) Independent Answer: B 4) Use your intuition to decide whether the following two events are likely to be independent or associated. Even A: A randomly selected registered voterʹs political party affiliation is Republican Event B: A randomly selected registered voter opposes a new tax on fuel. B) Independent A) Associated Answer: A
Page 16 Copyright © 2020 Pearson Education, Inc.
5) Classify the events as independent or not independent: Events A and B where the probability of event A occuring is 0.9, the probability of event B occuring is 0.5, and the probability of both event occuring is 0.45. A) independent B) not independent Answer: A 6) Classify the events as independent or not independent: Events A and B where the probability of event A occuring is 0.1, the probability of event B occuring is 0.9, and the probability of both event occuring is 0.08. A) not independent B) independent Answer: A 7) Use your intuition to decide whether the following two events are likely to be independent or associated. Event A: You roll a number larger than four on a die. Event B: Rolling a six on a die. B) Independent A) Associated Answer: A 8) Use your intuition to decide whether the following two events are likely to be independent or associated. Event A: Drawing a club from a deck of cards. Event B: Drawing a card with a black symbol from a deck of cards. A) Associated B) Independent Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 9) Suppose you would like a mug of hot chocolate with cinnamon. You reach into the kitchen cupboard containing twenty mixed up mismatched mugs without looking and pull out a pink coffee cup. You also reach into a kitchen drawer containing 30 different mixed up spice jars without looking and pull out the cinnamon. Use your intuition and state whether these two events are associated or independent. Explain. Answer: These two events are most likely independent. 10) Use your intuition and state whether these two events are likely to be associated or independent. Explain. Event A: A randomly selected adult is a pet owner. Event B: A randomly selected adult responds favorably to the survey question ʺShould a portion of the beach be set aside as an (unleashed) dog beach?ʺ. Answer: These two events are most likely associated. 3 Find the Probabilities of Sequences of Events MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Suppose that a recent poll of American households about pet ownership found that for households with one pet, 39% owned a dog, 33% owned a cat, and 7% owned a bird. Suppose that three households are selected randomly and with replacement. 1) What is the probability that all three randomly selected households own a dog? (Round to the nearest hundredth) A) 0.39 B) 0.06 C) 0.23 D) None of these Answer: B 2) What is the probability that none of the three randomly selected households own a cat? (Round to the nearest hundredth) A) 0.70 B) 0.46 C) 0.30 D) None of these Answer: C
Page 17 Copyright © 2020 Pearson Education, Inc.
3) What is the probability that all three randomly selected households own a cat? (Round to the nearest hundredth) A) 0.06 B) 0.03 C) 0.04 D) None of these Answer: C 4) What is the probability that none of the three randomly selected households own a bird? (Round to the nearest hundredth) A) 0.80 B) 0.93 C) 0.86 D) None of these Answer: A 5) What is the probability that at least one of the three randomly selected households own a dog? (Round to the nearest hundredth) B) 0.77 C) 0.61 D) 0.70 A) 0.27 Answer: B 6) What is the probability that at least one of the three randomly selected households own a bird? (Round to the nearest hundredth) A) 0.20 B) 0.40 C) 0.60 D) 0.80 Answer: A Solve the problem. 7) A true/false pop quiz contains five questions. What is the probability that when guessing, a student will get at least one question correct? (Round to the nearest hundredth) C) 0.76 D) 1.00 A) 0.50 B) 0.97 Answer: B 8) A true/false pop quiz contains seven questions. What is the probability that when guessing, a student will get at least one question correct? (Round to the nearest hundredth) A) 0.50 B) 0.97 C) 0.99 D) 1.00 Answer: C Find the indicated probability. 9) In one town, 45% of all voters are Democrats. If two voters are randomly selected for a survey, find the probability that they are both Democrats. Round to the nearest thousandth if necessary. C) 0.198 D) 0.450 B) 0.900 A) 0.203 Answer: A 10) Find the probability of correctly answering the first 4 questions on a multiple choice test if random guesses are made and each question has 6 possible answers. 1 2 3 1 A) B) C) D) 1296 3 2 4096 Answer: A 11) In one town, 73% of adults have health insurance. What is the probability that 8 adults selected at random from the town all have health insurance? Round to the nearest thousandth if necessary. A) 0.081 B) 5.84 C) 0.11 D) 0.73 Answer: A 12) In a homicide case 8 different witnesses picked the same man from a line up. The line up contained 5 men. If the identifications were made by random guesses, find the probability that all 8 witnesses would pick the same person. B) 0.0000026 C) 0.0000305 D) 1.6 A) 0.0000128 Answer: A
Page 18 Copyright © 2020 Pearson Education, Inc.
5.4 Finding Empirical Probabilities 1 Use Simulations to Compute Empirical Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) If 20 babies are born, how often are there 8 or fewer male babies? Assume that the gender of a baby is a random outcome. Which of the following experiments would not simulate this situation? A) Flip a coin twenty times. Designate a head to mean ʺfemaleʺ and a tail to mean ʺmale.ʺ B) Roll a die twenty times. Designate a 1, 2, or 3 to mean ʺfemaleʺ and a 4, 5, or 6 to mean ʺmale.ʺ C) Choose the first twenty digits from a row in the random number table. Designate odd numbers to mean ʺfemaleʺ and even numbers to mean ʺmale.ʺ D) All of these will simulate the gender of twenty babies. Answer: D 2) A fair coin is tossed 1000 times. What can you say about getting the outcome of exactly 500 tails? a. Getting 500 tails is no more likely than getting any other number of tails in 1000 tosses. b. You will sometimes get exactly 500 tails in 1000 tosses, but should not be surprised to see fewer or more heads c. You should expect between 400 and 600 tails in 1000 tosses. d. You should expect outcomes to change since the coin may be wearing. B) a C) c D) d A) b Answer: A 3) Swinging Sammy Skorʹs batting prowess was simulated to get an estimate of the probability that Sammy will get a hit in 42 tries. Let 1 = HIT and 0 = OUT. The output from a simulation was as follows. 1 0 0 0 1 0 0 1 0 0 1 1 1 0 0 0 0 1 1 1 1 0 0 0 0 1 1 0 0 1 1 1 1 0 0 0 0 0 1 1 1 1 Estimate the probability that he gets a hit. A) 0.476 B) 0.452
C) 0.301
D) 0.286
Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) A student flips a coin ten times and observes the outcome of heads three times. He states that his coin must not be fair because so few heads were observed. Explain why his results do not indicate that he has an unfair coin using the Law of Large Numbers and how it justifies the results that were observed. Answer: The LLN states that if an experiment with a random outcome is repeated a large number of times, the empirical probability is likely to be close to the true (theoretical) probability. A pattern of heads in the short run (like ten trials) can be highly variable, but the more trials that are conducted the closer the empirical probability will tend to approach the theoretical probability.
Page 19 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 5) A multiple-choice test has 50 questions. Each question has four choices, but only one choice is correct. If a random number table is used, which of the following methods is a valid simulation of a student who circles h his choices randomly? Explain. (Note: there might be more than one valid method.) a. Each of 8 digits represents the student’s attempt on one question. The digits 1 and 8 represent a correct choice; 2, 3, 4, 5, 6 and 7 represent an incorrect choice. b. The digits 0, 2, 4, and 6 represent the student’s attempt on one question. All other digits are ignored. The 0 represents the correct choice, the 2, 4 and 6 represent incorrect choices. c. The digits 1, 2, 3, and 4 represent the student’s attempt on one question. All other digits are ignored. The 4 represents the correct choice, the 1, 2 and 3 represent incorrect choices. 1 A) All are valid methods because the probability of a correct choice is . 4 B) Choices a and b are valid methods because the probability of a correct choice is because the probability of a correct choice is
1 . 3
C) Choices b and c are valid methods because the probability of a correct choice is because the probability of a correct choice is
1 . Choice a is invalid 4
1 . 3
D) Choice b is a valid method because the probability of a correct choice is because the probability of a correct choice is
1 . Choice c is invalid 4
1 . Choices a and c are invalid 4
1 . 3
Answer: A 6) A true/false test has 40 questions. Each question has two choices (true or false), and only one choice is correct. Which of the following methods is a valid simulation of a student who guesses randomly on each question. Explain. (Note: there might be more than one valid method.) a. Forty digits are selected using a row from a random number table. Each digit represents one question on the test. If the number is odd the answer is correct. If the number is even, the answer is incorrect. b. A die is rolled 40 times. Each roll represents one question on the test. If the die lands on an even number, the a is correct. If the die lands on an odd number, the answer is incorrect. c. A die is rolled 40 times. Each roll represents one question on the test. If the die lands on a 1 or 6 , the answer is correct; otherwise the answer is incorrect. 1 A) Both a and b are valid methods because the probablity of a correct choice is . 2 Choice c is invalid because the probability of a correct choice is
1 . 3
B) Both a and c are valid methods because the probablity of a correct choice is Choice b is invalid because the probability of a correct choice is
1 . 3
C) All are valid methods because the probability of a correct choice is
1 . 2
D) All are valid methods because the probability of a correct choice is
1 . 3
Answer: A
Page 20 Copyright © 2020 Pearson Education, Inc.
1 . 2
7) A multiple-choice test has 100 questions. Each question has five choices, but only one choice is correct. If a random number table is used which of the following methods is a valid simulation of a student who circles h her choices randomly? Explain. (Note: there might be more than one valid method.) a. Each of 5 digits represents the student’s attempt on one question. The digit 5 represents a correct choice; 1, 2, 3, and 4 represent an incorrect choice. b. The digits 1, 3, and 5 represent the student’s attempt on one question. All other digits are ignored. The 1 represents the correct choice, the 3 and 5 represent incorrect choices. c. The digits 0, 1, 2, 3, and 4 represent the student’s attempt on one question. All other digits are ignored. The 0 represents the correct choice, the 1, 2, 3 and 4 represent incorrect choices. 1 A) Choices a and c are valid because the probability of a correct choice is , but choice b is invalid because 5 the probability of a correct choice is
1 . 3
B) Choices a and b are valid because the probability of a correct choice is the probability of a correct choice is
1 . 4
C) Choice c is valid because the probability of a correct choice is the probability of a correct choice is
1 , but choice c is invalid because 5
1 , but choices a and b are invalid because 5
1 . 4
D) All are valid methods because the probability of a correct choice is
1 . 5
Answer: A 8) A yes/no test has 75 questions. Each question has two choices (yes or no), and only one choice is correct. Which of the following methods is a valid simulation of a student who guesses randomly on each questi Explain. (Note: there might be more than one valid method.) a. A die is rolled 75 times. Each roll represents one question on the test. If the die lands on a 1, 2, 3, or 4 the answer is correct; otherwise the answer is incorrect. b. A die is rolled 75 times. Each roll represents one question on the test. If the die lands on an odd number, the an is correct. If the die lands on an even number, the answer is incorrect. c. Using a row from a random number table ,75 digits are selected. Each digit represents one question on the tes number is even the answer is correct. If the number is odd, the answer is incorrect. A) Choices b and c are valid methods because the probablity of a correct choice is invalid method because the probability of a correct choice is
2 . 3
B) Choices a and c are valid methods because the probablity of a correct choice is invalid method because the probability of a correct choice is
1 , but choice a is an 2
3 . 4
C) All are valid methods because the probability of a correct choice is
1 . 2
D) All are valid methods because the probability of a correct choice is
1 . 4
Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
1 , but choice a is an 4
Ch. 5 Modeling Variation with Probability Answer Key 5.1 What is Randomness? 1 Use Simulations to Calculate Probabilities 1) D 2) D 3) a. 8 7 9 6 4 4 3 7 5 1 b. T T T T H H H T T H c. Longest streak was 3 heads. d. 40% of the flips were heads. 4) a. 2 7 5 8 3 0 1 8 6 6 5 8 2 5 0 3 8 1 0 3 3 5 8 2 5 b. C P P C P C P C C C P C C P C P C P C P P P C C P c. 52% were assigned to the computer group. d. The digits 0, 1, 2, 3, 4 could have assigned participants to Computer, and the digits 5, 6, 7, 8, 9 could have assigned participants to Paper. Other solutions are possible. 2 Identify Empirical and Theoretical Probabilities 1) A 2) B 3) B 4) A 5) A 6) A 7) B 8) A 9) B 10) B 11) An empirical probability is short-run relative frequency based on an experiment. A theoretical probability is a long-run relative frequency of an event after infinitely many trials. 12) This is a theoretical probability because it is not based on a short -run experiment. 13) The probability of each independent event is 1/2 for P(tail) and 1/6 for P(1). The probability of both events together would be 1/2 * 1/6 = 1/12 or 8.3%. This is a theoretical value. An experiment can be designed to throw each coin and die separately and record the results for 10 flips and rolls and compare to the theoretical value. This will be the empirical probability.
5.2 Finding Theoretical Probabilities 1 Find Theoretical Probabilities 1) D 2) A 3) C 4) C 5) B 6) B 7) A 8) C 9) B 10) A 11) A 12) A 13) C 14) A 15) A 16) A 17) A 18) A Page 22 Copyright © 2020 Pearson Education, Inc.
19) A 20) C 21) A 22) A 23) B 24) B 25) B 26) C 27) C 28) B 29) B 30) D 31) C 32) 1/7 + 1/7 = 0.286 33) Complement: Isaac will work Saturday and will work Sunday (not(A or B) = (not A AND not B)); 0.714 34) 1/7 * 1/7 = 0.020 35) 77/200 = 0.385 36) 30/200 = 0.150 37) 2/200 = 0.010 38) (77+70)/200 = 0.735 39) (10/88) versus (11/112) = 0.114 versus 0.098 2 Identify Mutually Exclusive Events 1) D 2) D 3) B 4) A 5) D 6) C 7) D 8) Various. Any combination of the events ʺstudent doing homework – type xʺ and ʺstudent engaged in outside employment – type yʺ.
5.3 Associations in Categorical Variables 1 Find Conditional Probabilities 1) C 2) A 3) A 4) A 5) B 6) D 7) D 8) B 9) B 10) D 11) A 12) 0.013 13) 0.190 14) 0.810 15) 0.763, 0.001 2 Determine if Events are Independent or Associated 1) A 2) B 3) B 4) A 5) A Page 23 Copyright © 2020 Pearson Education, Inc.
6) A 7) A 8) A 9) These two events are most likely independent. 10) These two events are most likely associated. 3 Find the Probabilities of Sequences of Events 1) B 2) C 3) C 4) A 5) B 6) A 7) B 8) C 9) A 10) A 11) A 12) A
5.4 Finding Empirical Probabilities 1 Use Simulations to Compute Empirical Probabilities 1) D 2) A 3) A 4) The LLN states that if an experiment with a random outcome is repeated a large number of times, the empirical probability is likely to be close to the true (theoretical) probability. A pattern of heads in the short run (like ten trials) can be highly variable, but the more trials that are conducted the closer the empirical probability will tend to approach the theoretical probability. 5) A 6) A 7) A 8) A
Page 24 Copyright © 2020 Pearson Education, Inc.
Ch. 6 Modeling Random Events: The Normal and Binomial Models 6.1 Probability Distributions Are Models of Random Experiments 1 Determine if a Variable is Continuous or Discrete MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Determine whether the variable would best be modeled as continuous or discrete: The temperature of a greenhouse at a certain time of the day. A) Continuous B) Discrete Answer: A 2) Determine whether the variable would best be modeled as continuous or discrete: The number of tomatoes harvested each week from a greenhouse tomato plant. B) Discrete A) Continuous Answer: B 3) Determine whether the variable would best be modeled as continuous or discrete: The number of cups dispensed from a beverage vending machine during a 24 -hour period. A) Continuous B) Discrete Answer: B 4) Determine whether the variable would best be modeled as continuous or discrete: The temperature of a cup of coffee dispensed from a beverage vending machine, taken four times during a 24 -hour period. B) Discrete A) Continuous Answer: A Provide an appropriate response. 5) The peak shopping time at home improvement store is between 8:00am -11:00 am on Saturday mornings. Management at the home improvement store randomly selected 95 customers last Saturday morning and decided to observe their shopping habits. They recorded the number of items that each of the customers purchased as well as the total time the customers spent in the store. Identify the types of variables recorded by the home improvement store. A) number of items - discrete; total time - continuous B) number of items - continuous; total time - continuous C) number of items - continuous; total time - discrete D) number of items - discrete; total time - discrete Answer: A Solve the problem. 6) Determine whether the variable would best be modeled as continuous or discrete: The number of tails when flipping ten coins. B) Discrete A) Continuous Answer: B 7) Determine whether the variable would best be modeled as continuous or discrete: The amount of time it takes students to get to school. A) Continuous B) Discrete Answer: A
Page 1 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 8) Explain the difference between a discrete random variable and a continuous random variable and give an example of each. Answer: A discrete random variables has a numerical outcome that can be listed or counted, A continuous random variable occurs over an infinite range of values and cannot be listed or counted. Examples will vary. 2 Create a Probability Density Function Table or Graph MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A box containing recipes from five categories is dropped so that the recipe cards are thoroughly mixed up. The following table shows the possible categories and the associated probability for a recipe randomly chosen. Does the table represent a probability distribution?
A) Yes B) No C) Canʹt be determined with the given information Answer: A 2) An MP3 playlist, containing several songs from five genres, is set to shuffle. The following table shows the genre and the associated probability for the first song played. Does the table represent a probability distribution?
A) Yes B) No C) Canʹt be determined with the given information Answer: B
Page 2 Copyright © 2020 Pearson Education, Inc.
Determine whether the following is a probability distribution. If not, identify the requirement that is not satisfied. 3) If a person is randomly selected from a certain town, the probability distribution for the number, x, of siblings is as described in the accompanying table. x P(x) 0 0.28 1 0.24 2 0.23 3 0.15 4 0.06 5 0.03 A) No B) Yes C) Canʹt be determined with the given information Answer: A 4) In a certain town, 50% of adults have a college degree. The accompanying table describes the probability distribution for the number of adults (among 4 randomly selected adults) who have a college degree. x P(x) 0 0.0625 1 0.2500 2 0.3750 3 0.2500 4 0.0625 A) Yes B) No C) Canʹt be determined with the given information Answer: A 3 Find Probabilities for Discrete-Valued Outcomes MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Students at a local high school were asked how many living grandparents they have and the results are shown below. Find the probability of each number of living grandparents for a randomly selected student in the school, and record the results in table form. Be sure the total of all the probabilities is 1. 2 3 4 Number of living grandparents 0 1 Frequency 40 89 151 204 154 A) Grandparents Probability 0 0.063 1 0.139 2 0.237 3 0.320 4 0.241 C) Grandparents Probability 1 0.149 2 0.253 3 0.341 4 0.258
B) Grandparents Probability 0 0.2 1 0.2 2 0.2 3 0.2 4 0.2 D) Grandparents Probability 0 0.071 1 0.155 2 0.237 3 0.304 4 0.234
Answer: A Page 3 Copyright © 2020 Pearson Education, Inc.
2) Students at a local middle school were asked how many siblings they have and the results are shown below. Find the probability of each number of siblings for a randomly selected student in the school, and record the results in table form. Be sure the total of all the probabilities is 1. 1 2 3 4 5 6 7 Number of siblings 0 Frequency 184 259 125 57 20 10 7 2 A) Siblings Probability 0 0.277 1 0.390 2 0.188 3 0.086 4 0.030 5 0.015 6 0.011 7 0.003 C) Siblings Probability 0 0.292 1 0.375 2 0.200 3 0.074 4 0.030 5 0.015 6 0.011 7 0.003
B) Siblings Probability 1 0.540 2 0.260 3 0.119 4 0.042 5 0.021 6 0.015 7 0.004 D) Siblings Probability 0 0.125 1 0.125 2 0.125 3 0.125 4 0.125 5 0.125 6 0.125 7 0.125
Answer: A Solve the problem. 3) A summary of the types of movies showing in a small city, contains five genres. The following table shows the genres and the associated probability for that type of movie available on a particular weekend. The table is a probability distribution. If we chose a theater and movie to attend at random, what probability would we have of seeing anything other than a romantic or sci-fi movie?
A) 0.646
B) 0.677
C) 0.880
Answer: B
Page 4 Copyright © 2020 Pearson Education, Inc.
D) 0.323
A box contains recipes from five categories. The following table shows the possible categories and the associated probabil recipe randomly chosen.
4) What is the probability of randomly selecting either an appetizer or salad category? Round to the nearest tenth of a percent. A) 3.7% B) 61.7% C) 3.8% D) 38.3% Answer: D 5) What is the probability of randomly selecting anything but a vegetable or main dish category? Round to the nearest tenth of a percent. A) 4.9% B) 48.6% C) 51.4% D) 5.1% Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 6) A suit of standard playing cards has thirteen cards. Suppose the suit of hearts is thoroughly shuffled and that you have the opportunity to play the following game: You win $10 if you choose the ten or queen of hearts, you lose $5 if you choose the jack or king of hearts, you win $2 if you choose the five or seven of hearts, and you lose $10 if you choose the ace of hearts. With any other outcome you will win or lose nothing. Complete the table that shows the probability distribution. Would it be sensible to play this game?
Answer: The probability column should contain the following values: 2/13, 2/13, 2/13, 1/13, 6/13. Your chances of winning money are slightly better than losing. 7) A researcher summarized the genres of movies showing in a particular city. The following table shows the possible genres and the associated probability for a movie randomly chosen. What type of distribution does this table represent? What are its characteristics? What is the probability of randomly selecting a movie genre other than drama or foreign? Round to the nearest t of a percent.
Answer: a. This is a probability distribution. b. Characteristics are that it must list all the possible outcomes and list all the associated probabilities. c. The probability of randomly selecting a movie genre other than drama or foreign is 1 – (0.421+0.093) = 0.486 or 4.9% Page 5 Copyright © 2020 Pearson Education, Inc.
4 Find Probabilities for Continuous-Valued Outcomes MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) At a course in public speaking, the instructor always gives an opening speech that lasts between fifteen and eighteen minutes. The length of the speech can be modeled by a uniform distribution, that is, the speech is just as likely to last fifteen minutes as it is to last eighteen minutes. The probability density curve is shown below. What is the probability that the speech will last at least seventeen minutes? What is the probability that the speech will last between fifteen and sixteen minutes?
A) Canʹt be determined with the given information C) 0.25; 0.50
B) 0.50; 0.75 D) 0.50; 0.25
Answer: D 2) At a course in public speaking, the instructor always gives an opening speech that lasts between fifteen and eighteen minutes. The length of the speech can be modeled by a uniform distribution, that is, the speech is just as likely to last fifteen minutes as it is to last eighteen minutes. The probability density curve is shown below. What is the probability that the speech will last sixteen minutes or more? What is the probability that the speech will last between eighteen and nineteen minutes?
A) Canʹt be determined with the given information C) 0.75; 0.25
B) 0.50; 0.75 D) 0.50; 0.25
Answer: C
Page 6 Copyright © 2020 Pearson Education, Inc.
Using the following probability density curve, answer the question.
3) What is the probability that the random variable has a value greater than 5? A) 0.375 B) 0.250 C) 0.500
D) 0.325
Answer: A 4) What is the probability that the random variable has a value less than 7? A) 0.875 B) 0.750 C) 0.625
D) 1.000
Answer: A A package delivery service divides their packages into weight classes. Suppose that the packages in the 14 -20 pound class are uniformly distributed, meaning that all weights within that class are equally likely to occur. The probability density curve is shown below.
5) Find the probability that a randomly selected package will weigh less than 18 pounds. A) 0.67 B) 0.80 C) 0.20
D) 0.30
Answer: A 6) Find the probability that a randomly selected package will weigh between 16 and 18 pounds. A) 0.80 B) 3.60 C) 0.33 D) 0.20 Answer: C 7) the probability that a package weighs less than 20 pounds? A) 0.80 B) 0.40 C) 1.00
D) 0.20
Answer: C 8) the probability that a package weighs between 14 and 18 pounds? C) 0.40 A) 0.67 B) 0.33 Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
D) 0.80
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. When exposed to heat, the reaction time of a certain chemical always occurs after thirteen minutes, but before 17 minutes. Reaction times for this chemical can be modeled by a uniform distribution, that is, the reaction time is just as likely to occur at thirteen minutes as it is to occur at seventeen minutes. 9) Find the probability that the reaction will happen after fifteen minutes. Shade the appropriate area and calculate numerical value of the probability.
Answer: 0.50, the right half of the distribution should be shaded. 10) Find the probability that the reaction time will occur after fourteen minutes, but before fifteen minutes. Shade th appropriate area and calculate the numerical value of the probability.
Answer: 0.25, the rectangle between 14 and 15 should be shaded.
Page 8 Copyright © 2020 Pearson Education, Inc.
6.2 The Normal Model 1 Understand the Empirical Rule and Normal Distributions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information for the question. The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 1) Which of these statements is asking for a measurement (i. e. is an inverse normal question)? A) What percentage of people living and working in Kokomo have a travel time to work that is between thirteen and fifteen minutes? B) If 15% of people living and working in Kokomo have travel time to work that is below a certain number of minutes, how many minutes would that be? Answer: B 2) Which of these statements is asking for a probability? A) What percentage of people living and working in Kokomo have a travel time to work that is between thirteen and fifteen minutes? B) If 15% of people living and working in Kokomo have travel time to work that is below a certain number of minutes, how many minutes would that be? Answer: A Use the empirical rule to solve the problem. 3) At one college, GPAʹs are normally distributed with a mean of 3 and a standard deviation of 0.6. What percentage of students at the college have a GPA between 2.4 and 3.6? A) 68% B) Almost all C) 84.13% D) 95% Answer: A 4) The amount of Jenʹs monthly phone bill is normally distributed with a mean of $65 and a standard deviation of $8. What percentage of her phone bills are between $41 and $89? A) Almost all B) 50% C) 68% D) 95% Answer: A 5) Solar energy is considered by many to be the energy of the future. A recent survey was taken to compare the cost of solar energy to the cost of gas or electric energy. Results of the survey revealed that the distribution of the amount of the monthly utility bill of a 3-bedroom house using gas or electric energy had a mean of $101 and a standard deviation of $10. If the distribution is normal, what percentage of homes will have a monthly utility bill of more than $91? A) approximately 84% B) approximately 95% C) approximately 16% D) approximately 32% Answer: A A police radar gun is used to measure the speeds of cars on a highway. The speeds of cars are normally distributed with a mean of 55 mi/hr and a standard deviation of 5 mi/hr. 6) Roughly what percentage of cars are driving less than 45 mi/hr? (Round to the nearest tenth of a percent) C) 18.0% D) 2.5% A) 5.0% B) 95.0% Answer: D 7) Roughly what proportion of cars are driving between 60 and 70 mi/hr? Round to the nearest thousandth) A) 0.157 B) .0843 C) 0.314 D) 0.628 Answer: A
Page 9 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 8) Some statisticians use a guideline that says that events that happen 5% of the time or less often should be considered ʺunusual.ʺ By this standard, is it unusual to have a car speed of 75 mi/hr or greater? A) No, this would not be unusual because it is only 2 standard deviations higher than the mean. B) No, this would not be unusual because is only 1 standard deviation higher than the mean. C) Yes, this would be unusual because it is 3 standard deviations higher than the mean. D) Yes, this would be unusual because it is 4 standard deviations higher than the mean. Answer: D The Normal distribution below models the distribution of highway gas consumption (in miles per gallon) for a large collection of SUVs when driving on level roads. The center is shown at 24.8 mpg and the standard deviations reflect the spread of the data.
9) Choose the most accurate statement. A) The top 5% of SUVs get 37.2 mpg or higher. C) The bottom 32% of SUVs get 18.6 mpg or lower.
B) The top 2.5% of SUVs get 37.2 mpg or higher. D) The bottom 0.3% of SUVs get 6.2 mpg or lower.
Answer: B 10) What percent of SUVs consume 37.2 mpg or less? A) 90% B) 97.5%
C) 95%
D) 50%
Answer: B 11) Choose the most accurate statement. A) Approximately 95% of all SUVs have gas consumption between 12.4 and 37.2 mpg. B) Approximately 99.7% of all SUVs have gas consumption between 12.4 and 37.2 mpg. C) The top 25% of gas consumption of SUVs is greater than 31.0 mpg. D) The bottom 30% gas consumption of all SUVs is less than 18.6 mpg. Answer: A Solve the problem. 12) The distribution of gas consumption for SUVs driving on flat highways is approximately Normal with a center of 24.8 mpg and a standard deviation of 5 mpg. Which one of the following statements would explain a larger spread of the normal distribution? A) The standard deviation was 7.5 mpg. B) The standard deviation was 2.5 mpg. C) The center of the distribution is greater than 24.8 mpg. D) The center of the distribution is less than 24.8 mpg. Answer: A
Page 10 Copyright © 2020 Pearson Education, Inc.
13) The distribution of gas consumption for SUVs (in miles per gallon) is approximately normally distributed with a center of 24.8 mpg and a standard deviation of 5 mpg. A similar data set for a group of sedans is approximately normally distributed with a center of 28 mpg and a standard deviation of 4 mpg. Which one of the following statements is true when the distributions are compared on a number line? A) The normal distribution for the sedans is shifted to the right and not as spread out. B) The normal distribution for the SUVs is centered at the same value as for the sedans but spreads out further. C) The normal distribution for the sedans is shifted to the right and more spread out. D) The normal distribution of the SUVs is centered at the same value as for the sedans but does not spread out as far. Answer: A 14) A data set of gas consumption for trucks (in miles per gallon) is approximately normally distributed with a center of 24.8 mpg and a standard deviation of 5 mpg. Which one of the following statements would explain a smaller spread of the normal distribution? A) The standard deviation was 7.5 mpg. B) The standard deviation was 2.5 mpg. C) The center of the distribution would be greater than 24.8 mpg. D) The center of the distribution would be less than 24.8 mpg. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 15) The normal distribution below reflects a data set of gas consumption (in miles per gallon) for a group of SUVs. a. State the specifics of what is shown on the bell-shaped curve below, addressing the 68%, 95%, and 99.7% inter and what they represent. b. Specify what the middle and the standard deviation of this distribution are and what they mean.
Answer: a. The intervals shown represent +/- 1 standard deviation or the middle 68% (68% of middle values range from 18.6 to 31.0 mpg), +/- 2 standard deviations or the middle 95% (95% of middle values range from 12.4 to 37.2 mpg), +/- 3 standard deviations or the middle 99.7% (99.7% of middle values range from to 43.4 mpg). b. The normal distribution shown has a center (mean, median) of 24.8 mpg with a standard deviation of 6. (representing the spread).
Page 11 Copyright © 2020 Pearson Education, Inc.
16) A study about commuting time for workers in a certain city. The distribution is clustered on the left with most of the values less than about 40 minutes commute. There are a fairly large number of values that are much higher, even stretching out to 150 minutes. There is a 15% chance that a random worker has a commute time of greater than 45 minutes. If you were to sketch this distribution, what would it look like, how would you describe it, and how do you represent the 15% probability? Answer: The distribution should be sketched as a right skewed distribution (most of the values clustered on the left side (zero to about 40 minutes) and then stretched out to the right showing 150 minutes on far right. The 15% probability is the shaded area under the curve from a vertical line at 45 all the way to the right end. 2 Solve Applications of Normal Distributions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information for the question. Male players at the high school, college and professional ranks use a regulation basketball that weighs 22.0 ounces with a standard deviation of 1.0 ounce. Assume that the weights of basketballs are approximately normally distributed. 1) Roughly what percentage of regulation basketballs weigh less than 20.7 ounces? Round to the nearest tenth of a percent. A) 40.3% of the basketballs will weigh less than 20.7 ounces. B) 22.3% of the basketballs will weigh less than 20.7 ounces. C) 9.7% of the basketballs will weigh less than 20.7 ounces. D) 5.7% of the basketballs will weigh less than 20.7 ounces. Answer: C 2) Roughly what percentage of regulation basketballs weigh more than 23.1 ounces? Round to the nearest tenth of a percent. A) Roughly 15.1% of the basketballs will weigh more than 23.1 ounces. B) Roughly 42.3% of the basketballs will weigh more than 23.1 ounces. C) Roughly 36.4% of the basketballs will weigh more than 23.1 ounces. D) Roughly 13.6% of the basketballs will weigh more than 23.1 ounces. Answer: D 3) If a regulation basketball is randomly selected, what is the probability that it will weigh between 20.5 and 23.5 ounces? Round to the nearest thousandth. A) 0.866 B) 0.134 C) 0.267 D) 0.704 Answer: A 4) If a regulation basketball is randomly selected, what is the probability that it will weigh between 19.5 and 22.5 ounces? Round to the nearest thousandth. B) 0.315 C) 0.685 D) 0.723 A) 0.547 Answer: C 5) Would it be unusual to randomly select a regulation basketball and find that it weighs 17.9 ounces? A) Yes, this would be unusual. B) No, this would not be unusual. Answer: A 6) Would it be unusual to randomly select a regulation basketball and find that it weighs 23.75 ounces? A) Yes, this would be unusual. B) No, this would not be unusual. Answer: B
Page 12 Copyright © 2020 Pearson Education, Inc.
Use the following information for the question. The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 7) Approximately what percentage of people living and working in Kokomo have a travel time to work of at least 20 minutes? Round to the nearest whole percent. C) 15% D) None of these. B) 25% A) 75% Answer: B 8) Approximately what percentage of people living and working in Kokomo have a travel time to work that is less than 15.5 minutes? Round to the nearest whole percent. A) 37% B) 63% C) 25% D) None of these. Answer: A Solve the problem. 9) Suppose that weights of cans of AJʹs brand whipped cream have a population mean of 7.5 ounces and a population standard deviation of 0.27 ounces and are approximately normally distributed. Which of the following statements are correct? Choose the best statement.
A) Approximately 95% of all cans of AJʹs whipped cream will weigh between 6.96 ounces and 8.04 ounces. B) The probability that a randomly selected can of AJʹs whipped cream will weigh between 7.8 ounces and 8.3 ounces is approximately 0.131. C) Less than 1% of all cans of AJʹs whipped cream will weigh more than 8.3 ounces. D) All of these statements are true. Answer: D
Page 13 Copyright © 2020 Pearson Education, Inc.
10) Suppose that weights of cans of Benneke brand peaches have a population mean of 13.5 ounces and a population standard deviation of 0.33 ounces and are approximately normally distributed. Which of the following statements are correct? Choose the best statement.
A) Approximately 95% of all Benneke brand canned peaches will weigh between 12.85 ounces and 14.15 ounces. B) The probability that a randomly selected can of Benneke peaches will weigh between 12.9 ounces and 13.6 ounces is approximately 0.585. C) About 4% of all cans of Benneke peaches will weigh less than 12.9 ounces D) All of these statements are true. Answer: D Use a table or technology to answer the question. 11) Find the area to the left of a z-score of 1.95. A) 0.9744 B) 0.8289
C) 01711
D) 0.0256
C) 0.7224
D) 0.2190
13) Find the probability that a z-score will be between 0.8 and 1.4. A) 0.1311 B) 0.7881 C) 0.9192
D) 0.2119
Answer: A 12) Find the probability that a z-score will be more than 0.59. A) 0.2776 B) 0.2224 Answer: A
Answer: A Provide an appropriate response. 14) A physical fitness association is including the mile run in its secondary -school fitness test. The time for this event for boys in secondary school is known to possess a normal distribution with a mean of 450 seconds and a standard deviation of 40 seconds. Find the probability that a randomly selected boy in secondary school can run the mile in less than 358 seconds. C) 0.9893 D) 0.5107 B) 0.4893 A) 0.0107 Answer: A
Page 14 Copyright © 2020 Pearson Education, Inc.
15) A physical fitness association is including the mile run in its secondary -school fitness test. The time for this event for boys in secondary school is known to possess a normal distribution with a mean of 460 seconds and a standard deviation of 50 seconds. Find the probability that a randomly selected boy in secondary school will take longer than 345 seconds to run the mile. B) 0.4893 C) 0.0107 D) 0.5107 A) 0.9893 Answer: A 16) The length of time it takes college students to find a parking spot in the library parking lot follows a normal distribution with a mean of 3.0 minutes and a standard deviation of 1 minute. Find the probability that a randomly selected college student will take between 1.5 and 4.0 minutes to find a parking spot in the library lot. B) 0.4938 C) 0.0919 D) 0.2255 A) 0.7745 Answer: A 17) The amount of soda a dispensing machine pours into a 12 ounce can of soda follows a normal distribution with a mean of 12.54 ounces and a standard deviation of 0.36 ounce. The cans only hold 12.90 ounces of soda. Every can that has more than 12.90 ounces of soda poured into it causes a spill and the can needs to go through a special cleaning process before it can be sold. What is the probability a randomly selected can will need to go through this process? B) 0.3413 C) 0.8413 D) 0.6587 A) 0.1587 Answer: A The mean time it takes for workers at a factory to insert a delicate bolt into an engine is 15 minutes. The standard deviation of time to insert the bolt is 4.0 minutes and the distribution of time is approximately normally distributed. 18) For a randomly selected worker, whatʹs the approximate probability the bolt will be inserted in 19 minutes or less? Round to the nearest whole percent. C) 18% D) 16% A) 50% B) 84% Answer: B 19) 97.5% of the time, the bolt is inserted in less than which time value? Round to the nearest whole number. A) 23 minutes B) 19 minutes C) 7 minutes D) 15 minutes Answer: A Solve the problem. 20) Admission to a certain university is determined by an entry exam. The scores of this test are Normally distributed with a mean of 400 and a standard deviation of 60. Only students who score in the top 30% will be offered admission. Amy scores 425 on the test. Choose the most accurate statement. A) The top 30% is defined with a score less than or equal to 431.4 so she will be admitted. B) The top 30% is defined with a score greater than or equal to 431.4 so she will not be admitted. C) The top 30% of all students have scores greater than or equal to 520 so she will not be admitted. D) The top 30% of all students have scores greater than or equal to 460 so she will not be admitted. Answer: B The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 21) Approximately what percentage of people living and working in Kokomo have a travel time to work that is less than 12.5 minutes? Round to the nearest whole percent. A) 16% B) 68% C) 5% D) 84% Answer: A
Page 15 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 22) The normal distribution below reflects a data set of gas consumption (in miles per gallon) for a group of trucks. The center is shown at 24 mpg and the standard deviation is 4 mpg. Choose the most accurate statement. A) Approximately 68% of all of the trucks miles per gallon is between 20 and 28. B) Approximately 99.7% of all of the trucks miles per gallon is between 12 and 36. C) The top 2.5% of all of the trucks miles per gallon is greater than 32. D) All of these statements are accurate. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Circumferences of regulation soccer balls have a mean of 69 cm with as standard deviation of 1.50 cm. Assume that the circumferences of soccer balls are approximately normally distributed. 23) Roughly what percentage of regulation soccer balls has a circumference that is greater than 69.9 cm? Round to the nearest tenth of a percent. Answer: 27.4% 24) If a regulation soccer ball is randomly selected, what is the probability that it will have a circumference between 66.9 and 68.9 cm? Round to the nearest thousandth. Answer: 0.393 25) Would it be unusual to randomly select a regulation soccer ball and find that it has a circumference that is greater than 72.3 cm? Answer: Yes, 72.3 cm would be unusual. It is more the two standard deviations from the mean. Suppose that weights of jars of Puff brand marshmallow cream has a population mean of 24.5 ounces and a population standard deviation of 0.19 ounces and are approximately normally distributed. Use the figure below to find determine the specified probabilities.
26) What is the probability that a randomly selected jar of Puff marshmallow cream will be greater than 24.45 ounces? What is the probability that a randomly selected jar of Puff marshmallow cream will be less than 24.22 ounces? Round to the nearest thousandth. Answer: 0.604; 0.070 27) If a large random sample of Puff marshmallow cream jars were weighed, approximately what percentage of the jars would weigh between 24.22 and 24.45 ounces? Round to the nearest tenth of a percent. Answer: 32.6% Page 16 Copyright © 2020 Pearson Education, Inc.
3 Use the Normal Distribution to Find Percentiles MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information for the question. The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 1) Suppose that it is reported in the news that 12% of the people living and working in Kokomo feel that their commute is too long. What is the travel time to work that separates the top 12% of people with the longest travel times and the lower 88%? Round to the nearest tenth of a minute. B) 18.1 minutes C) 22.3 minutes D) None of these A) 26.0 minutes Answer: C 2) Suppose that it is reported in the news that 8% of the people living and working in Kokomo feel ʺvery satisfiedʺ with their commute time to work. What is the travel time to work that separates the bottom 8% of people with the shortest travel times and the upper 92%? Round to the nearest tenth of a minute. B) 13.8 minutes C) 10.7 minutes D) None of these A) 17.2 minutes Answer: C Solve the problem. 3) The normal model N(58, 21) describes the distribution of weights of chicken eggs in grams. Suppose that the weight of a randomly selected chicken egg has a z-score of 1.78. What is the weight of this egg in grams? Round to the nearest hundredth of a gram. B) 89.50 grams C) 65.25 grams D) 79.50 grams A) 95.38 grams Answer: A 4) The normal model N(58, 21) describes the distribution of weights of chicken eggs in grams. Suppose that the weight of a randomly selected chicken egg has a z-score of -2.01. What is the weight of this egg in grams? Round to the nearest hundredth of a gram. B) 15.80 grams C) 28.50 grams D) 38.10 grams A) 31.20 grams Answer: B Provide an appropriate response. 5) Assume that adults have IQ scores that are normally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test). Find the IQ score separating the bottom 20% from the top 80%. C) 87.9 D) 88.7 A) 87.4 B) 86.1 Answer: A 6) Assume that adults have IQ scores that are normally distributed with a mean of 100 and a standard deviation of 15 (as on the Wechsler test). Find the IQ score separating the top 14% from the others. C) 102.8 D) 99.9 A) 116.2 B) 83.7 Answer: A 7) The amount of rainfall in January in a certain city is normally distributed with a mean of 3.6 inches and a standard deviation of 0.5 inches. Find the value of the 25th percentile, rounded to the nearest tenth. A) 3.3 B) 3.9 C) 3.4 D) 0.9 Answer: A 8) The tread life of a particular brand of tire is a random variable best described by a normal distribution with a mean of 60,000 miles and a standard deviation of 1,100 miles. What warranty should the company use if they want 96% of the tires to outlast the warranty? B) 61,925 mi C) 58,900 mi D) 61,100 mi A) 58,075 mi Answer: A
Page 17 Copyright © 2020 Pearson Education, Inc.
The average travel time to work for a person living and working in Kokomo, Indiana is 17 minutes. Suppose the standard deviation of travel time to work is 4.5 minutes and the distribution of travel time is approximately normally distributed. 9) What is the travel time to work that separates the bottom 2.5% of people with the median or 50%? Round to the nearest tenth of a minute. B) 3.5 to 17 minutes C) 8 to 17 minutes D) 12.5 to 17 minutes A) 7 to 17 minutes Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 10) The normal model N (210, 45) describes the weights of baby elephants in pounds. Suppose that the weight of a randomly selected baby elephant has a zscore of -1.35. What is the weight of this baby elephant in pounds? Round to the nearest tenth of a pound. Answer: 149.3 lbs. On a busy day, the average roller coaster wait time at a large amusement park is 27 minutes. Suppose the standard deviation of wait time is 11.9 minutes and the distribution of wait times is approximately normally distributed. 11) If it is a busy day, approximately what percentage of people at the amusement park will have a wait time that is at least 30 minutes? Round to the nearest whole percent. Answer: 40% will have at least a 30 minute wait time. 12) To improve customer satisfaction, the amusement park manager has decided to give food coupons to customers with long wait times. The manager decides to give the coupons to the top 10% of people waiting the longest. What is the minimum wait time for the top 10%? Round to the nearest tenth of a minute. Answer: 42.3 minutes
6.3 The Binomial Model 1 Understand the Binomial Model MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following characteristics are not required for the binomial model? A) There are a fixed number of trials. B) The trials must be independent. C) Only two outcomes are possible at each trial. D) The probability of success must be the same as the probability of failure. Answer: D 2) Which of the following characteristics are not required for the binomial model? A) The outcome must be a discrete numerical variable. B) There are a fixed number of trials. C) The trials must be independent. D) The probability of success is the same at each trial. Answer: A 3) Determine which situation could be modeled with the binomial model? A) Record the number of ear piercings in a group of 30 randomly selected college students. B) The probability that a customer at a hotdog stand orders hot peppers is 0.18. situation could be modeled with the binomial model? How many customers in the next batch of 10 customers will order hot peppers? C) Surveying customers leaving a hardware store until a customer responds that he or she spent more than fifty dollars. Answer: B Page 18 Copyright © 2020 Pearson Education, Inc.
4) Determine which situation could be modeled with the binomial model? A) Record the number of songs downloaded in a month for a group of 30 randomly selected college students. B) The probability that a customer in a store uses a credit card is 0.58.. How many customers in the next batch of 10 customers will use a credit card? C) Surveying customers entering a sporting equipment store until a customer responds that he or she was shopping for a bicycle. Answer: B Provide an appropriate response. 5) Decide whether the experiment is a binomial experiment. If it is not, explain why. You observe the gender of the next 600 babies born at a local hospital. Then you count the number of boys. A) This is a binomial experiment. B) This is not a binomial experiment since there are more than two outcomes. C) This is not a binomial experiment since the trials are not independent. D) This is not a binomial experiment. since the probability of success is not the same for each trial. Answer: A 6) Decide whether the experiment is a binomial experiment. If it is not, explain why. You draw a marble 450 times from a bag with three colors of marbles. You want to count the drawn marbles by color. A) This is not a binomial experiment since the outcomes are categorical. B) This is a binomial experiment. C) This is not a binomial experiment since the trials are not independent. D) This is not a binomial experiment since the probability of success is not the same for each trial. Answer: A 7) Decide whether the experiment is a binomial experiment. If it is not, explain why. In a game you spin a wheel that has 12 different letters 1,000 times. You want to count the selected letter on each spin of the wheel. A) This is not a binomial experiment since the outcomes are categorical. B) This is a binomial experiment. C) This is not a binomial experiment since the trials are not independent. D) This is not a binomial experiment since the probability of success is not the same for each trial. Answer: A 8) Decide whether the experiment is a binomial experiment. If it is not, explain why. Test a cough suppressant using 400 people to determine if it is effective. You want to count the number of people who find the cough suppressant effective. A) This is a binomial experiment. B) This is not a binomial experiment since there are more than two outcomes. C) This is not a binomial experiment since the trials are not independent. D) This is not a binomial experiment since the probability of success is not the same for each trial. Answer: A
9) Determine which of the given example describe a binomial distribution. A) The number of tattoos in a group of 30 randomly selected college students B) The number of college students with tattoos in a group of 30 randomly selected college students C) The number of customers surveyed, while leaving a grocery store, until a customer responds that he or she spent more than fifty dollars D) The ages of every 5th person entering a department store Answer: B
Page 19 Copyright © 2020 Pearson Education, Inc.
10) Can the following problem be solved by using a binomial model? A company has 50 personal computers. The probability that any one of them will need repair on any day is 0.02. Find the probability that exactly 5 of the computers will need repair on any given day. A) This problem cannot be solved using the binomial model because the computers are different ages and brands. B) There is not enough information provided to determine if the binomial model can be used. C) The problem can be solved using the binomial model since the experiment has all the required characteristics. D) A binomial model is only valid for the average probability rather than a particular day’s probability. Answer: C 11) Which of the following characteristics is not required for the binomial model? A) The probability of success is 0.5 then the probability of failure 0.5. B) There are a fixed number of trials. C) The trials must be independent. D) The probability of success is the same at each trial. Answer: A 12) Determine which of the given examples describe a binomial distribution. A) Record the number of songs downloaded in a month for a group of 30 randomly selected college students B) The number of college students surveryed that are female in a group of 30 randomly selected college students C) The number of customers surveyed entering a sporting equipment store until a customer responds that he or she was shopping for a bicycle D) The ages of every 5th person going into a department store Answer: B 13) Assume 80% of adults with arthritis symptoms report relief with a specific medication. If we are interested in the probability that the medication effectively relieves symptoms in 8 or more patients out of 10 with arthritis symptoms, can we use the binomial model? Select the most accurate statement to investigate the assumptions for using a binomial distribution model. A) The outcome is relief from symptoms, yes or no with relief as a yes. B) The probability of success for each new patient is 0.8. C) It is reasonable to assume that the replications are independent. D) All of these assumptions are true. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 14) a. List the four characteristics of the binomial model. b. Consider the given scenario: At a county animal shelter, the probability that a stray cat comes into the shelter w rabies is 0.145 and the probability that a stray dog comes into the shelter with rabies is 0.157. A volunteer at the shelter records whether the next thirty cats or dogs delivered to the shelter has rabies. Which of the condition or conditions for use of the binomial model is or are not met? Answer: a. (1) A fixed number of trials, (2) Only two possible outcomes for each trial, (3) The probability of success same from trial to trial, and (4) The trials are independent. b. (1) There are a fixed number of trials. (2) There are only two outcomes (rabies or no rabies). (3) The probability of success is not the same from trial to trial since the trials include cats and dogs.
Page 20 Copyright © 2020 Pearson Education, Inc.
In the new class college freshmen at a local college, at their orientation students were asked if they would eat their meals at the school cafeteria. In the past it has been determined that 80% of new freshmen ate there. For the question that follows, consider a sample of twenty college freshmen selected randomly. 15) Approximately how many trips would you expect to arrive on time out of twenty randomly selected trips? Round to the nearest whole trip. Answer: Approximately 16 trips In a recent survey of the new class of 235 college freshmen at a local college, students were asked if they would eat their meals at the school cafeteria. In the past it has been determined that 3% of new freshmen ate there. Suppose that a binomial distribution is illustrated by the figure below:
16) a. What is displayed in this graph, being as specific as possible? b. How would you determine the probability for x being less than 5? Round to the nearest hundredth place. Answer: a. The graph shows the binomial distribution for a binomial experiment in which n = 235 and the probability of success is 0.03. It indicates the probability of 5 successes in 235 trials. b. The method to determine the probability for x less than 5 is to add all of the probabilities for x = 4, x = 3 2, x =1, x = 0. The result is approximately 0.16 or 16%. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 17) A bag contains a yellow ball, a red ball, and a green ball. One ball is chosen blindly from the bag 10 times, and the color of the ball is recorded. The results is a tally showing the counts for each color found in the 10 tries. Explain why this is not a binomial experiment. Name a condition for use of the binomial model that is not met. A) Each trial results in a count that reflects the three possible outcomes (yellow, red, blue). In a binomial experiment, each trial only has counts for the two possible outcomes. B) Each trial results in a count that reflects the has only three possible outcomes (yellow, red, blue). In a binomial experiment, each trial has counts for the four possible outcomes. C) Each trial has ten possible outcomes (1, 2, 3, 4, 5, 6, 7, 8, 9, 10). In a binomial experiment, each trial only has two possible outcomes. D) Each trial has ten possible outcomes (1, 2, 3, 4, 5, 6, 7, 8, 9, 10). In a binomial experiment, each trial only has four possible outcomes. Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
18) A football player has a field goal success rate of 82%. Suppose he kicks as many field goals as he can in a half hour. Why would it be inappropriate to use the binomial model to find the probability that he makes at least 5 shots in one minute? What condition or conditions for use of the binomial model is or are not me A) There is not a fixed number of trials because he is going to kick as many field goals as he can in a half hour. B) Each trial has only two possible outcomes (field goal made, field goal missed). In a binomial experiment, each trial has four possible outcomes. C) The trials are not independent. If he makes one field goal, his chance of making the next one decreases. D) The trials are not independent. If he makes one field goal, his chance of making the next one increases. Answer: A 19) A rail service company has an on-time arrival rate of 65%. Assume that in one day, this rail service company has 60 train trips scheduled. Suppose we pick one day in August and find the number of ontime arrivals for this rail service company. Why would it be inappropriate to use the binomial model to find the probability that at least 45 of the 60 train trips arrive on time? What condition or conditions for use of the binomial model is or are not met? A) The trials are not independent. If one train is late, it may increase the chance that a subsequent train is also late. B) Each trial has only two possible outcomes (train on time, train late). In a binomial experiment, each trial has four possible outcomes. C) Each trial has three possible outcomes (train on time, train late, train trip canceled). In a binomial experiment, each trial has only two possible outcomes. D) The trials are independent. Trials must be dependent in order to use the binomial model. Answer: A 2 Find Binomial Probabilities MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Suppose that the probability that a person books a hotel using an online travel website is 0.68. For the questions that follow, consider a sample of fifteen randomly selected people who recently booked a hotel. 1) What is the probability that exactly ten people out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. A) 0.048 B) 0.552 C) 0.287 D) 0.213 Answer: D 2) What is the probability that at least fourteen out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. C) 0.025 D) 0.028 B) 0.323 A) 0.978 Answer: C 3) What is the probability that no more than four out of fifteen people used an online travel website when they booked their hotel? Round to the nearest thousandth. A) 0.111 B) 0.001 C) 0.321 D) None of these Answer: B Use the following information to answer the question. Suppose that among those who book airline tickets, the probability that they use an online travel website is 0.72. For the questions that follow, consider a sample of ten randomly selected people who recently booked an airline ticket. 4) What is the probability that exactly seven out of ten people used an online travel website when they booked their airline ticket? Round to the nearest thousandth. C) 0.998 D) 0.264 A) 0.035 B) 0.480 Answer: D Page 22 Copyright © 2020 Pearson Education, Inc.
5) What is the probability that at least nine out of ten people used an online travel website when they booked their airline ticket? Round to the nearest thousandth. A) 0.065 B) 0.183 C) 0.935 D) 0.857 Answer: B 6) What is the probability that no more than three out of ten people used an online travel website when they book their airline ticket? Round to the nearest thousandth. B) 0.115 C) 0.007 D) None of these A) 0.733 Answer: C
7) Five identical poker chips are tossed in a hat and mixed up. Two of the chips have been marked with an X to indicate that if drawn a valuable prize will be awarded. If you and two of your friends each draws a chip (with replacement), what is the probability that at least one of your group of three will win the valuable prize? Round to the nearest thousandth. B) 0.784 C) 0.978 D) None of these A) 0.216 Answer: B 8) Five identical poker chips are tossed in a hat and mixed up. Two of the chips have been marked with an X to indicate that if drawn a valuable prize will be awarded. If you and three of your friends each draws a chip (with replacement), what is the probability that at least one of your group of four will win the valuable prize? Round to the nearest thousandth. C) 0.758 D) None of these B) 0.130 A) 0.870 Answer: A Find the indicated probability. Round to three decimal places. 9) Assume that male and female births are equally likely (Assume female birth probability is 0.50) and that the birth of any child does not affect the probability of the gender of any other children. Find the probability of exactly six girls in ten births. A) 0.205 B) 0.6 C) 3.281 D) 0.06 Answer: A 10) In a recent survey, 63% of the community favored building a health center in their neighborhood. If 14 citizens are chosen, find the probability that exactly 8 of them favor the building of the health center. C) 0.571 D) 0.630 B) 0.066 A) 0.191 Answer: A 11) The probability that an individual has 20-20 vision is 0.13. In a class of 80 students, what is the probability of finding five people with 20-20 vision? C) 0.000 D) 0.13 A) 0.026 B) 0.063 Answer: A 12) According to insurance records a car with a certain protection system will be recovered 86% of the time. Find the probability that 4 of 8 stolen cars will be recovered. A) 0.015 B) 0.500 C) 0.86 D) 0.14 Answer: A 13) The probability that a football game will go into overtime is 12%. What is the probability that two of three football games will go to into overtime? B) 0.12 C) 0.279 D) 0.0144 A) 0.038 Answer: A
Page 23 Copyright © 2020 Pearson Education, Inc.
14) Fifty percent of the people that use the Internet order something online. Find the probability that only four of 10 Internet users will order something online. A) 0.205 B) 0.400 C) 0.001 D) 13.125 Answer: A 15) The probability that a house in an urban area will develop a leak is 4%. If 64 houses are randomly selected, what is the probability that none of the houses will develop a leak? B) 0.040 C) 0.000 D) 0.001 A) 0.073 Answer: A 16) Sixty-five percent of men consider themselves knowledgeable soccer fans. If 13 men are randomly selected, find the probability that exactly seven of them will consider themselves knowledgeable fans. A) 0.155 B) 0.083 C) 0.65 D) 0.538 Answer: A 17) A test consists of 10 true/false questions. To pass the test a student must answer at least 7 questions correctly. If a student guesses on each question, what is the probability that the student will pass the test? B) 0.117 C) 0.055 D) 0.945 A) 0.172 Answer: A 18) A machine has 7 identical components which function independently. The probability that a component will fail is 0.2. The machine will stop working if more than three components fail. Find the probability that the machine will be working. B) 0.033 C) 0.029 D) 0.996 A) 0.967 Answer: A 19) Find the probability of at least 2 girls in 8 births. Assume that male and female births are equally likely and that the births are independent events. A) 0.965 B) 0.035 C) 0.109 D) 0.855 Answer: A 20) An airline estimates that 91% of people booked on their flights actually show up. If the airline books 80 people on a flight for which the maximum number is 78, what is the probability that the number of people who show up will exceed the capacity of the plane? B) 0.001 C) 0.004 D) 0.021 A) 0.005 Answer: A
Page 24 Copyright © 2020 Pearson Education, Inc.
In a recent survey of the new class of 235 college freshmen at a local college, students were asked if they would eat their meals at the school cafeteria. In the past it has been determined that 3% of new freshmen ate there. Use the following information to answer the question.
21) What is the approximate probability that five or less students eat at the cafeteria ? Round to nearest hundredth place. A) 0.29 B) 0.13 C) 0.17 D) 0.71 Answer: A 22) What is the approximate probability that more than 10 students eat in the cafeteria? Round to the nearest hundredth place. A) 0.20 B) 0.10 C) 0.02 D) 0.88 Answer: B 23) What is the approximate probability that 6 or less students eat at the cafeteria? Round to nearest hundredth place. B) 0.56 C) 0.13 D) 0.76 A) 0.44 Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. In the new class college freshmen at a local college, at their orientation students were asked if they would eat their meals at the school cafeteria. In the past it has been determined that 80% of new freshmen ate there. For the question that follows, consider a sample of twenty college freshmen selected randomly. 24) What is the probability that exactly fifteen students out of twenty will eat their meals in the school cafeteria? Round to the nearest thousandth. Answer: 0.175
Page 25 Copyright © 2020 Pearson Education, Inc.
In a recent survey of the new class of 235 college freshmen at a local college, students were asked if they would eat their meals at the school cafeteria. In the past it has been determined that 3% of new freshmen ate there. Suppose that a binomial distribution is illustrated by the figure below:
25) a. What is the approximate probability that between 7 and 15 students eat in the cafeteria? Round to the nearest hundredth place. b. Does this probability make sense when looking at the graph of the distribution? Why? Answer: a. The probability that between 7 and 15 students eat at the cafeteria is approximately 0.56 or 56%. b. The shape of the distribution shows that the probability is greater than approximately half of the distrib since the probabilities skew slightly to the right. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
26) According to a national survey, 45% of people in Country X own a dog. Use a calculator with statistics capabilities. Round to the nearest percent. a. Find the probability that exactly 3 out of 10 randomly selected people from Country X own a dog. b. In a random sample of 10 people from Country X, find the probability that 3 or fewer own a dog. B) a. 17%, b. 90% C) a. 24%, b. 50% D) a. 24%, b. 73% A) a. 17%, b. 27% Answer: A 27) According to data, the percentage of people from Country X who have a passport has risen dramatically. Use a calculator with statistics capabilities. Round to the nearest percent. In 2007, only 19% of people from Country X had a passport; in 2017 that percentage had risen to 38%. Assume th currently 38% of people from Country X have a passport. Suppose 50 people from Country X are selected at random. a. Find the probability that at most 20 have a passport. b. Find the probability that at least 22 have a passport. c. Find the probability that fewer than 15 have a passport. B) a. 11%, b. 8%, c. 6% A) a. 67%, b. 23%, c. 9% C) a. 67%, b. 9%, c. 23% D) a. 11%, b. 6%, c. 8% Answer: A
Page 26 Copyright © 2020 Pearson Education, Inc.
28) According to data, 89% of households in Country X had no landline and only had cell phone service. Suppose a random sample of 30 households in Country X is taken. Use a calculator with statistics capabilities. Round to the nearest percent. a. Find the probability that exactly 25 the households sampled only have cell phone service. b. Find the probability that fewer than 25 households only have cell phone service. c. Find the probability that at most 25 households only have cell phone service. d. Find the probability that between 25 and 29 households only have cell phone service. B) a. 19%, b. 29%, c. 48%, d. 92% A) a. 12%, b. 10%, c. 23%, d. 86% C) a. 12%, b. 6%, c. 12%, d. 11% D) a. 19%, b. 85%, c. 71%, d. 5% Answer: A 29) The use of drones, aircraft without onboard human pilots, is becoming more prevalent in Country X. According to a 2017 report, 57% of people in Country X had seen a drone in action. Suppose 50 people in Countr are randomly selected. Use a calculator with statistics capabilities. Round to the nearest percent. a. What is the probability that at least 30 had seen a drone? b. What is the probability that more than 35 had seen a drone? c. What is the probability that between 38 and 40 had seen a drone? d. What is the probability that more than 30 had not seen a drone? B) a. 71%, b. 98%, c. 9%, d. 0.6% A) a. 39%, b. 2%, c. 0.4%, d. 0.5% C) a. 10%, b. 4%, c. 0.8%, d. 29% D) a. 41%, b. 8%, c. 0.9%, d. 0.4% Answer: A
Page 27 Copyright © 2020 Pearson Education, Inc.
30) According to a report, 32 % of pedestrians in Country X admit to texting while walking. Suppose two pedestrians are randomly selected. a. If the pedestrian texts while walking, record a T. If not, record an N. List all possible sequences of Ts and Ns for the two pedestrians. b. For each sequence, find the probability that it will occur by assuming independence. c. What is the probability that neither pedestrian texts while walking? d. What is the probability that both pedestrians text while walking? e. What is the probability that exactly one of the pedestrians texts while walking? A) a. TT, TN, NT, NN, b. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, P(TN) = (0.32)(0.68) = 0.2176 or about 22%, P(NT) = (0.68)(0.32) = 0.2176 or about 22%, P(NN) = (0.68)(0.68) = 0.4624 or about 46%, c. P(NN) = (0.68)(0.68) = 0.4624 or about 46%, d. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, e. P(TN) + P(NT) = 0.2176 + 0.2176 = 0.4352 or about 44% B) a. TT, NT, NN, b. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, P(TN) = (0.32)(0.68) = 0.2176 or about 22%, P(NT) = (0.68)(0.32) = 0.2176 or about 22%, P(NN) = (0.68)(0.68) = 0.3248 or about 32%, c. P(NN) = (0.68)(0.68) = 0.3248 or about 32%, d. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, e. P(NT) = 0.2176 or about 22% C) a. TT, TN, NN, b. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, P(TN) = (0.32)(0.68) = 0.2176 or about 22%, P(NT) = (0.68)(0.32) = 0.2176 or about 22%, P(NN) = (0.68)(0.68) = 0.4624 or about 46%, c. P(NN) = (0.68)(0.68) = 0.4624 or about 46%, d. P(TT) = (0.32)(0.32) = 0.1024 or about 10%, e. P(TN) = 0.2176 = 0.2176 or about 22% D) a. TT, TN, NT, NN, b. P(TT) = (0.32)(0.32) = 0.2176 or about 22%, P(TN) = (0.32)(0.68) = 0.1024 or about 10%, P(NT) = (0.68)(0.32) = 0.1024 or about 10%, P(NN) = (0.68)(0.68) = 0.3248 or about 32%, c. P(NN) = (0.68)(0.68) = 0.3248 or about 32%, d. P(TT) = (0.32)(0.32) = 0.2176 or about 22%, e. P(TN) + P(NT) = 0.1024 + 0.1024 = 0.2048 or about 20% Answer: A 3 Find the Expected Number and Standard Deviation for a Binomial Distribution MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Suppose that the probability that a person books a hotel using an online travel website is 0.68. For the questions that follow, consider a sample of fifteen randomly selected people who recently booked a hotel. 1) Out of fifteen randomly selected people, how many would you expect to use an online travel website to book their hotel, give or take how many? Round to the nearest whole person. A) 10 people, give or take 2 people B) 5 people, give or take 2 people D) 9 people, give or take 3 people C) 10 people, give or take 3 people Answer: A
Page 28 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Suppose that among those who book airline tickets, the probability that they use an online travel website is 0.72. For the questions that follow, consider a sample of ten randomly selected people who recently booked an airline ticket. 2) Out of ten randomly selected people, how many would you expect to use an online travel website to book their hotel, give or take how many? Round to the nearest whole person. B) 8 people, give or take 2 people A) 7 people, give or take 1 person C) 7 people, give or take 2 people D) 3 people, give or take 4 people Answer: A
3) Suppose that the probability that a person between the ages of 19 and 24 checks their daily horoscope is 0.12. If 400 randomly selected people between the ages of 19 and 24 were asked ʺDo you check your daily horoscope?ʺ, would you be surprised if 63 or more said yes to this question? Why? A) Yes, 63 would be an unusually small number of people given the known probability of 0.12. B) No, 63 is within the expected range of people. C) Yes, 63 would be an unusually large number of people given the known probability of 0.12. D) Cannot be determined with the given information. Answer: C 4) Suppose that the probability that a person between the ages of 19 and 24 buys at least one tabloid magazine per week is 0.115. If 500 randomly selected people between the ages of 19 and 24 were asked ʺDo you buy at least one tabloid magazine per week?ʺ, would you be surprised if 45 or more said yes to this question? Why? A) Yes, 45 would be an unusually small number of people given the known probability of 0.115. B) No, 45 is within the expected range of people. C) Yes, 45 would be an unusually large number of people given the known probability of 0.12. D) Cannot be determined with the given information. Answer: B 5) On a multiple choice test with 14 questions, each question has four possible answers, one of which is correct. For students who guess at all answers, find the expected number of correct answers. A) 3.5 B) 7 C) 4.7 D) 10.5 Answer: A 6) The probability that a radish seed will germinate is 0.7. A gardener plants seeds in batches of 14. Find the expected number of seeds germinating in each batch. C) 12.6 D) 9.94 B) 4.2 A) 9.8 Answer: A 7) The probability that a person has immunity to a particular disease is 0.4. Find the expected number of people who have immunity in samples of size 17. A) 6.8 B) 10.2 C) 8.5 D) 0.4 Answer: A 8) The probability is 0.7 that a person shopping at a certain store will spend less than $20. For groups of size 26, find the expected number of shoppers who spend less than $20. A) 18.2 B) 7.8 C) 14.0 D) 6.0 Answer: A 9) In a certain town, 20 percent of voters are in favor of a given ballot measure and 80 percent are opposed. For groups of 140 voters, find the expected number of voters who oppose the measure. A) 112 B) 28 C) 20 D) 80 Answer: A
Page 29 Copyright © 2020 Pearson Education, Inc.
10) On a multiple choice test with 23 questions, each question has four possible answers, one of which is correct. For students who guess at all answers, find the standard deviation for the number of correct answers. A) 2.1 B) 4.3 C) 18.6 D) 43.1 Answer: A 11) The probability that a radish seed will germinate is 0.7. A gardener plants seeds in batches of 8. Find the standard deviation for the number of seeds germinating in each batch. C) 2.8 D) 16.8 B) 1.7 A) 1.3 Answer: A 12) According to a research center, 66% of Gen-Xers (those born between 1965 and 1976) reported using a library or bookmobile within the last year. Suppose that a random sample of 300 Gen-Xers is taken. a. Complete this sentence: We would expect ____ of the sample to have used a library or bookmobile within the last year, give or take ____. b. Would it be surprising to find that 225 of the sample have used a library or bookmobile within the last year? Why or why not? A) a. We would expect 198 of the sample to have used a library or bookmobile within the last year, give or take 8.2. b. Since 225 is more than two standard deviations from the mean, it would be surprising. B) a. We would expect 198 of the sample to have used a library or bookmobile within the last year, give or take 8.2. b. Since 225 is less than two standard deviations from the mean, it would not be surprising. C) a. We would expect 102 of the sample to have used a library or bookmobile within the last year, give or take 8.2. b. Since 225 is more than two standard deviations from the mean, it would be surprising. D) a. We would expect 102 of the sample to have used a library or bookmobile within the last year, give or take 8.2. b. Since 225 is less than two standard deviations from the mean, it would not be surprising. Answer: A 13) A professional basketball player is a 75% free-throw shooter. Assume that free throw shots are independent. Suppose, over the course of a season, this player attempts 500 free throws. a. Find the mean and the standard deviation for the expected number of free throws we expect this player to make. b. Would it be surprising if he only made 390 of his free throws? Why or why not? A) a. Mean = 375, standard deviation = 9.7, b. No, it would not be surprising because 390 is less than 2 standard deviations from the mean. B) a. Mean = 375, standard deviation = 9.7, b. Yes, it would be surprising because 390 is more than 2 standard deviations from the mean. C) a. Mean = 125, standard deviation = 9.7, b. No, it would not be surprising because 390 is less than 2 standard deviations from the mean. D) a. Mean = 125, standard deviation = 9.7, b. Yes, it would be surprising because 390 is more than 2 standard deviations from the mean. Answer: A
Page 30 Copyright © 2020 Pearson Education, Inc.
Ch. 6 Modeling Random Events: The Normal and Binomial Models Answer Key 6.1 Probability Distributions Are Models of Random Experiments 1 Determine if a Variable is Continuous or Discrete 1) A 2) B 3) B 4) A 5) A 6) B 7) A 8) A discrete random variables has a numerical outcome that can be listed or counted, A continuous random variable occurs over an infinite range of values and cannot be listed or counted. Examples will vary. 2 Create a Probability Density Function Table or Graph 1) A 2) B 3) A 4) A 3 Find Probabilities for Discrete-Valued Outcomes 1) A 2) A 3) B 4) D 5) B 6) The probability column should contain the following values: 2/13, 2/13, 2/13, 1/13, 6/13. Your chances of winning money are slightly better than losing. 7) a. This is a probability distribution. b. Characteristics are that it must list all the possible outcomes and list all the associated probabilities. c. The probability of randomly selecting a movie genre other than drama or foreign is 1 – (0.421+0.093) = 0.486 or 4.9% 4 Find Probabilities for Continuous-Valued Outcomes 1) D 2) C 3) A 4) A 5) A 6) C 7) C 8) A 9) 0.50, the right half of the distribution should be shaded. 10) 0.25, the rectangle between 14 and 15 should be shaded.
6.2 The Normal Model 1 Understand the Empirical Rule and Normal Distributions 1) B 2) A 3) A 4) A 5) A 6) D 7) A 8) D 9) B 10) B 11) A Page 31 Copyright © 2020 Pearson Education, Inc.
12) A 13) A 14) B 15) a. The intervals shown represent +/- 1 standard deviation or the middle 68% (68% of middle values range from 18.6 to 31.0 mpg), +/- 2 standard deviations or the middle 95% (95% of middle values range from 12.4 to 37.2 mpg), +/- 3 standard deviations or the middle 99.7% (99.7% of middle values range from 6.2 to 43.4 mpg). b. The normal distribution shown has a center (mean, median) of 24.8 mpg with a standard deviation of 6.2 mpg (representing the spread). 16) The distribution should be sketched as a right skewed distribution (most of the values clustered on the left side (zero to about 40 minutes) and then stretched out to the right showing 150 minutes on far right. The 15% probability is the shaded area under the curve from a vertical line at 45 all the way to the right end. 2 Solve Applications of Normal Distributions 1) C 2) D 3) A 4) C 5) A 6) B 7) B 8) A 9) D 10) D 11) A 12) A 13) A 14) A 15) A 16) A 17) A 18) B 19) A 20) B 21) A 22) D 23) 27.4% 24) 0.393 25) Yes, 72.3 cm would be unusual. It is more the two standard deviations from the mean. 26) 0.604; 0.070 27) 32.6% 3 Use the Normal Distribution to Find Percentiles 1) C 2) C 3) A 4) B 5) A 6) A 7) A 8) A 9) C 10) 149.3 lbs. 11) 40% will have at least a 30 minute wait time. 12) 42.3 minutes
Page 32 Copyright © 2020 Pearson Education, Inc.
6.3 The Binomial Model 1 Understand the Binomial Model 1) D 2) A 3) B 4) B 5) A 6) A 7) A 8) A 9) B 10) C 11) A 12) B 13) D 14) a. (1) A fixed number of trials, (2) Only two possible outcomes for each trial, (3) The probability of success is the same fr to trial, and (4) The trials are independent. b. (1) There are a fixed number of trials. (2) There are only two outcomes (rabies or no rabies). (3) The probability of success is not the same from trial to trial since the trials include cats and dogs. 15) Approximately 16 trips 16) a. The graph shows the binomial distribution for a binomial experiment in which n = 235 and the probability of succes is 0.03. It indicates the probability of 5 successes in 235 trials. b. The method to determine the probability for x less than 5 is to add all of the probabilities for x = 4, x = 3, x = 2, x =1, x The result is approximately 0.16 or 16%. 17) A 18) A 19) A 2 Find Binomial Probabilities 1) D 2) C 3) B 4) D 5) B 6) C 7) B 8) A 9) A 10) A 11) A 12) A 13) A 14) A 15) A 16) A 17) A 18) A 19) A 20) A 21) A 22) B 23) A 24) 0.175 Page 33 Copyright © 2020 Pearson Education, Inc.
25) a. The probability that between 7 and 15 students eat at the cafeteria is approximately 0.56 or 56%. b. The shape of the distribution shows that the probability is greater than approximately half of the distribution since th probabilities skew slightly to the right. 26) A 27) A 28) A 29) A 30) A 3 Find the Expected Number and Standard Deviation for a Binomial Distribution 1) A 2) A 3) C 4) B 5) A 6) A 7) A 8) A 9) A 10) A 11) A 12) A 13) A
Page 34 Copyright © 2020 Pearson Education, Inc.
Ch. 7 Survey Sampling and Inference 7.1 Learning about the World through Surveys 1 Identify Populations, Samples, Parameters of Interest, and Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Fill in the blank to complete the statement. 1) The collection of the ages of all the U.S. first ladies when they married is a A) Population
B) Sample
C) Parameter
. D) Statistic
Answer: A 2) Suppose that the age of all the U.S. first ladies when they married was recorded. The mean age of U.S. first ladies when they married would be a . A) Population
B) Sample
C) Parameter
D) Statistic
Answer: C 3) Researchers are interested in learning more about the age of women when they marry for the first time so they survey 500 married or divorced women and ask them how old they were when they first married. The collection of the ages of the 500 women when they first . married is a A) Population
B) Sample
C) Parameter
D) Statistic
Answer: B 4) Suppose that the age of all the U. S. vice presidents when they took office was recorded. The collection of the ages of all the U. S. vice presidents when they took office is a . A) Population
B) Sample
C) Parameter
D) Statistic
Answer: A 5) The mean age of all the U. S. vice presidents when they took office would be a A) Population
B) Sample
C) Parameter
. D) Statistic
Answer: C 6) Researchers are interested in learning more about the age of men when they marry for the first time so they survey 500 married or divorced men and ask them how old they were when they first married. The mean of age of the 500 men when they married for the first time would be a . A) Population
B) Sample
C) Parameter
D) Statistic
Answer: D 7) Researchers want to find out which U. S. movie has the most positive audience reaction for the current week. As they exited a randomly selected movie theater, movie -goers were asked to give the movie they had just viewed a letter grade of A, B, C, D, or F. In this scenario, the movie-goers are an example of a . A) Sample
B) Population
C) Variable
Answer: A Solve the problem. 8) The deacons at a local church surveyed the congregation to find out if members would be willing to fund a new construction project. In this example, what is the population of interest? A) The deacons B) The congregation C) The survey respondents D) None of these Answer: B
Page 1 Copyright © 2020 Pearson Education, Inc.
Answer the question. 9) A magazine publisher mails a survey to every subscriber asking about the timeliness of its subscription service. The publisher finds that only 5% of the subscribers responded. This 5% represents what? A) The population B) The sample Answer: B 10) A magazine publisher always mails out a questionnaire six months before a subscription ends. This questionnaire asks its subscribers if they are going to renew their subscriptions. On average, only 7% of the subscribers respond to the questionnaire. Of the 7% who do respond, an average of 44% say that they will renew their subscription. This 7% who respond to the questionnaire are known as what? B) The sample A) The population Answer: B 11) A computer network manager wants to test the reliability of some new and expensive fiber -optic Ethernet cables that the computer department just received. The computer department received 7 boxes containing 50 cables each. The manager does not have the time to test every cable in each box. The manager will choose one box at random and test 10 cables chosen randomly within that box. What is the population? A) 350 cables B) The one box that was chosen at random from the 7 boxes C) The 7 boxes D) The 10 cables chosen randomly for testing Answer: A 12) A computer network manager wants to test the reliability of some new and expensive fiber -optic Ethernet cables that computer department just received. The computer department received 3 boxes containing 50 cables each. The manager does not have the time to test every cable in each box. The manager will choose one box at random and test 10 cables chosen randomly within that box. What is the sample? A) The 10 cables chosen for testing B) The one box that was chosen at random from the 3 boxes C) The 3 boxes D) 150 cables Answer: A Solve the problem. 13) All daily maximum temperatures in August last year for all major U.S. cities were recorded. The mean of all maximum temperatures would be a _____________. B) Sample C) Parameter D) Statistic A) Population Answer: C 14) Understand survey terminology] Researchers are interested in learning more about the age of young adults who watch a certain television program. By interviewing people at a shopping mall, they can identify people who watch this show. The collection of the ages of these young adults who watch this television program is a _____________. B) Sample C) Parameter D) Statistic A) Population Answer: B
Page 2 Copyright © 2020 Pearson Education, Inc.
15) A factory manager is monitoring the quality of production of a small battery powered toy. The factory produces 600 toys in an hour. Thirty toys are tested from each hourʹs production. Which one of the following statements is the most accurate? A) The toys being tested are the sample and the sample size is 30. All toys produced in that hour is the population. B) The toys being tested are the population. All toys produced in that hour is the sample and the sample size is 600. C) The toys being tested are the sample. The factoryʹs total production is the population. D) None of these statements accurately describe this situation. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 16) Explain the difference between a population and a sample. Give an example of each. Answer: A population is a group of objects or people which are being studied. A population is a total collection. A sample is a collection of objects or people taken from the population of interest. Examples will vary. 17) Describe the importance of how survey questions are phrased – in particular, the effect on the sample results. Explain how such a sample may not be reflective of the population. Answer: A persuasive survey question will result in biased results, which may not accurately reflect the true sentiments of the population. A confusing question with multiple negatives and irrelevant information can result in inaccurate responses. 2 Determine Whether Sampling Methods are Likely to Result in Representative or Biased Samples MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) Frances is interested in whether students at his college would like to see a portion of the campus preserved as green space. Using student identification numbers, he randomly contacts 300 students and receives a response from 75. Of those who responded, 64% favored the preservation of green space on campus. This scenario is describing what type of sampling bias? B) Gender bias A) Measurement bias C) Voluntary response bias D) Nonresponse bias Answer: D
2) Max is interested in whether there is community interest in having local musicians perform music in the park in the evenings during the summer. Max goes to the park for several evenings in a row. He sets up a booth with a sign ʺGive your opinion on musicʺ. He asks people visiting the park whether they would like to hear music in the evening. Out of the 200 people he surveys, 58% respond favorably. This scenario is describing what type of sampling bias? B) Gender bias A) Measurement bias C) Voluntary response bias D) Nonresponse bias Answer: C 3) Before opening a new dealership, an auto manufacturer wants to gather information about car ownership and driving habits of the local residents. The marketing manager of the company randomly selects 1000 households from all households in the area and mails a questionnaire to them. Of the 1000 surveys mailed, she receives 135 back. Determine the type of bias. B) Measurement bias A) Nonresponse bias C) Gender bias D) Voluntary response bias Answer: A
Page 3 Copyright © 2020 Pearson Education, Inc.
Provide an appropriate response. 4) You are receiving a large shipment of light bulbs and want to test their lifetimes. These light bulbs are for use in company offices. Explain whether you would want to test a sample of light bulbs or the entire population. A) You want to test a sample of light bulbs. If you tested them all until they burned out, no usable light bulbs would be left. B) You want to test a sample of light bulbs. If you tested them all, it would not provide the correct data. C) You want to test all of the light bulbs. If you tested only a sample, it would not provide the correct data. D) You want to test all of the light bulbs. If you tested only a sample until those burned out, too many light bulbs would be left. Answer: A 5) Suppose you want to estimate the mean SAT scores of all of the seniors at your local high school. You setup a table in the auditorium at graduation practice. The table is labeled “Senior Survey”. The students are asked to volunteer to their SAT scores. Do you think you would get a representative sample? Why or why not? A) This would be a biased sample. Some seniors may not be there for various reasons so you would be missing their data. Also, students with low SATs would be less willing to report their SAT score. B) .This would be a biased sample. There would be no volunteers available in auditorium. Also, students with high SATs would be less willing to report their SAT score. C) This would be a representative sample. Volunteers are willing to report their SAT score no matter how well they did on the test. D) This would be a representative sample. Senior boys are the only volunteers you need for your study. Answer: A 6) Explain the difference between sampling with replacement and sampling without replacement. Suppose you have 8 uniquely colored balls, each the same size, and want to select two balls. Describe both procedures. A) First, all 8 balls are put in a bag. Then one is drawn out and noted.“With replacement”: The ball that is selected is replaced in the bag, and a second draw is done. It is possible that the same ball could be picked twice.“Without replacement”: After the first ball is drawn out, it is not replaced, and the second draw must be a different ball. B) First, all 8 balls are put in a bag. Then one is drawn out and noted.“Without replacement”: After the first ball is drawn out, it is not replaced, and the second draw must be a different ball. “With replacement”: The ball that is selected is replaced in the bag, and a second draw is done. It is possible that the same ball could be picked twice. C) First, all 8 balls are put in a bag. Then two are drawn out and noted.“With replacement”: The balls that are selected are replaced in the bag. It is possible that the same two balls could be picked twice.“Without replacement”: After the balls are drawn out, they are not replaced. D) First, all 8 balls are put in a bag. Then two are drawn out and noted.“With replacement”: After the balls are drawn out, they are not replaced. “Without replacement”: The balls that are selected are replaced in the bag. It is possible that the same two balls could be picked twice. Answer: A TRUE/FALSE. Write ʹTʹ if the statement is true and ʹFʹ if the statement is false. 7) True or False? Simple random sampling is usually done with replacement. Answer: FALSE
Page 4 Copyright © 2020 Pearson Education, Inc.
3 Use Random Number Tables to Select Random Samples MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The government of a town needs to determine if the cityʹs residents will support the construction of a new town hall. The government decides to conduct a survey of a sample of the cityʹs residents. Which one of the following procedures would be most appropriate for obtaining a sample of the townʹs residents? A) Survey a random sample of persons within each geographic region of the city. B) Survey a random sample of employees at the old city hall. C) Survey every 13th person who walks into city hall on a given day. D) Survey the first 400 people listed in the townʹs telephone directory. Answer: A 2) The city council of a small town needs to determine if the townʹs residents will support the building of a new library. The council decides to conduct a survey of a sample of the townʹs residents. Which one of the following procedures would be most appropriate for obtaining a sample of the townʹs residents? A) Survey a random sample of persons within each neighborhood of the town. B) Survey a random sample of librarians who live in the town. C) Survey 500 individuals who are randomly selected from a list of all people living in the state in which the town is located. D) Survey every 11th person who enters the old library on a given day. Answer: A 3) You need to select a simple random sample of five from seven friends who will participate in a survey. Assume the friends are numbered 1, 2, 3, 4, 5, 6, and 7. Select five friends, using the two lines of numbers in the next column from a random number table. Read off each digit, skipping any digit not assigned to one of the friends. The sampling is without replacement, meaning that you cannot select the same person twice. Write down the numbers chosen. The first person is number 1.
Which five friends are chosen? A) 1, 6, 7, 3, 5 B) 0, 9, 1, 0, 6
C) 1, 6, 7, 3, 1
D) 6, 7, 3, 1, 5
Answer: A 4) Assume your family has 25 members and you want a simple random sample of 9 of them. Describe how to randomly select 8 people from your family using the random number table. Members are choosen without replacement. A) Assign each family member a pair of digits 00–24 (or 01–25). Read off pairs of digits from the random number table. The family members whose digits are called are in the sample. Skip any repeats. Stop after the first 8 are selected. B) Assign each family member a pair of digits 00–24 (or 01–25). Read off pairs of digits from the random number table. The family members whose digits are called are in the sample. Make sure to use repeats. Stop after the first 8 are selected. C) Assign each family member a pair of digits 00-08 (or 01–09). Read off pairs of digits from the random number table. The family members whose digits are called are in the sample. Skip repeats. Stop after the first 25 are selected. D) Assign each family member a pair of digits 00-08 (or 01–09). Read off pairs of digits from the random number table. The family members whose digits are called are in the sample. Make sure to use repeats. Stop after the first 25 are selected. Answer: A
Page 5 Copyright © 2020 Pearson Education, Inc.
5) Assume your class has 36 students and you want a simple random sample of 10 of them. A student suggests asking each student to roll a die, and if they roll a 6, then he or she is in your sample. Explain why this is not a good method. How many student would you expect to get? A) You would probably not get 10 students in your sample if each student rolled the die; we would expect about 6 students. B) You would probably not get 10 students in your sample if each student rolled the die; we would expect about 4 students. C) You would probably not get 10 students in your sample if each student rolled the die; we would expect about 2 students. D) You would probably not get 10 students in your sample if each student rolled the die; we would expect about 8 students. Answer: A 4 Use Surveys to Make Conclusions About the Population MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.
1) A researcher is interested in knowing how many students of a particular college would be interested in having a sandwich shop open within a block of campus. She surveys 300 students on campus at different times of day by asking students randomly if they would answer a few questions. She receives a response from 35. Of those who responded, 65% favored having a sandwich shop within a block of campus. This scenario describes samplin bias. Which statement most accurately describes the bias? I. The researcher asked only one question which results in bias. II. Only students that volunteered responded which could result in bias. III. Not enough students responded which can result in bias. D) II only A) I only B) III only C) both II and III Answer: D 2) Max organizes weekly concerts in the local park. He is interested in knowing what type of music people enjoy. Before one particular concert, he makes an announcement to the audience, and asks people to visit a web page and take a survey to vote on whether or not they liked the concert. 75 people take the survey, and 58% respond favorably. Max claims that 58% of all of those who were at the concert liked the music. This scenario describes sampling bias. Which statement most accurately describes the bias in Maxʹs method? I. The Max asked only one question which results in bias. II. The attendees voluntarily responded which could result in bias. III. Not enough attendees responded which can result in bias. B) II only C) III only D) both II and III A) I only Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) Explain the difference between a statistic and a parameter. Give an example of each. Answer: A statistic is a numerical summary of a sample of data, examples will vary. A parameter is a numerical value that characterizes some aspect of the population, examples will vary. 4) Frederick is interested in whether residents of his community are opposed to the construction of a party store on the corner of a busy intersection. He creates a simple random sample of 150 residents in the community. When this sample is polled, he receives responses from 55 residents. Of those who responded, 60% were opposed to the construction of the party store in the community so Frederick concludes that the majority of residents in his community oppose the construction of the party store. Explain what is wrong with this approach. Answer: Frederickʹs survey may have nonresponse bias. The residents who chose not to participate may have different views about the survey topic then those who did respond.
Page 6 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 5) A township conducts a survey to determine whether voters favor passing a bond to fund a library addition construction project. All registered voters are called. Of those called, 20% answer the survey call. Of those who respond, 59% say they favor passing the bond. Give a reason why the township should be confident or cautious about predicting that the bond will pass. A) One reason the district should be cautious because of the low survey response rate. B) One reason the district should be confident because of the low survey response rate. C) One reason the district should be confident because of the high survey response rate. D) One reason the district should be cautious because of the high survey response rate. Answer: A 6) To determine if patrons are satisfied with food quality, a restaurant surveys patrons by placing a paper survey inside their menus one evening. All patrons receive a menu as are seated at their table. Completed surveys are placed in boxes at the restaurant exits. On the evening of the survey, 150 patrons ate dinner at the restaurant. Forty surveys were completed, and 80% of these surveys indicated dissatisfaction with the food. Should the restaurant conclude that patrons were dissatisfied with food quality? Explain. A) The survey response might be biased because those who had strong feelings about the food (positive or negative) may be more likely to return the survey. Also the survey response rate was low (40/150 = 27%). B) The survey response might be biased because those who had strong feelings about the food (positive or negative) may be less likely to return the survey. Also the survey response rate was low (40/150 = 27%). C) The survey response might be slightly biased because those who had strong feelings about the food (positive or negative) may be more likely to return the survey. Also the survey response rate was moderate (80/150 = 53%). D) The survey response might be slightly biased because those who had strong feelings about the food (positive or negative) may be less likely to return the survey. Also the survey response rate was moderate (40/80 = 50%). Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
7) A random sample was taken from a study of views on euthanasia. In the study, a student asked a question two w 1. With persuasion: “My sister has been in a coma for six years. Her doctor says she is very likely to come out of the coma and make a full recovery. Now do you support or oppose euthanasia?” 2. Without persuasion: “Do you support or oppose euthanasia?” Here is a breakdown of her actual data. Men With Persuasion For euthanasia 7 Against Euthanasia 8
No Persuasion 12 1
Women With Persuasion For euthanasia 3 Against Euthanasia 7
No Persuasion 6 4
a. What percentage of those persuaded against it support euthanasia? b. What percentage of those not persuaded against it support euthanasia? c. Compare the percentages in parts a and b. What should we conclude about the role of persuasian in survey questioning? A) a. With persuasion: 10/25 = 40%. b. Without persuasion: 18/23 = 78%. c. As expected, those who received persuasion seemed to be more likely to oppose euthanasia. B) a. With persuasion: 12/28 = 64%. b. Without persuasion: 10/28 = 36%. c. As expected, she spoke against it, but more who heard her statements against it supported euthanasia, compared with those who did not hear her persuasion. C) a. With persuasion: 5/20 = 25%. b. Without persuasion: 15/20 = 75%. c. As expected, she spoke against it, and fewer who heard her statements against it supported euthanasia, compared with those who did not hear her persuasion . D) a. With persuasion: 15/20 = 75%. b. Without persuasion: 5/20 =25%. c. Unexpectedly, she spoke against it, but more who heard her statements against it supported euthanasia, compared with those who did not hear her persuasion. Answer: A 8) Use the following tables for the problem. Men With Persuasion For euthenasia 7 Against Euthenasia 8
No Persuasion 12 1
Women With Persuasion For euthenasia 3 Against Euthenasia 7
No Persuasion 6 4
Page 8 Copyright © 2020 Pearson Education, Inc.
A random sample of men and women was taken from a study of views on euthanasia. The result by gender is gi the tables. Make a single table by combining men for euthanasia into one group, men opposing it into another, women for it into one group, and women opposing it into another. Show your table. The student who collected the data could have made the results misleading by trying persuasion more often on one gender than on the othe She used persuasion on 10 of 20 women (50%) and on 15 of 28 men (54%). a. What percentage of the men support euthanasia? What percentage of the women support it? b. On the basis of these results, if you were in a coma and did not want to be euthanized, would you want a someone who is a man or a woman in charge of your fate? A)
Men Women For Euthanasia 19 9 Against Euthanasia 9 11 a. 19/28 = 68% of men are for euthanasia, 9/20 = 45% of women are for euthanasia. b. You would want a woman in charge of your fate. B) Men Women For Euthanasia 9 11 Against Euthanasia 19 9 a. 9/28 = 32% of men are for euthanasia, 11/20 = 55% of women are for euthanasia. b. You would want a man in charge of your fate. C) Men Women For Euthanasia 9 19 Against Euthanasia 11 9 a. 9/20 = 45% of men are for euthanasia, 19/28 = 68% of women are for euthanasia. b. You would want a man in charge of your fate. D) Men Women For Euthanasia 11 9 Against Euthanasia 9 19 a. 11/20 = 55% of men are for euthanasia, 9/28 = 32% of women are for euthanasia. b. You would want a woman in charge of your fate. Answer: A
7.2 Measuring the Quality of a Survey 1 Understand Bias and Precision MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) If it is being used to make inferences about a population, a good statistic (or estimator) should A) Be derived from population data. B) Be accurate and precise. C) Show correlation. D) None of these. Answer: B
Page 9 Copyright © 2020 Pearson Education, Inc.
2) Which of the following statements is not true about a sampling distribution? A) It is the probability distribution of a statistic. B) It is used for making inferences about a population. C) It tells us how often we can expect to see particular values of our estimator. D) All these statements are true. Answer: D 3) Which of the following statements is not true about a sampling distribution? A) It gives probabilities for a statistic. B) It gives characteristics of the estimator, such as bias and precision. C) It is used for making inferences about a sample. D) It is the probability distribution of a statistic. Answer: C 4) According to a snack cracker manufacturer, a batch of butter crackers has a defect rate of 8%. Suppose a quality inspector randomly inspects 500 crackers. Complete the following statement: The quality inspector should defective crackers, give or take crackers. expect A) 60; 16
B) 40; 6
C) 40; 16
D) 60; 12
Answer: B 5) According to a snack cracker manufacturer, a batch of butter crackers has a defect rate of 6%. Suppose a quality inspector randomly inspects 400 crackers. Complete the following statement: The quality inspector should defective crackers, give or take crackers. expect A) 45; 6
B) 24; 5
C) 25; 12
D) 48; 5
Answer: B 6) There are four colors in a bag containing 500 plastic chips. It is known that 28% of the chips are green. On average, how many chips from a random sample of 50 (with replacement) would be expected to be green? A) 18 B) 28 C) 14 D) Not enough information to determine expected value. Answer: C 7) There are four colors in a bag containing 600 plastic chips. It is known that 34% of the chips are yellow. On average, how many chips from a random sample of 30 (with replacement) would be expected to be yellow? Round to the nearest whole chip. A) About 5 B) About 10 C) About 16 D) Not enough information to determine expected value. Answer: B Use the following information to answer the question. A pescatarian is a person who eats fish and seafood but no other animal. An event planner does some research and finds that approximately 2.75% of the people in the area where a large event is to be held are pescatarian. Treat the 250 guests expected at the event as a simple random sample from the local population of about 150,000. 8) On average, what proportion of the guests would be expected to be pescatarian, give or take how many? Round to the nearest whole person. A) There is not enough information given to calculate expected value. B) 6 people, give or take 5 people C) 8 people, give or take 4 people D) 7 people, give or take 3 people Answer: D Page 10 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A pollotarian is a person who eats poultry but no red meat. A wedding planner does some research and finds that approximately 3.5% of the people in the area where a large wedding is to be held are pollotarian. Treat the 300 guests expected at the wedding as a simple random sample from the local population of about 200,000. 9) On average, what proportion of the guests would be expected to be pollotarian, give or take how many? Round to the nearest whole person. A) There is not enough information given to calculate expected value. B) 20 people, give or take 5 people C) 15 people, give or take 4 people D) 11 people, give or take 3 people Answer: D Use the following information to answer the question. In a recent poll of 1100 randomly selected home delivery truck drivers, 26% said they had encountered an aggressive dog on the job at least once. 10) What is the standard error for the estimate of the proportion of all home delivery truck drivers who have encountered an aggressive dog on the job at least once? Round to the nearest ten-thousandth. B) 0.0132 C) 0.0002 D) 0.0141 A) 0.1322 Answer: B Use the following information to answer the question. In a recent poll of 1200 randomly selected adult office workers, 32% said they had worn a Halloween costume to the office at least once. 11) What is the standard error for the estimate of the proportion of all American adult office workers that have worn a Halloween costume to the office? Round to the nearest ten -thousandth. A) 0.0002 B) 0.0135 C) 0.4672 D) 0.0143 Answer: B Solve the problem. 12) If you use an instrument to measure 1 ml of water and do it 3 times, weighing the water each time, you get 0.751, 0.753, and 0.750 grams. One ml of water should weigh 1.0 grams. The measurements are _________. B) precise but not accurate A) accurate but not precise D) neither precise nor accurate C) both precise and accurate Answer: B 13) Which of the following statements is characteristics of the sampling distribution of a sample proportion? A) It is the probability distribution of a parameter. B) It cannot be used for making inferences about a population. C) The standard deviation of the sampling distribution is the same as the standard deviation of the population from which the data were sampled. D) The mean of the sampling distribution is the same as the mean of the population from which the data were sampled. Answer: D ^
14) The bias of p is zero if certain conditions are met. Identify which condition is not required. I. The sample is randomly selected from the population of interest. II. The population must be at least 100 times bigger than the sample size. III. The standard error of the sampling distribution is the same as the standard deviation of the population. IV. The mean of the sampling distribution is the mean of the population from which the data were sampled. B) II only C) II and III D) I and IV A) I only Answer: C
Page 11 Copyright © 2020 Pearson Education, Inc.
15) A survey recently reported that 35% of U.S. citizens believe that we never landed on the moon. The pertinent question in the survey was ʺDo you think it was possible or impossible that the event of the U.S. landing on the moon never happened?ʺ Select the most accurate statement about this survey. A) The question worded in a confusing way caused measurement bias. B) The question worded in a confusing way caused sampling bias. C) The question worded in a confusing way caused response bias. D) The results are significant. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 16) A sampling method should be as precise and accurate as possible. Explain what these two terms mean and how each is measured. Answer: Precision means that sampling results are consistent when a sampling method is repeated. The precision of a sampling method is measured by the standard error. Accuracy means sampling results are centered around the population parameter. Accuracy is measured in terms of bias. 2 Find the Error in a Sample MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) What generally happens to the sampling error as the sample size is increased? A) It gets smaller. B) It gets larger. D) It gets less predictable. C) It gets more predictable. Answer: A Solve the problem. 2) A group of screwdrivers produced in a day at a particular factory has a defect rate of 0.7%. Suppose a quality inspector randomly inspects 500 screwdrivers. Complete the following statement: The quality inspector should expect ____defective screwdrivers, with an error of ____. D) 1; 0.4% A) 35; 0.4% B) 3.5; 0.4% C) 1; 4.0% Answer: B 3) A group of battery powered toys produced in a day at a factory has a defect rate of 0.5%. Suppose a quality inspector randomly inspects 200 of the toys. Complete the following statement: The quality inspector should expect ____defective toys, with an error of ____. A) 1; 5% B) 10; 5% C) 1; 0.5% D) 10; 0.5% Answer: C 4) There are five colors available in each bag of Skittles. It is known that the 18% of the Skittles are red. On average, how many Skittles from a random sample of 40 (with replacement) would be expected to be red? Round to the nearest whole Skittle. A) About 1 B) About 7 C) About 16 D) About 8 Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A marble manufacturer advertises that its bags of marbles will contain 25% ʺmilky -whiteʺ marbles. Suppose that a bag containing 80 marbles is inspected. 5) What value should we expect for our sampling percentage of milky -white marbles? How many marbles would this be? Round to the nearest whole marble. Answer: 25%; 20 milky-white marbles
Page 12 Copyright © 2020 Pearson Education, Inc.
6) What is the standard error? Round to the nearest tenth of a percent. Answer: 4.8% 7) Use your answers to fill in the blanks: We expect ________% milky -white marbles, give or take ________%. Answer: 25%; 4.8%
7.3 The Central Limit Theorem for Sample Proportions 1 Use the Central Limit Theorem for Sample Proportions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose that New Mexico lawmakers survey 160 randomly selected registered voters to see if they favor stricter laws regarding motorcycle helmet use for riders over the age of 17. The lawmakers believe the population proportion in favor of changing the law is 6% (based on historical data and previous votes). Which of the following conditions for the Central Limit theorem are not met? A) The population proportion is too small and will not have enough expected successes. B) Relative to the population, the sample is not large enough. C) The population proportion is too small and will not have enough expected failures. D) None of these, all the conditions of the CLT are met. Answer: A 2) Suppose that Illinois lawmakers survey 130 randomly selected registered voters to see if they favor charging a deposit on aluminum cans to encourage recycling. The lawmakers believe the population proportion in favor of changing the law is 93% (based on historical data and previous votes). Which of the following conditions for the Central Limit theorem are not met? A) The population proportion is too small and will not have enough expected failures. B) Relative to the population, the sample is not large enough. C) The population proportion is too small and will not have enough expected successes. D) None of these, all the conditions of the CLT are met. Answer: A Use the following information to answer the question. A pollotarian is a person who eats poultry but no red meat. A wedding planner does some research and finds that approximately 3.5% of the people in the area where a large wedding is to be held are pollotarian. Treat the 300 guests expected at the wedding as a simple random sample from the local population of about 200,000. 3) Suppose the wedding planner assumes that 5% of the guests will be pollotarian so she orders 15 pollotarian meals. What is the approximate probability that more than 5% of the guests are pollotarian and therefore she will not have enough pollotarian meals? Round to the nearest thousandth. B) 0.421 C) 0.489 D) None of these A) 0.079 Answer: A 4) Suppose the wedding planner assumes that only 3% of the guests will be pollotarian so she orders 9 pollotarian meals. What is the approximate probability that she will have too many pollotarian meals? Round to the nearest thousandth. B) 0.681 C) 0.319 D) 0.251 A) 0.477 Answer: C
Page 13 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A pescatarian is a person who eats fish and seafood but no other animal. An event planner does some research and finds that approximately 2.75% of the people in the area where a large event is to be held are pescatarian. Treat the 250 guests expected at the event as a simple random sample from the local population of about 150,000. 5) Suppose the event planner assumes that 4% of the guests will be pescatarian so he orders 10 pescatarian meals. What is the approximate probability that more than 4% of the guests are pescatarian and that he will not have enough pescatarian meals? Round to the nearest thousandth. B) 0.113 C) 0.470 D) None of these A) 0.387 Answer: B 6) Suppose the event planner assumes that only 1.6% of the guests will be pescatarian so he orders 4 pescatarian meals. What is the approximate probability that he will have too many pescatarian meals? Round to the nearest thousandth. B) 0.387 C) 0.113 D) 0.245 A) 0.613 Answer: C Solve the problem. 7) It is thought that 10% of all children have some level of nearsightedness. 200 randomly selected children, selected without replacement, had their eyesight tested. Can the Central Limit Theorem be used to find a good approximation of the probability that more than 15% of the children will be nearsighted? Which statement is not I.Relative to the population, the sample is not large enough. II. The sample does not need to be randomized since all children are likely to have vision problems. III. Sample size should be large enough so np >10 and nq>10. This condition is met. A) I only B) I and II C) II and III D) I and III Answer: B 8) A survey investigates whether residents of a certain city support an educational tax increase. Which of the following statements are true? A) The true proportion of ʺYesʺ in the population is p. B) The proportion of ʺYesʺ in one sample of size n will be close to p if the Central Limit Theorem conditions are satisfied. C) The proportions of ʺYesʺ in many samples of size n will be approximately normally distributed with mean = p and standard deviation equal to the square root of p*(1 -p)/n. D) All of these statements are true. Answer: D 9) According to the manufacturer of the candy Skittles, 20% of the candy produced are red. If we take a random sample of 100 bags of Skittles, what is the probability that the proportion in our sample of red candies will be less than 20%? Which statement, if any, is not true for the conditions to use the Central Limit Theorem? A) The success/failure conditions are satisfied. B) The sample size is small compared to the population. C) Though not from a random sample, we can assume that the bags are representative of the population. D) All of these statements are true. Answer: D 10) Suppose that Illinois lawmakers survey 130 randomly selected registered voters to see if they favor charging a deposit on aluminum cans to encourage recycling. The lawmakers believe the population proportion in favor of changing the law is 93% (based on historical data and previous votes). Which of the following conditions for the Central Limit theorem are not met? A) The population proportion is too small and will not have enough expected failures. B) Relative to the population, the sample is not large enough. C) The population proportion is too small and will not have enough expected successes. D) None of these, all the conditions of the CLT are met. Answer: A Page 14 Copyright © 2020 Pearson Education, Inc.
11) According to the manufacturer of the candy Skittles, 25% of the candy produced are green. If we take a random sample of 10 bags of Skittles, what is the probability that the proportion in our sample of green candies will be more than 25%? Which statement, if any, is true for the conditions to use the Central Limit Theorem? A) The success/failure conditions may not be satisfied. B) The sample size is small compared to the population. C) Though not from a random sample, we can assume that the bags are representative of the population. D) All of these statements are not true. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 12) Suppose that Michigan lawmakers survey 500 randomly selected registered voters to see if they favor an extension of the fall duck hunting season. The lawmakers believe the population proportion in favor of extending the duck hunting season is 45% (based on historical data and previous votes). State the three conditions of the Central Limit Theorem and explain whether each condition is satisfied in this scenario. Answer: Sample is random and independent—it is stated that this is a random sample and voters are independent. The sample is large—a sample of 500 is large enough since it will have at least 10 successes and failures (0.45 × 500 ≥. 10 and 0.55 × 500 ≥. 10). The population is big—A sample of 500 is large enough because the population is at least ten times larger. An event planner does some research and finds that in the area where a large childrenʹs event is to be held, approximately 1.75% of the children are lactose intolerant. Treat the 250 children expected at the event as a simple random sample from the local population of about 100,000 children. 13) On average, how many of the children attending the event would be expected to be lactose intolerant, give or take how many? Round to the nearest whole person. Answer: 4 children, give or take 2 children. 14) Suppose the event planner assumes that 2.8% of the children attending the event will be lactose intolerant so he orders 7 lactose-free meals. What is the approximate probability that more than 2.8% of the children attending the event are lactose intolerant and that he will not have enough lactose-free meals? Round to the nearest thousandth. Answer: 0.103 15) Suppose the event planner assumes that only 0.8% of the children attending the event will be lactose intolerant so he orders 2 lactose-free meals. What is the approximate probability that he will have too many lactose-free meals? Round to the nearest thousandth. Answer: 0.126
7.4 Estimating the Population Proportion with Confidence Intervals 1 Find Confidence Intervals for Population Proportions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following statements is true about the confidence interval for a population proportion? A) The end points of the confidence interval are equal to the population proportion plus -or-minus a calculated amount called the standard error. B) The end points of the confidence interval are equal to the sample proportion plus -or-minus a calculated amount called the margin of error. C) The confidence interval for a proportion will always contain the true population proportion. D) The confidence interval for a proportion does not need a specified confidence level. Answer: B
Page 15 Copyright © 2020 Pearson Education, Inc.
2) Complete the statements by filling in the blank. When constructing a confidence interval, to increase the level of confidence, you must ________ [increase/decrease] the margin of error and the confidence interval will be _________ [narrower/wider]. A larger sample size will improve the precision of the confidence interval, therefore, all things being equal, a larger sample size will result in a __________ [smaller/larger] margin of error and the confidence interval will be _________ [narrower/wider]. B) Decrease, wider. Larger, narrower A) Decrease, narrower. Larger, wider. C) Increase, narrower. Larger, wider. D) Increase, wider. Smaller, narrower. Answer: D Use the following information to answer the question. In a recent poll of 1200 randomly selected adult office workers, 32% said they had worn a Halloween costume to the office at least once. 3) What is the margin of error, using a 95% confidence level, for estimating the true population proportion of adult office workers who have worn a Halloween costume to the office at least once? (Round to the nearest thousandth) B) 0.053 C) 0.013 D) 0.026 A) 0.158 Answer: D 4) Report the 95% confidence interval for the percentage of all adult office workers who have worn a Halloween costume to the office at least once. (Round final calculations to the nearest tenth of a percent) A) (28.0%, 36.1%) B) (30.7%, 33.4%) C) (29.4%, 34.6%) D) None of these Answer: C Use the following information to answer the question. In a recent poll of 1100 randomly selected home delivery truck drivers, 26% said they had encountered an aggressive dog on the job at least once. 5) What is the margin of error, using a 95% confidence level, for estimating the true population percentage of home delivery truck drivers who have encountered an aggressive dog on the job at least once? (Round to the nearest thousandth) B) 0.013 C) 0.004 D) 0.053 A) 0.026 Answer: A 6) Report the 95% confidence interval for the percentage of all home delivery truck drivers who have encountered an aggressive dog on the job at least once. (Round final calculations to the nearest tenth of a percent) D) None of these A) (24.7 %, 27.3%) B) (23.4%, 28.6%) C) (20.7%, 31.3%) Answer: B Solve the problem. 7) A random sample of 830 adult television viewers showed that 52% planned to watch sporting event X. The margin of error is 3 percentage points with a 95% confidence. Does the confidence interval support the claim that the majority of adult television viewers plan to watch sporting event X? Why? A) No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 49% and 55%. The true proportion could be less than 50%. B) No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 50.5% and 53.5%. The lower limit of the confidence interval is just too close to 50% to say for sure. C) Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 50.5% and 53.5%. This is strong evidence that the true proportion is greater than 50% D) Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 49% and 55%. Since the confidence interval is mostly above 50% it is likely that the true proportion is greater than 50%. Answer: A Page 16 Copyright © 2020 Pearson Education, Inc.
8) A random sample of 950 adult television viewers showed that 48% planned to watch sporting event X. The margin of error is 4 percentage points with a 95% confidence. Does the confidence interval support the claim that the majority of adult television viewers plan to watch sporting event X? Why? A) No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 44% and 52%. B) No; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 46% and 50%. C) Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 44% and 52%. D) Yes; the confidence interval means that we are 95% confident that the population proportion of adult television viewers who plan to watch sporting event X is between 46% and 50%. Answer: C 9) Suppose that in a recent poll of 1200 adults between the ages of 35 and 45, 38% surveyed said they have thought about getting elective plastic surgery. Find the 95% confidence interval for the proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery then choose the correct interpretation. (Round to the nearest tenth of a percent) A) The population proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 35.2% and 40.7%, with a confidence level of 95%. B) There is a 95% chance that the population of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 35.2% and 40.7%. C) The population proportion of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 28.5% and 47.5%, with a confidence level of 95%. D) There is a 95% chance that the population of adults ages 35 to 45 who have thought about getting elective plastic surgery is between 28.5% and 47.5%. Answer: A Find the required confidence interval. 10) Suppose that in a recent poll of 900 adults between the ages of 35 and 45, 22% surveyed said they have thought about participating in an extreme sport such as bungee jumping. Find the 95% confidence interval for the proportion of adults ages 35 to 45 who have thought about participating in an extreme sport such as bungee jumping then choose the correct interpretation. (Round to the nearest tenth of a percent) A) The population proportion of adults aged 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 13.9% and 30.1%, with a confidence level of 95%. B) There is a 95% chance that the population of adults aged 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 13.9% and 30.1%. C) There is a 95% chance that the population of adults aged 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 19.3% and 24.7%. D) The population proportion of adults aged 35 to 45 who have thought about participating in an extreme sport such as bungee jumping is between 19.3% and 24.7%, with a confidence level of 95%. Answer: D 11) A researcher wishes to estimate the proportion of adults in the city of Darby who are vegetarian. In a random sample of 1,613 adults from this city, the proportion that are vegetarian is 0.075. Find a 90% confidence interval for the proportion of all adults in the city of Darby that are vegetarians. B) (0.0684, 0.0816) C) (0.0666, 0.0834) D) (0.0545, 0.0955) A) (0.0642, 0.0858) Answer: A 12) In a sample of 517 patients who underwent a certain type of surgery, 17% experienced complications. Find a 90% confidence interval for the proportion of all those undergoing this surgery who experience complications. B) (0.1535, 0.1865) C) (0.1488, 0.1912) D) (0.1338, 0.2062) A) (0.1428, 0.1972) Answer: A
Page 17 Copyright © 2020 Pearson Education, Inc.
Answer the question. 13) Which of the following statements is true about the confidence interval for a population proportion? A) It is equal to the population proportion plus-or-minus a calculated amount called the standard error. B) It is equal to the sample proportion plus-or-minus a calculated amount called the margin of error. C) The confidence interval for a proportion will always contain the true population proportion. D) The confidence interval for a proportion does not need a specified confidence level. Answer: B
14) Complete the statement by filling in the blanks. When constructing a confidence interval, if the level of confidence increases, the margin of error must _____ [increase/decrease] and the confidence interval will be _____ [narrower/wider]. B) Decrease, wider. A) Decrease, narrower. C) Increase; narrower. D) Increase; wider. Answer: D 15) Complete the statement by filling in the blanks. A larger sample size will improve the precision of the confidence interval, therefore, assuming no other values ch the margin of error will _____ [increase/decrease] and the confidence interval will be _____ [narrower/wider]. A) Increase, wider. B) Increase, narrower. C) Decrease, wider. D) Decrease; narrower. Answer: D 16) Complete the statement by filling in the blanks. When constructing a confidence interval, if the level of confidence decreases, the margin of error will _________ [increase/decrease]and the confidence interval will be _________ [narrower/wider]. B) Decrease, narrower A) Decrease, wider. C) Increase; wider. D) Increase; narrower. Answer: B 17) Complete the statement by filling in the blanks. When constructing a confidence interval, collecting a larger sample will assuming no other values change _____ [increase/decrease] the margin of error and the confidence interval will be _________ [narrower/wider]. A) Increase; narrower. B) Increase; wider. C) Decrease, narrower D) Decrease, wider. Answer: C From a random sample of workers at a large corporation you find that 58% of 200 went on a vacation last year away from home for at least a week. 18) An approximate 95% confidence interval is (0.50, 0.66). A student said this means 95% of the workers fall in the interval 0.50 to 0.66. What is the correct interpretation? A) 95% of the coworkers fall in the interval (0.50, 0.66). B) We are 95% confident that the proportion of coworkers who went on a vacation last year away from home for at least a week is between 50% and 66%. C) There is a 95% chance that a random selected coworker has gone on a vacation last year away from home for at least a week. D) We are 95% confident that between 50% and 66% of the samples will have a proportion near 58%. Answer: B
Page 18 Copyright © 2020 Pearson Education, Inc.
19) Which of the following statements are correct concerning the 95% confidence interval of (0.50, 0.66) of coworkers who went on a vacation last year away from home for at least a week? A) If the sample size is 500 instead of 200, the confidence interval will be larger. B) A maximum of 66% of the coworkers went on a vacation last year away from home for at least a week. C) If the confidence level were changed from 95% to 99%, the confidence interval would become wider. D) If the confidence level were changed from 95% to 90%, the confidence interval would become wider. Answer: C 20) What is the margin of error for the 95% confidence interval of (0.50, 0.66) of coworkers who went on a vacation last year away from home for at least a week? B) 0.16 C) 0.58 D) 0.04 A) 0.08 Answer: A In a recent poll of 1100 randomly selected home delivery truck drivers, 26% said they had encountered an aggressive dog on the job at least once. 21) What is the standard error for the estimate of the proportion of all home delivery truck drivers who have encountered an aggressive dog on the job at least once? Round to the nearest ten-thousandth. A) 0.1322 B) 0.0132 C) 0.0002 D) 0.0141 Answer: B Solve the problem. 22) Is it plausible that more than 10% of Americans believe in aliens? A random sample of 2000 adult Americans were surveyed and 15% of them said that they believed in aliens. Find the 95% confidence interval for the proportion of Americans who believe in aliens then choose the correct interpretation. (Round to the nearest tenth of a percent) A) The population proportion of Americans who believe in aliens is between 15% +/- 1.6% with a confidence level of 95%. The interval is higher than 10% and therefore, it is plausible that more than 10% of Americans believe in aliens. B) The population proportion of Americans who believe in aliens is between 15% +/- 1.6% with a confidence level of 95%. The interval includes 10% and therefore, it is plausible that at least 10% of Americans believe in aliens. C) The population proportion of Americans who believe in aliens is between 10% +/- 1.6% with a confidence level of 95%. The interval includes 10% and therefore, it is plausible that at least 10% of Americans believe in aliens. D) The population proportion of Americans who believe in aliens is between 15% +/- 0.8% with a confidence level of 95%. The interval does not include 10% and therefore, it is not plausible that at least 10% of Americans believe in aliens. Answer: A 23) Which of the following statements is true about the confidence interval for a population proportion? A) The confidence interval for a proportion will always contain the true population proportion. B) The confidence interval for a proportion does not need a specified confidence level. C) It is equal to the population proportion plus-or-minus the standard error. D) It is equal to the sample proportion plus-or-minus a calculated amount called the margin of error. Answer: D
Page 19 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 24) Suppose a city manager conducts a poll and finds that a 95% confidence interval for the proportion of residents who support parking restrictions during snow removal periods as 63% to 71%. Explain the meaning of the 95% confidence level and what the interval represents. Answer: The 95% indicates that if many polls were taken, 95% of them would result in confidence intervals that include the true population proportion of residents that support parking restrictions during snow removal. The interval represents a population estimate +/- a margin of error which depends on the confidence level (95%) and the sample size. 25) Explain the difference between the standard error of a sample proportion and the margin of error of a confidence interval for a population proportion. Answer: The margin of error is an amount that is added and subtracted from the estimate which provides the range of plausible values around the sample proportion base on a chosen level of confidence. The standard error measures the variability of the sample proportion as it varies from sample to sample. It is not about the variability of a particular sample. In a recent poll of 900 randomly selected adults, 37% reported that they could not swim 24 yards (the length of a typical gymnasium lap pool). 26) What is the margin of error, using a 95% confidence level, for estimating the true proportion of adults who self-report that they cannot swim 24 yards? Round to the nearest thousandth. Answer: 0.032 27) Report the 95% confidence interval for the proportion of adults who self-report that they cannot swim 24 yards. Round final calculations to the nearest tenth of a percent. Answer: The upper limit would be 37% + 3.2% = 40.2%. The lower limit would be 37% – 3.2% = 33.8%. (33.8%, 40.2%) Solve the problem. 28) A survey of 800 randomly selected senior citizens showed that 55% said they planned to watch an upcoming political debate on television. The margin of error for the 95% confidence interval is 3.5 percentage points. Does the confidence interval support the claim that the majority of senior citizens plan to watch the upcoming political debate on television? Explain why or why not. Answer: Yes, a confidence interval of 55% ± 3.5% would include plausible population parameter values that are greater than 50% so the claim would not be unreasonable. 29) Suppose that you and a friend read the following statement in a news report, ʺA recent poll found that 54% of voters, give or take 3%, plan to vote for candidate X in the next election (with a confidence level of 95%)ʺ. Your friend then makes the statement, ʺHey, look, thereʹs a 95% chance that somewhere between 51% and 57% of voters plan to vote for candidate X!ʺ How would you explain to your friend why his statement is incorrect, be sure to provide your friend with the correct interpretation of the confidence interval. Answer: Answers will vary, but should reference the fact that there is no chance that the population parameter will change, which is implied when one interprets a confidence level as a probability. The correct interpretation is that the proportion of voters who plan to vote for candidate X is between 51% and 57%, with a confidence level of 95% which means the process used to produce the interval will capture the true population proportion 95% of the time.
Page 20 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 30) A research organization conducted a study to estimate the percentage of people in Country X who do not use the internet. a. If a 95% confidence level is used, how many people should be included in the survey if the researchers wanted to have a margin of error of 5%? b. How would the sample size change if the researchers wanted to estimate the percentage with a margin of error of 2%? A) a. 1/(0.05) 2 = 400 people, b. 1/(0.02) 2 = 2500 people B) a. 1/0.05 = 20 people, b. 1/0.02 = 50 people C) a. 502 = 2500 people, b. 202 = 400 people D) a. 5 2 = 25 people, b. 2 2 = 4 people Answer: A 31) In a study, researchers wanted to estimate the percentage of high school boys who planned to major in a STEM field. a. If a 95% confidence level is used, how many people should be included in the survey if the researchers wanted to have a margin of error of 4%? b. If they decrease the number of study participants and change nothing else, will their margin of error need to b smaller or larger? A) a. 1/(0.04) 2 = 625 boys, b. They could increase their margin of error. This would decrease the number of study participants needed. B) a. 1/(0.04) 2 = 625 boys, b. They could decrease their margin of error. This would decrease the number of study participants needed. C) a. 402 = 1600 boys, b. They could decrease their margin of error. This would decrease the number of study participants needed. D) a. 402 = 1600 boys, b. They could increase their margin of error. This would decrease the number of study participants needed. Answer: A 32) A research organization conducted a study to estimate the percentage of people in Country X who have health insurance. a. If a 95% confidence level is used, how many people should be included in the survey if the researchers wanted to have a margin of error of 3%? b. How many people would be required if the researchers wanted to estimate the percentage with a margin of er 7%? c. What is the relationship between the size of the margin of error and the sample size? A) a. 1/(0.03) 2 = 1111 people, b. 1/(0.07) 2 = 204 people, c. As the margin of error is increased, the required sample size decreases. B) a. 1/0.03 = 33 people, b. 1/0.07 = 14 people, c. As the margin of error is increased, the required sample size decreases. C) a. 302 = 900 people, b. 702 = 4900 people, c. As the margin of error is increased, the required sample size also increases. D) a. 3 2 = 9 people, b. 7 2 = 49 people, c. As the margin of error is increased, the required sample size also increases. Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
33) In a study, researchers wanted to estimate the percentage of high school seniors who planned to major in an art a. If a 95% confidence level is used, how many people should be included in the survey if the researchers wanted to have a margin of error of 9%? b. If they were able to increase the number of study participants, what could happen to their margin of error? A) a. 1/(0.09) 2 = 123 seniors, b. If they could increase the number of study participants, their margin of error could decrease. B) a. 1/(0.09) 2 = 123 seniors, b. If they could increase the number of study participants, their margin of error could also increase. C) a. 9 2 = 81 seniors, b. If they could increase the number of study participants, their margin of error could decrease. D) a. 9 2 = 81 seniors, b. If they could increase the number of study participants, their margin of error could also increase. Answer: A
7.5 Comparing Two Population Proportions with Confidence 1 Find and Interpret Confidence Intervals for Two Population Proportions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Obtain the required confidence interval for the difference between two population proportions. Assume that independent simple random samples have been selected from the two populations. 1) In a random sample of 36 Democrats from one city, 18 approved of the mayorʹs performance. In a random sample of 48 Republicans from the city, 28 approved of the mayorʹs performance. Find a 90% confidence interval for the difference between the proportions of Democrats and Republicans who approve of the mayorʹs performance. B) (0.285, 0.715) C) (0.320, 0.680) D) (0.286, 0.715) A) (-0.264, 0.097) Answer: A 2) A survey found that 37 of 74 randomly selected women and 47 of 74 randomly selected men follow a regular exercise program. Find a 95% confidence interval for the difference between the proportions of women and men who follow a regular exercise program. B) (-0.323, 0.688) C) (0.342, 0.658) D) (0.312, 0.688) A) (-0.293, 0.023) Answer: A Solve the problem. 3) A polling agency wants to compare support for the presidentʹs foreign policies between men and women. They surveyed 2500 U.S. citizens and found a 95% confidence interval for the difference in proportions between men and women who support the presidentʹs foreign policies as ( -0.05 to -0.025) where population 1 is men and population 2 is women. Select the correct interpretation of this result. A) The interval estimates that p 1 – p 2 < 0 which shows that women are more likely than men to support with the presidentʹs foreign policies. B) The interval estimates that p 1 – p 2 < 0 which shows that men are more likely than women to support with the presidentʹs foreign policies. C) The interval does not contain 0, which shows that there is no significant difference in the proportions between men and women. D) The number surveyed should be increased to increase the accuracy of the confidence interval. Answer: A
Page 22 Copyright © 2020 Pearson Education, Inc.
4) Confidence intervals can be used to determine whether different sample proportions reflect a ʺrealʺ difference in the population. The basic approach is to... A) find the margin of error for each proportion and see if the difference is less than zero. B) find the difference in the proportions and see if the difference is less than zero. C) find a confidence interval at the significance level desired for the difference in proportions. D) find the difference in the proportions and see if the difference is greater than zero. Answer: C 5) A polling agency wants to estimate the proportion of U.S. citizens who support the presidentʹs domestic policies. They surveyed 2500 U.S. citizens and found a 95% confidence interval for the difference in proportions between men and women who support the presidentʹs domestic policies as ( -0.025 to 0.050) where population 1 is men and population 2 is women. Select the correct interpretation of this result. A) The interval contains zero which shows that women are more likely than men to disagree with the presidentʹs foreign policies. B) The interval contains zero which shows that men are more likely than women to disagree with the presidentʹs foreign policies. C) The interval does not contain zero which shows that there is no significant difference in the proportions between men and women. D) The interval contains zero which shows that there is no significant difference in the proportions between men and women. Answer: D 6) A medical study examined data on patients with cardiovascular disease who were currently non -smokers and those who were current smokers. Population 1 were smokers and population 2 were non -smokers. After data analysis, the 95% confidence interval for the difference in proportions is 0.015 +/- 0.011. The most accurate interpretation is... A) We are 95% confident that the difference in the proportion of smokers compared to nonsmokers is between 0.004 and 0.026. There is a significance difference indicating higher cardiovascular disease amongst smokers. B) We are 95% confident that the interval of the difference in the proportions contains zero. There is not a significance difference between smokers and non-smokers. C) We are 95% confident that the difference in the proportion of smokers compared to nonsmokers is between -0.004 and 0.026. There is not a significance difference in the proportions. D) We are 95% confident that the proportion of smokers compared to non-smokers is between 0.004 and 0.026. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) Confidence intervals can be used to determine whether different sample proportions reflect a ʺrealʺ difference in the population. Explain the method of how this is accomplished and what the assumptions are. Answer: The method for determining whether different sample proportions reflect a ʺrealʺ difference in the population is find the difference between the two sample proportions taken from two populations, subtracting the sample proportion for population 2 from the sample proportion for population 1. Similar to the confidence interval for a single proportion, a margin of error is calculated that depends on the confidence level, the sample proportions, and the sample size. The assumptions are that the samples are random and independent, are sufficiently large taken from big populations.
Page 23 Copyright © 2020 Pearson Education, Inc.
8) A polling agency wants to estimate the proportion of U.S. citizens who support the presidentʹs educational policies. They surveyed 1500 U.S. citizens and found a 95% confidence interval for the difference in proportions between men and women who support the presidentʹs educational policies as (-0.075 to 0.025) where population 1 is men and population 2 is women. Interpret the result and state if the assumptions are satisfied. Answer: The interval contains zero which shows that there is no significant difference in the proportions between men and women. The assumptions are satisfied: the samples randomly drawn from their populations, samples are independent, with samples large enough from big populations.
Page 24 Copyright © 2020 Pearson Education, Inc.
Ch. 7 Survey Sampling and Inference Answer Key 7.1 Learning about the World through Surveys 1 Identify Populations, Samples, Parameters of Interest, and Statistics 1) A 2) C 3) B 4) A 5) C 6) D 7) A 8) B 9) B 10) B 11) A 12) A 13) C 14) B 15) A 16) A population is a group of objects or people which are being studied. A population is a total collection. A sample is a collection of objects or people taken from the population of interest. Examples will vary. 17) A persuasive survey question will result in biased results, which may not accurately reflect the true sentiments of the population. A confusing question with multiple negatives and irrelevant information can result in inaccurate responses. 2 Determine Whether Sampling Methods are Likely to Result in Representative or Biased Samples 1) D 2) C 3) A 4) A 5) A 6) A 7) FALSE 3 Use Random Number Tables to Select Random Samples 1) A 2) A 3) A 4) A 5) A 4 Use Surveys to Make Conclusions About the Population 1) D 2) D 3) A statistic is a numerical summary of a sample of data, examples will vary. A parameter is a numerical value that characterizes some aspect of the population, examples will vary. 4) Frederickʹs survey may have nonresponse bias. The residents who chose not to participate may have different views about the survey topic then those who did respond. 5) A 6) A 7) A 8) A
7.2 Measuring the Quality of a Survey 1 Understand Bias and Precision 1) B 2) D Page 25 Copyright © 2020 Pearson Education, Inc.
3) C 4) B 5) B 6) C 7) B 8) D 9) D 10) B 11) B 12) B 13) D 14) C 15) A 16) Precision means that sampling results are consistent when a sampling method is repeated. The precision of a sampling method is measured by the standard error. Accuracy means sampling results are centered around the population parameter. Accuracy is measured in terms of bias. 2 Find the Error in a Sample 1) A 2) B 3) C 4) B 5) 25%; 20 milky-white marbles 6) 4.8% 7) 25%; 4.8%
7.3 The Central Limit Theorem for Sample Proportions 1 Use the Central Limit Theorem for Sample Proportions 1) A 2) A 3) A 4) C 5) B 6) C 7) B 8) D 9) D 10) A 11) A 12) Sample is random and independent—it is stated that this is a random sample and voters are independent. The sample is large—a sample of 500 is large enough since it will have at least 10 successes and failures (0.45 × 500 ≥. 10 and 0.55 × 500 ≥. 10). The population is big—A sample of 500 is large enough because the population is at least ten times larger. 13) 4 children, give or take 2 children. 14) 0.103 15) 0.126
7.4 Estimating the Population Proportion with Confidence Intervals 1 Find Confidence Intervals for Population Proportions 1) B 2) D 3) D 4) C 5) A 6) B 7) A 8) C 9) A Page 26 Copyright © 2020 Pearson Education, Inc.
10) D 11) A 12) A 13) B 14) D 15) D 16) B 17) C 18) B 19) C 20) A 21) B 22) A 23) D 24) The 95% indicates that if many polls were taken, 95% of them would result in confidence intervals that include the true population proportion of residents that support parking restrictions during snow removal. The interval represents a population estimate +/- a margin of error which depends on the confidence level (95%) and the sample size. 25) The margin of error is an amount that is added and subtracted from the estimate which provides the range of plausible values around the sample proportion base on a chosen level of confidence. The standard error measures the variability of the sample proportion as it varies from sample to sample. It is not about the variability of a particular sample. 26) 0.032 27) The upper limit would be 37% + 3.2% = 40.2%. The lower limit would be 37% – 3.2% = 33.8%. (33.8%, 40.2%) 28) Yes, a confidence interval of 55% ± 3.5% would include plausible population parameter values that are greater than 50% so the claim would not be unreasonable. 29) Answers will vary, but should reference the fact that there is no chance that the population parameter will change, which is implied when one interprets a confidence level as a probability. The correct interpretation is that the proportion of voters who plan to vote for candidate X is between 51% and 57%, with a confidence level of 95% which means the process used to produce the interval will capture the true population proportion 95% of the time. 30) A 31) A 32) A 33) A
7.5 Comparing Two Population Proportions with Confidence 1 Find and Interpret Confidence Intervals for Two Population Proportions 1) A 2) A 3) A 4) C 5) D 6) A 7) The method for determining whether different sample proportions reflect a ʺrealʺ difference in the population is find the difference between the two sample proportions taken from two populations, subtracting the sample proportion for population 2 from the sample proportion for population 1. Similar to the confidence interval for a single proportion, a margin of error is calculated that depends on the confidence level, the sample proportions, and the sample size. The assumptions are that the samples are random and independent, are sufficiently large taken from big populations. 8) The interval contains zero which shows that there is no significant difference in the proportions between men and women. The assumptions are satisfied: the samples randomly drawn from their populations, samples are independent, with samples large enough from big populations.
Page 27 Copyright © 2020 Pearson Education, Inc.
Ch. 8 Hypothesis Testing for Population Proportions 8.1 The Essential Ingredients of Hypothesis Testing 1 Identify Null and Alternative Hypotheses MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Read the following problem description then choose the correct null and alternative hypothesis. A new drug is being tested to see whether it can reduce the rate of asthma attacks in children ages 5 to 14. The rate of asthma attacks in the population of concern is 0.0744. B) H0 : p = 0.0744; Ha : p < 0.0744 A) H0 : p = 0.0744; Ha : p > 0.0744 C) H0 : p < 0.0744; Ha : p < 0.0744
D) H0 : p > 0.0744; Ha : p ≠ 0.0744
Answer: B 2) Read the following problem description then choose the correct null and alternative hypothesis. A new drug is being tested to see whether it can reduce the rate of food-related allergic reactions in children aged 1 to 3 with food allergies. The rate of allergic reactions in the population of concern is 0.03. B) H0 : p ≠ 0.03; Ha : p = 0.03 A) H0 : p < 0.03; Ha : p = 0.03 C) H0 : p = 0.03; Ha : p > 0.03
D) H0 : p = 0.03; Ha : p < 0.03
Answer: D Express the null hypothesis and the alternative hypothesis in symbolic form. 3) An entomologist writes an article in a scientific journal which claims that fewer than 13 in ten thousand male fireflies are unable to produce light due to a genetic mutation. B) H0 : p < 0.0013 C) H0 : p = 0.0013 D) H0 : p > 0.0013 A) H0 : p = 0.0013 Ha : p < 0.0013
Ha : p ≥ 0.0013
Ha : p > 0.0013
Ha : p ≤ 0.0013
Answer: A 4) A skeptical paranormal researcher claims that the proportion of Americans that have seen a UFO is less than 5 in every one thousand. C) H0 : p < 0.005 D) H0 : p = 0.005 A) H0 : p = 0.005 B) H0 : p > 0.005 Ha : p < 0.005
Ha : p ≤ 0.005
Ha : p ≥ 0.005
Ha : p > 0.005
Answer: A Solve the problem. 5) Read the following problem description then choose the correct null and alternative hypothesis. A new drug is being tested to see whether it can reduce the diastolic blood pressure measurement for adults age 45 -60 years. The upper limit for diastolic blood pressure measurement should be 90 mmHg. B) H0 : p ≠ 90 mmHg; Ha : p < 90 mmHg A) H0 : p = 90 mmHg; Ha : p ≠ 90 mmHg C) H0 : p < 90 mmHg; Ha : p > 90 mmHg
D) H0 : p = 90 mmHg; Ha : p < 90 mmHg
Answer: D 6) Read the following problem description then choose the correct null and alternative hypothesis: A new drug is being tested to see whether it can reduce the rate of food-related allergic reactions in children ages 1 to 3 with food allergies. The rate of allergic reactions in the population of concern is 0.03. B) H0 : p ≠ 0.03; Ha : p = 0.03 A) H0 : p < 0.03; Ha : p = 0.03 C) H0 : p = 0.03; Ha : p > 0.03
D) H0 : p = 0.03; Ha : p < 0.03
Answer: D
Page 1 Copyright © 2020 Pearson Education, Inc.
7) Which of the following is not true about the alternative hypothesis? A) It is sometimes called the research hypothesis. B) It is assumed to be true. C) Like the null hypothesis, it is always a statement about a population parameter. D) It is usually a statement that the researcher hopes to demonstrate is true. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 8) The worker at a carnival game claims that he can communicate with a small magic rock and to prove it he tells you to hide it in your hand behind your back and he will identify the hand holding the rock. Being a wise student of statistics, you decide to stand back and observe the outcome of the next ten games before deciding whether to pay your three dollars to play the game. You have just conducted an informal hypothesis test. State the null and alternative hypothesis. Answer: Null hypothesis is that the carnival worker has no special powers, therefore H0 : p = 0.5 . The alternative hypothesis is that he can communicate with a rock, therefore Ha: p > 0.5 . 9) A researcher wishes to test the claim that the proportion of children with blue eyes in his region is different than one in six, the national rate of blue eyes in children. State and explain the null and alternative hypothesis that should be used to test the claim. Answer: H0 : p = 1 / 6 and Ha: p ≠ 1 / 6 ; The null hypothesis states that the population parameter is no different than what is expected and is assumed to be true. The alternative hypothesis states that the population parameter may be different and contains the claim that the researcher is trying to show. A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test whether the claim is true or not, a random sample of 100 doctors is obtained. Of these doctors, 82 indicated that they recommend aspirin for headaches. Is the claim accurate? Test with a significance level of 0.05. 10) State and explain the null and alternative hypothesis that should be used to test the claim. Answer: The null and alternative hypotheses are as follow: H0 : p = 0.90 and Ha: p ≠ 0.90. 2 Find Test Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose that the following is to be tested: H0 : p = 0.35 and Ha : p > 0.35. Calculate the observed z-statistic for the following sample data: Forty out of eighty test subjects have the characteristic of interest. Round to the nearest hundredth. B) z = 2.81 C) z = 1.88 D) z = -1.87 A) z = -0.94 Answer: B 2) Suppose that the following is to be tested: H0 : p = 0.72 and Ha : p ≠ 0.72. Calculate the observed z-statistic for the following sample data: Sixty-eight out of ninety test subjects have the characteristic of interest. Round to the nearest thousandth. A) z = 0.751 B) z = 0.453 C) z = 0.756 D) z = -0.751 Answer: A
Page 2 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has too many defective bulbs. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.07 (the light bulb manufacturerʹs stated defect rate). Suppose he does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.07 and Ha : p > 0.07. 3) Suppose that the janitor tests 300 randomly selected light bulbs and finds that 27 bulbs are defective. What value of the test statistic should he report? Round to the nearest hundredth. B) z = 1.36 C) z = -1.36 D) z = 1.96 A) z = -1.96 Answer: B Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has a defect rate that is different than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.09 (the light bulb manufacturerʹs stated defect rate). Suppose we do a hypothesis test with a significance level of 0.01. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha : p > 0.09. 4) Suppose the janitor tests 300 light bulbs and finds that 33 bulbs are defective. What value of the test statistic should he report? Round to the nearest hundredth. D) z = -2.17 A) z = 1.21 B) z = -1.21 C) z = 2.17 Answer: A Find the value of the test statistic z. 5) A claim is made that the proportion of children who play sports is 0.5, and the sample statistics include 1,544 subjects with 30% saying that they play a sport. A) z = -15.72 B) z = 15.72 C) z = -32.08 D) z = 32.08 Answer: A 6) The claim is that the proportion of drowning deaths of children attributable to beaches is 0.25, and the sample statistics include 693 drowning deaths of children with 30% of them attributable to beaches. B) z = 2.87 C) z = -3.04 D) z = -2.87 A) z = 3.04 Answer: A A janitor at a large office building believes that his supply of light bulbs has a defect rate that is higher than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a manufacturerʹs defect rate of p = 0.09. He performs a test at a significance level of 0.01. The null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha: p > 0.09. 7) Suppose the janitor tests 300 light bulbs and finds that 33 bulbs are defective. The calculated test statistic is z = 1.21. Select the appropriate interpretation of the test statistic. A) A test statistic of 1.21 is 1.21 standard deviations greater than the mean (between 1 and 2) indicating that the result is not significant at a level of 0.01 using a one-sided alternative hypothesis. B) A test statistic of 1.21 is 1.21 standard deviations greater than the mean (between 1 and 2) indicating that the result could be significant using a two-sided alternative hypothesis. C) A test statistic of -1.21 is 1.21 standard deviations less than the mean (between 1 and 2) indicating that the result could be significant at a level of 0.01 using a one -sided alternative hypothesis. D) A test statistic of 1.21 is 1.21 standard deviations less than the mean (between 1 and 2) indicating that the result is not significant. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test whether the claim is true or not, a random sample of 100 doctors is obtained. Of these doctors, 82 indicated that they recommend aspirin for headaches. Is the claim accurate? Test with a significance level of 0.05. 8) Calculate the z test statistic for the sample results. Round to the nearest hundredth. Answer: z = -2.667 Page 3 Copyright © 2020 Pearson Education, Inc.
9) Explain how you can determine the significance by using the calculated z -value. Answer: The cutoff z-value for a significance level of 0.05 is +/-1.96. The calculated value of -2.667 is greater than the cutoff value and therefore the result is significant. 3 Find and Interpret p-Values MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has too many defective bulbs. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.07 (the light bulb manufacturerʹs stated defect rate). Suppose he does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.07 and Ha : p > 0.07. 1) Choose the statement that best describes the significance level in the context of the hypothesis test. A) The significance level of 0.05 is the defect rate we believe is the true defect rate. B) The significance level of 0.05 is the test statistic that we will use to compare the observed outcome to the null hypothesis. C) The significance level of 0.05 is the probability of concluding that the defect rate is equal to 0.07 when in fact it is greater than 0.07. D) The significance level of 0.05 is the probability of concluding that the defect rate is higher than 0.07 when in fact the defect rate is equal to 0.07. Answer: D Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has a defect rate that is different than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.09 (the light bulb manufacturerʹs stated defect rate). Suppose we do a hypothesis test with a significance level of 0.01. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha : p > 0.09. 2) Choose the statement that best describes the significance level in the context of the hypothesis test. A) The significance level of 0.01 is the probability of concluding that the defect rate is different than 0.09 when in fact the defect rate is equal to 0.09. B) The significance level of 0.01 is the defect rate we believe is the true defect rate. C) The significance level of 0.01 is the z-statistic that we will use to compare the observed outcome to the null hypothesis. D) The significance level of 0.01 is the probability of concluding that the defect rate is equal to 0.09 when in fact it is greater than 0.09. Answer: A Solve the problem. 3) A polling agency is interested in testing whether the proportion of women who support a female candidate for office is greater than the proportion of men. The null hypothesis is that there is no difference in the proportion of men and women who support the female candidate. The alternative hypothesis is that the proportion of women who support the female candidate is greater than the proportion of men. The test results in a p -value of 0.112. Which of the following is the best interpretation of the p-value? A) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is greater than the proportion of men. B) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. C) The p-value is the probability that men will support the female candidate. D) The p-value is the probability that women will support the female candidate. Answer: B
Page 4 Copyright © 2020 Pearson Education, Inc.
4) A polling agency is interested in testing whether the proportion of women who support a female candidate for office is less than the proportion of men. The null hypothesis is that there is no difference in the proportions of men and women who support the female candidate. The alternative hypothesis is that the proportion of women who support the female candidate is less than the proportion of men. The test results in a p -value of 0.041. Which of the following is the best interpretation of the p-value? A) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is less than the proportion of men. B) The p-value is the probability that the majority of women will support the female candidate. C) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. D) The p-value is the probability that the majority of men will support the female candidate. Answer: C 5) Suppose a city official conducts a hypothesis test to test the claim that the majority of voters support a proposed tax to build sidewalks. Assume that all the conditions for proceeding with a one -sample test on proportions have been met. The calculated test statistic is approximately 1.40 with an associated p -value of approximately 0.081. Choose the conclusion that provides the best interpretation for the p -value at a significance level of α = 0.05. A) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.40 is 0.081. This result is surprising and could not easily happen by chance. B) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.40 is 0.081. This result is not surprising and could easily happen by chance. C) The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. D) None of these. Answer: B 6) Suppose a city official conducts a hypothesis test to test the claim that the majority of voters oppose a proposed school tax. Assume that all the conditions for proceeding with a one-sample test on proportions have been met. The calculated test statistic is approximately 1.46 with an associated p -value of approximately 0.072. Choose the conclusion that provides the best interpretation for the p-value at a significance level of α = 0.05. A) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.46 is 0.072. This result is surprising and could not easily happen by chance. B) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.46 is 0.072. This result is not surprising and could easily happen by chance. C) The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. D) None of these. Answer: B
Page 5 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has too many defective bulbs. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.07 (the light bulb manufacturerʹs stated defect rate). Suppose he does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.07 and Ha : p > 0.07. 7) The janitor calculates a p-value for the hypothesis test of approximately 0.087. Choose the correct interpretation for the p-value. A) The p-value tells us that if the defect rate is 0.07, then the probability that the janitor will have 27 defective light bulbs out of 300 is approximately 0.087. At a significance level of 0.05, this would not be an unusual outcome. B) The p-value tells us that the probability of concluding that the defect rate is equal to 0.07, when in fact it is greater than 0.07, is approximately 0.087. C) The p-value tells us that the true population rate of defective light bulbs is approximately 0.087. D) None of these Answer: A Use the following information to answer the question. A janitor at a large office building believes that his supply of light bulbs has a defect rate that is different than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a defect rate of p = 0.09 (the light bulb manufacturerʹs stated defect rate). Suppose we do a hypothesis test with a significance level of 0.01. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha : p > 0.09. 8) The janitor calculates a p-value for the hypothesis test of approximately 0.113. Choose the correct interpretation for the p-value. A) The p-value tells us that the probability of concluding that the defect rate is equal to 0.09, when in fact it is greater than 0.09, is approximately 0.113. B) The p-value tells us that if the defect rate is 0.113, then the probability that the janitor will have 33 defective light bulbs out of 300 is approximately 0.113. At a significance level of 0.01, this would not be an unusual outcome. C) The p-value tells us that the true population rate of defective light bulbs is approximately 0.113. D) None of these Answer: B Solve the problem. 9) A research firm carried out a hypothesis test on a population proportion using a right -tailed alternative hypothesis. Which of the following z-scores would be associated with a p-value of 0.04? Round to the nearest hundredth. B) z = -2.50 C) z = 1.75 D) z = -1.75 A) z = 2.50 Answer: C 10) A research firm carried out a hypothesis test on a population proportion using a left-tailed alternative hypothesis. Which of the following z-scores would be associated with a p-value of 0.04? Round to the nearest hundredth. B) z = -2.50 C) z = 1.75 D) z = -1.75 A) z = 2.50 Answer: D 4 Know Concepts: The Essential Ingredients of Hypothesis Testing MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Complete the statement by filling in the blanks. The null hypothesis is and is only rejected when the observed outcome is shown to be A) Proven; true; impossible C) Assumed; true; extremely unlikely
to be .
B) Known; true; the population parameter D) Likely; false; extremely likely
Answer: C Page 6 Copyright © 2020 Pearson Education, Inc.
2) Which statement best describes the significance level of a hypothesis test? A) The probability of rejecting the null hypothesis when the null hypothesis is true. B) The probability of rejecting the null hypothesis when the null hypothesis is not true. C) The probability of failing to reject the null hypothesis when the null hypothesis is not true. D) None of these Answer: A 3) Which of the following is not true about the alternative hypothesis? A) It is sometimes called the research hypothesis. B) It is assumed to be true based on the sample results. C) Like the null hypothesis, it is always a statement about a population parameter. D) It is usually a statement that the researcher hopes to demonstrate is true. Answer: B 4) Complete the statement by filling in the blanks. A research hypothesis is always expressed in terms of ________ __________ because we are interested in making statements about the _________ based on _______ statistics. A) sample; statistics; population; sample B) population; statistics; population; parameter C) population; parameters; population; sample D) population; parameters; sample; population Answer: C 5) Complete the statement by filling in the blanks. The null hypothesis H0 is the statement of _________ and always has a ______ sign. The alternative hypothesis Ha is the __________ hypothesis. It is a statement about the value of a __________ that we intend to test. A) no consequence; > ; research; sample C) change; = ; no change; parameter
B) no change; = ; research; parameter D) no change; = ; research; sample
Answer: B 6) A quality control manager believes that there are too many defective light bulbs being produced, higher than the advertised rate. The managerʹs null hypothesis is that the production line of light bulbs has a defect rate of p = 0.025 (the light bulbʹs stated defect rate). He does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.025 and Ha: p > 0.025. Choose the statement that best describes the significance level in the context of the hypothesis test. A) The significance level of 0.05 is the defect rate we believe is the true defect rate. B) The significance level of 0.05 is the probability of concluding that the defect rate is higher than 0.025 when in fact the defect rate is equal to 0.025. C) The significance level of 0.05 is the probability of concluding that the defect rate is equal to 0.025 when in fact it is greater than 0.025. D) The significance level of 0.05 is the test statistic that we will use to compare the observed outcome to the null hypothesis. Answer: B
Page 7 Copyright © 2020 Pearson Education, Inc.
A janitor at a large office building believes that his supply of light bulbs has a defect rate that is higher than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a manufacturerʹs defect rate of p = 0.09. He performs a test at a significance level of 0.01. The null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha: p > 0.09. 7) Choose the statement that best describes the significance level in the context of the hypothesis test. A) The significance level of 0.01 is the probability of concluding the defect rate is more than 0.09 when it is equal to 0.09. B) The significance level of 0.01 is the defect rate we believe is the true defect rate. C) The significance level of 0.01 is the z-statistic that we will use to compare the observed outcome to the null hypothesis. D) The significance level of 0.01 is the probability of concluding that the defect rate is equal to 0.09 when in fact it is greater than 0.09. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test whether the claim is true or not, a random sample of 100 doctors is obtained. Of these doctors, 82 indicated that they recommend aspirin for headaches. Is the claim accurate? Test with a significance level of 0.05. 8) Write a statement describing the meaning of the level of significance in the context of the hypothesis test. Answer: The significance level of 0.05 is the probability of concluding that the defect rate is different from 0.90, when in fact the defect rate is equal to 0.90.
8.2 Hypothesis Testing in Four Steps 1 Check the Conditions for a One-Proportion z-Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The researcherʹs null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p > 0.26. The researcher collected data from a random sample of 75 adults in the region of interest. 1) Check that the conditions hold so that the sampling distribution of the z -statistic will approximately follow the standard normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. A) Yes, all the conditions are satisfied. B) No, the researcher did not collect a random sample. C) No, the researcher did not collect a large enough sample. D) No, the population of interest is not large enough to assume independence. Answer: A Use the following information to answer the question. A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p < 0.26. The researcher collected data from 150 surveys he handed out at a busy park located in the region. 2) Check that the conditions hold so that the sampling distribution of the z -statistic will approximately follow the standard normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. A) No, the researcher did not collect a random sample. B) No, the researcher did not collect a large enough sample. C) No, the population of interest is not large enough to assume independence. D) Yes, all the conditions are satisfied. Answer: A Page 8 Copyright © 2020 Pearson Education, Inc.
A researcher is wondering whether the smoking habits of young adults (18 -25 years of age) in a certain city in the U.S. are the same as the proportion of the general population of young adults in the U.S. A recent study stated that the proportion of young adults who reported smoking at least twice a week or more in the last month was 0.16. The researcher collected data from a random sample of 75 adults in the city of interest. 3) Check that the conditions hold so that the sampling distribution of the z -statistic will approximately follow the standard normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. A) Yes, all the conditions are satisfied. B) No, the researcher did not collect a random sample. C) No, the researcher did not collect a large enough sample. D) No, the population of interest is not large enough to assume independence. Answer: A A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults in the general population who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha: p < 0.26. The researcher collected data from 150 surveys he handed out at a busy park located in the region. 4) Check that the conditions hold so that the sampling distribution of the z -statistic will approximately follow the standard Normal distribution. Are the conditions satisfied? If not, choose the condition that is not satisfied. A) No the conditions are not satisfied; the researcher did not collect a random sample. B) Yes, the population of proportions can be assumed to be roughly symmetric. C) No, the population of interest is not large enough to assume independence. D) Yes, all the conditions are satisfied. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A health foods shop owner is wondering if his customerʹs daily vitamin supplement habits are in the same proportion as the general population of adults. The shop owner heard in a news report that 60% of all adults reported that they took a daily vitamin. The shop owner believes that his customers have a greater proportion of adults who take a daily vitamin, so he decides to conduct a hypothesis test using the following null and alternative hypothesis: H0 : p = 0.6 and Ha: p > 0.6. The shope owner collected data from 50 randomly selected customers. 5) List and verify that the conditions hold so that the sampling distribution of the z test statistic will approximately follow the standard normal distribution. Answer: Random Sample-stated in problem description; large enough sample size-since there are at least 10 expected success and failures (50 * 0.6 ≥ 10 and 50 * 0.4 ≥ 10); Large enough population-yes, it is reasonable to assume that there are at least 500 customers in the population; Independence -yes, reasonable to assume that customer responses are independent; Null hypothesis-yes, it is reasonable to assume the null hypothesis is true.
Page 9 Copyright © 2020 Pearson Education, Inc.
2 Find the Test Statistic and Corresponding p -value for a One-Proportion z-Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The researcherʹs null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p > 0.26. The researcher collected data from a random sample of 75 adults in the region of interest. 1) To continue the study into the drinking habits of adults, the researcher decides to collect data from adults working in ʺblue collarʺ jobs to see whether their drinking habits are in the same proportion as the general public. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p > 0.26. The researcher collected data from a random sample of 90 adults with ʺblue collarʺ jobs of which 30 stated that they drank once a week or less in the last month. Assume that the conditions that must be met in order for us to use ^
the N(0, 1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, p, the test statistic, and the p-value associated with the test statistic. Round all values to the nearest thousandth. ^ ^ A) p = 0.333, z = 0.067, p-value = 0.946 B) p = 0.667, z = 8.795, p-value = 0.000 + ^
C) p = 0.333, z = 1.586, p-value = 0.056
^
D) p = 0.289, z = -0.829, p-value = 0.407
Answer: C Use the following information to answer the question. A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p < 0.26. The researcher collected data from 150 surveys he handed out at a busy park located in the region. 2) To continue the study into the drinking habits of adults, the researcher decides to collect data from adults working in ʺwhite collarʺ jobs to see whether their drinking habits are in the same proportion as the general public. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha : p < 0.26. The researcher collected data from a random sample of 120 adults with ʺwhite collarʺ jobs of which 25 stated that they drank once a week or less in the last month. Assume that the conditions that must be met in order for us to use the N(0, 1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, ^
p, the test statistic, and the p-value associated with the test statistic. Round all values to the nearest thousandth. ^
^
A) p = 0.208, z = -0.250, p-value = 0.401
B) p = 0.75, z = -1.32, p-value = 0.599
C) p = 0.208, z = -1.290, p-value = 0.098
D) p = 0.30, z = 0.803, p-value = 0.041
^
^
Answer: C Solve the problem. 3) Two researchers are comparing a blood pressure reducing drug with a two -sided alternative hypothesis. Their test statistics show that the following z values: z 1 = 1.87 and z 2 = −2.45. Which one of these have the smaller p-value and why? A) z 1 = 1.87 value because it is closer to the mean. B) z 2 = −2.45 because the bigger z-value has a bigger area between −2.45 and 2.45. C) z 2 = −2.45 because the z-value indicates almost two and a half standard deviations away from the mean with the remaining areas smaller than 1.87. D) z 1 = −1.87 because the area between −1.87 and 1.87 is smaller. Answer: C
Page 10 Copyright © 2020 Pearson Education, Inc.
4) Suppose that the following is to be tested: H0 : p = 0.72 and Ha: p ≠ 0.72 . Calculate the observed z-statistic for the following sample data: Sixty-eight out of ninety test subjects have the characteristic of interest. Round to the nearest thousandth. A) z = 0.751 B) z = 0.453 C) z = 0.756 D) z = -0.751 Answer: A 5) A research firm carried out a hypothesis test on a population proportion using a left-tailed alternative hypothesis. Which of the following z-scores would be associated with a p-value of 0.025? Round to the nearest hundredth. C) z = 1.96 D) z = -1.96 B) z = -2.97 A) z = 2.97 Answer: D A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults in the general population who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha: p < 0.26. The researcher collected data from 150 surveys he handed out at a busy park located in the region. 6) To continue the study into the drinking habits of adults, the researcher decides to collect data from adults working in ʺwhite collarʺ jobs to see whether their drinking habits are in the same proportion as the general public. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha: p < 0.26 . The researcher collected data from a random sample of 120 adults with ʺwhite collarʺ jobs of which 25 stated that they drank once a week or less in the last month. Assume that the conditions that must be met in order for us to use the N (0,1) distribution as the sampling distribution are satisfied. Find the values of the sample proportion, ^
p , and the test statistic. Round all values to the nearest thousandth. ^
^
A) p = 0.208, z = −0.250
B) p = 0.75, z = −1.32
C) p = 0.208, z = −1.290
D) p = 0.30, z = 0.803
^
^
Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 7) Two different students conduct a coin flipping experiment with a left-tailed alternative. They obtain the followin test statistics: Student #1: z = −2.05 Student #2: z = −1.28 Which of the test statistics has a smaller p-value and why? Answer: If the null hypothesis is correct, then the test statistic should be close to zero. Values farther from zero are more surprising and so have smaller p-values. Since -2.05 is farther from zero than is -1.28, the area under the normal curve in the left tail is smaller for -2.05, therefore -2.05 will have a smaller p-value. 8) Suppose the following is to be tested: H0 : p = 0.4 and Ha : p ≠ 0.4. Calculate the observed z test statistic for the following sample data: n = 80 and 25 test subjects have the characteristic of interest. Round to the nearest thousandth. Answer: z = −1.598
Page 11 Copyright © 2020 Pearson Education, Inc.
A health foods shop owner is wondering if his customerʹs daily vitamin supplement habits are in the same proportion as the general population of adults. The shop owner heard in a news report that 60% of all adults reported that they took a daily vitamin. The shop owner believes that his customers have a greater proportion of adults who take a daily vitamin, so he decides to conduct a hypothesis test using the following null and alternative hypothesis: H0 : p = 0.6 and Ha: p > 0.6. The shope owner collected data from 50 randomly selected customers. 9) To continue the study, the shop owner decides to collect data from 60 customers between the ages of 22 and 27 to see whether the proportion in this age group is different from the general population of adults. From this sample, 26 reported that they took a daily vitamin. The null hypothesis for this test is H0 : p = 0.6 and the alternative hypothesis is Ha : p ≠ 0.6 . Assume that the conditions that must be met in order for us to use the N (0,1) distribution as the sampling ^
distribution are satisfied. Find the values of the sample proportion (p ), the observed test statistic, and the associated p-value. Round all values to the nearest thousandth. ^
Answer: p = 0.433, z = 02.635, p = 0.008
Page 12 Copyright © 2020 Pearson Education, Inc.
3 Identify the Graph that Shows a p-value for a Given Alternative Hypothesis MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area depicts a p-value for a left-tailed test. A)
B)
C)
Answer: A
Page 13 Copyright © 2020 Pearson Education, Inc.
2) From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area correctly depic following hypothesis test results: H0 : p = 0.15, Ha : p ≠ 0.15, α = 0.05, z = -1.82, p-value = 0.0688 A)
B)
C)
Answer: A
Page 14 Copyright © 2020 Pearson Education, Inc.
3) From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area depicts a p-value for a two-tailed test. A)
B)
C)
Answer: C
Page 15 Copyright © 2020 Pearson Education, Inc.
4) From the TI-84 graphing calculator screenshots below, choose the screenshot whose shaded area correctly depic following hypothesis test results: H0 : p = 0.25, Ha : p > 0.25, α = 0.05, z = 2.01, p-value = 0.022 A)
B)
C)
Answer: B
Page 16 Copyright © 2020 Pearson Education, Inc.
5) From the TI-84 graphing calculator screenshots below, there are specific shaded areas that represent p -values. Choose the statement that best describes the interpretation of these p-values.
A) The p-values shown in graphics a and b display one -sided tests while c displays a shaded area showing a two-sided p-value. B) The p-value shown in graphic c displays a one-sided test with a small p-value. C) The p-value shown in graphic c displays a small two-sided p-value. D) The p-value shown in graphic b displays a one-sided test with a small p-value. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Shade the approximate area that would represent the p -value for the alternative hypothesis and z -score, and then calculate the p-value. Round to the nearest thousandth. 6) The alternative hypothesis is a right-tailed with a z-score = 0.21
Answer: The right tail of the curve should be shaded and should approximately represent the p -value of 0.417. 7) The alternative hypothesis is a two-tailed with a z-score = −1.88
Answer: Both tails of the curve should be shaded and should approximately represent the p -value of 0.060.
Page 17 Copyright © 2020 Pearson Education, Inc.
4 Perform a One-Proportion z-Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H0 : p = 0.4 and Ha : p < 0.4. The test statistic and p-value for the test are z = -3.01 and p-value = 0.0013. For a significance level of α = 0.05, choose the correct conclusion regarding the null hypothesis. A) There is not sufficient evidence to reject the null hypothesis that the population proportion is equal to 0.04. B) There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.04. C) There is sufficient evidence to conclude that the population proportion is significantly different from 0.04. D) There is not sufficient evidence to conclude that the population proportion is significantly different from 0.04. Answer: C 2) A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H0 : p = 0.6 and Ha : p < 0.6. The test statistic and p-value for the test are z = -1.51 and p-value = 0.0655. For a significance level of α = 0.05, choose the correct conclusion regarding the null hypothesis. A) There is insufficient evidence to reject the null hypothesis that the population proportion is equal to 0.6. B) There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.6. C) There is sufficient evidence to conclude that the population proportion is significantly different from 0.6. D) None of these. Answer: A Conduct a full hypothesis test and determine if the given claim is supported or not supported at the 0.05 significance level. 3) A manufacturer considers his production process to be out of control when defects exceed 3%. In a random sample of 85 items, the defect rate is 5.7% but the manager claims that this is only a sample fluctuation and production is not really out of control. Test whether the manufacturerʹs claim that production is out of control is supported or not supported. B) supported A) not supported Answer: A 4) A supplier of 3.5ʺ disks claims that no more than 1% of the disks are defective. In a random sample of 600 disks, it is found that 3% are defective, but the supplier claims that this is only a sample fluctuation. Test whether the supplierʹs claim that no more than 1% are defective is supported or not supported. B) supported A) not supported Answer: A 5) According to a recent poll 53% of Americans would vote for the incumbent president. If a random sample of 100 people results in 40% who would vote for the incumbent, test whether the claim that the actual percentage is different from 53% is supported or not supported. B) not supported A) supported Answer: A Solve the problem. 6) Which of the following is not one of the four steps of the hypothesis test? A) State the null and alternative hypothesis about the population parameter. B) Make a decision to reject or not reject the null hypothesis. C) State the level of significance, choose a test, and check the conditions for the test. D) All of the these are steps of the hypothesis test. Answer: D
Page 18 Copyright © 2020 Pearson Education, Inc.
A researcher is wondering whether the smoking habits of young adults (18 -25 years of age) in a certain city in the U.S. are the same as the proportion of the general population of young adults in the U.S. A recent study stated that the proportion of young adults who reported smoking at least twice a week or more in the last month was 0.16. The researcher collected data from a random sample of 75 adults in the city of interest. 7) State the hypotheses to be tested for this study. A) H0 : p = 0.16; Ha : p < 0.16 B) H0 : p ≠ 0.16; Ha : p < 0.16 C) H0 : p = 0.16; Ha : p > 0.16
D) H0 : p = 0.16; Ha : p ≠ 0.16
Answer: D 8) A researcher completes a hypothesis test with a resulting p-value = 0.076. Choose the best statement to interpret the results. A) The p-value for a two-sided test is divided by 2 resulting in a value less than a standard cutoff value of α = 0.05 supporting the hypothesis that the city of interest has a different proportion of smokers than the general public. B) The standard cutoff value of α = 0.05 is multiplied by two for a two-sided test and the resulting value of 0.10 is greater than the p-value. Therefore there is no evidence to support that the city of interest has a different proportion of smokers than the general public. C) The p-value is above a standard cutoff value of α = 0.05 and therefore there is insufficient evidence to support that the city of interest has a different proportion of smokers than the general public. D) The p-value is above a standard cutoff value of α = 0.05 and therefore there is sufficient evidence to support that the city of interest has a different proportion of smokers than the general public. Answer: C Solve the problem. 9) A medical researcher conducts a hypothesis test to test the claim that U.S. adult males have gained weight over the past 15 years. Assume that all the conditions for proceeding with a one -sample test on proportions have been met. The calculated test statistic is approximately 1.71 with an associated p -value of approximately 0.0436. Choose the conclusion that provides the best interpretation for the p-value at a significance level of α = 0.05. A) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.71 is 0.0436. This result is not surprising and could easily happen by chance. B) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.71 is 0.0436. This result is surprising and could not easily happen by chance. C) The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. D) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.71 is 0.0436. The result should be doubled for a two -sided test. This result is not surprising and could easily happen by chance. Answer: B 10) Which of the following is not one of the four steps of the hypothesis test? A) State the null and alternative hypothesis about the population parameter. B) Make a decision to either accept the null hypothesis or accept the alternative hypothesis. C) State the level of significance, choose a test, and check the conditions for the test. D) Calculate the test statistic and the p-value. Answer: B
Page 19 Copyright © 2020 Pearson Education, Inc.
A researcher is wondering whether the drinking habits of adults in a certain region of the country are in the same proportion as the general population of adults. Suppose a recent study stated that the proportion of adults in the general population who reported drinking once a week or less in the last month was 0.26. The null hypothesis for this test is H0 : p = 0.26 and the alternative hypothesis is Ha: p < 0.26. The researcher collected data from 150 surveys he handed out at a busy park located in the region. 11) Suppose a city official conducts a hypothesis test to test the claim that the majority of voters support a proposed tax to build sidewalks. Assume that all the conditions for proceeding with a one -sample test on proportions have been met. The calculated test statistic is approximately 1.40 with an associated p -value of approximately 0.081. Choose the conclusion that provides the best interpretation for the p -value at a significance level of α = 0.05. A) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.40 is 0.081. This result is surprising and could not easily happen by chance. B) If the null hypothesis is true, then the probability of getting a test statistic as large or larger than 1.40 is 0.081. This result is not surprising and could easily happen by chance. C) The p-value should be considered extreme; therefore the hypothesis test proves that the null hypothesis is true. D) If the null hypothesis is true, then the probability of getting a test statistic that is as extreme or more extreme than the calculated test statistic of 1.40 is 0.081. The result should be doubled for a two -sided test. This result is not surprising and easily happen by chance. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 12) List and briefly summarize the four steps of the hypothesis test. Answer: (1) State the null and alternative hypothesis about the population parameter; (2) State the significance level, an appropriate test statistic, check the required conditions, and state any assumptions; (3) Compute the test statistic and associated p-value; (4) State you conclusion regarding the null hypothesis. Will you reject or fail to reject the null hypothesis? Explain in the context of the data. A health foods shop owner is wondering if his customerʹs daily vitamin supplement habits are in the same proportion as the general population of adults. The shop owner heard in a news report that 60% of all adults reported that they took a daily vitamin. The shop owner believes that his customers have a greater proportion of adults who take a daily vitamin, so he decides to conduct a hypothesis test using the following null and alternative hypothesis: H0 : p = 0.6 and Ha: p > 0.6. The shope owner collected data from 50 randomly selected customers. 13) Based on a 5% significance level, write a conclusion by interpreting the p-value. Be sure to clearly state the decision regarding the null hypothesis. Answer: If the null hypothesis is true, then the probability of getting a calculated z-score that is this far from zero is surprising and could not easily happen by chance. The null hypothesis should be rejected. There is sufficient evidence to support the claim that the proportion of customers who take a daily vitamin is different then 0.60.
Page 20 Copyright © 2020 Pearson Education, Inc.
8.3 Hypothesis Tests in Detail 1 Understand p-Values MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A quality control manager thinks that there is a higher defective rate on the production line than the advertised value of p = 0.025. She does a hypothesis test with a significance level of 0.05. Symbolically, the null and alternative hypothesis are as follows: H0 : p = 0.025 and Ha : p > 0.025. She calculates a p-value for the hypothesis test of defective light bulbs to be approximately 0.067. Choose the correct interpretation for the p-value. A) The p-value tells us that if the defect rate is 0.025, then the probability that she would observe the percentage she actually observed or higher is 0.067. At a significance level of 0.05, this would not be an unusual outcome. B) The p-value tells us that the probability of concluding that the defect rate is equal to 0.025, when in fact it is greater than 0.025, is approximately 0.067. C) The p-value tells us that the true population rate of defective light bulbs is approximately 0.067. D) The p-value tells us that the result is significantly higher than the advertised value using a significance level of 0.05. Answer: A A janitor at a large office building believes that his supply of light bulbs has a defect rate that is higher than the defect rate stated by the manufacturer. The janitorʹs null hypothesis is that the supply of light bulbs has a manufacturerʹs defect rate of p = 0.09. He performs a test at a significance level of 0.01. The null and alternative hypothesis are as follows: H0 : p = 0.09 and Ha: p > 0.09. 2) The janitor calculates a p-value for the hypothesis test of approximately 0.113. Choose the correct interpretation for the p-value. A) The p-value tells us that the probability of concluding that the defect rate is equal to 0.09, when in fact it is greater than 0.09, is approximately 0.113. B) The p-value tells us that if the defect rate is 0.09, then the probability that the janitor will have 33 or more defective light bulbs out of 300 is approximately 0.113. C) The p-value tells us that the true population rate of defective light bulbs is approximately 0.113. D) The p-value tells us that the result is significantly higher than the advertised value using a significance level of 0.05. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A survey claims that 9 out of 10 doctors recommend aspirin for their patients with headaches. To test whether the claim is true or not, a random sample of 100 doctors is obtained. Of these doctors, 82 indicated that they recommend aspirin for headaches. Is the claim accurate? Test with a significance level of 0.05. 3) Write a statement explaining what the p-value means and how it should be interpreted. Answer: The calculated p-value is 0.0076 (0.0038*2 for a two-sided test),which is less than 0.05, the stated level of significance. Therefore, the result is significant. The sample results give evidence that the given proportion of 0.90 is not correct.
Page 21 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 4) Online dating sites and mobile dating apps have become more popular over the years for American adults, according to the Pew Research Center. In 2013, 11% of adults in the US reported using online sites and mobile apps for dating. In a follow-up Pew Research poll in 2016, that percent jumped up to 15% of American adults. From the 2016 data, a hypothesis test was performed using a significance level of 0.01 and the results are shown in the output below. Based on this, can we conclude that the percentage of American adults who use online sites or mobile apps for dating has increased since 2013? If not, what conclusion would be appropriate based on the sample data?
A) Yes, we can conclude that the percentage of American adults who use online sites or mobile apps for dating has increased since 2013 because the p-value is less than the significance level of 0.01. B) Yes, we can conclude that the percentage of American adults who use online sites or mobile apps for dating has increased since 2013 because the p-value is not less than the significance level of 0.01. C) No, we cannot conclude that the percentage of American adults who use online sites or mobile apps for dating has increased since 2013 because the p-value is less than the significance level of 0.01. D) No, we can conclude that the percentage of American adults who use online sites or mobile apps for dating has increased since 2013 because the p-value is not less than the significance level of 0.01. Answer: A 5) The null hypothesis on a multiple-choice test with four answer choices (A, B, C, D) is that a student is guessing, and therefore, the proportion of right answers is 0.25. A student who takes a ten -question multiple-choice test gets 5 correct out of 10. He says that this means he knows the material because the one -tailed p-value from the one-proportion z-test is 0.0336, and he is using a significance level of 0.05. What is wrong with his approach? A) There is nothing wrong with his approach. The student is correct in his reasoning. B) He should have calculated a two-tailed p-value from the one-proportion z-test instead of a one-tailed p-value. C) The Large Sample condition of the z-test was not satisfied, so the z-test does not provide a good approximate p-value. D) The Large Population condition of the z-test was not satisfied, so the z-test does not provide a good approximate p-value. Answer: C
Page 22 Copyright © 2020 Pearson Education, Inc.
2 Identify Both Types of Errors MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. For the given hypothesis test, explain the meaning of the kind of error, as requested. 1) Suppose you are testing your friend to see whether she can tell the difference between the name brand and generic peanut butter. You give her 60 samples selected randomly, half from the name brand and half from the generic brand. The null hypothese is that she is just guessing and should get about half right. Explain what the first kind of error would be in this case (when you reject the null hypothesis when it is actually true). A) The first kind of error would be saying that your friend can tell the difference between the two kinds of peanut butter, when she really cannot. B) The first kind of error would be saying that your friend can not tell the difference between the two kinds of peanut butter, when she really can. C) The first kind of error would be saying that your friend can tell the difference between the two kinds of peanut butter, when she really can. D) The first kind of error would be saying that your friend cannot tell the difference between the two kinds of peanut butter, when she really cannot. Answer: A 2) A statistics student has heard that about 22% of the students on his campus attend sporting events weekly. He wants to know if statistics students attend events in the same proportions as the general student body. Explain what the second type of error would be in this case (where the student fails to reject a null hypothesis that is actually false). A) The second kind of error would be saying that there is no difference in the attendence of statistics students and the student body as a whole at sporting events, even though there really is. B) The second kind of error would be saying that statistics students attend sporting events in different proportions than the student body as a whole, even though they actually have the same attendence proportion. C) The second kind of error would be saying that there is no difference in the attendence of statistics students and the student body as a whole at sporting events, even though statistics students actually go much less often. D) The second kind of error would be saying that statistics students attend sporting events in much higher proportions than the student body as a whole, even though they actually have the same attendence proportion. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 3) Explain why failing to reject the null hypothesis does not prove that the null hypothesis is true. Answer: Failing to reject the null hypothesis does not prove that the null hypothesis is true, only that the sample evidence does not show surprising enough results to suggest that the assumption that the null hypothesis is true is incorrect. To say the null hypothesis is proven true means that there is no doubt about it. This is not possible based on chance processes.
Page 23 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 4) According to a 2011 Pew Research Center survey, roughly 35% of Americans owned a smartphone. Suppose we were interested in whether the percent of Americans who own a smartphone in 2019 is higher. Describe the two types of errors we might make in conducting this hypothesis test. A) The first type of error would be concluding that the percent of Americans who own a smartphone in 2019 is not higher than 35% when it actually is not. The second type of error would be concluding that the percent of Americans who own a smartphone in 2019 is higher than 35% when, in fact, it actually is. B) The first type of error would be concluding that the percent of Americans who own a smartphone in 2019 is higher than 35% when it actually is. The second type of error would be concluding that the percent of Americans who own a smartphone in 2019 is not higher than 35% when it actually is not. C) The first type of error would be concluding that the percent of Americans who own a smartphone in 2019 is not higher than 35% when it actually is. The second type of error would be concluding that the percent of Americans who own a smartphone in 2019 is higher than 35% when it actually is. D) The first type of error would be concluding that the percent of Americans who own a smartphone in 2019 is higher than 35% when, in fact, it is not. The second type of error would be concluding that the percent of Americans who own a smartphone in 2019 is not higher than 35% when it actually is. Answer: D 5) Suppose you are testing someone to see whether he or she can tell Coca -Cola® from Pepsi® by taste only. You have many small pre-made cups ready, where half are filled with Coca-Cola® and half are filled with Pepsi®. You randomly select one cup at a time to give the blindfolded taste -tester. The null hypothesis is that the taster is just guessing and should get about half right. Report the two possible errors, in context, when performing such a test. A) One error is saying the taster can tell Coca-Cola® from Pepsi® when she cannot. The other error is saying that the taster cannot tell Coca-Cola® from Pepsi® when she can. B) One error is saying the taster cannot tell Coca-Cola® from Pepsi® when she can. The other error is saying that the taster can tell Coca-Cola® from Pepsi® when she cannot. C) One error is saying the taster can tell Coca-Cola® from Pepsi® when she can. The other error is saying that the taster cannot tell Coca-Cola® from Pepsi® when she can. D) One error is saying the taster cannot tell Coca-Cola® from Pepsi® when she cannot. The other error is saying that the taster can tell Coca-Cola® from Pepsi® when she can. Answer: A 6) When we set the significance level to a small value, what are we trying to avoid? A) We are trying to avoid one of the types of mistakes because the significance level is the probability of failing to reject the null hypothesis when it is false. B) We are trying to avoid one of the types of mistakes because the significance level is the probability of rejecting the null hypothesis when it is true. C) We are trying to avoid one of the types of mistakes because the significance level is the probability of failing to reject the null hypothesis when it is true. D) We are trying to one of the types of mistakes because the significance level is the probability of rejecting the null hypothesis when it is false. Answer: B
Page 24 Copyright © 2020 Pearson Education, Inc.
3 Choose the Appropriate Test of Significance and Perform the Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) The business college computing center wants to determine the proportion of business students who have personal computers (PCʹs) at home, because if the proportion exceeds 25%, then the lab will scale back a proposed enlargement of its facilities. Suppose 300 business students were randomly sampled and 85 have PCʹs at home. Should they use a hypothesis test or a confidence interval to answer this question? B) Confidence interval. A) Hypothesis test. Answer: A 2) The city council wants to know what percentage of people in their town have health insurance. They take a poll, and of 92 adults selected randomly from the town, 64 have health insurance. Should the council use a hypothesis test or confidence interval to answer their question? B) Hypothesis test. A) Confidence interval. Answer: A Solve the problem. 3) Read the following then choose the appropriate test and name the population(s). A researcher asks a random sample of 200 men whether they had made an online purchase in the last three mon wants to determine whether the proportion of men who make online purchases is less than 0.18. A) Two-proportion z-test; the population is the 200 men surveyed. B) One-proportion z-test; the population is all men. C) One-proportion z-test; the population is all adults who make online purchases. D) Two-proportion z-test; one population is all men who make online purchases and the other population is all men who do not make online purchases. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) For the following description, state whether a one-proportion z-test or a two-proportion z-test would be appropriate, and name the population. A researcher asks people who are 20 -29 years old and senior citizens (people over 65) whether they support a new tax on income. He wants to determine whether the proportions that support the tax differ for these age groups. Answer: Two-proportion z-test. One population is all people in the 20-29 age bracket, the other population is all senior citizens. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 5) A supporter of a political candidate wants to estimate whether the candidate will win the majority vote, using a 5% significance level. Suppose a poll is taken and 570 out of 1000 randomly selected people support that candidate and plan to vote for her. Should the supporter use a hypothesis test or a confidence interval to answer this question? Why? A) The supporter should use a hypothesis test because she wants to know whether or not the candidate will get more than 50% of the votes. B) The supporter should use a confidence interval because she wants to know whether or not the candidate will get more than 50% of the votes. C) The supporter should use a hypothesis test because she wants an estimate for the population percentage. D) The supporter should use a confidence interval because she wants an estimate for the population percentage. Answer: A Page 25 Copyright © 2020 Pearson Education, Inc.
6) A supporter of a political candidate wants to know the population percentage of people who will vote for that candidate. Suppose a poll is taken and 570 out of 1000 randomly selected people support that candidate and plan to vote for her. Should the supporter use a hypothesis test or a confidence interval to answer this question? Why? A) The supporter should use a hypothesis test because she wants to know whether or not the candidate will get more than 50% of the votes. B) The supporter should use a confidence interval because she wants to know whether or not the candidate will get more than 50% of the votes. C) The supporter should use a hypothesis test because she wants an estimate for the population percentage. D) The supporter should use a confidence interval because she wants an estimate for the population percentage. Answer: D 4 Understand Hypotheses and How to Interpret the Results of a Hypothesis Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Formulate the indicated conclusion in nontechnical terms. Be sure to address the original claim. 1) An entomologist writes an article in a scientific journal which claims that fewer than 21 in ten thousand male fireflies are unable to produce light due to a genetic mutation. Assuming that a hypothesis test of the claim has been conducted and that the conclusion is to reject the null hypothesis, state the conclusion in nontechnical terms. A) There is sufficient evidence to support the claim that the true proportion is less than 21 in ten thousand. B) There is not sufficient evidence to support the claim that the true proportion is less than 21 in ten thousand. C) There is sufficient evidence to support the claim that the true proportion is greater than 21 in ten thousand. D) There is not sufficient evidence to support the claim that the true proportion is greater than 21 in ten thousand. Answer: A 2) A skeptical paranormal researcher claims that the proportion of Americans that have seen a UFO, p, is less than 1 in every ten thousand. Assuming that a hypothesis test of the claim has been conducted and that the conclusion is failure to reject the null hypothesis, state the conclusion in nontechnical terms. A) There is not sufficient evidence to support the claim that the true proportion is less than 1 in ten thousand. B) There is sufficient evidence to support the claim that the true proportion is less than 1 in ten thousand. C) There is sufficient evidence to support the claim that the true proportion is greater than 1 in ten thousand. D) There is not sufficient evidence to support the claim that the true proportion is greater than 1 in ten thousand. Answer: A 3) A psychologist claims that more than 23 percent of the population suffers from professional problems due to extreme shyness. Assuming that a hypothesis test of the claim has been conducted and that the conclusion is failure to reject the null hypothesis, state the conclusion in nontechnical terms. A) There is not sufficient evidence to support the claim that the true proportion is greater than 23 percent. B) There is sufficient evidence to support the claim that the true proportion is greater than 23 percent. C) There is not sufficient evidence to support the claim that the true proportion is less than 23 percent. D) There is sufficient evidence to support the claim that the true proportion is less than 23 percent. Answer: A
Page 26 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 4) A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H0 : p = 0.6 and Ha: p < 0.6 . The test statistic and p-value for the test are z = −1.51 and p−value = 0.0655. For a significance level of α = 0.05, choose the correct conclusion regarding the null hypothesis. A) There is insufficient evidence to reject the null hypothesis that the population proportion is equal to 0.6. B) There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.6. C) There is sufficient evidence to conclude that the population proportion is significantly different from 0.6. D) There is insufficient evident to determine the significance. Answer: A 5) Which statement best describes the significance level of a hypothesis test? A) The probability of rejecting the null hypothesis when the null hypothesis is true. B) The probability of rejecting the null hypothesis when the null hypothesis is not true. C) The probability of failing to reject the null hypothesis when the null hypothesis is not true. D) None of these Answer: A 6) When conducting hypothesis tests, is it acceptable practice to create the alternative hypothesis based on the results from your data? A) Yes, this will help you determine if you should make the alternative hypothesis “less than” or “greater than.” B) Yes, this will help you determine if you should use a one-sided or two-sided alternative hypothesis. C) No, this is cheating. You should make the alternative hypothesis before looking at any data. Otherwise, your significance level will be incorrect. D) It does not matter if you make the alternative hypothesis before or after looking at data. Answer: C 7) Agree or disagree with the following statement: failing to reject the null hypothesis means that we are accepting it as true. A) Agree. We have shown that there is not enough evidence for the alternative hypothesis, so the null hypothesis must be true. B) Disagree. We have only shown that there is not enough evidence to reject the null hypothesis, but not that it is true. Answer: B 8) A pharmaceutical company claims that its new migraine medication causes migraines frequency to reduce due to its particular drug formulation. The test this claim, researchers randomly assign migraine sufferers to two groups. One group takes the new migraine medication and the other group takes a placebo. After 3 months, the researchers test the claim that the migraine drug is better than the placebo in reducing the frequency of migraines. They record the proportion of each group that saw a reduction in their migraine frequencies. They announce that they failed to reject the null hypothesis. Which of the following is a valid interpretation of the researchers’ findings? A) The medication and placebo are equally as effective in reducing migraine frequency. B) The placebo was less effective in reducing migraine frequency than the medication. C) There is a significant difference in the proportion of sufferers that saw a reduction in their migraine frequencies between the medication and placebo groups. D) There is no significant difference in the proportion of sufferers that saw a reduction in their migraine frequencies between the medication and placebo groups. Answer: D
Page 27 Copyright © 2020 Pearson Education, Inc.
8.4 Comparing Proportions from Two Populations 1 Understand How to Interpret Two-Proportion z-Tests MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A researcher conducts a hypothesis test on a population proportion. Her null and alternative hypothesis are H0 : p = 0.4 and Ha: p < 0.4 . The test statistic and p-value for the test are z = −3.01 and p−value = 0.0013. For a significance level of α = 0.05 , choose the correct conclusion regarding the null hypothesis. A) There is not sufficient evidence to reject the null hypothesis that the population proportion is equal to 0.4. B) There is sufficient evidence to accept the null hypothesis that the population proportion is equal to 0.4. C) There is sufficient evidence to conclude that the population proportion is significantly different from 0.4. D) There is not sufficient evidence to conclude that the population proportion is significantly different from 0.4. Answer: C 2) A polling agency is interested in testing whether the proportion of women who support a female candidate for office is greater than the proportion of men. The null hypothesis is that there is no difference in the proportion of men and women who support the female candidate. The alternative hypothesis is that the proportion of women who support the female candidate is greater than the proportion of men. The test results in a p -value of 0.112. Which of the following is the best interpretation of the p-value? A) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is greater than the proportion of men. B) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. C) The p-value is the probability that men will support the female candidate. D) The p-value is the probability that women will support the female candidate. Answer: B 3) A polling agency is interested in testing whether the proportion of women who support a female candidate for office is less than the proportion of men. The null hypothesis is that there is no difference in the proportions of men and women who support the female candidate. The alternative hypothesis is that the proportion of women who support the female candidate is less than the proportion of men. The test results in a p -value of 0.041. Which of the following is the best interpretation of the p-value? A) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that the proportion of women who support the female candidate is less than the proportion of men. B) The p-value is the probability that the majority of women will support the female candidate. C) The p-value is the probability of getting a result that is as extreme as or more extreme than the one obtained, assuming that there is no difference in the proportions. D) The p-value is the probability that the majority of men will support the female candidate. Answer: C
Page 28 Copyright © 2020 Pearson Education, Inc.
2 Perform a Two-Proportion z-Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is not a condition that must be checked before proceeding with a two -sample test? A) Both samples must be large enough so that the product of each sample size ( n 1 and n 2 ) and the pooled ^
estimate, p, is greater than or equal to 10. B) Each sample must be a random sample. C) The samples must be independent of each other. D) All of these are conditions that must be checked to proceed with a two-sample test. Answer: D 2) Which of the following is not a condition that must be checked before proceeding with a two -sample test? A) The observations within each sample must be independent of one another. B) Each sample must be a random sample. C) The samples must be independent of each other. D) All of these are conditions that must be checked to proceed with a two-sample test. Answer: D 3) A researcher believes that the proportion of women who exercise with a friend is greater than the proportion of men. He takes a random sample from each population and records the response to the question, ʺHave you exercised with a friend at least once in the last seven days?ʺ The null hypothesis is H0 : pwomen = pmen. Choose the correct alternative hypothesis. A) Ha : pwomen < pmen C) Ha : pwomen ≠ pmen
B) Ha : pwomen > pmen D) Ha : p = 0
Answer: B 4) A researcher believes that the reading habits of men and women are different. He takes a random sample from each population and records the response to the question, ʺDid you read at least one book last month?ʺ The null hypothesis is H0 : pwomen = pmen. Choose the correct alternative hypothesis. A) Ha : pwomen < pmen
B) Ha : pwomen > pmen
C) Ha : pwomen ≠ pmen
D) Ha : p = 0
Answer: C
Page 29 Copyright © 2020 Pearson Education, Inc.
5) A researcher believes that children who attend elementary school in a rural setting have lower obesity rates then children who attend elementary school in an urban setting. The researcher collects a random sample from each population and records the proportion of children in each sample who are clinically obese. The data is summarized in the table below. Assume that all conditions for proceeding with a two -sample test have been met.
Find the z-statistic (rounded to the nearest hundredth) and p-value (rounded to the nearest thousandth) for this hypothesis test. Using a 5% significance level, state the correct conclusion regarding the null hypothesis H0 : prural = purban. A) z = -1.95, p = 0.026. There is sufficient evidence to reject the null hypothesis. B) z = -1.95, p = 0.026. There is not sufficient evidence to reject the null hypothesis. C) z =1.95, p = 0.026. There is sufficient evidence to prove that the population proportions are the same. D) z = -1.85, p = 0.032. There is sufficient evidence to accept the null hypothesis. Answer: A 6) A researcher believes that children who attend elementary school in a rural setting are more physically active then children who attend elementary school in an urban setting. The researcher collects a random sample from each population and records the proportion of children in each sample who reported participating in at least one hour of rigorous activity a day. The data is summarized in the table below. Assume the all conditions for proceeding with a two-sample test have been met.
Find the z-statistic (rounded to the nearest hundredth) and p-value (rounded to the nearest thousandth) for this hypothesis test. Using a 5% significance level, state the correct conclusion regarding the null hypothesis, H0 : prural = purban. A) z = -1.79, p = 0.037. There is insufficient evidence to reject the null hypothesis. B) z = 1.79, p = 0.037. There is sufficient evidence to reject the null hypothesis. C) z = 0.82, p = 0.073. There is sufficient evidence to accept the null hypothesis. D) z = 0.71, p = 0.073. There is sufficient evidence to reject the null hypothesis. Answer: B 7) A researcher believes that the exercise habits of men and women are different. He takes a random sample from each population and records the response to the question, ʺDid you exercise for at least 30 minutes twice a week?ʺ The null hypothesis is H0 : p women = p men Choose the correct alternative hypothesis. A) Ha: p women < p men C) Ha: p women ≠ p men
B) Ha: p women > p men D) Ha: p = 0
Answer: C
Page 30 Copyright © 2020 Pearson Education, Inc.
8) Which of the following is not a condition that must be checked before proceeding with a two-sample test? A) Both samples must be large enough so that the product of each sample size ( n1 and n2 ) and the pooled ^
estimate, p , is greater than or equal to 10. B) Each sample must be a random sample. C) The samples must be independent of each other. D) Each sample must be from populations with the same standard deviation. Answer: D 9) Which of the following is not a condition that must be checked before proceeding with a two -sample test? A) The observations within each sample must be independent of one another. B) Each sample must be a random sample. C) The samples must be independent of each other. D) All of these are conditions that must be checked to proceed with a two-sample test. Answer: D 10) Two movie reviewers give movies ʺthumbs upʺ and ʺthumbs downʺ ratings. You sample 100 movies that they both have rated and find that they both gave ʺthumbs upʺ to 25 movies, both gave ʺthumbs downʺ to 30 movies, Sarah gave ʺthumbs upʺ and Jessica ʺthumb downʺ to 28 movies, and the remaining movies Sarah gave ʺthumbs downʺ and Jessica ʺthumbs upʺ. Test whether there is a tendency for one reviewer to give more movies ʺthumbs upʺ (proportion 1) than the other (proportion 2). A) z = 1.56; For a two-sided test at = 0.05 level, there is insufficient evidence to reject the null hypothesis because the cutoff z-value is at 1.96. B) z = −1.56; For a two-sided test at = 0.05 level, there is insufficient evidence to reject the null hypothesis because the cutoff z-value is at 1.96. C) z = 1.96 There is sufficient evidence to accept the null hypothesis. D) z = −1.96 There is sufficient evidence to reject the null hypothesis. Answer: A 11) A researcher asks random samples of residents of two separate counties as to whether they had purchased organically grown food in the last three months. He wants to determine whether the proportion of residents of one county who purchase organically grown food is greater than the proportion of residents of the second county who purchase organically grown food. Choose the appropriate test and name the population(s). A) One-proportion z-test; the population is all residents of a state. B) One-proportion z-test; the population is all residents of the first county. C) Two-proportion z-test; one population is all residents of the first county and the other population is residents of the second county. D) Two-proportion z-test; one population is all adults who buy organically grown food and the other population is all adults who do not buy organically grown food. Answer: C
Page 31 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 12) A sociologist believes that families that eat at least one meal a day together (without the interference of any other media) will have better communication skills. The sociologist conducts a study to see if there is a difference in the proportion of meals that are eaten together as a family for families living in a rural setting compared to families living in an urban setting. She collects a random sample from each population and records the proportion of test subjects that reported that they had eaten at least 3 meals per week together as a family. The data are summarized in the table below. Assume the all conditions for proceeding with a two-sample test have been met.
Find the z test statistic (rounded to the nearest hundredth) and p -value (rounded to the nearest thousandth) for testing the hypothesis that the population proportions are different. At the 5% significance level, state the correct conclusion regarding the null hypothesis H0 : p rural = p urban. Round all calculations to the nearest hundredth. Answer: z = −2.05; p = 0.04 . There is sufficient evidence to reject the null hypothesis that the population proportions are equal. 13) A sociologist believes that the proportion of single men who attend church on a regular basis is less than the proportion of single women. She takes a random sample from each population and records the proportion from each that reported that they attended church on a regular basis. The null hypothesis is H0 : p men = p women . State the correct alternative hypothesis with a sentence and symbolically. Answer: The population proportion of single men who attend church regularly is less than the population proportion of single women who attend church regularly. Ha: p men < p women 14) When a two-sample test of proportions is conducted, there are two conditions of independence that must be checked. State the two conditions of independence. Be sure that your statement clearly states the difference between the two conditions. Answer: One condition is that the two samples themselves must be independent of each other. The second condition is that the observations within each sample must be independent of each other.
Page 32 Copyright © 2020 Pearson Education, Inc.
Ch. 8 Hypothesis Testing for Population Proportions Answer Key 8.1 The Essential Ingredients of Hypothesis Testing 1 Identify Null and Alternative Hypotheses 1) B 2) D 3) A 4) A 5) D 6) D 7) B 8) Null hypothesis is that the carnival worker has no special powers, therefore H0 : p = 0.5 . The alternative hypothesis is that he can communicate with a rock, therefore Ha: p > 0.5 . 9) H0 : p = 1 / 6 and Ha: p ≠ 1 / 6 ; The null hypothesis states that the population parameter is no different than what is expected and is assumed to be true. The alternative hypothesis states that the population parameter may be different and contains the claim that the researcher is trying to show. 10) The null and alternative hypotheses are as follow: H0 : p = 0.90 and Ha: p ≠ 0.90. 2 Find Test Statistics 1) B 2) A 3) B 4) A 5) A 6) A 7) A 8) z = -2.667 9) The cutoff z-value for a significance level of 0.05 is +/-1.96. The calculated value of -2.667 is greater than the cutoff value and therefore the result is significant. 3 Find and Interpret p-Values 1) D 2) A 3) B 4) C 5) B 6) B 7) A 8) B 9) C 10) D 4 Know Concepts: The Essential Ingredients of Hypothesis Testing 1) C 2) A 3) B 4) C 5) B 6) B 7) A 8) The significance level of 0.05 is the probability of concluding that the defect rate is different from 0.90, when in fact the defect rate is equal to 0.90.
8.2 Hypothesis Testing in Four Steps 1 Check the Conditions for a One-Proportion z-Test 1) A Page 33 Copyright © 2020 Pearson Education, Inc.
2) A 3) A 4) A 5) Random Sample-stated in problem description; large enough sample size-since there are at least 10 expected success and failures (50 * 0.6 ≥ 10 and 50 * 0.4 ≥ 10); Large enough population-yes, it is reasonable to assume that there are at least 500 customers in the population; Independence-yes, reasonable to assume that customer responses are independent; Null hypothesis-yes, it is reasonable to assume the null hypothesis is true. 2 Find the Test Statistic and Corresponding p -value for a One-Proportion z-Test 1) C 2) C 3) C 4) A 5) D 6) C 7) If the null hypothesis is correct, then the test statistic should be close to zero. Values farther from zero are more surprising and so have smaller p-values. Since -2.05 is farther from zero than is -1.28, the area under the normal curve in the left tail is smaller for -2.05, therefore -2.05 will have a smaller p-value. 8) z = −1.598 ^
9) p = 0.433, z = 02.635, p = 0.008 3 Identify the Graph that Shows a p-value for a Given Alternative Hypothesis 1) A 2) A 3) C 4) B 5) A 6) The right tail of the curve should be shaded and should approximately represent the p -value of 0.417. 7) Both tails of the curve should be shaded and should approximately represent the p -value of 0.060. 4 Perform a One-Proportion z-Test 1) C 2) A 3) A 4) A 5) A 6) D 7) D 8) C 9) B 10) B 11) B 12) (1) State the null and alternative hypothesis about the population parameter; (2) State the significance level, an appropriate test statistic, check the required conditions, and state any assumptions; (3) Compute the test statistic and associated p-value; (4) State you conclusion regarding the null hypothesis. Will you reject or fail to reject the null hypothesis? Explain in the context of the data. 13) If the null hypothesis is true, then the probability of getting a calculated z-score that is this far from zero is surprising and could not easily happen by chance. The null hypothesis should be rejected. There is sufficient evidence to support the claim that the proportion of customers who take a daily vitamin is different then 0.60.
8.3 Hypothesis Tests in Detail 1 Understand p-Values 1) A 2) B 3) The calculated p-value is 0.0076 (0.0038*2 for a two-sided test),which is less than 0.05, the stated level of significance. Therefore, the result is significant. The sample results give evidence that the given proportion of 0.90 is not correct. 4) A 5) C Page 34 Copyright © 2020 Pearson Education, Inc.
2 Identify Both Types of Errors 1) A 2) A 3) Failing to reject the null hypothesis does not prove that the null hypothesis is true, only that the sample evidence does not show surprising enough results to suggest that the assumption that the null hypothesis is true is incorrect. To say the null hypothesis is proven true means that there is no doubt about it. This is not possible based on chance processes. 4) D 5) A 6) B 3 Choose the Appropriate Test of Significance and Perform the Test 1) A 2) A 3) B 4) Two-proportion z-test. One population is all people in the 20-29 age bracket, the other population is all senior citizens. 5) A 6) D 4 Understand Hypotheses and How to Interpret the Results of a Hypothesis Test 1) A 2) A 3) A 4) A 5) A 6) C 7) B 8) D
8.4 Comparing Proportions from Two Populations 1 Understand How to Interpret Two-Proportion z-Tests 1) C 2) B 3) C 2 Perform a Two-Proportion z-Test 1) D 2) D 3) B 4) C 5) A 6) B 7) C 8) D 9) D 10) A 11) C 12) z = −2.05; p = 0.04 . There is sufficient evidence to reject the null hypothesis that the population proportions are equal. 13) The population proportion of single men who attend church regularly is less than the population proportion of single women who attend church regularly. Ha: p men < p women 14) One condition is that the two samples themselves must be independent of each other. The second condition is that the observations within each sample must be independent of each other.
Page 35 Copyright © 2020 Pearson Education, Inc.
Ch. 9 Inferring Population Means 9.1 Sample Means of Random Samples 1 Distinguish Between Parameters and Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants was taken and the mean finishing time was found to be 1.59 hours with a standard deviation of 0.30 hours. . 1) In this example, the numerical values of 1.67 hours and 0.25 hours are A) estimates C) parameters
B) statistics D) unbiased estimators
Answer: C Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants in the 40 -44 age group was taken and the mean finishing time was found to be 1.62 hours with a standard deviation of 0.40 hours. . 2) In this example, the numerical values of 1.62 hours and 0.40 hours are A) estimates C) parameters
B) statistics D) unbiased estimators
Answer: B Feature movie lengths (in hours) were measured for all movies shown in the past year in the U.S. The mean length of all feature length movies shown was 1.80 hours with a standard deviation of 0.15 hours. Suppose the length of a random sample of 20 movies was recorded from all movies released this year. The mean length of the feature length movies was found to be 1.72 hours with a standard deviation of 0.18 hours. . 3) In this example, the numerical values of 1.80 hours and 0.15 hours are A) estimates C) parameters
B) statistics D) unbiased estimators
Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all male participants in a recent large duathlon was 1.54 hours with a standard deviation of 0.22 hours. The distribution of finish times for males is right-skewed. Suppose that a sample of 30 randomly selected male participants is selected. 4) Is the number 1.54 a statistic or parameter? Explain. Answer: This is a parameter. It is calculated using data from every member of the population. The population it was drawn from is small enough in scope that the population mean can be calculated.
Page 1 Copyright © 2020 Pearson Education, Inc.
2 Sketch Population Distributions and Find Probabilities Using the Empirical Rule MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants in the 40-44 age group was taken and the mean finishing time was found to be 1.62 hours with a standard deviation of 0.40 hours. 1) Suppose the process of taking random samples of size 30 from the 40 -44 age group is repeated 200 times and a histogram of the 200 sample means is created. Which statement best describes the shape of the histogram? B) The histogram will be unimodal. A) The histogram will be roughly symmetrical. C) The histogram will be roughly bell-shaped. D) All of these statements are true. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all male participants in a recent large duathlon was 1.54 hours with a standard deviation of 0.22 hours. The distribution of finish times for males is right-skewed. Suppose that a sample of 30 randomly selected male participants is selected. 2) Suppose that the process of drawing samples of size 30 from the population of all male participants is repeated 100 times. If possible, sketch and describe what the sampling distribution of the means will look like and state the approximate mean value of the distribution. Round to the nearest thousandth. Answer: The distribution will be approximately bell-shaped (normally distributed). The mean will be 1.54 hours (the same as the population mean). 3 Decide What Type of Distribution is Shown in a Histogram Made from the Data in a Sample MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants was taken and the mean finishing time was found to be 1.59 hours with a standard deviation of 0.30 hours. 1) Suppose we were to make a histogram of the finishing times of all participants in the duathlon. Would the histogram be a display of the population distribution, the distribution of a sample, or the sampling distribution of means? A) population distribution B) distribution of a sample C) sampling distribution of means Answer: A 2) Suppose the process of taking random samples of size 30 is repeated 200 times and a histogram of the 200 sample means is created. Would the histogram be an approximate display of the population distribution, the distribution of a sample, or the sampling distribution of means? A) population distribution B) distribution of a sample C) sampling distribution of means Answer: C
Page 2 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants in the 40 -44 age group was taken and the mean finishing time was found to be 1.62 hours with a standard deviation of 0.40 hours. 3) Suppose we were to make a histogram of the finishing times of the 30 participants in the 40 -44 age group. Would the histogram be a display of the population distribution, the distribution of a sample, or the sampling distribution of means? A) population distribution B) distribution of a sample C) sampling distribution of means Answer: B 4) Suppose the process of taking random samples of size 30 from the 40 -44 age group is repeated 200 times and a histogram of the 200 sample means is created. Which statement best describes the shape of the histogram? B) The histogram will be unimodal. A) The histogram will be roughly symmetrical. C) The histogram will be roughly bell-shaped. D) All of these statements are true. Answer: D Feature movie lengths (in hours) were measured for all movies shown in the past year in the U.S. The mean length of all feature length movies shown was 1.80 hours with a standard deviation of 0.15 hours. Suppose the length of a random sample of 20 movies was recorded from all movies released this year. The mean length of the feature length movies was found to be 1.72 hours with a standard deviation of 0.18 hours. 5) Suppose we were to make a histogram of the feature length movie times of all movies in the past year. The histogram would be a display of which of the following? A) population distribution B) distribution of a sample C) sampling distribution of means D) Normal distribution Answer: A 6) Suppose the process of taking random samples of size 20 is repeated 200 times and a histogram of the 200 sample means is created. The histogram would be a display of which of the following? A) population distribution B) distribution of a sample C) sampling distribution of means D) Normal distribution Answer: C 4 Understand and Calculate the Sample Mean and the Standard Error of the Mean MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants was taken and the mean finishing time was found to be 1.59 hours with a standard deviation of 0.30 hours. 1) What is the standard error for the mean finish time of 30 randomly selected participants? Round to the nearest thousandth. B) 0.300 C) 0.055 D) 0.046 A) 0.250 Answer: D
Page 3 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all participants in a recent large duathlon was 1.67 hours with a standard deviation of 0.25 hours. Suppose a random sample of 30 participants in the 40 -44 age group was taken and the mean finishing time was found to be 1.62 hours with a standard deviation of 0.40 hours. 2) What is the standard error for the mean finish time of 30 randomly selected participants in the 40-44 age group? Round to the nearest thousandth. B) 0.300 C) 0.250 D) 0.055 A) 0.046 Answer: A Solve the problem. 3) Choose the statement that best describes what is meant when we say that the sample mean is unbiased when estimating the population mean. A) The sample mean will always equal the population mean. B) On average, the sample mean is the same as the population mean. C) The standard deviation of the sampling distribution (also called the standard error) and the population standard deviation are equal. D) None of these. Answer: B Feature movie lengths (in hours) were measured for all movies shown in the past year in the U.S. The mean length of all feature length movies shown was 1.80 hours with a standard deviation of 0.15 hours. Suppose the length of a random sample of 20 movies was recorded from all movies released this year. The mean length of the feature length movies was found to be 1.72 hours with a standard deviation of 0.18 hours. 4) What is the standard error for the estimated mean feature length movie time of the 20 randomly selected movies? Round to the nearest thousandth. B) 0.040 C) 0.055 D) 0.034 A) 0.356 Answer: D 5) If we create a sampling distribution of sample means, what would be the mean and standard deviation of that distribution given the sample size of 20? A) The mean length would be 1.80 hours with a standard deviation of 0.18 hours. B) The mean length would be 1.80 hours with a standard deviation of 0.15 hours. C) The mean length would be 1.72 hours with a standard deviation of 0.18 hours. D) The mean length would be 1.80 hours with a standard deviation of 0.034 hours. Answer: D Solve the problem. 6) Choose the statement that best describes what is meant when we say that the sample mean is unbiased when estimating the population mean. A) The sample mean will always equal the population mean. B) On average, the sample mean is the same as the population mean. C) The standard deviation of the sampling distribution (also called the standard error) and the population standard deviation are equal. D) We cannot say that the sample mean is unbiased. Answer: B 7) When the mean of the sampling distribution is the same value as the population parameter, we can say that the statistic is A) a representation of an unbiased parameter. B) the standard error. C) an unbiased estimator. D) unlimited. Answer: C
Page 4 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A sprint duathlon consists of a 5 km run, a 20 km bike ride, followed by another 5 km run. The mean finish time of all male participants in a recent large duathlon was 1.54 hours with a standard deviation of 0.22 hours. The distribution of finish times for males is right-skewed. Suppose that a sample of 30 randomly selected male participants is selected. 8) What is the expected finish time for a male participant in the sample of 30? Will the expected mean finish time be the same for any sample of 30 males drawn from the population? Explain. Answer: The sample mean can vary from sample to sample, but the expected value of any sample of 30 drawn from the population will typically be the same as the population mean of 1.54 hours. 9) Calculate the standard error for the mean finish time of 30 randomly selected male participants. Show all your work and round to the nearest thousandth. Answer: SE = σ/ n ≈ 0.22/5.477 ≈ 0.040 Solve the problem. 10) Explain what is meant when we say that the sample mean is an unbiased estimator. Answer: When many samples of size n are taken from a population, the mean of the sampling distribution of sample means is equal to the population mean. Since sample mean is an accurate estimate of the population parameter, it is called an unbiased estimator.
9.2 The Central Limit Theorem for Sample Means 1 Understand and Use the Central Limit Theorem MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is not a true statement about the Central Limit Theorem for sample means? A) If the sample size is large, it doesnʹt matter what the distribution of the population it was drawn from is, the normal distribution can still be used to perform statistical inference. B) If conditions are met, the mean of the sampling distribution is equal to the population mean. C) The Central Limit Theorem helps us find probabilities for sample means when those means are based on a random sample from a population. D) All of these statements are true about the Central Limit Theorem for sample means. Answer: D 2) Suppose that the average pop song length in America is 4 minutes with a standard deviation of 1.25 minutes. It is known that song length is not normally distributed. Suppose a sample of 25 songs is taken from the population. What is the approximate probability that the average song length will be less than 3.5 minutes? Round to the nearest thousandth. B) 0.023 C) 0.155 D) 0.477 A) 0.345 Answer: B 3) Suppose that the average song length in America is 4 minutes with a standard deviation of 1.25 minutes. It is known that song length is not normally distributed. Find the probability that a single randomly selected song from the population will be longer than 4.25 minutes. Round to the nearest thousandth. A) 0.579 B) 0.421 C) 0.079 D) This probability cannot be determined because we do not know the distribution of the population. Answer: D
Page 5 Copyright © 2020 Pearson Education, Inc.
4) Suppose that the average country song length in America is 4.75 minutes with a standard deviation of 1.10 minutes. It is known that song length is not normally distributed. Suppose a sample of 25 songs is taken from the population. What is the approximate probability that the average song length will last more than 5.25 minutes? Round to the nearest thousandth. B) 0.012 C) 0.325 D) 0.175 A) 0.488 Answer: B 5) Suppose that the average country song length in America is 4.75 minutes with a standard deviation of 1.10 minutes. It is known that song length is not normally distributed. Find the probability that a single randomly selected song from the population will be less than 4.20 minutes. Round to the nearest thousandth. A) 0.006 B) 0.494 C) 0.068 D) This probability cannot be determined because we do not know the distribution of the population. Answer: D Use the empirical rule to solve the problem. 6) Some sources report that the systolic blood pressures of 18-year-old women are Normally distributed with a mean of 120 mmHg and a standard deviation of 12 mmHg. What percentage of 18 -year-old women have a systolic blood pressure between 96 mmHg and 144 mmHg? B) 99.7% C) 99.99% D) 68% A) 95% Answer: A 7) At one college, the GPAʹs are Normally distributed with a mean of 3 and a standard deviation of 0.5. What percentage of students at the college have a GPA between 2.5 and 3.5? A) 68% B) 99.7% C) 84.13% D) 95% Answer: A Solve the problem. 8) Which of the following statements is true about the t-distribution? A) For small sample sizes, the t-distribution has the same properties as the normal curve. B) For large sample sizes, the t-distribution has the same properties as the normal curve. C) Like the Normal distribution, the t-distribution is symmetric for small n. D) Since population standard deviation is usually unknown, the standard error uses the sample standard deviation to estimate population standard deviation. Answer: B 9) Which of the following is a true statement about the Central Limit Theorem for sample means? A) If the sample size is large, it doesnʹt matter what the distribution of the population it was drawn from is, the normal distribution can still be used to perform statistical inference. B) If conditions are met, the mean of the sampling distribution is equal to the population mean. C) The Central Limit Theorem helps us find probabilities for sample means when those means are based on a random sample from a population. D) All of the statements are true about the Central Limit Theorem for sample means. Answer: D Suppose that the mean Systolic blood pressure for adults age 50 -54 is 125 mmHg with a standard deviation of 5 mmHg. It is known that Systolic blood pressure is not Normally distributed. Suppose a sample of 25 adult Systolic blood pressure measurements is taken from the population. 10) What is the approximate standard error for the sampling distribution of mean blood pressures? Round to the nearest hundredth. A) 0.20 B) 17.00 C) 5.00 D) 1.00 Answer: D Page 6 Copyright © 2020 Pearson Education, Inc.
11) What is the interpretation of the z score used to find the probability that the average Systolic blood pressure will be less than 122 mmHg? Assume technology is not available and so the values must first be converted to standard units. Round to the nearest hundredth. A) z = -3.00 which is less than or equal to 3 standard deviations above the mean. B) z = -3.00 which is less than or equal to 3 standard deviations below the mean. C) z = 3.00 which is less than or equal to 3 standard deviations above the mean. D) This probability cannot be determined because we do not know the distribution of the population. Answer: B 12) What is the approximate probability with interpretation that the average diastolic blood pressure will be less than 122 mmHg, given the above information? Round to the nearest thousandth. A) The probability for z = 3.00 or less is less than or equal to 3 standard deviations which is not a significant result. B) The probability for z = -3.00 or less is less than or equal to 3 standard deviations which is a significant result, using an approximation of 0.5 of (1 – 99.7%) = 0.0015. C) The probability for z = -3.00 or less is less than or equal to 3 standard deviations which is a significant result, using an approximation of 2*(1 – 99.7%) = 0.0060. D) None of these statements are correct. Answer: B Assume that the average Systolic blood pressure for adults age 50 -54 is 125 mmHg with a standard deviation of 5 mmHg. It is known that Systolic blood pressure is not normally distributed. A sample of 25 adult Systolic blood pressure measurements are taken from the population. 13) What type of test would be appropriate for this example to see if the sample results are different from the averag diastolic blood pressure measurements? I. A one-sample t test because the data are not normally distributed. II. A one-sample z test because the data are not normally distributed. III. A 95% confidence interval using a critical t statistic. A) I only B) II only C) III only D) I and III Answer: D 14) What is the approximate z-value with interpretation for the probability that the average Systolic blood pressure will be less than 122 mmHg? Round to the nearest hundredth. A) z = -3.00 which is less than or equal to 3 standard deviations which is not a significant result. B) z = -3.00 which is less than or equal to 3 standard deviations which is a significant result. C) z = 3.00 which is greater than or equal to 3 standard deviations which is a significant result. D) This probability cannot be determined because we do not know the distribution of the population. Answer: B 15) What is the approximate z score with interpretation that the average Systolic blood pressure will be greater than 128 mmHg? Round to the nearest thousandth. A) The probability for z = 3.00 or greater is greater than or equal to 3 standard deviations which is a significant result. B) The probability for z = -3.00 or greater than or equal to 3 standard deviations which is not a significant result. C) The probability for z = -3.00 or less is less than or equal to 3 standard deviations which is a significant result. D) None of these statements are correct. Answer: A
Page 7 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 16) How might the shapes of a population distribution and a sampling distribution look the same? How might they look different? Answer: A population distribution and sampling distribution might both be normally distributed, but not necessarily. The population distribution histogram can have any distribution, but if the sample size is large, the sampling distribution will always be approximately normal. 17) Describe the center and standard deviation of a sampling distribution. Answer: The mean of this Normal distribution will be the same as the population mean. The standard deviation of this distribution is the standard error. 18) Suppose that a major league baseball game has an average length of 2.9 hours with a standard deviation of 0.5 hours. It is known that game length is not normally distributed. Suppose a random sample of 36 games is taken from the population. Sketch the probability distribution and shade in the region that corresponds to the probability. What is the approximate probability that average game length will be greater than 3.1 hours? Round to the nearest thousandth. Answer: 0.008 19) Suppose that a major league baseball game has an average length of 2.9 hours with a standard deviation of 0.5 hours. It is known that game length is not normally distributed. Suppose a random sample of 36 games is taken from the population. What is the approximate probability that average game length will be greater than 3.15 hours or less than 2.75 hours? Round to the nearest thousandth. Answer: 0.037 20) Compare the normal distribution and the t-distribution. How are they similar? How are they different? Answer: Both distributions are symmetrical and unimodal, but the t-distribution takes on a slightly different shape depending on sample size. The t -distribution will have thicker tails for smaller samples, but as sample size increases, the shape will more closely resemble the bell-shaped normal distribution.
9.3 Answering Questions about the Mean of a Population 1 Find, Interpret, and Use Confidence Levels and Intervals for a Single Population Mean MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Many couples believe that it is getting too expensive to host an ʺaverageʺ wedding in the United States. According to the website www.costofwedding.com, the average cost of a wedding in the U.S. in 2009 was $24,066. Recently, in a random sample of 40 weddings in the U.S. it was found that the average cost of a wedding was $23,224, with a standard deviation of $2,903. On the basis of this, a 95% confidence interval for the mean cost of weddings in the U.S. is $22,296 to $24,152. 1) For this description, which of the following does not describe a condition for a valid confidence interval? A) The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. B) The sample observations are independent because knowledge about the cost of any one wedding tells us nothing about the cost of any other wedding in the sample. C) The sample distribution must be normally distributed in order to have a valid confidence interval. The problem does not describe the distribution of the sample, so this condition is not met. D) The sample size of 40 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. Answer: C
Page 8 Copyright © 2020 Pearson Education, Inc.
2) Does the confidence interval provide evidence that the mean cost of a wedding has decreased? A) Yes B) No Answer: B 3) Choose the statement that is the best interpretation of the confidence interval. A) In about 95% of all samples of 40 U.S. weddings, the resulting confidence interval will contain the mean cost of all weddings in the U.S. B) We are extremely confident that the mean cost of a U.S. wedding is between $22,296 and $24,152. C) That probability that a U.S. wedding will cost more than $24,152 is less than 3%. Answer: A Solve the problem. 4) The weights at birth of five randomly chosen baby giraffes were 111, 115, 120, 103, and 106 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby giraffes. Use technology for your calculations. Give the confidence interval in the form ʺestimate ± margin of error.ʺ Round to the nearest tenth of a pound. A) There is not enough information given to calculate the confidence interval. B) 110.0 ± 8.5 pounds C) 111.5 ± 9.0 pounds D) 111.0 ± 8.5 pounds Answer: D 5) Choose the statement that describes a situation where a confidence interval and a hypothesis test will yield the same results. A) When the null hypothesis contains a population parameter that is equal to zero. B) When the alternative hypothesis is two-tailed. C) Both (a) and (b). D) Neither (a) nor (b). The confidence interval cannot yield results that are the same as the hypothesis test. Answer: B Use the following information to answer the question. According to the website ww.costofwedding.com, the average cost of flowers for a wedding is $698. Recently, in a random sample of 40 weddings in the U.S. it was found that the average cost of the flowers was $734, with a standard deviation of $102. On the basis of this, a 95% confidence interval for the mean cost of flowers for a wedding is $701 to $767. 6) For this description, which of the following does not describe a condition for a valid confidence interval? A) The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. B) The sample observations are independent because knowledge about the cost of flowers for any one wedding tells us nothing about the cost of flowers for any other wedding in the sample. C) The sample size of 40 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. D) All of these describe conditions for a valid confidence interval. Answer: D 7) Does the confidence interval provide evidence that the mean cost of flowers for a wedding has increased? A) Yes B) No Answer: A
Page 9 Copyright © 2020 Pearson Education, Inc.
8) Choose the statement that is the best interpretation of the confidence interval. A) That probability that the flowers at a wedding will cost more than $698 is greater than 5%. B) In about 95% of all samples of size 40, the resulting confidence interval will contain the mean cost of flowers at weddings. C) We are extremely confident that the mean cost of flowers at a wedding is between $701 and $767. D) That probability that flowers at a wedding will cost less than $767 is nearly 100%. Answer: B Solve the problem. 9) The weights at birth of five randomly chosen baby Orca whales were 425, 454, 380, 405, and 426 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby Orca whales. Use technology for your calculations. Give the confidence interval in the form ʺestimate ± margin of error.ʺ Round to the nearest tenth of a pound. A) There is not enough information given to calculate the confidence interval. B) 384.0 ± 68.0 pounds C) 418.0 ± 34.0 pounds D) 418.0 ± 34.5 pounds Answer: C Many couples believe that it is getting too expensive to host an ʺaverageʺ wedding in the United States. According to a statistics study in the U.S., the average cost of a wedding in the U.S. in 2014 was $25,200. Recently, in a random sample of 35 weddings in the U.S. it was found that the average cost of a wedding was $24,224 with a standard deviation of $2,210. 10) For this description, which of the following does not describe a required condition for a valid confidence interval based on the sample results? A) The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. B) The sample observations are independent because knowledge about the cost of any one wedding tells us nothing about the cost of any other wedding in the sample. C) The sample distribution must be normally distributed in order to have a valid confidence interval. The problem does not describe the distribution of the sample, so this condition is not met. D) The sample size of 35 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. Answer: C 11) Which method should be used for finding a 95% margin of error for the sample mean for this example? A) Find the critical z value for the given level of confidence and multiply by the standard error using the formula z/ n . B) Find the critical t value for the given level of confidence and multiply by the standard error using the formula s/ n . C) Find the critical z value for the given level of confidence and multiply by the standard error using the formula s/ n . D) Find the critical t value for the given level of confidence and multiply by the standard error using the formula σ/ n. Answer: B
Page 10 Copyright © 2020 Pearson Education, Inc.
12) Which is the best interpretation of a 95% confidence interval for the sample mean? A) The 95% confidence interval means that we are 95% confident that the sample mean is between the low and high interval values. B) The 95% confidence interval means that there is a 95% probability that the population mean is between the low and high interval values. C) The 95% confidence interval means that we are 95% confident that the population mean is between the low and high interval values. D) The 95% confidence interval means that there is a 95% probability that the sample mean is between the low and high interval values. Answer: C 13) If a 95% confidence interval for the mean for the wedding sample is ($23465, $24983), does this mean that the sample results are significantly different from the claimed value for the mean of $25,200? A) Since the claimed population mean is outside of the 95% confidence interval, we conclude that the sample results are significantly different. B) Since the claimed population mean is outside of the 95% confidence interval, we conclude that there is a 95% chance that the sample results are significantly different. C) Since the accepted population average is outside of the 95% confidence interval, we conclude that there is a 5% chance that the sample results are significantly different. D) Since the accepted population average is outside of the 95% confidence interval, we conclude that the sample results are not significantly different. Answer: A According to the website www.costofwedding.com, the average cost of flowers for a wedding is $698. Recently, in a random sample of 40 weddings in the U. S. it was found that the average cost of the flowers was $734, with a standard deviation of $102. On the basis of this, a 95% confidence interval for the mean cost of flowers for a wedding is $701 to $767. 14) For this description, which of the following does not describe a condition for a valid confidence interval? A) The description states that the sample was randomly selected, so we can assume that the condition which states that the data must represent a random sample is satisfied. B) The sample observations are independent because knowledge about the cost of flowers for any one wedding tells us nothing about the cost of flowers for any other wedding in the sample. C) The sample size of 40 is large enough that knowledge about the population distribution is not necessary and the condition that the population be normally distributed or sample size be larger than 25 is satisfied. D) All of these describe conditions for a valid confidence interval. Answer: D 15) Choose the statement that is the best interpretation of the confidence interval. I. That probability that the flowers at a wedding will cost more than $698 is greater than 5%. II. In about 95% of all samples of size 40, the resulting confidence interval will contain the mean cost of flowe weddings. III. We are extremely confident that the mean cost of flowers at a wedding is between $701 and $767. A) I only B) II only C) III only D) II and III are both correct. Answer: D
Page 11 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 16) The weights at birth of five randomly chosen baby Orca whales were 425, 454, 380, 405, and 426 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby Orca whales. Use technology for your calculations. Give the confidence interval in the form ʺestimate ± margin of errorʺ. Round to the nearest tenth of a pound. A) There is not enough information given to calculate the confidence interval. B) 384.0 ± 68.0 pounds C) 418.0 ± 34.1 pounds D) 418.0 ± 34.5 pounds Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. According to the website www.costofwedding.com, the average cost of catering a wedding (with an open bar) is $100 per person. Recently, in a random sample of 40 weddings in the U. S. it was found that the average catering cost was $110, with a standard deviation of $12. On the basis of this, a 95% confidence interval for the mean catering cost for a wedding is $106 to $114. 17) Verify that the conditions for a valid confidence interval are met. Answer: All conditions are met. The sample is randomly selected, the sample observations are independent, and the sample size is large enough that population distribution does not matter. 18) Explain whether the confidence interval provides evidence that the mean catering cost of a wedding has increased. Be specific about the reasoning of using the confidence interval. Answer: The confidence interval provides strong evidence that the catering cost of a wedding has increased because the 95% confidence interval is above $100 and there is no overlap. If the value is not included in the interval, the result is significant. 19) A popular wedding magazine states in an article on catering costs that ʺThere is a 95% chance that the catering cost of your wedding will be between $106 and $114 per person.ʺ Explain what is wrong with this statement and write a better statement that correctly interprets the confidence interval. Answer: A confidence interval does not describe a probability. The correct statement should be similar to the following: ʺWe are 95% confident that the mean catering cost of a wedding will be between $106 and $114 per person.ʺ Solve the problem. 20) The weights at birth of five randomly chosen baby hippopotamuses were 75, 99, 107, 82, and 63 pounds. Assume the distribution of weights is normally distributed. Find a 95% confidence interval for the mean weight of all baby hippopotamuses. Use technology for your calculations. Give the confidence interval in the form ʺestimate ± margin of errorʺ. Round to the nearest pound. Answer: 85 + 22 pounds
Page 12 Copyright © 2020 Pearson Education, Inc.
9.4 Hypothesis Testing for Means 1 Test Hypotheses Concerning a Population Mean MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose a consumer product researcher wanted to find out whether a highlighter lasted longer than the manufacturerʹs claim that their highlighters could write continuously for 14 hours. The researcher tested 40 highlighters and recorded the number of continuous hours each highlighter wrote before drying up. Test the hypothesis that the highlighters wrote for more than 14 continuous hours. Following are the summary statistics: x =14.5 hours, s =1.2 hours Report the test statistic, p-value, your decision regarding the null hypothesis. At the 5% significance level, state your conclusion about the original claim. Round all values to the nearest thousandth. A) z = 9.583; p = 0.000 + ; Reject the null hypothesis; there is strong evidence to suggest that the highlighters last longer than 14 hours. B) z = 9.583; p = 0.000 + ; Fail to reject the null hypothesis; there is not strong evidence to suggest that the highlighters last longer than 14 hours. C) t = 2.635; p = 0.006; Reject the null hypothesis; there is strong evidence to suggest that the highlighters last longer than 14 hours. D) t = 2.635; p = 0.006; Fail to reject the null hypothesis; there is not strong evidence to suggest that the highlighters last longer than 14 hours. Answer: C 2) Suppose a consumer product researcher wanted to find out whether a highlighter lasted less than the manufacturerʹs claim that their highlighters could write continuously for 14 hours. The researcher tested 40 highlighters and recorded the number of continuous hours each highlighter wrote before drying up. Test the hypothesis that the highlighters wrote for less than 14 continuous hours. Following are the summary statistics: x =13.6 hours, s =1.3 hours Report the test statistic, p-value, your decision regarding the null hypothesis, and your conclusion about the original claim. Round all values to the nearest thousandth. A) z = 1.946; p = 0.029; Reject the null hypothesis; there is strong evidence to suggest that the highlighters last less than 14 hours. B) z = 1.946; p = 0.974; Fail to reject the null hypothesis; there is not strong evidence to suggest that the highlighters last less than 14 hours. C) t = -1.946; p = 0.029; Fail to reject the null hypothesis; there is not strong evidence to suggest that the highlighters last less than 14 hours. D) t = -1.946; p = 0.029; Reject the null hypothesis; there is strong evidence to suggest that the highlighters last less than 14 hours. Answer: D
Page 13 Copyright © 2020 Pearson Education, Inc.
3) An economist conducted a hypothesis test to test the claim that the average cost of eating a meal at home increased from 2009 to 2010. The average cost of eating a meal at home in 2009 was $5.25 per person per meal. Assume that all conditions for testing have been met. He used technology to complete the hypothesis test. Following is his null and alternative hypothesis and the output from his graphing calculator. H0 : μ = $5.25 Ha : μ > $5.25
At the 5% significance level, choose the statement that contains the correct conclusion regarding the hypothesis a original claim. A) Reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. B) Reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. C) Fail to reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. D) Fail to reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating at home has increased since 2009. Answer: A
Page 14 Copyright © 2020 Pearson Education, Inc.
4) An economist conducted a hypothesis test to test the claim that the average cost of eating a meal away from home decreased from 2009 to 2010. The average cost of eating a meal away from home in 2009 was $7.15 per person per meal. Assume that all conditions for testing have been met. He used technology to complete the hypothesis test. Following is his null and alternative hypothesis and the output from his graphing calculator. H0 : μ = $7.15 Ha : μ < $7.15
Choose the statement that contains the correct conclusion regarding the hypothesis and the original claim. A) Reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. B) Reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. C) Fail to reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. D) Fail to reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2009. Answer: D 5) Suppose a consumer product researcher wanted to find out whether a Sharpie lasted longer than the manufacturerʹs claim that their Sharpies could write continuously for a mean of 14 hours. The researcher tested 40 Sharpies and recorded the number of continuous hours each Sharpie wrote before drying up. Test the hypothesis that Sharpies can write for more than a mean of 14 continuous hours. Following are the summary statistics: x 14.5 hours, s 1.2 hours At the 5% significance level, t = 2.635; p = 0.006. State your conclusion about the original claim. A) Do not reject the null hypothesis; there is not strong enough evidence to suggest that the Sharpies last longer than a mean of 14 hours. B) Reject the alternative hypothesis; there is strong evidence to suggest that the Sharpies last longer than a mean of 14 hours. C) Reject the null hypothesis; there is strong evidence to suggest that the Sharpies last longer than a mean of 14 hours. D) There needs to be more data to determine if the Sharpies last longer than a mean of 14 hours. Answer: C
Page 15 Copyright © 2020 Pearson Education, Inc.
6) Understand the hypothesis test of the mean] An economist conducted a hypothesis test to test the claim that the average cost of eating a meal away from home decreased from 2012 to 2013. The average cost of eating a meal away from home in 2012 was $7.15 per person per meal. Assume that all conditions for testing have been met. He used technology to complete the hypothesis test. Following is his null and alternative hypothesis and the output from his graphing calculator. H0 : μ = $7.15 Ha: μ < $7.15
Choose the statement that contains the correct conclusion regarding the hypothesis and the original claim. A) Reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2012. B) Reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2012. C) Fail to reject the null hypothesis; there is sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2012. D) Fail to reject the null hypothesis; there is not sufficient evidence to support the claim that the average cost of eating away from home has decreased since 2012. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) Suppose a consumer product researcher wanted to find out whether a printer ink cartridge lasted longer than the manufacturerʺs claim that their ink cartridges could print 400 pages. The researcher tested 40 ink cartridges and recorded the number of pages that were printed before the ink started to fade. Test the hypothesis that the ink cartridges lasted for more than 400 pages. Following are the summary statistics: x = 415 pages, s = 30 pages Report the null and alternative hypothesis, test statistic, p-value, your decision regarding the null hypothesis, an your conclusion about the original claim. Round all values to the nearest thousandth. Answer: H0 : μ = 400 and Ha: μ > 400; t = 3.162; p = 0.002; Reject the null hypothesis; there is strong evidence to suggest the ink cartridges last for more than 400 pages (the claim is supported).
Page 16 Copyright © 2020 Pearson Education, Inc.
8) The quality engineer at a paint manufacturer conducted a hypothesis test to test the claim that the mean volume of paint cans had changed after an adjustment in the manufacturing process. Mean volume in paint cans before the adjustment was 1.02 gallons. Assume that all conditions for testing have been met. She used technology to complete the hypothesis test. Following is the null and alternative hypothesis and the output from her graphing calculator. H0 : μ = 1.02 gallons Ha: μ ≠ 1.02 gallons
Write a statement explaining what her decision regarding the null hypothesis should be and a statement summa her conclusion regarding the claim that average volume of paint cans had changed. Has the adjustment in the manufacturing process changed the average volume of paint cans? Answer: Reject the null hypothesis; there is strong evidence to suggest that average volume of paint cans is different than 1.02 gallons. The adjustment in the manufacturing process has affected volume of paint cans. 2 Find and Interpret Confidence Intervals for a Population Mean MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Suppose a random sample of 40 students from a university were asked their ages. The table below gives a summary of the reported ages. Report and interpret a 95% confidence interval for the population mean. Assume the necessary conditions hold. # of students Age 18 8 19 10 20 9 21 7 22 6 A) (19.4, 20.3). We are 95% confident that the mean age of university students is between 19.4 and 20.3. B) (18.0, 22.0). We are 95% confident that the mean age of university students is between 18 and 22. C) (A. 19.4, 20.3). We are 95% confident that the mean age of university students in this sample is between 19. 20.3. D) (18.0, 22.0). We are 95% confident that the mean age of university students in this sample is between 18 and 22. Answer: A 2) According to Forbes.com, singer Rihanna’s “Diamond World Tour” concert tickets in 2013 averaged $200 per seat. Suppose a random sample of 30 ticket prices in Los Angeles had a sample mean of $227 with a standard deviation of $35. Construct a 90% confidence interval for the price of Rihanna concert tickets in Los Angeles. Assume the necessary conditions hold. B) ($216.14, $237.86) A) (A. $189.14, $210.86) C) ($186.93, $213.07) D) ($213.93, $240.07) Answer: B
Page 17 Copyright © 2020 Pearson Education, Inc.
3) Suppose that a class of 260 students each took a random sample of 40 students (with replacement) from their large college and recorded the number of Twitter followers each person in their sample reported. Then, each student used his or her data to calculate a 95% confidence interval for the mean number of Twitter followers for all students at the college. How many of the 250 intervals would you expect to capture the true population mean number of Twitter followers? Explain. A) We would expect 40 intervals to capture the true population mean number of Twitter followers because that’s how many students were in each sample. B) We would expect 3 intervals to capture the true population mean number of Twitter followers because 5% of 250 is 3. C) We would expect about 125 intervals to capture the true population mean number of Twitter followers because only half of the intervals should capture it. D) We would expect about 238 intervals to capture the true population mean number of Twitter followers because 95% of 250 is 238. Answer: D 4) A 95% confidence interval for the mean net of the world’s richest people was calculated to be ($58 billion, $122 billion) using the top 5 richest people on the list. Either interpret the interval or explain why it should not be interpreted. A) We are 95% confident that the population mean net worth of the richest people in the world is between $58 billion and $122 billion. B) We are 95% confident that the population mean net worth of all the people in the world is between $58 billion and $122 billion. C) We cannot interpret the interval because a random sample was not taken. D) We cannot interpret the interval because it included the entire population. Answer: C
9.5 Comparing Two Population Means 1 Distinguish Between Independent and Paired (Dependent) Samples MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The reading level of a random sample of men and a random sample of women are measured. Researchers want know whether women typically read at a higher level than men. B) Independent A) Dependent Answer: B 2) The college GPAʹs of identical twins are compared to see whether the means are different. A) Dependent B) Independent Answer: A 3) The productivity of manufacturing plant workers is compared before and after the installation of air conditioning. A) Dependent B) Independent Answer: A 4) The weight of King Salmon from Lake Michigan and Lake Superior are measured. Researchers want to know whether Lake Michigan King Salmon weigh less than those from Lake Superior. B) Independent A) Dependent Answer: B
Page 18 Copyright © 2020 Pearson Education, Inc.
5) The reading level of a random sample of men and a random sample of women is measured. Researchers want to know whether women typically read at a higher level than men. The samples A) are dependent B) are independent C) follow a normal distribution D) are not a type that can be determined Answer: B 6) The weight of King Salmon from Lake Michigan and Lake Superior are measured. Researchers want to know whether Lake Michigan King Salmon weigh less than those from Lake Superior. The samples B) are independent A) are dependent C) follow a normal distribution D) are not a type that can be determined Answer: B 7) The productivity of manufacturing plant workers is compared before and after the installation of air conditioning. The samples B) are independent A) are dependent C) follow a normal distribution D) are not a type that can be determined Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 8) State whether the situation has dependent or independent samples. A researcher wants to know if reaction time is affected by body type of the vehicle being driven. He measures the reaction time of 40 drivers while they drive a compact car then he measures the reaction time while they drive an SUV. Answer: The samples are dependent 9) State whether the situation has dependent or independent samples. A researcher wants to know if reaction time is affected by the gender of the driver. He measures the reaction time of 30 female drivers while they drive a compact car, then he measures the reaction time of 30 male drivers while they drive a compact car. Answer: The samples are independent
Page 19 Copyright © 2020 Pearson Education, Inc.
2 Test Hypotheses Concerning the Comparison of Two Population Means MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) A researcher wants to know whether athletic women are more flexible than non -athletic women. For this experiment, a woman who exercised vigorously at least four times per week was considered ʺathletic.ʺ Flexibility is measured in inches on a sit & reach box. Test the researcherʹs claim using the following summary statistics:
Assume that all conditions for testing have been met. Report the test statistic and p -value. At the 1% significanc level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) t =1.623; p = 0.054; Fail to reject the null hypothesis; there is not strong evidence to suggest that athletic women are more flexible than non-athletic women. B) t = -1.623; p = 0.054; Reject the null hypothesis; there is strong evidence to suggest that athletic women are more flexible than non-athletic women. C) t = -1.623; p = 0.108; Reject the null hypothesis; there is strong evidence to suggest that athletic women are more flexible than non-athletic women. D) t =1.623; p = 0.108; Reject the null hypothesis; there is not strong evidence to suggest that athletic women are more flexible than non-athletic women. Answer: A 2) A researcher wants to know whether athletic men are more flexible than non -athletic men. For this experiment, a man who exercised vigorously at least four times per week was considered ʺathletic.ʺ Flexibility is measured in inches on a sit & reach box. Test the researcherʹs claim using the following summary statistics:
Assume that all conditions for testing have been met. Report the test statistic and p -value. At the 5% significanc level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) t = 3.270; p = 0.001; Fail to reject the null hypothesis; there is not strong evidence to suggest that athletic men are more flexible than non-athletic men. B) t = 3.270; p = 0.001; Reject the null hypothesis; there is strong evidence to suggest that athletic men are more flexible than non-athletic men. C) t = -3.270; p = 0.002; Reject the null hypothesis; there is strong evidence to suggest that athletic men are more flexible than non-athletic men. D) t = -3.270; p = 0.002; Reject the null hypothesis; there is not strong evidence to suggest that athletic men are more flexible than non-athletic men. Answer: B
Page 20 Copyright © 2020 Pearson Education, Inc.
3) A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classical music. Test the hypothesis that mood rating improved after being exposed to classical music. Following are the mood ratings for the four participants:
Assume that all conditions for testing have been met. Report the null and alternative hypothesis and p -value. A 5% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) H0 : μdifference = 0, Ha : μdifference > 0; p = 0.922; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improved mood rating. B) H0 : μ1 = μ2 , Ha : μ1 ≠ μ2 ; p = 0.008; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improved mood rating. C) H0 : μdifference = 0, Ha : μdifference < 0; p = 0.008; Reject the null hypothesis; there is strong evidence to suggest that exposure to classical music improved mood rating. D) H0 : μ1 = μ2 , Ha : μ1 < μ2 ; p = 0.077; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improved mood rating. Answer: C 4) A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classic rock music. Test the hypothesis that mood rating decreased after being exposed to classic rock music. Following are the mood ratings for the four participants:
Assume that all conditions for testing have been met. Report the null and alternative hypothesis and p -value. A 5% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) H0 : μdifference = 0, Ha : μdifference > 0; p = 0.154; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classic rock music decreased mood rating. B) H0 : μdifference = 0, Ha : μdifference > 0; p = 0.154; Reject the null hypothesis; there is strong evidence o suggest that exposure to classic rock music decreased mood rating. C) H0 : μdifference = 0, Ha : μdifference ≠ 0; p = 0.309; Reject the null hypothesis; there is strong evidence to suggest that exposure to classic rock music decreased mood rating. D) H0 : μdifference = 0, Ha : μdifference > 0; p = 0.015; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classic rock music decreased mood rating. Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
5) A researcher wants to know whether athletic women are more flexible than non -athletic women. For this experiment, a woman who exercised vigorously at least four times per week was considered ʺathleticʺ. Flexibility is measured in inches on a sit & reach box. A researcher tested his claim using the following summary statistics:
Assume that all conditions for testing have been met. t = 1.626; p = 0.057 ; At the 1% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. A) Fail to reject the null hypothesis; there is not strong enough evidence to suggest that athletic women are more flexible, on average, than non-athletic women. B) Reject the null hypothesis; there is strong evidence to suggest that athletic women are more flexible, on average, than non-athletic women. C) Reject the null hypothesis; there is strong evidence to suggest that non-athletic women are more flexible, on average, than athletic women. D) Fail to reject the null hypothesis; there is strong evidence to suggest that non-athletic women are more flexible, on average, than athletic women. Answer: A 6) A researcher wants to know whether athletic men are more flexible than non -athletic men. For this experiment, a man who exercised vigorously at least four times per week was considered ʺathleticʺ. Flexibility is measured in inches on a sit & reach box. Test the researcherʹs claim using the following summary statistics:
Assume that all conditions for testing have been met. Report the test statistic and p -value. At the 5% significanc level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) t = 3.270; p = 0.001 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that athletic men are more flexible than non-athletic men. B) t = 3.270; p = 0.001 ; Reject the null hypothesis; there is strong evidence to suggest that athletic men are more flexible than non-athletic men. C) t = −3.270; p = 0.002 ; Reject the null hypothesis; there is strong evidence to suggest that athletic men are more flexible than non-athletic men. D) t = −3.270; p = 0.002 ; Reject the null hypothesis; there is not strong evidence to suggest that athletic men are more flexible than non-athletic men. Answer: B
Page 22 Copyright © 2020 Pearson Education, Inc.
7) A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classical music. She is interested in testing whether the mean mood rating improves after an adult is exposed to classical music. Assume that all conditions for testing have been met. The resulting p -value is 0.008. Following are the mood ratings for the four participants:
A) H0 = μdifference = 0, Ha: μdifference > 0; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improves mean mood rating. B) H0 : μ1 = μ2 , Ha: μ1 ≠ μ2 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improves mean mood rating. C) H0 = μdifference = 0, Ha: μdifference < 0; Reject the null hypothesis; there is strong evidence to suggest that exposure to classical music improves mean mood rating. D) H0 : μ1 = μ2 , Ha: μ1 < μ2 ; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classical music improves mean mood rating. Answer: C 8) A researcher wants to know if mood is affected by music. She conducts a test on a sample of 4 randomly selected adults and measures mood rating before and after being exposed to classic rock music. Test the hypothesis that mood rating decreased after being exposed to classic rock music. Following are the mood ratings the four participants:
Assume that all conditions for testing have been met. Report the null and alternative hypothesis and p -value. A 5% significance level, state your decision regarding the null hypothesis and your conclusion about the original claim. Round all values to the nearest thousandth. A) H0 = μdifference = 0, Ha: μdifference > 0; p = 0.154; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classic rock music decreased mood rating. B) H0 = μdifference = 0, Ha: μdifference > 0; p = 0.154; Reject the null hypothesis; there is strong evidence to suggest that exposure to classic rock music decreased mood rating. C) H0 = μdifference = 0, Ha: μdifference ≠ 0; p = 0.309; Reject the null hypothesis; there is strong evidence to suggest that exposure to classic rock music decreased mood rating. D) H0 = μdifference = 0, Ha: μdifference > 0; p = 0.015; Fail to reject the null hypothesis; there is not strong evidence to suggest that exposure to classic rock music decreased mood rating. Answer: A
Page 23 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 9) A sociologist wants to know whether there is a difference in the mean number of times that men and women in the U. S. check their Smartphone during the day. Test the hypothesis that the mean number of times that men and women in the U. S. check their Smartphone during the day is different. Following are the summary statistics
Assume that all conditions for testing have been met. Be sure to report the null and alternative hypothesis, test st p-value, your decision regarding the null hypothesis, and your conclusion about the original claim. Round all values to the nearest thousandth. Answer: H0 : μ1 = μ2 and Ha: μ1 ≠ μ2 ; t = −3.063; p = 0.003; Reject the null hypothesis; there is strong evidence to suggest that there is a difference in the mean number of times that men and women check their Smartphone during the day. 3 Find, Interpret, and Use Confidence Intervals for the Difference of Two Sample Means MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) A student survey was conducted at a major university. A random sample of 99 students reported data on their gender and GPA. Summary statistics for each gender are displayed in the table below: Gender Female Male
N 59 40
Mean GPA 3.21 2.96
StDev 0.51131 0.57194
SE Mean 0.06657 0.09043
A 95% confidence interval for the mean difference in GPA between females and males (GPA F — GPA M) was calculated to be (0.02, 0.47). Interpret the interval in context and explain what the positive values indicate. A) We are 95% confident that the true mean difference in GPA between females and males is between 0.02 and 0.47. The positive values in the interval indicate that females have higher mean GPAs than males. B) We are 95% confident that the true mean difference in GPA between females and males is between 0.02 and 0.47. The positive values in the interval indicate that males have higher mean GPAs than females. C) We are 95% confident that the true mean difference in GPA between females and males is between 0.51 and 0.57. The positive values in the interval indicate that females have higher mean GPAs than males. D) We are 95% confident that the true mean difference in GPA between females and males is between 0.51 and 0.57. The positive values in the interval indicate that males have higher mean GPAs than females. Answer: A
Page 24 Copyright © 2020 Pearson Education, Inc.
2) A student survey was conducted at a major university. A random sample of 99 students reported data on their gender and GPA. Summary statistics for each gender are displayed in the table below: Gender Female Male
N 59 40
Mean GPA 3.21 2.96
StDev 0.51131 0.57194
SE Mean 0.06657 0.09043
A 95% confidence interval for the mean difference in GPA between females and males (GPA F — GPA M) was calculated to be (0.02, 0.47). Does the interval capture 0? Explain what this means. A) The interval does contain 0. This means there is a statistically significant difference in the mean GPAs of males and females at the university. B) The interval does contain 0. This means there is no statistically significant difference in the mean GPAs of males and females at the university. C) The interval does not contain 0. This means there is a statistically significant difference in the mean GPAs of males and females at the university. D) The interval does not contain 0. This means there is no statistically significant difference in the mean GPAs of males and females at the university. Answer: C 3) Flight delay data for July 2014 was collected for all airlines at all US airports. Summary statistics for two major airlines, American and Delta, are provided in the table below. The data are based on random samples from all Delta and American flights in July. Samples sizes are indicated in the table. Airline American Delta
N 84 137
Mean # Delayed Flights 146.5 70.3
StDev 377.0219 217.6505
SE Mean 41.1365 18.5951
A two-tailed hypothesis test was performed to determine if there was a difference in the mean number of delayed flights between the two airlines (μAmerican — μDelta). Using a significance level of 0.05, the null hypothesis could not be rejected (p-value = 0.0942). If you found a 95% confidence interval for the difference between the means, would it capture 0? Explain. A) The interval would capture 0 since we cannot conclude the means are different. It is possible they are the same. B) The interval would not capture 0 because we reject the hypothesis that the mean # of delays are the same. C) The interval would capture 0 because we could reject the hypothesis that the mean # of delays are the same. D) The interval would not capture 0 since we cannot conclude the means are different. It is possible they are the same. Answer: A 4) Flight delay data for July 2014 was collected for all airlines at all US airports. Summary statistics for two major airlines, American and Delta, are provided in the table below. The data are based on random samples from all Delta and American flights in July. Samples sizes are indicated in the table. Airline American Delta
N 84 137
Mean # Delayed Flights 146.5 70.3
StDev 377.0219 217.6505
SE Mean 41.1365 18.5951
Using the provided information, calculate a 95% confidence interval for the difference in population means. A) (1.33, 151.02)
B) (-7.83, 160.17)
C) (217.65, 377.02)
Answer: D Page 25 Copyright © 2020 Pearson Education, Inc.
D) (-13.23, 165.57)
9.6 Overview of Analyzing Means 1 Compare Hypothesis Tests and Confidence Intervals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Choose the statement that describes a situation where a confidence interval and a hypothesis test would yield th results. I. When the null hypothesis contains a population parameter that is equal to zero. II. When the alternative hypothesis is two-tailed. A) I only B) II only C) I and II D) Neither I nor II. The confidence interval cannot yield results that are the same as the hypothesis test. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 2) Describe the circumstances under which a confidence interval and hypothesis test yield the same results? Answer: When the alternative hypothesis is two-tailed the confidence interval and hypothesis will yield the same results.
Page 26 Copyright © 2020 Pearson Education, Inc.
Ch. 9 Inferring Population Means Answer Key 9.1 Sample Means of Random Samples 1 Distinguish Between Parameters and Statistics 1) C 2) B 3) C 4) This is a parameter. It is calculated using data from every member of the population. The population it was drawn from is small enough in scope that the population mean can be calculated. 2 Sketch Population Distributions and Find Probabilities Using the Empirical Rule 1) D 2) The distribution will be approximately bell-shaped (normally distributed). The mean will be 1.54 hours (the same as the population mean). 3 Decide What Type of Distribution is Shown in a Histogram Made from the Data in a Sample 1) A 2) C 3) B 4) D 5) A 6) C 4 Understand and Calculate the Sample Mean and the Standard Error of the Mean 1) D 2) A 3) B 4) D 5) D 6) B 7) C 8) The sample mean can vary from sample to sample, but the expected value of any sample of 30 drawn from the population will typically be the same as the population mean of 1.54 hours. 9) SE = σ/ n ≈ 0.22/5.477 ≈ 0.040 10) When many samples of size n are taken from a population, the mean of the sampling distribution of sample means is equal to the population mean. Since sample mean is an accurate estimate of the population parameter, it is called an unbiased estimator.
9.2 The Central Limit Theorem for Sample Means 1 Understand and Use the Central Limit Theorem 1) D 2) B 3) D 4) B 5) D 6) A 7) A 8) B 9) D 10) D 11) B 12) B 13) D 14) B 15) A
Page 27 Copyright © 2020 Pearson Education, Inc.
16) A population distribution and sampling distribution might both be normally distributed, but not necessarily. The population distribution histogram can have any distribution, but if the sample size is large, the sampling distribution will always be approximately normal. 17) The mean of this Normal distribution will be the same as the population mean. The standard deviation of this distribution is the standard error. 18) 0.008 19) 0.037 20) Both distributions are symmetrical and unimodal, but the t-distribution takes on a slightly different shape depending on sample size. The t-distribution will have thicker tails for smaller samples, but as sample size increases, the shape will more closely resemble the bell-shaped normal distribution.
9.3 Answering Questions about the Mean of a Population 1 Find, Interpret, and Use Confidence Levels and Intervals for a Single Population Mean 1) C 2) B 3) A 4) D 5) B 6) D 7) A 8) B 9) C 10) C 11) B 12) C 13) A 14) D 15) D 16) C 17) All conditions are met. The sample is randomly selected, the sample observations are independent, and the sample size is large enough that population distribution does not matter. 18) The confidence interval provides strong evidence that the catering cost of a wedding has increased because the 95% confidence interval is above $100 and there is no overlap. If the value is not included in the interval, the result is significant. 19) A confidence interval does not describe a probability. The correct statement should be similar to the following: ʺWe are 95% confident that the mean catering cost of a wedding will be between $106 and $114 per person.ʺ 20) 85 + 22 pounds
9.4 Hypothesis Testing for Means 1 Test Hypotheses Concerning a Population Mean 1) C 2) D 3) A 4) D 5) C 6) D 7) H0 : μ = 400 and Ha: μ > 400; t = 3.162; p = 0.002; Reject the null hypothesis; there is strong evidence to suggest the ink cartridges last for more than 400 pages (the claim is supported). 8) Reject the null hypothesis; there is strong evidence to suggest that average volume of paint cans is different than 1.02 gallons. The adjustment in the manufacturing process has affected volume of paint cans. 2 Find and Interpret Confidence Intervals for a Population Mean 1) A 2) B 3) D 4) C Page 28 Copyright © 2020 Pearson Education, Inc.
9.5 Comparing Two Population Means 1 Distinguish Between Independent and Paired (Dependent) Samples 1) B 2) A 3) A 4) B 5) B 6) B 7) A 8) The samples are dependent 9) The samples are independent 2 Test Hypotheses Concerning the Comparison of Two Population Means 1) A 2) B 3) C 4) A 5) A 6) B 7) C 8) A 9) H0 : μ1 = μ2 and Ha: μ1 ≠ μ2 ; t = −3.063; p = 0.003; Reject the null hypothesis; there is strong evidence to suggest that there is a difference in the mean number of times that men and women check their Smartphone during the day. 3 Find, Interpret, and Use Confidence Intervals for the Difference of Two Sample Means 1) A 2) C 3) A 4) D
9.6 Overview of Analyzing Means 1 Compare Hypothesis Tests and Confidence Intervals 1) B 2) When the alternative hypothesis is two-tailed the confidence interval and hypothesis will yield the same results.
Page 29 Copyright © 2020 Pearson Education, Inc.
Ch. 10 Associations between Categorical Variables 10.1 The Basic Ingredients for Testing with Categorical Variables 1 Classify Tests as Using Categorical or Numerical Data MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Are hypothesis tests of proportions used for categorical or numerical data? Explain. A) Hypothesis tests of proportions are used for categorical data because proportions come from counts of categories. B) Hypothesis tests of proportions are used for categorical data because proportions are numbers. C) Hypothesis tests of proportions are used for numerical data because proportions are numbers. D) Hypothesis tests of proportions are used for numerical data because proportions come from counts of categories. Answer: A 2) Are hypothesis tests of means used for categorical or numerical data? Explain. A) Hypothesis tests of means are used for categorical data because means come from counts of categories. B) Hypothesis tests of means are used for categorical data because means are numbers. C) Hypothesis tests of means are used for numerical data because means are numbers. D) Hypothesis tests of means are used for numerical data because means come from counts of categories. Answer: C 3) What kind of data are needed to perform chi-square tests? A) Numerical data. B) Categorical data. C) Either numerical or categorical data. D) Neither numerical nor categorical data. Answer: B 4) Numerical data can be used for which of the following types of hypothesis tests? A) Hypothesis tests of proportions. B) Chi-square hypothesis tests. C) Hypothesis tests of means. D) Numerical data can be used for any type of hypothesis test. Answer: C 2 Construct Two-Way Tables MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following table to answer the question. The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of flying than males.
1) How many categorical variables are summarized in the table? A) Four B) Three C) Two Answer: C
Page 1 Copyright © 2020 Pearson Education, Inc.
D) Zero
2) What fraction represents the proportion of people in the study who did not express a fear of flying? 154 100 210 156 B) C) D) A) 310 310 310 310 Answer: A Use the following table to answer the question. The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of heights than males.
3) How many categorical variables are summarized in the table? A) Two B) Three C) Four
D) Zero
Answer: A 4) What fraction represents the proportion of people in the study who expressed a fear of heights? 177 162 198 183 B) C) D) A) 360 360 360 360 Answer: B The following table summarizes the results of a study to determine if job satisfaction is related to pay category.
5) How many categorical variables are summarized in the table? A) One B) Two C) Three
D) Four
Answer: B 6) What fraction represents the fraction of people in the study who were satisfied with their jobs? 194 605 350 381 B) C) D) A) 1310 1310 1310 1310 Answer: C 7) What fraction of salaried employees are satisfied with their jobs? 200 605 200 B) C) A) 350 1310 605
D)
200 1310
Answer: B The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a g fear of heights than males.
8) What fraction represents the proportion of people in the study who expressed a fear of heights? 183 177 162 198 A) B) C) D) 360 360 360 360 Answer: B Page 2 Copyright © 2020 Pearson Education, Inc.
9) How many categorical variables are summarized in the table? A) Two B) Three C) Four
D) Zero
Answer: A 10) What fraction of women expressed a fear of heights? 109 109 B) A) 198 360
C)
177 360
D)
109 177
Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. The following table summarizes the results of a study to determine if job satisfaction is related to pay category.
11) Describe the categorical variables that are summarized in the table. Describe an association that could be tested based on the information from the table. Answer: Pay category and job satisfaction are the categorical variables. One association that could be tested is whether there is an association between pay category and job satisfaction. 12) Calculate the percentage of people in the study who were dissatisfied with their job. Round your answer to the nearest tenth of a percent. Answer:
194 = 0.148 or 14.8% 1310
3 Classify Variables as Numeric or Categorical MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) The table below summarizes the outcomes of a study about the heights of superheroes in two major franchises, DC Comics and Marvel Comics. Identify the variable or variables and classify as numerical or categorical.
DC Comics Marvel Comics
Mean Height (cm) 180.9 190.5
A) Height (numerical). C) Height (numerical) and Franchise (categorical).
B) Height (categorical). D) Height (categorical) and Franchise (numerical).
Answer: C 2) The table below summarizes the outcomes of a study about the heights of superheroes in two major franchises, DC Comics and Marvel Comics. Identify the numerical variable and determine whether it is continuous or discrete.
DC Comics Marvel Comics
Mean Height (cm) 180.9 190.5
A) Height; continuous. C) Franchise; continuous.
B) Height; discrete. D) Franchise; discrete.
Answer: A Page 3 Copyright © 2020 Pearson Education, Inc.
3) The table below summarizes the outcomes of a study about the moral alignment of superheroes in two major franchises, DC Comics and Marvel Comics. Comic book enthusiasts want to determine whether one franchise is more likely to have bad superheroes than the other. Identify the categorical variable(s). Assume the data are base random sample of characters.
DC Comics Marvel Comics
Good 142 259
Bad 59 115
Neutral 13 11
A) Franchise. C) Franchise and Moral Alignment.
B) Moral Alignment. D) All of the variables are numerical.
Answer: C 4) The table below summarizes the outcomes of a study about superheroes in two major franchises, DC Comics and Marvel Comics. Comic book enthusiasts want to determine whether one franchise is more likely to have bad superheroes than the other. Determine whether Moral Alignment is categorical or numerical; if numerical, whether continuous or discrete. Assume the data are based on a random sample of characters.
DC Comics Marvel Comics
Good 142 259
A) Numerical; continuous.
Bad 59 115
Neutral 13 11 B) Numerical; discrete.
C) Categorical.
Answer: C 4 Find Expected Values MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following table to answer the question. The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of flying than males.
1) Find the expected number of women who should express a fear of flying, if the variables are independent. Round to the nearest whole number. A) 81 B) 104 C) 50 D) 11 Answer: B Use the following table to answer the question. The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a greater fear of heights than males.
2) Find the expected number of men who should express a fear of heights, if the variables are independent. Round to the nearest whole number. A) 97 B) 80 C) 87 D) 93 Answer: B
Page 4 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Lambda olive oil is touted as the ʺWorldʹs most expensive olive oil.ʺ A twelve ounce bottle typically costs fifty dollars or more. In a blind taste test, a group of food experts tasted three premium olive oils, one of which was Lambda olive oil. When asked to pick the Lambda olive oil, 84 got it right and 87 got it wrong. 3) If this group were just guessing, how many people (out of 171) would be expected to guess correctly? A) 86 B) 57 C) 114 D) Not enough information given to calculate expected value. Answer: B Use the following information to answer the question. Bellei Extravecchio Balsamico Tradizionale is a very expensive brand of balsamic vinegar. A twelve ounce bottle typically costs two -hundred dollars or more. In a blind taste test, a group of food experts tasted three premium balsamic vinegars, with the most expensive one being Bellei Extravecchio Balsamico Tradizionale. When asked to pick the most expensive balsamic vinegar, 92 got it right and 76 got it wrong. 4) If this group were just guessing, how many people (out of 168) would be expected to guess correctly? A) 84 B) 70 C) 56 D) Not enough information given to calculate expected value. Answer: C The following table summarizes the results of a study to determine if job satisfaction is related to pay category.
5) The proportion of hourly employees is about 0.733. If the variables are independent, what is the expected number of hourly employees who are dissatisfied with their jobs. A) 0.733(1310) B) 0.733(174) C) 0.733(194) D) 0.733(20) Answer: C The following table summarizes the outcomes of a study that researchers carried out to determine if females expressed a g fear of heights than males.
6) The proportion of those who expressed a fear of heights is about 0.492. If the variables are independent, what is the expected number of men who expressed a fear of heights? A) 0.492(360) B) 0.492(162) C) 0.492(68) D) 0.492(94) Answer: B Solve the problem. 7) Package design is an important marketing tool, especially when marketing to children. The same food was presented to children using three different packaging designs. Each child was asked to select his or her favorite. In a taste test among 243 children, 55 selected package A as tasting best, 112 selected package B as tasting best, and 76 selected package C as tasting best. If there is no difference in the products in the 3 package designs, how many children out of the 243 would you expect to pick package B as tasting best? B) 61 C) 121 D) 81 A) 112 Answer: D Page 5 Copyright © 2020 Pearson Education, Inc.
8) Package design is important marketing tool, especially when marketing to children. The same food was presented to children using three different packaging designs. Each child was asked to select his or her favorite. In a taste test among 267 children, 68 selected package A as tasting best, 105 selected package B as tasting best, and 94 selected package C as tasting best. If there is no difference in the products in the 3 package designs, how many children out of the 267 would you expect to pick package B as tasting best? B) 128 C) 89 D) 133 A) 105 Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. The following table summarizes the results of a study to determine if job satisfaction is related to pay category.
9) Find the expected number of salaried workers who should be satisfied with their jobs if the variables are independent. Round to the nearest hundredth. Answer: 161.64 Almas caviar is a very expensive Iranian brand of Beluga sturgeon caviar. A one pound gold -plated tin typically costs $12,000 or more. In a blind taste test, a group of food experts tasted four premium Beluga sturgeon caviars, with the most expensive one being Almas caviar. When asked to pick the most expensive caviar, 141 got it right and 79 got it wrong. 10) If this group were just guessing, how many people (out of 220) would be expected to guess correctly? Answer: 55 5 Calculate Chi-Square Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Lambda olive oil is touted as the ʺWorldʹs most expensive olive oil.ʺ A twelve ounce bottle typically costs fifty dollars or more. In a blind taste test, a group of food experts tasted three premium olive oils, one of which was Lambda olive oil. When asked to pick the Lambda olive oil, 84 got it right and 87 got it wrong. 1) Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. A) 6.39 B) 12.79 C) 23.68 D) 19.18 Answer: D Use the following information to answer the question. Bellei Extravecchio Balsamico Tradizionale is a very expensive brand of balsamic vinegar. A twelve ounce bottle typically costs two -hundred dollars or more. In a blind taste test, a group of food experts tasted three premium balsamic vinegars, with the most expensive one being Bellei Extravecchio Balsamico Tradizionale. When asked to pick the most expensive balsamic vinegar, 92 got it right and 76 got it wrong. 2) Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. A) 11.57 B) 23.14 C) 41.21 D) 34.71 Answer: D
Page 6 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 3) Typically the percentage of women in an engineering class at a particular university has been 15%. A new recruiting plan was implemented in the past year to increase the proportion of women in the incoming class. This year, there were 49 women and 208 men in the incoming class. The data are summarized below.
Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. A) 3.33 B) 2.75 C) 218.41 D) 98.37 Answer: A 4) Typically the percentage of women in an engineering class at a particular university has been 18%. A new recruiting plan was implemented in the past year to increase the proportion of women in the incoming class. This year, there were 57 women and 201 men in the incoming class. The data are summarized below.
Calculate the observed value of the chi-square statistic. Round to the nearest hundredth. A) 223.03 B) 2.51 C) 0.86 D) 2.93 Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Almas caviar is a very expensive Iranian brand of Beluga sturgeon caviar. A one pound gold -plated tin typically costs $12,000 or more. In a blind taste test, a group of food experts tasted four premium Beluga sturgeon caviars, with the most expensive one being Almas caviar. When asked to pick the most expensive caviar, 141 got it right and 79 got it wrong. 5) Calculate the observed value of the chi-square statistic to test whether the proportion of people selecting the correct brand as most expensive differs from chance. Show all your work. Round to the nearest hundredth. Answer: χ 2 = 179.30 Solve the problem. 6) Compare the sampling distributions of the normally distributed test statistic z and the chi-square test statistic. Specifically, how do the sampling distributions differ in shape and range of values? Answer: The sampling distribution of z is bell-shaped and values can be any real number. The sampling distribution of χ 2 is usually not symmetric and is often right skewed. The values of χ 2 must be positive.
Page 7 Copyright © 2020 Pearson Education, Inc.
10.2 The Chi-Square Test for Goodness of Fit 1 Identify the Properties of Chi-Square Tests for Goodness of Fit MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A dowsing rod is a ʺYʺ or ʺLʺ shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L -shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an ʺXʺ over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.
1) Of the following statements, which one is not true about the chi-square statistic and p-value? Choose (d) if all statements are true. A) The larger the chi-square statistic, the smaller the p-value. B) Under the assumption that the null is true, the p-value is the probability that the chi-square statistic will be as big as or bigger than the observed value. C) On the chi-square distribution, the p-value is represented by the area under the curve to the right of the chi-square statistic. D) All of these statements are true. Answer: D Solve the problem. 2) Of the following statements, which one is true about the chi-square statistic and p-value? A) The p-value for the chi-square distribution is the area under the curve to the left of the chi-square statistic. B) Under the assumption that the null hypothesis is true, the p-value is the probability that the chi-square statistic will be as small as or smaller than the observed value. C) The larger the chi-square statistic, the smaller the p-value. D) All of these statements are true. Answer: C 3) Of the following statements, which one is true about the chi-square statistic and p-value? A) The p-value for the chi-square distribution is the area under the curve to the right of the observed chi-square statistic. B) Under the assumption that the null hypothesis is true, the p-value is the probability that the chi-square statistic will be as small as or smaller than the observed value. C) The larger the chi-square statistic, the larger the p-value. D) All of these statements are true. Answer: A Page 8 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 4) Suppose a goodness-of-fit test is used to test the claim that obesity rates in the elderly have changed since the time of the Egyptian mummies. The p-value is calculated to be 0.00023. Describe the value of the chi -square test statistic (is it likely to be large? small?) and the decision regarding the null hypothesis that there is no difference in obesity rates for mummies and modern-day elderly people. Answer: The chi-square statistic will be large. The p-value supports a decision to reject the null hypothesis. 2 Perform Chi-Square Tests for Goodness of Fit MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A dowsing rod is a ʺYʺ or ʺLʺ shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L -shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an ʺXʺ over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.
1) Choose the correct null and alternative hypothesis. A) H0 : The dowsing rods correctly identify the location of ground water 50% of the time. Ha : The dowsing rods correctly identify the location of ground water more than 50% of the time. B) H0 : The dowsing rods correctly identify the location of ground water 20% of the time. Ha : The dowsing rods correctly identify the location of ground water more than 20% of the time. C) H0 : The dowsing rods work better at locating ground water than guessing. Ha : The dowsing rods work no better at locating ground water than guessing. D) H0 : The dowsing rods work no better at locating ground water than guessing. Ha : The dowsing rods work better at locating ground water than guessing. Answer: D
Page 9 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A dowsing rod is a ʺYʺ or ʺLʺ shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L -shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an ʺXʺ over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.
2) Choose the correct null and alternative hypothesis. A) H0 : The dowsing rods correctly identify the location of ground water 50% of the time. Ha : The dowsing rods correctly identify the location of ground water more than 50% of the time. B) H0 : The dowsing rods work no better at locating ground water than guessing. Ha : The dowsing rods work better at locating ground water than guessing. C) H0 : The dowsing rods correctly identify the location of ground water 20% of the time. Ha : The dowsing rods correctly identify the location of ground water more than 20% of the time. D) H0 : The dowsing rods work better at locating ground water than guessing. Ha : The dowsing rods work no better at locating ground water than guessing. Answer: B
Page 10 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A dowsing rod is a ʺYʺ or ʺLʺ shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L -shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an ʺXʺ over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.
3) Test the hypothesis that the dowsing rods worked better at locating ground water than guessing. Using a goodness-of-fit test and a 0.05 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. A) Fail to reject H0 ; There is not enough evidence to conclude that the dowsing rods worked better than guessing. B) Reject H0 ; There is not enough evidence to conclude that the dowsing rods worked better than guessing. C) Fail to reject H0 ; There is enough evidence to conclude that the dowsing rods worked better than guessing. D) Reject H0 ; There is enough evidence to conclude that the dowsing rods worked better than guessing. Answer: A
Page 11 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A dowsing rod is a ʺYʺ or ʺLʺ shaped instrument that some believe can find ground water. Many dowsers today use a pair of simple L -shaped metal rods. One rod is held in each hand, with the short arm of the L held upright, and the long arm pointing forward. When something is found, the rods cross over one another making an ʺXʺ over the found object. Skeptics of dowsing conducted an experiment to see if dowsing rods could find ground water. Five identical 3 foot by 3 foot plots of land were sectioned off and a container of water was buried in one of the plots. Below is a summary of the experiment results and the output for the goodness-of-fit test.
4) Test the hypothesis that the dowsing rods worked better at locating ground water than guessing. Using a goodness-of-fit test and a 0.05 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. A) Reject H0 ; There is enough evidence to conclude that the dowsing rods worked better than guessing. B) Fail to reject H0 ; There is not enough evidence to conclude that the dowsing rods worked better than guessing. C) Reject H0 ; There is not enough evidence to conclude that the dowsing rods worked better than guessing. D) Fail to reject H0 ; There is enough evidence to conclude that the dowsing rods worked better than guessing. Answer: B
Page 12 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 5) In many pharmaceutical studies, about 16% of all adult volunteers who take placebos (but who are told they have taken a cold remedy) reported that they suffered side effects of drowsiness, stomach upset, or headaches. A study was run on 250 children aged 8 to 12 to see if the results for children differ from those for adults. Below is a summary of the experiment conducted with 250 children aged 8 to 12 and the output for the goodness-of-fit test.
Choose the correct null and alternative hypotheses. A) H0 : The percent of children reporting side effects is 16% Ha: The percent of children reporting side effects differs from 16% B) H0 : The percent of children reporting side effects differs from 16% Ha: The percent of children reporting side effects is 16% C) H0 : The percent of children reporting side effects is 12.8% Ha: The percent of children reporting side effects differs from 12.8% D) H0 : The percent of children reporting side effects differs from 12.8% Ha: The percent of children reporting side effects is 12.8% Answer: A 6) A traffic study found that of 676 automobiles entering a busy intersection during the period from 6 A.M. to 9 A.M., 257 turned left, 221 turned right, and 198 drove straight through the intersection. Below is a summary of the data and the output for the goodness-of-fit test.
Test the hypothesis that the traffic is not equally divided among the three directions. Using a goodness -of-fit tes a 0.05 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. A) Fail to reject H0 ; There is enough evidence to conclude that the traffic is equally divided among the three directions B) Fail to reject H0 ; There is enough evidence to conclude that the traffic is not equally divided among the three directions C) Reject H0 ; There is enough evidence to conclude that the traffic is equally divided among the three directions D) Reject H0 ; There is enough evidence to conclude that the traffic is not equally divided among the three directions Answer: D
Page 13 Copyright © 2020 Pearson Education, Inc.
7) A traffic study found that of 676 automobiles entering a busy intersection during the period from 6 A.M. to 9 A.M., 257 turned left, 221 turned right, and 198 drove straight through the intersection. Below is a summary of the data and the output for the goodness-of-fit test. Choose the correct null and alternative hypotheses if we wa see if traffic is not evenly divided among the three directions.
A) H0 : 38% of traffic turns left, 33% of traffic turns right and 29% of traffic drives straight Ha: At least one of the percentages differs from its hypothesized value B) H0 : 38% of traffic turns left, 33% of traffic turns right and 29% of traffic drives straight Ha: The traffic is evenly divided among the three directions C) H0 : The traffic is evenly divided among the three directions Ha: The traffic is not evenly divided among the three directions D) H0 : The traffic is not evenly divided among the three directions Ha: The traffic is evenly divided among the three directions Answer: C 8) In many pharmaceutical studies, about 16% of all adult volunteers who take placebos (but who are told they have taken a cold remedy) reported that they suffered side effects of drowsiness, stomach upset, or headaches. A study was run on 275 children aged 8 to 12 to see if the results are similar. Below is a summary of the experiment conducted with 275 children aged 8 to 12 and the output for the goodness -of-fit test.
Test the hypothesis that the percent of children reporting side effects differs from that of adults. Using a goodness-of-fit test and a 0.10 level of significance, choose the correct decision regarding the null hypothesis and conclusion statement. A) Reject H0 ; There is enough evidence to conclude the percent of children reporting side effects differs from 16% B) Reject H0 ; There is not enough evidence to conclude the percent of children reporting side effects differs from 16% C) Fail to reject H0 ; There is enough evidence to conclude the percent of children reporting side effects differs from 16% D) Fail to reject H0 ; There is not enough evidence to conclude the percent of children reporting side effects differs from 16% Answer: D
Page 14 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. There are those who believe that identical twins have a psychic connection. Skeptics of this belief conducted an experiment to see if identical twins had some kind of psychic connection. A twin was placed at random behind one of four similar doors and three other non-related people were placed behind the other three doors. The other twin was asked to identify which door their sibling was behind. The results of the experiment are summarized below, along with the output for the goodness-of-fit test.
9) In the context of this description, state the correct null and alternative hypothesis to test the claim that identical twins have a psychic connection. Answer: H0 : The identical twin did no better than guessing which door concealed the other twin; Ha: The identical twin did better than guessing which door concealed the other twin. 10) Test the hypothesis that identical twins have a psychic connection and could locate their twin behind the door more often than would be expected by guessing. Using a goodness -of-fit test and a 0.05 level of significance, state the correct decision regarding the null hypothesis and summarize your conclusion using a complete sentence. Answer: Reject the null hypothesis; there is enough evidence to conclude that the identical twin did better than guessing which door concealed the other twin.
10.3 Chi-Square Tests for Associations between Categorical Variables 1 Determine Whether a Test is a Test of Independence or Homogeneity MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose a random sample of 1,220 U.S. adults were asked about their opinion regarding federal spending on public education. Respondents were asked whether federal spending on public education was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence? B) Independence A) Homogeneity Answer: B 2) Suppose a random sample of 1,220 U.S. adults were asked about their opinion regarding federal spending on infrastructure (i.e. roads and bridges). Respondents were asked whether federal spending on infrastructure was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence? B) Independence A) Homogeneity Answer: B
Page 15 Copyright © 2020 Pearson Education, Inc.
3) Suppose a researcher was interested in learning more about parentsʹ concerns when their children start elementary school. The researcher asks the parents of 800 randomly selected first graders in rural school district and the parents of 950 randomly selected first graders in an urban school district to rate their level of concern with the following statement: We are (a) not at all concerned (b) somewhat concerned or (c) very concerned about the nutrition level of school lunches. If we wanted to test whether there was an association between the response to the question and the type of school district that the first grader was attending, would this be a test of homogeneity or of independence? B) Independence A) Homogeneity Answer: A 4) Suppose a researcher was interested in learning more about parentsʹ concerns when their children go away to college. The researcher asks the parents of 900 randomly selected freshman at a private college and the parents of 1,020 randomly selected freshman at a public college to rate their level of concern with the following statement: We are (a) not at all concerned (b) somewhat concerned or (c) very concerned about the potential pressure to drink alcohol that our child will be exposed to while at college. If we wanted to test whether there was an association between the response to the question and the type of college that the freshman was attending, would this be a test of homogeneity or of independence? B) Independence A) Homogeneity Answer: A The roads in a particular state are in very poor condition, but the state claims that it does not have enough money to repair the roads. A proposal is on the ballot to raise the sales tax in the state from 6% to 7% to pay for repairing the roads. A month before the election, a survey of likely voters is taken. A random sample of 500 likely voters under the age of 30 is selected, a random sample of 500 likely voters aged 30 -49 is selected, and a random sample of 500 likely voters aged 50 or more is selected. Each likely voter is asked whether he/she is in favor of the increase in the sales tax or not. 5) Differentiate between a test for homogeneity and a test for independence] If we want to test whether there is an association between the response to the tax increase and age, would this be a test of homogeneity or of independence? B) Independence A) Homogeneity Answer: A Solve the problem. 6) In 2004, voters in the state of Michigan approved a constitutional amendment that banned same -sex marriage and civil unions in the state. To see if voter sentiment has changed over the years, a survey of 500 registered Michigan voters was taken. Each voter was asked his/her opinion on same-sex relationships with the following options: (a) same-sex marriage should be legalized, (b) same-sex unions should be legalized, (c) no same-sex relationships should be legalized, or (d) no opinion. Each voter was also classified by sex. If we wanted to test whether there was an association between the response to the question and sex, would this be a test of homogeneity or of independence? A) Homogeneity B) Independence Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) Suppose a researcher was interested in learning more about high school seniorsʹ concerns about the future. The researcher asks the 750 randomly selected female high school seniors and the 800 randomly selected male high school seniors to rate their level of concern with the following statement: I am (a) not at all concerned (b) somewhat concerned or (c) very concerned about getting accepted into a two or four year college. If we wanted to test whether there was an association between the response to the question and the gender of the high school senior, would this be a test of homogeneity or independence? Answer: Homogeneity
Page 16 Copyright © 2020 Pearson Education, Inc.
8) Suppose a random sample of 1,105 adults were asked about their opinion regarding salaries of full -time employees at a local state funded university. Respondents were asked whether salaries at the university were (a) too low, (b) adequate, or (c) too high. Respondents were also classified by income level. If we wanted to test whether there was an association between the response to the question and income level, would this be a test of homogeneity or of independence? Answer: Independence 2 Identify the Properties of Tests of Independence and Homogeneity MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Choose the statement that is not true about the chi-square test or choose (d) if all the statements are true. A) The conclusion of a chi-square test tells whether the variables under study are associated and how they are associated. B) The test statistic is the same for a test of homogeneity or a test for independence, it is chi -square (χ 2 ). C) To conduct a chi-square test you must have a large enough sample. This condition is met if each expected value is 5 or more. D) All of these statements are true. Answer: A 2) Choose the statement that is not true about the chi-square test or choose (d) if all the statements are true. A) The conclusion of a chi-square test tells only whether the variables under study are associated, not how they are associated. B) The test statistic is the same for a test of homogeneity or a test for independence, it is chi -square (χ 2 ). C) To conduct a chi-square test you must have a large enough sample. This condition is met if each expected value is 5 or more. D) All of these statements are true. Answer: D 3) The table below shows the gender and the percentage of each gender that spent different amounts at a local toy store. The data was taken from a random sample of single shoppers collected over five consecutive Saturdays at the toy store. Choose the reason(s) why you cannot do a chi-square test with this data.
A) The samples were not collected randomly. B) The data are from the entire population, not a sample, so inference is unnecessary. C) There is not enough information to convert the percentages to counts. D) All of these. Answer: C
Page 17 Copyright © 2020 Pearson Education, Inc.
4) The table below shows the gender and the percentage of each gender that spent different amounts at a local hardware store. The data was taken from a random sample of single shoppers collected over five consecutive Saturdays at the hardware store. Choose the reason(s) why you cannot do a chi -square test with this data.
A) There is not enough information to convert the percentages to counts. B) The samples were not collected randomly. C) The data are from the entire population, not a sample, so inference is unnecessary. D) All of these. Answer: A 3 Perform Chi-Square Tests of Independence and Homogeneity MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose a random sample of 1,220 U.S. adults were asked about their opinion regarding federal spending on public education. Respondents were asked whether federal spending on public education was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. Choose the correct hypotheses to test whether there is an association between the response to the question and income level. A) H0 : Among U.S. adults, opinions about federal spending on education and income level are associated. Ha : Among U.S. adults, opinions about federal spending on education and income level are independent. B) H0 : Among U.S. adults, opinions about federal spending on education and income level are independent. Ha : Among U.S. adults, opinions about federal spending on education and income level are associated. C) H0 : There is no difference between the proportions of U.S. adults who responded (a), (b) or (c) to the opinion question. Ha : There is a difference between the proportions of U.S. adults who responded (a), (b) or (c) to the opinion question. D) None of these Answer: B
Page 18 Copyright © 2020 Pearson Education, Inc.
2) Suppose a random sample of 1,220 U.S. adults were asked about their opinion regarding federal spending on infrastructure (i.e. roads and bridges). Respondents were asked whether federal spending on infrastructure was (a) too low, (b) adequate, or (c) too high. Respondents were classified by income level. Choose the correct hypotheses to test whether there is an association between the response to the question and income level. A) H0 : There is no difference between the proportions of U.S. adults who responded (a), (b) or (c) to the opinion question. Ha : There is a difference between the proportions of U.S. adults who responded (a), (b) or (c) to the opinion question. B) H0 : Among U.S. adults, opinions about federal spending on infrastructure and income level are associated. Ha : Among U.S. adults, opinions about federal spending on infrastructure and income level are independent. C) H0 : Among U.S. adults, opinions about federal spending on infrastructure and income level are independent. Ha : Among U.S. adults, opinions about federal spending on infrastructure and income level are associated. D) None of these Answer: C 3) A health foods store owner is thinking about carrying some new products and is interested in her customerʹs opinions. The shop owner decides to randomly sample 202 customers and ask them whether they have (a) heard about the health benefits of coconut milk and (b) whether they have heard of the health benefits of Quinoa flour. She also asked each respondent how often they typically make purchases during a month. The table below shows the results from the study. Assume all conditions for testing have been met.
Test the hypothesis that how the respondents answered the questions is associated with number of monthly pur using a significance level of 0.05. Choose the correct decision regarding the null hypothesis and conclusion. Refer the computer output below.
A) Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated. B) Fail to reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated. C) Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are not associated. D) Reject the null hypothesis; how the respondents answered the questions and number of monthly purchases are associated. Answer: D Page 19 Copyright © 2020 Pearson Education, Inc.
4) Suppose a study was conducted to see whether there is an association between marital status and breast cancer remission. The table below shows the results from the study. Assume all conditions for testing have been met.
This was an observational study of randomly chosen patients who had received similar chemotherapy treatment the hypothesis that marital status and remission from breast cancer are associated, using a significance level of 0 Choose the correct decision regarding the null hypothesis and conclusion. Refer to the computer output below.
A) Reject the null hypothesis; marital status and cancer remission are not associated. B) Reject the null hypothesis; marital status and cancer remission are associated. C) Fail to reject the null hypothesis; marital status and cancer remission are not associated. D) Fail to reject the null hypothesis; marital status and cancer remission are associated. Answer: C 5) Suppose a study was conducted to see whether there is an association between marital status and vehicle color. The table below shows the results from the study. Assume all conditions for testing have been met.
This was an observational study of randomly chosen vehicle owners. Test the hypothesis that marital status and color are associated, using a significance level of 0.05. Choose the correct decision regarding the null hypothesis conclusion. Refer to the computer output below.
A) Reject the null hypothesis; marital status and vehicle color are not associated. B) Reject the null hypothesis; marital status and vehicle color are associated. C) Fail to reject the null hypothesis; marital status and vehicle color are associated. D) Fail to reject the null hypothesis; marital status and vehicle color are not associated. Answer: D
Page 20 Copyright © 2020 Pearson Education, Inc.
The roads in a particular state are in very poor condition, but the state claims that it does not have enough money to repair the roads. A proposal is on the ballot to raise the sales tax in the state from 6% to 7% to pay for repairing the roads. A month before the election, a survey of likely voters is taken. A random sample of 500 likely voters under the age of 30 is selected, a random sample of 500 likely voters aged 30 -49 is selected, and a random sample of 500 likely voters aged 50 or more is selected. Each likely voter is asked whether he/she is in favor of the increase in the sales tax or not. 6) Choose the correct hypotheses to test whether there is an association between the response to whether to increase the sales tax or not and age. A) H0 : There is no difference in the proportions of likely voters who favor and who do not favor the increase in sales tax Ha: There is a difference in the proportions of likely voters who favor and who do not favor the increase in sales tax B) H0 : There is a difference in the proportions of likely voters who favor and who do not favor the increase in sales tax Ha: There is no difference in the proportions of likely voters who favor and who do not favor the increase in sales tax C) H0 : Among likely voters, opinions on the sales tax increase and age are independent Ha: Among likely voters, opinions on the sales tax increase and age are associated D) H0 : Among likely voters, opinions on the sales tax increase and age are associated Ha: Among likely voters, opinions on the sales tax increase and age are independent Answer: C Solve the problem. 7) Recently, many states have been considering legalizing the use of marijuana. A survey was taken last year where adults in one state were asked if they favored legalizing the use of marijuana. The following table display the percentage of adults in the state who favor and who do not favor the legalization of marijuana for both males females:
Choose the reason(s) why you cannot perform a chi-square test on the data. I. The percentages do not add up to 100%. II. There is not enough information to convert the percentages to counts. III. The samples were not randomly selected A) I only B) II only C) III only Answer: B
Page 21 Copyright © 2020 Pearson Education, Inc.
D) I, II, and III
8) Recently, many states have been considering legalizing the use of marijuana. A survey was taken last year where adults in one state were asked if they favored legalizing the use of marijuana. The following table display the percentage of adults in the state who favor and who do not favor the legalization of marijuana for both males females:
What additional information would be necessary to be able to perform a chi -square test on the data. A) We would have to know the total number of those favoring legalizing marijuana in the sample and the total number of those not favoring legalizing marijuana in the sample. B) We would just need to know the total number of people in the sample. C) We would have to know the total number of men in the sample and the total number of women in the sample. D) None of these Answer: C 9) The American Academy of Pediatrics recommends breast milk as the best nutrition for infants. A survey was conducted to see if pregnant women in their last trimester of pregnancy planned to breast feed their babies. It was thought that the income level of the women might be related to whether they planned to breast feed or not. The table below shows the results of the study:
Assume that all conditions for testing have been met. This was an observational study of randomly selected preg women in their last trimester of pregnancy. Test whether there is an association between whether a woman plan breast feed her baby or not and her income level, using a significance level of 0.05. Choose the correct decision and conclusion regarding the null hypothesis that no association exists. Refer to the computer output above. A) Reject the null hypothesis; income level and plan to breast feed are not associated. B) Reject the null hypothesis; income level and plan to breast feed are associated. C) Fail to reject the null hypothesis; income level and plan to breast feed are associated. D) Fail to reject the null hypothesis; income level and plan to breast feed are not associated. Answer: D
Page 22 Copyright © 2020 Pearson Education, Inc.
10) A study of middle school and high school students was conducted to examine the ethnic/racial self-classification of adolescents. A group of 500 students in the eighth grade were randomly selected and followed for 4 years. At the beginning of each year, the students were asked among many questions to classify themselves as White, Black, Hispanic, or Multiracial (additional classifications were eliminated because of small frequencies). At the end of the study, the number of times each student changed his/her classification was recorded. The table below shows the results from the study.
Test the hypothesis that the number of changes in racial classification is associated with studentsʹ racial classifica eighth grade, using a significance level of 0.05. Choose the correct decision regarding the null hypothesis and decision. Refer to the computer output above. A) Reject the null hypothesis; racial classification in 8th grade is not associated with number of changes in racial classification B) Reject the null hypothesis; racial classification in 8th grade is associated with number of changes in racial classification C) Fail to reject the null hypothesis; racial classification in 8th grade is not associated with number of changes in racial classification D) Fail to reject the null hypothesis; racial classification in 8th grade is associated with number of changes in racial classification Answer: B 11) Suppose a study was conducted to see whether there is an association between marital status and vehicle color. The table below shows the results from the study. Assume all conditions for testing have been met.
This was an observational study of randomly chosen vehicle owners. Test the hypothesis that marital status and color are associated, using a significance of 0.05. Choose the correct decision regarding the null hypothesis and conclusions. Refer to the computer output above. A) Reject the null hypothesis; marital status and vehicle color are not associated. B) Reject the null hypothesis; marital status and vehicle color are associated. C) Fail to reject the null hypothesis; marital status and vehicle color are associated. D) Fail to reject the null hypothesis; marital status and vehicle color are not associated. Answer: D
Page 23 Copyright © 2020 Pearson Education, Inc.
12) A study was conducted to see if teenagers work longer hours to pay for newer cars. A random sample of teenagers was taken and the number of hours each worked was recorded along with the age of the car that each usually drove. The table below shows the results from the study.
Test the hypothesis that the age of the car driven is associated with the number of hours worked per week, using significance level of 0.10. Choose the correct decision regarding the null hypothesis and decision. Refer to the computer output above. A) Reject the null hypothesis; age of car usually driven is not associated with number of hours worked per week B) Reject the null hypothesis; age of car usually driven is associated with number of hours worked per week C) Fail to reject the null hypothesis; age of car usually driven is associated with number of hours worked per week D) Fail to reject the null hypothesis; age of car usually driven is not associated with number of hours worked per week Answer: B 13) Choose the statement that is not true about the chi-square test or state that none of the statements is true. A) To conduct a chi-square test you must have a large enough sample. This condition is met if each of the observed cells is 5 or more. B) To conduct a chi-square test you must have a large enough sample. This condition is met if the sample size is at least 40. C) The conclusion of a chi-square test tells us whether the variables are associated as well as how they are associated. D) None of these statements are true. Answer: D 14) Choose the statement that is not true about the chi-square test or state that all the statements are true. A) The test statistic is the same for a test of homogeneity or a test for independence, it is chi -square (χ2 ). B) To conduct a chi-square test you must have a large enough sample. This condition is met if each expected value is 5 or more. C) The conclusion of a chi-square test tells whether the variables under study are associated and how they are associated. D) All of these statements are true. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 15) Suppose a random sample of 1,105 adults were asked about their opinion regarding salaries of full -time employees at a local state funded university. Respondents were asked whether salaries at the university were (a) too low, (b) adequate, or (c) too high. Respondents were also classified by income level. State the correct hypothesis to test whether there is an association between the response to the questions and income level. Answer: H0 : Among adults, opinions about state-funded university salaries for full-time employees and income level are independent. Ha: Among adults, opinions about state-funded university salaries for full-time employees and income level are associated.
Page 24 Copyright © 2020 Pearson Education, Inc.
16) Recently, many states have been considering legalizing the use of marijuana. A survey was taken in 2014 where adults in one state were asked if they favored legalizing the use of marijuana. The following table displays the percentage of adults in the state who favor and who do not favor the legalization of marijuana for both males and females. Can a chi-square test be done with these data? Explain why or why not.
Answer: The data are given as percentages, not frequencies, and there is not enough information given to convert the percentages to counts. Suppose that you are a researcher and that you want to research whether a home remedy that recommends applying a salve of tobacco and saliva to cuts or scrapes to absorb toxins really works. You decide to conduct a study to see whether there is an association between the treatment and the outcome. A positive outcome means that the patient reported lower pain levels and the wound had healed without infection in less than 14 days. The table below shows the results from your study. Assume all conditions for testing have been met.
17) Test the hypothesis that the treatment is associated with the outcome, using a significance level of 0.05. State the correct decision regarding the null hypothesis and write a conclusion using a full sentence. Refer to the computer output above. Answer: Reject the null hypothesis. There is enough evidence to conclude that the treatment and the outcome are associated. 18) From a purely scientific point-of-view, would you suggest the tobacco and saliva salve to a co-worker who expressed an interest in home remedies? Explain why or why not. Answer: Yes, they should try the home remedy because the study showed that the treatment and the positive outcome were associated.
Page 25 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 19) A study of middle school and high school students was conducted to examine the ethnic/racial self-classification of adolescents. A group of 500 students in the eighth grade were randomly selected and followed for 4 years. At the beginning of each year, the students were asked among many questions to classify themselves as White, Black, Hispanic, or Multiracial (additional classifications were eliminated because of small frequencies). At the end of the study, the number of times each student changed his/her classification was recorded. The table below shows the results from the study. Assume all conditions for testing have been met.
Test the hypothesis that the number of changes in racial classification is not associated to studentsʹ racial classific eighth grade, using a significance level of 0.05. State the correct decision regarding the null hypothesis and write conclusion using a full sentence. Refer to the output above. Answer: Reject the null hypothesis. The number of changes in racial classification and studentsʹ classification in the eighth grade are associated.
10.4 Hypothesis Tests When Sample Sizes Are Small 1 Perform a Chi-Square Test by Combining Categories MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. The data in the top row of the table shows the number of vacation days taken by the respondent in the previous 90 days. The respondents also reported their level of happiness; Very H means very happy, and so on.
1) We wish to test whether happiness is associated with taking vacation days. Choose the statement that is true about the hypothesis test or choose (d) if all the statements are false. A) The chi-square test is not appropriate because some expected cell counts will be less than 5. B) The sample size is large so the chi-square test is appropriate. C) A hypothesis test cannot be conducted on this data because some of the observed cell counts are zero and there must be at least one observation in each cell. D) All of these statements are false. Answer: A
Page 26 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. The data in the top row of the table shows the number of vacation days taken by the respondent in the previous 90 days. The respondents also reported their level of happiness; Very H means very happy, and so on.
2) We wish to test whether happiness is associated with taking vacation days. Choose the statement that is true about the hypothesis test or choose (d) if all the statements are false. A) The chi-square test is not appropriate because some expected cell counts will be less than 5. B) The sample size is large so the chi-square test is appropriate. C) A hypothesis test cannot be conducted on this data because some of the observed cell counts are zero and there must be at least one observation in each cell. D) All of these statements are false. Answer: A Use the following information to answer the question. The data in the top row of the table shows the number of vacation days taken by the respondent in the previous 90 days. The respondents also reported their level of happiness; Very H means very happy, and so on.
3) The following table shows the data after merging categories so that there are two column categories (0 -1 vacation days and 2 or more vacation days), and two row categories (happy and unhappy). Expected values for each cell are also shown in parentheses. Test the hypothesis that there is an association between happiness and number of vacation days taken in the last 90 days, using a significance level of 0.05. State the value of the test statistic rounded to two decimal places and state whether the p-value is closer to zero or one.
A) χ 2 = 4.22; The p-value will be close to zero. C) χ 2 = 52.54; The p-value will be close to zero.
B) χ 2 = 4.22; The p-value will be close to one. D) χ 2 = 52.54; The p-value will be close to one.
Answer: C
Page 27 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. The data in the top row of the table shows the number of vacation days taken by the respondent in the previous 90 days. The respondents also reported their level of happiness; Very H means very happy, and so on.
4) The following table shows the data after merging categories so that there are two column categories (0 -2 vacation days and 3 or more vacation days), and two row categories (happy and unhappy). Expected values for each cell are also shown in parentheses. Test the hypothesis that there is an association between happiness and number of vacation days taken in the last 90 days, using a significance level of 0.05. State the value of the test statistic rounded to two decimal places and state whether the p-value is closer to zero or one. 0-2 vacation days 3+ vacation days 84 221 Happy (104) (201) Unhappy 35 9 (15.003) (28.997) 2 A) χ = 46.28; The p-value will be close to one. B) χ 2 = 46.28; The p-value will be close to zero. C) χ 2 = 5.03; The p-value will be close to one.
D) χ 2 = 5.03; The p-value will be close to zero.
Answer: B Bullying is a problem faced by many schools. To see if student perceptions of how well a high school is doing in dealing with bullying are related to class standing, a small survey was taken. Forty students were randomly selected from Central High School and asked their level of agreement to the following statement: ʺCentral High School is doing a good job dealing with bullying in the school.ʺ The numbers in the parentheses are the expected values if ʺview on bullyingʺ an ʺclass standingʺ are independent.
5) Suppose we want to test to see if there is an association between class standing and view on bullying. Choose the statement that is true about the hypothesis test or state that all statements are false. A) The chi-square test is appropriate because two observed values are greater than 5. B) The chi-square test is appropriate because only one observed value is 0. C) The chi-square test is not appropriate because some of the expected cell counts are less than 5. D) All these statements are false. Answer: C
Page 28 Copyright © 2020 Pearson Education, Inc.
6) Suppose the cells in the above table are collapsed so that the new table is as follows:
Expected values for each of the new cells are listed in the table in parentheses. Compute the value of the chi -squ statistic rounded to 2 decimal places. A) χ 2 = 6.30 B) χ 2 = 5.23 C) χ 2 = 49.00 D) χ 2 = 1.23 Answer: B Students applying for admission to a Masterʹs program in statistics must submit scores from the GRE test, which includes a verbal and a quantitative component. Shown here are the scores for 100 applicants to a particular Masterʹs statistics program. The numbers in parentheses are the expected values if the verbal scores and quantitative scores are independent.
7) We wish to test whether quantitative scores are associated with verbal scores. Choose the statement that is true about the hypothesis test or state that all of the statements are false. A) The chi-square test is not appropriate because some expected cell counts are less than 5. B) The sample size is large so the chi-square test is appropriate. C) A hypothesis test cannot be conducted on these data because one of the observed cell counts is zero and there must be at least one observation in each cell. D) All of these statements are false. Answer: A 8) The following table shows the data after merging categories so that there are two column categories (quantitative scores 130-149 and 150-170), and two row categories (verbal scores 130-149 and 150-170). Expected values for each cell are also shown in parenthesis. Calculate the value of the test statistic rounded to tw decimal places.
A) χ2 = 2.08
B) χ2 = 1.77
C) χ2 = 30.25
Answer: B
Page 29 Copyright © 2020 Pearson Education, Inc.
D) χ2 = 0.30
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Bullying is a problem faced by many students in many schools. To see if student perceptions of how well a high school is doing in dealing with bullying are related to class standing, a small survey was taken. One hundred students were randomly selected from Central High School and asked their level of agreement to the following statement: Central High School is doing a good job dealing with bullying in the school. The numbers in parentheses are the expected values class standing and view on bullying are independent.
9) Suppose you want to test the hypothesis that view on bullying is associated with class standing. Can a chi-square test for independence be used to test the hypothesis? Explain why or why not. Answer: No, because some of the expected cell counts will be less than 5 and a chi -square test would result in inaccurate p-values which could lead to a wrong conclusion. 10) The following table shows the data after merging categories so that there are two column categories (Agree that school is doing a good job dealing with bullying and disagree/neutral that school is doing a good job dealing with bullying), and two row categories (freshman/sophomore and junior/senior). Expected values for each cell are also shown in parenthesis. Test the hypothesis that there is an association between view on bullying and class standing, using a significance level of 0.05. State the value of the test statistic rounded to two decimal places, state whether the p-value is closer to zero or one.
Answer: X2 = 9.66 ; the p-value will be closer to zero. Solve the problem. 11) Describe at least one advantage and one disadvantage of combining categories as was done in the previous question. Answer: If combined categories result in expected cell counts that are 5 or greater, then an advantage of combining categories is that you can conduct a chi-square test which will result in accurate p-values. A disadvantage of combining categories is that any conclusions will now apply to a broader group and may lose some of its practical value. 2 Use Fischerʹs Exact Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Choose the statement that is not true about Fisherʹs Exact Test or choose (d) if all the statements are true. A) When sample size is small resulting in expected cell counts that are less than 5, Fisherʹs Exact Test is one option that can be used to conduct a hypothesis test. B) Fisherʹs Exact Test cannot be used for tables with more than two rows or columns. C) With Fisherʹs Exact Test an exact p-value can be calculated instead of using an approximation for the p-value as is the case with the chi-square test. D) All of these statements are true. Answer: B Page 30 Copyright © 2020 Pearson Education, Inc.
2) Choose the statement that is not true about Fisherʹs Exact Test or choose (d) if all the statements are true. A) When sample size is small resulting in expected cell counts that are less than 5, Fisherʹs Exact Test is one option that can be used to conduct a hypothesis test. B) Fisherʹs Exact Test can be used for tables with more than two rows or columns. C) With Fisherʹs Exact Test an exact p-value can be calculated instead of using an approximation for the p-value as is the case with the chi-square test. D) All of these statements are true. Answer: D 3) The following table shows the results from a study to see if a home remedy ointment for mosquito bites worked better than a placebo. Each participant was randomly assigned to receive the home remedy ointment or the placebo ointment. ʺImprovementʺ means no symptoms of itching after three minutes.
No Improvement Improvement Total
Home Remedy 2 6 8
Placebo 4 5 9
Total 6 11 17
The alternative hypothesis is that the home remedy ointment leads to improvement (in this case, less itching). Th p-value for a one-tailed Fisherʹs exact test with these data is 0.618. Suppose the study had turned out differently the following table.
No Improvement Improvement Total
Home Remedy 0 8 8
Placebo 6 3 9
Total 6 11 17
Would Fisherʹs Exact Test have led to a p-value larger or smaller than 0.618? A) The p-value would be larger. B) The p-value would be smaller. Answer: B
Page 31 Copyright © 2020 Pearson Education, Inc.
4) The following table shows the results from a study to see if a home remedy ointment for mosquito bites worked better than a placebo. Each participant was randomly assigned to receive the home remedy ointment or the placebo ointment. ʺImprovementʺ means there were no symptoms of itching after three minutes.
No Improvement Improvement Total
Home Remedy 2 6 8
Placebo 4 5 9
Total 6 11 17
The alternative hypothesis is that the home remedy ointment leads to improvement (in this case, less itching). Th p-value for a one-tailed Fisherʹs exact test with these data is 0.618. Suppose the study had turned out differently the following table.
No Improvement Improvement Total
Home Remedy 6 2 8
Placebo 0 9 9
Total 6 11 17
Would Fisherʹs Exact Test have led to a p-value larger or smaller than 0.618? A) The p-value would be larger. B) The p-value would be smaller. Answer: A 5) Choose the statement about Fisherʹs Exact Test that is true or state that none of the statements is true. A) Fisherʹs Exact test can only be used with tables that have two rows and two columns. B) The p-value calculated for Fisherʹs Exact test is always the same as the p-value associated with a chi-square test. C) If one or more of the expected cell sizes is less than 5, Fisherʹs Exact test is one option that can be used to conduct a test of hypothesis. D) None of these statements is true. Answer: C
Page 32 Copyright © 2020 Pearson Education, Inc.
6) Genetically modified food has generated much controversy as to whether it is safe for humans to consume. A recent study was conducted to see if age is related to how a person views genetically modified food. The results appear in the following table:
The alternative hypothesis is that age and view on genetically modified foods are associated. The p-value for Fis Exact Test with these data is 0.1748. Suppose the study had turned out differently, as in the following table.
Would the p-value for Fisherʹs Exact Test be larger or smaller than 0.1748? A) The p-value would be larger. B) The p-value would be smaller. Answer: A 7) Genetically modified food has generated much controversy as to whether it is safe for humans to consume. A recent study was conducted to see if age is related to how a person views genetically modified food. The results appear in the following table:
The alternative hypothesis is that age and view on genetically modified foods are associated. The p-value for Fis Exact Test with these data is 0.1748. Suppose the study had turned out differently, as in the following table.
Would the p-value for Fisherʹs Exact Test be larger or smaller than 0.1748? A) The p-value would be larger. B) The p-value would be smaller. Answer: B
Page 33 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 8) The following table shows the results from a study to see if a home remedy salve for bee stings works differently than a placebo. Each participant was randomly assigned to receive the home remedy salve or a placebo ointment. ʺImprovementʺ means that there were no symptoms of pain after fifteen minutes.
The alternative hypothesis is that the home remedy salve and the placebo work differently. The p-value for Fish Exact Test with these data is 0.6285. Suppose the study had turned out differently, as in the following table.
Would Fisherʹs Exact Test have led to a p-value larger or smaller than 0.6285? Explain. Answer: The p-value would be smaller than 0.6285 because these results are more extreme.
Page 34 Copyright © 2020 Pearson Education, Inc.
Ch. 10 Associations between Categorical Variables Answer Key 10.1 The Basic Ingredients for Testing with Categorical Variables 1 Classify Tests as Using Categorical or Numerical Data 1) A 2) C 3) B 4) C 2 Construct Two-Way Tables 1) C 2) A 3) A 4) B 5) B 6) C 7) B 8) B 9) A 10) B 11) Pay category and job satisfaction are the categorical variables. One association that could be tested is whether there is an association between pay category and job satisfaction. 194 12) = 0.148 or 14.8% 1310 3 Classify Variables as Numeric or Categorical 1) C 2) A 3) C 4) C 4 Find Expected Values 1) B 2) B 3) B 4) C 5) C 6) B 7) D 8) C 9) 161.64 10) 55 5 Calculate Chi-Square Statistics 1) D 2) D 3) A 4) D 5) χ 2 = 179.30 6) The sampling distribution of z is bell-shaped and values can be any real number. The sampling distribution of χ 2 is usually not symmetric and is often right skewed. The values of χ 2 must be positive.
10.2 The Chi-Square Test for Goodness of Fit 1 Identify the Properties of Chi-Square Tests for Goodness of Fit 1) D 2) C 3) A Page 35 Copyright © 2020 Pearson Education, Inc.
4) The chi-square statistic will be large. The p-value supports a decision to reject the null hypothesis. 2 Perform Chi-Square Tests for Goodness of Fit 1) D 2) B 3) A 4) B 5) A 6) D 7) C 8) D 9) H0 : The identical twin did no better than guessing which door concealed the other twin; Ha: The identical twin did better than guessing which door concealed the other twin. 10) Reject the null hypothesis; there is enough evidence to conclude that the identical twin did better than guessing which door concealed the other twin.
10.3 Chi-Square Tests for Associations between Categorical Variables 1 Determine Whether a Test is a Test of Independence or Homogeneity 1) B 2) B 3) A 4) A 5) A 6) B 7) Homogeneity 8) Independence 2 Identify the Properties of Tests of Independence and Homogeneity 1) A 2) D 3) C 4) A 3 Perform Chi-Square Tests of Independence and Homogeneity 1) B 2) C 3) D 4) C 5) D 6) C 7) B 8) C 9) D 10) B 11) D 12) B 13) D 14) C 15) H0 : Among adults, opinions about state-funded university salaries for full-time employees and income level are independent. Ha: Among adults, opinions about state-funded university salaries for full-time employees and income level are associated. 16) The data are given as percentages, not frequencies, and there is not enough information given to convert the percentages to counts. 17) Reject the null hypothesis. There is enough evidence to conclude that the treatment and the outcome are associated. 18) Yes, they should try the home remedy because the study showed that the treatment and the positive outcome were associated. Page 36 Copyright © 2020 Pearson Education, Inc.
19) Reject the null hypothesis. The number of changes in racial classification and studentsʹ classification in the eighth grade are associated.
10.4 Hypothesis Tests When Sample Sizes Are Small 1 Perform a Chi-Square Test by Combining Categories 1) A 2) A 3) C 4) B 5) C 6) B 7) A 8) B 9) No, because some of the expected cell counts will be less than 5 and a chi -square test would result in inaccurate p-values which could lead to a wrong conclusion. 10) X2 = 9.66 ; the p-value will be closer to zero. 11) If combined categories result in expected cell counts that are 5 or greater, then an advantage of combining categories is that you can conduct a chi-square test which will result in accurate p-values. A disadvantage of combining categories is that any conclusions will now apply to a broader group and may lose some of its practical value. 2 Use Fischerʹs Exact Test 1) B 2) D 3) B 4) A 5) C 6) A 7) B 8) The p-value would be smaller than 0.6285 because these results are more extreme.
Page 37 Copyright © 2020 Pearson Education, Inc.
Ch. 11 Multiple Comparisons and Analysis of Variance 11.1 Multiple Comparisons 1 Choose an Appropriate Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Choose the appropriate test for the following situation: You wish to test whether the mean number of words recalled from short term memory is different for males and females. B) Two-sample t-test C) ANOVA D) Chi-square test A) One-sample t-test Answer: B 2) Choose the appropriate test for the following situation: You wish to test whether an association exists between the type of vehicle a driver owns and the cost of speeding tickets. A) One-sample t-test B) Two-sample t-test C) ANOVA D) Chi-square test Answer: C 3) Choose the appropriate test for the following situation: You wish to test whether an association exists between type of vehicle purchased and vehicle color. B) Two-sample t-test C) ANOVA D) Chi-square test A) One-sample t-test Answer: D 4) Choose the appropriate test for the following situation: You wish to test whether an association exists between the type of vehicle purchased and how many children the buyer has. A) One-sample t-test B) Two-sample t-test C) ANOVA D) Chi-square test Answer: C 5) Choose the statement that is true about the level of significance when performing several hypothesis tests at the same time. Or state that all statements are true. A) The overall significance level is always smaller than the significance level for any one of the individual tests. B) The overall level of significance is the probability of mistakenly rejecting the null hypothesis in at least one of several hypothesis tests. C) The overall level of significance is found by adding the significance levels of each of the several hypothesis tests. D) All of these statements are true. Answer: B 6) Choose the statement that is not true about multiple comparisons and ANOVA. Or state that all the statements are true. A) ANOVA is a method for testing whether there is an association between a categorical variable and a numerical variable. B) When doing a multiple comparison, the overall significance level will increase meaning it is more likely that an incorrect conclusion will be drawn. C) When doing multiple comparisons, the response variable is always numerical, but the independent variable can be numerical or categorical. D) All of these statements are true. Answer: C 7) Choose the appropriate test for the following situation: You wish to test whether the mean number of boxes of a new cereal sold in several different markets around the country is different for three different package types. B) Two-sample t-test C) ANOVA D) Chi-square test A) One-sample t-test Answer: C Page 1 Copyright © 2020 Pearson Education, Inc.
8) Choose the appropriate test for the following situation. A company has instituted a new work policy that it hopes will reduce the number of sick days used per year by its employees. You wish to test whether the mean number of sick days used by employees this year differs from the mean number of sick days used last year by all employees (which was 8.2). B) Two-sample t-test C) ANOVA D) Chi-square test A) One-sample t-test Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 9) Name the appropriate test for the following situation: Suppose that researchers wish to study whether type of diet affects alertness levels. Three popular dieting plans were included in the study. After following the diet for several weeks, participants were given a test that measured response time to a stimulus. Answer: ANOVA test 10) Name the appropriate test for the following situation: You wish to test whether an association exists between type of vehicle transmission purchased— automatic or manual— and gender of the purchaser. Answer: Chi-square test 2 Use the Bonferroni Correction MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose you have observations from six different regions within your state and you wish to do hypothesis tests to compare the mean income across groups. How many comparisons can be done with six groups? B) 20 C) 30 D) 11 A) 15 Answer: A 2) Suppose you have observations from six different regions within your state and you wish to do hypothesis tests to compare the mean income across groups. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.10? Round to the nearest thousandth. B) 0.007 C) 0.033 D) None of these A) 0.050 Answer: B 3) Suppose you have observations from five different regions within your state and you wish to do hypothesis tests to compare the mean home value across groups. How many comparisons can be done with five groups? C) 20 D) 30 A) 10 B) 15 Answer: A 4) Suppose you have observations from five different regions within your state and you wish to do hypothesis tests to compare the mean home value across groups. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.05? Round to the nearest thousandth. B) 0.010 C) 0.005 D) None of these A) 0.050 Answer: C
Page 2 Copyright © 2020 Pearson Education, Inc.
5) Suppose you wish to compare the means of m groups pair-by-pair. Choose the statement that is not true about the Bonferroni Correction. Or state that none of the statements is true. A) The Bonferroni Correction is done by using a larger significance level for each individual test. B) If you want an overall significance level of α and you are running m tests, then each individual test should be run with significance level mα. C) The number of pairwise comparisons required is always equal to the number of groups that are being compared. D) None of these statements is true. Answer: D 6) Suppose you wish to compare the means of m groups pair-by-pair. Choose the statement that is true about the Bonferroni Correction. Or state that none of the statements is true. A) The Bonferroni Correction is done by using a significance level for each individual test. B) If you want an overall significance level ofαα and you are running m tests, then each individual test should be run with significance level α / m. C) The number of comparisons required is always equal to the number of groups that are being compared. D) None of these statements is true. Answer: B 7) Suppose that you wish to compare the mean starting salaries of students from seven different majors based on a random sample of students from these majors. There will be a total of 21 pairwise comparisons. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance level of 0.05? Round your answer to the nearest thousandth. C) 0.007 D) 0.002 B) 0.025 A) 0.050 Answer: D 8) Suppose you have observations from six different regions within your state and you wish to perform hypothesis tests to compare the mean home value across groups. There will be a total of 15 pairwise comparisons. Using the Bonferroni Correction, what significance level should you use for each hypothesis test if you want an overall significance of 0.05? Round to the nearest thousandth. B) 0.008 C) 0.003 D) None of these A) 0.050 Answer: C 9) Suppose that you wish to compare the mean starting salaries of students from three different majors. The Bonferroni-corrected intervals for all three comparisons are shown in the table. Assume an overall confidence level of 95%, that is, an individual confidence level of 98.33%. Choose the statement that is true about the differences among the three means. Or state that all of the statements are true.
A) The interval comparing Business to Education does not capture 0; there is a significant difference in the mean starting salaries between Business majors and Education majors. B) The interval comparing Computer Science to Education does not capture 0; there is a significant difference between the starting salaries of Computer Science majors and Education majors. C) The interval comparing Business to Computer Science captures 0; there is no significant difference in mean starting salaries between Business majors and Computer Science majors. D) All of these statements are true. Answer: D
Page 3 Copyright © 2020 Pearson Education, Inc.
10) Suppose that you wish to compare the mean home values from three different regions. The Bonferroni-corrected intervals for all three comparisons are shown in the table. Assume an overall confidence level of 95%, that is, an individual confidence level of 98.33%. Choose the statement that is true about the differences among the three means. Or state that none of the statements is true
. A) The interval comparing North East to North Central does not capture 0; there is no significant difference in the mean home values between the North East and the North Central regions. B) The interval comparing Far West to North Central does not capture 0; there is no significant difference in the mean home values between the Far West and the North Central regions. C) The interval comparing North East to Far West captures 0; there is no significant difference in mean home values between the North East and the Far West regions. D) None of these statements is true. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 11) Suppose that you wish to determine whether the mean starting salaries of students from seven different majors differ. How many comparisons (comparing two groups at a time) can be made with seven groups? Answer: 21 comparisons 12) Suppose that you wish to determine whether the mean starting salaries of students from seven different majors differ. If you compared groups two-at-a-time, what significance level should you use for each hypothesis test if you want an overall significance level of 0.10? Round your answer to the nearest thousandth. Answer: 0.005
11.2 The Analysis of Variance 1 Compare F-values using Boxplots MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose a researcher collected data to compare whether dogs of different size categories differed in mean cost to a pet owner. Dogs were categorized as small, medium, or large. Cost was calculated as the average annual amount spent on food, veterinary visits, and medications. The calculated F -statistic was 421.58. Given this test statistic, which of the following is the most reasonable conclusion? A) The F-statistic shows that variation within groups is larger than variation between groups, therefore, the researcher will likely conclude that there is not an association between dog size and mean cost to the pet owner. B) The F-statistic shows that large dogs have a significantly higher cost to the pet own than the other categories, therefore, the researcher will conclude that there is an association between dog size and mean cost to the pet owner. C) The F-statistic shows that variation between groups is larger than variation within groups, therefore, the researcher will likely conclude that there is an association between dog size and mean cost to the pet owner. D) None of these Answer: C
Page 4 Copyright © 2020 Pearson Education, Inc.
2) Refer to the figure below. Assume that all the distributions are symmetric (i.e. the sample mean and median are approximately equal) and that all the samples sizes are the same. Imagine carrying out two ANOVAs. The first ANOVA compares the means based on samples A, B, and C (left of the vertical line), and the second ANOVA compares the means based on samples X, Y, and Z (right of the vertical line). One of the calculated values of the F-statistic is 3.64 and the other is 30.92. Which value is which, and why?
A) 3.64 goes with A, B, and C; 30.92 goes with X, Y, and Z. The F -statistic for X, Y, and Z is larger because the variation between groups is larger relative to the variation within groups. B) 3.64 goes with A, B, and C; 30.92 goes with X, Y, and Z. The F -statistic for X, Y, and Z is larger because the variation within groups is larger relative to the variation between groups. C) 30.92 goes with A, B, and C; 3.64 goes with X, Y, and Z. The F -statistic for A, B, and C is larger because the variation between groups is larger relative to the variation within groups. D) 30.92 goes with A, B, and C; 3.64 goes with X, Y, and Z. The F -statistic for A, B, and C is larger because the variation within groups is larger relative to the variation between groups. Answer: A 3) An increase in which type of variance (between group or within group) will result in a larger F-statistic value? Explain. A) Between group variance because it would increase the denominator of the F-statistic and therefore increase the entire F-statistic value. B) Between group variance because it would increase the numerator of the F -statistic and therefore increase the entire F-statistic value. C) Bithin group variance because it would increase the denominator of the F -statistic and therefore increase the entire F-statistic value. D) Within group variance because it would increase the numerator of the F-statistic and therefore increase the entire F-statistic value. Answer: B
Page 5 Copyright © 2020 Pearson Education, Inc.
4) A decrease in which type of variance (between group or within group) will result in a larger F-statistic value? Explain. A) Between group variance because it would decrease the denominator of the F -statistic and therefore increase the entire F-statistic value. B) Between group variance because it would decrease the numerator of the F -statistic and therefore increase the entire F-statistic value. C) Within group variance because it would decrease the denominator of the F -statistic and therefore increase the entire F-statistic value. D) Within group variance because it would decrease the numerator of the F-statistic and therefore increase the entire F-statistic value. Answer: C 5) Refer to the figure below. Assume that all the distributions are symmetric (i.e. the sample mean and median are approximately equal) and that all the samples sizes are the same. Imagine carrying out two ANOVAs. The first ANOVA compares the means based on samples A, B, and C (left of the vertical line), and the second ANOVA compares the means based on samples X, Y, and Z (right of the vertical line). One of the calculated values of the F-statistic is 0.92 and the other is 33.54. Which value is which, and why?
A) 33.54 goes with A, B, and C; 0.92 goes with X, Y, and Z. The F -statistic for A, B, and C is larger because the variation between groups is smaller relative to the variation within groups. B) 33.54 goes with A, B, and C; 0.92 goes with X, Y, and Z. The F -statistic for A, B, and C is larger because the variation within groups is smaller relative to the variation between groups. C) 0.92 goes with A, B, and C; 33.54 goes with X, Y, and Z. The F -statistic for X, Y, and Z is larger because the variation between groups is smaller relative to the variation within groups. D) 0.92 goes with A, B, and C; 33.54 goes with X, Y, and Z. The F -statistic for X, Y, and Z is larger because the variation within groups is smaller relative to the variation between groups. Answer: D
Page 6 Copyright © 2020 Pearson Education, Inc.
2 Interpret ANOVA Results MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information for the question. Researchers conducted a study that examined marital status and stress levels. A hypothesis test was conducted to test the claim that people with different marital statuses have a different mean stress level. The TI-84 output for the test is shown below.
1) State the null and alternative hypothesis. A) H0 : Marital status and stress levels are associated. Ha : Marital status and stress levels are not associated. B) H0 : Marital status and stress levels are not associated. Ha : Marital status and stress levels are associated. C) H0 : Marital status and stress levels are not associated. Ha : Mean stress levels of married people are greater than the mean stress levels of unmarried people. D) H0 : There is no difference in the mean stress levels of married and unmarried people. Ha : There is a difference in the mean stress levels of married and unmarried people. Answer: B 2) What is the value of the test statistic? Round to the nearest hundredth if necessary. A) 25.27 B) 76.70 C) 5.21 D) Canʹt be determined with the given information Answer: C Solve the problem. 3) Choose the statement that is not true about ANOVA. A) ANOVA can be used to compare the means of two groups. B) ANOVA is a more powerful approach than the Bonferroni Correction. C) ANOVA is used to compare two categorical variables. D) ANOVA is a procedure for comparing the means of several groups. Answer: C 4) Choose the statement that is true about the F-statistic. Or state that all statements are true. A) The F-statistic compares the variation between the groups to the variation within the groups. B) Large values of the F-statistic support the null hypothesis that there is no difference among the group means. C) The F-statistic is large when the variation within the groups is large compared to the variation between the groups. D) All of these statements are true. Answer: A Page 7 Copyright © 2020 Pearson Education, Inc.
5) Choose the statement that best describes the purpose of ANOVA. A) The ANOVA procedure will reveal whether the means of several groups are different and which group or groups have a different mean. B) ANOVA is a procedure for comparing the means of several groups. C) ANOVA is a procedure for comparing different categories for several groups. D) None of these Answer: B 6) Identify the test statistic used for the ANOVA procedure and how it is calculated. A) The test statistic is z and is the ratio of the mean within a group to the variation between groups. B) The test statistic is F and is the ratio of the variation between groups to the variation within groups. C) The test statistic is z and is calculated by finding the mean z-score between groups. D) The test statistic is F and is calculated by finding the difference between group means. Answer: B 7) Refer to the figure. Assume that distributions within each group are symmetric and that all the sample sizes are the same. Suppose that two ANOVAs are carried out. The first compares the means of samples A, B, and C. The second compares the means of samples L, M, and N. Which of the following statements is true?
A) The F-statistic for comparing the means of samples A, B, and C is larger than the F-statistic for comparing the means of sample L, M and N. B) The F-statistic for comparing the means of samples A, B, and C is smaller than the F-statistic for comparing the means of sample L, M and N. C) The F-statistic for comparing the means of samples A, B, and C is the same as the F-statistic for comparing the means of sample L, M and N. D) There is not enough information to determine how the F-statistic for comparing the means of samples A, B, and C and the F-statistic for comparing the means of sample L, M and N compare. Answer: B
Page 8 Copyright © 2020 Pearson Education, Inc.
Vanadium has recently been recognized as an important trace element. An experiment was conducted to compare the concentrations of vanadium in different foods. The amount of Vanadium, measured in nanograms per gram, was recorded for 8 samples of each of the following foods: green beans, tuna, and soy milk. The output from an ANOVA is shown below:
8) State the null and alternative hypotheses to carry out ANOVA with these data. A) H0 : There is a difference in the mean level of Vanadium among the three food products Ha: There is no difference in the mean level of Vanadium among the three food products B) H0 : There is no difference in the mean level of Vanadium among the three food products Ha: There is a difference in the mean level of Vanadium among the three food products C) H0 : Food products and level of Vanadium are related Ha: Food products and Vanadium are not related D) H0 : Food products and level of Vanadium are not related Ha: Food products and level of Vanadium are related Answer: B 9) What is the value of the test statistic? A) 0.4938 B) 1.0591
C) 2.14
Answer: C
Page 9 Copyright © 2020 Pearson Education, Inc.
D) 0.142
Vanadium has recently been recognized as an important trace element. An experiment was conducted to compare the concentrations of vanadium in different foods. The amount of vanadium, measured in nanograms per gram, was recorded for 8 samples of each of the following foods: green beans, tuna, and soy milk. The output from an ANOVA is shown below
10) State the null and alternative hypotheses to carry out ANOVA with these data. A) H0 : Food products and level of vanadium are related Ha: Food products and vanadium are not related B) H0 : Food products and level of vanadium are not related Ha: Food products and level of vanadium are related C) H0 : There is no difference in the mean level of vanadium among the three food products Ha: There is a difference in the mean level of vanadium among the three food products D) H0 : There is a difference in the mean level of vanadium among the three food products Ha: There is no difference in the mean level of vanadium among the three food products Answer: C 11) What is the value of the test statistic? A) 0.541 B) 1.559
C) 0.078
D) 2.88
Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 12) What is the meaning of the overall significance level of a test? Explain what happens to the overall significance level when multiple comparisons are made, that is, when multiple hypothesis tests are conducted in an effort to compare the means of several groups. Answer: The overall significance level is the probability that you will mistakenly reject the null hypothesis in at least one of several hypothesis tests. When multiple comparisons are made the overall significance level will increase. 13) What is the test statistic used for the ANOVA procedure? Explain how it is calculated. Answer: The test statistic is F and is calculated by dividing the variation between groups by the variation within groups. 14) Suppose a researcher collected data to compare whether cats of different size categories differed in mean number of hours slept per day. Cats were categorized as small, medium, or large. The calculated F-statistic was 220.41. Given this test statistic, can we conclude that there is relatively more variation between groups or within groups? Based on the test statistic, is it likely that an association exists between cat size and mean number of hours that a cat sleeps per day? Explain. Answer: There is more variation between groups. Since F - the ratio of variation between groups to the variation within groups - is relatively large, it is likely that an association exists.
Page 10 Copyright © 2020 Pearson Education, Inc.
Vanadium has recently been recognized as an important trace element. An experiment was conducted to compare the concentrations of vanadium in different foods. The amount of vanadium, measured in nanograms per gram, was recorded for 8 samples of each of the following foods: green beans, tuna, and soy milk. The output from an ANOVA is shown below
15) State the null and alternative hypotheses. Answer: H0 : Food product and amount of vanadium are not associated; Ha: Food product and amount of vanadium are associated. 16) What is the value of the test statistic? Answer: F = 2.88 Solve the problem. 17) In the context of the ANOVA test, do the phrases ʺvariation due to treatmentʺ, ʺexplained variationʺ, and ʺvariation due to factorsʺ, describe variation within groups or variation between groups? Answer: Variation between groups 3 Complete an ANOVA Table MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
1) Compute the F-statistic. Round to the nearest hundredth. A) 1.19 C) 0.35
B) 0.84 D) Not enough information is given
Answer: A
Page 11 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
2) Compute the F-statistic. Round to the nearest hundredth. A) 35.48 C) 0.10
B) 10.51 D) Not enough information is given
Answer: B
11.3 The ANOVA Test 1 Perform an ANOVA Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
1) Choose the correct conclusion for the hypothesis that vehicle color affects the amount of a speeding ticket. Assume all ANOVA test conditions have been satisfied. A) Reject H0 . The vehicle color has an effect on the amount of the speeding ticket. B) Fail to reject H0 . The vehicle color has an effect on the amount of the speeding ticket. C) Reject H0 . The vehicle color has no effect on the amount of the speeding ticket. D) Fail to reject H0 . The vehicle color has no effect on the amount of the speeding ticket. Answer: D Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
2) Choose the correct conclusion for the hypothesis that vehicle color affects the amount of a speeding ticket. Assume all ANOVA test conditions have been satisfied. A) Reject H0 . The vehicle color has an effect on the amount of the speeding ticket. B) Fail to reject H0 . The vehicle color has an effect on the amount of the speeding ticket. C) Reject H0 . The vehicle color has no effect on the amount of the speeding ticket. D) Fail to reject H0 . The vehicle color has no effect on the amount of the speeding ticket. Answer: A
Page 12 Copyright © 2020 Pearson Education, Inc.
Use the following information for the question. A group of home gardeners want to test whether the type of soil used to grow heirloom tomatoes has an effect on the number of tomatoes harvested. Gardeners randomly assigned tomato plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Forty plants were assigned to each group. At the end of the growing season the number of tomatoes harvested was counted. Assume that all other conditions for the ANOVA test have been met.
3) State the null and alternative hypothesis. A) H0 : μnone < μcommercial < μcompost Ha : There is no difference in the mean number of tomatoes harvested. B) H0 : μnone = μcommercial = μcompost Ha : The mean number of tomatoes harvested differs by type of fertilizer used.. C) H0 : The mean number of tomatoes harvested differs by type of fertilizer used. Ha : There is no difference in the mean number of tomatoes harvested by type of fertilizer used. D) None of these Answer: C 4) Using the test results provided, test the hypothesis that soil treatment affects the number of tomatoes harvested. Use a significance level of 5%. Choose the correct decision regarding the null hypothesis and correct conclusion. A) Reject H0 . We can conclude that the treatment of the soil affects the number of heirloom tomatoes harvested. B) Fail to reject H0 . We can conclude that the treatment of the soil affects the number of heirloom tomatoes harvested. C) Reject H0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested. D) Fail to reject H0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested. Answer: A
Page 13 Copyright © 2020 Pearson Education, Inc.
Use the following information for the question. A group of home gardeners want to test whether the type of soil used to grow heirloom tomatoes has an effect on the number of tomatoes harvested. Gardeners randomly assigned tomato plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Forty plants were assigned to each group. At the end of the growing season the number of tomatoes harvested was counted. Assume that all other conditions for the ANOVA test have been met.
5) State the null and alternative hypothesis. A) H0 : μnone < μcommercial < μcompost Ha : There is no difference in the mean number of tomatoes harvested. B) H0 : The mean number of tomatoes harvested differs by type of fertilizer used. Ha : There is no difference in the mean number of tomatoes harvested by type of fertilizer used. C) H0 : μnone = μcommercial = μcompost Ha : The mean number of tomatoes harvested differs by type of fertilizer used.. D) None of these Answer: A 6) Using the test results provided, test the hypothesis that soil treatment affects the number of tomatoes harvested. Use a significance level of 5%. Choose the correct decision regarding the null hypothesis and correct conclusion. A) Reject H0 . We can conclude that the treatment of the soil affects the number of heirloom tomatoes harvested. B) Fail to reject H0 . We can conclude that the treatment of the soil affects the number of heirloom tomatoes harvested. C) Reject H0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested. D) Fail to reject H0 . We can conclude that the treatment of the soil does not affect the number of heirloom tomatoes harvested. Answer: D
Page 14 Copyright © 2020 Pearson Education, Inc.
The waiting time (in minutes) was measured for a random sample of patients with non -life threatening injuries in emergency rooms at four local hospitals who arrived between 2:00 and 3:00pm on a particular Saturday. The results of the ANOVA and the plot of the residuals follow:
7) Using the information in the outputs, is there evidence that the condition of same variance holds? A) Yes B) No Answer: B 8) Using the information in the outputs, is there evidence that the distribution of observations is normal in each hospitalʹs population? A) Yes B) No Answer: B
Page 15 Copyright © 2020 Pearson Education, Inc.
The waiting time (in minutes) was measured for a random sample of patients with non -life threatening injuries in emergency rooms at four local hospitals who arrived between 2:00 and 3:00pm on a particular Saturday. The results of the ANOVA and the plot of the residuals follow:
9) Using the information in the outputs, is there evidence that the distribution of observations is normal in each hospitalʹs population? A) Yes B) No Answer: B 10) Using the information in the outputs, is there evidence that the condition of same variance holds? A) Yes B) No Answer: A Solve the problem. 11) Which of the following statements about the p-value of an ANOVA test is not true? A) The p-value for an ANOVA test is the probability of getting an F-statistic as small as or smaller than the observed value, assuming the null hypothesis is true. B) The p-value for an ANOVA test is the probability of getting an F-statistic as large as or larger than the observed value, assuming the null hypothesis is true. C) A very small p-value for an ANOVA test suggests that at least two population means are different. D) A large p-value for an ANOVA test suggests that there is no evidence of differences among the population means. Answer: A
Page 16 Copyright © 2020 Pearson Education, Inc.
12) Choose the statement that best describes the F-statistic for the ANOVA test. A) The F-statistic compares the variation between groups to the variation within groups. A small F-statistic indicates that variation between groups is small relative to variation within groups. B) The F-statistic compares the variation between groups to the variation within groups. A small F-statistic indicates that variation between groups is large relative to variation within groups. C) The F-statistic is the probability of getting the sample results, assuming that there is no difference in the groups. D) None of these Answer: A High levels of ozone in the air indicates that air pollution is present. Eight air samples were collected from each of 3 locations in southern California and the amounts of ozone (in parts per million) were recorded for each. Assume that all conditions of the ANOVA test have been met.
13) State the null and alternative hypotheses for carrying out an ANOVA with these data. A) H0 : μ1 = μ2 = μ3 Ha: μ1 > μ2 > μ3 B) H0 : μ1 > μ2 > μ3 Ha: There is no difference in the mean amounts of ozone among the three locations C) H0 : There is no difference in the mean amounts of ozone among the three locations Ha: μ1 > μ2 > μ3 D) H0 : μ1 = μ2 = μ3 Ha: The mean amount of ozone differs among the three locations Answer: D 14) Using the results of the ANOVA provided, test the hypothesis that the mean amount of ozone varies by locations. Use the significance level of 1%. Choose the correct decision regarding the null hypothesis and correct conclusion. A) Fail to reject H0 . We conclude that the mean amount of ozone differs among the three locations. B) Reject H0 . We conclude that the mean amount of ozone differs among the three locations. C) Fail to reject H0 . We conclude that there is no difference in the mean amount of ozone among the three locations. D) Reject H0 . We conclude that there is no difference in the mean amount of ozone among the three locations. Answer: B
Page 17 Copyright © 2020 Pearson Education, Inc.
High levels of ozone in the air indicates that air pollution is present. Eight air samples were collected from each of 3 locations in southern California and the amounts of ozone (in parts per million) were recorded for each. Assume that all conditions of the ANOVA test have been met.
15) State the null and alternative hypotheses for carrying out an ANOVA with these data. A) H0 : There is no difference in the mean amounts of ozone among the three locations Ha: μ1 > μ2 > μ3 B) H0 : μ1 = μ2 = μ3 Ha: The mean amount of ozone differs among the three locations C) H0 : μ1 = μ2 = μ3 Ha: μ1 > μ2 > μ3 D) H0 : μ1 > μ2 > μ3 Ha: There is no difference in the mean amounts of ozone among the three locations Answer: B 16) Using the results of the ANOVA provided, test the hypothesis that the mean amounts of ozone varies by locations. Use the significance level of 1%. Choose the correct decision regarding the null hypothesis and correct conclusion. A) Fail to reject H0 . We conclude that the mean amount of ozone differs among the three locations. B) Reject H0 . We conclude that the mean amount of ozone differs among the three locations. C) Fail to reject H0 . We conclude that there is no difference in the mean amount of ozone among the three locations. D) Reject H0 . We conclude that there is no difference in the mean amount of ozone among the three locations. Answer: C
Page 18 Copyright © 2020 Pearson Education, Inc.
Polychlorinated biphenyls (PCBs), used in manufacturing of many materials, are extremely toxic contaminants when released into the environment. Thirty water samples were taken from each of 6 rivers in the Northeast and analyzed for PCB concentrations (in parts per million).
17) Compute the F-statistic. Round to the nearest hundredth. A) 0.78 B) 1.28
C) 20.71
D) 16.16
Answer: B 18) Choose the correct conclusion to test the hypothesis that mean PCB concentrations differ among the six rivers. Use a significance level of 5%. Assume that all ANOVA conditions are met. A) Fail to reject H0 . We conclude that the mean PCB concentration differs among the six rivers. B) Reject H0 . We conclude that the mean PCB concentration differs among the six rivers. C) Fail to reject H0 . We conclude that there is no evidence of a difference in the mean PCB concentration among the six rivers. D) Reject H0 . We conclude that there is no difference in the mean PCB concentration among the six rivers. Answer: C Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study - red, white, black, and silver. Thirty vehicles were randomly assigned to each group.
19) Compute the F-statistic. Round to the nearest hundredth. A) 269.31 C) 0.14
B) 6.96 D) Not enough information is given
Answer: B 20) Choose the correct conclusion for the hypothesis that vehicle color affects the amount of a speeding ticket. Assume all ANOVA test conditions have been satisfied. A) Reject H0 . The vehicle color has an effect on the amount of the speeding ticket. B) Fail to reject H0 . The vehicle color has an effect on the amount of the speeding ticket. C) Reject H0 . The vehicle color has no effect on the amount of the speeding ticket. D) Fail to reject H0 . The vehicle color has no effect on the amount of the speeding ticket. Answer: A
Page 19 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A group of home gardeners want to test whether the type of soil used to grow carrots has an effect on the number of carrots harvested. Gardeners randomly assigned carrot plants to be grown in soil with no fertilizer, commercial plant food, and homemade compost. All other growing conditions were kept the same. Fifty plants were assigned to each group. At the end of the growing season the number of carrots harvested was counted. Use the output below. Assume that other conditions for the ANOVA test have been met.
21) Interpret the boxplots given. Compare medians, interquartile ranges, and shapes, and mention any potential outliers. Answer: The medians are similar, although the median for compost was the largest. The shapes are not strongly skewed, although the plant food group has a shape that is skewed somewhat to the left. There does not appear to be any potential outliers. 22) State the null and alternative hypothesis for carrying out ANOVA with these data. Answer: H0 : μnone = μcommercial = μcompost Ha: The mean number of carrots harvested differs by type of fertilizer used. 23) Using the results provided, test the hypothesis that soil treatment affects the number of carrots harvested. Use a significance level of 5%. State the correct decision regarding the null hypothesis and write a sentence summarizing the conclusion. Answer: We cannot reject the null hypothesis that population means are all the same. There’’s not enough evidence to conclude that there is an association between soil type and number of carrots harvested. Solve the problem. 24) Describe two of the four conditions that must be checked in order for the calculated F-statistic to follow the F-distribution. Answer: The four conditions are (1) random sample and independent measurements, (2) independent groups, (3) same variance, and (4) Normal distribution or large sample.
Page 20 Copyright © 2020 Pearson Education, Inc.
25) The figure below shows the F-distribution with 5 and 12 degrees of freedom to test the hypothesis that age groups and reading speed are associated. The shaded area represents the p-value. Assume that all conditions for ANOVA have been met. Should the null hypothesis that the age group population means are equal be rejected? What conclusion can be drawn about the association between age group and reading speed?
Answer: Since the p-value is relatively large, the null hypothesis that age group means are equal should not be rejected. Although reading speed may vary from person to person, it doesnʹt seem to have anything to do with age group. Polychlorinated biphenyls (PCBs), used in manufacturing of many materials, are extremely toxic contaminants when released into the environment. Thirty water samples were taken from each of 6 rivers in the Northeast and analyzed for PCB concentrations (in parts per million).
26) State the null and alternative hypothesis for carrying out ANOVA with these data. Answer: H0 : μA = μB = μC = μD = μE = μF Ha: The mean PCB concentrations differ by river. 27) Assume that all ANOVA test conditions have been satisfied. State the correct decision regarding the null hypothesis for the claim that river affects PCB concentrations. Write a sentence summarizing your conclusion. Answer: We cannot reject the null hypothesis that population mean PCB levels are all the same. Thereʹs not enough evidence to conclude that there is an association between river and mean PCB concentrations.
Page 21 Copyright © 2020 Pearson Education, Inc.
2 Determine if an ANOVA Test is Appropriate MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is not one of the conditions that must be checked in order for the calculated F-statistic to follow the F-distribution. A) The groups are independent of each other. B) Each groupʹs population must be at least 10 times larger than its respective sample. C) The variances or standard deviations of the groups must be equal. D) The distribution of the observations is Normal in each groupʹs population or the sample size is large. Answer: B 2) Choose the statement that is not true about multiple comparisons and ANOVA. Choose (d) if all the statements are true. A) ANOVA is a method for testing whether there is an association between a categorical variable and a numerical variable. B) When doing multiple comparisons, the response variable is always numerical, but the independent variable can be numerical or categorical. C) When doing a multiple comparison, the overall significance level will increase meaning it is more likely that an incorrect conclusion will be drawn. D) All of these statements are true. Answer: B 3) Choose the statement that best describes the purpose of ANOVA. A) ANOVA is a procedure for comparing the means of several groups. B) ANOVA is a procedure for comparing different categories for several groups. C) The ANOVA procedure will reveal whether the means of several groups are different and which group or groups have a different mean. D) None of these Answer: A 4) A movie studio did a poll to determine whether women in different age groups watched different amounts of horror movies. Check the computer output to see whether the same-variance condition for ANOVA holds. Is ANOVA appropriate?
A) Yes
B) No
Answer: B
Page 22 Copyright © 2020 Pearson Education, Inc.
5) A movie studio did a poll to determine whether women in different age groups watched different amounts of horror movies. Check the computer output to see whether the same-variance condition for ANOVA holds. Is ANOVA appropriate?
A) Yes
B) No
Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 6) The waiting time (in minutes) was measured for a random sample of patients with non -life threatening injuries in emergency rooms at four local hospitals who arrived between 2:00 and 3:00pm on a particular Saturday. Check the output to see whether the same-variance condition for ANOVA holds. Show your work. Is ANOVA appropriate?
Answer: The smallest SD is 1.60. The largest SD is more than 2 ×1.60 so it appears that variances are not the same. ANOVA would not be appropriate.
Page 23 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 7) Body temperatures (in degrees Fahrenheit) were taken for randomly selected 5 people, each in three different situations: sitting, after exercising, and while sleeping. Explain why it would not be appropriate to use one-way ANOVA to test whether the population mean body temperatures were associated with activity. Person A B C D E
Sitting 98.2 98.6 97.9 98.8 98.3
Exercising 99.0 98.9 98.5 99.2 98.8
Sleeping 97.9 98.5 97.7 98.6 97.8
A) The random sample and independent measurements condition fails because the body temperature measurements within each activity level are not independent of each other. B) The independence group condition fails because the activity levels are not independent of each other. C) The same variance condition fails because the variances of the body temperature measurements in each activity level are all very different. D) All of the conditions for using a one-way ANOVA test are satisfied. Answer: B 8) The calorie count for cereals made by three popular cereal manufacturers (Kellogg, General Mills, and Post) were collected and a one-way ANOVA was performed, as displayed in the table below. Do the results indicate that the same-variance condition for performing one-way ANOVA tests is satisfied? Explain why or why not.
A) The same-variance condition is satisfied. General Mills and Post have similar standard deviations and 10.54/10.37 = 1.02 < 2. B) The same-variance condition is not satisfied. General Mills and Post have different standard deviations and 10.54/10.37 = 1.02 < 2. C) The same-variance condition is satisfied. General Mills and Kellogg have similar standard deviations and 22.22/10.37 = 2.14 >2. D) The same-variance condition is not satisfied. General Mills and Kellogg have different standard deviations and 22.22/10.37 = 2.14 >2. Answer: D
Page 24 Copyright © 2020 Pearson Education, Inc.
9) The grams of carbohydrates for cereals made by three popular cereal manufacturers (Kellogg, General Mills, and Post) were collected and a one-way ANOVA was performed, as displayed in the table below. Do the results indicate that the Normal distribution/large sample condition for performing one -way ANOVA tests is satisfied? Explain why or why not.
A) The Normal distribution/large sample condition is satisfied. Although the samples sizes for each group are less than 25, the boxplots and histograms show roughly symmetric distributions. B) The Normal distribution/large sample condition is satisfied. The sample sizes for each group are greater than 25. C) The Normal distribution/large sample condition is not satisfied. The sample sizes for each group are less than 25. D) The Normal distribution/large sample condition is not satisfied. The sample sizes for each group are less than 25 and the boxplots and histograms are skewed. Answer: A
Page 25 Copyright © 2020 Pearson Education, Inc.
10) The grams of carbohydrates for cereals made by three popular cereal manufacturers (Kellogg, General Mills, and Post) were collected and a one-way ANOVA was performed, as displayed in the table below. Do the results indicate that the same variance condition for performing one -way ANOVA tests is satisfied? Explain why or why not.
A) The same variance condition is satisfied. When you compare the largest standard deviation (Kellogg) to the smallest (Post), you see that 4.465/1.922 = 2.323 > 2. B) The same variance condition is not satisfied. When you compare the largest standard deviation (Kellogg) to the smallest (Post), you see that 4.465/1.922 = 2.323 > 2. C) The same variance condition is satisfied. When you compare the two smallest standard deviations (General Mills) to the smallest (Post), you see that 3.347/1.922 = 1.741 < 2. D) The same variance condition is not satisfied. When you compare the two smallest standard deviations (General Mills) to the smallest (Post), you see that 3.347/1.922 = 1.741 < 2. Answer: B
11.4 Post-Hoc Procedures 1 Perform Post-Hoc Tests MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
1) Do the ANOVA test results warrant a Post-hoc procedure? A) Yes B) No Answer: B
Page 26 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study --red, white, black, and silver. Thirty vehicles were randomly assigned to each group. Use the output below to answer the question.
2) Do the ANOVA test results warrant a Post-hoc procedure? A) Yes B) No Answer: B Solve the problem. 3) An ANOVA test was conducted to see whether bike frame type (Type A, Type B, or Type C) had an effect on speed over a one mile distance. Test results warranted post-hoc procedures. The Tukey HSD approach was used with the following results: Group Comparison Type A - Type B Type A - Type C Type B - Type C
98.33% Confidence Interval (-12.70, -4.29) (-19.63, -8.87) (-9.89, -1.61)
Is there evidence that one type of bike frame is faster than the others? Which type of frame appears to be the fast A) Yes, the confidence interval results show that frame type A is the faster than B or C. B) Yes, the confidence interval results show that frame type B is the faster than A or C. C) Yes, the confidence interval results show that frame type C is the faster than A or B. D) No, there is not enough evidence to say with confidence that one frame type is faster than the others because none of the confidence intervals contain zero. Answer: A Polychlorinated biphenyls (PCBs), used in manufacturing of many materials, are extremely toxic contaminants when released into the environment. Thirty water samples were taken from each of 6 rivers in the Northeast and analyzed for PCB concentrations (in parts per million).
4) Do the results of the ANOVA test warrant a Post-hoc procedure? A) Yes B) No Answer: B
Page 27 Copyright © 2020 Pearson Education, Inc.
Researchers want to test whether the color of a vehicle ticketed for speeding has an effect on the amount of the ticket. Four vehicle colors were used for the study - red, white, black, and silver. Thirty vehicles were randomly assigned to each group.
5) Do the ANOVA test results warrant a Post-hoc procedure? A) Yes B) No Answer: A Solve the problem. 6) In a bumper test, three types of sub-compact cars were deliberately crashed into a barrier at 5 mph with the resulting damage (in dollars) recorded. Eight test cars of each type were crashed. Test results indicated that post-hoc procedures were warranted. The TUKEY HSD approach was used with the following results:
Using an overall significance level of 5%, is there evidence that one type of car had less bumper damage than the others? Which car appears to have the least bumper damage? A) Yes, the confidence interval results show that Car 1 has less damage than 2 or 3. B) Yes, the confidence interval results show that Car 2 has less damage than 1 or 3. C) Yes, the confidence interval results show that Car 3 has less damage than 1 or 2. D) No, there is not enough evidence to say that one type of car has less damage than the others because the intervals all overlap. Answer: A
Page 28 Copyright © 2020 Pearson Education, Inc.
7) In a bumper test, three types of sub-compact cars were deliberately crashed into a barrier at 5 mph with the resulting damage (in dollars) recorded. Eight test cars of each type were crashed. Test results indicated that post-hoc procedures were warranted. The TUKEY HSD approach was used with the following results:
Using an overall significance level of 5%, is there evidence that one type of car had less bumper damage than the others? Which car appears to have the least bumper damage? A) Yes, the confidence interval results show that Car 3 has less damage than 1 or 2. B) Yes, the confidence interval results show that Car 2 has less damage than 1 or 3. C) Yes, the confidence interval results show that Car 1 has less damage than 2 or 3. D) No, there is not enough evidence to say that one type of car has less damage than the others because the intervals all overlap. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Polychlorinated biphenyls (PCBs), used in manufacturing of many materials, are extremely toxic contaminants when released into the environment. Thirty water samples were taken from each of 6 rivers in the Northeast and analyzed for PCB concentrations (in parts per million).
8) When do results from an ANOVA procedure warrant a post-hoc analysis? Answer: Ad-hoc procedures are warranted when the ANOVA test results in rejection of the null hypothesis. Rejection of the null hypothesis is an indication that at least one of the group means is different which would warrant post-hoc procedures.
Page 29 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 9) In a car bumper test, three types of sub-compact cars were deliberately crashed into a barrier at 5 mph with the resulting damage (in dollars) recorded. Eight test cars of each type were crashed. Test results indicated that post-hoc procedures were warranted. The TUKEY HSD approach was used with the following results:
Is there evidence that one type of car had less mean bumper damage than the others? Which car appears to have least mean bumper damage? Explain. Answer: Car type 1 has a mean bumper damage that is less than the means for the others. There is insufficient evidence to conclude that there is a difference between the mean bumper damage of car type 2 and the mean bumper damage of car type 3.
Page 30 Copyright © 2020 Pearson Education, Inc.
Ch. 11 Multiple Comparisons and Analysis of Variance Answer Key 11.1 Multiple Comparisons 1 Choose an Appropriate Test 1) B 2) C 3) D 4) C 5) B 6) C 7) C 8) A 9) ANOVA test 10) Chi-square test 2 Use the Bonferroni Correction 1) A 2) B 3) A 4) C 5) D 6) B 7) D 8) C 9) D 10) C 11) 21 comparisons 12) 0.005
11.2 The Analysis of Variance 1 Compare F-values using Boxplots 1) C 2) A 3) B 4) C 5) D 2 Interpret ANOVA Results 1) B 2) C 3) C 4) A 5) B 6) B 7) B 8) B 9) C 10) C 11) D 12) The overall significance level is the probability that you will mistakenly reject the null hypothesis in at least one of several hypothesis tests. When multiple comparisons are made the overall significance level will increase. 13) The test statistic is F and is calculated by dividing the variation between groups by the variation within groups. 14) There is more variation between groups. Since F - the ratio of variation between groups to the variation within groups - is relatively large, it is likely that an association exists. 15) H0 : Food product and amount of vanadium are not associated; Ha: Food product and amount of vanadium are associated. Page 31 Copyright © 2020 Pearson Education, Inc.
16) F = 2.88 17) Variation between groups 3 Complete an ANOVA Table 1) A 2) B
11.3 The ANOVA Test 1 Perform an ANOVA Test 1) D 2) A 3) C 4) A 5) A 6) D 7) B 8) B 9) B 10) A 11) A 12) A 13) D 14) B 15) B 16) C 17) B 18) C 19) B 20) A 21) The medians are similar, although the median for compost was the largest. The shapes are not strongly skewed, although the plant food group has a shape that is skewed somewhat to the left. There does not appear to be any potential outliers. 22) H0 : μnone = μcommercial = μcompost Ha: The mean number of carrots harvested differs by type of fertilizer used. 23) We cannot reject the null hypothesis that population means are all the same. There’’s not enough evidence to conclude that there is an association between soil type and number of carrots harvested. 24) The four conditions are (1) random sample and independent measurements, (2) independent groups, (3) same variance, and (4) Normal distribution or large sample. 25) Since the p-value is relatively large, the null hypothesis that age group means are equal should not be rejected. Although reading speed may vary from person to person, it doesnʹt seem to have anything to do with age group. 26) H0 : μA = μB = μC = μD = μE = μF Ha: The mean PCB concentrations differ by river. 27) We cannot reject the null hypothesis that population mean PCB levels are all the same. Thereʹs not enough evidence to conclude that there is an association between river and mean PCB concentrations. 2 Determine if an ANOVA Test is Appropriate 1) B 2) B 3) A 4) B 5) A 6) The smallest SD is 1.60. The largest SD is more than 2 ×1.60 so it appears that variances are not the same. ANOVA would not be appropriate. 7) B 8) D 9) A Page 32 Copyright © 2020 Pearson Education, Inc.
10) B
11.4 Post-Hoc Procedures 1 Perform Post-Hoc Tests 1) B 2) B 3) A 4) B 5) A 6) A 7) C 8) Ad-hoc procedures are warranted when the ANOVA test results in rejection of the null hypothesis. Rejection of the null hypothesis is an indication that at least one of the group means is different which would warrant post -hoc procedures. 9) Car type 1 has a mean bumper damage that is less than the means for the others. There is insufficient evidence to conclude that there is a difference between the mean bumper damage of car type 2 and the mean bumper damage of car type 3.
Page 33 Copyright © 2020 Pearson Education, Inc.
Ch. 12 Experimental Design: Controlling Variation 12.1 Variation Out of Control 1 Discuss the Design of Studies MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Provide an appropriate response. 1) A drug company wanted to test a new indigestion medication. The researchers found 100 adults aged 25-35 and randomly assigned them to two groups. The first group received the new drug, while the second received a placebo. After one month of treatment, the percentage of each group whose indigestion symptoms decreased was recorded and compared. What is the response variable in this experiment? A) the percentage who had decreased indigestion symptoms B) the type of drug (medication or placebo) C) the 100 adults aged 25-35 D) the one month treatment time Answer: A 2) A drug company wanted to test a new depression medication. The researchers found 700 adults aged 25-35 and randomly assigned them to two groups. The first group received the new drug, while the second received a placebo. After one month of treatment, the percentage of each group whose depression symptoms decreased was recorded and compared. What is the treatment in this experiment? A) the drug B) the percentage who had decreased depression symptoms C) the 700 adults aged 25-35 D) the one month treatment time Answer: A 3) A medical journal published the results of an experiment on insomnia. The experiment investigated the effects of a controversial new therapy for insomnia. Researchers measured the insomnia levels of 97 adult women who suffer moderate conditions of the disorder. After the therapy, the researchers again measured the womenʹs insomnia levels. The differences between the the pre- and post-therapy insomnia levels were reported. What is the response variable in this experiment? A) the differences between the the pre- and post-therapy insomnia levels B) the 97 adult women who suffer from insomnia C) the disorder (insomnia or no insomnia) D) the therapy Answer: A 4) A medical journal published the results of an experiment on anorexia. The experiment investigated the effects of a controversial new therapy for anorexia. Researchers measured the anorexia levels of 78 adult women who suffer moderate conditions of the disorder. After the therapy, the researchers again measured the womenʹs anorexia levels. The differences between the the pre- and post-therapy anorexia levels were reported. What is the treatment in this experiment? A) the therapy B) the 78 adult women who suffer from anorexia C) the disorder (anorexia or no anorexia) D) the differences between the the pre- and post-therapy anorexia levels Answer: A
Page 1 Copyright © 2020 Pearson Education, Inc.
5) A farmer wishes to test the effects of a new fertilizer on her tomato yield. She has four equal-sized plots of land-- one with sandy soil, one with rocky soil, one with clay-rich soil, and one with average soil. She divides each of the four plots into three equal-sized portions and randomly labels them A, B, and C. The four A portions of land are treated with her old fertilizer. The four B portions are treated with the new fertilizer, and the four Cʹs are treated with no fertilizer. At harvest time, the tomato yield is recorded for each section of land. What is the response variable in this experiment? A) the tomato yield recorded for each section of land B) the type of fertilizer (old, new, or none) C) the section of land (A, B, or C) D) the four types of soil Answer: A 6) A farmer wishes to test the effects of a new fertilizer on her tomato yield. She has four equal-sized plots of land-- one with sandy soil, one with rocky soil, one with clay-rich soil, and one with average soil. She divides each of the four plots into three equal-sized portions and randomly labels them A, B, and C. The four A portions of land are treated with her old fertilizer. The four B portions are treated with the new fertilizer, and the four Cʹs are treated with no fertilizer. At harvest time, the tomato yield is recorded for each section of land. What is the treatment in this experiment? A) the fertilizers B) the tomato yield recorded for each section of land C) the section of land (A, B, or C) D) the four types of soil Answer: A Solve the problem. 7) A chocolate chip manufacturer has developed a new recipe for its chocolate chips and is planning a consumer taste test. Researchers in the test kitchen want to measure whether there is a difference in consumer opinions between the old recipe and the new recipe. Company researchers believe that consumer reaction to the new chip will depend on age so they decide to block on age. To do this, they create blocks for consumers between the ages 12-18 years, 19-25 years, 26-32 years, and 32 years and older. They then randomly assign subjects in each block to taste cookies made with the new and old recipe chocolate chip. Following the taste test, participants respond to a questionnaire about the cookies they tasted. Is this an effective design for this study? A) Yes B) No Answer: A 8) A ranch salad dressing manufacturer has developed a new recipe for its ranch dressing and is planning a consumer taste test. Researchers in the test kitchen want to measure whether there is a difference in consumer opinions between the old recipe and the new recipe. Company researchers believe that consumer reaction to the new dressing will depend on gender so they decide to block on gender. To do this, they create blocks for female adult consumers and male adult consumers. They then randomly assign subjects in each block to taste dressings using the new and old recipe. Following the taste test, participants respond to a questionnaire about the dressings they tasted. Is this an effective design for this study? B) Yes A) No Answer: B 9) Suppose a new engine additive is designed to improve performance in race cars and researchers wish to test the effectiveness of the new additive. On a closed race course, the individual race times for forty race cars are recorded. The race times (on the same course) for each of the cars after being treated with the engine additive are then recorded. Which test design would be most appropriate for this scenario? B) Paired t-test C) None of these A) Two-sample t-test Answer: B
Page 2 Copyright © 2020 Pearson Education, Inc.
10) Suppose a sociologist is interested in finding out if there is an association between gender and opinions on human cloning. Which test design would be most appropriate for this scenario? A) Two-sample t-test B) Chi-square C) Paired t-test D) ANOVA Answer: B Scientists were interested in determining if the amount of nicotine present in tobacco leaves is related to the amount of water received by the plants. In a greenhouse, a total of 120 tobacco plants were selected. During the growing season, 30 plants were randomly assigned to receive the normal amount of water, 30 plants received 75% of the normal amount of water, 30 plants received 50% of the normal amount of water, and 30 plants received 25% of the normal amount of water. At the end of the growing season, one leaf was selected at random from each plant and the amount of nicotine present was recorded for each. Researchers determined that leaves from tobacco plants receiving the normal amount of water during the growing season had more nicotine than leaves from tobacco plants receiving less water. 11) For this experiment, identify the response and treatment variables. A) Response: Amount of nicotine in leaf. Treatment: Amount of water received by the plant. B) Response: Amount of water received by the plant. Treatment: Amount of nicotine in leaf. C) Response: Amount of nicotine in leaf. Treatment: Normal amount of water. D) Response: Normal amount of water. Treatment: Amount of nicotine in leaf. Answer: A 12) Was this an observational study or a controlled experiment? A) Observational study because the plants were not a random sample. B) Observational study because the researchers determined how much water each group of plants would receive. C) Experimental because researchers determined, through randomized assignment, which plants belonged to which treatment group. D) Experimental because the researchers could not control the amount of nicotine in the plants. Answer: C 13) Choose the statement that restates the conclusions of the study in terms of cause -and-effect conclusion. A) Reduced amounts of water during the growing season is associated with a decrease in the amount of nicotine. B) Normal amounts of water during the growing season produced less nicotine in tobacco leaves than reduced amounts of water. C) Normal amounts of water during the growing season produced more nicotine in tobacco leaves than reduced amounts of water. D) The amount of water received by tobacco plants during the growing season does not affect the amount of nicotine in the tobacco leaves. Answer: C A tire manufacturer has made some modifications to its most popular tire and wants to see if the tread on the modified tire will last longer than the original tire. Because there is much variation in the way people drive, the manufacturer decides to use blocking to reduce this variation. Twenty cars will be used to test the tires. Ten cars are randomly selected to receive an original tire on the right rear wheel and a modified tire on the left rear wheel. The remaining ten cars will receive an original tire on the left rear wheel and a modified tire on the right rear wheel. Each car will then be driven 10,000 miles and the amount of wear recorded for each wheel. 14) Is this a correct use of blocking? A) Yes B) No Answer: A 15) Which test design would be most appropriate for this scenario? B) Paired t-test A) Two-sample t-test C) Two-sample z-test D) Paired z-test Answer: B
Page 3 Copyright © 2020 Pearson Education, Inc.
Which treatment is most effective at treating carpet stains: vinegar, ammonia, or hydrogen peroxide? In a study, researchers randomly assigned 55 carpet samples with identical stains to one of three groups. Depending on which group they were assigned to, each carpet sample received a home remedy of water mixed with vinegar, ammonia, or hydrogen peroxide. Afterwards, each carpet sample was examined and any remaining stain was measured. Researchers found that a vinegar and water mixture produced better results than both ammonia and hydrogen peroxide. 16) Choose the statement that restates the conclusion of the study in terms of a cause -and-effect conclusion. A) A water and vinegar mixture effectively treats carpet stains compared to ammonia and hydrogen peroxide. B) People who use water and vinegar to treat carpet stains will have fewer visible carpet stains. C) A water and ammonia mixture effectively treats carpet stains compared to ammonia and hydrogen peroxide. D) The type of treatment received is associated with the amount of stain removed. Answer: A Solve the problem. 17) Scientists were interested in determining if the amount of nicotine present in tobacco leaves is related to the amount of water received by the plants. In a greenhouse, a total of 120 tobacco plants were selected. During the growing season, 30 plants received the normal amount of water, 30 plants received 75% of the normal amount of water, 30 plants received 50% of the normal amount of water, and 30 plants received 25% of the normal amount of water. At the end of the growing season, one leaf was selected from each plant and the amount of nicotine present was recorded for each. Researchers determined that leaves from tobacco plants receiving the normal amount of water during the growing season had more nicotine than leaves from tobacco plants receiving less water. For this experiment, identify the response and treatment variables. A) Response: Amount of nicotine in leaf. Treatment: Normal amount of water. B) Response: Normal amount of water. Treatment: Amount of nicotine in leaf. C) Response: Amount of nicotine in leaf. Treatment: Amount of water received by the plant. D) Response: Amount of water received by the plant. Treatment: Amount of nicotine in leaf. Answer: C 18) Which statement is not true about randomized block designs? A) Each block should have objects in all treatment and control groups. B) A block consists of objects that are similar on one or more variables. C) Blocks are randomly assigned to each of the treatment and control groups. D) All of these statements are true. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 19) Does listening to music improve efficiency for mundane tasks? In a study, researchers randomly assigned 50 similar adults to one of three groups. All subjects were asked to stuff 400 envelopes. Depending on which group they were assigned to, subjects heard Top 40 music, classical music, or no music while they worked. Researchers recorded how long it took the participants to stuff the envelopes. Researchers found that the adults that listened to no music completed the work faster than both the group that listened to Top 40 music and classical music. For this controlled experiment, state the treatment and the response variables. Answer: The treatments were Top 40 music, classical music, and no music. The response variable was the time it took participants to stuff the 400 envelopes.
Page 4 Copyright © 2020 Pearson Education, Inc.
20) A publisher is considering publishing a new magazine about environmentally -friendly living in the city and plans to gage interest in the magazine using potential consumers. Company researchers want to measure whether there is a difference in opinion between the new magazine and the competing publication. Company researchers believe that consumer reaction to the new magazine will depend on age so they decide to block on age. To do this, they create blocks for consumers between the ages 18 -24 years, 25-32 years, 33-45 years, and 46 years and older. They then randomly select two of the blocks to read the first issue of the new magazine and the other two blocks to read a similar competing publication. Following the test, participants respond to a questionnaire about the magazine they read. Is this an effective design for the study? If not, describe an improvement. Answer: This is not an effective design for the study because researchers randomly assigned entire blocks to treatment groups. To improve the study they should randomize within blocks. 21) Suppose the president of a large company is in the process of deciding whether to adopt a lunchtime exercise program. The purpose of the program is to improve the health of the employees and thus, reduce the medical expenses. To get more information, he instituted the exercise program for all the employees in one office. The president knows that during the winter months medical expenses are relatively high because of high occurrences of colds and flu. Therefore, he records the medical expenses of the employees in the selected office for each of the twelve months prior to the exercise program and then for each of the twelve months after the exercise program is initiated. The ʺbeforeʺ and ʺafterʺ expenses are compared on a month -to-month basis. Which test design would be the most appropriate for this scenario? Answer: A paired t-test is the most appropriate test because the medical expenses were compared for the same months both before and after the implementation of the exercise program. Which treatment is most effective at treating bathroom mildew: water mixed with vinegar, ammonia, or soap? In a study, researchers randomly assigned 55 similar tile samples with similar amounts of mildew to one of three groups. Depending on which group they were assigned to, each tile sample received a home remedy of water mixed with vinegar, ammonia, or soap. Afterwards, each tile sample was examined and any remaining mildew was measured. Researchers found that a soap and water mixture produced better results than both vinegar and ammonia. 22) Is this a controlled experiment or an observational study? Explain. Answer: Controlled experiment. 23) Write a statement that restates the conclusion of the study in terms of a cause -and-effect conclusion. Answer: A soap and water mixture effectively treats mildew stains on tile compared to vinegar and ammonia. These two headlines are on the same topic. Headline A: Children who view advertisements for fast food dramatically increase their risk of becoming obese adolescen new study suggests. Headline B: Viewing advertisements for fast food leads to childhood obesity, a new study finds. 24) Headline B implies a cause-and-effect relationship. This is problematic because it is inappropriate to make cause-and-effect statements based on observational studies. Confounding factors will vary. Answer: This is most likely an observational study since it would not be ethical to purposely try to cause childhood obesity.
Page 5 Copyright © 2020 Pearson Education, Inc.
2 Identify Observational and Experimental Studies MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Which treatment is most effective at treating head lice: Benzyl alcohol lotion, dandruff shampoo, or mayonnaise? In a study, researchers randomly assigned 55 subjects with head lice to one of three groups. All subjects had similar cases of head lice. Depending on which group they were assigned to, subjects received an over-the-counter remedy of Benzyl alcohol lotion, dandruff shampoo, or mayonnaise, which was applied by a trained professional. Afterwards, each subject was examined and any lice or eggs found were counted. Researchers found that Benzyl alcohol lotion produced better results than both dandruff shampoo and mayonnaise. 1) Was this an observational study or a controlled experiment? A) Observational Study B) Controlled Experiment Answer: B Use the following information to answer the question. Which treatment is most effective at treating carpet stains: vinegar, ammonia, or hydrogen peroxide? In a study, researchers randomly assigned 55 carpet samples with identical stains to one of three groups. Depending on which group they were assigned to, each carpet sample received a home remedy of water mixed with vinegar, ammonia, or hydrogen peroxide. Afterwards, each carpet sample was examined and any remaining stain was measured. Researchers found that a vinegar and water mixture produced better results than both ammonia and hydrogen peroxide. 2) Was this a controlled experiment or an observational study? A) Controlled Experiment B) Observational Study Answer: A Use the following information to answer the question. These two headlines are on the same topic. Headline A: Gaining weight after giving birth for the first time leads to pregnancy -related diabetes during second pregna new study finds. Headline B: Women who gain weight after giving birth for the first time dramatically increase their risk of developing pregnancy-related diabetes during their second pregnancy, a new study suggests. 3) Was the study referenced most likely a controlled experiment or an observational study? A) Controlled experiment B) Observational study Answer: B Use the following information to answer the question. These two headlines are on the same topic. Headline A: Married men who gain weight following the birth of their first child dramatically increase their risk of furthe weight gain following the birth of more children, a new study suggests. Headline B: Married male weight gain after the birth of the first child leads to additional weight gain following the birth o more children, a new study finds. 4) Was the study referenced most likely a controlled experiment or an observational study? A) Observational study B) Controlled experiment Answer: A
Page 6 Copyright © 2020 Pearson Education, Inc.
A tire manufacturer has made some modifications to its most popular tire and wants to see if the tread on the modified tire will last longer than the original tire. Because there is much variation in the way people drive, the manufacturer decides to use blocking to reduce this variation. Twenty cars will be used to test the tires. Ten cars are randomly selected to receive an original tire on the right rear wheel and a modified tire on the left rear wheel. The remaining ten cars will receive an original tire on the left rear wheel and a modified tire on the right rear wheel. Each car will then be driven 10,000 miles and the amount of wear recorded for each wheel. 5) Was the study described above most likely an observational study or a controlled experiment? A) Experimental because the researchers placed both types of tires on each car, randomly assigning one type of tire to the right side and the other type to the left side. B) Experimental because the researcher could not control how fast each car was driven. C) Observational study because the researcher decided which car received which type of tire. D) Observational study because the tires were not a random sample. Answer: A Which treatment is most effective at treating carpet stains: vinegar, ammonia, or hydrogen peroxide? In a study, researchers randomly assigned 55 carpet samples with identical stains to one of three groups. Depending on which group they were assigned to, each carpet sample received a home remedy of water mixed with vinegar, ammonia, or hydrogen peroxide. Afterwards, each carpet sample was examined and any remaining stain was measured. Researchers found that a vinegar and water mixture produced better results than both ammonia and hydrogen peroxide. 6) Was this a controlled experiment or an observational study? A) Experimental because researchers determined, through randomized assignment, which treatment each carpet sample received B) Experimental because researchers could not determine how much of the stain was removed C) Observational study because researchers determined which treatment each carpet sample received D) Observational study because the carpet samples were not a random sample Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. These two headlines are on the same topic. Headline A: Children who view advertisements for fast food dramatically increase their risk of becoming obese adolescen new study suggests. Headline B: Viewing advertisements for fast food leads to childhood obesity, a new study finds. 7) Was the study referenced most likely a controlled experiment or an observational study? Explain. Answer: This is most likely an observational study since it would not be ethical to purposely try to cause childhood obesity. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 8) Suppose researchers wanted to determine the effects of smoking on lung capacity. They studied 150 adults who smoke cigarettes regularly and 128 adults who do not smoke and found that the people who regularly smoked have smaller lung capacities. Would this be considered a controlled experiment or an observational study? Explain. A) This is a controlled experiment because researchers assigned participants to control and treatment groups. B) This is a controlled experiment because researchers did not assign participants to control and treatment groups. C) This is an observational study because researchers assigned participants to control and treatment groups. D) This is an observational study because researchers did not assign participants to control and treatment groups. Answer: D Page 7 Copyright © 2020 Pearson Education, Inc.
9) In 2015, Assistance Publique Hopitaux De Marseille conducted a study to determine if relaxation optimized by virtual reality helped with anxiety disorders more than classical relaxation. 58 adults who had previously been diagnosed with general anxiety disorders were randomly assigned to one of the two relaxation groups. Would this be considered a controlled experiment or an observational study? Explain. A) This is a controlled experiment because researchers assigned participants to control and treatment groups. B) This is a controlled experiment because researchers could control anxiety levels. C) This is an observational study because researchers could not control anxiety levels. D) This is an observational study because researchers did not assign participants to control and treatment groups. Answer: A 10) Suppose researchers wanted to determine the effects of smoking on lung capacity. They studied 150 adults who smoke cigarettes regularly and 128 adults who do not smoke and found that the people who regularly smoked have smaller lung capacities. Can a causal conclusion be made from this study? Explain. A) A causal conclusion can be made because this was an observational study. B) A causal conclusion can be made because this was a controlled experiment. C) A causal conclusion cannot be made because this was an observational study. D) A causal conclusion cannot be made because this was a controlled experiment. Answer: A 3 Explain the Power of Studies MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is not a primary factor that affects the power of a test? A) Natural variability within the population B) The size of the true difference between treatment groups C) Sample size D) The sample standard deviation Answer: D 2) Choose the statement that best describes the power of a test. A) The power of a test is the probability that the null hypothesis will be accepted when it is true. B) The power of a test is the probability that the alternative hypothesis will be rejected when it is false. C) The power of a test is the probability that the null hypothesis will be rejected when it is false. D) None of these statements describes the power of a test. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) List three factors that affect the statistical power of a test. Be sure to list the factor over which researchers actually have some control. Answer: Sample size, the size of the true difference between the groups, and the natural variability within the population. Researchers have control over sample size only.
Page 8 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 4) Suppose two studies of a meditation program are created that claim to help people lower their Body Mass Index (BMI) values. The first study is based on a random sample of 100 adults who participate in the meditation program for 4 months. A hypothesis test is performed to determine whether their mean BMI change from the start of the program to the end is negative (i.e. BMI values decreased). The second study is based on a random sample of 100 women who followed the meditation program for 4 months. The same hypothesis test is performed to determine whether their mean BMI change is negative (i.e. BMI values decreased). Which study will most likely have more variability in the populations from which the samples are drawn? Explain. A) The first study will most likely have more variability in the populations because it includes only women in the sample. B) The first study will most likely have more variability in the populations because it includes both men and women in the sample. C) The second study will most likely have more variability in the populations because it includes only women in the sample. D) The second study will most likely have more variability in the populations because it includes both men and women in the sample. Answer: B 5) Suppose two studies of a meditation program are created that claim to help people lower their Body Mass Index (BMI) values. The first study is based on a random sample of 100 adults who participate in the meditation program for 4 months. A hypothesis test is performed to determine whether their mean BMI change from the start of the program to the end is negative (i.e. BMI values decreased). The second study is based on a random sample of 100 women who followed the meditation program for 4 months. The same hypothesis test is performed to determine whether their mean BMI change is negative (i.e. BMI values decreased). If we assume the meditation program is more effective for women than men, which study will have more power? Explain. A) The first study will have more power because it is drawing a sample from a population that has more variability. B) The first study will have more power because it is drawing a sample from a population that has less variability. C) The second study will have more power because it is drawing a sample from a population that has more variability. D) The second study will have more power because it is drawing a sample from a population that has less variability. Answer: D 6) Suppose Investment Company A can actually get clients 3% return on their investments, on average. Investment Company B uses a more aggressive approach and can actually get clients 5% return on their investments, on average. You have been hired by a regulating agency to test the companies’ claims that their investment programs increase return percentages. For both companies, you will collect a random sample of clients who will investment money and then test the hypothesis that the mean return is greater than 0%. Suppose that it is important to keep the power of both studies at 85%. Which study would require a larger sample size? Explain. A) The study for Investment Company A would require a larger sample size because the true effect of their investment program is larger. B) The study for Investment Company A would require a larger sample size because the true effect of their investment program is smaller. C) The study for Investment Company B would require a larger sample size because the true effect of their investment program is larger. D) The study for Investment Company B would require a larger sample size because the true effect of their investment program is smaller. Answer: B
Page 9 Copyright © 2020 Pearson Education, Inc.
12.2 Controlling Variation in Surveys 1 Identify the Type of Sampling Used MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Purple Loosetrife is considered an invasive plant in Michigan. To detect the presence of Purple Loosetrife on public land, environmental researcherʹs partition land into one acre parcels then randomly select a sample of parcels to be fully inspected for the presence of Purple Loostrife. What kind of sampling does this illustrate? C) Cluster D) Random Sampling B) Stratified A) Systematic Answer: C 2) Suppose a manufacturer of rearview mirrors decides to inspect every fifteenth part for defects. What kind of sampling does this illustrate? A) Systematic B) Stratified C) Cluster D) Random Sampling Answer: A 3) Suppose state lawmakers are interested in finding out whether a newly instituted elementary school program about bullying is effective. A statistician divides the state into three regions then randomly selects a sample of thirty elementary schools from each region. Students at the selected schools complete a questionnaire about bullying. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: B 4) Suppose a gumball manufacturer decides to inspect every twenty-fifth gumball for defects. What kind of sampling does this illustrate? A) Systematic B) Stratified C) Cluster D) Random Sampling Answer: A 5) Suppose state lawmakers are interested in finding out whether a newly instituted state program about distracted driving is effective. A statistician divides the state into three regions then randomly selects a sample of adult licensed drivers from each region. Participants are then asked to fill out a questionnaire about distracted driving. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: B 6) The Sirex Noctilio, a wood wasp, is considered an invasive species in Michigan and can harm and even kill pine trees. To detect the presence of the wood wasp on public land, environmental researchers partition land into one acre parcels then randomly select a sample of parcels to be fully inspected for the presence of the wood wasps in pine trees. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: C 7) Suppose a large endowment has been left to your city by a private donor. The city council has decided that the endowment should be used to build a recreation park near the center of town. One option is to build a skate park. It is decided that a stratified sampling plan would be the best way to get input from the public since opinions within a strata are likely to be similar. Which method for stratifying seems to be most reasonable for this scenario? B) Stratify by gender. C) Stratify by income. D) None of these. A) Stratify by age. Answer: A
Page 10 Copyright © 2020 Pearson Education, Inc.
8) An office building janitorial company is interested in the opinions of people who visit the office building on a typical day. He thinks that opinions of people who visit the office building in the morning could be different than those who visit later in the day and he does not want a biased sample. Which sampling method is likely to result in an unbiased sample? A) A large sample of people who visit the building that have been stratified by age. B) A systematic sample of every tenth person who visits the office building throughout the day. C) A cluster sample of all the people who visit the office building between 8:00 am and 9:00 am and between 5:00 pm and 6:00 pm. D) None of these. Answer: B 9) Suppose a large endowment has been left to your city by a private donor. The city council has decided that the endowment should be used to build an activity center near the center of town. One option is to build an activity center for senior citizens. It is decided that a stratified sampling plan would be the best way to get input from the public since opinions within a strata are likely to be similar. Which method for stratifying seems to be most reasonable for this scenario? B) Stratify by age. C) Stratify by income. D) None of these. A) Stratify by gender. Answer: B 10) A newspaper reporter is interested in the outcome of a local election on a controversial issue. She knows that opinions of voters who visit the polls in the morning could be different than those who visit later in the day and she does not want a biased sample. Which sampling method is likely to result in an unbiased sample? A) A large sample of voters that have been stratified by age. B) A cluster sample of all the voters who visit the polls between 8:00 am and 9:00 am and between 5:00 pm and 6:00 pm. C) A systematic sample of every tenth voter throughout the day. D) None of these. Answer: C 11) Which of the following is not a benefit of a stratified sampling plan? A) Members of a stratum, or similar group, are likely to respond the same, which leads to results with lower variability. B) Statistics from a stratum will reflect the parameters of the population (from which the stratum was drawn) as a whole. C) Increased precision D) All of these are benefits of a stratified sampling plan. Answer: B 12) Which of the following statements is not true about a cluster sampling plan? A) Cluster sampling can make it easier to access very large populations. B) If carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. C) In cluster sampling, the clusters contain objects that are as similar as possible. D) All of these statements are true of a cluster sampling plan. Answer: C 13) Which of the following statements is not true about a systematic sampling plan? A) With systematic sampling, objects from the population are sampled at regular intervals. B) Systematic sampling works best when objects are received in sequence during a specific time period and the characteristic of interest can be reasonably assumed to be randomly mixed during the time period in which the data will be collected. C) A systematic sampling plan is often used for exit polls and quality control studies. D) All of these statements are true for a systematic sampling plan. Answer: D
Page 11 Copyright © 2020 Pearson Education, Inc.
14) Which of the following statements is not true about a cluster sampling plan? A) Cluster sampling can make it easier to access very large populations. B) If carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. C) In cluster sampling, some natural or convenient distinction is used to divide the population. D) All of these statements are true of a cluster sampling plan. Answer: D 15) Which of the following statements is not true about a systematic sampling plan? A) A systematic sampling divides the population into mini-populations that will have lower variability. B) With systematic sampling, objects from the population are sampled at regular intervals. C) Systematic sampling works best when objects are received in sequence during a specific time period and the characteristic of interest can be reasonably assumed to be randomly mixed during the time period in which the data will be collected. D) All of these statements are true for a systematic sampling plan. Answer: A 16) A coin collector has over 54 books containing varying number of coins of various denominations. He is very interested in knowing the value of the collection, but does not want to take the time to look up the value of each coin in the collection. He decides to randomly select 4 books and determines the value of each coin in each of the 4 books by looking up the coin in an online pricing book. What kind of sampling does this illustrate? C) Cluster D) Random Sampling B) Stratified A) Systematic Answer: C 17) The personnel manager of a corporation wants to estimate, for one year, the total number of days used for sick leave among all 116 plants in his firm. The 116 plants are divided into 30 ʺsmallʺ plants, 45 ʺmediumʺ plants, and 41 ʺlargeʺ plants. He decides to randomly select 40 employees from each of the three size of plants. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: B 18) In order to estimate the average amount of money due on 484 open accounts, the accountant at a large manufacturing company randomly selects 25 open accounts to audit. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: D 19) Many landfills and dumps in the United States contain toxic materials. To see what proportion of the landfill contains toxic materials, the landfill is divided into 10,000 square grids that are each 10 feet by 10 feet square. After selecting one square grid at random to test for toxic materials, every five hundredth grid after that is selected for testing. What kind of sampling does this illustrate? C) Cluster D) Random Sampling B) Stratified A) Systematic Answer: A 20) Suppose a gumball manufacturer decides to inspect every twenty-fifth gumball for defects. What kind of sampling does this illustrate? B) Stratified C) Cluster D) Random Sampling A) Systematic Answer: A 21) Which of the following statements is true about a cluster sampling plan? A) Members of a cluster are likely to respond the same, which leads to results with lower variability. B) With cluster sampling, objects from the population are sampled at regular intervals. C) A cluster sampling plan is often used for exit polls and quality control studies. D) Cluster sampling can make it easier to access very large populations. Answer: D Page 12 Copyright © 2020 Pearson Education, Inc.
22) Which of the following statements is true about a stratified sampling plan? A) In stratified sampling, objects in the same stratum are as varied as possible. B) Members of the same stratum are likely to respond the same, which leads to results with lower variability. C) In stratified sampling, several strata are randomly selected and data is collected on every member of the selected strata. D) Stratified sampling works best when objects are received in sequence during a specific time period and the characteristic of interest can be reasonably assumed to be randomly mixed during the time period in which the data will be collected. Answer: B 23) The city council of a moderate size city wants to gather input from the citizens of the city on the proposed expansion to the local airport. It has been decided that a stratified sampling plan would be the best way to get the desired information because opinions within the strata are likely to be similar. Which method of stratifying seems to be the most reasonable for this situation? B) Stratify by age. A) Stratify by gender. C) Stratify by distance from the airport. D) Stratify by income. Answer: C 24) The quality control section of an industrial firm wishes to estimate the average fill of 12 -ounce cans coming off an assembly line. It is expected that as the machine dispensing the liquid into the cans heats up during the day, the amount of liquid dispensed changes. Which sampling method is likely to result in an unbiased sample? A) A systematic sample of every 50th can throughout the day. B) A simple random sample of cans filled after 12:00 pm. C) A cluster sample of all cans filled between 8:00 am and 9:00 am and between 4:00 pm and 5:00 pm. D) A simple random sample of cans filled before 12:00 pm. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 25) A corporation desires to estimate the total number of worker-hours lost for a given month because of accidents among all employees. Because laborers, technicians, and administrators have different accident rates, the researcher decides to randomly select employees from each of the three groups and records the number of worker-hours lost for the given month for each employee. What kind of sampling does this illustrate? Answer: Stratified 26) A researcher is interested in knowing how satisfied people who live in independent living facilities are with their living conditions. To answer this question, the researcher randomly selects 50 independent living facilities from throughout the United States and asks each resident of the selected facilities how satisfied he/she is with his/her current living conditions. What kind of sampling does this illustrate? Answer: Cluster 27) Recently, it was reported that the proportion of people using seatbelts while driving has begun to drop after years of steady increases. To determine the proportion of drivers using seatbelts in a particular community, it was decided to set up a checkpoint on a busy road and stop every twentieth driver to see if he/she was using a seatbelt. What kind of sampling does this illustrate? Answer: Systematic 28) Describe one benefit to a stratified sampling plan. Answer: Some of the benefits of a stratified sampling plan are increased precision and decreased variability within strata.
Page 13 Copyright © 2020 Pearson Education, Inc.
29) Describe one benefit to a cluster sampling plan. Answer: Cluster sampling can make it easier to access very large populations and if carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. 30) Describe a situation where a systematic sampling plan would be appropriate. Answer: Answers will vary, but systematic sampling plans are often used for exit polls or in quality control studies. 31) A study is to be undertaken to determine the teacher workload in the public schools in a large county in California. The target population is all high school teachers (grades 9 – 12) with at least one year of experience. The county contains a total of thirty-seven public high schools and a total of 4,857 teachers. All teachers selected in the sample will be asked to complete a short survey. Describe a sampling plan that would allow the researchers to obtain the desired information. Answer: Answers will vary, but cluster sampling is likely to be a reasonable choice. One could randomly select high schools from the thirty-seven and then survey all teachers in the selected high schools. 32) The city council of a moderate size city wants to gather input from the citizens of the city on the proposed expansion to the local airport. It has been decided that a stratified sampling plan would be the best way to get the desired information because opinions within the strata are likely to be similar. What would be a reasonable way to stratify the population? Answer: Answers will vary, but stratification by distance from the airport is likely to be the most reasonable. 2 Determine if Stratified Sampling Should be Used MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Suppose a college is interested in building a new bicycle rack on campus. They have 3 possible locations in mind and are asking a random sample of town citizens to rank the locations with 1 being the most desirable and 3 being the least desirable. Explain why the college might want to stratify the sampling into two groups: people who have bicycles and people who do not. A) A stratified sample is not necessary. A simple random sample would likely give the same results. B) People who do not own a bicycle may feel the location of the bicycle rack is more important than students who do own a bicycle because they likely would not use this resource. C) People who own a bicycle may feel the location of the bicycle rack is more important than students who do not own a bicycle because they would likely use this resource. D) The location of the bicycle rack would be important for both people with and without bicycles because anyone can use this resource. Answer: C 2) In which of the scenarios below would a stratified sampling method be most useful for estimating a population parameter? A) A small company wants to decide where to place a new water cooler in the office. B) A high school wants to determine if their students want longer lunch times. C) A town wants to decide where to build a new park. D) A high school wants to determine if students’ favorite subjects differ by class level (Freshman, Sophomore, Junior, Senior). Answer: D
Page 14 Copyright © 2020 Pearson Education, Inc.
3) Suppose a college is deciding whether or not to provide more resources to a free weight center at the campus gym. Explain why the school might want to use a stratified sample rather than a simple random sample from the student body before making a final decision. A) A stratified sample is not necessary. Sampling the entire school would likely give the same results. B) They should use a stratified sample because choice of gym materials may be more important to gym users than non-gym users. C) They should use a stratified sample because choice of gym materials may be more important to female students than male students. D) They should use a stratified sample because choice of gym materials may be more important to students than faculty. Answer: B 4) Suppose a local school district, which is responsible for decisions about elementary, middle, and high schools in the area, is deciding whether or not to increase the length of their spring break vacation. They plan to randomly survey people who live in the school district. Explain why the school might want to use a stratified sample rather than a simple random sample. A) A stratified sample is not necessary. A simple random sample would likely give the same results. B) They should use a stratified sample because spring break length may be more important to parents than children. C) They should use a stratified sample because spring break length may be more important to people who have children in these schools than people who do not. D) They should use a stratified sample because spring break length may be more important to people who live in the school district than people who do not. Answer: C
12.3 Reading Research Papers 1 Interpret the Results of Studies MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A researcher wonders whether receiving a job offer is effected by whether an interviewee wears a red tie. The researcher used the following study design to collect data: The researcher chose five large companies in a large city and observed the tie color of all interviewees during a two week period. He also records whether the person was hired by the company. He finds that interviewees who wear red ties are more likely to be hired than interviewees who do not wear red ties. 1) Choose the statement that correctly explains why we can or cannot generalize these results to a larger population. A) The interviewees (and companies where they interviewed) were not selected from a larger population, so the results cannot be generalized beyond the sample. B) The interviewees were randomly selected to participate in the study so the results can be generalized to the larger population. C) The interviewees were not randomly assigned to wear a red tie to the interview so the results cannot be generalized to beyond the sample. D) This was an observational study so results can be generalized to a larger population. Answer: A
Page 15 Copyright © 2020 Pearson Education, Inc.
2) Choose the statement that correctly explains why we can or cannot make a cause -and-effect conclusion. A) This was an observational study. Since random assignment was not used, we cannot conclude that getting hired was caused by wearing a red tie. B) This was an experimental study so we conclude that wearing a red tie caused an interviewee to be hired. C) The interviewees (and companies where they were interviewed) were not selected from a larger population, so we cannot conclude that there was a cause -and-effect relationship. D) Since this was a controlled observational study we can conclude that there was a cause -and-effect relationship. Answer: A Often college students taking an introductory course in psychology are required to participate in research experiments. One experiment was performed to see if using colored paper increased reading comprehension. One hundred students enrolled in an introductory college psychology course volunteered to participate in the study. Fifty of the students were randomly chosen to read a chapter in a psychology test that was printed on yellow paper and the remaining fifty students were given the same chapter to read printed on white paper. After reading the chapter, each student was asked to answer a series of questions to test his/her comprehension. The results indicated that those students reading the chapter printed on yellow paper tended to have higher comprehension scores than those students reading the chapter printed on white paper. 3) Choose the statement that correctly explains why we can or cannot generalize these results to a larger population. A) This was an observational study so the results can be generalized to a larger population. B) The participants were randomly assigned to a paper color, so the results can be generalized to a larger population. C) The participants were randomly selected, so the results can be generalized to a large population. D) The participants were not selected from a larger population so the results cannot be generalized beyond the sample. Answer: D 4) Choose the statement that correctly explains why we can or cannot make a cause -and-effect conclusion. A) This was an observational study so we cannot conclude that reading comprehension increases when reading text printed on yellow paper. B) The participants were not selected from a larger population so we cannot conclude a cause -and effect relationship. C) Since this was a controlled observational study, we can conclude there was a cause -and-effect relationship. D) This was an experimental study so we conclude that reading comprehension increases when reading from text printed on yellow paper, but the results cannot be generalized beyond the sample. Answer: D A researcher wonders whether speed of service at a diner is affected by whether a male customer is wearing a suit jacket. The researcher used the following study design to collect data: The researcher chose five diners in a large city and recorded the number of male customers who wore a suit jacket during a two week period. He also records how long it took a waiter or waitress to address the customer. He finds that male customers who wear suit jackets were addressed by the wait staff faster than male customers who did not wear a suit jacket. 5) Choose the statement that correctly explains why we can or cannot generalize these results to a larger population. A) This was an observational study so results can be generalized to a larger population. B) The male customers (and diners) were not selected from a larger population, so the results cannot be generalized beyond the sample. C) The male customers were randomly selected to participate in the study so the results can be generalized to the larger population. D) The male customers were randomly assigned to wear a suit jacket so the results cannot be generalized to beyond the sample. Answer: B Page 16 Copyright © 2020 Pearson Education, Inc.
6) Choose the statement that correctly explains why we can or cannot make a cause -and-effect conclusion. A) This was an experimental study so we conclude that wearing a suit jacket caused a male customer to be addressed faster. B) The male customers (and diners) were not selected from a larger population, so we cannot conclude that there was a cause-and-effect relationship. C) This was an observational study. Since random assignment was not used, we cannot conclude that getting addressed faster was caused by wearing a suit jacket. D) Since this was a controlled observational study we can conclude that there was a cause -and-effect relationship. Answer: C Solve the problem. 7) A small study was conducted on a new procedure for treating atrial fibrillation. The study involved a new way to perform ablation surgery. One method currently being used had a success rate of only 50% in preventing additional episodes of atrial fibrillation within one year. The new method of performing ablation surgery was performed on 8 patients and 6 of the 8 patients did not suffer additional episodes of atrial fibrillation within one year. Choose the statement that best summarizes the significance of this result. A) Many people would say that this result has both clinical and statistical significance because the probability of success increases from 50% to 75% and the result could have significant impact on lives. B) Although this result may have clinical significance, it does not have statistical significance because the sample size is so small. C) Although this result may have statistical significance, it does not have clinical significance because the sample size is so small. D) Although this result may have statistical significance, it does not have clinical significance because the sample was not randomly selected. Answer: B 8) Which of the following statements are true concerning meta-analysis? A) A meta-analysis considers all studies done to test a particular treatment and tries to reconcile different conclusions, attempting to determine whether other factors played a role in the reported outcomes. B) Meta-analysis is the practice of stating hypotheses after first looking at the data. C) Meta-analysis is used to prove clinical significance but not statistical significance. D) Meta-analysis is used to prove statistical significance but not clinical significance. Answer: A 9) Which of the following is not an important principle that should always be kept in mind when reading research articles containing statistical research results? A) Donʹt rely solely on the conclusions of any single paper. B) Be wary of conclusions based on very complex statistical or mathematical models. C) Stick to peer-reviewed journals. D) All of these are important principles. Answer: D
Page 17 Copyright © 2020 Pearson Education, Inc.
10) Suppose it was reported on the news that a recent study concluded that the probability that you will get brain cancer if you use a cell phone doubled from 1 in 460,000 to 1 in 230,000. Choose the statement that best summarizes the significance of this result. A) Although this result may have statistical significance, it does not have clinical significance since the probability that you will get brain cancer if you use cell phone is still so small that it is unlikely to have a meaningful effect on lives. B) Many would say that this result has both clinical and statistical significance because the probability more than doubled and this result could have a significant impact on lives. C) Many would say that this result does not have clinical or statistical significance because the probabilities are so small that they are meaningless. D) Although this result may have clinical significance, it does not have statistical significance since the probability that you will get brain cancer if you use a cell phone is still so small that it is unlikely to have a meaningful effect on lives. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. A researcher wonders whether length of time that it takes to hail a taxi is affected by the gender of the person hailing the taxi. The researcher used the following study design to collect data: The researcher chose five busy intersections in a large city and recorded the number of males and females who hailed a taxi. He also records how long it took to get a taxi to stop. He finds that male customers waited a shorter period of time than females waited for a taxi. 11) Explain why we can or cannot generalize these results to a larger population. Answer: The male and female customers (and intersections) were not selected from a larger population, so the results cannot be generalized beyond the sample. 12) Restate the conclusion of the study in terms of a cause-and-effect conclusion. Explain why we can or cannot make a cause-and-effect conclusion. Answer: Possible cause-and-effect conclusion: Being male leads to shorter wait times for a taxi, a new study finds. This was an observational study. Since random assignment was not used, we cannot conclude that being male caused the taxis to stop faster. Solve the problem. 13) Write at least two questions that you should ask yourself when reading articles containing statistical research results. Explain why these questions are important. Answer: Possible questions: Did the study use random sampling and random assignment? Is the conclusion supported by other similar studies? Are the conclusions based on complex statistical or mathematical models? Is the journal containing the article peer-reviewed? Is the evidence compelling enough to support the conclusion? These questions encourage critical evaluation of published research or televised new reports that could affect our lives. 14) Suppose it was reported on the news that a recent study concluded that pollutants in the air have dramatically increased our chances of getting a rare form of skin cancer from 1 in 250,000 to 1 in 150,000. Compare the clinical and statistical significance of this result. Answer: Although this result may have statistical significance, it does not have clinical significance since the probability that you will get the rare form of skin cancer is still so small that it is unlikely to have a meaningful effect on lives.
Page 18 Copyright © 2020 Pearson Education, Inc.
Ch. 12 Experimental Design: Controlling Variation Answer Key 12.1 Variation Out of Control 1 Discuss the Design of Studies 1) A 2) A 3) A 4) A 5) A 6) A 7) A 8) B 9) B 10) B 11) A 12) C 13) C 14) A 15) B 16) A 17) C 18) C 19) The treatments were Top 40 music, classical music, and no music. The response variable was the time it took participants to stuff the 400 envelopes. 20) This is not an effective design for the study because researchers randomly assigned entire blocks to treatment groups. To improve the study they should randomize within blocks. 21) A paired t-test is the most appropriate test because the medical expenses were compared for the same months both before and after the implementation of the exercise program. 22) Controlled experiment. 23) A soap and water mixture effectively treats mildew stains on tile compared to vinegar and ammonia. 24) This is most likely an observational study since it would not be ethical to purposely try to cause childhood obesity. 2 Identify Observational and Experimental Studies 1) B 2) A 3) B 4) A 5) A 6) A 7) This is most likely an observational study since it would not be ethical to purposely try to cause childhood obesity. 8) D 9) A 10) A 3 Explain the Power of Studies 1) D 2) C 3) Sample size, the size of the true difference between the groups, and the natural variability within the population. Researchers have control over sample size only. 4) B 5) D 6) B
12.2 Controlling Variation in Surveys 1 Identify the Type of Sampling Used 1) C Page 19 Copyright © 2020 Pearson Education, Inc.
2) A 3) B 4) A 5) B 6) C 7) A 8) B 9) B 10) C 11) B 12) C 13) D 14) D 15) A 16) C 17) B 18) D 19) A 20) A 21) D 22) B 23) C 24) A 25) Stratified 26) Cluster 27) Systematic 28) Some of the benefits of a stratified sampling plan are increased precision and decreased variability within strata. 29) Cluster sampling can make it easier to access very large populations and if carefully executed, a cluster sampling plan will produce estimates that are as precise as possible. 30) Answers will vary, but systematic sampling plans are often used for exit polls or in quality control studies. 31) Answers will vary, but cluster sampling is likely to be a reasonable choice. One could randomly select high schools from the thirty-seven and then survey all teachers in the selected high schools. 32) Answers will vary, but stratification by distance from the airport is likely to be the most reasonable. 2 Determine if Stratified Sampling Should be Used 1) C 2) D 3) B 4) C
12.3 Reading Research Papers 1 Interpret the Results of Studies 1) A 2) A 3) D 4) D 5) B 6) C 7) B 8) A 9) D 10) A 11) The male and female customers (and intersections) were not selected from a larger population, so the results cannot be generalized beyond the sample.
Page 20 Copyright © 2020 Pearson Education, Inc.
12) Possible cause-and-effect conclusion: Being male leads to shorter wait times for a taxi, a new study finds. This was an observational study. Since random assignment was not used, we cannot conclude that being male caused the taxis to stop faster. 13) Possible questions: Did the study use random sampling and random assignment? Is the conclusion supported by other similar studies? Are the conclusions based on complex statistical or mathematical models? Is the journal containing the article peer-reviewed? Is the evidence compelling enough to support the conclusion? These questions encourage critical evaluation of published research or televised new reports that could affect our lives. 14) Although this result may have statistical significance, it does not have clinical significance since the probability that you will get the rare form of skin cancer is still so small that it is unlikely to have a meaningful effect on lives.
Page 21 Copyright © 2020 Pearson Education, Inc.
Ch. 13 Inference Without Normality 13.1 Transforming Data 1 Understand Inference and Nonparametric Statistics MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is an indication that nonparametric inference might be necessary? A) The sample size is too large to assume the CLT holds. B) The distribution of the population is strongly skewed. C) The distribution of the population is Normal. D) The sample standard deviation is smaller than the sample mean. Answer: B 2) Which of the following is not necessarily an indication that nonparametric inference might be necessary? A) Observations are be matched pairs B) The sample size is too small to assume the CLT holds. C) The distribution of the population is not Normal. D) The distribution of the sample is strongly skewed Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) Explain some features of the data that might indicate that nonparametric inference might be useful. Answer: Sample size is small, the population is not normal, the data is strongly skewed. MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 4) In which of the following situations would nonparametric statistics NOT be useful? A) The data are symmetric. B) The data strongly skewed. C) The data strongly skewed. D) The sample sizes are small. Answer: A 5) When performing a two-sample t-test, which of the following conditions is NOT required if the size of both samples is large? B) Data drawn from Normal distributions. A) Data drawn from random samples. C) Observations independent of each other. D) The groups independent of each other. Answer: B 6) With which type of distribution should you use the median to report the “typical” value of a data set? A) Uniform. B) Normal. C) Skewed. D) Symmetric. Answer: C
Page 1 Copyright © 2020 Pearson Education, Inc.
2 Understand QQ Plots MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) In the context of nonparametric inference, what information can the QQ provide? A) It is a tool that displays quartile locations of data values which can provide information about the sample distribution. B) It is a tool that can help you determine whether a sample is drawn from a normal population. C) It is tool that displays the distribution of transformed data. D) None of these Answer: B 2) Which of the following QQ plots most closely depicts data from a normally distributed population? B) A)
C)
D)
Answer: A
Page 2 Copyright © 2020 Pearson Education, Inc.
3) Which of the following QQ plots most closely depicts data from a skewed population? B) A)
C)
D)
Answer: B 4) In the context of nonparametric inference, what information can the QQ plot provide? A) It is a tool that can be used to determine if the mean and median are equal to each other. B) It is a tool that can be used to determine if the standard deviation is normally distributed. C) It is a tool that can be used to determine whether a sample is drawn from a uniform distribution. D) It is a tool that can be used to determine whether a sample is drawn from a normal population. Answer: D 5) Match each of the histograms with the corresponding QQ plot.
Page 3 Copyright © 2020 Pearson Education, Inc.
A) Histogram A goes with QQ plot C and histogram B goes with QQ plot D. B) Histogram A goes with QQ plot D and histogram B goes with QQ plot C. Answer: B
Page 4 Copyright © 2020 Pearson Education, Inc.
6) Which of the following QQ plots most closely depicts data from a skewed population? A)
B)
C)
Page 5 Copyright © 2020 Pearson Education, Inc.
D)
Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 7) Suppose you are asked to analyze sample data but you do not know the distribution of the population it came from. Explain how a QQ plot can be used to give you information about the population from which the samples was drawn. Answer: The QQ plot is a tool that can help you determine whether the sample was drawn from a normal population. Use the following plots to answer the question.
Page 6 Copyright © 2020 Pearson Education, Inc.
8) Match each of the histograms with the corresponding QQ plot. Histogram A goes with QQ plot ______. Histogram B goes with QQ plot ______. Answer: A goes with D, B goes with C.
Page 7 Copyright © 2020 Pearson Education, Inc.
9) For which sample might a log transform be useful? Explain. (There are no zeros or negative values in either data set.) Answer: A log transform might be useful for the data shown in histogram B because it is right skewed. 3 Perform Log Transforms MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) Find the log (base 10) transformation of the number 1 million. A) 1 B) 3 C) 6
D) 9
Answer: C 2) Do the back transformation by finding the antilog (base 10) of the number 5. A) 10000 B) 100000 C) 1000000
D) 10000000
Answer: B 3) Find the log (base 10) transformation of the number 1650. Round to one decimal place, if needed. A) 2.2 B) 3.2 C) 4.2 D) 5.2 Answer: B 4) Do the back transformation by finding the antilog (base 10) of the number 25580. Round to one decimal place, if needed. C) 4.4 D) 5.4 B) 3.4 A) 2.4 Answer: C 4 Find Geometric Means MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Find the mean, median, and geometric mean for the following numbers: 10, 300, 1500, and 33,000. Round to the nearest tenth. B) 8702.5, 900.0, 620.8 A) 8702.5, 900.0, 29.2 C) 8700.0, 620.8, 950.0 D) 900.0, 1000.0, 8700.0 Answer: B 2) Find the mean, median, and geometric mean for the following numbers: 120, 400, 1300, and 22,000. List from smallest to largest and round to the nearest tenth. A) 9300.0, 850.0, 120.0 B) 400.0, 1082.4, 9273.8 D) 9273.8, 850.0, 1082.4 C) 4765.0, 850.0, 1185.5 Answer: D 3) Which of the following presents, in this order, the mean, median, and geometric mean of these numbers: 410, 20, 1750, 66000. Round to the nearest tenth. A) 17045.0, 885.0, 986.5 B) 17045.0, 885.0, 3.0 C) 17045.0, 1080.0, 3.0 D) 17045.0, 1080.0, 986.5 Answer: D 4) Which of the following presents, in this order, the mean, median, and geometric mean of these numbers: 400, 120, 1300, and 22000. Round to the nearest tenth. B) 5955.0, 710.0, 1082.4 A) 5955.0, 850.0, 1185.5 C) 5955.0, 710.0, 1185.5 D) 5955.0, 850.0, 1082.4 Answer: D
Page 8 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 5) Calculate the mean, median, and geometric mean for the following numbers: 710, 27,000, 1400, and 260. Round to the nearest tenth. Answer: Median: 1055.0, Geometric Mean: 1625.3, Mean: 7342.5 MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 6) Find the geometric mean for the numbers 100 and 36. A) 50 B) 60
C) 68
D) 3600
Answer: B 7) How would the mean compare to the geometric mean for the numbers 152, 238, and 1000? A) The mean and geometric mean would be equal. B) The mean would be less than the geometric mean. C) The mean would be greater than the geometric mean. D) There is no relationship between the mean and the geometric mean. Answer: C 8) How would the median compare to the geometric mean for the numbers 152, 238, and 1000? A) The median and geometric mean would be equal. B) The median would be less than the geometric mean. C) The median would be greater than the geometric mean. D) There is no relationship between the median and the geometric mean. Answer: B 5 Find and Analyze Log-Transformed Data MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Suppose the manager of a large high -end jewelry store wants to estimate the amount spent by customers during the holiday season. She took a random sample of customers and recorded the amount they spent. A histogram of the data shows that the data is strongly left -skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) log -transformed data, which showed a more normally distributed data set. Use this information to answer the question.
1) Calculate the width of both intervals (note that you will need to convert the log -transformed interval back into dollars). Which interval is narrower? A) Width of interval for untransformed data: 111.2; width of interval for transformed data: 92.7. The width of the interval for the log transformed data is narrower. B) Width of interval for untransformed data: 111.2; width of interval for transformed data: 204.6. The width of the interval for the untransformed data is narrower. C) Width of interval for untransformed data: 55.6; width of interval for transformed data: 111.2. The width of the interval for the log transformed data is narrower. D) Cannot be determined with the given information Answer: A
Page 9 Copyright © 2020 Pearson Education, Inc.
2) Choose the statement that explains which confidence interval is likely to be a more precise estimate of amount spent and why. A) The confidence interval for the untransformed data is more precise because the values are in actual dollars which is more meaningful. B) The confidence interval for the geometric mean is more precise because the distribution of the log-transformed data is more symmetric. C) The confidence interval for the untransformed data is more precise because it is strongly left-skewed and the confidence interval gives a wider interval. D) None of these. Answer: C Use the following information to answer the question. Suppose the manager of a large furniture store wants to estimate the amount spent by customers during the holiday season. She took a random sample of customers and recorded the amount they spent. A histogram of the data shows that the data is strongly left -skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) log -transformed data, which showed a more normally distributed data set. Use this information to answer the question.
3) Calculate the width of both intervals (note that you will need to convert the log -transformed interval back into dollars). Which interval is narrower? A) Width of interval for untransformed data: 144.2; width of interval for transformed data: 152.5. The width of the interval for the untransformed data is narrower. B) Width of interval for untransformed data: 144.2; width of interval for transformed data: 129.8. The width of the interval for the log transformed data is narrower. C) Width of interval for untransformed data: 72.1; width of interval for transformed data: 273. The width of the interval for the log transformed data is narrower. D) Cannot be determined with the given information Answer: B 4) Choose the statement that explains which confidence interval more precisely depicts the data and why. A) The confidence interval for the untransformed data is more precise because it is strongly left-skewed and the confidence interval gives a wider interval. B) The confidence interval for the untransformed data is more precise because the values are in actual dollars which is more meaningful. C) The confidence interval for the geometric mean is more precise because the distribution of the log-transformed data is more symmetric. D) None of these. Answer: A
Page 10 Copyright © 2020 Pearson Education, Inc.
Suppose that a college professor lives 65 miles from campus and is interested in the average time it takes her to commute to work each morning. Driving the same route each day, she records the time it takes her to get to work (in minutes) on 20 randomly selected days. Because there is construction on the route that she usually takes, there are days when it takes much longer than others to get to work. A histogram of the data shows that the data are skewed to the right. The information provided below shows the confidence intervals for the mean time to get to work using (A) raw (untransformed) data, and (B) log-transformed data, which showed a more normally distributed data set.
5) Calculate the width of both intervals in the original units of minutes (note that you will need to convert the log-transformed interval back into minutes). Which interval is narrower? A) Width of interval for untransformed data: 5.60; width of interval for transformed data: 5.02. The width of the interval for the log transformed data is narrower. B) Width of interval for untransformed data: 5.60; width of interval for transformed data: 6.53. The width of the interval for the untransformed data is narrower. C) Width of interval for untransformed data: 5.02; width of interval for transformed data: 5.60. The width of the interval for the untransformed data is narrower. D) Cannot be determined with the given information Answer: A 6) Choose the statement that explains which confidence interval is likely to be a more precise estimate of time spent and why. A) The confidence interval for the untransformed data is more precise because the values are in actual minutes which is more meaningful. B) The confidence interval for the geometric mean is more precise because the distribution of the log transformed data is more symmetric. C) The confidence interval for the untransformed data is more precise because the sample distribution is strongly skewed to the right and the confidence interval for the untransformed data gives a wider interval. D) None of these. Answer: B
Page 11 Copyright © 2020 Pearson Education, Inc.
Suppose that a college professor lives 65 miles from campus and is interested in the average time it takes her to commute to work each morning. Driving the same route each day, she records the time it takes her to get to work (in minutes) on 20 randomly selected days. Because there is construction on the route that she usually takes, there are days when it takes much longer than others to get to work. A histogram of the data shows that the data are skewed to the right. The information provided below shows the confidence intervals for the mean time to get to work using (A) raw (untransformed) data, and (B) log-transformed data, which showed a more normally distributed data set.
7) Calculate the width of both intervals (note that you will need to convert the log -transformed interval back into minutes). Which interval is narrower? A) Width of interval for untransformed data: 5.71; width of interval for transformed data: 6.43. The width of the interval for the untransformed data is narrower. B) Width of interval for untransformed data: 5.30; width of interval for transformed data: 5.71. The width of the interval for the untransformed data is narrower. C) Width of interval for untransformed data: 5.71; width of interval for transformed data: 5.30. The width of the interval for the log transformed data is narrower. D) Cannot be determined with the given information Answer: C 8) Choose the statement that explains which confidence interval is likely to be a more precise estimate of time spent and why. A) The confidence interval for the geometric mean is more precise because the distribution of the log transformed data is more symmetric. B) The confidence interval for the untransformed data is more precise because the values are in actual minutes which is more meaningful. C) The confidence interval for the untransformed data is more precise because the sample distribution is strongly skewed to the right and the confidence interval for the untransformed data gives a wider interval. D) None of these. Answer: A
Page 12 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Suppose the manager of a large appliance and electronics store wants to estimate the amount spent by customers during the holiday season. He took a random sample of customers and recorded the amount they spent. A histogram of the amount spent shows that the distribution of the sample is strongly left -skewed. The figures below show the confidence intervals for the mean amount spent using (A) raw (untransformed) data, and (B) log -transformed data, which showed a more normally distributed data set.
9) Calculate the width of both intervals and state which interval is narrower after converting the log -units back to the original units. Answer: The width of the untransformed data is 327.7, the width of the transformed data is 308.51. The width of the confidence interval for the transformed data is narrower. 10) Which interval should the manager report to the store owner about the typical amount of money spent during the holiday season? Explain. Answer: The confidence interval for the geometric mean is more appropriate because the log-transformed data had a symmetric distribution. The confidence interval for the geometric mean will be more accurate and precise.
13.2 The Sign Test for Paired Data 1 Understand the Sign Test and Perform it if Appropriate MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following statements could be a reason to justify the use of the sign test? A) The sample size is small B) The distribution of the population is unknown or not Normal C) The data are matched pairs D) All of these Answer: D 2) Which of the following statements is not true about the sign test? A) Matched pairs must be independent of other pairs in the sample. B) The p-value is based on the normal distribution. C) The sign test relies on the signs (negative or positive) of the measured differences in pairs. D) The binomial model is used to find an exact p-value. Answer: B
Page 13 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Can stretching help you stay alert in class? Thirty -six subjects were measured for alertness at the beginning of class; the subjects then participated in some light arm and neck stretches followed by a forty-five minute lecture. Each subject was then measured for alertness at the end of the lecture. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met:
3) Choose the correct null and alternative hypothesis. A) H0 : The median difference in alertness is 0. HA: The median difference in alertness is not 0. C) H0 : The median difference in alertness is 1.
B) H0 : The median difference in alertness is not 0. HA: The median difference in alertness is 0. D) None of these.
HA: The median difference in alertness is not 1. Answer: A 4) What is the name and value of the test statistic? A) F = 20 B) S = 20
C) S = 12
D) Z = 12
Answer: B 5) Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Fail to reject H0 . There is evidence to suggest that there is a difference in alertness after light arm and neck stretches before the lecture. B) Reject H0 . There is evidence to suggest that there is no difference in alertness after light arm and neck stretches before the lecture. C) Fail to reject H0 . There is evidence to suggest that there is no difference in alertness after light arm and neck stretches before the lecture. D) Reject H0 . There is evidence to suggest that there is difference in alertness after light arm and neck stretches before the lecture. Answer: C
Page 14 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Can deep -knee bends help you stay alert in class? Forty subjects were measured for alertness at the beginning of class then voluntarily performed fifteen deep -knee bends followed by a forty-five minute lecture. Each subject was then measured for alertness at the end of the lecture. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met:
6) Choose the correct null and alternative hypothesis. A) H0 : The median difference in alertness is not 0. HA: The median difference in alertness is 0. C) H0 : The median difference in alertness is 1.
B) H0 : The median difference in alertness is 0. HA: The median difference in alertness is not 0. D) None of these.
HA: The median difference in alertness is not 1. Answer: B 7) What is the name and value of the test statistic? A) F = 21 B) S = 14
C) Z = 5
D) S = 21
Answer: D 8) Using a significance level of 5%, state the correct decision regarding the null hypothesis and concluding statement. A) Fail to reject H0 . There is evidence to suggest that there is no difference in alertness after deep knee bends before the lecture. B) Fail to reject H0 . There is evidence to suggest that there is a difference in alertness after deep knee bends before the lecture. C) Reject H0 . There is evidence to suggest that there is no difference in alertness after deep knee bends before the lecture. D) Reject H0 . There is evidence to suggest that there is difference in alertness after deep knee bends before the lecture. Answer: A Solve the problem. 9) Which of the following statements is not true about the sign test for paired data? A) The sign test is a nonparametric test that can be used in place of the paired t-test. B) The sign test is a nonparametric test based on the median of a population. C) The sign test is similar to the paired t-test in that both are based on examining differences between paired observations. D) The sign test is a nonparametric test based on the mean of a population. Answer: D 10) Which of the following statements is true about the sign test for paired data? A) The sign test relies only on the signs of the differences in pairs, not the values of the differences. B) The sign test relies only on the values of the differences in pairs, not the signs of the differences. C) The sign test is a nonparametric test because it makes two assumptions about the distributions of the populations. D) The sign test is most useful if you know your data are normally distributed. Answer: A
Page 15 Copyright © 2020 Pearson Education, Inc.
An insurance companyʹs procedure for settling a claim under $10,000 for fire or water damage to a home is to require two estimates for cleanup and repair of structural damage before allowing the insured to proceed with the work. The insurance company compared the estimates from two contractors on 38 jobs in the region. The insurance company is interested in knowing if the contractors typically produce different estimates for the projects. Assume that all conditions for testing have been met.
11) Choose the correct null and alternative hypotheses. A) H0 : The median difference in estimated cost is 500 Ha : The median difference in estimated cost is not 500 B) H0 : The median difference in estimated cost is not 500 Ha : The median difference in estimated cost is 500 C) H0 : The median difference in estimated cost is 0 Ha : The median difference in estimated cost is not 0 D) H0 : The median difference in estimated cost is not 0 Ha : The median difference in estimated cost is 0 Answer: C 12) What is the name and value of the test statistic? A) F = 11 B) S = 24
C) Z = 24
D) S = 14
Answer: B 13) Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Reject H0 . There is evidence to suggest that there is typically a difference in estimated cost between the two contractors. B) Reject H0 . There is no evidence to suggest that there is typically a difference in estimated cost between the two contractors. C) Fail to reject H0 . There is evidence to suggest that there is typically a difference in estimated cost between the two contractors. D) Fail to reject H0 . There is no evidence to suggest that there is typically a difference in estimated cost between the two contractors. Answer: A
Page 16 Copyright © 2020 Pearson Education, Inc.
An insurance companyʹs procedure for settling a claim under $10,000 for fire or water damage to a home is to require two estimates for cleanup and repair of structural damage before allowing the insured to proceed with the work. The insurance company compared the estimates from two contractors on 38 jobs in the region. The insurance company is interested in knowing if the contractors typically produce different estimates for the projects. Assume that all conditions for testing have been met.
14) Choose the correct null and alternative hypotheses. A) H0 : The median difference in estimated cost is 500 Ha : The median difference in estimated cost is not 500 B) H0 : The median difference in estimated cost is not 500 Ha : The median difference in estimated cost is 500 C) H0 : The median difference in estimated cost is 0 Ha : The median difference in estimated cost is not 0 D) H0 : The median difference in estimated cost is not 0 Ha : The median difference in estimated cost is 0 Answer: C 15) What is the name and value of the test statistic? A) F = 23 B) S = 15
C) Z = 12
D) S = 23
Answer: D 16) Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Reject H0 . There is evidence to suggest that there is typically a difference in estimated cost between the two contractors. B) Reject H0 . There is no evidence to suggest that there is typically a difference in estimated cost between the two contractors. C) Fail to reject H0 . There is evidence to suggest that there is typically a difference in estimated cost between the two contractors. D) Fail to reject H0 . There is no evidence to suggest that there is typically a difference in estimated cost between the two contractors. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 17) Describe some features of the data that might indicate that sign test would be useful for inference. Answer: Sample data are matched pairs and the normal conditions of the paired t-test are not satisfied. Also, the sample size is too small.
Page 17 Copyright © 2020 Pearson Education, Inc.
The annual amount of phosphorus deposited into Lake Erie is of concern to environmentalists because phosphorus can aid in the growth of algae blooms in the lake. The annual amount of phosphorus deposited into Lake Erie from two major rivers was recorded each of 24 years. Environmentalists want to determine if the medial annual amount of phosphorus deposited into Lake Erie differs for the two major rivers. The hypothesis test results for the sign test are summarized below. Assume that all conditions for testing have been met:
18) State the null and alternative hypotheses to determine if the median annual amount of phosphorus deposited in Lake Erie differs for the two rivers. Answer: H0 : The median difference in deposited phosphorus is 0. Ha: The median difference in deposited phosphorus is not 0. 19) Explain how the binomial model is used to calculate the p-value. Answer: The sampling distribution for S is binomial with n = 24 and p = 0.5 . For a two-tailed test, find the probability that S will be as extreme as or more extreme than 16 or 8. 20) Calculate the value of the test statistic and state the value of the p-value. Answer: Test statistic S = 8 and p-value = 0.1516. 21) Using a significance level of 5%, state the correct decision regarding the null hypothesis and write a sentence which summarizes the conclusion and addresses the question. Answer: Fail to reject H0 . There was no difference in the median amounts of phosphorus deposited between the two rivers.
13.3 Mann-Whitney Test for Two Independent Groups 1 Perform a Mann-Whitney Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Choose the statement that is not true about the Mann-Whitney Test. A) The Mann-Whitney Test can be used when the Normal condition of the t -test is not met. B) The Mann-Whitney Test is based on the ranks of the observations, not on their actual values. C) The Mann-Whitney Test is used to compare the centers of two groups of numerical variables. D) The Mann-Whitney Test is based on the number of pairs with positive differences. Answer: D 2) Choose the statement that is not true about the Mann-Whitney Test. A) The Mann-Whitney Test is based on the number of pairs with negative differences. B) The Mann-Whitney Test can be used when the Normal condition of the t -test is not met. C) The Mann-Whitney Test is based on the ranks of the observations, not on their actual values. D) The Mann-Whitney Test is used to compare the centers of two groups of numerical variables. Answer: A
Page 18 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Suppose the Nielson Organization conducted a survey to find out how many minutes of reality-type television programming people watched in one week. Assume that all conditions for the Mann-Whitney test have been met. Use the following test output to answer the question.
3) Choose the correct null and alternative hypothesis to test the claim that men and women watch different amounts of reality-type programming. A) H0 : The median for men and women are not equal. HA: The median for men is equal to the median for women. B) H0 : The median for men is equal to the median for women. HA: The median for men and women are not equal. C) H0 : The median for men is equal to the median for women. HA: The median for men is less than the median for women. D) H0 : The median for men is equal to the median for women. HA: The median for men is greater than the median for women. Answer: B 4) Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Fail to reject H0 . There is evidence to suggest that there is a difference in the median amount of reality-type television that men and women watch. B) Reject H0 . There is evidence to suggest that there is no difference in the median amount of reality -type television that men and women watch. C) Fail to reject H0 . There is evidence to suggest that there is no difference in the median amount of reality-type television that men and women watch. D) Reject H0 . There is evidence to suggest that there is a difference in the median amount of reality -type television that men and women watch. Answer: D
Page 19 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Suppose the Nielson Organization conducted a survey to find out how many minutes of televised sporting events people watched in one week. Assume that all conditions for the Mann-Whitney test have been met. Use the following test output to answer the question.
5) Choose the correct null and alternative hypothesis to test the claim that adults between the ages of 24 and 34 and adults between the ages of 35 and 45 watch different amounts of televised sporting events. A) H0 : The median for the two age groups are not equal. HA: The median for adults ages 24 to 34 is equal to the median for adults ages 35 -45. B) H0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. HA: The median for the two age groups are not equal. C) H0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. HA: The median for adults ages 24 to 34 is less than the median for adults ages 35 to 45. D) H0 : The median for adults ages 24 to 34 is equal to the median for adults ages 35 to 45. HA: The median for adults ages 24 to 34 is greater than the median for adults ages 35 to 45. Answer: B 6) Using a significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Fail to reject H0 . There is evidence to suggest that there is a difference in the median amount of televised sporting events that adults in the two age groups watched. B) Reject H0 . There is evidence to suggest that there is no difference in the median amount of televised sporting events that adults in the two age groups watched. C) Fail to reject H0 . There is evidence to suggest that there is no difference in the median amount of televised sporting events that adults in the two age groups watched. D) Reject H0 . There is evidence to suggest that there is difference in the median amount of televised sporting events that adults in the two age groups watched. Answer: C Solve the problem. 7) Which of the following conditions is not a requirement for the Mann-Whitney Test? A) The population distributions of the groups have the same shape. B) The response variable must be numerical and continuous. C) There are two independent groups. D) There are two paired groups. Answer: D 8) Do daughters tend to be taller than their mothers? To answer this question, the heights of mothers and daughters from 20 families were recorded. Which of the following is the better method for testing whether daughters tend to be taller than mothers? A) Sign test B) Data transformation and t-confidence interval of transformed data C) Mann-Whitney Test D) Independent samples t-test Answer: A
Page 20 Copyright © 2020 Pearson Education, Inc.
9) Are price/earnings ratios typically different for firms in the energy equipment sector and firms in the hotel, restaurant and leisure sector? To answer this question, 15 firms in each sector are randomly selected and the price/earnings ratios recorded for each. Which of the following is the better method for testing whether the sectors typically differ in price/earnings ratio? A) Sign test B) Data transformation and t-confidence interval of transformed data C) Mann-Whitney Test D) Independent samples t-test Answer: C 10) A soccer fan was interested in estimating the mean salary of professional soccer players. He took a random sample of professional soccer players and recorded the annual salaries of each. After looking at a histogram of the data, he noticed that the data were right-skewed. He took the log of each value and verified that the distribution of these transformed data was approximately normal. What test/method should be used to estimate the mean annual salary of profession soccer players? A) Sign test B) Data transformation and t-confidence interval of transformed data C) Mann-Whitney Test D) Independent samples t-test Answer: B 11) A new fiber bar is advertised to curb hunger for three hours. A sample of fourteen hungry subjects was asked to record their level of hunger before eating the fiber bar and again three hours after eating the fiber bar. Which of the following is a better method for testing whether there is a difference in the level of hunger three hours after eating the fiber bar (i.e. the fiber bar curbed hunger for three hours)? A) Data transformation and t-confidence interval of transformed data B) Paired t-test C) Sign Test D) Mann-Whitney Test Answer: C 12) You are presented with data from two independent samples. The variable being measured is continuous. The distribution of the population of each sample is right skewed. You wish to test the hypothesis that there is a difference in the median value of the variable for the samples. Which of the following is the better method for testing whether the medians differ? A) Data transformation and t-confidence interval of transformed data B) Paired t-test C) Sign Test D) Mann-Whitney Test Answer: D 13) A used car lot owner wanted to estimate the amount spent by customers during the summer months. She took a random sample of customers and recorded the amount they spent. A histogram showed the data was right-skewed so she took the log of each value and verified that the distribution of these values was more normally distributed. What test/method should she use to estimate the mean amount spent during the summer months? B) Paired t-test A) Data transformation D) Mann-Whitney Test C) Sign Test Answer: A
Page 21 Copyright © 2020 Pearson Education, Inc.
Readers of a national magazine were asked to rate their satisfaction with 36 restaurant chains in the United States on a scale from 0 to 100. Each chain was then classified as a low -price restaurant (under $16 per person) or a high -price restaurant ($16 or more). Assume that all conditions for the Mann-Whitney test have been met.
14) Choose the correct null and alternative hypotheses to test the claim that customer satisfaction with low -priced and high-priced restaurant chains differ. A) H0 : The median satisfaction for low-priced and high-priced restaurants is the same Ha: The median satisfaction for low-priced restaurants is less than the median for high-priced restaurants B) H0 : The median satisfaction for low-priced and high-priced restaurants is the same Ha: The median satisfaction for low-priced restaurants is greater than the median for high-priced restaurants. C) H0 : The median satisfaction for low-priced and high-priced restaurants is the same Ha: The median satisfaction for low-priced and high-priced restaurants differs D) H0 : The median satisfaction for low-priced restaurants is less than the median for high-priced restaurants. Ha: The median satisfaction for low-priced and high-priced restaurants is the same Answer: C 15) Using the significance level of 5%, state the correct decision regarding the null hypothesis and the concluding statement. A) Fail to reject H0 . There is evidence to suggest that there is a difference in the median satisfaction for low-priced and high-priced restaurants B) Fail to reject H0 . There is evidence to suggest that there is no difference in the median satisfaction for low-priced and high-priced restaurants C) Reject H0 . There is evidence to suggest that there is a difference in the median satisfaction for low -priced and high-priced restaurants D) Reject H0 . There is evidence to suggest that the median satisfaction for low-priced restaurants is greater than the median satisfaction for high-priced restaurants Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 16) List three of the five conditions, pertaining to the sample, which must be met in order to use the Mann-Whitney test. Answer: The five conditions are (1) there are two independent groups, (2) the response variable is numerical and continuous, (3) each group is a random sample from some representative population, (4) the observations are independent of one another, and (5) the population distribution of each group has the same shape. Page 22 Copyright © 2020 Pearson Education, Inc.
17) Suppose data were collected from ten women and twelve men about the length in minutes of their commute to work. The histogram for men was roughly normal, but the histogram for women was strongly skewed to the right. Explain why the t-test is not the best test of whether men and women have different typical commute times. Answer: The sample sizes are small and the womenʹs data is skewed. 18) You are presented with data from two independent samples. The variable being measured is numerical and continuous-valued. The distribution of the population of each sample is right skewed. You wish to test the hypothesis that there is a difference in the median value of the variable for the samples. What type of test/method should you use? Explain why the t-test is not the best method in this situation. Answer: Because the data are from independent samples, the variable is continuous and the population distribution of each group has the same shape, the Mann-Whitney test would be the best method. The t-test is not appropriate because the population distributions are not normal. A trucking company would like to compare two different routes for efficiency. Truckers are randomly assigned to two different routes and the travel times to complete the routes were recorded for each trucker. Assume that all conditions for the Mann-Whitney test have been met.
19) State the null and alternative hypotheses to test whether the median travel times differ for the two routes. Answer: H0 : The median travel times for the two routes are equal. Ha: The median travel times for the two routes are not equal. 20) Using a significance level of 5%, state the correct decision regarding the null hypothesis and write a sentence which summarizes the conclusion and addresses the question. Answer: Fail to reject H0 . There was no difference in the median travel times for the two routes.
Page 23 Copyright © 2020 Pearson Education, Inc.
13.4 Randomization Tests 1 Understand the Procedure Behind Randomization Tests MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in mean GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that all conditions for a randomization test have been satisfied.
1) State the null and alternative hypotheses and also the value of the test statistic for this randomization test. A) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is greater than that for male college athletes The test statistic is 0.42. B) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is greater than that for male college athletes The test statistic is 0.46. C) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is less than that for male college athletes The test statistic is -0.42. D) H0 : The mean GPA for female college athletes is greater than that of male college athletes Ha: The mean GPA for female college athletes is the same as that for male college athletes The test statistic is 0.42. Answer: A 2) Use the histogram to roughly estimate the p-value. Choose the answer that most closely approximates the p-value. (Approximations have been made to the nearest thousandth.) B) p = 0.998 C) p = 0.002 D) None of these A) p = 0.40 Answer: C Page 24 Copyright © 2020 Pearson Education, Inc.
Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in median GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that conditions for a randomization test have been satisfied.
3) State the null and alternative hypotheses and also the value of the test statistic for this randomization test. A) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is greater than that for male college athletes The test statistic is 0.42. B) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is greater than that for male college athletes The test statistic is 0.46. C) H0 : The mean GPA for female college athletes is the same as that of male college athletes Ha: The mean GPA for female college athletes is less than that for male college athletes The test statistic is -0.46. D) H0 : The mean GPA for female college athletes is greater than that of male college athletes Ha: The mean GPA for female college athletes is the same as that for male college athletes The test statistic is 0.46. Answer: B 4) Use the histogram to roughly estimate the p-value. Choose the answer that most closely approximates the p-value. (Approximations have been made to the nearest thousandth.) B) p = 0.002 C) p = 0.998 D) None of these A) p = 0.040 Answer: B
Page 25 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in mean GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that all conditions for a randomization test have been satisfied.
5) State the null and alternative hypotheses and also the value of the test statistic for testing whether the mean GPAs for female college athletes is greater than that for male college athletes differ. Answer: H0 : The typical GPA of college women athletes is the same as that of college men athletes. Ha: The typical GPA of college women athletes is greater than that of college men athletes. The Test Statistic is 0.420. 6) Explain how you would use the histogram to get an approximate p-value and state your p-value estimation. Answer: To approximate the p-value using the histogram draw a vertical line at about 0.420 then approximate the proportion of observations to the right of the vertical line. From the histogram it appears that there are two observations to the right of 0.420 so the p-value is approximately 0.002.
Page 26 Copyright © 2020 Pearson Education, Inc.
2 Perform a Randomization Test MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. Math self -efficacy can be defined as oneʹs belief in his or her own ability to perform mathematical tasks. A college math professor wishes to find out if her studentsʹ math self-efficacy matches reality. To do this she gives a math quiz then asks her students to rate their level of confidence in how well they did on the quiz. She plans to test whether those who had little confidence that they did well on the quiz actually performed worse than those who had a high level of confidence that they did well on the quiz. Shown below is the approximate sampling distribution of the difference in mean quiz scores. The table below shows the summary statistics for the two groups. Assume that all conditions for a randomization test have been satisfied.
1) State the null and alternative hypothesis and also the value of the test statistic for the professorʹs randomization test. A) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is 5.4. B) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is -5.4. C) H0 : The typical quiz score for those with high confidence is greater than that of those with low confidence. HA: The typical quiz score for those with high confidence is the same as that of those with low confidence. The Test Statistic is 5.4. D) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is 1.2. Answer: A
Page 27 Copyright © 2020 Pearson Education, Inc.
2) Use the histogram to roughly estimate the p-value. Choose the answer that most closely approximates the p-value. (Approximations have been made to the nearest hundredth.) A) p = 0.00 B) p = 0.40 C) p = 0.90 D) None of these Answer: B 3) Carry out the randomization test. What is the professorʹs conclusion? Are differences in mean quiz scores due to chance? A) Fail to reject H0 . The professor should conclude that typical quiz scores for those with high confidence is greater than that of those with low confidence. The studentʹs self-efficacy matches reality. B) Reject H0 . The professor should conclude that there is no difference in mean quiz scores for those with high confidence and those with low confidence. The studentʹs self -efficacy does not match reality. C) Fail to reject H0 . The professor should conclude that there is no difference in mean quiz scores for those with high confidence and those with low confidence. The studentʹs self-efficacy does not match reality. D) Reject H0 . The professor should conclude that typical quiz scores for those with high confidence is greater than that of those with low confidence. The studentʹs self-efficacy matches reality. Answer: C Use the following information to answer the question. Math self -efficacy can be defined as oneʹs belief in his or her own ability to perform mathematical tasks. A college math professor wishes to find out if her female studentsʹ math self-efficacy matches reality. To do this she gives a math quiz to the female students then asks them to rate their level of confidence in how well they did on the quiz. She plans to test whether those who had little confidence that they did well on the quiz actually performed worse than those who had a high level of confidence that they did well on the quiz. Shown below is the approximate sampling distribution of the difference in mean quiz scores. The table below shows the summary statistics for the two groups. Assume that all conditions for a randomization test have been satisfied.
4) State the null and alternative hypothesis and also the value of the test statistic for the professorʹs randomization test. A) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is 15.
Page 28 Copyright © 2020 Pearson Education, Inc.
B) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is -15. C) H0 : The typical quiz score for those with high confidence is greater than that of those with low confidence. HA: The typical quiz score for those with high confidence is the same as that of those with low confidence. The Test Statistic is -16.3. D) H0 : The typical quiz score for those with high confidence is the same as that of those with low confidence. HA: The typical quiz score for those with high confidence is greater than that of those with low confidence The Test Statistic is -1.2. Answer: A 5) Use the histogram to roughly estimate the p-value. Choose the answer that most closely approximates the p-value. (Approximations have been made to the nearest hundredth.) B) p = 0.40 C) p = 0.00 D) None of these A) p = 0.90 Answer: A 6) Carry out the randomization test. What is the professorʹs conclusion? Are differences in mean quiz scores due to chance? A) Fail to reject H0 . The professor should conclude that typical quiz scores for those with high confidence is less than that of those with low confidence. The studentʹs self-efficacy matches reality. B) Reject H0 . The professor should conclude that there is no difference in mean quiz scores for those with high confidence and those with low confidence. The studentʹs self -efficacy does not match reality. C) Fail to reject H0 . The professor should conclude that there is no difference in mean quiz scores for those with high confidence and those with low confidence. The studentʹs self-efficacy does not match reality. D) Reject H0 . The professor should conclude that typical quiz scores for those with high confidence is greater than that of those with low confidence. The studentʹs self-efficacy matches reality. Answer: D
Page 29 Copyright © 2020 Pearson Education, Inc.
Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in mean GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that all conditions for a randomization test have been satisfied.
7) Carry out the randomization test. Do female college athletes have higher GPAs than male college athletes? Are differences in mean GPAs due to chance? A) Fail to reject H0 . One can conclude the female college athletes typically have higher GPAs than male college athletes. B) Reject H0 . One can conclude that there is no difference in mean GPAs between female college athletes and male college athletes. C) Fail to reject H0 . One can conclude that there is no difference in mean GPAs between female college athletes and male college athletes. D) Reject H0 . One can conclude the female college athletes typically have higher GPAs than male college athletes. Answer: D
Page 30 Copyright © 2020 Pearson Education, Inc.
Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in median GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that conditions for a randomization test have been satisfied.
8) Carry out the randomization test. Do female college athletes have higher GPAs than male college athletes? Are differences in median GPAs due to chance? A) Fail to reject H0 . One can conclude the female college athletes typically have higher GPAs than male college athletes. B) Reject H0 . One can conclude that there is no difference in mean GPAs between female college athletes and male college athletes. C) Fail to reject H0 . One can conclude that there is no difference in mean GPAs between female college athletes and male college athletes. D) Reject H0 . One can conclude the female college athletes typically have higher GPAs than male college athletes. Answer: D
Page 31 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Do female college athletes tend to have higher GPAs than male college athletes? To answer this question, the GPAs of 107 randomly selected college female athletes and 115 randomly selected college male athletes were recorded. The table below shows the summary statistics for males and females. Shown below is the approximate sampling distribution of the differences in mean GPAs, obtained by randomly shuffling the gender labels in the data set 1000 times. Assume that all conditions for a randomization test have been satisfied.
9) Complete the randomization test by stating the proper decision regarding the null hypothesis and answering the question. Are differences in mean GPAs due to chance? Answer: Reject H0 . We can conclude that typical GPA for female college athletes is greater than that of male college athletes.
Page 32 Copyright © 2020 Pearson Education, Inc.
Ch. 13 Inference Without Normality Answer Key 13.1 Transforming Data 1 Understand Inference and Nonparametric Statistics 1) B 2) A 3) Sample size is small, the population is not normal, the data is strongly skewed. 4) A 5) B 6) C 2 Understand QQ Plots 1) B 2) A 3) B 4) D 5) B 6) B 7) The QQ plot is a tool that can help you determine whether the sample was drawn from a normal population. 8) A goes with D, B goes with C. 9) A log transform might be useful for the data shown in histogram B because it is right skewed. 3 Perform Log Transforms 1) C 2) B 3) B 4) C 4 Find Geometric Means 1) B 2) D 3) D 4) D 5) Median: 1055.0, Geometric Mean: 1625.3, Mean: 7342.5 6) B 7) C 8) B 5 Find and Analyze Log-Transformed Data 1) A 2) C 3) B 4) A 5) A 6) B 7) C 8) A 9) The width of the untransformed data is 327.7, the width of the transformed data is 308.51. The width of the confidence interval for the transformed data is narrower. 10) The confidence interval for the geometric mean is more appropriate because the log-transformed data had a symmetric distribution. The confidence interval for the geometric mean will be more accurate and precise.
13.2 The Sign Test for Paired Data 1 Understand the Sign Test and Perform it if Appropriate 1) D 2) B 3) A 4) B Page 33 Copyright © 2020 Pearson Education, Inc.
5) C 6) B 7) D 8) A 9) D 10) A 11) C 12) B 13) A 14) C 15) D 16) D 17) Sample data are matched pairs and the normal conditions of the paired t-test are not satisfied. Also, the sample size is too small. 18) H0 : The median difference in deposited phosphorus is 0. Ha: The median difference in deposited phosphorus is not 0. 19) The sampling distribution for S is binomial with n = 24 and p = 0.5 . For a two-tailed test, find the probability that S will be as extreme as or more extreme than 16 or 8. 20) Test statistic S = 8 and p-value = 0.1516. 21) Fail to reject H0 . There was no difference in the median amounts of phosphorus deposited between the two rivers.
13.3 Mann-Whitney Test for Two Independent Groups 1 Perform a Mann-Whitney Test 1) D 2) A 3) B 4) D 5) B 6) C 7) D 8) A 9) C 10) B 11) C 12) D 13) A 14) C 15) C 16) The five conditions are (1) there are two independent groups, (2) the response variable is numerical and continuous, (3) each group is a random sample from some representative population, (4) the observations are independent of one another, and (5) the population distribution of each group has the same shape. 17) The sample sizes are small and the womenʹs data is skewed. 18) Because the data are from independent samples, the variable is continuous and the population distribution of each group has the same shape, the Mann-Whitney test would be the best method. The t-test is not appropriate because the population distributions are not normal. 19) H0 : The median travel times for the two routes are equal. Ha: The median travel times for the two routes are not equal. 20) Fail to reject H0 . There was no difference in the median travel times for the two routes.
13.4 Randomization Tests 1 Understand the Procedure Behind Randomization Tests 1) A 2) C 3) B 4) B Page 34 Copyright © 2020 Pearson Education, Inc.
5) H0 : The typical GPA of college women athletes is the same as that of college men athletes. Ha: The typical GPA of college women athletes is greater than that of college men athletes. The Test Statistic is 0.420. 6) To approximate the p-value using the histogram draw a vertical line at about 0.420 then approximate the proportion of observations to the right of the vertical line. From the histogram it appears that there are two observations to the right of 0.420 so the p-value is approximately 0.002. 2 Perform a Randomization Test 1) A 2) B 3) C 4) A 5) A 6) D 7) D 8) D 9) Reject H0 . We can conclude that typical GPA for female college athletes is greater than that of male college athletes.
Page 35 Copyright © 2020 Pearson Education, Inc.
Ch. 14 Inference for Regression 14.1 The Linear Regression Model 1 Identify Factors that Might Contribute to the Random Component of a Linear Regression Model MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following is not true about residuals? If all the statements are true choose (d). A) The residuals can be described as the excess, due to randomness, that doesnʹt fit on the line. B) The residuals can be determined by finding the difference between the actual observed value and the predicted dependent variable. C) The residuals are the result of natural variation in the independent variable. D) All of these are true. Answer: C 2) Which of the following is not true about residuals? If all the statements are true choose (d). A) The residuals can be described as the excess, due to randomness, that doesnʹt fit on the line. B) The residuals can be determined by finding the difference between the actual observed value and the predicted dependent variable. C) The residuals are the result of natural variation in the dependent variable. D) All of these are true. Answer: D 3) Biologists studying the relationship between the number of Round Goby (an invasive prey fish) and the number of salmon eggs in streams believe that the deterministic component of the relationship is a straight line. A scatterplot shows that even though the general trend is linear, the points do not fall exactly on a straight line. Which of the following factors might account for the random component of this regression model? A) Different size salmon might affect the number of eggs laid. B) Variation in the size of the Goby might cause variation in the amount of salmon eggs consumed. C) Variability might appear in the instrument used to count salmon eggs. D) All of these are possible factors that could account for the random component of the regression model. Answer: D 4) Environmental biologists studying the relationship between the number of owls in a forested region and the number of field mice in the region believe that the deterministic component of the relationship is a straight line. A scatterplot shows that even though the general trend is linear, the points do not fall exactly on a straight line. Which of the following factors might account for the random component of this regression model? A) Different size owls might affect the number of mice eaten. B) Variability might appear in the instrument used to count mice. C) Variation in the size of the mice might cause variation in the amount of mice consumed by owls. D) All of these are possible factors that could account for the random component of the regression model. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 5) Many universities use ACT test scores as one criterion for admission. Admission counselors believe that the deterministic part of a regression model predicting college GPA at the end of the first year from the ACT score is a straight line. What factors might contribute to the random component? In other words, why might a studentʹs end of year GPA not fall exactly on this line? Answer: Answers will vary, but possible answers could include (1) some students are poor test takers and thus the ACT may not accurately reflect the studentʹs actual ability and (2) the student may have taken courses in his first year that did not interest him so his GPA may not reflect his ability.
Page 1 Copyright © 2020 Pearson Education, Inc.
2 Calculate Residuals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. A salesman says he can predict the sales price of used Toyota Corollas by using the equation Sale Price = 14300 - 960 Age 1) Using the equation above, find the predicted value for a used Toyota Corolla that is 6 years old. A) $8540 B) $6950 C) $13340 D) Cannot be determined from the information given. Answer: A 2) Using the information above, find the value of the residual for a used Toyota Corolla that is 7 years old and has an actual price of $6950. A) $630 B) $0 C) -$630 D) =$6390 Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 3) A salesman says he can predict the sales price of a used Ford Focus by using the equation Sale Price = 15,700 – 850 Age Complete the table below by calculating the residuals for the following small data set.
Answer:
Page 2 Copyright © 2020 Pearson Education, Inc.
3 Use Residual and/or QQ Plots to Determine if a Linear Regression Model is Appropriate MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Suppose that you were presented with data showing the association between days absent from class and final class average. Which of the following residual plots below suggests that the association between number of days absent from class and final class average is linear? B) A)
C)
D)
Answer: A 2) Which of the following statements is not true about the constant standard deviation condition of the linear regression model? A) A residual plot can highlight the existence of a nonconstant standard deviation even when it is hard to see in the original scatterplot. B) A QQ plot can help you determine whether the constant standard deviation condition holds. C) A constant standard deviation means that the vertical spread of the y -values about the line is the same across the entire line. D) A fan shape in a residual plot indicates that the constant standard deviation condition does not hold. Answer: B
Page 3 Copyright © 2020 Pearson Education, Inc.
3) Which of the following statements is not true about the constant standard deviation condition of the linear regression model? A) A residual plot can highlight the existence of a nonconstant standard deviation even when it is hard to see in the original scatterplot. B) A constant standard deviation means that the vertical spread of the y -values about the line is the same across the entire line. C) A fan shape in a residual plot indicates that the constant standard deviation condition does not hold. D) All of these are true about the constant standard condition. Answer: D 4) Choose the condition of the linear regression model that cannot by verified by examining the residuals. Choose (d) if all the conditions given can be verified by examining the residuals. A) Linearity B) Constant Standard Deviation C) Normality of errors D) All of these can be verified by examining the residuals Answer: D Use the following information to answer the question. Below is the scatterplot showing the association between raw material (in tons) put into an injection molding machine each day ( x), and the amount of scrap plastic (in tons) that is collected from the machine every 4 weeks ( y). The residual plot of the data is also shown along with a QQ plot of the residuals.
5) Choose the statement that best describes whether the condition for linearity does or does not hold for the linear regression model. A) The QQ plot mostly follows a straight line trend--the QQ plot is consistent with the claim of linearity. B) The residual plot shows no trend--the residual plot is consistent with the claim of linearity. C) The residual plot does not display a fan shape--the residual plot is consistent with the claim of linearity. D) The residual plot shows a horizontal trend--the residual plot is not consistent with the claim of linearity. Answer: B Page 4 Copyright © 2020 Pearson Education, Inc.
6) Choose the statement that best describes whether the condition for constant standard deviation does or does not hold for the linear regression model. A) The QQ plot mostly follows a straight line--the QQ plot is consistent with the claim of constant standard deviation. B) The scatter plot shows a linear trend--the scatter plot is not consistent with the claim of constant standard deviation. C) The residual plot does not display a fan shape--the residual plot is consistent with the claim of constant standard deviation. D) The residual plot shows no trend--the residual plot is not consistent with the claim of constant standard deviation. Answer: C 7) Choose the statement that best describes whether the condition for normality of errors does or does not hold for the linear regression model. A) The QQ plot mostly follows a straight line, therefore the normality condition is satisfied. B) The residual plot shows no trend, therefore the normality condition is satisfied. C) The residual plot does not display a fan shape, therefore the normality condition is satisfied. D) The residual plot shows no trend, therefore the normality condition is not satisfied. Answer: A
Page 5 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. Below is the scatterplot showing the association between miles driven in a semi truck (x), and the amount of tread wear on the tires ( y). The residual plot of the data is also shown along with a QQ plot of the residuals.
8) Based on the plots provided, choose the statement that best describes whether the condition for linearity does or does not hold for the linear regression model. A) The residual plot shows no trend--the residual plot is consistent with the claim of linearity. B) The residual plot shows a horizontal trend--the residual plot is not consistent with the claim of linearity. C) The QQ plot mostly follows a straight line trend--the QQ plot is consistent with the claim of linearity. D) The residual plot does not display a fan shape--the residual plot is consistent with the claim of linearity. Answer: A 9) Based on the plots provided, choose the statement that best describes whether the condition for constant standard deviation does or does not hold for the linear regression model. A) The residual plot does not display a fan shape--the residual plot is consistent with the claim of constant standard deviation. B) The QQ plot mostly follows a straight line--the QQ plot is consistent with the claim of constant standard deviation. C) The residual plot shows no trend--the residual plot is not consistent with the claim of constant standard deviation. D) The scatterplot shows a linear trend--the scatterplot is not consistent with the claim of constant standard deviation. Answer: A
Page 6 Copyright © 2020 Pearson Education, Inc.
10) Choose the statement that best describes whether the condition for normality of errors does or does not hold for the linear regression model. A) The residual plot shows no trend, therefore the normality condition is satisfied. B) The residual plot does not display a fan shape, therefore the normality condition is satisfied. C) The QQ plot mostly follows a straight line, therefore the normality condition is satisfied. D) The residual mostly follows a horizontal line which would have a slope of zero, therefore the normality condition is not satisfied. Answer: C Solve the problem. 11) Suppose that you were presented with data showing the association between number of days absent from class and score on the first exam for a classroom of students. Which of the following residual plots below suggests that the association between number of days absent from class and score on the first exam is not linear, but curved? A)
B)
C)
Page 7 Copyright © 2020 Pearson Education, Inc.
D)
Answer: D 12) Suppose that you were presented with data showing the association between days absent from class and final class average. Which of the following residual plots below suggests that both linearity and the constant standard deviation conditions hold? A)
B)
Page 8 Copyright © 2020 Pearson Education, Inc.
C)
D)
Answer: A
Page 9 Copyright © 2020 Pearson Education, Inc.
Below is the scatterplot showing the relationship between the age of an internet user and the amount of time spent browsing the internet per week (in minutes). The residual plot is also shown along with the QQ plot of the residuals.
13) Choose the statement that best describes whether the condition for Normality of errors does or does not hold for the linear regression model. A) The residual plot displays a fan shape; therefore the Normality condition is not satisfied. B) The scatterplot shows a negative trend; therefore the Normality condition is satisfied. C) The residual plot shows no trend; therefore the Normality condition is not satisfied. D) The QQ plot mostly follows a straight line; therefore the Normality condition is satisfied. Answer: D 14) Choose the statement that best describes whether the condition of constant standard deviation does or does not hold for the linear regression model. A) The residual plot displays a fan shape – the residual plot is not consistent with the claim of constant standard deviation. B) The scatterplot shows a linear trend – the scatterplot is consistent with the claim of constant standard deviation. C) The QQ plot mostly follows a straight line – the QQ plot is not consistent with the claim of constant standard deviation. D) The residual plot displays a fan shape – the residual plot is consistent with the claim of constant standard deviation. Answer: A 15) Choose the statement that best describes whether the condition for linearity does or does not hold for the linear regression model. A) The QQ plot mostly follows a straight line – the QQ plot is not consistent with the claim of linearity. B) The QQ plot mostly follows a straight line – the QQ plot is consistent with the claim of linearity. C) The residual plot shows no trend – the residual plot is consistent with the claim of linearity. D) The residual plot shows a horizontal trend – the residual plot is not consistent with the claim of linearity. Answer: C
Page 10 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 16) Choose the statement that best describes whether the condition for Normality of errors does or does not hold for the linear regression model. A) The residual plot displays a fan shape; therefore the Normality condition is not satisfied. B) The scatterplot shows a negative trend; therefore the Normality condition is satisfied. C) The QQ plot does not follow a straight line; therefore the Normality condition is not satisfied. D) The QQ plot mostly follows a straight line; therefore the Normality condition is satisfied. Answer: D 17) Which of the following statements is true? A) The residual plot displays a fan shape; therefore we can conclude that the claim of constant standard deviation is not met. B) The scatterplot shows a linear trend; therefore we can conclude that the claim of constant standard deviation is met. C) The QQ plot mostly follows a straight line; therefore we can conclude that the claim of constant standard deviation is met. D) The residual plot displays a fan shape; therefore we can conclude that the claim of constant standard deviation is met. Answer: A 18) Which of the following statements is true? A) The QQ plot mostly follows a straight line; therefore we can conclude that the claim of linearity is not met. B) The QQ plot does not follow a straight line; therefore we can conclude that the claim of linearity is not met. C) The residual plot shows no slope; therefore we can conclude that the claim of linearity is met. D) The residual plot displays a fan shape; therefore we can conclude that the claim of linearity is not met. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Suppose that you were presented with data showing the association between days absent from class and final class average.
Page 11 Copyright © 2020 Pearson Education, Inc.
19) Which of the residual plots above would suggest that the association between number of days absent from class and final class average meet both the linearity and the constant standard deviation conditions required by the linear model? Explain. Answer: Residual plot (a) indicates that the relationship is linear because the plot shows no trend and a slope of zero. Because plot (a) is formless, it also indicates constant standard deviation. 20) Which of the residual plots above would suggest that the condition for constant standard deviation might not be satisfied? Explain. Answer: Residual plot (d) indicates that the condition for a constant standard deviation may not be satisfied because the plot shows fan shape.
Page 12 Copyright © 2020 Pearson Education, Inc.
Below is the scatterplot showing the association between the number of workers on an assembly team, and the number of parts assembled in an 8-hour shift. The residual plot of the data is also shown along with a QQ plot of the residuals.
21) Use the plot(s) above to explain whether the condition for normality of errors is satisfied. Answer: The QQ plot shows a linear trend - the QQ plot is consistent with the claim of normality of errors. 22) Use the plot(s) above to explain whether the condition for constant standard deviation is satisfied. Answer: The residual plot shows no pattern or is formless - the residual plot is consistent with the claim of constant standard deviation. 23) Use the plot(s) above to explain whether the condition for linearity is satisfied. Answer: The residual plot shows no slope - the residual plot is consistent with the claim of linearity. 4 Understand Concepts Related to Linear Regression MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following statements is true about residuals? If all of the statements are true choose (d). A) The residual can be computed by finding the difference between the deterministic component and the random component of the regression model. B) The residual can be found by finding the difference between the actual observed value and the predicted value of the dependent variable. C) The residual can be found by finding the difference between the actual observed value and the predicted value of the independent variable. D) All of the above are true. Answer: B
Page 13 Copyright © 2020 Pearson Education, Inc.
2) Which of the following statements is true about checking for the constant standard deviation condition? A) If the points of the QQ plot are close to the straight line, then we can conclude the constant standard deviation condition is met. B) If the plot of the residuals against time shows no trend, then we can conclude the constant standard deviation condition is met. C) If the residual plot shows a fan shape, then we can conclude the constant standard deviation condition is met. D) If the residual plot shows no features such as a trend or a fan shape, then we can conclude the constant standard deviation condition is met. Answer: D 3) Which of the following is not true about residuals? Assume the conditions of the linear model hold. Or state that all the statements are true. A) The residuals can be described as the distances the points are from the line due to randomness. B) The residuals can be determined by finding the difference between the actual observed value and the predicted dependent variable. C) The residuals are the result of natural variation in the dependent variable. D) All of these are true. Answer: D 4) Which of the following is not a condition of the linear regression model? A) Linearity B) Residuals must be fan-shaped C) Constant Standard Deviation of errors D) Normality of errors Answer: B 5) Which of the following statements is not true about the constant standard deviation condition of the linear regression model? A) A residual plot can highlight the existence of a nonconstant standard deviation even when it is hard to see in the original scatterplot. B) A constant standard deviation means that the vertical spread of the y -values about the line is the same across the entire line. C) A fan shape in a residual plot indicates that the constant standard deviation condition does not hold. D) All of these are true about the constant standard condition. Answer: D SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 6) Explain how a residual plot can be useful in determining whether the condition for linearity and the condition for a constant standard deviation have been satisfied. Answer: If the original data is linear, the residual plot should have no slope or trend. If the residual plot shows a fan shape, then the condition that y-values have a constant standard deviation may not be satisfied.
Page 14 Copyright © 2020 Pearson Education, Inc.
14.2 Using the Linear Model 1 Perform Hypothesis Tests of the Slope and Intercept MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A random sample of 30 married couples were asked to report the height of their spouse and the height of their biological parent of the same gender as their spouse. The output of a regression analysis for predicting spouse height from parent height is shown. Assume that the conditions of the linear regression model are satisfied.
1) Test the hypothesis that the slope is zero (significance level is 0.05), then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion. A) Reject H0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between parent height and spouse height. B) Fail to reject H0 . We donʹt have enough evidence to reject a slope of zero which is an indication that no linear association exists between parent height and spouse height. C) Reject H0 . We donʹt have enough evidence to reject a slope of zero which is an indication that no linear association exists between parent height and spouse height. D) Fail to reject H0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between parent height and spouse height. Answer: B
Page 15 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A random sample of 30 couples who were also new home owners were asked to report the cost of their first house and their combined age when they married. The output of a regression analysis for predicting home cost from combined age is shown. Assume that the conditions of the linear regression model are satisfied.
2) Test the hypothesis that the slope is zero (significance level is 0.05), then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion. A) Reject H0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between combined age of the couple and home cost. B) Fail to reject H0 . We donʹt have enough evidence to reject a slope of zero which is an indication that no linear association exists between combined age of the couple and home cost. C) Reject H0 . We donʹt have enough evidence to reject a slope of zero which is an indication that no linear association exists between combined age of the couple and home cost. D) Fail to reject H0 . There is enough evidence to reject a slope of zero which is an indication that a linear association exists between combined age of the couple and home cost. Answer: C Use the following information to answer the question. A statistics professor is interested in learning whether there is a positive association between number of posts by online students on a message board and the final class average in an online statistics course. The computer output below shows the results from a regression model in which the final class average was predicted by the number of message board posts. Assume that the conditions of the linear regression model are satisfied.
3) Choose the correct null and alternative hypothesis to test whether there is an association between final class average and number of message board posts. A) H0 : There is no linear association between the number of message board posts and the final class average. Ha : There is a positive linear association between the number of message board posts and the final class average. B) H0 : There is a linear association between the number of message board posts and the final class average. Ha : There is no linear association between the number of message board posts and the final class average. C) H0 : The correlation is positive. Ha : The correlation is zero. D) None of these. Answer: A Page 16 Copyright © 2020 Pearson Education, Inc.
4) Choose the correct observed value of the test statistic and the p-value. Round to the nearest thousandth. A) t = 3.708, p = 0.005 B) t = 0.002, p = 3.707 C) t = 0.777, p = 0.002 D) t = 3.708, p = 0.002 Answer: D 5) Choose the correct decision regarding the null hypothesis and the correct conclusion. State your conclusion using a significance level of 5%. A) Fail to reject H0 . There is enough evidence to conclude that the final class average is positively associated with the number of message board posts. B) Reject H0 . There is enough evidence to conclude that the final class average is positively associated with the number of message board posts. C) Fail to reject H0 . There is not enough evidence to conclude that the final class average is positively associated with the number of message board posts. D) Reject H0 . There is not enough evidence to conclude that the final class average is positively associated with the number of message board posts. Answer: B Use the following information to answer the question. A humanities professor is interesting in learning whether there is a positive association between average online homework scores and the final class average in an online humanities course. The computer output below shows the results from a regression model in which the final class average was predicted by the average online homework score. Assume that the conditions of the linear regression model are satisfied.
6) Choose the correct null and alternative hypothesis to test whether there is an association between final class average and average online homework scores. A) H0 : There is no linear association between the final class average and the average online homework score. Ha : There is a positive linear association between the final class average and the average online homework score. B) H0 : There is a linear association between the final class average and the average online homework score. Ha : There is no linear association between the final class average and the average online homework score. C) H0 : The correlation is positive. Ha : The correlation is zero. D) None of these. Answer: A 7) Choose the correct observed value of the test statistic and the p-value. Round to the nearest thousandth. A) t = 4.345, p = 0.005 B) t = 8.286, p = 0.000 C) t = 8.286, p = 2.090 D) t = 1.112, p = 0.000 Answer: B
Page 17 Copyright © 2020 Pearson Education, Inc.
8) Choose the correct decision regarding the null hypothesis and the correct conclusion. State your conclusion using a significance level of 5%. A) Fail to reject H0 . There is enough evidence to conclude that the final class average is positively associated with the average online homework score. B) Reject H0 . There is not enough evidence to conclude that the final class average is positively associated with the average online homework score. C) Reject H0 . There is enough evidence to conclude that the final class average is positively associated with the average online homework score. D) Fail to reject H0 . There is not enough evidence to conclude that the final class average is positively associated with the average online homework score. Answer: C Provide an appropriate response. 9) Wassamatta University offers supplemental instruction (SI) for introductory chemistry students three times a week. The table below shows the number of SI visits during the semester for a sample of students along with each studentʹs final class average. Use technology to compute a p -value and use it to determine whether the regression equation is useful for making predictions about the benefit of attending SI. Test at the 5% significance level with the hypotheses H0 : β 1 = 0 and Ha : β 1 ≠ 0. # of SI visits Final class average # of SI visits Final class average 2 8 81 81 8 10 85 71 10 7 76 66 9 7 76 88 15 5 72 63 7 4 72 62 5 4 42 61 4 11 59 70 A) p = 0.2529. Since p > α, do not reject the null hypothesis H0 : β 1 = 0. The regression equation is not useful for making predictions. B) p = 0.2633. Since p > α, reject the null hypothesis H0 : β 1 = 0. The regression equation is useful for making predictions. C) p = 0.04229. Since p < α, reject the null hypothesis H0 : β 1 = 0. The regression equation is useful for making predictions. D) p = 0.0325. Since p < α, do not reject the null hypothesis H0 : β 1 = 0. The regression equation is not useful for making predictions. Answer: A
Page 18 Copyright © 2020 Pearson Education, Inc.
10) The maintenance costs incurred over the past year for a particular make, model, and year of automobile are given below, along with each carʹs end-of-year mileage. Use technology to compute a p-value and use it to determine whether the regression equation is useful for making predictions about maintenance costs for this automobile model. Test at the 1% significance level with the hypotheses H0 : β 1 = 0 and Ha : β 1 ≠ 0. Make sure to eliminate any outliers and/or influential points from the data. End-of-year mileage Last yearʹs maintenance (thousands of miles) cost (dollars) 23.1 119 58.2 198 44.6 161 50.2 270 27.5 122 38.9 151 38.1 148 36.8 133 A) p = 0.00014. Since p < α, reject the null hypothesis. The regression equation is useful for making maintenance cost predictions. B) p = 0.02317. Since p > α, do not reject the null hypothesis. The regression equation is not useful for making maintenance cost predictions. C) p = 0.00014. Since p < α, do not reject the null hypothesis. The regression equation is not useful for making maintenance cost predictions. D) p = 0.02317. Since p > α, reject the null hypothesis. The regression equation is useful for making maintenance cost predictions. Answer: A 11) The table below gives the career free-throw percentage and the player height for a sample of NBA basketball players, both past and present. Use technology to compute a p-value and use it to determine whether the regression equation is useful for making predictions about freethrow percentage. Test at the 10% significance level with the hypotheses H0 : β 1 = 0 and Ha : β 1 ≠ 0. Height (meters) Career Free-throw % Height (meters) Career Free-throw % 1.83 2.18 76.0 72.1 2.16 2.08 54.2 69.2 2.13 2.01 71.0 58.4 2.26 1.60 81.1 82.7 2.11 2.06 75.6 88.6 2.01 2.03 78.2 84.8 1.83 2.21 90.4 64.9 1.98 2.29 82.1 56.1 A) p = 0.04622. Since p < α, reject the null hypothesis. The regression equation is useful for making free throw percentage predictions. B) p = 0.02311. Since p < α, reject the null hypothesis. The regression equation is useful for making free throw percentage predictions. C) p = 0.04622. Since p < α, do not reject the null hypothesis. The regression equation is not useful for making free throw percentage predictions. D) p = 0.02311. Since p < α, do not reject the null hypothesis. The regression equation is not useful for making free throw percentage predictions. Answer: A
Page 19 Copyright © 2020 Pearson Education, Inc.
Solve the problem. 12) Choose the statement that is true about the estimators for the slope and intercept of a regression line. Assume the conditions of the linear model hold. Or state that all statements are true. A) The estimators of the slope and intercept of a regression line are unbiased. B) The sampling distributions of the estimators will follow the Normal distribution if the Normal condition of the error terms is met. C) The sampling distributions of the estimators will be approximately Normal if the Normal condition of the error terms is not met but the sample size is large. D) All of these are true. Answer: D 13) Choose the statement(s) that are not true about the estimators for slope and intercept of a regression line when the conditions of the linear model hold. Or state that each statement is. A) The sampling distribution of the estimators will follow the Normal model. B) The estimators for the slope and intercept of a regression line are unbiased. C) The sampling distributions of the estimators follow the chi-square model. D) All of these are true. Answer: C 14) A researcher wanted to see if there was a linear relationship between the length of a personʹs leg and the length of a personʹs arm. Suppose a regression analysis was run to predict the lengths of legs from the lengths of arms. What would it mean if the intercept were 6 and the slope of the regression line 1? B) On average, arms are 1 inch longer than legs. A) On average, legs are 1 inch longer than arms. C) On average, legs are 6 inches longer than arms. D) On average, arms are 6 inches longer than legs. Answer: C 15) A researcher wanted to see if there was a linear relationship between the heights of sons and the heights of their fathers. Suppose a regression analysis was run to predict the heights of the sons from the heights of their fathers. What would it mean if the intercept were 2 and the slope of the regression line 1? A) On average, sons are 2 inches taller than their fathers. B) On average, fathers are 2 inches taller than their sons. C) On average, sons are 1 inch taller than their fathers. D) On average, fathers are 1 inch taller than their sons. Answer: A
Page 20 Copyright © 2020 Pearson Education, Inc.
In golf, does it pay to be a long hitter? The average driving distance (the distance the ball is hit in yards) and the percent of greens reached in regulation (the percentage of times that the ball was hit and landed on the green – the area immediately surrounding the hole) were recorded for 50 of the top golfers in the world. The computer output below shows the results from a regression model in which the percentage of greens reached in regulation was predicted by avera driving distance. Assume the conditions of the linear regression model are satisfied.
16) Choose the correct null and alternative hypotheses to test whether there is an association between the percent of greens reached in regulation and driving distance. A) H0 : There is an association between the percent of greens reached in regulation and driving distance. Ha: There is no association between the percent of greens reached in regulation and driving distance. B) H0 : There is no association between the percent of greens reached in regulation and driving distance. Ha: There is a negative association between the percent of greens reached in regulation and driving distance. C) H0 : There is no association between the percent of greens reached in regulation and driving distance. Ha: There is an association between the percent of greens reached in regulation and driving distance. D) H0 : The correlation between greens reached in regulation and driving distance is positive. Ha: There is no correlation between greens reached in regulation and driving distance. Answer: C 17) Choose the correct observed value of the test statistic and the p-value to test whether there is a linear relationship between greens reached in regulation and driving distance. A) t = 3.95, p = 0.000 B) t = 1.24, p = 0.223 C) s = 2.86830, p = 0.0308 D) t = 0.0540, p = 0.223 Answer: B 18) Test the hypothesis that the slope is not zero. Then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion using a significance level of 5%. A) Fail to reject H0 . There is enough evidence to conclude that the percent of greens reached in regulation is linearly related to driving distance. B) Reject H0 . There is enough evidence to conclude that the percent of greens reached in regulation is linearly related to driving distance. C) Fail to reject H0 . There is not enough evidence to conclude that the percent of greens reached in regulation is linearly related to driving distance. D) Reject H0 . There is not enough evidence to conclude that the percent of greens reached in regulation is linearly related to driving distance. Answer: C
Page 21 Copyright © 2020 Pearson Education, Inc.
An experiment was run to see whether the time (in minutes) it takes to drill a distance of 5 feet in rock is related to the depth of the drilling (in feet). The computer output below shows the results from a regression model in which the time to drill 5 feet was predicted by the depth of the drilling. Assume the conditions of the linear regression model are satisfied.
19) According to the regression line, what is the mean time to drill 5 feet at a depth of 0 feet? A) 0.01363 B) 6.65 C) 3.92 D) 5.408 Answer: D Airline pilots face the risk of progressive hearing loss due to the noisy cockpits of most jet aircraft. Much of the noise comes from air roar which increases at high speeds rather than from the engines. The cockpit noise was measured to the left of the pilotʹs ear on 60 randomly selected flights of one type of aircraft. The computer output below shows the results from a regression model in which the noise level was predicted by the air speed. Assume the conditions of the linear regression model are satisfied.
20) Choose the correct null and alternative hypotheses to test whether there is a relationship between cockpit noise and air speed. A) H0 : There is no relationship between cockpit noise and air speed. Ha: There is a relationship between cockpit noise and air speed. B) H0 : The correlation between cockpit noise and air speed is positive. Ha: There is no correlation between cockpit noise and air speed. C) H0 : There is a relationship between cockpit noise and air speed. Ha: There is no relationship between cockpit noise and air speed. D) H0 : There is no relationship between cockpit noise and air speed. Ha: There is a negative relationship between cockpit noise and air speed. Answer: A
Page 22 Copyright © 2020 Pearson Education, Inc.
21) Choose the correct observed value of the test statistic and the p-value. A) t = 65.624, p = 0.000 B) t = 6.28, p = 0.000 C) s = 11.4088, p = 0.0885 D) t = 2.37, p = 0.021 Answer: D 22) Test the hypothesis that the slope is not zero. Then choose the correct decision regarding the null hypothesis and the statement that correctly summarizes the conclusion using a significance level of 5%. A) Fail to reject H0 . There is enough evidence to conclude that cockpit noise is linearly related to air speed. B) Reject H0 . There is enough evidence to conclude that cockpit noise is linearly related to air speed. C) Fail to reject H0 . There is not enough evidence to conclude that cockpit noise is linearly related to air speed. D) Reject H0 . There is not enough evidence to conclude that cockpit noise is linearly related to air speed. Answer: B 23) What is the slope of the line? Choose the statement that gives the correct interpretation of the slope in context of the problem.What is the slope of the line? Choose the statement that gives the correct interpretation of the slope in context of the problem. A) The slope is 1. On average, for each additional increase in air speed of 0.073417, the cockpit noise is 1 unit higher, in the sample. B) The slope is 65.624. On average, for each additional unit increase in air speed, the cockpit noise is about 65.624 units higher, in the sample. C) The slope is 0.073417. On average, for each additional unit increase in air speed, the cockpit noise is 0.073417 units higher, in the sample. D) The slope is 2.37. On average, for each additional unit increase in air speed, the cockpit noise is 2.37 units higher, in the sample. Answer: C Airline pilots face the risk of progressive hearing loss due to the noisy cockpits of most jet aircraft. Much of the noise comes from air roar which increases at high speeds rather than from the engines. The cockpit noise (in decibels) was measured to the left of the pilotʹs ear on 60 randomly selected flights of one type of aircraft. The computer output below shows the results from a regression model in which the cockpit noise level was predicted by the air speed (in mph). Assume the conditions of the linear regression model are satisfied within the range of air speeds tested.
24) According to the regression line, what is the estimated mean cockpit noise for a plane sitting on the runway, not moving, with its engines running? A) 0.073417 mph B) 11.4088 mph C) 65.624 mph D) 6.28 mph Answer: C
Page 23 Copyright © 2020 Pearson Education, Inc.
25) Test the hypothesis that the intercept is 0 using a significance level of 0.05. A) Reject H0 . There is enough evidence to reject an intercept of 0 decibels. B) Reject H0 . There is not enough evidence to reject an intercept of 65.624 decibels. C) Fail to reject H0 . There is not enough evidence to reject an intercept of 0 decibels. D) Fail to reject H0 . There is enough evidence to reject an intercept of 65.624 decibels. Answer: A SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 26) Consider the following statement: ʺWhen the conditions of the linear model hold, the estimators for slope and intercept are unbiased.ʺ What is meant by the word unbiased in this context? Answer: An estimator is said to be unbiased if the mean of the sampling distribution of that estimator is equal to the parameter being estimated. An economist wanted to analyze the relationship between the speed of a car and its gas mileage. One car was driven at several different specified speeds and the gas mileage was measured for each speed. The output of the regression analysis for predicting gas mileage from car speed is shown below. Assume that the conditions of the linear regression model are s
27) State the null and alternative hypotheses to test whether there is an association between gas mileage and car speed. Answer: H0 : There is no association between gas mileage and speed of the car. Ha: There is an association between gas mileage and speed of the car. 28) What is the observed value of the test statistic and the p-value for testing if there is an association between gas mileage and car speed? Answer: t = -2.95, p = 0.009 29) State the null and alternative hypotheses to test whether there is an association between gas mileage and car speed. Answer: Reject H0 . There is enough evidence to conclude that the gas mileage is associated with speed of the car.
Page 24 Copyright © 2020 Pearson Education, Inc.
The manager of a chain of appliance stores believes that experience is a very important factor in determining the success of a salesperson. The manager selects a random sample of 20 salespeople and records the last monthʹs sales (in thousands of dollars) and the number of years of experience selling appliances. The computer output below shows the results from a regression model in which the last monthʹs sales were predicted by years of experience. Assume the conditions of the linear regression model are satisfied.
30) State the slope of the regression line. Write a sentence explaining what this slope means in the context of the problem. Answer: The slope is 1.580. On average, for each additional year of experience, the last monthʹs sales will be 1.580 thousand dollars higher, in the sample. 31) Test the hypothesis that there is a linear relationship between last monthʹs sales and years of experience. State the correct decision regarding the null hypothesis and write a statement that correctly summarizes the conclusion in the context of the problem. Use the 5% level of significance. Answer: Fail to reject H0 . There is not enough evidence to reject the null hypothesis. We cannot conclude that there is a linear association between years of experience and last monthʹs sales. 32) If the intercept were 0 and the slope 1, explain how the linear model would be interpreted in context of the problem. Answer: It would mean that on average, last monthʹs sales are 1 thousand times the number of years of experience.
Page 25 Copyright © 2020 Pearson Education, Inc.
2 Report Confidence Intervals for the Slope and Intercept MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) The regression output below is the result of testing whether there is an association between the number of practice test problems a student completed and the number of questions answered correctly on the test. Assume that the conditions of the linear regression model are satisfied. What is the 95% confidence interval for the intercept (rounded to the nearest hundredth)? Does this interval support the theory that the intercept is zero? Choose the statement that summarizes your answer in context.
A) (-8.15, 3.90). This interval supports the theory that the intercept could be zero. In this context this would mean that a student who completed zero practice test problems could reasonably expect to get a zero on the test. B) (0.86, 1.36). This interval does not support the theory that the intercept could be zero. In this context the intercept is greater than zero so a student could expect to get a positive score on the test even if they did none of the practice problems. C) (-8.15, 3.90). This interval does not support the theory that the intercept could be zero. In this context this would mean that for approximately every two practice problems completed, the student could expect to get approximately one test question correct. D) None of these Answer: A 2) The regression output below is the result of testing whether there is an association between the number of hours of sleep a student had the night before an exam and the number of questions answered correctly on the exam. Assume that the conditions of the linear regression model are satisfied. What is the 95% confidence interval for the intercept (rounded to the nearest hundredth)? Does this interval support the theory that the intercept is zero? Choose the statement that summarizes your answer in context.
A) (0.03, 0.14) . This interval does not support the theory that the intercept could be zero. In this context the intercept is greater than zero so a student could expect to get a positive score on the test even if they did not get any hours of sleep. B) (-4.64, 4.14) . This interval supports the theory that the intercept could be zero. In this context this would mean that a student who had zero hours of sleep could reasonably expect to get a zero on the test. C) (-4.64, 4.14) . This interval does not support the theory that the intercept could be zero. In this context this would mean that for approximately every 0.08 hours of sleep, the student could expect to get approximately one test question correct. D) None of these Answer: B
Page 26 Copyright © 2020 Pearson Education, Inc.
Airline pilots face the risk of progressive hearing loss due to the noisy cockpits of most jet aircraft. Much of the noise comes from air roar which increases at high speeds rather than from the engines. The cockpit noise (in decibels) was measured to the left of the pilotʹs ear on 60 randomly selected flights of one type of aircraft. The computer output below shows the results from a regression model in which the cockpit noise level was predicted by the air speed (in mph). Assume the conditions of the linear regression model are satisfied within the range of air speeds tested.
3) What is the 95% confidence interval for the slope of the line? Does this interval support the theory that cockpit noise is related to air speed? Choose the statement that summarizes your answer in the context of the problem. A) (44.77, 86.47) ; This interval does not contain 1. Therefore, there is no evidence to indicate cockpit noise is related to air speed. B) (44.77, 86.47) ; This interval does not contain 0. Therefore, there is evidence to indicate cockpit noise is related to air speed. C) (0.0115, 0.1354) ; This interval does not contain 1. Therefore, there is no evidence to indicate cockpit noise is related to air speed. D) (0.0115, 0.1354) ; This interval does not contain 0. Therefore, there is evidence to indicate that cockpit noise is related to air speed. Answer: D
Page 27 Copyright © 2020 Pearson Education, Inc.
An experiment was run to see whether the time (in minutes) it takes to drill a distance of 5 feet in rock is related to the depth of the drilling (in feet). The computer output below shows the results from a regression model in which the time to drill 5 feet was predicted by the depth of the drilling. Assume the conditions of the linear regression model are satisfied.
4) What is the 95% confidence interval for the slope of the line? Does this interval support the theory that the time to drill 5 feet is linearly related to depth of drilling? Choose the statement that summarizes your answer in the context of the problem. A) (0.00622, 0.02103) ; This interval does not contain 1. Therefore, there is no evidence to indicate time to drill 5 feet is linearly related to drilling depth. B) (0.00622, 0.02103) ; This interval does not contains 0. Therefore, there is evidence to indicate that time to drill 5 feet is linearly related to drilling depth. C) (3.675, 7.141) ; This interval does not contain 1. Therefore, there is no evidence to indicate time to drill 5 feet is linearly related to drilling depth. D) (3.675, 7.141) ; This interval does not contain 0. Therefore, there is evidence to indicate time to drill 5 feet is linearly related to drilling depth. Answer: B
Page 28 Copyright © 2020 Pearson Education, Inc.
SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. The manager of a chain of appliance stores believes that experience is a very important factor in determining the success of a salesperson. The manager selects a random sample of 20 salespeople and records the last monthʹs sales (in thousands of dollars) and the number of years of experience selling appliances. The computer output below shows the results from a regression model in which the last monthʹs sales were predicted by years of experience. Assume the conditions of the linear regression model are satisfied.
5) What is the 95% confidence interval for the intercept? Does this support the theory that the intercept is something different from 0? Write a statement that summarizes your answer in the context of the problem. Answer: (-8.59, 21.66). This interval does not support the theory that the intercept is something different from 0. In this context, it would not be unusual for a salesperson with no experience to not have any sales the last month.
14.3 Predicting Values and Estimating Means 1 Understand the Differences Between Confidence Intervals and Prediction Intervals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A high school boys cross country coach performs a regression to predict the finish times of runners in the 10k event from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold.
1) The coach wants to predict the finish time of his top runner who trained for 180 minutes the previous week. Should the coach use a confidence interval or a prediction interval? A) Confidence Interval B) Prediction Interval Answer: B
Page 29 Copyright © 2020 Pearson Education, Inc.
Use the following information to answer the question. A high school girls cross country coach performs a regression to predict the finish times of runners in the 10k event from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold.
2) The coach wants to predict the finish time of his top runner who trained for 145 minutes the previous week. Should the coach use a confidence interval or a prediction interval? A) Confidence Interval B) Prediction Interval Answer: B Solve the problem. 3) Which of the following statements is not true about prediction intervals? A) Prediction intervals are wider than confidence intervals because there is more uncertainty in predicting an individualʹs value. B) A prediction interval is concerned with estimating a population parameter. C) The width of the prediction interval is affected by the size of the standard deviation of the population distribution. D) All of these statements are true. Answer: B 4) Which of the following statements is true about prediction intervals? A) Prediction intervals are wider than confidence intervals because there is more uncertainty in predicting an individualʹs value. B) A prediction interval is concerned with predicting values for individuals. C) The width of the prediction interval is affected by the size of the standard deviation of the population distribution. D) All of these statements are true. Answer: D
Page 30 Copyright © 2020 Pearson Education, Inc.
Many factors affect the price of wine. One factor that is expected to be related to price is taste. Twenty -five random bottles of wine were selected and scored on a scale from 0 to 20 for taste, where the higher the score, the better the taste. The price of the bottle of wine (in dollars) was also recorded. A regression analysis was run where price was predicted by taste with the following results. Assume the conditions for the linear regression model hold.
5) If we wanted to estimate the average price of all bottles of wine with a taste score of 18, we would use a A) Confidence Interval B) Prediction Interval Answer: A 6) What statement about prediction intervals and confidence intervals is true. Or state that none of the statements is true. A) Prediction intervals are always smaller than confidence intervals because confidence intervals are predicting the mean of the population and prediction intervals are predicting the value of only one item in the population. B) Prediction intervals are always smaller than confidence intervals because prediction intervals are predicting the mean of all the items in the population and confidence intervals are predicting the value of only one item in the population. C) Sometimes prediction intervals are smaller than confidence intervals and sometimes prediction intervals are larger than confidence intervals. D) None of the statements is true. Answer: D
Page 31 Copyright © 2020 Pearson Education, Inc.
7) Is the amount of nicotine contained in a cigarette related to the weight of the cigarette? A scatterplot of the amounts of nicotine and the weights (in grams) of 25 brands of cigarettes is shown below. The scatterplot also depicts the boundaries of either confidence intervals or prediction intervals. Choose the statement that correctly describes the intervals shown on the plot.
A) The intervals depicted on the plot are most likely prediction intervals because many of the observations lie outside the intervals. B) The intervals depicted on the plot are most likely confidence intervals because many of the observations lie outside the intervals. C) The intervals are most likely prediction intervals because the regression line lies entirely within the intervals. D) The intervals are most likely confidence intervals because the regression line lies entirely within the intervals. Answer: B SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. Solve the problem. 8) Explain the difference between confidence intervals and prediction intervals. Be sure to include the type of situation in which each type of interval would be used. Which interval is likely to be wider and why? Answer: A confidence interval is used to estimate a population parameter. A prediction interval is used to estimate a value for an individual. Prediction intervals are likely to be wider than confidence intervals because the sampling distribution for the mean has a smaller standard deviation than the population distribution. A high school boysʹ track and field coach performs a regression to predict the vault height (in feet) of a pole vault from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold.
9) The coach wants to project what the jump height of his top pole vaulter will be who trained for 175 minutes the previous week. Should the coach use a confidence interval or a prediction interval? Answer: Prediction interval
Page 32 Copyright © 2020 Pearson Education, Inc.
2 Use Confidence Intervals and Prediction Intervals to Answer Questions MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Use the following information to answer the question. A high school boys cross country coach performs a regression to predict the finish times of runners in the 10k event from the number of minutes of training in the previous week. The output is shown below. Assume that the conditions of the linear regression model hold.
1) Suppose the coachʹs top runner trained for 180 minutes the previous week. If this runner participates in the 10k event, what is the coachʹs expected finish time for this runner? Can he be reasonably confident that this runner will beat the previous seasonʹs record of 43 minutes? A) Expected finish time is 38.84 minutes. The coach can be confident that this runner will beat the previous seasonʹs record of 43 minutes because the interval contains the value of 43 minutes. B) Expected finish time is 65.84 minutes. The coach cannot be confident that this runner will beat the previous seasonʹs record because the interval contains the value of 43 minutes. To be confident the interval would have to lie completely below 43 minutes. C) Expected finish time is 38.84 minutes. The coach cannot be confident that this runner will beat the previous seasonʹs record because the interval contains the value of 43 minutes. To be confident the interval would have to lie completely below 43 minutes. D) Expected finish time is 65.84 minutes. The coach can be confident that this runner will beat the previous seasonʹs record of 43 minutes because the interval contains the value of 43 minutes. Answer: C
Page 33 Copyright © 2020 Pearson Education, Inc.
Many factors affect the price of wine. One factor that is expected to be related to price is taste. Twenty -five random bottles of wine were selected and scored on a scale from 0 to 20 for taste, where the higher the score, the better the taste. The price of the bottle of wine (in dollars) was also recorded. A regression analysis was run where price was predicted by taste with the following results. Assume the conditions for the linear regression model hold.
2) Suppose we know that a particular bottle of wine had a taste score of 15. What can be said about the price of this bottle of wine? Select the statement that is true about the price of this bottle of wine. A) The expected price is $6.18. We can be confident that this bottle of wine will cost less than $20. B) The expected price is $25.61. We can be confident that the actual price of this bottle of wine will be between $21.25 and $29.97. C) The expected price is $25.61. We can be confident that the actual price of this bottle of wine will be between $24.45 and $29.97. D) The expected price is $25.61. We can be confident that the average price of all bottles of wine with a taste score of 15 will be between $21.25 and $29.97. Answer: B
Page 34 Copyright © 2020 Pearson Education, Inc.
Answer the question. 3) In 2017, nutritional data for menu items at 12 fast food restaurants were collected. The scatterplot and regression results below show the relationship between the number of calories and the total fat (in grams) for 126 randomly selected food items.
A customer wants to know the fat content of the burger she just bought, which has 450 calories. Should she use a prediction interval or a confidence interval? Explain. A) She should use a prediction interval because we want to predict the value for one item, not the mean of the group of items. B) She should use a prediction interval because we want to predict the value for the group of items, not the value for one item. C) She should use a confidence interval because we want to predict the value for one item, not the mean of the group of items. D) She should use a confidence interval because we want to predict the value for the group of items, not the value for one item. Answer: A
Page 35 Copyright © 2020 Pearson Education, Inc.
4) In 2017, nutritional data for menu items at 12 fast food restaurants were collected. The scatterplot and regression results below show the relationship between the number of calories and the total fat (in grams) for 126 randomly selected food items.
A customer wants to know the uncertainty in the price of the burger she just bought, which has 450 calories. Rep appropriate 95% interval for the predicted total fat? A) (-10.46, -5.31) B) (0.06, 0.07) C) (21.75, 24.05) D) (10.56, 35.25) Answer: D 5) In 2017, nutritional data for menu items at 12 fast food restaurants were collected. The scatterplot and regression results below show the relationship between the number of calories and the total fat (in grams) for 126 randomly selected food items.
Page 36 Copyright © 2020 Pearson Education, Inc.
A customer ate a burger, which as 450 calories, but wants to know if it has less than 8 grams of total fat since she diet. Is the burger likely to have this fat content? Explain. A) Yes, she is likely to find an item to eat because 8 is within the prediction interval. B) No, she is not likely to find an item to eat because 8 is within the prediction interval. C) Yes, she is likely to find an item to eat because 8 is not within the prediction interval. D) No, she is not likely to find an item to eat because 8 is not within the prediction interval. Answer: D 3 Estimate Prediction Intervals from Graphs MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Answer the question. 1) The Time Higher Education World University Rankings have been collected from 2011 to 2016. The relationship between the teaching rating and the research rating for the 2603 colleges and universities is shown in the scatterplot below. The scatterplot also displays 95% prediction intervals. Using the graph, estimate the upper and lower values for the prediction interval used for predicting the research rating of a university with a teaching rating of 40%.
A) Lower: 15, Upper: 65 C) Lower: 25, Upper: 65
B) Lower: 20, Upper: 55 D) Lower: 30, Upper: 55
Answer: B
Page 37 Copyright © 2020 Pearson Education, Inc.
2) Data on 77 cereal brands were collected in 2017. The following scatterplot shows the relationship between the calorie count and grams of sugar in these cereals. The graph also displays 95% prediction intervals. Using the graph, estimate the upper and lower values for the prediction interval used for predicting the grams of sugar for a cereal with 100 calories.
A) Lower: 0, Upper: 10 C) Lower: -1, Upper: 10
B) Lower: 0, Upper: 14 D) Lower: -1, Upper: 14
Answer: D 3) Data were collected from 83 high school students who had taken at least one AP test while in school. The scatterplot below shows the relationship between the students’ GPAs and their average AP test scores, along with 95% prediction intervals. Note that GPAs have values between 0.0 and 4.0 and AP test scores are between 0 and 5. Explain why the prediction interval for a student with a GPA of 3.5 either would or would not be useful.
A) The prediction interval is useful because it includes scores of 1 to 4.5, which basically covers all possible scores on the AP test. B) The prediction interval is not useful because it includes scores of 1 to 4.5, which basically covers all possible scores on the AP test. C) The prediction interval is useful because it includes all of the actual points for students with GPAs of 3.5. D) The prediction interval is not useful because it includes all of the actual points for students with GPAs of 3.5. Answer: B
Page 38 Copyright © 2020 Pearson Education, Inc.
4) Data were collected on the Pittsburgh Steelers between 1943 and 2008. The scatterplot below shows the relationship between the number of years with the head coach and the number of season wins, along with 95% prediction intervals. Using the graph, estimate the upper and lower values for the prediction interval used for predicting the number of season wins for a team when the head coach has been with the team for 5 years.
A) (0, 15)
B) (1, 13)
C) (2, 14)
D) (3, 15)
Answer: B 4 Understand and Interpret Confidence Intervals and Prediction Intervals MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. Solve the problem. 1) Which of the following statements is true about prediction intervals? A) Prediction intervals are wider than confidence intervals because there is more uncertainty in predicting an individualʹs value. B) A prediction interval is concerned with predicting values for individuals. C) The width of the prediction interval is affected by the size of the standard deviation of the population distribution. D) All of these statements are true. Answer: D 2) Which of the following is not true about the coefficient of determination, r2 ? A) The coefficient of determination, r2 , ranges from 0% to 100% and represents the amount of variability in the response variable explained by the regression line. B) In order to interpret the coefficient of determination, r2 , the linearity condition of the linear regression model must be satisfied. C) A hypothesis test should be conducted to verify that the coefficient of variation, r2 , is large enough to conclude a linear relationship exists. D) The coefficient of determination, r2 , is a statistic that will give some information about how well the data fit the model, but it should not be the only piece of information taken into consideration when determining how useful a linear model might be. Answer: C SHORT ANSWER. Write the word or phrase that best completes each statement or answers the question. 3) State the two conditions that must be satisfied, without exception, to make inferences using a linear model. Answer: (1) the linearity condition and (2) the independence of errors must be satisfied for the linear model to be appropriate. Page 39 Copyright © 2020 Pearson Education, Inc.
MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question. 4) Generally, are prediction intervals or confidence intervals wider? Explain. A) Prediction intervals are wider because they include estimates of means instead of predictions for individual values. B) Confidence intervals are wider because they include estimates of means instead of predictions for individual values. C) Prediction intervals are wider because they include predictions for individual values instead of estimates of means. D) Confidence intervals are wider because they include predictions for individual values instead of estimates of means. Answer: C 5) (In 2017, nutritional data for menu items at 12 fast food restaurants were collected. The scatterplot and regression results below show the relationship between the number of calories and the total fat (in grams) for 126 randomly selected food items.
When considering the fat content of all items with 450 calories, what is a 95% confidence interval for the mean fa content? A) (21.75, 24.05). We are 95% confident that the mean grams of total fat for all fast food items with 450 calories is between 21.75 and 24.05. B) (21.75, 24.05). We are 95% confident that one fast food item with 450 calories will have between 21.75 and 24.05 grams of total fat. C) (10.56, 35.25). We are 95% confident that the mean grams of total fat for all fast food items with 450 calories is between 21.75 and 24.05. D) (10.56, 35.25). We are 95% confident that one fast food item with 450 calories will have between 21.75 and 24.05 grams of total fat. Page 40 Copyright © 2020 Pearson Education, Inc.
Answer: A 6) In 2017, nutritional data for menu items at 12 fast food restaurants were collected. The scatterplot and regression results below show the relationship between the number of calories and the total fat (in grams) for 126 randomly selected food items.
A customer wants to buy a fast food item with 450 calories, report and interpret the 95% prediction interval for t A) (21.75, 24.05). We are 95% confident that the mean grams of total fat for all fast food items with 450 calories is between 21.75 and 24.05. B) (21.75, 24.05). We are 95% confident that one fast food item with 450 calories will have between 21.75 and 24.05 grams of total fat. C) (10.56, 35.25). We are 95% confident that the mean grams of total fat for all fast food items with 450 calories is between 21.75 and 24.05. D) (10.56, 35.25). We are 95% confident that one fast food item with 450 calories will have between 21.75 and 24.05 grams of total fat. Answer: D
Page 41 Copyright © 2020 Pearson Education, Inc.
Ch. 14 Inference for Regression Answer Key 14.1 The Linear Regression Model 1 Identify Factors that Might Contribute to the Random Component of a Linear Regression Model 1) C 2) D 3) D 4) D 5) Answers will vary, but possible answers could include (1) some students are poor test takers and thus the ACT may not accurately reflect the studentʹs actual ability and (2) the student may have taken courses in his first year that did not interest him so his GPA may not reflect his ability. 2 Calculate Residuals 1) A 2) C 3)
3 Use Residual and/or QQ Plots to Determine if a Linear Regression Model is Appropriate 1) A 2) B 3) D 4) D 5) B 6) C 7) A 8) A 9) A 10) C 11) D 12) A 13) D 14) A 15) C 16) D 17) A 18) C 19) Residual plot (a) indicates that the relationship is linear because the plot shows no trend and a slope of zero. Because plot (a) is formless, it also indicates constant standard deviation. 20) Residual plot (d) indicates that the condition for a constant standard deviation may not be satisfied because the plot shows fan shape. 21) The QQ plot shows a linear trend - the QQ plot is consistent with the claim of normality of errors. 22) The residual plot shows no pattern or is formless - the residual plot is consistent with the claim of constant standard deviation. 23) The residual plot shows no slope - the residual plot is consistent with the claim of linearity. 4 Understand Concepts Related to Linear Regression 1) B 2) D 3) D 4) B Page 42 Copyright © 2020 Pearson Education, Inc.
5) D 6) If the original data is linear, the residual plot should have no slope or trend. If the residual plot shows a fan shape, then the condition that y-values have a constant standard deviation may not be satisfied.
14.2 Using the Linear Model 1 Perform Hypothesis Tests of the Slope and Intercept 1) B 2) C 3) A 4) D 5) B 6) A 7) B 8) C 9) A 10) A 11) A 12) D 13) C 14) C 15) A 16) C 17) B 18) C 19) D 20) A 21) D 22) B 23) C 24) C 25) A 26) An estimator is said to be unbiased if the mean of the sampling distribution of that estimator is equal to the parameter being estimated. 27) H0 : There is no association between gas mileage and speed of the car. Ha: There is an association between gas mileage and speed of the car. 28) t = -2.95, p = 0.009 29) Reject H0 . There is enough evidence to conclude that the gas mileage is associated with speed of the car. 30) The slope is 1.580. On average, for each additional year of experience, the last monthʹs sales will be 1.580 thousand dollars higher, in the sample. 31) Fail to reject H0 . There is not enough evidence to reject the null hypothesis. We cannot conclude that there is a linear association between years of experience and last monthʹs sales. 32) It would mean that on average, last monthʹs sales are 1 thousand times the number of years of experience. 2 Report Confidence Intervals for the Slope and Intercept 1) A 2) B 3) D 4) B 5) (-8.59, 21.66). This interval does not support the theory that the intercept is something different from 0. In this context, it would not be unusual for a salesperson with no experience to not have any sales the last month.
14.3 Predicting Values and Estimating Means 1 Understand the Differences Between Confidence Intervals and Prediction Intervals 1) B 2) B 3) B Page 43 Copyright © 2020 Pearson Education, Inc.
4) D 5) A 6) D 7) B 8) A confidence interval is used to estimate a population parameter. A prediction interval is used to estimate a value for an individual. Prediction intervals are likely to be wider than confidence intervals because the sampling distribution for the mean has a smaller standard deviation than the population distribution. 9) Prediction interval 2 Use Confidence Intervals and Prediction Intervals to Answer Questions 1) C 2) B 3) A 4) D 5) D 3 Estimate Prediction Intervals from Graphs 1) B 2) D 3) B 4) B 4 Understand and Interpret Confidence Intervals and Prediction Intervals 1) D 2) C 3) (1) the linearity condition and (2) the independence of errors must be satisfied for the linear model to be appropriate. 4) C 5) A 6) D
Page 44 Copyright © 2020 Pearson Education, Inc.