CHAPTER 01 – 4e 1) The following data represent a sample of non-elementary mathematics teachers in Bergen County, New Jersey. Identify the qualitative and quantitative variables, the categories associated with each qualitative variable, and the measurement scales for all variables. School Brookside E.S. Program 3-Emotionally Distur. Program 3-Emotionally Distur. Program 3-Emotionally Distur. Program 3-Emotionally Distur. Program 5-Life Skills Program 5-Life Skills Bergen AcademiesHackensack Bergen AcademiesHackensack Bergen AcademiesHackensack
Degree Masters Masters
Years Experience 8 25
Salary $ 67,945 $ 82,910
Classes Taught 6 4
Bachelors
25
$ 86,030
7
Bachelors
3
$ 62,690
5
Masters
21
$ 82,620
5
Masters Masters Masters
11 41 31
$ 82,330 $ 79,790 $ 82,626
5 4 5
Masters
10
$ 82,626
4
Masters
3
$ 98,291
4
Source: http://php.app.com/edstaff/results2.php?county=BERGEN&district=%25&school=%25&lname= &fname=&job1=Math+Non-Elementary&Submit=Submit
2) The following data concern a sample of employees of the U.S. Marshals in the state of New York. Identify the qualitative and quantitative variables, the categories associated with each qualitative variable, and the measurement scales for all variables. Country New York
Version 1
Station NEW YORK-NY
Title MISCELLANEOUS CLERK AND
Grade GS 07
Salary $ 51,030
1
Country New York Country Kings Country Erie Country Onondaga Country New York Country Onondaga Country Erie Country New York Country Kings Country
NEW YORK-NY NEW YORKKINGS BUFFALO SYRACUSE NEW YORK-NY SYRACUSE BUFFALO NEW YORK-NY NEW YORKKINGS
ASSISTANT MISCELLANEOUS CLERK AND ASSISTANT ACCOUNTING TECHNICIAN
GS 07
$ 55,405
GS 07
$ 45,196
ADMINISTRATIVE OFFICER BUDGET ANALYSIS
GS 13 GS 09
$ 95,023 $ 53,773
GENERAL BUSINESS AND INDUSTRY GENERAL BUSINESS AND INDUSTRY MISCELLANEOUS ADMINISTRATION AND PROGRAM MISCELLANEOUS ADMINISTRATION AND PROGRAM GENERAL BUSINESS AND INDUSTRY
GS 11
$ 66,887
GS 11
$ 74,628
GS 09
$ 59,962
GS 09
$ 57,065
GS 11
$ 66,887
Source: http://php.app.com/fed_employees10/results.php?fullname=&agency_name=U.S.+MARSHALS +SERVICE&statename=New+York&countyname=%25&Submit=Search
3) Why do we spend time on inspecting and preparing the data before performing statistical analysis?
4)
The following data represent a sample of information on our customers.
Version 1
2
Name Al Brooks Beth Gloria Mary King Bob Julissa
Gender Male Female Female Male
Date of Birth 09/17/1986 05/27/1983 08/01/1981 07/08/1962
Phone Number 205-228-5876 760-301-3657 708-326-7311
Income $ 19,000 $ 125,025 $ 100,700 $ 164,032
Ed Taylor
Male
04/07/1977
413-212-4506
$ 76,048
What data preparation task is most appropriate to answer the following questions? A. Are there missing values in Phone Number? B. How many customers do we have? C. How many of our customers are female? D. Who is our youngest customer? E. How do our male customers compare to our female customers?
5)
The study of statistics is NOT about A) the language of data. B) the art and science of getting information from data. C) the study of collecting, analyzing, presenting, and interpreting data. D) the study of mind and consciousness.
6)
When reading published statistics (numerical facts), you should
Version 1
3
A) never believe what you read, because all statistics are lies. B) only believe those statistics that are adequately supported. C) believe what you read, because they wouldn’t be published if they weren’t correct. D) only believe those statistics that are presented in so-called quality publications.
7)
The two branches of the study of statistics are generally referred to as A) descriptive and inferential statistics. B) inferential and differential statistics. C) descriptive and referential statistics. D) differential and descriptive statistics.
8)
Population parameters are difficult to calculate due to
A) cost prohibitions on data collection. B) the infeasibility of collecting data on the entire population. C) the fact that samples are difficult to draw due to the nature of the data. D) both cost prohibitions on data collection and the infeasibility of collecting data on the entire population.
9) The teachers’ union in California wants to know the average salary for high school teachers throughout the country. What is the teachers’ union presumably planning to calculate? A) Sample statistic B) Sample parameter C) Population statistic D) Population parameter
Version 1
4
10)
A population consists of A) all items of interest in a sample. B) a subject of interest in a sample. C) all items of interest in a statistical problem. D) a subject of interest in a statistical problem.
11)
In inferential statistics, we calculate statistics of sample data to A) estimate unknown population parameters. B) conduct tests about unknown population parameters. C) Both of these choices are correct. D) Neither of these choices is correct.
12)
Which of the following represents a population and a sample from that population? A) Residents of Albany, New York, and registered voters in Albany, New York B) Teachers of a high school and members of the parent-teacher group C) Fans at a concert who purchase T-shirts, and fans at a concert who purchase soda D) Freshmen at St. Joseph’s University and basketball players at St. Joseph’s University
13)
Which of the following represents a population and a sample from that population?
Version 1
5
A) Attendees at a sporting event, and those who purchased popcorn at said sporting event B) Full-time employees at a marketing firm, and temporary summer interns at the marketing firm C) Seniors at Boston College and students in a first-semester business statistics course D) Stocks available on the NYSE and stocks on the NASDAQ
14)
Which of the following is an example of cross-sectional data? A) GDP of the United States from 1990–2010 B) Daily price of DuPont stock during the first quarter C) Quarterly housing starts collected over the last 60 years D) Results of market research testing consumer preferences for soda
15)
Which of the following is an example of time series data? A) The sale prices of town houses sold last year B) Quarterly housing stats collected over the last 60 years C) Results of market research testing consumer preferences for soda D) Starting salaries of recent business graduates at Penn State University
16)
The estimation of which of the following requires sampling? A) U.S. unemployment rate B) Total rainfall in Phoenix, Arizona, in 2010 C) The Cleveland Indians’ hitting percentage in 2010 D) The average SAT score of incoming freshmen at a university
Version 1
6
17) A company wants to estimate the mean price of oil over the past 10 years. What kind of data does the company need? A) Time series data B) Inferential statistics C) Cross-sectional data D) Descriptive statistics
18)
For which of the following population parameters is sampling not necessary? A) The average height of NBA players B) The average life of light bulbs produced by a manufacturer C) The average content of cereal boxes produced by a manufacturer D) The percentage of the U.S. public school teachers who support Democrats
19) Sampling is used heavily in manufacturing and service settings to ensure high-quality products. In which of the following areas would sampling be inappropriate? A) Computer assembly B) Custom cabinet making C) Cell phone manufacturing D) Technical support by phone
20)
Which of the following is NOT an example of cross-sectional data?
Version 1
7
A) The test scores of students in a class B) The current average prices of regular gasoline in different states C) The sales prices of single-family homes sold last month in California D) The sales prices of single-family homes in Arizona collected since 2008.
21) An analyst studies a data set of the year-end book value per share for all companies listed on the New York Stock Exchange. This data set is best described as A) time series data. B) cross-sectional data. C) neither time series nor cross-sectional data. D) a combination of time series and cross-sectional data.
22)
Which type of data, cross-sectional versus time series, is more important to research?
A) Neither type of data is important. B) Cross-sectional data is more important than time series data. C) Time series data is more important than cross-sectional data. D) Time series data and cross-sectional data are equally as valuable in different types of research.
23)
Which of the following variables is categorical? A) Height B) Gender C) Weight D) Temperature
Version 1
8
24)
Which of the following variables is quantitative? A) Gender B) Temperature C) Marital status D) Religious affiliation
25)
Which of the following is a qualitative variable? A) House age B) House size C) House price D) House district
26) San Francisco 49ers’ linebacker Patrick Willis won the Defensive Rookie of the Year Award in 2007 with a total of 153 tackles. Tackles are measured on what kind of a scale? Is a variable measuring the number of tackles considered continuous or discrete? A) Ratio scale; discrete B) Ratio scale; continuous C) Interval scale; discrete D) Interval scale; continuous
27) San Francisco 49ers’ linebacker Patrick Willis won the Defensive Rookie of the Year Award in 2007 with a total of 174 tackles. Tackles are measured on what kind of a scale? Is a variable measuring the number of tackles considered continuous or discrete?
Version 1
9
A) Ratio scale; discrete B) Interval scale; discrete C) Ratio scale; continuous D) Interval scale; continuous
28)
Which of the following variables is not continuous? A) Height of NBA players B) Time of a flight between Atlanta and Chicago C) Average temperature in the month of July in Orlando D) The number of obtained heads when a fair coin is tossed 20 times
29)
The ordinal scale of data measurement is A) less sophisticated than the nominal scale. B) more sophisticated than the interval scale. C) more sophisticated than the nominal scale. D) as equally sophisticated as the nominal scale.
30)
The interval scale of data measurement is A) less sophisticated than the ratio scale. B) more sophisticated than the ratio scale. C) less sophisticated than the ordinal scale. D) equally sophisticated as the ratio scale because both are appropriate for quantitative
data.
Version 1
10
31) A recent survey of 300 small firms (annual revenue less than $8 million) asked whether an increase in the minimum wage would cause the firm to decrease capital spending. Possible responses to the survey question were: "Yes," "No," or "Don’t Know." This data is best classified as A) interval scale. B) ordinal scale. C) nominal scale. D) ratio scale.
32) A recent survey of 200 small firms (annual revenue less than $10 million) asked whether an increase in the minimum wage would cause the firm to decrease capital spending. Possible responses to the survey question were: "Yes," "No," or "Don’t Know." This data is best classified as A) ratio scale. B) ordinal scale. C) interval scale. D) nominal scale.
33) Which scale of data measurement is appropriate for the names of companies listed on the Dow Jones Industrial Average? A) Ratio scale B) Ordinal scale C) Interval scale D) Nominal scale
34) An analyst collects data on the weekly closing price of gold throughout a year. The scale of this data is
Version 1
11
A) ratio scale. B) ordinal scale. C) interval scale. D) nominal scale.
35) An undergraduate student’s status (freshman, sophomore, junior, or senior) is an example of which scale of measurement? A) Ratio scale B) Ordinal scale C) Interval scale D) Nominal scale
36)
The Fahrenheit scale for measuring temperature would be classified as a(n) A) ratio scale. B) ordinal scale. C) interval scale. D) nominal scale.
37) At the end of a semester college students evaluate their instructors by assigning them to one of the following categories: Excellent, Good, Average, Below Average, and Poor. The measurement scale is a(n) A) ratio scale. B) ordinal scale. C) interval scale. D) nominal scale.
Version 1
12
38)
What is the scale of measurement of the distance between any two locations? A) Ratio scale B) Ordinal scale C) Interval scale D) Nominal scale
39)
Which scales of data measurement are associated with quantitative data? A) Interval and ratio B) Ratio and nominal C) Ordinal and interval D) Nominal and ordinal
40)
Which data scales of measurement are associated with qualitative data? A) Interval and ratio B) Ratio and nominal C) Ordinal and interval D) Nominal and ordinal
41) The data represents the stock price for Google at the end of the past four quarters. Which of the following types of data best describe these values?
Version 1
13
A) Cross-sectional B) Nominal C) Time series D) Ordinal
42) Your business statistics class had a test last week. The average score for the class is an example of A) secondary data B) qualitative data C) descriptive statistics D) inferential statistics
43)
A sample statistic is an estimate of A) population parameter. B) population statistic. C) sample parameter. D) descriptive statistic.
44)
A __________ represents all possible subjects of interest. A) sample B) population C) statistic D) parameter
Version 1
14
45) A major portion of __________ <span style="display: none;"></span> is concerned with the problem of estimating population parameters or testing hypothesis about such parameters. A) descriptive statistics B) population statistics C) inferential statistics D) business statistics
46)
Data that describe a characteristic about a sample is known as a A) population. B) survey. C) parameter. D) statistic.
47) When a characteristic of interest differs among various observations, then it can be termed a A) parameter. B) variable. C) data. D) information.
48) A(n) __________ variable is characterized by infinitely uncountable values and can take any value within interval.
Version 1
15
A) discrete B) infinite C) continuous D) quantitative
49)
Differences between categories are meaningless with __________ data. A) ordinal B) interval C) ratio D) continuous
50)
Which of the following scales represents the strongest level of measurement? A) Ordinal B) Nominal C) Ratio D) Interval
51)
Which of the following scales represents the least sophisticated level of measurement? A) Ordinal B) Nominal C) Ratio D) Interval
52)
The values of data on a(n) __________ scale can be categorized and ranked.
Version 1
16
A) ordinal B) nominal C) ratio D) interval
53)
Which of the following characteristics does the interval scale not have? A) Values can be categorized. B) Values can be ranked. C) There is a true zero point. D) The differences between values are valid.
54)
Which of the following is an example of quantitative data? A) The ZIP code of your home address B) Google’s closing stock price today C) Your gender D) Your Social Security number
55)
Which of the following is an example of qualitative data? A) Today’s high temperature B) The class average of last test C) The amount of time you spent for your homework D) Your last name
Version 1
17
56) A respondent of a survey is asked whether the Philadelphia Flyers’ performance in the last game was excellent, good, fair, or poor. The person indicates that the performance was "good." This is an example of A) nominal data B) ordinal data C) interval data D) ratio data
57)
Which of the following is a data preparation task? A) Collecting relevant data B) Conducting a survey C) Counting relevant variables D) Communicating insights from analysis
58) You have a point-of-sale data set that provides information on time of sale, products bought, and payment methods customers used to complete their purchases. You want to compare purchases made by cash versus those made by credit card. How should the data be best prepared to make comparisons between the two types of purchases? A) Exclude cash and credit card purchases from the data set B) Impute cash and credit card purchases from the data set C) Count the data by method of payment D) Subset the data by method of payment
59) Philadelphia experienced a record amount of rainfall in August. During the last week of the month, the city received additional rain from a hurricane. Because global warming is thought to cause extreme weather patterns, one conclusion that could be drawn is that these patterns are evidence of global warming. What is wrong with this conclusion?
Version 1
18
60) Administrators have concluded that the SAT exam results for 2011 show a distinct change in student capabilities when compared with the year 1991. In 1991 the SAT exam included only multiple choice sections and was later redesigned. What is wrong with this conclusion?
61) A university is interested in tracking the success of its graduates by measuring the length of each graduate’s job search before getting a position in his or her chosen field. How would you define the appropriate population?
62) We would like to determine whether there is a difference between the height of a college team of basketball players at the Ohio State University and the height of the overall student body. Identify the two populations in this study.
Version 1
19
63) In each of the following statements, determine whether the branch of statistics is best classified as descriptive statistics or inferential statistics. A. The average of a data set is equal to 35.7. B. The minimum value of a data set is 78, and the maximum value is 146. C. Because the average age in a sample is 23, it is likely that the average age in the population is about 23. D. Because the values in the sample are so widely dispersed, the spread of the population must be high.
64) A car company wants to know the average age of cars of their brand that are still on the road. How would you define the appropriate population? Will the car company calculate a population parameter or a sample statistic? Why?
65)
What are the primary reasons that sampling is necessary?
66) An investor wants to know today’s average closing price of the stocks listed on the Standard and Poor’s 500 Index. Will the investor calculate a population parameter or sample statistic? Why?
Version 1
20
67) We would like to determine the average height of a college team of basketball players at Ohio State University. Is it necessary to take a sample of basketball players? Explain.
68) We would like to determine the average height of the overall student body at Ohio State University. Does it seem necessary to take a sample from the overall student body?
69) Researchers are interested in completing a study examining trends in the sale of foods in the U.S. They have decided to examine the quantity of organic vegetables sold by supermarkets. Will researchers be able to gather population data?
70) Every 10 years, a census is taken in the U.S. by the Census Bureau. Despite the intent of gathering data on the population of the United States, issues exist that make true population data impossible to gather. Identify at least two issues in collecting these data.
Version 1
21
71) Social networking sites support themselves in large part by selling advertising space. The hit rate on these ads is a critical measure when trying to solicit advertising. The hit rate is used as a measure of success for ads. How would you recommend a social networking site use sampling to evaluate its existing ads?
72) The following table includes the number of white women over the age of 20 in the civilian labor force. Because it is time series data, what would the entries of the first column refer to? ?
Number in Civilian Labor Force 43216 43479 44663 45409 45543 46613 47051 47833 48611 49128 48562
Source: https://data.bls.gov
Version 1
22
73) A study of teen smoking is planned. Researchers are interested in collecting crosssectional data, which allow them to draw conclusions about the likelihood, frequency, and longevity of teen smoking. You have been asked to design this study and will collect no more than five pieces of data. What information will you collect?
74) Define the measurement scale of a car’s fuel efficiency (measured in miles per gallon). Is a car’s fuel efficiency discrete or continuous?
75) A study of teen smoking is planned. Researchers are interested in collecting data which allow them to draw conclusions about the likelihood, frequency, and longevity of teen smoking. The questions asked include: “What is your gender?” “What is your age?” “Do you smoke (yes or no)?” “How many cigarettes per day do you smoke?” “For how long have you smoked (in years)?” What is the measurement scale for each variable?
76) The following data represent a sample of property sales in Cape May County during the year 2000. Identify the qualitative and quantitative variables. What are the natural categories for Town and Class? Identify the measurement scales for all variables. Town Avalon Avalon Wildwood
Version 1
Class Residential Residential Commercial
Date 12/28/2000 04/14/2000 05/01/2000
Price $ 500,000 $ 500,000 $ 500,000
Assessment $ 288,600 $ 325,900 $ 250,000
23
Avalon North Wildwood Avalon North Wildwood Avalon Avalon Wildwood
Residential Commercial Residential Residential Commercial Residential Residential
05/22/2000 06/02/2000 09/16/2000 04/07/2000 01/15/2000 01/15/2000 06/14/2000
$ 500,000 $ 500,000 $ 518,000 $ 520,000 $ 520,000 $ 525,000 $ 525,000
$ 332,500 $ 607,700 $ 269,900 $ 373,100 $ 414,600 $ 373,500 $ 379,600
77)
Statistics is the science of transforming data into useful information. ⊚ true ⊚ false
78)
Statistics is the methodology of extracting unnecessary information from a data set. ⊚ ⊚
true false
79) The branch of statistical studies called descriptive statistics summarizes important aspects of a data set. ⊚ ⊚
true false
80) The branch of statistical studies called inferential statistics refers to drawing conclusions about sample data by analyzing the corresponding population. ⊚ ⊚
Version 1
true false
24
81)
A population is a larger data set than its corresponding sample. ⊚ true ⊚ false
82)
Population parameters are used to estimate corresponding sample statistics. ⊚ true ⊚ false
83)
Typically, it is possible to examine every member of the population. ⊚ ⊚
84)
Cross-sectional data contain values of a characteristic of one subject collected over time. ⊚ ⊚
85)
true false
true false
Time series data contain values of a characteristic of a subject over time. ⊚ ⊚
true false
86) Structured data tends to include numbers, dates, and groups of words and numbers called strings. ⊚ ⊚
Version 1
true false
25
87)
Unstructured data conforms to a predefined row-column format. ⊚ ⊚
true false
88)
Big data is a catchphrase that implies a complete set of population data. ⊚ true ⊚ false
89)
A qualitative variable assumes meaningful numerical values. ⊚ ⊚
90)
true false
Both discrete and continuous variables may assume an uncountable number of values. ⊚ ⊚
true false
91)
A discrete variable cannot assume an infinite number of values. ⊚ true ⊚ false
92)
A continuous variable assumes any value from an interval (or collection of intervals). ⊚ ⊚
Version 1
true false
26
93) A professor's gender (male, female) as well as rank (assistant, associate, full) represent ordinal data. ⊚ ⊚
94)
A professor's rank (assistant, associate, and full), as well as salary, represent ordinal data. ⊚ ⊚
95)
true false
Data and data interpretation do not show up in every facet of life. ⊚ ⊚
98)
true false
The weather forecast cannot be based on only the weather for the last three days. ⊚ ⊚
97)
true false
Many people believe that statistics has no use in real life. ⊚ ⊚
96)
true false
true false
A population is defined as all possible subjects of a specific group.
Version 1
27
⊚ ⊚
99)
true false
Researchers use sample results in an attempt to estimate an unknown population statistic. ⊚ ⊚
true false
100) The recorded body temperature of patients in the group of patients under research study is an example of time series data. ⊚ ⊚
true false
101)
Body weight is an example of a discrete variable. ⊚ true ⊚ false
102)
The mathematical operation of addition can be performed on nominal data. ⊚ ⊚
103)
A ZIP code is an example of numerical data. ⊚ ⊚
104)
true false
true false
Ordinal scale reflects a stronger level of measurement than the nominal scale.
Version 1
28
⊚ ⊚
105)
true false
All mathematical operations can be performed on ratio-scaled data. ⊚ ⊚
true false
106) A respondent to a survey indicates that she drives a Nissan Pathfinder. This is an example of qualitative data. ⊚ ⊚
107)
The zero point of an interval scale reflects a complete absence of what is being measured. ⊚ ⊚
108)
true false
Statistical analysis begins with counting and sorting data. ⊚ ⊚
110)
true false
Nominal and interval scales are used for qualitative variables. ⊚ ⊚
109)
true false
true false
Data subsetting can be used to remove observations that contain missing values.
Version 1
29
⊚ ⊚
Version 1
true false
30
Answer Key Test name: Chap 01_4e 1) Qualitative: School, Degree; Quantitative: Years Experience, Salary, Classes Taught Categories for School: Brookside E.S., Program 3Emotionally Distur., Program 5-Life Skills, and Berger AcademiesHackensack; Categories for Degree: Bachelors and Masters Nominal: School; Ordinal: Degree; Ratio: Years Experience, Classes Taught, and Salary 2) Qualitative: County, Station, Title, and Grade; Quantitative: Salary Categories for County: New York County, Kings County, Erie County, and Onondaga County; Categories for Station: NEW YORK - NY, NEW YORK - KINGS, BUFFALO, and SYRACUSE; Categories for Title: MISCELLANEOUS CLERK AND ASSISTANT, ACCOUNTING TECHNICIAN, ADMINISTRATIVE OFFICER, BUDGET ANALYSIS, GENERAL BUSINESS AND INDUSTRY, MISCELLANEOUS ADMINISTRATION AND PROGRAM Categories for Grade: GS 07, GS 13, GS 09, GS 11 Nominal: County, Station, and Title; Ordinal: Grade; Ratio: Salary 3) We want to ensure that the data is complete so that there are no missing values for important variables. We also want to review the range of values for each variable so that there are no outliers.
Version 1
31
4) A. Counting the number of blanks in Phone Number; B. Counting the number of observations in the data; C. Counting the number of observations having Female as the value ofvalue of Gender. D. Sorting the data by Date of Birth in ascending order; E. Subsetting the data by Gender. 5) D 6) B 7) A 8) D 9) A 10) C 11) C 12) A 13) A 14) D 15) B 16) A 17) A 18) A 19) B 20) D 21) B 22) D 23) B 24) B 25) D Version 1
32
26) A 27) A 28) D 29) C 30) A 31) C 32) D 33) D 34) A 35) B 36) C 37) B 38) A 39) A 40) D 41) C 42) C 43) A 44) B 45) C 46) D 47) B 48) C 49) A 50) C 51) B 52) A 53) C 54) B 55) D Version 1
33
56) B 57) C 58) D 59) The existence or nonexistence of climate change cannot be based on one month’s worth of data. We must instead look at long-term trends. 60) The redesign of the exam makes any conclusions comparing 1991 to 2011 invalid. 61) The population should include all graduates of the university within a specified time period. 62) Varsity basketball players at OSU, all students at OSU 63) Descriptive Statistics: The average of a data set is equal to 35.7 and the minimum value of a data set is 78, and the maximum value is 146. Inferential Statistics: Because the average age in a sample is 23, it is likely that the average age in the population is about 23, and because the values in the sample are so widely dispersed, the spread of the population must be high. 64) The population includes all cars of their brand still on the road. The car company will calculate a sample statistic. It would be very costly to find the age of every car of their brand on the road. 65) Gathering population data can sometimes be impossible, very timeconsuming, or very expensive. 66) The investor will calculate a population parameter. It is easy to gather all the prices for the firms on the Standard and Poor’s 500 Index. 67) No, it is a small population and it is very easy to get the heights of the entire basketball team. 68) Yes, it would be very difficult and expensive to collect the heights of so many students. 69) No, population data would be very impractical and expensive to gather.
Version 1
34
70) Homeless population not surveyed, people providing intentionally false responses, illegal immigrants not wanting to be counted, no enforcement on returning completed survey, new homes being missed or demolished homes being counted, etc. 71) The existing ads should be split into categories of like products. Then some number of ads from each category could be selected for the sample to determine the average number of hits for the relevant category. Any other logical sampling methods are acceptable. 72) Time measured in years, quarters, months, etc.
73) “What is your gender?” “What is your age?” “Do you smoke?” “How many cigarettes per day do you smoke?” “For how long have you smoked?” (Other reasonable answers are acceptable.) 74) Fuel efficiency is measured on a ratio scale. The number zero is meaningful. Fuel efficiency is a continuous variable. 75) Gender: nominal; Age: ratio; Smoking: nominal; Number of cigarettes per day: ratio; Time of smoking: ratio 76) Qualitative variables: Town, Class; Quantitative variables: Date, Price, Assessment Town categories: Avalon, Wildwood, and North Wildwood; Class categories: Residential and Commercial Nominal: Town and Class; Interval: Date; Ratio: Price and Assessment 77) TRUE 78) FALSE 79) TRUE 80) FALSE 81) TRUE 82) FALSE 83) FALSE 84) FALSE 85) TRUE 86) TRUE
Version 1
35
87) FALSE 88) FALSE 89) FALSE 90) FALSE 91) FALSE 92) TRUE 93) FALSE 94) FALSE 95) FALSE 96) TRUE 97) FALSE 98) TRUE 99) FALSE 100) FALSE 101) FALSE 102) FALSE 103) FALSE 104) TRUE 105) TRUE 106) TRUE 107) FALSE 108) FALSE 109) FALSE 110) TRUE
Version 1
36
CHAPTER 02 – 4e 1) For a categorical variable, a frequency distribution groups its values into __________ and records the number of __________.
2) Graphically, we can show a(n) __________ __________ for a categorical variable by constructing a pie chart or a bar chart.
3) When constructing a frequency distribution for a numerical variable, classes are mutually __________ and __________.
4) A __________ __________ is a table that shows the number of data observations that fall into specific interval.
5) The shape of most data distributions can be categorized as either __________ or __________.
6)
A stem-and-leaf diagram most resembles a(n) __________.
7) A scatterplot depicts a positive __________ relationship, if as x increases, y tends to increase at an increasing rate.
Version 1
1
8) Using a scatterplot above we observe a __________ linear relationship between two variables: Education and Income.
9) Using the contingency table below, we observe that most of the __________ have High School as their highest level of education. Employed Associate No Yes All
Version 1
4 7 11
Highest Education Bachelors Doctorate High School 4 4 7 5 5 5 9 9 12
Masters
All
4 5 9
23 27 50
2
10) Using the stacked column chart below, we observe that more __________ than __________ have a Masters degree.
11)
Frequency distributions may be used to describe which of the following types of data? A) Nominal and ordinal data only B) Nominal and interval data only C) Nominal, ordinal, and interval data only D) Nominal, ordinal, interval, and ratio data
12)
In order to summarize a categorical variable, a useful tool is a __________.
Version 1
3
A) histogram B) frequency distribution C) stem-and-leaf diagram D) line chart
13) For both categorical and numerical variables, what is the difference between the relative frequency and the percent frequency? A) The relative frequency equals the percent frequency multiplied by 100. B) The percent frequency equals the relative frequency multiplied by 100. C) As opposed to the relative frequency, the percent frequency is divided by the number of observations in the data set. D) As opposed to the percent frequency, the relative frequency is divided by the number of observations in the data set.
14)
For which of the following data sets will a pie chart be most useful? A) Heights of high school freshmen B) Ambient temperatures in the U.S. Capitol Building C) Percentage of net sales by product for Lenovo in Year 1 D) Growth rates of firms in a particular industry
15) An auto parts chain asked customers to complete a survey rating the chain's customer service as average, above average, or below average. The following shows the results from the survey: Average
Below Average
Average
Above Average
Above Average
Above Average
Below Average
Average
Average
Below Average
Average
Below Average
Below Average
Below Average
Below Average
Version 1
4
{MISSING IMAGE}Click here for the Excel Data File The proportion of customers who felt the customer service was Average is the closest to _______. A) 0.20 B) 0.33 C) 0.46 D) 0.53
16) An auto parts chain asked customers to complete a survey rating the chain's customer service as average, above average, or below average. The following table shows the results from the survey. Average
Below Average
Average
Above Average
Above Average
Above Average
Below Average
Average
Average
Below Average
Average
Below Average
Below Average
Below Average
Below Average
{MISSING IMAGE}Click here for the Excel Data File A rating of Average or Above Average accounted for what number of responses to the survey? A) 3 B) 7 C) 8 D) 10
17) 1.
The following is a list of five of the world's busiest airports by passenger traffic for Year Name
Location
Number of Passengers (in millions)
Hartsfield-Jackson
Atlanta, Georgia, United States
89
Capital International
Beijing, China
74
London Heathrow
London, United Kingdom
67
O’Hare
Chicago, Illinois, United
66
Version 1
5
States Tokyo
Tokyo, Japan
64
The percentage of passenger traffic in the five busiest airports that occurred in Asia is the closest to ___________. A) 18% B) 21% C) 25% D) 38%
18) 1.
The following is a list of five of the world's busiest airports by passenger traffic for Year Name
Location
Number of Passengers (in millions)
Hartsfield-Jackson
Atlanta, Georgia, United States
89
Capital International
Beijing, China
74
London Heathrow
London, United Kingdom
67
O’Hare
Chicago, Illinois, United States
66
Tokyo
Tokyo, Japan
64
How many more millions of passengers flew out of Atlanta than flew out of Chicago? A) 13 B) 21 C) 23 D) 25
19) A city in California spent $7 million repairing damage to its public buildings in Year 1. The following table shows the categories where the money was directed. Cause Termites Water Damage
Version 1
Percent 30% 5%
6
Mold Earthquake Other
19% 35% 11%
How much did the city spend to fix damage caused by mold? A) $2,660,000 B) $1,330,000 C) $2,100,000 D) $665,000
20) A city in California spent $6 million repairing damage to its public buildings in Year 1. The following table shows the categories where the money was directed. Cause Termites Water Damage Mold Earthquake Other
Percent 22% 6% 12% 27% 33%
How much did the city spend to fix damage caused by mold? A) $360,000 B) $720,000 C) $1,440,000 D) $1,800,000
21) A city in California spent $6 million repairing damage to its public buildings in Year 1. The following table shows the categories where the money was directed. Cause Termites Water Damage Mold Earthquake Other
Version 1
Percent 22% 6% 12% 27% 33%
7
How much more did the city spend to fix damage caused by termites compared to the damage caused by water? A) $360,000 B) $720,000 C) $960,000 D) $1,320,000
22) Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table: 2
5
4
5
1
4
2
4
4
1
2
2
3
5
5
5
3
1
3
1
3
1
5
2
5
2
4
3
3
1
{MISSING IMAGE}Click here for the Excel Data File What is the most common score given in the evaluations? A) 5 B) 1 C) 3 D) 2
23) Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table: 1
4
4
5
5
3
4
3
4
1
5
5
4
4
2
3
3
2
3
3
4
5
5
5
5
3
2
3
3
2
{MISSING IMAGE}Click here for the Excel Data File What is the most common score given in the evaluations?
Version 1
8
A) 2 B) 3 C) 4 D) 5
24) Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table. 1
4
4
5
5
3
4
3
4
1
5
5
4
4
2
3
3
2
3
3
4
5
5
5
5
3
2
3
3
2
{MISSING IMAGE}Click here for the Excel Data File What percentage of students gave Professor Smith an evaluation higher than 3? A) 20% B) 30% C) 50% D) 80%
25) Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table. 1
4
4
5
5
3
4
3
4
1
5
5
4
4
2
3
3
2
3
3
4
5
5
5
5
3
2
3
3
2
{MISSING IMAGE}Click here for the Excel Data File What percentage of students gave Professor Smith an evaluation of 2 or less?
Version 1
9
A) 6.7% B) 13.3% C) 20% D) 80%
26) Students in Professor Smith's business statistics course have evaluated the overall effectiveness of the professor's instruction on a five-point scale, where a score of 1 indicates very poor performance and a score of 5 indicates outstanding performance. The raw scores are displayed in the accompanying table: 1
4
4
5
5
3
4
3
4
1
5
5
4
4
2
3
3
2
3
3
4
5
5
5
5
3
2
3
3
2
{MISSING IMAGE}Click here for the Excel Data File What is the relative frequency of the students who gave Professor Smith an evaluation of 3? A) 0.3 B) 0.5 C) 9 D) 15
27) In the following pie chart representing a collection of cookbooks, which author has more titles?
Version 1
10
A) Jeff Smith B) Julia Child C) Rachael Ray D) Paula Deen
28) The accompanying chart shows the numbers of books written by each author in a collection of cookbooks. What type of chart is this?
A) Bar chart for qualitative data B) Bar chart for quantitative data C) Frequency histogram for qualitative data D) Frequency histogram for quantitative data
29) The accompanying chart shows the number of books written by each author in a collection of cookbooks. What type of data is being represented?
Version 1
11
A) Quantitative, ordinal B) Quantitative, ratio C) Qualitative, nominal D) Qualitative, ordinal
30)
Horizontal bar charts are constructed by placing __________.
A) each category on the vertical axis and the appropriate range of values on the horizontal axis B) each category on the horizontal axis and the appropriate range of values on the vertical axis C) each interval of values on the vertical axis and the appropriate range of values on the horizontal axis D) each interval of values stacking on top of one another
31) When constructing a frequency distribution for a numerical variable, it is important to remember that __________. A) classes are mutually exclusive B) classes are overlapping C) the total number of classes cannot be 15 D) the total number of classes should be at least 100
32)
Which of the following best describes a frequency distribution for a categorical variable?
Version 1
12
A) It groups data into histograms and records the proportion (fraction) of observations in each histogram. B) It groups data into categories and records the number of observations in each category. C) It groups data into intervals called classes and records the proportion (fraction) of observations in each class. D) It groups data into intervals called classes and records the number of observations in each class.
33) What graphical tool is best used to display the relative frequency of a numerical variable? A) Ogive B) Pie chart C) Bar chart D) Histogram
34)
The following data represent scores on a pop quiz in a statistics section. 34
82
91
85
75
36
53
68
35
89
66
64
82
28
15
59
58
40
33
20
{MISSING IMAGE}Click here for the Excel Data File Suppose the data on quiz scores will be grouped into five classes. The width of the classes for a frequency distribution or histogram is the closest to __________. A) 14 B) 16 C) 18 D) 12
Version 1
13
35)
The following data represent scores on a pop quiz in a statistics section. 45
66
74
72
62
44
55
70
33
82
56
56
84
16
16
47
32
32
17
37
{MISSING IMAGE}Click here for the Excel Data File Suppose the data on quiz scores will be grouped into five classes. The width of the classes for a frequency distribution or histogram is the closest to __________. A) 10 B) 12 C) 14 D) 16
36)
The following data represent scores on a pop quiz in a statistics section: 45
66
74
72
62
44
55
70
33
82
56
56
84
16
16
47
32
32
17
37
{MISSING IMAGE}Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "30 up to 44"—that is, {x; 30 ≤ x < 44}. The frequency of this class is __________. A) 0.20 B) 0.25 C) 4 D) 5
37)
The following data represent scores on a pop quiz in a statistics section. 45
66
74
72
62
44
55
70
33
82
56
56
84
16
16
47
32
32
17
37
{MISSING IMAGE}Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "30 up to 44"—that is, {x; 30 ≤ x < 44}. The relative frequency of this class is _____.
Version 1
14
A) 0.20 B) 0.25 C) 4 D) 5
38) The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city. 199
124
145
163
216
128
207
227
220
138
180
210
224
178
118
177
133
166
179
243
113
228
173
263
{MISSING IMAGE}Click here for the Excel Data File Suppose the data on house prices will be grouped into five classes. The width of the classes for a frequency distribution or histogram is the closest to __________. A) 30 B) 35 C) 25 D) 20
39) The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city. 187
125
165
170
230
139
195
229
239
135
188
210
228
172
127
139
122
181
196
237
115
199
170
239
{MISSING IMAGE}Click here for the Excel Data File Suppose the data on house prices will be grouped into five classes. The width of the classes for a frequency distribution or histogram is the closest to __________.
Version 1
15
A) 15 B) 20 C) 25 D) 30
40) The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city 187
125
165
170
230
139
195
229
239
135
188
210
228
172
127
139
122
181
196
237
115
199
170
239
{MISSING IMAGE}Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "115 up to 140"—that is, {x; 115 ≤ x < 140}. The relative frequency of this class is __________. A) 6/24 B) 7/24 C) 6 D) 7
41) The following data represent the recent sales price (in $1,000s) of 24 homes in a Midwestern city. 187
125
165
170
230
139
195
229
239
135
188
210
228
172
127
139
122
181
196
237
115
199
170
239
{MISSING IMAGE}Click here for the Excel Data File Suppose the data are grouped into five classes, and one of them will be "165 up to 190"—that is, {x; 165 ≤ x < 190}. The frequency of this class is __________. A) 6/24 B) 7/24 C) 6 D) 7
Version 1
16
42) Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 2,030
1,900
2,120
1,740
1,630
2,160
2,340
2,350
1,910
1,770
2,030
2,110
2,110
2,010
1,460
1,850
2,360
2,010
2,040
2,380
2,160
1,840
1,780
2,180
1,490
2,190
2,320
1,980
2,350
1,770
{MISSING IMAGE}Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1,400 up to 1,600, 1,600 up to 1,800, 1,800 up to 2,000, and so on. How many students scored at least 1,800 but less than 2,000? A) 7 B) 3 C) 2 D) 5
43) Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 1,450
1,620
1,800
1,740
1,650
1,710
1,900
1,910
1,950
1,820
1,800
2,010
1,780
1,840
1,490
1,590
2,350
2,260
1,870
1,530
1,620
1,480
2,390
1,640
1,830
1,950
2,000
1,830
1,980
2,100
{MISSING IMAGE}Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1,400 up to 1,600, 1,600 up to 1,800, 1,800 up to 2,000, and so on. How many students scored at least 1,800 but less than 2,000? A) 3 B) 7 C) 12 D) 18
Version 1
17
44) Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 1,450
1,620
1,800
1,740
1,650
1,710
1,900
1,910
1,950
1,820
1,800
2,010
1,780
1,840
1,490
1,590
2,350
2,260
1,870
1,530
1,620
1,480
2,390
1,640
1,830
1,950
2,000
1,830
1,980
2,100
{MISSING IMAGE}Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1,400 up to 1,600, 1,600 up to 1,800, 1,800 up to 2,000, and so on. What percent of students scored less than 2,200? A) 10% B) 20% C) 80% D) 90%
45) Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 1,450
1,620
1,800
1,740
1,650
1,710
1,900
1,910
1,950
1,820
1,800
2,010
1,780
1,840
1,490
1,590
2,350
2,260
1,870
1,530
1,620
1,480
2,390
1,640
1,830
1,950
2,000
1,830
1,980
2,100
{MISSING IMAGE}Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1,400 up to 1,600, 1,600 up to 1,800, 1,800 up to 2,000, and so on. What is the approximate relative frequency of students who scored more than 1,600 but less than 1,800? A) 0.17 B) 0.23 C) 0.40 D) 0.77
46) Thirty students at Eastside High School took the SAT on the same Saturday. Their raw scores are given next. 1,450
1,620
1,800
1,740
1,650
1,710
1,900
1,910
1,950
1,820
1,800
2,010
1,780
1,840
1,490
1,590
2,350
2,260
1,870
1,530
Version 1
18
1,620
1,480
2,390
1,640
1,830
1,950
2,000
1,830
1,980
2,100
{MISSING IMAGE}Click here for the Excel Data File Consider a frequency distribution of the data that groups the data in classes of 1,400 up to 1,600, 1,600 up to 1,800, 1,800 up to 2,000, and so on. What graphical tool would you use to display the cumulative relative frequency of the grouped data? A) Ogive B) Polygon C) Pie chart D) Bar chart
47)
Consider the following frequency distribution. Class
Frequency
12 up to 15
6
15 up to 18
8
18 up to 21
8
21 up to 24
9
24 up to 27
3
The total number of observations in the frequency distribution is __________. A) 19 B) 20 C) 34 D) 38
48)
Consider the following frequency distribution. Class
Frequency
12 up to 15
3
15 up to 18
6
18 up to 21
3
21 up to 24
4
24 up to 27
4
The total number of observations in the frequency distribution is __________.
Version 1
19
A) 5 B) 6 C) 20 D) 24
49)
Consider the following frequency distribution. Class
Frequency
12 up to 15
3
15 up to 18
6
18 up to 21
3
21 up to 24
4
24 up to 27
4
How many observations are at least 15 but less than 18? A) 3 B) 4 C) 5 D) 6
50)
Consider the following frequency distribution. Class
Frequency
12 up to 15
3
15 up to 18
6
18 up to 21
3
21 up to 24
4
24 up to 27
4
How many observations are less than 21? A) 6 B) 12 C) 18 D) 24
Version 1
20
51)
Consider the following frequency distribution. Class
Frequency
12 up to 15
3
15 up to 18
6
18 up to 21
3
21 up to 24
4
24 up to 27
4
What proportion of the observations are at least 15 but less than 18? A) 0.20 B) 0.25 C) 0.30 D) 0.35
52)
Consider the following frequency distribution. Class
Frequency
12 up to 15
3
15 up to 18
6
18 up to 21
3
21 up to 24
4
24 up to 27
4
What proportion of the observations are less than 21? A) 0.30 B) 0.60 C) 0.90 D) 1.00
Version 1
21
53) The following histogram represents the number of pages in each book within a collection. What is the frequency of books containing at least 250 but fewer than 300 pages?
A) 5 B) 6 C) 7 D) 12
54) The following histogram represents the number of pages in each book within a collection. What is the frequency of books containing at least 200 but fewer than 250 pages?
A) 4 B) 5 C) 6 D) 7
Version 1
22
55) The following histogram represents the number of pages in each book within a collection. What is the frequency of books containing at least 250 but fewer than 400 pages?
A) 7 B) 10 C) 11 D) 12
56) An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks. Class (in percent)
Frequency
–10 up to 0
5
0 up to 10
22
10 up to 20
18
20 up to 30
8
The number of stocks with returns of 0% up to 10% is __________. A) 8 B) 18 C) 5 D) 22
57) An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks.
Version 1
Class (in percent)
Frequency
−10 up to 0
8
23
0 up to 10
25
10 up to 20
15
20 up to 30
2
The number of stocks with returns of 0% up to 10% is __________. A) 2 B) 8 C) 15 D) 25
58) An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks. Class (in percent)
Frequency
−10 up to 0
8
0 up to 10
25
10 up to 20
15
20 up to 30
2
The number of stocks with returns of less than 10% is __________. A) 8 B) 25 C) 33 D) 48
59) An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks: Class (in percent)
Frequency
−10 up to 0
8
0 up to 10
25
10 up to 20
15
20 up to 30
2
The proportion of stocks with returns of 0% up to 10% is __________.
Version 1
24
A) 0.30 B) 0.50 C) 0.66 D) 0.80
60) An analyst constructed the following frequency distribution on the monthly returns for 50 selected stocks. Class (in percent)
Frequency
−10 up to 0
8
0 up to 10
25
10 up to 20
15
20 up to 30
2
The proportion of stocks with returns of less than 10% is __________. A) 0.30 B) 0.50 C) 0.66 D) 0.80
61) Automobiles traveling on a road with a posted speed limit of 65 miles per hour are checked for speed by a state police radar system. The following table is a frequency distribution of speeds. Speed (miles per hour)
Frequency
45 up to 55
58
55 up to 65
316
65 up to 75
266
75 up to 85
20
How many of the cars traveled less than 75 miles per hour? A) 315 B) 265 C) 640 D) 665
Version 1
25
62) Automobiles traveling on a road with a posted speed limit of 65 miles per hour are checked for speed by a state police radar system. The following table is a frequency distribution of speeds. Speed (miles per hour)
Frequency
45 up to 55
50
55 up to 65
325
65 up to 75
275
75 up to 85
25
How many of the cars traveled less than 75 miles per hour? A) 275 B) 325 C) 650 D) 675
63) Automobiles traveling on a road with a posted speed limit of 65 miles per hour are checked for speed by a state police radar system. The following table is a frequency distribution of speeds. Speed (miles per hour)
Frequency
45 up to 55
50
55 up to 65
325
65 up to 75
275
75 up to 85
25
What proportion of the cars traveled at least 55 but less than 65 miles per hour? A) 0.33 B) 0.48 C) 0.56 D) 0.80
64)
When using a polygon to visualize a numerical variable, what does each point represent?
Version 1
26
A) The lower limit of a particular class and its width B) The midpoint of a particular class and its associated frequency or relative frequency C) The midpoint of a particular class and its associated cumulative frequency or cumulative relative frequency D) The upper limit of a particular class and its associated cumulative frequency or cumulative relative frequency
65)
The accompanying table shows students' scores from the final exam in a history course. Scores
Cumulative Frequency
50 up to 60
11
60 up to 70
33
70 up to 80
64
80 up to 90
93
90 up to 100
100
How many of the students scored at least 70 but less than 90? A) 60 B) 29 C) 36 D) 93
66)
The accompanying table shows students' scores from the final exam in a history course. Scores
Cumulative Frequency
50 up to 60
12
60 up to 70
33
70 up to 80
64
80 up to 90
88
90 up to 100
100
How many of the students scored at least 70 but less than 90?
Version 1
27
A) 24 B) 31 C) 55 D) 88
67) The following table shows the number of payroll jobs the government added during the years it added jobs (since 1973). The jobs are in thousands. Jobs Added
Frequency
100 up to 200
5
200 up to 300
8
300 up to 400
7
400 up to 500
5
500 up to 600
1
Approximately what percent of the time did the government add 200,000 or more jobs? A) 19% B) 50% C) 77% D) 81%
68) The accompanying relative frequency distribution represents the last year car sales for the sales force at Kelly's Mega Used Car Center. Car Sales
Relative Frequency
35 up to 45
0.08
45 up to 55
0.19
55 up to 65
0.30
65 up to 75
0.26
75 up to 85
0.17
If Kelly's employs 100 salespeople, how many of these salespeople have sold at least 35 but fewer than 45 cars in the last year?
Version 1
28
A) 6 B) 11 C) 16 D) 8
69) The accompanying relative frequency distribution represents the last year car sales for the sales force at Kelly's Mega Used Car Center. Car Sales
Relative Frequency
35 up to 45
0.07
45 up to 55
0.15
55 up to 65
0.31
65 up to 75
0.22
75 up to 85
0.25
If Kelly's employs 100 salespeople, how many of these salespeople have sold at least 35 but fewer than 45 cars in the last year? A) 5 B) 7 C) 10 D) 15
70) The accompanying relative frequency distribution represents the last year car sales for the sales force at Kelly's Mega Used Car Center. Car Sales
Relative Frequency
35 up to 45
0.07
45 up to 55
0.15
55 up to 65
0.31
65 up to 75
0.22
75 up to 85
0.25
If Kelly's employs 100 salespeople, how many of these salespeople have sold at least 45 but fewer than 65 cars in the last year?
Version 1
29
A) 15 B) 31 C) 40 D) 46
71) The accompanying relative frequency distribution represents the last year car sales for the sales force at Kelly's Mega Used Car Center. Car Sales
Relative Frequency
35 up to 45
0.07
45 up to 55
0.15
55 up to 65
0.31
65 up to 75
0.22
75 up to 85
0.25
If Kelly's employs 100 salespeople, how many of these salespeople have sold at least 65 cars in the last year? A) 22 B) 25 C) 31 D) 47
72)
When visualizing a numerical variable, what is an ogive used to plot?
A) Frequency or relative frequency of each class against the midpoint of the corresponding class B) Cumulative frequency or cumulative relative frequency of each class against the upper limit of the corresponding class C) Frequency or relative frequency of each class against the midpoint of the corresponding class and cumulative frequency or cumulative relative frequency of each class against the upper limit of the corresponding class D) Cumulative frequency or cumulative relative frequency of each class against the lower limit of the corresponding class
Version 1
30
73)
How does an ogive differ from a polygon?
A) An ogive is used for qualitative data, while a polygon is used for numerical data. B) An ogive is used for quantitative data, while a polygon is used for categorical data. C) An ogive is a graphical depiction of a frequency or relative distribution, while a polygon is a graphical depiction of a cumulative frequency or cumulative relative frequency distribution. D) An ogive is a graphical depiction of a cumulative frequency or cumulative relative frequency distribution, while a polygon is a graphical depiction of a frequency or relative frequency distribution.
74) Recent home sales in a suburb of Washington, D.C., are shown in the accompanying ogive.
Approximate the percentage of houses that sold for less than $600,000. A) 60% B) 70% C) 80% D) 90%
Version 1
31
75) Recent home sales in a suburb of Washington, D.C., are shown in the accompanying ogive.
Approximate the percentage of houses that sold for more than $500,000. A) 40% B) 50% C) 60% D) 70%
76) The organization of the Girl Scouts has completed its annual cookie drive. The sales are reported in the accompanying ogive.
Approximate the percentage of girls who sold fewer than 90 boxes of cookies.
Version 1
32
A) 45% B) 55% C) 65% D) 75%
77) The organization of the Girl Scouts has completed its annual cookie drive. The sales are reported in the accompanying ogive.
Approximate the percentage of girls who sold more than 70 boxes of cookies. A) 45% B) 55% C) 65% D) 75%
78) A stem-and-leaf diagram is constructed by separating each value of a data set into two parts. What are these parts? A) Stem consisting of the last digit and leaf consisting of the leftmost digits B) Stem consisting of the leftmost digits and leaf consisting of the second digit C) Stem consisting of the second digit and leaf consisting of the last digit D) Stem consisting of the leftmost digits and the leaf consists of the remaining digits.
Version 1
33
79) In the accompanying stem-and-leaf diagram, the values in the stem and leaf portions represent 10s and 1s digits, respectively. Stem
Leaf
1
1 2 2 4 5 7
2
0 4 4 5 5 7 7 7 7 8 8 9
3
0 0 4 7 8
4
0 7
Which of the following numbers appears in the stem-and-leaf diagram? A) 3,000 B) 30 C) 300 D) 3.0
80) In the accompanying stem-and-leaf diagram, the values in the stem and leaf portions represent 10s and 1s digits, respectively. Stem
Leaf
1
3 5 6 8 8 9
2
0 1 2 2 3 5 6 6 8 8 8 9
3
0 1 2 2 8
4
2 2
Which of the following numbers appears in the stem-and-leaf diagram? A) 3800 B) 380 C) 38 D) 3.8
81) In the accompanying stem-and-leaf diagram, the values in the stem-and-leaf portions represent 10s and 1s digits, respectively. Stem 1
Version 1
Leaf 3 5 6 8 8 9 34
2
0 1 2 2 3 5 6 6 8 8 8 9
3
0 1 2 2 8
4
2 2
What would be the frequency of the class 35 up to 45, that is {x; 35 ≤ x < 45}? A) 0 B) 1 C) 2 D) 3
82) In the accompanying stem-and-leaf diagram, the values in the stem-and-leaf portions represent 10s and 1s digits, respectively. Stem
Leaf
1
3 5 6 8 8 9
2
0 1 2 2 3 5 6 6 8 8 8 9
3
0 1 2 2 8
4
2 2
How many values are at least 25 but less than 35? A) 10 B) 11 C) 12 D) 13
83) In the accompanying stem-and-leaf diagram, the values in the stem-and-leaf portions represent 10s and 1s digits, respectively. Stem
Leaf
1
3 5 6 8 8 9
2
0 1 2 2 3 5 6 6 8 8 8 9
3
0 1 2 2 8
4
2 2
Find the frequency associated with data values that are more than 28.
Version 1
35
A) 8 B) 9 C) 10 D) 11
84) In the accompanying stem-and-leaf diagram, the values in the stem-and-leaf portions represent 10s and 1s digits, respectively. Stem
Leaf
1
3 5 6 8 8 9
2
0 1 2 2 3 5 6 6 8 8 8 9
3
0 1 2 2 8
4
2 2
The stem-and-leaf diagram shows that the distribution is __________. A) symmetric B) positively skewed C) negatively skewed D) bimodal
85) The following stem-and-leaf diagram shows the speeds in miles per hour (mph) of 14 cars approaching a toll booth on a bridge in Oakland, California. Stem
Leaf
2
2 4 6 8 9
3
2 2 2 4 8
4
0 0 7 9
How many of the cars were traveling faster than 25 mph but slower than 40 mph?
A) 7 B) 8 C) 9 D) 11
Version 1
36
86) The following stem-and-leaf diagram shows the speeds in miles per hour (mph) of 14 cars approaching a toll booth on a bridge in Oakland, California. Stem
Leaf
2
5 6 6 7 9
3
4 7 7 8 9
4
0 0 2 3
How many of the cars were traveling faster than 25 mph but slower than 40 mph? A) 8 B) 9 C) 10 D) 12
87) The following stem-and-leaf diagram shows the last 20 dividend payments (in cents) paid by Procter and Gamble. Stem
Leaf
3
1 5 5 5 5
4
0 0 0 0 4 4 4 4 4 8 8 8
5
3 3 3
The most common dividend payment is __________. A) 0.35 B) 0.40 C) 0.44 D) 0.48
88)
What may NOT be revealed from a scatterplot?
Version 1
37
A) No relationship between two variables B) A linear relationship between two variables C) A curvilinear relationship between two variables D) A causal relationship between two variables
89)
What type of relationship is indicated in the scatterplot?
A) No relationship B) A negative linear relationship C) A negative curvilinear relationship D) A positive linear or curvilinear relationship
90)
What type of relationship is indicated in the scatterplot?
Version 1
38
A) No relationship B) A negative linear relationship C) A positive linear relationship D) A positive curvilinear relationship
91)
Use the following data to construct a scatterplot. What type of relationship is implied? x
3
6
10
14
18
23
y
34
28
20
12
5
0
{MISSING IMAGE}Click here for the Excel Data File A) No relationship B) A positive relationship C) A negative relationship D) There is not enough information to answer
92)
Use the following data to construct a scatterplot. What type of relationship is implied? x
1
5
9
14
18
23
y
2
4
15
12
15
20
{MISSING IMAGE}Click here for the Excel Data File A) No relationship B) A positive relationship C) A negative relationship D) Not enough information to answer
Version 1
39
93) A car dealership created a scatterplot showing the manufacturer's retail price and profit margin for the cars they have on their lot.
As the manufacturer's suggested retail price increases, the profit margin tends to __________. A) increase B) decrease C) stay the same D) oscillate
94) The statistics professor has kept attendance records and recorded the number of absent students per class. The recorded data is displayed in the following bar chart with the frequency of each number of absent students shown above the bars.
How many statistics classes had three or more students absent?
Version 1
40
A) 8 B) 13 C) 22 D) 43
95) The following table shows the percentage of e-mail that is sent each day of the business week according to an Intermedia survey. Day Monday Tuesday Wednesday Thursday Friday
Percentage 15% 23% 22% 21% 19%
Which of the following best displays this data? A) Horizontal bar chart B) Vertical bar chart C) Pie chart D) Histogram
96) The following frequency distribution displays the weekly sales of a certain brand of television at an electronics store. Number Sold
Frequency
01-05
4
06-10
6
11-15
21
16-20
12
21-25
4
How many weeks of data are included in this frequency distribution?
Version 1
41
A) 72 B) 47 C) 94 D) 22
97) The following frequency distribution displays the weekly sales of a certain brand of television at an electronics store. Number Sold
Frequency
01-05
3
06-10
7
11-15
14
16-20
22
21-25
4
How many weeks of data are included in this frequency distribution? A) 25 B) 50 C) 75 D) 100
98) The following frequency distribution shows the frequency of the asking price, in thousands of dollars, for current homes on the market in a particular city. Asking Price
Frequency
$350 up to $400
16
$400 up to $450
16
$450 up to $500
5
$500 up to $550
6
$550 up to $600
4
What percentage of houses has an asking price between $350,000 and under $400,000?
Version 1
42
A) 34.0% B) 45.5% C) 43.1% D) 28.6%
99) The following frequency distribution shows the frequency of the asking price, in thousands of dollars, for current homes on the market in a particular city. Asking Price
Frequency
$350 up to $400
12
$400 up to $450
9
$450 up to $500
17
$500 up to $550
11
$550 up to $600
6
What percentage of houses has an asking price between $350,000 and under $400,000? A) 16.4% B) 21.8% C) 30.9% D) 33.3%
100) The following frequency distribution shows the frequency of the asking price, in thousands of dollars, for current homes on the market in a particular city. Asking Price
Frequency
$350 up to $400
12
$400 up to $450
9
$450 up to $500
17
$500 up to $550
11
$550 up to $600
6
What percentage of houses has an asking price under $550,000?
Version 1
43
A) 50.5% B) 69.1% C) 89.1% D) 95.0%
101) A survey conducted by CBS news asked 1,026 respondents: "What would you do with an unexpected tax refund?" The responses are summarized in the following table. Category Pay off debts Put it in the bank Spend it I never get a refund Other
Percentage 47% 30% 11% 10% 2%
How many people will either put it in the bank or spend it? A) 421 B) 411 C) 113 D) 482
102) The manager at a water park constructed the following frequency distribution to summarize attendance in July and August. Attendance
Frequency
1,000 up to 1,250
5
1,250 up to 1,500
6
1,500 up to 1,750
10
1,750 up to 2,000
20
2,000 up to 2,250
15
2,250 up to 2,500
4
What of the following is the most likely attendance range?
Version 1
44
A) 2,000 up to 2,500 B) 1,000 up to 1,750 C) 1,250 up to 1,750 D) 1,750 up to 2,000
103) The Statistical Abstract of the United States provided the following frequency distribution of the number of people who live below the poverty level by region. Region
Number of People (in 1000s)
Northeast
6,166
Midwest
7,237
South
15,501
West
8,372
What is the percentage of people who live below the poverty level in the West or Midwest? A) 35.96% B) 41.87% C) 41.58% D) 31.96%
104)
Consider the following stem-and-leaf diagram.
Stem
Leaf
3
1 1 1 4 5
4
4 6 7
5
0 0 4 5 6 6 8 9
6
1 3 3 6
Which data value occurs most often? A) 1 B) 56 C) 31 D) 63
Version 1
45
105)
Consider the following stem-and-leaf diagram.
Stem
Leaf
3
1 1 1 4 5
4
4 6 7
5
0 0 4 5 6 6 8 9
6
1 3 3 6
Which of the following statements is correct? A) There is a total of 10 data values in this data set. B) The data value that occurs most often is 50. C) This largest data value is 59. D) The range 50-59 contains the most values.
106) Which of the following methods best describe the relationship between two categorical variables? A) A pie chart B) A bar chart C) A stacked column chart D) A column chart
107) The Chief Diversity Officer of an accounting firm wants to make sure gender equity is achieved among all levels of positions of the firm. The Officer has information on an employee’s gender, position level, and marital status as shown below: Employee 1 2 : 5000
Gender Male Female : Male
Position Level Top Bottom : Middle
Marital Status Widowed Married : Single
Which of the following methods is best to help determine the firm’s status on gender equity?
Version 1
46
A) A contingency table by Gender and Position Level. B) A stacked column chart by Gender and Marital Status. C) A frequency distribution of Position Level. D) A column chart by Gender.
108)
Which of the following is not a graphical technique to display numerical data? A) stem-and-leaf B) histogram C) scatterplot D) bar chart
109) A survey of 400 unemployed people was completed at a job fair. Each person was asked to categorize his or her job interests. The accompanying relative frequency distribution was constructed. Field
Relative Frequency
Management
0.15
Business and financial operations
0.20
Computer and mathematical
0.10
Life, physical, and social science
0.30
Community and social service
0.25
a. Outside of Connect, construct the corresponding frequency distribution. How many of these people designated that the computer and mathematical industry was their job interest? b. By hand or in Excel, construct a pie chart.
110) A hair stylist records the hair color of her 25 most recent appointments, classifying the color as blonde, brown, black, or red. Her data set is displayed next. Red
Version 1
Blonde
Black
Red
Blonde 47
Blonde
Black
Blonde
Red
Blonde
Brown
Black
Red
Blonde
Brown
Brown
Red
Black
Black
Red
Brown
Black
Brown
Blonde
Blonde
{MISSING IMAGE}Click here for the Excel Data File a. Outside of Connect, construct a frequency and relative frequency distribution of the hair color of the stylist's customers. What is the frequency and relative frequency of each hair color? b. By hand or in Excel, construct a pie chart. Which hair color is the most common among the stylist's customers? c. By hand or in Excel, create a bar chart to display the frequency distribution. How many customers had black hair?
111) The following table lists some of the busiest ports in the world based on the number of containers in Year 1. Location of Port
Number of Containers (in millions)
Shanghai
29
Singapore
28
Hong Kong
24
Rotterdam
11
Los Angeles
7
New York
5
By hand or in Excel, construct a pie chart to summarize the data. Approximately what percent of the total number of containers go through Hong Kong?
Version 1
48
112) Johnson and Johnson (JNJ) is a consumer staples company. Consumer staples are products people need and buy even during times of financial hardship. Do you think JNJ will have a volatile stock price? Does the accompanying graph accurately depict the volatility of JNJ stock? Explain.
113) Each month the Bureau of Labor Statistics reports the number of people (in thousands) employed in the United States by age. The accompanying frequency distribution shows the results for August. Age
Frequency
16 to 19
4,794
20 to 24
13,273
25 to 34
30,789
35 to 44
30,021
45 to 54
32,798
55 and over
28,660
a. Outside of Connect, construct a relative frequency distribution. What proportion of workers is between 20 and 24 years old? b. Outside of Connect, construct a cumulative relative frequency distribution. What proportion of workers is younger than 35 years old? c. By hand or in Excel, construct a relative frequency histogram.
Version 1
49
114) The following table displays the top 40 American League batting averages of the last season. Player
Batting Average Player
Batting Average
Miguel Cabrera
0.344
Yunel Escobar
0.290
Adrian Gonzalez
0.338
Vladimir Guerrero
0.290
Michael Young
0.338
Alberto Callaspo
0.288
Victor Martinez
0.33
Howard Kendrick
0.285
Jacoby Ellsbury
0.321
Jeff Francoeur
0.285
David Ortiz
0.309
Nick Markakis
0.284
Dustin Pedroia
0.307
Michael Cuddyer
0.284
Casey Kotchman
0.306
Adam Jones
0.280
Melky Cabrera
0.305
Elvis Andrus
0.279
Alex Gordon
0.303
Erick Aybar
0.279
Jose Bautista
0.302
Juan Pierre
0.279
Robinson Cano
0.302
Matt Joyce
0.277
Paul Konerko
0.300
Asdrubal Cabrera
0.273
Jhonny Peralta
0.299
Edwin Encarnacion
0.272
Josh Hamilton
0.298
Ichiro Suzuki
0.272
Derek Jeter
0.297
Peter Bourjos
0.271
Adrian Beltre
0.296
J.J. Hardy
0.269
Alex Avila
0.295
Alexei Ramirez
0.269
Eric Hosmer
0.293
Ben Zobrist
0.269
Billy Butler
0.291
Delmon Young
0.268
Version 1
50
{MISSING IMAGE}Click here for the Excel Data File a. Outside of Connect, construct frequency, relative frequency, and cumulative relative frequency distributions that group the data in classes of 0.265 up to 0.280, 0.280 up to 0.295, 0.295 up to 0.310, and so on. b. How many of these players have a batting average above 0.340? What proportion of these players has a batting average of at least 0.280 but below 0.295? What percentage of these players has a batting average below 0.325? c. Construct a relative frequency histogram. Is the distribution symmetric? If not, is it positively or negatively skewed? d. By hand or in Excel, construct an ogive. Using the ogive, approximately what proportion of the players in this group has a batting average above 0.290?
115) The following table shows analyst sentiment ratings for the 30 stocks listed in the Dow Jones Industrial Average. 7
4
6
8
4
9
4
2
2
4
6
4
5
6
5
3
8
4
9
6
2
9
7
8
4
3
9
4
6
7
{MISSING IMAGE}Click here for the Excel Data File a. Outside of Connect, construct a frequency distribution, relative frequency distribution, cumulative frequency distribution, and relative cumulative frequency distribution using classes of 2 up to 4, 4 up to 6, 6 up to 8, and 8 up to 10. b. Outside of Connect, construct a histogram that summarizes the data. What percentage of the stocks in the Dow Jones Industrial Average received a sentiment rating less than 8? c. What percentage of the stocks in the Dow Jones Industrial Average received a sentiment rating of 6 or more?
Version 1
51
116) The accompanying cumulative relative frequency distribution shows a summary of the scores from an Algebra II exam at a local high school. Twenty students took the exam. Class
Cumulative Relative Frequency
51 - 60
0.05
61 - 70
0.20
71 - 80
0.45
81 - 90
0.80
91 - 100
1.00
a. Outside of Connect, construct the relative frequency distribution. What proportion of students scored between 81 and 90? b. Outside of Connect, construct the frequency distribution. How many students scored between 71 and 80? c. By hand or in Excel, construct an ogive. What is the approximate percentage of students that scored less than 85?
117) The dividend yields of the stocks in an investor's portfolio are shown in the following cumulative relative frequency distribution. Dividend Yields
Cumulative Relative Frequency
0% up to 2%
0.55
2% up to 4%
0.85
4% up to 6%
0.90
6% up to 8%
0.96
8% up to 10%
1.00
Version 1
52
By hand or in Excel, construct an ogive. Approximately what percent of the stocks had a dividend yield of 3% or larger?
118)
Outside of Connect, construct a stem-and-leaf diagram with the following data set.
3.2
1.3
2.1
2.4
4.3
3.1
3.2
1.1
1.4
2.5
2.4
2.9
3.8
1.7
2.3
1.2
3.2
1.4
1.5
2.6
{MISSING IMAGE}Click here for the Excel Data File Is the distribution symmetric? Stem
Leaf
1
1 2 3 4 4 5 7
2
1 3 4 4 5 6 9
3
1 2 2 2 8
4
3
119)
Outside of Connect, construct a stem-and-leaf diagram for the following data set.
74
75
63
62
56
79
58
79
53
49
78
69
74
72
53
72
64
65
67
77
{MISSING IMAGE}Click here for the Excel Data File Is the distribution symmetric? Stem
Leaf
4
9
5
3 3 6 8
6
2 3 4 5 7 9
7
2 2 4 4 5 7 8 9 9
Version 1
53
120) The following table shows average wind speeds (in miles per hour) during 15 major fires in California. 44
55
22
32
29
24
47
33
32
27
58
39
38
51
41
{MISSING IMAGE}Click here for the Excel Data File Outside of Connect, construct a stem-and-leaf diagram. Were most of these storms fueled by 45+ mile-per-hour winds? Explain.
121) The following table shows the prices (in $1,000s) of the last 15 trucks sold at a Toyota dealership. 32
21
26
33
23
24
31
22
17
25
18
23
22
19
35
{MISSING IMAGE}Click here for the Excel Data File Outside of Connect, construct a stem-and-leaf diagram. Given this diagram, estimate the price that a potential buyer would likely pay for a Toyota truck.
122) The following data represent the ages of patients in the cardiac section of the local hospital. Outside of Connect, construct a stem-and-leaf diagram. Comment on whether or not the distribution is symmetric. 48
53
60
61
62
63
70
70
72
77
78
79
80
82
87
88
{MISSING IMAGE}Click here for the Excel Data File Outside of Connect, construct a stem-and-leaf diagram. Comment on whether or not the distribution is symmetric.
Version 1
54
90
123) A high school football league recorded the average points scored per game, as well as the winning percentage, for the 10 teams in the league. Points per Game 24 21 27 13 16 18 15 17 19 22
Winning Percentage 88% 66% 78% 28% 32% 52% 30% 44% 32% 50%
{MISSING IMAGE}Click here for the Excel Data File Outside of Connect, construct a scatterplot. Does scoring more points appear to be associated with a higher winning percentage?
124) A statistics instructor computes the grade and percentage of classes that each of his students attends. Outside of Connect, construct a scatterplot from the data displayed next. Does a relationship exist between attendance and grade? Attendance
47
60
75
86
95
98
100
Grade
58
72
85
84
90
97
92
{MISSING IMAGE}Click here for the Excel Data File
125) A grocery store manager wants to know how the annual demand for wild-caught versus farm-raised salmon changes over time. He has demand data for the two types of salmon from 2010 to 2020: Year 2010
Version 1
Wild-caught ('000 lbs.) 15.25
Farm-raised ('000 lbs.) 18.54
55
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
13.63 18.00 19.34 18.71 25.08 28.44 28.70 30.11 39.54 40.06
20.37 20.25 20.15 20.06 19.98 17.91 18.82 19.70 17.59 16.47
{MISSING IMAGE}Click here for the Excel Data File By hand or in Excel, construct a line chart to show the changes in demand over time for the two types of salmon. Which type of salmon has an upward trend in demand?
126) You want to examine if there is a performance difference between online and traditional students in your Business Statistics class. The data below shows information on a student’s grade and whether he/she is an online student: Student 1 2 3 : 50
Online Yes Yes No : Yes
Grade A B A : A
{MISSING IMAGE}Click here for the Excel Data File Outside of Connect, construct a contingency table that cross-classifies the data by Online and Grade. How many are online students? How many online students earned an A/B grade? How many traditional students earn an A/B grade? How many online students earned an E grade? How many traditional students earned an E grade? Outside of Connect, construct a stacked column chart. Does there appear to be a performance gap for online students?
Version 1
56
127) A frequency distribution for a categorical variable groups these data into classes called intervals and records the total number of observations in each class. ⊚ true ⊚ false
128) The relative frequency of a category is calculated by dividing the category's frequency by the total number of observations. ⊚ true ⊚ false
129) The percent frequency of a category equals the frequency of the category multiplied by 100%. ⊚ true ⊚ false
130) A pie chart is a segmented circle that portrays the categories and relative sizes of some numerical variable. ⊚ true ⊚ false
131) A bar chart depicts the frequency or relative frequency of each category of the categorical variable as a bar rising vertically from the horizontal axis. It is also acceptable for the bar to extend horizontally from the vertical axis. ⊚ true ⊚ false
Version 1
57
132)
A bar chart may be displayed horizontally. ⊚ true ⊚ false
133)
A contingency table shows the frequencies for two categorical variables. ⊚ ⊚
true false
134) A stacked column chart is an advanced version of a bar chart designed to visualize one numerical variable only. ⊚ ⊚
true false
135) To approximate the width of a class in the creation of a bar chart, we may use this formula: (Maximum value − Minimum value) / Number of classes ⊚ true ⊚ false
136) For a numerical variable, a relative frequency distribution identifies the proportion of observations that fall into each class. ⊚ true ⊚ false
137) For a numerical variable, a cumulative relative frequency distribution records the proportion (fraction) of values that fall below the upper limit of each class. Version 1
58
⊚ ⊚
true false
138) A histogram is a series of rectangles where the width and height of each rectangle represent the frequency (or relative frequency) and the width of the respective class. ⊚ true ⊚ false
139) A polygon connects a series of neighboring points where each point represents the midpoint of a particular class and its associated frequency or relative frequency. ⊚ true ⊚ false
140) An ogive is a graph that plots the cumulative frequency (or the cumulative relative frequency) of each class above the lower limit of the corresponding class. ⊚ true ⊚ false
141) A stem-and-leaf diagram is useful in that it gives an overall picture of where the observations of a numerical variable are centered and how the data are dispersed from the center. ⊚ true ⊚ false
142) A scatterplot is a graphical tool that helps determine whether or not two numerical variables are related. ⊚ true ⊚ false
Version 1
59
143) When constructing a scatterplot for two numerical variables, we usually refer to one variable as x and another one as y. Typically, we graph x on the vertical axis and y on the horizontal axis. ⊚ true ⊚ false
144) When constructing a pie chart, only a few, the most frequent, categories must be included in the pie. ⊚ true ⊚ false
145) When visualizing a numerical variable, it is always better to have up to 30 classes in a frequency distribution. ⊚ ⊚
true false
146) Scatterplot is a graphical tool that is useful to show the relationship between two numerical variables, but has no way to incorporate a categorical variable. ⊚ ⊚
Version 1
true false
60
Answer Key Test name: Chap 02_4e 1) [categories, observations] 2) frequency distribution 3) [exclusive, exhaustive] 4) frequency distribution 5) [symmetric, skewed] 6) histogram 7) curvilinear 8) positive 9) unemployed 10) [males, females] 11) D 12) B 13) B 14) C 15) B 16) C 17) D 18) C 19) B 20) B 21) C 22) A 23) B 24) C 25) C 26) A Version 1
61
27) B 28) A 29) C 30) A 31) A 32) B 33) D 34) B 35) C 36) C 37) A 38) A 39) C 40) B 41) D 42) D 43) C 44) D 45) B 46) A 47) C 48) C 49) D 50) B 51) C 52) B 53) C 54) B 55) C 56) D Version 1
62
57) D 58) C 59) B 60) C 61) C 62) C 63) B 64) B 65) A 66) C 67) D 68) D 69) B 70) D 71) D 72) B 73) D 74) C 75) C 76) D 77) B 78) D 79) B 80) C 81) D 82) B 83) A 84) B 85) B 86) B Version 1
63
87) C 88) D 89) D 90) B 91) C 92) B 93) A 94) D 95) C 96) B 97) B 98) A In this case, ({{[a(6)]:#,###}} + {{[a(7)]:#,###}} + {{[a(8)]:#,###}} + {{[a(9)]:#,###}}) / ({{[a(6)]:#,###}} + {{[a(7)]:#,###}} + {{[a(8)]:#,###}} + {{[a(9)]:#,###}} + {{[a(10)]:#,###}}) = {{[a(12)]:#,###}}/{{[a(11)]:#,###}} = {{[a(14)]:#,###.00}}%. 99) B In this case, 12 / (12 + 9 + 17 + 11 + 6) = 12/55 = 21.8% 100) C In this case, (12 + 9 + 17 + 11) / (12 + 9 + 17 + 11 + 6) = 49/55 = 89.1% 101) A In this case, 1026 × (0.30 + 0.11) = 420.66 102) B 103) B In this case, (7237 + 8372)/(sum of all regions) = 15609/37276 = 41.87% 104) C 105) D 106) C 107) A Version 1
64
108) D 109) a. See the table below for the frequency distribution. Forty people designated that the computer and mathematical field was their job interest. Field
Relative Frequency
Frequency
Management
0.15
60
Business and financial operations
0.20
80
Computer and mathematical
0.10
40
Life, physical, and social science
0.30
120
Community and social service
0.25
100
b. [MISSING IMAGE: A pie chart shows the survey of 400 unemployed people. Management: 15%. Business and financial operations: 20%. Computer and mathematical: 10%. Life, physical and social science: 30%. Community and social service: 25%., A pie chart shows the survey of 400 unemployed people. Management: 15%. Business and financial operations: 20%. Computer and mathematical: 10%. Life, physical and social science: 30%. Community and social service: 25%. ]
110) a. Hair Color Black Blonde Brown Red
Frequency 6 8 5 6
Relative Frequency 0.24 0.32 0.20 0.24
b. The most common hair color is Blonde. [MISSING IMAGE: A pie chart shows the record of hair color. Red: 24%. Blonde: 32%. Brown: 20%. Black: 24%., A pie chart shows the record of hair color. Red: 24%. Blonde: 32%. Brown: 20%. Black: 24%. ] c. Six customers have black hair. [MISSING IMAGE: A pie chart records the hair color of customers. Red: 6. Blonde: 8. Brown: 5. Black: 6., A pie chart records the hair color of customers. Red: 6. Blonde: 8. Brown: 5. Black: 6. ]
Version 1
65
111) Twenty-three percent of the containers traveled through Hong Kong. [MISSING IMAGE: A pie chart shows the number of containers in millions. Shanghai: 28%. Singapore: 27%. Hong Kong: 23%. Rotterdam: 10%. Los Angeles: 7%. New York: 5%., A pie chart shows the number of containers in millions. Shanghai: 28%. Singapore: 27%. Hong Kong: 23%. Rotterdam: 10%. Los Angeles: 7%. New York: 5%. ]
112) Consumer staples companies tend to have stable stocks. No, the graph does not accurately depict the volatility of JNJ stock. The vertical axis starts at 54 and should start at zero.
113) a. 0.095. b. 0.348. Age
Frequency
Relative Frequency
Cumulative Relative Frequency
16 to 19
4,794
0.034
0.034
20 to 24
13,273
0.095
0.129
25 to 34
30,789
0.219
0.348
35 to 44
30,021
0.214
0.562
45 to 54
32,798
0.234
0.796
55 and over
28,660
0.204
1.000
c. [MISSING IMAGE: A histogram plots relative frequency versus age of unemployed worker. The horizontal axis is labeled age of unemployed worker. It is marked from left to right as follows: 16 to 19, 20 to 24, 25 to 34, 35 to 44, 45 to 54, and 55 and over. The vertical axis is labeled relative frequency. It ranges from 0 to 0.25 in increments of 0.05. 16 to 19: 0.04. 20 to 24: 0.09. 25 to 34: 0.23. 35 to 44: 0.22. 45 to 54: 0.24. 55 and over: 0.19., A histogram plots relative frequency versus age of unemployed worker. The horizontal axis is labeled age of unemployed worker. It is marked from left to right as follows: 16 to 19, 20 to 24, 25 to 34, 35 to 44, 45 to 54, and 55 and over. The vertical axis is labeled relative frequency. It ranges from 0 to 0.25 in increments of 0.05. 16 to 19: 0.04. 20 to 24: 0.09. 25 to 34: 0.23. 35 to 44: 0.22. 45 to 54: 0.24. 55 and over: 0.19. ]
114) a. Batting Average
Frequency
Relative Frequency
Cumulative Relative Frequency
0.265 – 0.280
12
0.300
0.300
0.280 – 0.295
10
0.250
0.550
0.295 – 0.310
13
0.325
0.875
0.310 – 0.325
1
0.025
0.900
Version 1
66
0.325 – 0.340
3
0.075
0.975
0.340 – 0.355
1
0.025
1.000
b. One player has a batting average above 0.340; 25% of the players have a batting average of at least 0.280 but less than 0.295; 90% of the players have batting averages below 0.325. c. The distribution is not symmetric; it is positively skewed. [MISSING IMAGE: A histogram plots relative frequency versus batting average. The horizontal axis is labeled batting average. It is labeled from left to right as follows: .265 to .280, .280 to .295, .295 to .310, .310 to .325, .325 to 340, and .340 to .355. The vertical axis is labeled relative frequency. It ranges from 0 to 14 in increments of 2. .265 to .280: 12. .280 to .295: 10. .295 to .310: 13. .310 to .325: 1. .325 to .340: 3. .340 to .355: 0.8., A histogram plots relative frequency versus batting average. The horizontal axis is labeled batting average. It is labeled from left to right as follows: .265 to .280, .280 to .295, .295 to .310, .310 to .325, .325 to 340, and .340 to .355. The vertical axis is labeled relative frequency. It ranges from 0 to 14 in increments of 2. .265 to .280: 12. .280 to .295: 10. .295 to .310: 13. .310 to .325: 1. .325 to .340: 3. .340 to .355: 0.8. ] d. [MISSING IMAGE: A graph plots cumulative relative frequency versus batting average. The graph shows a curve that begins at the bottom left, goes up and to the right with increasing steepness until a point, and then continues to go up and to the right with decreasing steepness., A graph plots cumulative relative frequency versus batting average. The graph shows a curve that begins at the bottom left, goes up and to the right with increasing steepness until a point, and then continues to go up and to the right with decreasing steepness. ] e. Approximately 0.55 115) b. 23/30 ≈ 0.77 or about 77%. See cumulative relative frequency distribution in part a. c. 15/30 = 0.5 or 50%. See cumulative relative frequency distribution in part a.
116) a. 0.35 b. 5 Class
Cumulative Relative Frequency
Relative Frequency
Frequency
51 - 60
0.05
0.05
1
61 - 70
0.20
0.15
3
71 - 80
0.45
0.25
5
81 - 90
0.80
0.35
7
91 - 100
1.00
0.20
4
Version 1
67
c. Approximately 60% of students scored less than 85. [MISSING IMAGE: A graph plots relative frequency versus scores. The horizontal axis is labeled scores. It ranges from 0 to 120 in increments of 20. The vertical axis is labeled relative frequency. It ranges from 0 to 1 in increments of 0.1. The graph shows a curve that begins at (50, 0), passes through (60, 0.05), (70, 0.2), (80, 0.5), (90, 0.8), and ends at (100, 1)., A graph plots relative frequency versus scores. The horizontal axis is labeled scores. It ranges from 0 to 120 in increments of 20. The vertical axis is labeled relative frequency. It ranges from 0 to 1 in increments of 0.1. The graph shows a curve that begins at (50, 0), passes through (60, 0.05), (70, 0.2), (80, 0.5), (90, 0.8), and ends at (100, 1). ]
117)
[MISSING IMAGE: A graph plots relative frequency versus dividend yield. The horizontal axis is labeled dividend yield. It ranges from 0 to 0.1 in increments of 0.02. The vertical axis is labeled relative frequency. It ranges from 0 to 1 in increments of 0.1. The graph shows a curve that begins at (0, 0), passes through (0.02, 0.55), (0.04, 0.85), (0.06, 0.9), (0.08, 0.95), and ends at (0.1, 1)., A graph plots relative frequency versus dividend yield. The horizontal axis is labeled dividend yield. It ranges from 0 to 0.1 in increments of 0.02. The vertical axis is labeled relative frequency. It ranges from 0 to 1 in increments of 0.1. The graph shows a curve that begins at (0, 0), passes through (0.02, 0.55), (0.04, 0.85), (0.06, 0.9), (0.08, 0.95), and ends at (0.1, 1). ]
Approximately 30% of the stocks had a dividend yield of 3% or greater. 118) No, the distribution is positively skewed. 119) No, the distribution is negatively skewed. 120) No, most of the time the average wind speed was below 45 mph; only 4 out of the 15 storms had average wind speeds exceeding 45 mph. Stem
Leaf
2
2 4 7 9
3
2 2 3 8 9
4
1 4 7
5
1 5 8
121) A potential buyer of a Toyota truck is likely to pay in the low to mid $20s (in thousands). Stem
Leaf
1
7 8 9
2
1 2 2 3 3 4 5 6
3
1 2 3 5
122) Stem
Leaf
4
8
5
3
Version 1
68
6
0 1 2 3
7
0 0 2 7 8 9
8
0 2 7 8
9
0
The distribution is not symmetric; it is slightly negatively skewed due to patient who is 48 years old. 40s
48
50s
53
60s
60, 61, 62, 63
70s
70, 70, 72, 77, 78, 79
80s
80, 82, 87, 88
90s
90
123) Teams with higher points per game tend to have a higher winning percentage. [MISSING IMAGE: A scatter plot shows winning percentage versus average points per game. The horizontal axis is labeled average points per game. It ranges from 0 to 30 in increments of 5. The vertical axis is labeled winning percentage. It ranges from 0% to 100% in increments of 10%. The relationship between winning percentage and average points per game is positive and linear., A scatter plot shows winning percentage versus average points per game. The horizontal axis is labeled average points per game. It ranges from 0 to 30 in increments of 5. The vertical axis is labeled winning percentage. It ranges from 0% to 100% in increments of 10%. The relationship between winning percentage and average points per game is positive and linear. ] 124) [MISSING IMAGE: A scatter plot shows grade out of 100 versus attendance in percentage of classes attended. The horizontal axis is labeled attendance in percentage of classes attended. It ranges from 0 to 100 in increments of 20. The vertical axis is labeled grade out of 100. It ranges from 0 to 100 in increments of 20. The relationship between grade out of 100 and attendance is positive and linear., A scatter plot shows grade out of 100 versus attendance in percentage of classes attended. The horizontal axis is labeled attendance in percentage of classes attended. It ranges from 0 to 100 in increments of 20. The vertical axis is labeled grade out of 100. It ranges from 0 to 100 in increments of 20. The relationship between grade out of 100 and attendance is positive and linear. ]
Yes, there appears to be a positive relationship. 125) The wild-caught salmon has an upward trend in demand. 126)
Version 1
69
Grade A B C D E Grand Total
Online-No 11 8 4 4 1 28
Online-Yes 7 4 5 3 3 22
Grand Total 18 12 9 7 4 50
There are 22 online students; 11 online students earn an A/B grade; 19 traditional students earn an A/B grade; 3 online student earned an E grade; 1 traditional students earned an E grade. [MISSING IMAGE: A stacked bar graph shows the count of students attending online classes. Grade A. Online, yes: 11. Online, no: 7. Grade B. Online, yes: 7. Online, no: 5. Grade C. Online, yes: 4. Online, no: 5. Grade D. Online, yes: 4. Online, no: 3. Grade E. Online, yes: 1. Online, no: 3., A stacked bar graph shows the count of students attending online classes. Grade A. Online, yes: 11. Online, no: 7. Grade B. Online, yes: 7. Online, no: 5. Grade C. Online, yes: 4. Online, no: 5. Grade D. Online, yes: 4. Online, no: 3. Grade E. Online, yes: 1. Online, no: 3. ]
[MISSING IMAGE: A stacked bar graph shows the count of students attending online classes. Grade A. Online, yes: 2. Online, no: 3. Grade B. Online, yes: 2. Online, no: 1. Grade C. Online, yes: 2. Online, no: 1. Grade D. Online, yes: 1., A stacked bar graph shows the count of students attending online classes. Grade A. Online, yes: 2. Online, no: 3. Grade B. Online, yes: 2. Online, no: 1. Grade C. Online, yes: 2. Online, no: 1. Grade D. Online, yes: 1. ] There appears to be a performance gap for online students. There are fewer A/Bs and more Es for online students as opposed to traditional students.
127) FALSE 128) TRUE 129) FALSE 130) FALSE 131) TRUE 132) TRUE 133) TRUE 134) FALSE 135) FALSE 136) TRUE Version 1
70
137) TRUE 138) FALSE 139) TRUE 140) FALSE 141) TRUE 142) TRUE 143) FALSE 144) FALSE 145) FALSE 146) FALSE
Version 1
71
CHAPTER 03 – 4e 1)
Which are characteristic(s) of the coefficient of variation? A) It adjusts for differences in the magnitude of means. B) It has the same units of measurement as the observations. C) It allows for direct comparisons across different data sets. D) It is a measure of central location.
2)
The median is defined as the __________. A) middle point in a data set. B) geometric average of a data set. C) arithmetic average of a data set. D) most common value of a data set.
3)
The mode is defined as the __________. A) middle point in a data set. B) geometric average of a data set. C) arithmetic average of a data set. D) most frequent value in a data set.
4)
Which of the following is the most influenced by outliers? A) Mode B) Median C) 75th percentile D) Arithmetic mean
Version 1
1
5)
How do we find the median if the number of observations in a data set is odd? A) By averaging the first and the third quartile B) By taking the middle value in the sorted data set C) By averaging the minimum and maximum values D) By taking the middle value in the sorted data set after eliminating outliers
6)
Is it possible for a data set to have no mode? A) Yes, if two observations occur twice. B) No, unless there is an odd number of observations. C) No, if the data set is nonempty, there is always a mode. D) Yes, if there are no observations that occur more than once.
7)
Is it possible for a data set to have more than one mode?
A) No, there must always be a single mode, or else there is no mode. B) Yes, if two or more values in a data set occur the same number of times. C) Yes, if there are at least two different values in a data set, there is always more than one mode. D) Yes, if two or more values in a data set occur with the most frequency and the frequency is greater than one.
8) The Boom company has recently decided to raise the salaries of all employees by 10%. Which of the following is (are) expected to be affected by this raise? A) Mean and mode only B) Mean and median only C) Mode and median only D) Mean, median, and mode
Version 1
2
9) The owner of a company has recently decided to raise the salary of one employee, who was already making the highest salary, by 20%. Which of the following is(are) expected to be affected by this raise? A) Mean only B) Median only C) Mean and median only D) Mean, median, and mode
10) The manager of a nationwide department store wants to compare average store sales by region to identify which region has the lowest store sales. Which of the following numerical descriptive measures is best for this purpose? A) The arithmetic mean B) The subsetted mean C) The geometric mean D) The weighted mean
11) The table below gives the deviations of a portfolio’s annual total returns from its benchmark’s annual returns, for a six-year period ending in Year 6. Portfolio’s deviations from Benchmark Return, Year 1 – Year 6 Year 1 −7.62% Year 2 2.37% Year 3 −9.11% Year 4 0.55% Year 5 5.48% Year 6 −1.67%
{MISSING IMAGE}Click here for the Excel Data File The arithmetic mean return and median return are the closest to __________.
Version 1
3
A) mean = −2.00% and median = −4.28%. B) mean = −2.00% and median = −1.67%. C) mean = −1.67% and median = −0.56%. D) mean = −1.67% and median = 0.56%.
12)
Which of the following statements is most accurate when defining percentiles? A) The pth percentile divides a data set into equal parts. B) Approximately p% of the observations are greater than the pth percentile. C) Approximately (100 − p)% of the observations are less than the pth percentile. D) Approximately (100 − p)% of the observations are greater than the pth percentile.
13)
Which five values are graphed on a box plot? A) Min, Quintile 1, Mean, Quintile 3, Max B) Min, Quartile 1, Mean, Quartile 3, Max C) Min, Quintile 1, Median, Quintile 3, Max D) Min, Quartile 1, Median, Quartile 3, Max
14)
What is the interquartile range? A) Q3 – Q1 B) Max – Min C) Mean – Median D) All values between Q1 and Q3
Version 1
4
15) Consider the following data: 1, 2, 4, 5, 10, 12, 18. The 30th percentile is the closest to __________. {MISSING IMAGE}Click here for the Excel Data File A) 2.0 B) 2.4 C) 2.8 D) 5.0
16) Calculate the interquartile range from the following data: 1, 2, 4, 5, 10, 12, 18. {MISSING IMAGE}Click here for the Excel Data File A) 5 B) 6 C) 10 D) 17
17) As of September 30, the earnings per share (EPS) of five firms in the beverages industry are as follows: 1.13
2.41
1.52
1.40
0.41
{MISSING IMAGE}Click here for the Excel Data File The 25th percentile and the 75th percentile of the EPS are the closest to __________. A) 0.77 and 1.97 B) 0.91 and 1.77 C) 1.77 and 0.91 D) 1.97 and 0.77
18)
In what way(s) is(are) the concept of geometric mean useful?
Version 1
5
A) In evaluating investment returns B) In calculating average growth rates C) In assessing the dispersion of the data D) Both in evaluating investment returns and in calculating average growth rates
19)
What is(are) characteristic(s) of the geometric mean?
A) It is always greater than the arithmetic mean. B) It is the mathematical equivalent to the median. C) It is always less than or equal to the arithmetic mean. D) Both it is the mathematical equivalent to the median and it is always less than or equal to the arithmetic mean.
20) Sales for Adidas grew at a rate of 0.5196 in Year 1, 0.0213 in Year 2, 0.0485 in Year 3, and −0.0387 in Year 4. The average growth rate for Adidas during these four years is the closest to __________. A) 3.49% B) 11.83% C) 13.77% D) 14.02%
21) Total revenue for Apple Computers (in millions) was $42,905 in Year 1, $65,225 in Year 2, and $108,249 in Year 3. The average growth rate of revenue during these three years is the closest to __________. A) 36.13% B) 39.33% C) 58.84% D) 58.99%
Version 1
6
22)
The following data represent monthly returns (in percent):
−7.24
1.64
3.48
−2.49
9.30
{MISSING IMAGE}Click here for the Excel Data File The geometric mean return is the closest to __________. A) −0.43% B) 0.78% C) 0.94% D) 4.79%
23) A portfolio manager generates a 5% return in Year 1, a 12% return in Year 2, a negative 6% return in Year 3, and a return of 2% (nonannualized) in the first quarter in Year 4. The annualized return for the entire period is the closest to __________. A) 3.05% B) 3.25% C) 3.50% D) 3.77%
24)
The range is defined as __________. A) Q3 − Q1 B) Max− Q1 C) Max− Min D) Max− Median
25)
What is(are) the most widely used measure(s) of dispersion?
Version 1
7
A) Range B) Interquartile range C) Variance and standard deviation D) Covariance and the correlation coefficient
26)
What is the relationship between the variance and the standard deviation? A) The standard deviation is the absolute value of the variance. B) The variance is the absolute value of the standard deviation. C) The variance is the positive square root of the standard deviation. D) The standard deviation is the positive square root of the variance.
27)
Which of the following statements about variance is the most accurate? A) Variance is the square root of the standard deviation. B) Variance can be both, positive or negative. C) Variance is denominated in the same units as the original data. D) Variance is the average of the squared deviations from the mean.
28) Which of the following statements about the mean absolute deviation (MAD) is the most accurate? A) It is the square root of the standard deviation. B) It can be a positive number or a negative number. C) It is measured in the same units as the original data. D) It is the arithmetic mean of the squared deviations from the mean.
29) An analyst gathered the following information about the net profit margins of companies in two industries: Version 1
8
Net Profit Margin Mean Standard deviation Range
Industry A 15.0% 2.0% 10.0%
Industry B 5.0% 0.8% 15%
Compared with the other industry, the relative dispersion of net profit margins is smaller for Industry __________. A) B, because it has a smaller mean deviation. B) B, because it has a smaller range of variation. C) A, because it has a smaller standard deviation. D) A, because it has a smaller coefficient of variation.
30)
Consider a population with data values of 12
8
28
22
12
30
14
30
14
{MISSING IMAGE}Click here for the Excel Data File The population mean is __________. A) 12 B) 14 C) 18 D) 22
31)
Consider a population with data values of 12
8
28
22
12
{MISSING IMAGE}Click here for the Excel Data File The median is __________. A) 12 B) 14 C) 18 D) 22
32)
Consider a population with data values of
Version 1
9
12
8
28
22
12
30
14
30
14
{MISSING IMAGE}Click here for the Excel Data File The mode is __________. A) 12 B) 14 C) 18 D) 22
33)
Consider a population with data values of 12
8
28
22
12
{MISSING IMAGE}Click here for the Excel Data File The population variance is the closest to __________. A) 8.00 B) 8.64 C) 64.00 D) 74.67
34)
Consider a population with data values of 12
8
28
22
12
30
14
{MISSING IMAGE}Click here for the Excel Data File The population standard deviation is the closest to __________. A) 8.00 B) 8.64 C) 64.00 D) 74.67
35) The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday’s Business Statistics exam. 3
Version 1
12
2
3
5
10
{MISSING IMAGE}Click here for the Excel Data File The mean and the median of the numbers of hours spent by the five students are __________. A) 2 hours and 3 hours, respectively B) 3 hours and 5 hours, respectively C) 5 hours and 2 hours, respectively D) 5 hours and 3 hours, respectively
36) The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday’s Business Statistics exam. 3
12
2
3
5
{MISSING IMAGE}Click here for the Excel Data File The sample standard deviation of the number of hours spent by the five students is the closest to __________. A) 3.6 hours B) 4.1 hours C) 13.2 hours D) 16.5 hours
37) The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday’s Business Statistics exam. 3
12
2
3
5
{MISSING IMAGE}Click here for the Excel Data File The 75th percentile of the data is the closest to __________. A) 3 hours B) 4.5 hours C) 8.5 hours D) 10 hours
Version 1
11
38) The sample data below shows the number of hours spent by five students over the weekend to prepare for Monday’s Business Statistics exam. 3
12
2
3
5
{MISSING IMAGE}Click here for the Excel Data File The interquartile range of the data is the closest to __________. A) 4 hours B) 6 hours C) 10 hours D) 12 hours
39)
A bowler’s scores for a sample of seven games were
172
168
188
190
172
182
174
{MISSING IMAGE}Click here for the Excel Data File The bowler’s average score is __________. A) 172 B) 174 C) 178 D) 190
40)
A bowler’s scores for a sample of six games were
172
168
188
190
172
182
174
{MISSING IMAGE}Click here for the Excel Data File The bowler’s median score is __________. A) 172 B) 174 C) 178 D) 190
41)
A bowler’s scores for a sample of six games were
Version 1
12
172
168
188
190
172
182
174
The bowler’s modal score is __________. A) 172 B) 174 C) 178 D) 190
42)
A bowler’s scores for a sample of six games were
172
168
188
190
172
182
174
{MISSING IMAGE}Click here for the Excel Data File The sample standard deviation is the closest to __________. A) 8.00 B) 8.64 C) 64.00 D) 74.67
43) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. With the beginning of the next academic year, all professors will get a 2% raise. What will be the average and the standard deviation of their new salaries? A) $80,000 and $6,120. B) $81,600 and $6,000. C) $81,600 and $6,120. D) $82,000 and $6,200.
44) The annual returns (in percent) for a sample of stocks in the technology industry over the past year are as follows: 4.2
−9.4
2.8
−16.0
−6.6
{MISSING IMAGE}Click here for the Excel Data File The average return is the closest to __________.
Version 1
13
A) −6.6 B) −5 C) 0 D) 2.8
45) The annual returns (in percent) for a sample of stocks in the technology industry over the past year are as follows: 4.2
−9.4
2.8
−16.0
−6.6
{MISSING IMAGE}Click here for the Excel Data File The median return is the closest to __________. A) −6.6 B) −5 C) 0 D) 2.8
46) The annual returns (in percent) for a sample of stocks in the technology industry over the past year are as follows: 4.2
−9.4
2.8
−16.0
−6.6
{MISSING IMAGE}Click here for the Excel Data File The sample standard deviation is the closest to __________. A) 7.59 B) 8.49 C) 57.61 D) 72.01
47)
The coefficient of variation is best described as __________.
Version 1
14
A) a relative measure of dispersion B) an absolute measure of dispersion C) a relative measure of central location D) an absolute measure of central location
48) The mean return on equity (ROE) for a group of firms in an industry is 15% with a variance of 9%. The coefficient of variation of the industry's ROE is __________. A) 0.2 B) 0.6 C) 1.7 D) 5.0
49) The advantage of using mean absolute deviation rather than variance as a measure of dispersion is that mean absolute deviation __________. A) is less sensitive to extreme deviations B) requires fewer observations to be a valid measure C) is a relative measure rather than an absolute measure of risk D) considers only unfavorable (negative) deviations from the mean
50) A college professor collected data on the number of hours spent by his 100 students over the weekend to prepare for Monday’s Business Statistics exam. He processed the data by Excel and the following incomplete output is available. Mean
10
Sample Variance
7.23
Skewness
1.08
The median is most likely to be __________.
Version 1
15
A) about 10 hours B) Cannot tell from the information provided C) greater than 10 hours D) less than 10 hours
51) A college professor collected data on the number of hours spent by his 100 students over the weekend to prepare for Monday’s Business Statistics exam. He processed the data by Excel and the following incomplete output is available. Mean
7
Sample Variance
7.84
Skewness
1.17
The median is most likely to be __________. A) about 7 hours B) less than 7 hours C) greater than 7 hours D) Cannot tell from the information provided
52) A college professor collected data on the number of hours spent by his 100 students over the weekend to prepare for Monday’s Business Statistics exam. He processed the data by Excel and the following incomplete output is available. Mean
7
Sample Variance
7.84
Skewness
1.17
The coefficient of variation in the data is __________. A) 40% B) 90% C) 111% D) 243%
Version 1
16
53) The price to earnings ratio, also called the P/E ratio of a stock, is a measure of the price of a share relative to the annual net income per share earned by the firm. Suppose the P/Es for a firm’s common stock during the past four quarters are 10, 12, 15, and 11, respectively. The standard deviation of the P/E ratio over the four quarters is __________. A) 1.87 B) 2.16 C) 3.50 D) 4.67
54) As of September 30, the earnings per share (EPS) of five firms in the biotechnology industry are 1.51
2.34
2.13
1.70
0.18
{MISSING IMAGE}Click here for the Excel Data File The sample mean and the sample standard deviation are the closest to __________. A) 1.73 and 0.74 B) 1.73 and 0.85 C) 1.57 and 0.85 D) 1.57 and 0.74
55) As of September 30, the earnings per share (EPS) of five firms in the biotechnology industry are 1.53
2.29
2.07
1.69
0.07
{MISSING IMAGE}Click here for the Excel Data File The sample mean and the sample standard deviation are the closest to __________. A) 1.53 and 0.76 B) 1.53 and 0.87 C) 1.69 and 0.76 D) 1.69 and 0.87
Version 1
17
56)
A portfolio’s annual total returns (in percent) for a five-year period are:
−6.92
1.58
2.37
−2.53
9.54
{MISSING IMAGE}Click here for the Excel Data File The median and the standard deviation for this sample are the closest to __________. A) 2.46 and 6.13 B) 1.58 and 5.48 C) 1.58 and 6.13 D) 0.71 and 5.48
57)
A portfolio’s annual total returns (in percent) for a five-year period are:
−7.14
1.62
2.50
−2.50
9.27
{MISSING IMAGE}Click here for the Excel Data File The median and the standard deviation for this sample are the closest to __________. A) 0.75 and 5.46 B) 1.62 and 5.46 C) 1.62 and 6.11 D) 2.50 and 6.11
58)
The Sharpe ratio measures __________. A) the extra reward per unit of risk B) the extra risk per unit of reward C) the increase in mean per unit of risk D) the extra variance per unit of reward
59) The table below gives statistics relating to a hypothetical 10-year record of two portfolios. Assume other statistics relating to these portfolios are the same and the risk-free rate is 3.5%.
Version 1
18
Arithmetic Mean Return Portfolio A Portfolio B
15.3% 10.2%
Standard Deviation of Return 22.8% 15.9%
Using the coefficient of variation and the Sharpe ratio, the fund that is preferred in terms of relative risk and return per unit of risk is __________. A) Portfolio A because it has a higher coefficient of variation and a lower Sharpe ratio B) Portfolio A because it has a lower coefficient of variation and a higher Sharpe ratio C) Portfolio B because it has a higher coefficient of variation and a lower Sharpe ratio D) Portfolio B because it has a lower coefficient of variation and a higher Sharper ratio
60) The following table summarizes selected statistics for two portfolios for a 10-year period. Assume that the risk-free rate is 4% over this period. Arithmetic Mean Fund A Fund B
11.64% 12.58%
Standard Deviation of Return 12.65% 18.44%
As measured by the Sharpe ratio, the fund with the superior risk-adjusted performance during this period is __________. A) Fund A because it has a lower positive Sharpe ratio than Fund B B) Fund B because it has a lower positive Sharpe ratio than Fund A C) Fund A because it has a higher positive Sharpe ratio than Fund B D) Fund B because it has a higher positive Sharpe ratio than Fund A
61) For k> 1, Chebyshev’s theorem is useful in estimating the proportion of observations that fall within __________. A) k standard deviations from the mean B) k2 standard deviations from the mean C) (1 − 1/k) standard deviations from the mean D) (1− 1/k2) standard deviations from the mean
Version 1
19
62)
Chebyshev’s theorem is applicable when the data are __________. A) any shape B) skewed to the left C) skewed to the right D) approximately symmetric and bell-shaped
63)
Which of the following capabilities does Analysis of Relative Location provide?
A) They make statements regarding the shape of the data. B) They make statements regarding the central location of the data. C) They make statements regarding the dispersion of the data around the median. D) They make statements regarding the percentage of data values that fall within some number of standard deviations from the mean.
64)
What is the difference between Chebyshev’s Theorem and the Empirical Rule?
A) The empirical rule applies to all data sets, whether their distributions are symmetric and bell-shaped or not. B) Chebyshev’s theorem applies to all data sets except those that have approximately a symmetric and bell-shaped distribution. C) The empirical rule applies to all data sets, whereas Chebyshev’s theorem is appropriate when the data have approximately a symmetric and bell-shaped distribution. D) Chebyshev’s theorem applies to all data sets, whereas the empirical rule is only appropriate when the data have approximately a symmetric and bell-shaped distribution.
65)
In its standard form, Chebyshev’s theorem provides a lower bound on __________.
Version 1
20
A) the number of observations lying within a certain interval B) the number of observations lying outside a certain interval C) the proportion (or percentage) of observations lying within a certain interval D) the proportion (or percentage) of observations lying outside a certain interval
66)
The empirical rule can be used to estimate some proportions __________. A) when data has any shape. B) when it is skewed to the left. C) when it is skewed to the right. D) when it is approximately symmetric and bell-shaped.
67) When applicable, the empirical rule provides the approximate percentage of observations that fall within A) 1, 2, or 3 standard deviations. B) 2, 3, or 4 standard deviations. C) k standard deviations for every k > 1. D) 1 − 1/k2 standard deviations for every k > 1.
68)
Which of the following is true when using the empirical rule for a set of sample data? A) <p>Almost all observations are in the interval B) <p>Approximately 68% of all observations are in the interval C) <p>Approximately 95% of all observations are in the interval D) <p>Approximately 68% of all observations are in the interval
69)
When using the empirical rule, which of the following assumptions is made?
Version 1
21
A) The data only comes from a sample. B) The data only comes from a population. C) The data are exactly symmetric and bell-shaped. D) The data are approximately symmetric and bell-shaped.
70) In a marketing class of 60 students, the mean and the standard deviation of scores was 70 and 5, respectively. Use Chebyshev's theorem to determine the number of students who scored less than 60 or more than 80. A) At least 45 B) At most 15 C) At most 45 D) At least 15
71) In an accounting class of 200 students, the mean and standard deviation of scores was 70 and 5, respectively. Use the empirical rule to determine the number of students who scored less than 65 or more than 75. A) It is about 32. B) It is about 64. C) It is about 68. D) It is about 136.
72) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. Using Chebyshev’s inequality, the percentage of salaries falling between $68,000 and $92,000 is at least __________.
Version 1
22
A) 68% B) 65% C) 75% D) 95%
73) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. The salary distribution cannot be regarded as bell-shaped. What can be said about the percentage of salaries that are less than $68,000 or more than or more than $92,000? A) It is at least 75%. B) It is at most 75%. C) It is at least 25%. D) It is at most 25%.
74) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. The salary distribution is approximately bell-shaped. What can be said about the percentage of salaries that are less than $68,000 or more than $92,000? A) It is about 5%. B) It is about 32%. C) It is about 68%. D) It is about 95%.
75) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. The salary distribution is approximately bell-shaped. What can be said about the percentage of salaries that are at least $74,000? A) It is about 68%. B) It is about 84%. C) It is about 95%. D) It is about 97.5%.
Version 1
23
76) Professors at a local university earn an average salary of $80,000 with a standard deviation of $6,000. The salary distribution is approximately bell-shaped. Because of budget limitations, it has been decided that only those whose salaries are approximately in the bottom 2.5% would get a raise. What is the maximum current salary that qualifies for the raise? A) It is about $58,000. B) It is about $62,000. C) It is about $68,000. D) It is about $74,000.
77) Amounts spent by a sample of 200 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $)
Frequency
0 up to 10
15
10 up to 20
75
20 up to 30
55
30 up to 40
55
{MISSING IMAGE}Click here for the Excel Data File The mean amount spent by customers is the closest to __________. A) $17.50 B) $20.00 C) $22.50 D) $50.00
78) Amounts spent by a sample of 200 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $)
Frequency
0 up to 10
15
10 up to 20
75
20 up to 30
55
30 up to 40
55
Version 1
24
{MISSING IMAGE}Click here for the Excel Data File The median amount will fall in the following class interval __________. A) 0 up to 10 B) 10 up to 20 C) 20 up to 30 D) 30 up to 40
79) Amounts spent by a sample of 50 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $)
Relative Frequency
0 up to 10
0.20
10 up to 20
0.40
20 up to 30
0.30
30 up to 40
0.10
{MISSING IMAGE}Click here for the Excel Data File The mean amount spent by customers is the closest to __________. A) $0.36 B) $18.00 C) $20.00 D) $25.00
80) Amounts spent by a sample of 50 customers at a retail store are summarized in the following relative frequency distribution. Amount Spent (in $)
Relative Frequency
0 up to 10
0.20
10 up to 20
0.40
20 up to 30
0.30
30 up to 40
0.10
{MISSING IMAGE}Click here for the Excel Data File The median amount will fall in the following class interval __________.
Version 1
25
A) 0 up to 10 B) 10 up to 20 C) 20 up to 30 D) 30 up to 40
81) The following frequency distribution represents the number of hours studied per week by a sample of 60 students. Amount Spent (in $)
Midpoint
Frequency
2.5 less than 5.5
4
5
5.5 less than 8.5
7
30
8.5 less than 11.5
10
25
{MISSING IMAGE}Click here for the Excel Data File The mean number of hours studied is __________. A) 7 B) 8 C) 22.8 D) 480
82) You buy 50 stocks of Company A, 30 of Company B, and 20 of Company C. The annual returns of these companies are 8%, 12%, and 10% respectively. The average return for one year is the closest to __________. A) 9.1% B) 9.6% C) 10.0% D) 10.5%
83) The mean grade of the 30 students in Section 1 is 80. The mean grade of the 40 students in Section 2 is 85. The mean grade of the 30 students in Section 3 is 80. What is the mean grade of all students from the three sections combined?
Version 1
26
A) 80.00 B) 81.67 C) 82.00 D) 85.00
84) An investor bought common stock of Blackstone Company on several occasions at the following prices. The following frequency distribution represents the number of shares and its price per share. Number of Shares 100
Price per Share $ 31
200
$ 29
400
$ 38
{MISSING IMAGE}Click here for the Excel Data File The average price per share at which the investor bought these shares of common stock was the closest to __________. A) $35.67 B) $34.43 C) $33.00 D) $36.00
85) An investor bought common stock of Blackstone Company on several occasions at the following prices. The following frequency distribution represents the number of shares and its price per share. Number of Shares 100 200 400
Version 1
Price per Share $ 34 $ 30 $ 28
27
{MISSING IMAGE}Click here for the Excel Data File The average price per share at which the investor bought these shares of common stock was the closest to __________. A) $28.00 B) $29.43 C) $30.67 D) $31.00
86) Automobiles traveling on a road with a posted speed limit of 65 miles per hour are checked for speed by a state police radar system. The following is a frequency distribution of speeds. Speed (miles per hour)
Frequency
45 up to 55
50
55 up to 65
325
65 up to 75
275
75 up to 85
25
{MISSING IMAGE}Click here for the Excel Data File The mean speed of the automobiles traveling on this road is the closest to __________. A) 55.0 B) 57.8 C) 64.1 D) 65.0
87)
What does the covariance measure? A) The direction of a linear relationship between two variables. B) The direction of a curvilinear relationship between two variables. C) The direction of a linear relationship between two or more variables. D) The direction of a curvilinear relationship between two or more variables.
Version 1
28
88) When interpreting the covariance between variables x and y, which of the following statements is the most accurate? A) A negative value of covariance that, on average, if x is below its mean, then y tends to be below its mean. B) A positive value of covariance indicates that, on average, if x is above its mean, then y tends to be above its mean. C) A positive value of covariance indicates that, on average, if x is above its mean, then y tends to be below its mean. D) A negative value of covariance indicates that, on average, if x is above its mean, then y tends to be above its mean.
89)
What is an advantage of the correlation coefficient over the covariance?
A) It falls between 0 and 1. B) It falls between −1 and 1. C) It is a unit-free measure, therefore making it easier to interpret. D) Both answers—that it falls between −1 and 1 and that it is a unit-free measure—are correct.
90) Which of the following relationships cannot be concluded from examining the correlation coefficient? A) No relationship B) A positive relationship C) A curvilinear relationship D) A negative relationship
Version 1
29
91) The covariance between the returns of A and B is −0.175. The standard deviation of the rates of return is 0.33 for stock A and 0.74 for stock B. The correlation of the rates of return between A and B is the closest to ______. A) −0.72 B) −2.07 C) 0.72 D) 2.07
92) The covariance between the returns of A and B is −0.112. The standard deviation of the rates of return is 0.26 for stock A and 0.81 for stock B. The correlation of the rates of return between A and B is the closest to __________. A) −1.88 B) −0.53 C) 0.53 D) 1.88
93) The covariance between the returns on two assets is negative. This occurs when __________. A) the variance of one asset has a negative linear relationship with the variance of the other asset B) the standard deviation of one asset has a positive linear relationship with the standard deviation of the other asset C) on average, the return on one asset is below its expected value and the return on the other asset is above its expected value D) on average, the return on one asset is below its expected value and the return on the other asset is below its expected value
Version 1
30
94) The covariance between the returns of stock A and stock B is −125. The standard deviation of the rates of return is 20 for stock A and 10 for stock B. The correlation coefficient of the rates of return between A and B is closest to __________. A) −0.625 B) −0.375 C) 0.375 D) 0.625
95)
The daily high temperature in Philadelphia over an eight-day period is shown below. 69
77
70
81
84
86
69
69
{MISSING IMAGE}Click here for the Excel Data File The median for this data set is __________. A) 69 B) 70 C) 73.5 D) 77
96)
The following data represents the actual talk time, in hours, for the iPhone for 11 users.
22.0 10.4 11.1 26.9 13.3 30.0 10.5 26.8 16.4 15.1 19.0
{MISSING IMAGE}Click here for the Excel Data File The 60thpercentile of this data set is ________. A) 16.4 B) 19.6 C) 15.1 D) 18.3
97)
The following data represents the actual talk time, in hours, for the iPhone for 11 users.
25.1
Version 1
19.1
21.6
9.5
20.3
18.0
24.6
25.2
21.9
29.7
28.5 31
{MISSING IMAGE}Click here for the Excel Data File The 60th percentile of this data set is __________. A) 21.6 B) 21.9 C) 23.3 D) 24.7
98) The following data represent the wait time, in minutes, for customers calling Dell technical support. 14
21
37
24
19
12
16
69
13
{MISSING IMAGE}Click here for the Excel Data File The interquartile range is __________.
A) 13.5 minutes B) 17 minutes C) 19 minutes D) 30.5 minutes
99) The following data represent the wait time, in minutes, for customers calling Dell technical support. 14
21
37
24
19
12
16
69
13
{MISSING IMAGE}Click here for the Excel Data File The upper limit for determining outliers for a box-and-whisker plot is __________. A) 17 minutes B) 30.5 minutes C) 56 minutes D) 62.5 minutes
100) The __________ identifies the number of standard deviations a particular value is from the mean of its distribution.
Version 1
32
A) coefficient of variation B) z-score C) median D) empirical rule
101)
Which of the following is not true concerning the attributes of z-scores?
A) z-scores are positive for data values above the mean of the distribution. B) z-scores are negative for data values below the mean of the distribution. C) z-scores can be positive or negative for data values above the mean of the distribution. D) z-scores are equal to zero for data values equal to the mean of the distribution.
102) The average class size this semester in the business school of a particular university is 38.1 students with a standard deviation of 13.4 students. The z-score for a class with 20 students is _____. A) 1.49 B) 0 C) 1.35 D) 0.78
103) The average class size this semester in the business school of a particular university is 38.1 students with a standard deviation of 12.9 students. The z-score for a class with 21 students is __________. A) −1.33 B) 0 C) 0.8 D) 1.51
Version 1
33
104) Suppose the wait to pass through immigration at JFK Airport in New York is thought to be bell-shaped and symmetrical with a mean of 23 minutes. It is known that 68% of travelers will spend between 17 and 29 minutes waiting to pass through immigration. The standard deviation for the wait time through immigration is _________. A) 10 minutes B) 8 minutes C) 6 minutes D) 9 minutes
105) Suppose the wait to pass through immigration at JFK Airport in New York is thought to be bell-shaped and symmetrical with a mean of 22 minutes. It is known that 68% of travelers will spend between 16 and 28 minutes waiting to pass through immigration. The standard deviation for the wait time through immigration is __________. A) 6 minutes B) 8 minutes C) 9 minutes D) 10 minutes
106) Suppose the dealer incentive per vehicle for Honda’s Acura brand is thought to be bellshaped and symmetrical with a mean of $3,280 and a standard deviation of $250. Based on this information, what interval of dealer incentives would we expect approximately 99.7% of vehicles to fall within? A) $2,280 to $4,280 B) $2,780 to $3,780 C) $3,030 to $3,530 D) $2,530 to $4,030
Version 1
34
107) Suppose the dealer incentive per vehicle for Honda’s Acura brand is thought to be bellshaped and symmetrical with a mean of $2,500 and a standard deviation of $300. Based on this information, what interval of dealer incentives would we expect approximately 99.7% of vehicles to fall within? A) $2,200 to $2,800 B) $1,900 to $3,100 C) $1,600 to $3,400 D) $1,300 to $3,700
108) Suppose the average price for new cars has a mean of $30,100, a standard deviation of $5,600 and is normally distributed. Based on this information, what interval of prices would we expect at least 95% of new car prices to fall within? A) $24,500 to $35,700 B) $18,900 to $41,300 C) $13,300 to $46,900 D) $7.700 to $52,500
109) There are five rows of students seated in a marketing class. The following table shows the number of students in each row and the average score of the most recent test for that row. Row
Number of Students
Row Average Score
1
6
82.3
2
7
91.4
3
4
85.0
4
5
78.3
5
6
89.1
{MISSING IMAGE}Click here for the Excel Data File What is the average test score for this class?
Version 1
35
A) 84.7 B) 85.7 C) 83.0 D) 86.8
110) The director of graduate admissions is analyzing the relationship between scores in the Graduate Record Examination (GRE) and student performance in graduate school, as measured by a student’s GPA. The table below shows a sample of 10 students. GRE
1,500
1,400
1,000
1,050
1,100
1,250
800
850
950
1,350
GPA
3.4
3.5
3.0
2.9
3.0
3.3
2.7
2.8
3.2
3.3
{MISSING IMAGE}Click here for the Excel Data File The covariance is __________. A) 53.5 B) 51.75 C) 57.5 D) 58.75
111) The director of graduate admissions is analyzing the relationship between scores in the GRE and student performance in graduate school, as measured by a student’s GPA. The table below shows a sample of 10 students. GRE
1,840
1,160
130
1,050
1,040
1,350
800
650
910
930
GPA
2.2
4.1
3.9
2.9
2.4
2.1
3.6
2.1
2.8
3.5
{MISSING IMAGE}Click here for the Excel Data File Which of the following statements is correct? A) The correlation between GRE and GPA is positive and strong. B) The correlation between GRE and GPA is positive and weak. C) The correlation between GRE and GPA is negative and strong. D) The correlation between GRE and GPA is negative and weak.
Version 1
36
112) The director of graduate admissions is analyzing the relationship between scores in the GRE and student performance in graduate school, as measured by a student’s GPA. The table below shows a sample of 10 students. GRE
1,500
1,400
1,000
1,050
1,100
1,250
800
850
950
1,350
GPA
3.4
3.5
3.0
2.9
3.0
3.3
2.7
2.8
3.2
3.3
{MISSING IMAGE}Click here for the Excel Data File Which of the following statements is correct? A) The correlation between GRE and GPA is negative and weak. B) The correlation between GRE and GPA is positive and weak. C) The correlation between GRE and GPA is negative and strong. D) The correlation between GRE and GPA is positive and strong.
113)
Which of the following is true regarding the weighted mean formula: for i = 1, ..., n A) B) C) D)
114)
=1 1</p> 1</p>
Calculate the mean, median, and mode of the sample data below.
6
3
9
5
3
7
8
1
{MISSING IMAGE}Click here for the Excel Data File
115) Janice Anooshian asks eight of her friends about the number of hours they spend daily on Facebook. Their responses are: 2
Version 1
1
1
8
2
1
1
2
37
{MISSING IMAGE}Click here for the Excel Data File Calculate the mean, median, and mode numbers of hours her friends spent on Facebook. Does the mean accurately reflect the center of the data?
116)
The following represent the sizes of fleece jackets for kids sold at a local Old Navy Store.
6
7
4
8
1
0
4
5
4
4
6
{MISSING IMAGE}Click here for the Excel Data File Calculate the mean, median, and mode size of fleece jackets for kids. Which of these measures of the central location represents the age that the store would like to target for advertisement dollars.
117) The following data are a list of the magnitudes of six of Alaska’s largest recorded earthquakes. 9.2
7.9
8.7
8.6
7.9
8.1
{MISSING IMAGE}Click here for the Excel Data File Calculate the mean, median, and mode of the magnitude of Alaska’s earthquakes.
118) The following is a list of the number of touchdowns LaDainian Tomlinson scored in five nonconsecutive years as a running back in the NFL. 10
11
28
14
12
{MISSING IMAGE}Click here for the Excel Data File Calculate the mean and the median of the number of touchdowns LT scored. Does the mean accurately reflect the center of the data?
Version 1
38
119) The following sample data shows the starting salaries of six graduates from the accounting program at California Polytechnic State University. The data are in thousands of dollars. 24
46
48
52
56
58
60
{MISSING IMAGE}Click here for the Excel Data File a. Calculate and interpret the 25th, 50th, and 75th percentiles. b. Are there any outliers? Is the distribution symmetric? If not, comment on its skewness.
120) John lives in Los Angeles and hates the traffic. He asked a sample of six of his coworkers who live all over Los Angles how many hours they spend commuting every year. These are their responses in hours per year. 20
240
260
300
310
570
{MISSING IMAGE}Click here for the Excel Data File a. Calculate and interpret the 30th, 50th, and 70th percentiles. b. Are there any outliers? Is the distribution symmetric? If not, comment on its skewness.
Version 1
39
121) The following are daily returns for the Dow Jones Industrial Average during the week of October 13. The returns are rounded to the nearest whole number. 11%
−1.00%
−8.00%
5.00%
−1.00%
{MISSING IMAGE}Click here for the Excel Data File a. Calculate the arithmetic mean return. b. Calculate the geometric mean return.
122) The Yearly Prices (rounded to the nearest dollar) for GLD (a gold exchange traded fund) and SLV (a silver exchange traded fund) are reported in the following table. Year January-08 January-09 January-10 January-11
GLD $ 83 $ 87 $ 110 $ 139
SLV $ 15 $ 11 $ 17 $ 30
{MISSING IMAGE}Click here for the Excel Data File a.Calculate the sample variance and sample standard deviation for the GLD ETF and SLV ETF. b.Which asset had a greater variance? c.Which asset had the greater relative dispersion?
123) The following is a list of average wind speeds at a local surf spot in California over the last week. 8
Version 1
17
19
6
3
9
12
40
{MISSING IMAGE}Click here for the Excel Data File a. What is the range in wind speed? b. What is the Mean Absolute Deviation of the wind speed?
124) The data show operating expenses (in millions) for British Petroleum for the years Year 1 to Year 3. Year Operating Expenses (Millions)
Year 1 $ 331,814
Year 2 $ 219,712
Year 3 $ 312,630
a. Use the growth rates from Year 1 − 2 and Year 2 − Year 3 to calculate the average growth rate. b. Calculate the average growth rate directly from sales.
125) Three investment options are under consideration for inclusion in a mutual fund. Given that the return on a one-year T-bill is 4.5%, use the Sharpe ratio to select the best option. Option A B C
Mean Return 11.3% 7.7% 5.5%
Variance 1,246.35(%)2 843.92(%)2 134.83(%)2
{MISSING IMAGE}Click here for the Excel Data File
Version 1
41
126) The following is return data for a retail sector ETF and energy sector ETF for the years, Year 1 to Year 5. Year
Retail Sector ETF
Energy Sector ETF
1
−0.27
0.45
2
−0.31
−0.44
3
0.72
0.33
4
0.29
0.13
5
0.1
−0.03
{MISSING IMAGE}Click here for the Excel Data File a. What is the arithmetic mean return for each ETF? b. What is the geometric mean return for each ETF? c. What is the sample standard deviation for each ETF? Which ETF was riskier over this time period? d. Given a risk-free rate of 5%, what is the Sharpe Ratio for each ETF? Which investment had a better return per unit of risk over this time period?
127) The following table shows the annual returns (in percent) of Chevron and Caterpillar for Year 1–4. Year 1 2 3 4
Version 1
Chevron 35% −20% 3% 15%
Caterpillar 14% −30% 30% 55%
42
{MISSING IMAGE}Click here for the Excel Data File a. Which fund had the higher arithmetic average return? b. Which fund was riskier over this time period? c. Given a risk-free rate of 1%, which fund has the higher Sharpe ratio? What does this imply?
128)
The following gives summary measures for Google and Apple for Year 1–5.
a. Which fund had the higher arithmetic average return? b. Which fund was riskier over this time period? c. Given a risk-free rate of 1%, which fund has the higher Sharpe ratio? What does this imply?
129) The mean starting salary of recent business graduates at a university is $52,000 with a standard deviation of $16,000. The distribution of starting salaries is unknown. a. What proportion of business graduates has a starting salary between $20,000 and $84,000? b. Suppose 600 business graduates from this university got hired. How many of them started with a salary between $20,000 and $84,000?
Version 1
43
130) Your used car is expected to last an average of 200,000 miles with a standard deviation of 25,000 miles before it requires a new transmission. a. Use Chebyshev’s theorem to approximate the probability that the engine will last between 150,000 miles and 250,000 miles. b. Assume a symmetric bell-shaped distribution to approximate the probability that the engine will last between 150,000 miles and 250,000 miles.
131) The mean starting salary of recent business graduates at a university is $52,000 with a standard deviation of $16,000. The distribution of starting salaries is assumed to be symmetric and bell-shaped. a. What proportion of business graduates has a starting salary between $20,000 and $84,000. b. Suppose 600 business graduates from this university got hired. How many of them started with a salary between $20,000 and $84,000?
Version 1
44
132) The following data represent motor vehicle theft rates per 100,000 people for the cities of Detroit, Michigan; Newark, New Jersey; St. Louis, Missouri; Oakland, California; Atlanta, Georgia; and Fresno, California. These six cities had the highest per-capita motor vehicle theft rates in the nation. City
State
Avg Vehicle Theft Rate
Detroit
Michigan
1400
Newark
New Jersey
1290
St. Louis
Missouri
1200
Oakland
California
1130
Atlanta
Georgia
940
Fresno
California
940
{MISSING IMAGE}Click here for the Excel Data File a. What are the mean and median per capita theft rates of the above cities? b. Given that the standard deviation of the per capita crime rate in Detroit is 200 thefts per 100,000, use the empirical rule to calculate the probability Detroit has more than 1,800 thefts per 100,000 next year?
133) A luxury apartment complex in South Beach Miami is for sale. The owner has received the following offers in millions of dollars. 64
72
66
58
78
82
{MISSING IMAGE}Click here for the Excel Data File a. What is the mean offer price? What is the median offer price? Is the mean a good measure of central location? b. What is the sample standard deviation of the offers? c. What is equivalent to a 75th percentile offer?
Version 1
45
134) The following data represent the number of unique visitors and the revenue a website generated for the months of July through December. Unique Visitors
Revenue
July
26
2.2
August
18
3.4
September
14
6.2
October
22
8.6
November
55
5.6
December
75
5.8
{MISSING IMAGE}Click here for the Excel Data File a. What is the sample standard deviation for the number of unique visitors and the revenue? b. Calculate the coefficient of variations. Which variable has a higher relative dispersion? c. Calculate the sample correlation coefficient between the number of unique visitors and Revenue. d. Comment on the strength of the linear relationship. What does this mean for the owner of the website?
135) The following is a list of GPA ranges and frequencies from a high school. Use 1.5 as the midpoint of the 2.0 or less category. GPA
ƒi
2.0 or less
2
2.0-2.5
8
Version 1
46
2.5-3.0
34
3.0-3.5
4
3.5-4.0
2
{MISSING IMAGE}Click here for the Excel Data File What is the mean GPA?
136) A surfer visited his favorite beach 50 times and recorded the wave height each time in the following table. Heights of waves in feet
Frequency
0 to 2
20
3 to 5
15
6 to 8
10
9 to 11
5
Total
50
{MISSING IMAGE}Click here for the Excel Data File Calculate the average wave height.
137) A large city in Southern California collected data on education and the unemployment rate for its residents with a survey. The following is the survey data. Education Level
Frequency
Average Unemployment Rate
Less than High School
35
0.12
High School Grad
25
0.09
Some College
20
0.08
College Grad
20
0.05
{MISSING IMAGE}Click here for the Excel Data File Calculate the mean unemployment rate for the city.
Version 1
47
138) Yearly returns (rounded to the nearest percent) for GLD (a gold exchange traded fund) and SLV (a silver exchange traded fund) are reported in the following table. Year 1 2 3 4 5
GLD 38% 5% 26% 26% 12%
SLV 25% −27% 55% 76% −4%
{MISSING IMAGE}Click here for the Excel Data File a. Calculate the covariance between GLD and SLV. b. Calculate and interpret the correlation coefficient.
139) The following is data a veterinarian collected from some of her clients. It is a rough estimate of a dog’s weight and how long the dog lived. Estimated of Dog's Weight 20
13
40
12
60
10
100
7
130
6
sWeight = 44.70
Version 1
Life Span
sLife Span = 3.05
48
{MISSING IMAGE}Click here for the Excel Data File a. Calculate the covariance between a dog’s weight and its life span. b. Calculate and interpret the correlation coefficient.
140) The marketing analyst uses information on 100 sales associates that includes each associate’s level of experience, and yearly number of new customers acquired by region to examine the performance of the associates. A portion of the data is shown below: Sales Associate 1 2 : 100
Experience
Northeast
Southwest
West
Southeast
Midwest
Low High : Low
15 43 : 68
45 80 : 27
40 95 : 36
30 93 : 96
30 40 : 41
{MISSING IMAGE}Click here for the Excel Data File a. Which region has the highest average yearly number of new customers acquired? b. Which region has the lowest average yearly number of new customers acquired? c. Find the average yearly number of new customers acquired for high-experience versus lowexperience associates by region. Which subgroup acquired more customers for each region?
141) The term central location refers to the way numerical data tend to cluster around some middle or central value.
Version 1
49
⊚ ⊚
true false
142)
The arithmetic mean is the middle value of a data set. ⊚ true ⊚ false
143)
Approximately 60% of the observations in a data set fall below the 60th percentile. ⊚ true ⊚ false
144)
The median is not always the 50th percentile. ⊚ true ⊚ false
145) set.
In a data set, an outlier is a large or small value regarded as an extreme value in the data ⊚ ⊚
true false
146) A box plot is useful when comparing similar information gathered at different places or times. ⊚ true ⊚ false
147) The geometric mean is a multiplicative average of a data set used to measure values over a period of time.
Version 1
50
⊚ ⊚
true false
148) The mean absolute deviation (MAD) is a less effective measure of variation when compared with the average deviation from the mean. ⊚ true ⊚ false
149) The variance and standard deviation are the most widely used measures of central location. ⊚ true ⊚ false
150)
The standard deviation is the positive square root of the variance. ⊚ true ⊚ false
151)
The variance is an average squared deviation from the mean. ⊚ true ⊚ false
152)
The coefficient of variation is a unit-free measure of dispersion. ⊚ true ⊚ false
153) Mean-variance analysis suggests that investments with lower average returns are also associated with higher risks.
Version 1
51
⊚ ⊚
true false
154)
The Sharpe ratio measures the extra reward per unit of risk. ⊚ true ⊚ false
155)
Chebyshev’s theorem is only applicable for sample data. ⊚ true ⊚ false
156)
The empirical rule is only applicable for approximately bell-shaped data. ⊚ true ⊚ false
157)
Z-scores can always be used to detect outliers. ⊚ true ⊚ false
158)
The formula for a z-score is = ⊚ true ⊚ false
159)
Outliers are extreme values above or below the mean that require special consideration. ⊚ true ⊚ false
Version 1
52
160) Mark’s grade on the recent business statistics test was an 85 on a scale of 0–100. Based on this information we can conclude that Mark’s grade was in the 85th percentile in his class. ⊚ true ⊚ false
161)
Geometric mean is greater than the arithmetic mean. ⊚ true ⊚ false
162) In quality control settings, businesses prefer a larger standard deviation, which is an indication of more consistency in the process. ⊚ true ⊚ false
163) The z-score has no units even though the original values will normally be expressed in units such as dollars, years, pounds, or calories. ⊚ true ⊚ false
164) When working with frequency distribution, the interval median is the value in the middle of the interval and can be found by taking the average of the endpoints for the interval. ⊚ true ⊚ false
165)
There is no meaningful measure of central location to describe a categorical variable. ⊚ true ⊚ false
Version 1
53
166) The weighted mean of observations in a sample is calculated differently than observations in a population. ⊚ true ⊚ false
Version 1
54
Answer Key Test name: Chap 03_4e 1) [A, C] 2) A 3) D 4) D 5) B 6) D 7) D 8) D 9) A 10) B 11) C 12) D 13) D 14) A 15) C 16) C 17) A 18) D 19) C 20) B =QUARTILE.EXC(<data>,1) = 0.77 and =QUARTILE.EXC(<data>,3) = 1.97.
21) C =(108,249/42,905)^(1/2) − 1 = 58.84%
22) B =GEOMEAN(1−0.0724,1+0.0164,1+0.0348,1−0.0249,1+0.0930)−1 = 0.78%
23) D
Version 1
55
= (1.05× 1.12× 0.94× 1.02)^(1/3.25) − 1 (Note: Since these returns are not for full years, the Excel command GEOMEAN cannot be used here.)
24) C 25) C 26) D 27) D 28) C 29) D Coefficient of Variation: A = 2/15 = 0.133; B = 0.8/5 = 0.16.
30) C In Excel, the function used is =AVERAGE(12,8,28,22,12,30,14) = 18
31) B In Excel, the function used is =MEDIAN(12,8,28,22,12,30,14) = 14
32) A In Excel, the function used is =MODE(12,8,28,22,12,30,14) = 12
33) C In Excel, the function used is =VAR.P(12,8,28,22,12,30,14) = 64; Wrong answer would be using the sample variance of =VAR.S().
34) A In Excel, the function used is =STDEV.P(12,8,28,22,12,30,14) = 8; Wrong answer would be using the sample standard deviation of =STDEV.S()
35) D In Excel, the function used is =AVERAGE(3,12,2,3,5) = 5; =MEDIAN(3,12,2,3,5) = 3
36) B In Excel, the function used is =STDEV.S(3,12,2,3,5) = 4.1
37) C In Excel, the function used is =QUARTILE.EXC(<data>,3)
38) B In Excel, the function used is =QUARTILE.EXC(<data>,3)−QUARTILE.EXT(<data>,1) = 6
39) C In Excel, the function used is =AVERAGE(172,168,188,190,172,182,174) = 178
40) B In Excel, the function used is =MEDIAN(172,168,188,190,172,182,174) = 174
41) A Version 1
56
In Excel, the function used is =MODE(172,168,188,190,172,182,174) = 172 42) B In Excel, the function used is =STDEV.S(172,168,188,190,172,182,174) = 8.64
43) C 44) B In Excel, the function used is =AVERAGE(4.2,−9.4,2.8,−16.0,−6.6) = −5
45) A In Excel, the function used is =MEDIAN(4.2,-9.4,2.8,−16.0,−6.6) = −6.6
46) B In Excel, the function used is =STDEV.S(4.2,−9.4,2.8,−16.0,−6.6) = 8.49
47) A 48) A First, standard deviation =SQRT(9) = 3. Then, CV = 3/15 = 0.20. 49) A 50) D 51) B 52) A Coefficient of Variation = SQRT(7.84)/7 = 0.40 53) B The Excel function used is =STDEV.S(10,12,15,11) = 2.16.
54) C In Excel, the function used is =AVERAGE({{[a(1)]:#,###.00}},{{[a(2)]:#,###.00}},{{[a(3)]:#,###.00}},{{[a(4)]:#,###.00}}, {{[a(5)]:#,##0.00}}) = {{[a(6)]:#,###.00}} and =STDEV.S({{[a(1)]:#,###.00}},{{[a(2)]:#,###.00}},{{[a(3)]:#,###.00}},{{[a(4)]:#,###.00}},{{ [a(5)]:#,##0.00}}) = {{[a(7)]:#,##0.00}}.
55) B In Excel, the function used is =AVERAGE(1.53,2.29,2.07,1.69,0.07) = 1.53 and =STDEV.S(1.53,2.29,2.07,1.69,0.07) = 0.87
56) C
Version 1
57
In Excel, the function used is =MEDIAN({{[a(1)]:#,###.00}},{{[a(2)]:#,###.00}},{{[a(3)]:#,###.00}},{{[a(4)]:#,###.00}},{{ [a(5)]:#,###.00}}) = {{[a(6)]:#,###.00}} and =STDEV.S({{[a(1)]:#,###.00}},{{[a(2)]:#,###.00}},{{[a(3)]:#,###.00}},{{[a(4)]:#,###.00}},{{ [a(5)]:#,###.00}}) = {{[a(7)]:#,###.00}}.
57) C In Excel, the function used is =MEDIAN(−7.14,1.62,2.50,−2.50,9.27) = 1.62 and =STDEV.S(−7.14,1.62,2.50,−2.50,9.27) = 6.11
58) A 59) B 60) C The Sharpe Ratio for Fund A is (11.64 − 4)/12.65 = 0.6040 and Fund B is (12.58 − 4)/18.44 = 0.4653.
61) A 62) A 63) D 64) D 65) C 66) D 67) A 68) B 69) D 70) B 71) B 72) C 73) D 74) A 75) B 76) C 77) C 78) C Version 1
58
79) B = 5 × 0.20 + 15 × 0.40 + 25 × 0.30 + 35 × 0.10 = $18.00.
80) B 81) B 82) B 83) C 84) B 85) B 86) C 87) A 88) B 89) D 90) C 91) A 92) B 93) C 94) A 95) C 96) B =PERCENTILE.EXC(<data>,{{[a(30)]:#,##0.00}}) = {{[a(25)]:#,###.0}}.
97) D Version 1
59
=PERCENTILE.EXC(<data>,0.60) = 24.7
98) B 99) C 100) B 101) C 102) C z = ({{[a(3)]:#,###}} − {{[a(1)]:#,###.0}})/{{[a(2)]:#,###.0}} = {{[a(4)]:#,##0.00}}.
103) A z = (21 − 38.1)/12.9 = −1.33
104) C Since 68% is one standard deviation from the mean, then {{[a(4)]:#,###}} − {{[a(1)]:#,###}} = {{[a(5)]:#,###}} or {{[a(1)]:#,###}} − {{[a(3)]:#,###}} = {{[a(5)]:#,###}} minutes.
105) A Since 68% is one standard deviation from the mean, then 28 − 22 = 6 or 22 − 16 = 6 minutes
106) D ${{[a(1)]:#,###}} +/− 3 × ${{[a(2)]:#,###}} = ${{[a(4)]:#,###}} to ${{[a(5)]:#,###}}.
107) C $2,500 +/− 3 × $300 = $1,600 to $3,400
108) B $30,100 ±2 × $5,600 = $18,900 to $41,300
109) B 110) C =COVARIANCE.S(<GRE data>,<GPA data>) = 57.5
111) D =CORREL(<GRE data>,<GPA data>) = {{[a(24)]:#,##0.000}}.
112) D =CORREL(<GRE data>,<GPA data>) = 0.894
113) A =CORREL(<GRE data>,<GPA data>) = 0.894 114) Mean = 5.25, Median = 5.5, Mode = 3 115) Mean = 2.25 hours, Median = 1.5, Mode = 1. No. Since the mean is significantly higher than the median in this set of data, the median will be a better measure of center for reporting purposes.
Version 1
60
116) Mean = 4.45; Median = 4; Mode = 4 Since the mean and median are very close, the best measure of central location for targeting advertisement dollars is the mean or approximately 6 years old. 117) Mean = 8.4, Median = 8.35, Mode = 7.9 118) Mean = 15, Median = 12. No, it is sensitive to the outlier of 28. Since the mean is significantly higher than the median in this set of data, the median will be a better measure of center for reporting purposes. 119) a. 25th percentile is 46; 50th percentile is 52; 75th percentile is 58. b. Lower Limit = 46 − 1.5 × (58 − 46) = 28. Since 24 < 28, the value of 24 is considered an outlier and the distribution is negatively skewed. 120) a. 30th percentile =PERCENTILE.EXC(<data>,0.30) 50th percentile =PERCENTILE.EXC(<data>,0.50) or =MEDIAN() 70th percentile =PERCENTILE.EXC(<data>,0.70) b. 25th quartile =QUARTILE.EXC(<data>,1) = 185 75th quartile =QUARTILE.EXC(<data>,3) = 375 IQR = 375 − 185 = 190 Lower Limit = 185 − 1.5 × 190 = −100 Upper Limit = 375 + 1.5 × 190 = 660 Based on the lower and upper limit, there are no outliers. The graph is very symmetric. 121) a. 1.2% Arithmetic Mean =AVERAGE(<data>) = 0.012 b. 1.00% Geometric Mean =GEOMEAN(1+0.11,1−0.011,1−0.08,1+0.05,1−0.01)−1 = 0.00977 or 1.0% 122) <p>For GLD: =AVERAGE(<data>) = 104.75 For SLV: =AVERAGE(<data>) = 18.25 a. For GLD: s2 = 662.92; s = 25.75. For SLV: s2 = 67.58; s = 8.22 b. GLD had a greater variance c. For GLD, CV = 0.25; for SLV, CV = 0.45; Silver had the greater relative dispersion. 123) a. Range = 16 Range = 19 − 3 = 16 b. MAD = 4.65 =AVERAGE(<data>) = 10.57 MAD =AVEDEV(<data>) = 4.65 Version 1
61
124) a) Growth from Year 1-2: Growth from Year 2-3: b) Average: =GEOMEAN(1−0.3378,1+0.4229)−1 = −0.0293 125) <p>Option A Ratio: Option B Ratio: Option C Ratio: Option A offers the best reward per unit of risk. 126) a) Arithmetic Mean ● Retail =AVERAGE(<data>) = 0.106 ● Energy =AVERAGE(<data>) =0.088 For retail ETF, the mean = 10.60%; for Energy ETF, the mean = 8.80% b) Geometric Mean ● Retail =GEOMEAN(1−0.27,1−0.31,1+0.72,1+0.29,1+0.1)−1 = 0.0422 ● Energy =GEOMEAN(1+0.45,1−0.44,1+0.33,1+0.13,1−0.03)−1 = 0.0343 For retail ETF, the geometric mean return = 4.22%; for Energy ETF, the geometric mean return = 3.43%. c) Standard Deviation ● Retail =STDEV.S(<data>) = 0.4258 ● Energy =STDEV.S(<data>) = 0.3479 For retail ETF, the standard deviation = 0.42; for Energy ETF, the standard deviation = 0.34. The Retail ETF is riskier. d) Sharpe Ratio ● Retail =(10.6 − 5)/42.58 = 0.1315 ● Energy =(8.8 − 5)/34.79 = 0.1092 For retail ETF, the Sharpe ratio = 0.13; for Energy ETF, the Sharpe ratio = 0.11. The Retail ETF has a higher return per unit of risk.
Version 1
62
127) a) Arithmetic Mean ● Chevron =AVERAGE(<data>)=0.0825 ● Caterpilla =AVERAGE(<data>)=0.1725 Caterpillar has the higher average return. b) Standard Deviation ● Chevron =STDEV.S(<data>)=0.22998 ● Caterpillar =STDEV.S(<data>)=0.34734 Caterpillar has higher risk. c) Sharpe Ratio ● Chevron =(8.25− 1)/22.998=0.3152 ● Caterpillar =(17.25 − 1)/34.734=0.4547 SRChevron = 0.32; SRCaterpillar = 0.45. So Caterpillar has better risk per unit of reward. 128) a. Apple b. Apple c. Sharpe Ratio: ● Apple = (66-1)/89=0.7313 ● Google = (20-1)/60.5=0.3140 Apple has better return per unit of risk.
129) a. At least 75% $20,000 & $84,000 are 2 standard deviations below/above the mean $52,000± 2 × $16,000 Chebyshev’s Theorem: 1 − 1/(22) = 75% b. At least 450 600 × 0.75 = 450 130) a. At least 75% of the time. 150,000 & 250,000 are 2 standard deviations below/above the mean 200,000 ±2 × 25,000 Chebyshev’s Theorem: 1 − 1/(22) = 75% b. About 95% of the time 131) a. At least 95% of the time. $20,000 & 84,000 are 2 standard deviations below/above the mean $52,000 ±2 × 16,000 so above 95% of the distribution b. 600 × 0.95 = 570 Version 1
63
132) <p>a. Median = 1,165 ● Mean =AVERAGE(<data>) = 1,150 ● Median =MEDIAN(<data>) = 1,165 b. 2.5% With mean of 1,400 and standard deviation of 200, 1,800 is two standard deviations from the mean (1,800 − 1,400) = 400/200 = 2. Therefore, 95% of the data falls in the middle of the curve leaving 2.5% above 1,800. 133) a. Mean = 70. Yes, it is a good measure of central location; No outliers; Median = 69 Mean =AVERAGE(<data>) = 70 Median =MEDIAN(<data>) = 69 b. Sample Standard Deviation ≈ 9. Standard Deviation=STDEV.S(<data>) = 9.033 c. 75th percentile is 79 75th Percentile: =QUARTILE.EXC(<data>,3) = 79 134) a. The sample standard deviation for the number of visitors = 24.41; the sample standard deviation for revenue = 2.25. b. The coefficient of variation for the number of visitors = 0.70; the coefficient of variation for revenue = 0.42; the number of visitors had a higher relative dispersion. c. The correlation coefficient between the number of visitors and revenue = 0.089. d. The linear relationship is weak. The number of unique visitors is not the major driver of revenue. 135) a. Mean GPA = 2.70. First, 136) a) First, 137) 138) a. sxy = 0.0375 ● =COVARIANCE.S(<GLD data>,<SLV data>) = 0.0375 b. rxy = 0.69; strong positive correlation. ● =CORREL(<GLD data>,<SLV data>) = 0.69
Version 1
64
139) a. sxy = −135 ● =COVARIANCE.S(<GLD data>,<SLV data>) = −135 b. rxy= −0.99 very strong negative correlation. ● =CORREL(<GLD data>,<SLV data>) = −0.9899 140) a. Southeast b. Southwest c. The high-experience associates acquired more customers than the less-experience counterpart for all five regions:
Northeast 56.11
Southwest 49.15
West 59.98
Southeast 60.29
Midwest 59.75
141) TRUE 142) FALSE 143) TRUE 144) FALSE 145) TRUE 146) TRUE 147) TRUE 148) FALSE 149) FALSE 150) TRUE 151) TRUE 152) TRUE 153) FALSE 154) TRUE 155) FALSE 156) TRUE 157) FALSE 158) FALSE Version 1
65
159) TRUE 160) FALSE 161) FALSE 162) FALSE 163) TRUE 164) FALSE 165) FALSE 166) FALSE
Version 1
66
CHAPTER 04 – 4e 1)
A(n) _____ _____ of an experiment records all possible outcomes of the experiment.
2)
The intersection of two events is the event consisting of all outcomes in A ____ B.
3) A(n) ________ probability is calculated as the relative frequency with which an event occurs.
4) According to the ________, if we flip a fair coin a large number of times, tails will shows up ______% of the time.
5)
<p>If A and B are _______ ________ events, then the addition rule is defined as
6) Events are considered _________ if the occurrence of one is related to the probability of the occurrence of the other.
7) An intuitive way to express the total probability rule is with the help of a _______________.
8)
What is probability?
Version 1
1
A) Any value between 0 and 1 is always treated as a probability of an event. B) A numerical value assigned to an event that measures the number of its occurrences. C) A value between 0 and 1 assigned to an event that measures the likelihood of its occurrence. D) A value between 0 and 1 assigned to an event that measures the unlikelihood of its occurrence.
9)
For an experiment in which a single die is rolled, the sample space is __________. A) {1, 1, 3, 4, 5, 6}. B) {2, 1, 3, 6, 5, 4}. C) {1, 2, 3, 4, 4, 5}. D) {3, 5, 5, 6, 4, 2}.
10)
A sample space contains _________________. A) outcomes of the relevant events. B) several outcomes of an experiment. C) all possible outcomes of an experiment. D) one of several outcomes of an experiment.
11)
What is a simple event? A) An event that contains all outcomes of a sample space B) An event that contains several outcomes of a sample space C) An event that contains only one outcome of a sample space D) An event that contains the most probable outcome of a sample space
Version 1
2
12) Which of the following is not an event when considering the sample space of tossing two coins? A) {HH, HT} B) {HH, TT, HT} C) {HH, TT, HTH} D) {HH, HT, TH, TT}
13)
Events are collectively exhaustive if _____________. A) they include all events B) they are included in all events C) they contain all outcomes of an experiment D) they do not share any common outcomes of an experiment
14)
Mutually exclusive events _____________. A) contain all possible outcomes B) may share common outcomes C) do not share common outcomes D) do not contain all possible outcomes
15) Which of the following are mutually exclusive events of an experiment in which two coins are tossed? A) {TT, HH} and {TT} B) {HT, TH} and {TH} C) {TT, HT} and {HT} D) {TT, HH} and {TH}
Version 1
3
16) In an experiment in which a coin is tossed twice, which of the following represents mutually exclusive and collectively exhaustive events? A) {TT, HH} and {TT, HT} B) {HT, TH} and {HH, TH} C) {TT, HH} and {TH, HT} D) {TT, HT} and {HT, TH}
17) Which of the following sets of outcomes described below in I and II represent mutually exclusive events? 1. "Your final course grade is an A"; "Your final course grade is a B." 2. "Your final course grade is an A"; "Your final course grade is a Pass."
A) Neither I nor II represent mutually exclusive events. B) Both I and II represent mutually exclusive events. C) Only I represents mutually exclusive events. D) Only II represents mutually exclusive events.
19)
Mutually exclusive and collectively exhaustive events ______________.
A) contain all outcomes in a sample space and may share common outcomes B) contain all outcomes in a sample space and do not share common outcomes C) do not have to contain all outcomes in a sample space but do not share common outcomes D) do not have to contain all outcomes in a sample space and may share common outcomes
20)
The union of events A and B, denoted by
Version 1
, ____________.
4
A) contains all outcomes that are in A or B B) contains all outcomes of an experiment C) contains no outcomes that are in A and B D) consists only of outcomes that are in A and B
21)
The intersection of events A and B, denoted by
, ___________.
A) contains outcomes that are either in A or B B) contains outcomes that are both in A and B C) does not contain outcomes that are either in A or B D) does not contain outcomes that are both in A and B
22) The intersection of events A = {apple pie, peach pie, pumpkin pie} and B = {cherry pie, blueberry pie, pumpkin pie} is ____________. A) {pumpkin pie} B) {apple pie, peach pie, cherry pie, blueberry pie} C) {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie} D) {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie, pumpkin pie}
23) The union of events A = {apple pie, peach pie, pumpkin pie} and B = {cherry pie, blueberry pie, pumpkin pie} is ________. A) {pumpkin pie} B) {apple pie, peach pie, cherry pie, blueberry pie} C) {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie} D) {apple pie, peach pie, pumpkin pie, cherry pie, blueberry pie, pumpkin pie}
Version 1
5
24) The complement of an event A, within the sample space S, is the event consisting of ____________. A) all outcomes in A that are in S B) all outcomes in S that are in A C) all outcomes in S that are not in A D) all outcomes in A that are not in S
25) For the sample space S = {apple pie, cherry pie, peach pie, pumpkin pie}, what is the complement of A = {pumpkin pie, cherry pie}? A) {apple pie} B) {peach pie} C) {apple pie, peach pie} D) {pumpkin pie, cherry pie}
26) Assume the sample space S = {yes, no}. Select the choice that fulfills the requirements of the definition of probability. A) P({yes}) = 0.7, P({no}) = 0.3 B) P({yes}) = 0.5, P({no}) = −0.5 C) P({yes}) = 0.7, P({no}) = 0.2 D) P({yes}) = 1.0, P({no}) = 0.1
27) Assume the sample space S = {win, loss}. Select the choice that fulfills the requirements of the definition of probability. A) P({win}) = 0.7, P({loss}) = 0.2 B) P({win}) = 0.7, P({loss}) = 0.3 C) P({win}) = 1.0, P({loss}) = 0.1 D) P({win}) = 0.5, P({loss}) = −0.5
Version 1
6
28) A probability based on logical analysis rather than on observation or personal judgment is best referred to as a(n) _______________. A) classical probability B) empirical probability C) subjective probability D) finite probability
29) An analyst believes the probability that U.S. stock returns exceed long-term corporate bond returns over a five-year period is based on personal assessment. This type of probability is best characterized as a(n) ___________. A) classical probability B) empirical probability C) objective probability D) subjective probability
30)
Which of the following represents a subjective probability? A) The probability of rolling a 2 on a single die is one in six. B) Based on a conducted experiment, the probability of tossing a head on an unfair coin
is 0.6. C) A skier believes she has a 10% chance of winning a gold medal. D) Based on past observation, a manager believes there is a three-in-five chance of retaining an employee for at least one year.
31)
Which of the following represents an empirical probability?
Version 1
7
A) The probability of tossing a head on a coin is 0.5. B) The probability of rolling a 2 on a single die is one in six. C) A skier believes she has a 0.10 chance of winning a gold medal. D) Based on past observation, a manager believes there is a three-in-five chance of retaining an employee for at least one year.
32) An analyst has a limit order outstanding on a stock. He argues that the probability that the order will execute before the close of trading is 0.30. What is the probability that the order will execute after the closing of trading? A) 0.70 B) 0.30 C) 1.0 D) 0.75
33) After extensive research examining 10,000 calls from customers on a typical 12-hour day at C&A technical support center, the analyst uses the following table to summarize the frequency distribution of calls: Hours of the Day
Frequency
8 AM to 11 AM
254
11:01 AM to 2 PM
3080
2:01 PM to 5 PM
2541
5:01 PM to 8 PM
4125
What is the probability that a randomly selected call is placed before 2:01 PM? A) 31% B) 25% C) 33% D) 67%
Version 1
8
34) After extensive research examining 10,000 calls from customers on a typical 12-hour day at C&A technical support center, the analyst uses the following table to summarize the frequency distribution of calls: Hours of the Day
Frequency
8 AM to 11 AM
254
11:01 AM to 2 PM
3080
2:01 PM to 5 PM
2541
5:01 PM to 8 PM
4125
What is the probability that a randomly selected call is placed after 2 PM? A) 31% B) 25% C) 33% D) 67%
35) There is a 25% chance that a customer who purchases milk will also purchase bread. The probability of a milk purchase is 70% and the probability of a bread purchase is 50%. What is the probability that a customer will purchase bread given that he/she buys milk? A) 36% B) 50% C) 25% D) 64%
36) An experiment consists of tossing three fair coins. What is the probability of tossing two tails? A) 1/8 B) 1/4 C) 3/8 D) 1/2
Version 1
9
37)
Let P(A) = 0.6, P(B) = 0.3, and P(A∪B)C = 0.1. Calculate P(A∩B). A) 0 B) 0.3 C) 0.9 D) Not enough information to calculate.
38) Given an experiment in which a fair coin is tossed three times, the sample space is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}. Event A is defined as tossing no tail (T). What is the event Ac and what is the probability of this event? A) Ac = {TTT, HHH, HTH}; P(Ac) = 0.625 B) Ac = {HHT, HTH, THH, HTT, THT, TTH, TTT}; P(Ac) = 0.875 C) Ac = {TTT, THH, HHH, HHT}; P(Ac) = 0.500 D) Ac = {TTT, HHH, HHT, HTH, HTT}; P(Ac) = 0.875
39) Given an experiment in which a fair coin is tossed three times, the sample space is S = {HHH, HHT, HTH, THH, HTT, THT, TTH, TTT}. Event A is defined as tossing one head (H). What is the event Ac and what is the probability of this event? A) Ac = {TTT, HHH, HTH}; P(Ac) = 0.375 B) Ac = {TTT, THH, HHH, HHT}; P(Ac) = 0.500 C) Ac = {TTT, HHH, HHT, HTH, HTT}; P(Ac) = 0.625 D) Ac = {TTT, HHT, HTH, THH, HHH}; P(Ac) = 0.625
40)
<p>Let
Version 1
. Calculate
.
10
A) 0.20 B) 0.33 C) 0.40 D) Not enough information to calculate.
41) Alison has all her money invested in two mutual funds, A and B. She knows that there is a 65% chance that fund A will rise in price, and a 80% chance that fund B will rise in price given that fund A rises in price. What is the probability that both fund A and fund B will rise in price? A) 0.40 B) 1.00 C) 0.76 D) 0.52
42) Alison has all her money invested in two mutual funds, A and B. She knows that there is a 40% chance that fund A will rise in price, and a 60% chance that fund B will rise in price given that fund A rises in price. What is the probability that both fund A and fund B will rise in price? A) 0.24 B) 0.40 C) 0.76 D) 1.00
43) Alison has all her money invested in two mutual funds, A and B. She knows that there is a 40% chance that fund A will rise in price, and a 60% chance that fund B will rise in price given that fund A rises in price. There is also a 30% chance that fund B will rise in price. What is the probability that at least one of the funds will rise in price?
Version 1
11
A) 0.24 B) 0.46 C) 0.60 D) 0.76
44) Alison has all her money invested in two mutual funds, A and B. She knows that there is a 40% chance that fund A will rise in price, and a 60% chance that fund B will rise in price given that fund A rises in price. There is also a 20% chance that fund B will rise in price. What is the probability that neither fund will rise in price? A) 0.24 B) 0.36 C) 0.40 D) 0.64
45) A recent survey shows that the probability of a college student drinking alcohol is 0.6. Further, given that the student is over 21 years old, the probability of drinking alcohol is 0.8. It is also known that 30% of the college students are over 21 years old. The probability of drinking or being over 21 years old is __________. A) 0.24 B) 0.42 C) 0.66 D) 0.90
46) <p>Let of P(B|A)?
Version 1
Suppose A and B are independent. What is the value
12
A) 0.12 B) 0.3 C) 0.4 D) 0.7
47)
If A and B are independent events, which of the following is correct? A) B) C) D)
48) Let A and B be two independent events with P(A) = 0.40 and P(B) = 0.20. Which of the following is correct? A) B) C) D)
49) Peter applied to an accounting firm and a consulting firm. He knows that 30% of similarly qualified applicants receive job offers from the accounting firm, while only 20% of similarly qualified applicants receive job offers from the consulting firm. Assume that receiving an offer from one firm is independent of receiving an offer from the other. What is the probability that both firms offer Peter a job? A) 0.05 B) 0.06 C) 0.44 D) 0.50
Version 1
13
50) The likelihood of Company A’s stock price rising is 15%, and the likelihood of Company B’s stock price rising is 40%. Assume that the returns of Company A and Company B stock are independent of each other. The probability that the stock price of at least one of the companies will rise is ______. A) 15% B) 6% C) 49% D) 55%
51) The likelihood of Company A’s stock price rising is 20%, and the likelihood of Company B’s stock price rising is 30%. Assume that the returns of Company A and Company B stock are independent of each other. The probability that the stock price of at least one of the companies will rise is ______. A) 6%. B) 10%. C) 44%. D) 50%.
52)
Find the missing values marked xx and yy in the following contingency table. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
167
111
yy
369
35 years or older
63
xx
32
134
Total
230
150
123
503
A) xx = 102, yy = 89 B) xx = 39, yy = 91 C) xx = 102, yy = 91 D) xx = 39, yy = 89
Version 1
14
53)
Find the missing values marked xx and yy in the following contingency table. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
yy
357
35 years or older
45
xx
87
159
Total
202
148
166
516
A) xx = 72, yy = 79 B) xx = 27, yy = 77 C) xx = 27, yy = 79 D) xx = 72, yy = 77
54) The contingency table below provides frequencies for the preferred type of exercise for people under the age of 35 and those 35 years of age or older. Find the probability that an individual prefers running. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
79
357
35 years or older
45
27
87
159
Total
202
148
166
516
A) 0.3159 B) 0.3915 C) 0.6085 D) 0.6805
55) The contingency table below provides frequencies for the preferred type of exercise for people under the age of 35 and those 35 years of age or older. Find the probability that an individual prefers running and is under 35 years of age. Age Group
Preferred Form of Exercise Running
Version 1
Biking
Total
Swimming
15
Under 35 years
157
121
79
357
35 years or older
45
27
87
159
Total
202
148
166
516
A) 0.3042 B) 0.3915 C) 0.4398 D) 0.6918
56) The contingency table below provides frequencies for the preferred type of exercise for people under the age of 35, and those 35 years of age or older. Find the probability that an individual prefers biking given that he or she is 35 years old or older. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
79
357
35 years or older
45
27
87
159
Total
202
148
166
516
A) 0.1698 B) 0.1824 C) 0.8175 D) 0.8302
57) The following probability table shows probabilities concerning Favorite Subject and Gender. What is the probability of selecting an individual who is a female or prefers science? Gender
Favorite Subject
Total
Math
English
Science
Male
0.200
0.050
0.175
0.425
Female
0.100
0.325
0.150
0.575
Total
0.300
0.375
0.325
1.000
Version 1
16
A) 0.250 B) 0.375 C) 0.625 D) 0.750
58) The following probability table shows probabilities concerning Favorite Subject and Gender. What is the probability of selecting an individual preferring science if she is female? Gender
Favorite Subject
Total
Math
English
Science
Male
0.200
0.050
0.175
0.425
Female
0.100
0.325
0.150
0.575
Total
0.300
0.375
0.325
1.000
A) 0.2609 B) 0.4615 C) 0.5385 D) 0.7391
59) Two hundred people were asked if they had read a book in the last month. The accompanying contingency table, cross-classified by age, is produced. Under 30
30+
Yes
76
65
No
24
35
The probability that a respondent is at least 30 years old is the closest to ______. A) 0.33 B) 0.46 C) 0.50 D) 0.65
Version 1
17
60) Two hundred people were asked if they had read a book in the last month. The accompanying contingency table, cross-classified by age, is produced. Under 30
30+
Yes
76
65
No
24
35
The probability that a respondent read a book in the last month and is at least 30 years old is the closest to _______. A) 0.12 B) 0.33 C) 0.46 D) 0.88
61) Two hundred people were asked if they had read a book in the last month. The accompanying contingency table, cross-classified by age, is produced. Under 30
30+
Yes
76
65
No
24
35
Given that a respondent read a book in the last month, the probability that he or she is at least 30 years old is the closest to ________. A) 0.33 B) 0.46 C) 0.65 D) 0.88
62) Mark Zuckerberg, the founder of Facebook, announced that he will eat meat only from animals that he has killed himself (Vanity Fair, November 2011). Suppose 257 people were asked, "Does the idea of killing your own food appeal to you, or not?" The accompanying contingency table, cross-classified by gender, is produced from the 187 respondents.
Yes
Version 1
Male
Female
35
20
18
No
56
76
The probability that a respondent to the survey is male is the closest to ____. A) 0.19 B) 0.38 C) 0.49 D) 0.64
63) Mark Zuckerberg, the founder of Facebook, announced that he will eat meat only from animals that he has killed himself (Vanity Fair, November 2011). Suppose 257 people were asked, "Does the idea of killing your own food appeal to you, or not?" The accompanying contingency table, cross-classified by gender, is produced from the 187 respondents. Male
Female
Yes
35
20
No
56
76
The probability that a respondent is male and feels that the idea of killing his own food is appealing is the closest to _____. A) 0.19 B) 0.38 C) 0.41 D) 0.59
64) Mark Zuckerberg, the founder of Facebook, announced that he will eat meat only from animals that he has killed himself (Vanity Fair, November 2011). Suppose 257 people were asked, "Does the idea of killing your own food appeal to you, or not?" The accompanying contingency table, cross-classified by gender, is produced from the 187 respondents. Male
Female
Yes
35
20
No
56
76
Given that the respondent is male, the probability that he feels that the idea of killing his own food is appealing is the closest to ____.
Version 1
19
A) 0.19 B) 0.38 C) 0.59 D) 0.64
65) The 150 residents of the town of Wonderland were asked their age and whether they preferred vanilla, chocolate, or swirled frozen yogurt. The results are displayed next. Chocolate
Vanilla
Swirl
Under 25 years old
22
37
16
At least 25 years old
19
29
27
What is the probability that a randomly selected customer prefers vanilla? A) 0.37 B) 0.31 C) 0.17 D) 0.44
66) The 150 residents of the town of Wonderland were asked their age and whether they preferred vanilla, chocolate, or swirled frozen yogurt. The results are displayed next. Chocolate
Vanilla
Swirl
Under 25 years old
40
20
15
At least 25 years old
15
40
20
What is the probability that a randomly selected customer prefers vanilla? A) 0.13 B) 0.27 C) 0.33 D) 0.40
Version 1
20
67) The 150 residents of the town of Wonderland were asked their age and whether they preferred vanilla, chocolate, or swirled frozen yogurt. The results are displayed next. Chocolate
Vanilla
Swirl
Under 25 years old
40
20
15
At least 25 years old
15
40
20
What is the probability a randomly selected customer prefers chocolate given he or she is at least 25 years old? A) 0.10 B) 0.20 C) 0.27 D) 0.77
68) The 150 residents of the town of Wonderland were asked their age and whether they preferred vanilla, chocolate, or swirled frozen yogurt. The results are displayed next. Chocolate
Vanilla
Swirl
Under 25 years old
40
20
15
At least 25 years old
15
40
20
What is the probability a randomly selected customer prefers swirled yogurt or is at least 25 years old? A) 0.10 B) 0.20 C) 0.60 D) 0.90
69)
<p>Let
Version 1
, and
. Compute P(A).
21
A) 0.15 B) 0.30 C) 0.45 D) 0.50
70)
Let P(A∩B) = 0.3, and P(A∩Bc) = 0.10. Compute P(B|A). A) 0.53 B) 0.75 C) 0.41 D) 0.58
71)
<p>Let
, and
. Compute
.
A) 0.33 B) 0.45 C) 0.5 D) 0.67
72)
Let P(A) = 0.70, P(B|A) = 0.70, and P(B|Ac) = 0.35. Compute P(B). A) 0.375 B) 0.60 C) 0.45 D) 0.40
73)
Let P(A) = 0.4, P(B|A) = 0.5, and P(B|Ac) = 0.25. Compute P(B).
Version 1
22
A) 0.125 B) 0.15 C) 0.20 D) 0.35
74)
<p>Let .
, and
, and
. Compute
A) 0.20 B) 0.50 C) 0.65 D) 0.70
75)
Let P(A) = 0.4, P(B|A) = 0.5, and P(B|Ac) = 0.25. Compute P(A|B) . A) 0.20 B) 0.35 C) 0.57 D) 0.80
76) An analyst expects that 10% of all publicly traded companies will experience a decline in earnings next year. The analyst has developed a ratio to help forecast this decline. If the company is headed for a decline, there is a 70% chance that this ratio will be negative. If the company is not headed for a decline, there is only a 20% chance that the ratio will be negative. The analyst randomly selects a company and its ratio is negative. Based on Bayes’ theorem, the posterior probability that the company will experience a decline is ____. A) 3%. B) 7%. C) 18%. D) 28%.
Version 1
23
77) An analyst expects that 20% of all publicly traded companies will experience a decline in earnings next year. The analyst has developed a ratio to help forecast this decline. If the company is headed for a decline, there is a 90% chance that this ratio will be negative. If the company is not headed for a decline, there is only a 10% chance that the ratio will be negative. The analyst randomly selects a company with a negative ratio. Based on Bayes’ theorem, the posterior probability that the company will experience a decline is A) 18%. B) 26%. C) 44%. D) 69%.
78) Romi, a production manager, is trying to improve the efficiency of his assembly line. He knows that the machine is set up correctly only 60% of the time. He also knows that if the machine is set up correctly, it will produce good parts 80% of the time, but if set up incorrectly, it will produce good parts only 20% of the time. Romi starts the machine and produces one part before he begins the production run. He finds the first part to be good. What is the revised probability that the machine was set up correctly? A) 48% B) 56% C) 86% D) 96%
79) Romi, a production manager, is trying to improve the efficiency of his assembly line. He knows that the machine is set up correctly only 65% of the time. He also knows that if the machine is set up correctly, it will produce good parts 76% of the time, but if set up incorrectly, it will produce good parts only 10% of the time. Romi starts the machine and produces one part before he begins the production run. He finds the first part to be good. What is the revised probability that the machine was set up correctly?
Version 1
24
A) 93.40% B) 16.40% C) 20.70% D) 49.40%
80) Romi, a production manager, is trying to improve the efficiency of his assembly line. He knows that the machine is set up correctly only 70% of the time. He also knows that if the machine is set up correctly, it will produce good parts 95% of the time, but if set up incorrectly, it will produce good parts only 40% of the time. Romi starts the machine and produces one part before he begins the production run. He finds the first part to be good. What is the revised probability that the machine was set up correctly? A) 12% B) 33.5% C) 66.5% D) 84.7%
81)
5! is equal to _______. A) 5 × 4 × 3 × 2 B) 5 × 4 C) 5 × 4 × 3 D) 5 × 4 × 3 × 2 × 1
82)
How many ways can a committee of four students be selected from a 15-member club? A) 15!/44! B) 15!/(11! × 4!) C) 15 ×14 × 13 D) 15!/(11! × 4!) and 15 × 14 × 13
Version 1
25
83) How many ways can a potential four-letter word, whether or not it has a meaning, be created out of 10 available different letters? A) 4 × 3 × 2 × 1 B) 10 × 9 × 8 × 7/(4 × 3 × 2) C) 10 × 9 × 8 × 7 × 6 × 5/(4 × 3 × 2) D) 10 × 9 × 8 × 7
84)
When some objects are randomly selected, which of the following is true?
A) The order in which objects are selected matters in combinations. B) The order in which objects are selected does not matter in permutations. C) The order in which objects are selected does not matter in combinations. D) The order in which objects are selected matters in both permutations and combinations.
85) A small company that manufactures juggling equipment makes seven different types of clubs. The company wants to start an ad campaign that emphasizes the myriad combinations the avid juggler can create with the company’s clubs. If a juggler wishes to juggle four clubs, each of a different type, how many different combinations of the company’s clubs can he or she make? A) 7 B) 28 C) 35 D) 210
86) Boeing currently produces five models of airplanes for commercial sale. The airline that Lauren works for is rapidly expanding and would like to purchase three airplanes of different models to service various routes. Her job is to analyze which three to buy. How many combinations will she have to analyze?
Version 1
26
A) 5 B) 10 C) 15 D) 20
87) How many project teams composed of five students can be created out of a class of 10 students? A) 10 B) 50 C) 252 D) 30,240
88) How many project teams composed of five students can be created out of a class of 10 students, if each of the five students is assigned a specific position in each group such as president, vice president, secretary, treasurer, and social coordinator? A) 10 B) 50 C) 252 D) 30,240
89) A survey of adults who typically work full time from home recorded their current education level. The results are shown in the table below. Education Level
Frequency
Bachelor's degree or higher
32
Associate degree
12
High school only
4
Less than high school
2
Calculating the probability that a randomly selected adult who works full time from home has an associate degree is using ________ probability.
Version 1
27
A) empirical B) simple C) subjective D) classical
90) A survey of adults who typically work full time from home recorded their current education level. The results are shown in the table below. Education Level
Frequency
Bachelor's degree or higher
32
Associate degree
12
High school only
4
Less than high school
2
The probability that a randomly selected adult who works full time from home has a bachelor’s degree or higher is ______. A) 0.24. B) 0.29. C) 0.33. D) 0.64.
91) A company is bidding on two projects, A and B. The probability that the company wins project A is 0.40 and the probability that the company wins project B is 0.25. Winning project A and winning project B are independent events. What is the probability that the company wins project A or project B? A) 0.55 B) 0.65 C) 0.30 D) 0.25
Version 1
28
92) A company is bidding on two projects, A and B. The probability that the company wins project A is 0.50 and the probability that the company wins project B is 0.20. Winning project A and winning project B are independent events. What is the probability that the company does not win either project? A) 0.70 B) 0.65 C) 0.30 D) 0.40
93) A company is bidding on two projects, A and B. The probability that the company wins project A is 0.40 and the probability that the company wins project B is 0.25. Winning project A and winning project B are independent events. What is the probability that the company does not win either project? A) 0.35 B) 0.70 C) 0.45 D) 0.75
94) A random sample of computer users was asked which browser they primarily relied on for surfing the Internet and whether this browser was installed on a PC or on a Mac computer. The following contingency table shows these results. Browser
PC
Mac
Internet Explorer
18
33
Firefox
12
24
Chrome
15
30
Safari
3
9
Other
3
3
Given that the randomly selected computer was a PC, the probability that the computer used the Firefox browser is ______.
Version 1
29
A) 0.235 B) 0.320 C) 0.361 D) 0.443
95) A business statistics class roster includes 14 business major students and 21 students of another major. Sixteen students in this class are male. There are eight female business majors. What is the number of men in the class who are not business majors? A) 6 B) 8 C) 10 D) 11
96) The following contingency table shows the number of customers who bought various brands of digital cameras at Amazon.com and Ritz Camera. Brand
Amazon.com
Ritz Camera
Nikon
53
6
Canon
34
30
Sony
26
20
Other
90
79
Given that the camera was purchased at Ritz Camera, the probability that it was a Sony brand is ______. A) 0.148 B) 0.131 C) 0.204 D) 0.216
Version 1
30
97) The following contingency table shows the number of customers who bought various brands of digital cameras at Amazon.com and Ritz Camera. Brand
Amazon.com
Ritz Camera
Nikon
42
9
Canon
30
24
Sony
33
15
Other
82
75
Given that the camera was purchased at Ritz Camera, the probability that it was a Sony brand is ______. A) 0.105 B) 0.122 C) 0.178 D) 0.190
98) At a local bar in a small town, beer and wine are the only two alcoholic options. The manager noted that of all male customers, 150 ordered beer, 40 ordered wine, and 20 asked for soft drinks. Of female customers 38 ordered beer, 20 ordered wine, and 12 asked for soft drinks. What is the probability that a randomly selected customer orders wine? A) 0.143 B) 0.250 C) 0.071 D) 0.214
99) Suppose that 60% of the students do homework regularly. It is also known that 80% of students who had been doing homework regularly, end up doing well in the course. Only 20% of students who had not been doing homework regularly end up doing well in the course. Given that a student did well in the course, what is the probability that the student had been doing homework regularly?
Version 1
31
A) 0.286 B) 0.857 C) 0.143 D) 0.429
100) Suppose that 60% of the students do homework regularly. It is also known that 80% of students who had been doing homework regularly end up doing well in the course. Only 20% of students who had not been doing homework regularly end up doing well in the course. What is the probability that a student does well in the course? A) 0.080 B) 0.480 C) 0.560 D) 0.857
101) There are 23 chess players participating in a weekend tournament. Prices are awarded to the first-, second-, third-, and fourth-place finishers. The number of different ways these four places can be filled by those playing is ______. A) 3,450 B) 8,855 C) 70,256 D) 212,520
102) There are 30 Major League Baseball teams in the National League. Five of these teams will make the playoffs at the end of the season. The number of unique groups of teams that can make the playoffs is ______.
Version 1
32
A) 142,506 B) 252,640 C) 752,988 D) 17,100,720
103) The following table summarizes the ages of the 400 richest Americans. Suppose we select one of these individuals. Find the probability that the selected individual is at least 60 years old. Ages
Frequency
less than 40
7
40 up to 50
47
50 up to 60
90
60 up to 70
109
70 up to 80
93
80 up to 90
45
90 or more
9
104) The following table summarizes the ages of the 400 richest Americans. Suppose we select one of these individuals. Find the probability that the selected individual is less than 80 years old. Ages
Frequency
less than 40
7
40 up to 50
47
50 up to 60
90
60 up to 70
109
70 up to 80
93
80 up to 90
45
90 or more
9
Version 1
33
105)
An experiment consists of rolling a fair die. Find the probability that we roll a 4 or a 6.
106) An experiment consists of tossing a fair coin and rolling a fair die. Find the probability of tossing a head and rolling a 6.
107) The likelihood of rain has been reported at 80% for this week, 40% for next week, and 32% for both weeks. a.What is the probability that it rains in at least one of these two weeks? Which rule of probability applies? b.What is the probability that it does not rain in either of these two weeks? Which rule of probability applies?
108) The likelihood of rain has been reported at 80% for this week, 40% for next week, and 32% for both weeks. Does whether it rains this week influence whether it rains next week? Explain.
Version 1
34
109) Exams are approaching and Helen is allocating time to studying for exams. She thinks that with the appropriate amount of studying, she has an 80% chance of getting an A in Marketing. She also feels that she has a 60% chance of getting an A in Spanish with the appropriate amount of studying. Given the demands on her time, she feels that she has only a 45% chance of getting an A in both classes. What is the probability that Helen gets an A in at least one of her two classes?
110) Exams are approaching and Helen is allocating time to studying for exams. She feels that with the appropriate amount of studying, she has an 80% chance of getting an A in Marketing. She also feels that she has a 60% chance of getting an A in Spanish with the appropriate amount of studying. Given the demands on her time, she feels that she has only a 45% chance of getting an A in both classes. What is the probability that Helen does not get an A in either class?
111) Exams are approaching and Helen is allocating time to studying for exams. She thinks that with the appropriate amount of studying, she has a 70% chance of getting an A in Spanish, and a 20% chance of getting a B in Marketing. Assuming she has no obstacles limiting her available time to allocate to each subject, what is the probability that Helen gets an A in Spanish or a B in Marketing?
Version 1
35
112) Exams are approaching and Helen is allocating time to studying for exams. She plays for the very successful women’s lacrosse team and must schedule her studying around lacrosse practices and play-off games. The entire team assumes that the probability of making the playoffs is 50%. Helen thinks that with the appropriate preparation, she has a 70% chance of getting an A in Marketing, but this chance decreases to 60% if the lacrosse team makes the play-offs. Find the probability of getting an A in Marketing and making the lacrosse play-offs.
113) Exams are approaching and Helen is allocating time to studying for exams. She plays for the very successful women’s lacrosse team and must schedule her studying around lacrosse practices and play-off games. She thinks that with the appropriate preparation, she has a 70% chance of getting an A in Marketing. She also believes that this chance will decrease to 60% if the lacrosse team makes the play-offs. Are getting an A on the exam and being in the lacrosse play-offs independent events? Show evidence of your response.
114) Ryan is hoping to attend graduate school next year. Two of the schools he applied to are the University of Utah and Ohio State University. The probability he gets accepted to Utah given he got accepted to Ohio State is 0.75. The probability he gets accepted to Ohio State is 0.05. The probability he gets accepted to Utah is 0.10. What is the probability he gets accepted to Ohio State given he gets accepted to Utah?
115) Two stocks, A and B, have a historical correlation indistinguishable from zero. The probability that stock A increases next year is 0.2, and the probability that stock B increases next year is 0.4. Calculate the probability that A and B both increase next year.
Version 1
36
116) An investor is keeping a careful eye on the real estate markets in Las Vegas and the Inland Empire. The following are her predictions for the real estate market in 2016. ● With 0.32 probability, foreclosures will increase in Las Vegas. ● With 0.46 probability, foreclosures will increase in Las Vegas or the Inland Empire. ● With 0.27 probability, foreclosures will increase in Las Vegas and the Inland Empire. What is the probability that foreclosures will increase in the Inland Empire?
117) An investor is keeping a careful eye on the real estate markets in Las Vegas and the Inland Empire. The following are her predictions for the real estate market for the next year. ● With 0.32 probability, foreclosures will increase in Las Vegas. ● With 0.46 probability, foreclosures will increase in Las Vegas or the Inland Empire. ● With 0.27 probability, foreclosures will increase in Las Vegas and the Inland Empire. What is the probability that foreclosures will increase in the Inland Empire given that they increased in Vegas?
118) The following contingency table provides frequencies for the preferred type of exercise for people under the age of 35 and those 35 years of age or older. Here xx and yy represent missing values. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
79
xx
35 years or older
45
27
87
159
Total
yy
148
166
516
Version 1
37
Compute the probability that an individual is under 35 and prefers running.
119) The following contingency table provides frequencies for the preferred type of exercise for people under the age of 35 and those 35 years of age or older. Here xx and yy represent missing values. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
79
xx
35 years or older
45
27
87
159
Total
yy
48
166
516
Determine whether selecting an individual under 35 is independent of selecting an individual who prefers swimming.
120) The following contingency table provides frequencies for the preferred type of exercise for people under the age of 35 and those 35 years of age or older. Here xx and yy represent missing values. Age Group
Preferred Form of Exercise
Total
Running
Biking
Swimming
Under 35 years
157
121
79
xx
35 years or older
45
27
87
159
Total
yy
48
166
516
Determine whether selecting an individual under 35 is independent of selecting an individual who prefers swimming.
Version 1
38
121) In January 2012, the second stop for a Republican to get votes toward the presidential nomination was at the New Hampshire Primary. The following exhibit shows the votes several candidates received from registered Republicans and Independents. Party Affiliation Mitt Romney Ron Paul Jon Huntsman
Others
Total
Republicans
62,645
16,792
12,733
27,610
119,780
Not Republican
34,887
40,056
29,212
24,550
128,705
Source: ABC News What is the probability that a randomly selected voter voted for Ron Paul?
122) In January of 2012, the second stop for a Republican to get votes toward the presidential nomination was at the New Hampshire Primary. The following exhibit shows the votes several candidates received from registered Republicans and Independents. Party Affiliation Mitt Romney Ron Paul Jon Huntsman
Others
Total
Republicans
62,645
16,792
12,733
27,610
119,780
Not Republican
34,887
40,056
29,212
24,550
128,705
Source: ABC News Given that a randomly selected voter is not a Republican, what is the probability that he or she voted for Ron Paul?
123) In January of 2012, the second stop for a Republican to get votes toward the presidential nomination was at the New Hampshire Primary. The following exhibit shows the votes several candidates received from registered Republicans and Independents. Party Affiliation Mitt Romney Ron Paul Jon Huntsman
Others
Total
Republicans
62,645
16,792
12,733
27,610
119,780
Not Republican
34,887
40,056
29,212
24,550
128,705
Version 1
39
Source: ABC News If a randomly selected voter voted for Jon Huntsman, what is the probability he or she is a Republican?
124) As part of pharmaceutical testing for drowsiness as a side effect of a drug, 200 patients are randomly assigned to one of two groups of 100 each. One group is given the actual drug and the other a placebo. The number of people who felt drowsy in the next hour is recorded as Drug
Placebo
Drowsy
40
30
Not drowsy
60
70
a.What is the probability that a randomly picked patient in the study feels drowsy in the next hour? b.What is the probability that a randomly picked patient in the study takes the placebo or feels drowsy in the next hour? c.Given that the patient was given the drug, what is the probability that he or she feels drowsy in the next hour? d.Is whether a patient feels drowsy independent of taking the drug? Explain using probabilities.
125) Restaurants in London, Paris, and New York want diners to experience eating in pitch darkness to heighten their senses of taste and smell (Vanity Fair, December 2011). Suppose 400 people were asked, “If given the opportunity, would you eat at one of these restaurants?” The accompanying contingency table, cross-classified by age, would be produced. 18–29
30–44
45–64
65+
Yes
50
39
76
65
No
50
61
24
35
Convert the contingency table to a joint probability table. Version 1
40
126) Restaurants in London, Paris, and New York want diners to experience eating in pitch darkness to heighten their senses of taste and smell (Vanity Fair, December 2011). Suppose 400 people were asked, “If given the opportunity, would you eat at one of these restaurants?" The accompanying contingency table, cross-classified by age, would be produced. 18–29
30–44
45–64
65+
Yes
50
39
76
65
No
50
61
24
35
a.What is the probability that a respondent would eat at one of these restaurants? b.What is the probability that a respondent would eat at one of these restaurants or is in the 30–44 age bracket? c.Given that the respondent would eat at one of these restaurants, what is the probability that he or she is in the 30–44 age bracket? d.Is whether a respondent would eat at one of these restaurants independent of one’s age? Explain using probabilities.
127) Members of the saxophone section of a marching band are asked to attend saxophoneonly practices once a week. These practices are not mandatory, but they are the only opportunity for the entire section to play together in preparation for the marching season. The section leader has noted that 70% of the section regularly attends practices. Further, given that they attend regularly, there is a 40% chance of earning an A on their performance when later graded by the band director. If they do not attend regularly, there is only a 10% chance of earning an A on their performance. What is the probability that a randomly chosen musician receives an A on their performance?
Version 1
41
128) A certain weightlifter is prone to back injury. He finds that he has a 20% chance of hurting his back if he uses the proper form of bending at the hips and keeping his spine locked. The probability that he will hurt his back with bad form is 95%. The probability that he uses proper form is 75%. What is the probability he hurts his back?
129) Approximately 70% of the state of Pennsylvania sits on a shale formation from which natural gas may be extracted. If a geological test is positive, it has an 80% accuracy rate in correctly identifying a productive drilling site (shale is under the ground). If there is no shale under the ground, geological testing is falsely positive with a probability of 20%. Suppose the result of the geological test comes back positive (the test says there is shale under the ground). It is most important for us to know the probability that there is indeed shale under the ground given a positive geological test result. Find this probability.
130) An investor in Apple is worried the latest management earnings forecast is too aggressive and the company will fall short. His favorite analyst that covers Apple is going to release his report on Apple the week before the earnings announcement. Report stands for the analyst’s report, and Forecast stands for the earnings announcement. Prior Probabilities
Conditional Probabilities
P(Good Report) = 0.2
P(Below Forecast | Good Report) = 0.1
P(Medium Report) = 0.5
P(Below Forecast | Medium Report) = 0.4
P(Bad Report) = 0.3
P(Below Forecast | Bad Report) = 0.9
What is the probability the earnings announcement is below the forecast?
Version 1
42
131) An investor in Apple is worried the latest management earnings forecast is too aggressive and the company will fall short. His favorite analyst that covers Apple is going to release his report on Apple the week before the earnings announcement. Report stands for the analyst’s report, and Forecast stands for the earnings announcement. Prior Probabilities
Conditional Probabilities
P(Good Report) = 0.2
P(Below Forecast | Good Report) = 0.1
P(Medium Report) = 0.5
P(Below Forecast | Medium Report) = 0.4
P(Bad Report) = 0.3
P(Below Forecast | Bad Report) = 0.9
What is the probability the analyst issued a good report given Apple’s earnings announcement was below the forecast?
132) A certain state wants to use three letters followed by three digits on their license plates. If each letter and each digit can be used only once, how many possible license plates can be created?
133) A pension plan gives you 12 possible funds in which to invest, but you are allowed to hold four at any given point in time. How many possible combinations of the 12 funds could you have in your portfolio?
134) There are three unfilled seats on a small plane, but 10 people showed up to try to get one of those three unfilled seats. If the airline picks three people at random to get on the plane, what is the probability you and your two kids will get on the plane? Version 1
43
135) A group of students has 12 girls and 10 boys. A project team, including three girls and two boys, must be created. Find the number of possible project teams.
136) 5}.
For an experiment in which a single die is rolled, the sample space may be {1, 1, 2, 3, 4, ⊚ ⊚
true false
137)
The probability of a union of events can be greater than 1. ⊚ true ⊚ false
138)
Events are exhaustive if they do not share common outcomes of a sample space. ⊚ true ⊚ false
139)
Mutually exclusive events may share common outcomes of a sample space. ⊚ true ⊚ false
Version 1
44
140) Mutually exclusive and collectively exhaustive events contain all outcomes of a sample space, and they do not share any common outcomes. ⊚ true ⊚ false
141) The union of two events A and B, denoted by A ∪ B, does not have outcomes from both A and B. ⊚ true ⊚ false
142) The complement of an event A, denoted by AC, within the sample space S, is the event consisting of all outcomes of A that are not in S. ⊚ true ⊚ false
143) The intersection of two events A and B, denoted by A ∩ B, is the event consisting of all outcomes that are in A and B. ⊚ true ⊚ false
144)
Two events A and B can be both mutually exclusive and independent at the same time. ⊚ true ⊚ false
145)
Subjective probability is assigned to an event by drawing on logical analysis. ⊚ true ⊚ false
Version 1
45
146)
For two independent events A and B, the probability of their intersection is zero. ⊚ true ⊚ false
147)
Two events can be both mutually exclusive and independent at the same time. ⊚ true ⊚ false
148) Two events A and B are independent if the probability of one does not influence the probability of the other. ⊚ true ⊚ false
149) The total probability rule is useful only when the unconditional probability is expressed in terms of probabilities conditional on two mutually exclusive and exhaustive events. ⊚ true ⊚ false
150) Bayes’ theorem uses the total probability rule to update the prior probability of an event that has not been affected by any new evidence. ⊚ true ⊚ false
151) Bayes’ theorem is used to update prior probabilities based on the arrival of new relevant information. ⊚ true ⊚ false
Version 1
46
152)
Combinations are used when the order in which different objects are arranged matters. ⊚ true ⊚ false
153)
Permutations are used when the order in which different objects are arranged matters. ⊚ true ⊚ false
154) Consider these events. A = The survey respondent is less than 40 years old. B = The survey respondent is 40 years or older. Events A and B are mutually exclusive and exhaustive. ⊚ true ⊚ false
155) The addition rule is used to determine the probability of the union of two events occurring and is defined as a sum of the probabilities of both events. ⊚ true ⊚ false
156) Joint probability of two independent events A and B equals the sum of the individual probabilities of A and B. ⊚ true ⊚ false
157)
The total probability rule is defined as P(A) = P(A ⊚ ⊚
Version 1
B) + P(A
)
true false
47
158) Bayes’ theorem is a rule that uses the total probability rule and the addition rule to update the probability of the event. ⊚ true ⊚ false
159)
0! = 0. ⊚ true ⊚ false
Version 1
48
Answer Key Test name: Chap 04_4e 1) sample space 2) 3) empirical 4) [law of large numbers, 50] 5) mutually exclusive 6) dependent 7) probability tree 8) C 9) B 10) C 11) C 12) C 13) C 14) C 15) D 16) C 17) C 18) C 19) B 20) A 21) B 22) A 23) C 24) C 25) C 26) A Version 1
49
27) B 28) A 29) D 30) C 31) D 32) A 33) C 34) D 35) A 36) C 37) A If P(A∪B)C = 0.10 the P(A∪B) = 0.90 P(A∪B) = P(A) + P(B) − P(A∩B) So, 0.90 = 0.60 + 0.30 − P(A∩B) and P(A∩B) = 0 38) B 39) D 40) C 41) D 42) A 43) B 44) D 45) C 46) C 47) C 48) D 49) B 50) C 51) C 52) B 53) C Version 1
50
54) B 55) A 56) A 57) D 58) A 59) C 60) B 61) B 62) C 63) A 64) B 65) D 66) D 67) B 68) C 69) C 70) B 71) D 72) B 73) D 74) A 75) C 76) D 77) D 78) C 79) A 80) D 81) D 82) B 83) D Version 1
51
84) C 85) C 86) B 87) C 88) D 89) A 90) D 91) A 92) D 93) C 94) A 95) C 96) A 97) B 98) D 99) B 100) C 101) D 102) A 103) 104) 105) 106)
0.64 0.865 0.3333 1/12 or 0.0833
107) a.88%; the addition rule. b.12%; the compliment rule. 108) Whether it rains next week is not dependent on whether it rains this week because P(rain this week ∩ rain next week) = P(rain this week) × P(rain next week). 109) 0.95 110) 0.05
Version 1
52
111) 0.90 112) 0.30
113) The events "Getting an A in Marketing" and "Making the playoffs" are not independent. 114) 0.375 120) Selecting a 35-or-older individual is not independent of selecting an individual who prefers swimming. 124) a.0.35 b.0.70 c.0.40 d.Not independent
125) 18–29
30–44
45–64
65+
Total
Yes
0.125
0.975
0.19
0.1625
0.575
No
0.125
0.1525
0.06
0.0875
0.425
Total
0.250
0.250
0.250
0.250
1.000
126) a.0.575 b.0.7275 c.0.1696 d.Not independent 127) 0.31 128) 0.3875 129) 0.9032
130) 0.49 131) 0.04
136) FALSE 137) FALSE 138) FALSE 139) FALSE 140) TRUE 141) FALSE 142) FALSE Version 1
53
143) TRUE 144) FALSE 145) FALSE 146) FALSE 147) FALSE 148) TRUE 149) FALSE 150) FALSE 151) TRUE 152) FALSE 153) TRUE 154) TRUE 155) FALSE 156) FALSE 157) TRUE 158) FALSE 159) FALSE
Version 1
54
CHAPTER 05 – 4e 1) risk.
A risk-averse consumer demands a(n) _______ expected gain as compensation for taking
2) The risk of the portfolio depends not only on the individual risks of the assets but also on the ________ between the asset returns.
3) To illustrate the possible outcomes and their associated probabilities we can construct a probability ______.
4)
An Excel’s function _________ is used for calculating Poisson probabilities.
5) The hypergeometric probability distribution is appropriate in applications where we cannot assume the trials are _________.
6)
<p>The following formula defines the ________ of a hypergeometric random variable
7)
Which of the following can be represented by a discrete random variable? A) The number of obtained spots when rolling a six-sided die B) The height of college students C) The average outside temperature taken every day for two weeks D) The finishing time of participants in a cross-country meet
Version 1
1
8)
Which of the following can be represented by a discrete random variable? A) The circumference of a randomly generated circle B) The time of a flight between Chicago and New York C) The number of defective light bulbs in a sample of five D) The average distance achieved in a series of long jumps
9)
Which of the following can be represented by a continuous random variable? A) The time of a flight between Chicago and New York B) The number of defective light bulbs in a sample of five C) The number of arrivals to a drive-through bank window in a four-hour period D) The score of a randomly selected student on a five-question multiple-choice quiz
10)
Which of the following can be represented by a continuous random variable?
A) The average temperature in Tampa, Florida, during the month of July B) The number of typos found on a randomly selected page of this test bank C) The number of students who will get financial assistance in a group of 50 randomly selected students D) The number of customers who visit a department store between 10:00 a.m. and 11:00 a.m. on Mondays
11) Which of the following is NOT a characteristic of the probability mass function of a discrete random variable X? A) The sum of probabilities P(X = x) over all possible values x is 1. B) For every possible value x, the probability P(X = x) is between 0 and 1. C) Describes all possible values x with the associated probabilities P(X = x). D) The probability P(X ≤ x) for every possible value x is equal to 1.
Version 1
2
12)
What are the two key properties of a discrete probability distribution? A) 0≤P (X = x) ≤1 and ∑P (X = xi ) = 0 B) 0≤P (X = x) ≤1 and ∑P (X = xi ) = 1 C) −1≤P (X = x) ≤1 and ∑P (X = xi ) = 1 D) −1≤P (X = x) ≤1 and ∑P (X = xi ) = 0
13)
Consider the following discrete probability distribution. x
−10
0
10
20
P(X = x)
0.35
0.10
0.15
0.40
What is the probability that X is 0? A) 0.10 B) 0.35 C) 0.55 D) 0.65
14)
Consider the following discrete probability distribution. x
–10
0
10
20
P(X = x)
0.30
0.18
0.11
0.41
What is the probability that X is greater than 0? A) 0.18 B) 0.30 C) 0.52 D) 0.62
15)
Consider the following discrete probability distribution. x
−10
0
10
20
P(X = x)
0.35
0.10
0.15
0.40
Version 1
3
What is the probability that X is greater than 0? A) 0.10 B) 0.35 C) 0.55 D) 0.65
16)
Consider the following discrete probability distribution. x
−10
0
10
20
P(X = x)
0.35
0.10
0.15
0.40
What is the probability that X is negative? A) 0.00 B) 0.10 C) 0.15 D) 0.35
17)
Consider the following discrete probability distribution. x
−10
0
10
20
P(X = x)
0.35
0.10
0.15
0.40
What is the probability that X is less than 5? A) 0.10 B) 0.15 C) 0.35 D) 0.45
18) X.
Consider the following cumulative distribution function for the discrete random variable x
1
2
3
4
P(X ≤ x)
0.30
0.44
0.72
1.00
Version 1
4
What is the probability that X is less than or equal to 2? A) 0.14 B) 0.30 C) 0.44 D) 0.56
19) X.
Consider the following cumulative distribution function for the discrete random variable x
1
2
3
4
P(X ≤ x)
0.26
0.49
0.58
1.00
What is the probability that X equals 2? A) 0.23 B) 0.65 C) 0.49 D) 0.26
20) X.
Consider the following cumulative distribution function for the discrete random variable x
1
2
3
4
P(X ≤ x)
0.30
0.44
0.72
1.00
What is the probability that X equals 2? A) 0.14 B) 0.30 C) 0.44 D) 0.56
Version 1
5
21) X.
Consider the following cumulative distribution function for the discrete random variable x
1
2
3
4
5
P(X ≤ x)
0.10
0.35
0.75
0.85
1.00
What is the probability that X is greater than 3? A) 0.75 B) 0.45 C) 0.35 D) 0.25
22)
We can think of the expected value of a random variable X as ________.
A) the long-run average of the random variable values generated over 100 independent repetitions B) the long-run average of the random variable values generated over 1,000 independent repetitions C) the long-run average of the random variable values generated over infinitely many independent repetitions D) the long-run average of the random variable values generated over a finite number of independent repetitions
23) The expected value of a random variable X cannot be referred to or denoted as ___________. A) µ B) E(X) C) the population mean D) the mode
24)
Consider the following probability distribution. xi
Version 1
P(X = xi)
6
0
0.1
1
0.1
2
0.1
3
0.7
The expected value is _____. A) 1.4 B) 2.0 C) 2.4 D) 3.0
25)
Consider the following probability distribution. xi
P(X = xi)
0
0.1
1
0.2
2
0.3
3
0.4
The expected value is _____.
A) 0.9 B) 1.5 C) 2.0 D) 2.5
26)
Consider the following probability distribution. xi
P(X = xi)
0
0.1
1
0.2
2
0.4
3
0.3
The expected value is _____.
Version 1
7
A) 0.89 B) 0.94 C) 1.65 D) 1.90
27)
Consider the following probability distribution. xi
P(X = xi)
0
0.1
1
0.2
2
0.4
3
0.3
The standard deviation is _________. A) 0.89 B) 0.94 C) 1.65 D) 1.90
28)
Consider the following probability distribution. xi
P(X = xi)
–2
0.2
–1
0.1
0
0.3
1
0.4
The expected value is _____. A) −1.0 B) −0.1 C) 0.1 D) 1.0
Version 1
8
29)
Consider the following probability distribution. xi
P(X = xi)
−2
0.2
−1
0.1
0
0.3
1
0.4
The variance is _____. A) 1.14 B) 1.29 C) 1.65 D) 1.94
30)
Consider the following probability distribution. xi
P(X = xi)
–2
0.2
–1
0.1
0
0.3
1
0.4
The standard deviation is _____.
A) 1.14 B) 1.29 C) 1.65 D) 1.94
31) An analyst has constructed the following probability distribution for firm X’s predicted return for the upcoming year. Return
Probability
−5
0.20
0
0.30
5
0.40
10
0.10
Version 1
9
The expected value and the variance of this distribution are _____ and _____. Expected Value
Variance
A.
1
4.58
B.
2
21.00
C.
2.5
4.58
D.
2.5
21.00
A) Option A B) Option B C) Option C D) Option D
32) An analyst believes that a stock’s return depends on the state of the economy, for which she has estimated the following probabilities: State of the Economy
Probability
Return
Good
0.60
15%
Normal
0.30
10%
Poor
0.10
−2%
According to the analyst’s estimates, the expected return of the stock is ____. A) 7.8% B) 11.4% C) 11.8% D) 15.0%
33)
An analyst estimates that the year-end price of a stock has the following probabilities: Stock Price
Probability
$80
0.10
$85
0.30
$90
0.40
$95
0.20
The stock’s expected price at the end of the year is _______.
Version 1
10
A) $87.50 B) $88.50 C) $89.00 D) $90.00
34) The number of homes sold by a realtor during a month has the following probability distribution: Number Sold
Probability
0
0.20
1
0.40
2
0.40
What is the probability that the realtor will sell at least one house during a month? A) 0.20 B) 0.40 C) 0.60 D) 0.80
35) The number of homes sold by a realtor during a month has the following probability distribution: Number Sold
Probability
0
0.20
1
0.40
2
0.40
What is the probability that the realtor sells no more than one house during a month? A) 0.20 B) 0.40 C) 0.60 D) 0.80
Version 1
11
36) The number of homes sold by a realtor during a month has the following probability distribution: Number Sold
Probability
0
0.20
1
0.40
2
0.40
What is the expected number of homes sold by the realtor during a month? A) 1 B) 1.2 C) 1.5 D) 2
37) The number of homes sold by a realtor during a month has the following probability distribution: Number Sold
Probability
0
0.20
1
0.10
2
0.70
What is the standard deviation of the number of homes sold by the realtor during a month? A) 0.81 B) 0.65 C) 1.50 D) 1.50
38) The number of homes sold by a realtor during a month has the following probability distribution: Number Sold
Version 1
Probability
0
0.20
1
0.40
2
0.40
12
What is the standard deviation of the number of homes sold by the realtor during a month? A) 0.56 B) 0.75 C) 1 D) 1.2
39) The number of cars sold by a car salesperson during each of the last 25 weeks is the following: Number Sold
Frequency
0
10
1
10
2
5
What is the probability that the salesperson will sell one car during a week? A) 0.20 B) 0.40 C) 0.60 D) 0.80
40) The number of cars sold by a car salesperson during each of the last 25 weeks is the following: Number Sold
Frequency
0
10
1
10
2
5
What is the probability that the salesperson sells no more than one car during a week? A) 0.20 B) 0.40 C) 0.60 D) 0.80
Version 1
13
41) The number of cars sold by a car salesperson during each of the last 25 weeks is the following: Number Sold
Frequency
0
10
1
10
2
5
What is the expected number of cars sold by the salesperson during a week? A) 0 B) 0.8 C) 1 D) 1.5
42) The number of cars sold by a car salesperson during each of the last 25 weeks is the following: Number Sold
Frequency
0
10
1
10
2
5
What is the standard deviation of the number of cars sold by the salesperson during a week? A) 0.56 B) 0.75 C) 0.80 D) 1
43)
A consumer who is risk averse is best characterized as __________.
Version 1
14
A) a consumer who may accept a risky prospect even if the expected gain is negative B) a consumer who demands a positive expected gain as compensation for taking risk C) a consumer who completely ignores risk and makes his or her decisions based solely on expected values D) a consumer who is indifferent to risk
44) A consumer who is risk neutral is best characterized as ______________________________________________________. A) a consumer who may accept a risky prospect even if the expected gain is negative B) a consumer who demands a positive expected gain as compensation for taking risk C) a consumer who completely ignores risk and makes his or her decisions based solely on expected values D) a consumer who underplays risk
45)
How would you characterize a consumer who is risk loving?
A) A consumer who may accept a risky prospect even if the expected gain is negative. B) A consumer who demands a positive expected gain as compensation for taking risk. C) A consumer who completely ignores risk and makes his or her decisions solely on the basis of expected values. D) A consumer who is indifferent to risk.
46) An investor has a $200,000 portfolio of which $120,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are shown in the accompanying table. Stock A
Stock B
E(RA ) = µA = 8.4%
E(RB ) = µB = 6.5%
σA = 11.82%
σB = 7.19%
Cov(RA,RB ) = σAB = 17.10%
Version 1
15
The correlation coefficient between the returns on Stocks A and B is _____. A) −0.17 B) 0.20 C) 0.80 D) 4.97
47) An investor has a $200,000 portfolio of which $120,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are shown in the accompanying table. Stock A
Stock B
E(RA ) = µA = 8.4%
E(RB ) = µB = 6.5%
σA = 11.82%
σB = 7.19%
Cov(RA,RB ) = σAB = 17.10%
The expected return of the portfolio is ____. A) 2.60% B) 5.04% C) 7.64% D) 14.90%
48) An investor has a $200,000 portfolio of which $120,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are shown in the accompanying table. Stock A
Stock B
E(RA ) = µA = 8.4%
E(RB ) = µB = 6.5%
σA = 11.82%
σB = 7.19%
Cov(RA,RB ) = σAB = 17.10%
The portfolio variance is ______.
Version 1
16
A) 8.17% B) 13.80% C) 66.78 (%)2 D) 190.70 (%)2
49) An investor has a $100,000 portfolio of which $75,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are shown in the accompanying table. Stock A
Stock B
E(RA ) = µA = 8.0%
E(RB ) = µB = 5.5%
σA = 11.82%
σB = 7.19%
Cov(RA,RB ) = σAB = 17.10%
The expected return of the portfolio is _____. A) 6.30% B) 6.75% C) 7.38% D) 13.50%
50) An investor has a $100,000 portfolio of which $75,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are shown in the accompanying table. Stock A
Stock B
E(RA ) = µA = 8.4%
E(RB ) = µB = 6.5%
σA = 10.80%
σB = 7.29%
Cov(RA,RB ) = σAB = 16.70%
The standard deviation of the portfolio is _____.
Version 1
17
A) 9.39 (%) B) 14.19 (%). C) 88.23 (%)2. D) 201.41 (%)2.
51) Given the information in the accompanying table, calculate the correlation coefficient between the returns on Stocks A and B. Stock A
Stock B
E(RA ) = µA = 8.4%
E(RB ) = µB = 6.5%
σA = 10.80%
σB = 7.29%
Cov(RA,RB ) = σAB = 16.70%
A) −0.212 B) −0.167 C) 0.167 D) 0.212
52) Which of the following statements is the most accurate about a binomial random variable? A) It has a bell-shaped distribution. B) It is a continuous random variable. C) It counts the number of successes in a given number of trials. D) It counts the number of successes in a specified time interval or region.
53) It is known that 15% of the calculators shipped from a particular factory are defective. What is the probability that exactly four of ten chosen calculators are defective?
Version 1
18
A) 0.99 B) 0.01 C) 0.4 D) 0.04
54) It is known that 10% of the calculators shipped from a particular factory are defective. What is the probability that none in a random sample of four calculators is defective? A) 0.0010 B) 0.2916 C) 0.3439 D) 0.6561
55) It is known that 10% of the calculators shipped from a particular factory are defective. What is the probability that at least one in a random sample of four calculators is defective? A) 0.0010 B) 0.2916 C) 0.3439 D) 0.6561
56) It is known that 10% of the calculators shipped from a particular factory are defective. What is the probability that no more than one in a random sample of four calculators is defective? A) 0.2916 B) 0.3439 C) 0.6561 D) 0.9477
Version 1
19
57) Thirty percent of the CFA candidates have a degree in economics. A random sample of three CFA candidates is selected. What is the probability that none of them has a degree in economics? A) 0.027 B) 0.300 C) 0.343 D) 0.900
58) Thirty percent of the CFA candidates have a degree in economics. A random sample of three CFA candidates is selected. What is the probability that at least one of them has a degree in economics? A) 0.300 B) 0.343 C) 0.657 D) 0.900
59) On a particular production line, the likelihood that a light bulb is defective is 5%. Ten light bulbs are randomly selected. What is the probability that two light bulbs will be defective? A) 0.0105 B) 0.0746 C) 0.3151 D) 0.5987
60) On a particular production line, the likelihood that a light bulb is defective is 5%. Ten light bulbs are randomly selected. What is the probability that none of the light bulbs will be defective?
Version 1
20
A) 0.0105 B) 0.0746 C) 0.3151 D) 0.5987
61) On a particular production line, the likelihood that a light bulb is defective is 12%. Four light bulbs are randomly selected. What is the probability that at most 2 of the light bulbs will be defective? A) 0.0062 B) 0.0052 C) 0.9937 D) 0.9832
62) On a particular production line, the likelihood that a light bulb is defective is 5%. Ten light bulbs are randomly selected. What is the probability that at most 3 of the light bulbs will be defective? A) 0.0105 B) 0.0115 C) 0.9885 D) 0.9990
63) On a particular production line, the likelihood that a light bulb is defective is 5%. Ten light bulbs are randomly selected. What are the mean and variance of the number of defective bulbs? A) 0.475 and 0.475 B) 0.475 and 0.6892 C) 0.50 and 0.475 D) 0.50 and 0.6892
Version 1
21
64) According to a study by the Centers for Disease Control and Prevention, about 33% of U.S. births are Caesarean deliveries. Suppose seven expectant mothers are randomly selected. What is the probability that two of the expectant mothers will have a Caesarean delivery? A) 0.0147 B) 0.0606 C) 0.2090 D) 0.3088
65) According to a study by the Centers for Disease Control and Prevention, about 33% of U.S. births are Caesarean deliveries. Suppose seven expectant mothers are randomly selected. What is the probability that at least one of the expectant mothers will have a Caesarean delivery? A) 0.0606 B) 0.2090 C) 0.9394 D) 0.9742
66) According to a study by the Centers for Disease Control and Prevention, about 33% of U.S. births are Caesarean deliveries. Suppose seven expectant mothers are randomly selected. What is the probability that at most two of the expectant mothers will have a Caesarean delivery? A) 0.2696 B) 0.3088 C) 0.4217 D) 0.5783
67) According to a study by the Centers for Disease Control and Prevention, about 33% of U.S. births are Caesarean deliveries. Suppose seven expectant mothers are randomly selected. The expected number of mothers who will not have a Caesarean delivery is ______.
Version 1
22
A) 1.24 B) 2.31 C) 3.50 D) 4.69
68) According to a study by the Centers for Disease Control and Prevention, about 33% of U.S. births are Caesarean deliveries. Suppose seven expectant mothers are randomly selected. What is the standard deviation of the number of mothers who will have a Caesarean delivery? A) 1.24 B) 1.54 C) 2.31 D) 4.69
69) For a particular clothing store, a marketing firm finds that 16% of $10-off coupons delivered by mail are redeemed. Suppose six customers are randomly selected and are mailed $10-off coupons. What is the probability that three of the customers redeem the coupon? A) 0.0486 B) 0.1912 C) 0.3513 D) 0.4015
70) For a particular clothing store, a marketing firm finds that 5% of $10-off coupons delivered by mail are redeemed. Suppose three customers are randomly selected and are mailed $10-off coupons. What is the probability that no more than one of the customers redeems the coupon?
Version 1
23
A) 0.5913 B) 0.6415 C) 0.9928 D) 0.4872
71) For a particular clothing store, a marketing firm finds that 16% of $10-off coupons delivered by mail are redeemed. Suppose six customers are randomly selected and are mailed $10-off coupons. What is the probability that no more than one of the customers redeems the coupon? A) 0.2472 B) 0.3513 C) 0.4015 D) 0.7528
72) For a particular clothing store, a marketing firm finds that 16% of $10-off coupons delivered by mail are redeemed. Suppose six customers are randomly selected and are mailed $10-off coupons. What is the probability that at least two of the customers redeem the coupon? A) 0.2472 B) 0.3513 C) 0.4015 D) 0.7528
73) For a particular clothing store, a marketing firm finds that 16% of $10-off coupons delivered by mail are redeemed. Suppose six customers are randomly selected and are mailed $10-off coupons. What is the expected number of coupons that will be redeemed?
Version 1
24
A) 0.81 B) 0.96 C) 3.42 D) 5.04
74) According to a Department of Labor report, the city of Detroit had a 20% unemployment rate in May. Eight working-age residents were chosen at random. What is the probability that exactly one of the residents was unemployed? A) 0.0419 B) 0.1678 C) 0.2936 D) 0.3355
75) According to a Department of Labor report, the city of Detroit had a 20% unemployment rate in May. Eight working-age residents were chosen at random. What is the probability that at least two of the residents were unemployed? A) 0.1678 B) 0.3355 C) 0.4967 D) 0.5033
76) According to a Department of Labor report, the city of Detroit had a 20% unemployment rate in May. Eight working-age residents were chosen at random. What is the probability that exactly four residents were unemployed? A) 0.0013 B) 0.0091 C) 0.0459 D) 0.1468
Version 1
25
77) According to a Department of Labor report, the city of Detroit had a 20% unemployment rate in May. Eight working-age residents were chosen at random. What was the expected number of unemployed residents, when eight working-age residents were randomly selected? A) 1.0 B) 1.6 C) 2.0 D) 6.4
78) Chauncey Billups, a current shooting guard for the Los Angeles Clippers, has a career free-throw percentage of 89.4%. Suppose he shoots six free throws in tonight’s game. What is the probability that Billups makes all six free throws? A) 0.1070 B) 0.3632 C) 0.5105 D) 0.6530
79) Chauncey Billups, a current shooting guard for the Los Angeles Clippers, has a career free-throw percentage of 92.80%. Suppose he shoots six free throws in tonight’s game. What is the probability that Billups makes five or more of his free throws? A) 0.5728 B) 0.4255 C) 0.9360 D) 0.9563
80) Chauncey Billups, a current shooting guard for the Los Angeles Clippers, has a career free-throw percentage of 89.4%. Suppose he shoots six free throws in tonight’s game. What is the probability that Billups makes five or more of his free throws?
Version 1
26
A) 0.3632 B) 0.5105 C) 0.8737 D) 0.8940
81) Chauncey Billups, a current shooting guard for the Los Angeles Clippers, has a career free-throw percentage of 89.4%. Suppose he shoots six free throws in tonight’s game. What is the expected number of free throws that Billups will make? A) 0.636 B) 5.364 C) 5.686 D) 6.000
82) Chauncey Billups, a current shooting guard for the Los Angeles Clippers, has a career free-throw percentage of 89.4%. Suppose he shoots six free throws in tonight’s game. What is the standard deviation of the number of free throws that Billups will make? A) 0.5364 B) 0.5686 C) 0.7540 D) 5.6860
83)
Which of the following statements is the most accurate about a Poisson random variable? A) It counts the number of successes in a given number of trials. B) It counts the number of successes in a specified time or space interval. C) It is a continuous random variable. D) It has a bell-shaped distribution.
Version 1
27
84) The foreclosure crisis has been particularly devastating in housing markets in much of the south and west United States, but even when analysis is restricted to relatively strong housing markets the numbers are staggering. For example, in 2017 an average of three residential properties were auctioned off each weekday in the city of Boston, up from an average of one per week in 2011. What is the probability that exactly four foreclosure auctions occurred on a randomly selected weekday of 2017 in Boston? A) 0.1680 B) 0.1954 C) 0.2240 D) 0.8153
85) The foreclosure crisis has been particularly devastating in housing markets in much of the south and west United States, but even when analysis is restricted to relatively strong housing markets the numbers are staggering. For example, in 2017 an average of three residential properties were auctioned off each weekday in the city of Boston, up from an average of one per week in 2011. What is the probability that at least one foreclosure auction occurred in Boston on a randomly selected weekday of 2017? A) 0.0498 B) 0.1494 C) 0.8009 D) 0.9502
86) The foreclosure crisis has been particularly devastating in housing markets in much of the south and west United States, but even when analysis is restricted to relatively strong housing markets the numbers are staggering. For example, in 2017 an average of three residential properties were auctioned off each weekday in the city of Boston, up from an average of one per week in 2011. What is the probability that no more than two foreclosure auctions occurred on a randomly selected weekday of 2017 in Boston?
Version 1
28
A) 0.1991 B) 0.2240 C) 0.4232 D) 0.5768
87) The foreclosure crisis has been particularly devastating in housing markets in much of the south and west United States, but even when analysis is restricted to relatively strong housing markets the numbers are staggering. For example, in 2017 an average of three residential properties were auctioned off each weekday in the city of Boston, up from an average of one per week in 2011. What is the probability that exactly 10 foreclosure auctions occurred during a randomly selected five-day week in 2017 in Boston? A) 0.0008 B) 0.0486 C) 0.1185 D) 0.9514
88) A bank manager estimates that an average of four customers enter the tellers’ queue every nine minutes. Assume that the number of customers that enter the tellers’ queue is Poisson distributed. What is the probability that exactly five customers enter the queue in a randomly selected nine-minute period? A) 0.1999 B) 0.2466 C) 0.0661 D) 0.1563
89) A bank manager estimates that an average of two customers enter the tellers’ queue every five minutes. Assume that the number of customers that enter the tellers’ queue is Poisson distributed. What is the probability that exactly three customers enter the queue in a randomly selected five-minute period?
Version 1
29
A) 0.0902 B) 0.1804 C) 0.2240 D) 0.2707
90) A bank manager estimates that an average of two customers enters the tellers’ queue every five minutes. Assume that the number of customers that enters the tellers’ queue is Poisson distributed. What is the probability that fewer than two customers enter the queue in a randomly selected five-minute period? A) 0.1353 B) 0.2707 C) 0.4060 D) 0.6767
91) A bank manager estimates that an average of two customers enters the tellers’ queue every five minutes. Assume that the number of customers that enters the tellers’ queue is Poisson distributed. What is the probability that at least two customers enter the queue in a randomly selected five-minute period? A) 0.1353 B) 0.2707 C) 0.4060 D) 0.5940
92) A bank manager estimates that an average of two customers enters the tellers’ queue every five minutes. Assume that the number of customers that enters the tellers’ queue is Poisson distributed. What is the probability that exactly seven customers enter the queue in a randomly selected 15-minute period?
Version 1
30
A) 0.0034 B) 0.1033 C) 0.1377 D) 0.1606
93) Cars arrive randomly at a tollbooth at a rate of 30 cars per 8 minutes during rush hour. What is the probability that exactly five cars will arrive over a five-minute interval during rush hour? A) 0.4623 B) 0.0874 C) 0.0123 D) 0.0001
94) Cars arrive randomly at a tollbooth at a rate of 20 cars per 10 minutes during rush hour. What is the probability that exactly five cars will arrive over a five-minute interval during rush hour? A) 0.0378 B) 0.0500 C) 0.1251 D) 0.5000
95) A roll of steel is manufactured on a processing line. The anticipated number of defects in a 10-foot segment of this roll is two. What is the probability of no defects in 10 feet of steel? A) 0.0002 B) 0.1353 C) 0.1804 D) 0.8647
Version 1
31
96) According to geologists, the San Francisco Bay Area experiences five earthquakes with a magnitude of 6.5 or greater every 100 years. What is the probability that no earthquakes with a magnitude of 6.5 or greater will strike the San Francisco Bay Area in the next 40 years? A) 0.0067 B) 0.0337 C) 0.1353 D) 0.2707
97) According to geologists, the San Francisco Bay Area experiences five earthquakes with a magnitude of 6.5 or greater every 100 years. What is the probability that more than two earthquakes with a magnitude of 6.5 or greater will strike the San Francisco Bay Area in the next 40 years? A) 0.1353 B) 0.2706 C) 0.3233 D) 0.8754
98) According to geologists, the San Francisco Bay Area experiences six earthquakes with a magnitude of 4.6 or greater every 100 years. What is the probability that one or more earthquakes with a magnitude of 4.6 or greater will strike the San Francisco Bay Area in the next year? A) 0.4972 B) 0.1447 C) 0.9418 D) 0.0582
Version 1
32
99) According to geologists, the San Francisco Bay Area experiences five earthquakes with a magnitude of 6.5 or greater every 100 years. What is the probability that one or more earthquakes with a magnitude of 6.5 or greater will strike the San Francisco Bay Area in the next year? A) 0.0488 B) 0.1353 C) 0.4878 D) 0.9512
100) According to geologists, the San Francisco Bay Area experiences five earthquakes with a magnitude of 6.5 or greater every 100 years. What is the expected value of the number of earthquakes with a magnitude of 6.5 or greater striking the San Francisco Bay Area in the next 40 years? A) 1.414 B) 2.000 C) 2.236 D) 5.000
101) According to geologists, the San Francisco Bay Area experiences four earthquakes with a magnitude of 5.1 or greater every 100 years. What is the standard deviation of the number of earthquakes with a magnitude of 5.1 or greater striking the San Francisco Bay Area in the next 40 years? A) 1.836 B) 4.000 C) 1.600 D) 1.265
Version 1
33
102) According to geologists, the San Francisco Bay Area experiences five earthquakes with a magnitude of 6.5 or greater every 100 years. What is the standard deviation of the number of earthquakes with a magnitude of 6.5 or greater striking the San Francisco Bay Area in the next 40 years? A) 1.414 B) 2.000 C) 2.236 D) 5.000
103)
Which of the following is true about the hypergeometric distribution? A) The trials are independent and the probability of success may change from trial to
trial. B) The trials are independent and the probability of success does not change from trial to trial. C) The trials are not independent and the probability of success may change from trial to trial. D) The trials are not independent and the probability of success does not change from trial to trial.
104) An urn contains 15 balls, seven of which are red. The selection of a red ball is desired and is therefore considered to be a success. If a person draws three balls from the urn, what is the probability of two successes? A) 0.3692 B) 0.2101 C) 0.7320 D) 0.8919
105) An urn contains 12 balls, five of which are red. The selection of a red ball is desired and is therefore considered to be a success. If a person draws three balls from the urn, what is the probability of two successes? Version 1
34
A) 0.1591 B) 0.3182 C) 0.6810 D) 0.8409
106) An urn contains 12 balls, five of which are red. Selection of a red ball is desired and is therefore considered to be a success. If three balls are selected, what is the expected value of the distribution of the number of selected red balls? A) 0.4167 B) 0.8333 C) 0.5833 D) 1.2500
107) Suppose a baseball team has 14 players on the roster who are not members of the pitching staff. Of those 14 players, assume that three have recently taken a performance-enhancing drug. Suppose the league decides to randomly test five members of the team. What is the probability that exactly two of the tested players are found to have taken a performance-enhancing drug? A) 0.2308 B) 0.2473 C) 0.4945 D) 0.7692
108) Suppose a baseball team has 14 players on the roster who are not members of the pitching staff. Of those 14 players, assume that two have recently taken a performance-enhancing drug. Suppose the league decides to randomly test seven members of the team. What is the probability that at least one of the tested players is found to have taken a performance-enhancing drug?
Version 1
35
A) 0.2308 B) 0.7692 C) 0.2473 D) 0.4595
109) Suppose a baseball team has 14 players on the roster who are not members of the pitching staff. Of those 14 players, assume that three have recently taken a performance-enhancing drug. Suppose the league decides to randomly test five members of the team. What is the probability that at least one of the tested players is found to have taken a performance-enhancing drug? A) 0.2308 B) 0.2473 C) 0.4945 D) 0.7692
110) There are currently 18 pit bulls at the pound. Of the 18 pit bulls, four have attacked another dog in the last year. Joe, a member of the staff, randomly selects six of the pit bulls for his group. What is the probability that exactly one of the pit bulls in Joe’s group attacked another dog last year? A) 0.1618 B) 0.3235 C) 0.4314 D) 0.4853
111) There are currently 18 pit bulls at the pound. Of the 18 pit bulls, four have attacked another dog in the last year. Joe, a member of the staff, randomly selects six of the pit bulls for his group. What is the probability that at least one of the pit bulls in Joe’s group attacked another dog last year?
Version 1
36
A) 0.1618 B) 0.4314 C) 0.5686 D) 0.8382
112) There are currently 18 pit bulls at the pound. Of the 18 pit bulls, four have attacked another dog in the last year. Joe, a member of the staff, randomly selects six of the pit bulls for his group. What is the probability that no more than two of the pit bulls in Joe’s group attacked another dog last year? A) 0.0833 B) 0.3235 C) 0.5931 D) 0.9167
113) The following table shows the number of students who did not attend statistics class over the 35 class meetings last semester. Number of students
Frequency
0
4
1
12
2
9
3
5
4
5
What is the expected number of absent students per class for the next semester? A) 2.123 B) 1.857 C) 1.965 D) 1.246
Version 1
37
114) The following table shows the number of students who did not attend statistics class over the 35 class meetings last semester. Number of students
Frequency
0
4
1
12
2
9
3
5
4
5
What is the probability that there will be more than one student absent in class today? A) 0.54 B) 0.87 C) 0.70 D) 0.63
115) An investor owns a portfolio consisting of two mutual funds, A and B, with 35% invested in A. The following table lists the inputs for these funds. Measure
Fund A
Fund B
Expected Value
10
5
Variance
98
26
Covariance
22
The expected return for the portfolio return is ____. A) 5.76% B) 7.65% C) 6.75% D) 12.45%
116) An investor owns a portfolio consisting of two mutual funds, A and B, with 35% invested in A. The following table lists the inputs for these funds. Measure Expected Value
Version 1
Fund A
Fund B
10
5
38
Variance Covariance
98
26 22
The standard deviation for the portfolio return is ____. A) 5.74% B) 7.54% C) 6.57% D) 3.74%
117) According to the Mortgage Bankers Association, 8% of U.S. mortgages were delinquent last year. A delinquent mortgage is one that has missed at least one payment but has not yet gone to foreclosure. A random sample of eight mortgages was selected. What is the probability that exactly one of these mortgages is delinquent? A) 0.1296 B) 0.3570 C) 0.4926 D) 0.6157
118) According to the Mortgage Bankers Association, 8% of U.S. mortgages were delinquent last year. A delinquent mortgage is one that has missed at least one payment but has not yet gone to foreclosure. A random sample of eight mortgages was selected. What is the probability that fewer than three of these mortgages are delinquent? A) 0.7018 B) 0.7583 C) 0.9789 D) 0.8225
119) Forty-four percent of consumers with credit cards carry balances from month to month. Four consumers with credit cards are randomly selected. What is the probability that all consumers carry a credit card balance?
Version 1
39
A) 0.1015 B) 0.0375 C) 0.3133 D) 0.1390
120) Forty-four percent of consumers with credit cards carry balances from month to month. Four consumers with credit cards are randomly selected. What is the probability that fewer than two consumers carry a credit card balance? A) 0.4074 B) 0.6734 C) 0.7717 D) 0.3643
121) According to the Department of Transportation, 26% of domestic flights were delayed in the last year at JFK airport. Five flights are randomly selected at JFK. What is the probability that all five flights are delayed? A) 0.0192 B) 0.0220 C) 0.0012 D) 0.0206
122) According to the Department of Transportation, 27% of domestic flights were delayed in the last year at JFK airport. Five flights are randomly selected at JFK. What is the probability that all five flights are delayed? A) 0.0194 B) 0.0208 C) 0.0014 D) 0.0222
Version 1
40
123) According to the Department of Transportation, 27% of domestic flights were delayed in the last year at JFK airport. Five flights are randomly selected at JFK. What is the probability that all five flights are on time? A) 0.1513 B) 0.7927 C) 0.8487 D) 0.2073
124) Studies have shown that bats can consume an average of 10 mosquitoes per minute. What is the probability that a bat does not consume any mosquitoes in a 30-second interval? A) 0.0067 B) 0.1755 C) 0.0023 D) 0.0090
125) Studies have shown that bats can consume an average of 10 mosquitoes per minute. What is the probability that a bat consumes four mosquitoes in a 30-second interval? A) 0.1404 B) 0.1755 C) 0.3159 D) 0.0189
126) A professor has learned that three students in his class of 20 will cheat on the final exam. He decides to focus his attention on four randomly chosen students during the exam. What is the probability that he finds at least one of the students cheating?
Version 1
41
A) 0.4912 B) 0.4211 C) 0.5088 D) 0.5789
127)
A discrete uniform distribution has the following characteristics EXCEPT: A) The distribution is symmetric. B) The distribution is bell-shaped. C) The distribution has a finite number of specific values. D) The distribution has the same probability value for all outcomes.
128)
The return of a portfolio is a __________ of individual returns. A) random outcome B) product C) linear combination D) covariance
129)
Which of the following scenarios fits the condition of a Bernoulli process? A) A product is being rated on a 5-point scale. B) A product is being liked or not. C) A product is being ranked among 10 other similar products. D) A product is being used during different times of the day.
130)
Which of the following is best described as a Poisson variable?
Version 1
42
A) The number of students earning an A grade in Business Statistics. B) The number of heads in coin tossing. C) The number of ones in die casting. D) The number of positive reviews in a week.
131) Which of the following problems is solved by an appropriate application of a hypergeometric distribution? A) What is the probability of getting two of aces in drawing five cards from a wellshuffled deck? B) What is the probability of getting two heads in tossing a fair coin five times? C) What is the probability of getting two ones in casting a die five times? D) What is the probability of getting two positive reviews from five randomly selected customers from a pool of 1,000?
132)
A six-sided, unfair (weighted) die has the following probability distribution.
xi P(X = xi)
1 1/4
2 1/12
3 1/3
4 1/12
5 1/6
6 1/12
Find the probability of rolling a 3 or less.
133)
Does the following table describe a discrete probability distribution? xi
P(X = xi)
−1
0.30
0
0.20
1
0.10
2
0.20
3
0.10
Version 1
43
134)
An analyst estimates that a stock has the following probabilities of year-end prices.
Stock Price
Probability
$80
0.10
$85
0.40
$90
0.30
$95
0.20
a. Calculate the expected price at year-end. b. Calculate the variance and the standard deviation.
135) You have inherited a lottery ticket that may be a $5,000 winner. You have a 35% chance of winning the $5,000 and a 65% chance of winning $0. You have an opportunity to sell the lottery ticket for $1,500. What is your expected return and what should you do if are risk neutral?
136) You have inherited a lottery ticket may be a $10,000 winner. You have a 0.25 chance of winning the $10,000 and a 0.75 chance of winning $0. You have an opportunity to sell the lottery ticket for $2,500. What is your expected return and what should you do if are risk averse?
Version 1
44
137) An investor has a $120,000 portfolio of which $50,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are as follows: Stock A
Stock B
E(RA ) = µA = 11.2%
E(RB ) = µB = 8.6%
σA = 15.93%
σB = 9.65%
Cov(RA,RB ) = σAB = 38.20%
a.Calculate the correlation coefficient. b.Calculate the expected return of the portfolio. c.Calculate the standard deviation of the portfolio.
138) An investor has an $80,000 portfolio of which $60,000 has been invested in Stock A and the remainder in Stock B. Other characteristics of the portfolio are as follows: Stock A
Stock B
E(RA ) = µA = −2.3%
E(RB ) = µB = 5.1%
σA = −23.82%
σB = 7.54%
Cov(RA,RB ) = σAB = −11.20%
a.Calculate the correlation coefficient. b.Calculate the expected return of the portfolio. c.Calculate the standard deviation of the portfolio.
Version 1
45
139) Suppose your firm is buying five new computers. The manufacturer offers a warranty to replace any computer that breaks down within three years. Suppose there is a 25% chance that any given computer breaks down within three years. a.What is the probability that exactly one of the computers breaks down within five years? b.What is the probability that at least one of the computers breaks down within five years? c.Suppose the warranty for five computers costs $700, while a new computer costs $600. Is the warranty less expensive than the expected cost of replacing the broken computers?
140) Lisa is in a free-throw shooting contest where each contestant attempts 10 free throws. On average, Lisa makes 77% of the free throws she attempts. a.What is the probability that she makes exactly eight free throws? b.What is the probability she makes at least nine free throws? c.What is the probability she makes fewer than nine free throws? d.Lisa is competing against Bill to see who can make the most free throws in 10 attempts. Suppose Bill goes first and makes seven. Should we expect Lisa to make at least as many as Bill? Explain.
141) George buys six lottery tickets for $2 each. In addition to the grand prize, there is a 20% chance that each lottery ticket gives a prize of $4. Assume that these tickets are not grand prize winners. a.What is the probability that the tickets are winners more than the average that George expected? b.What is the probability that none of the tickets are winners? c.What is the probability that at least one of the tickets is a winner?
Version 1
46
142) A car salesperson has a 5% chance of landing a sale with a random customer on the lot. Suppose 10 people come onto the lot today. a.What is the probability that the salesperson sells exactly three cars today? b.What is the probability he or she sells fewer than two cars today? c.What is the expected number of cars he or she is going to sell today?
143) A company is going to release four quarterly reports this year. Suppose the company has a 32% chance of beating analyst expectations each quarter. a.What is the probability that the company beats analyst expectations every quarter of this year? b.What is the probability the company beats analyst expectations more than half the time this year? c.What is the expected number of times the company will beat analyst expectations this year?
144) Assume that the mean success rate of a Poisson process is six successes per hour. a.Find the expected number of successes in a 40-minute period. b.Find the expected number of successes in a three-hour period. c.Find the probability of at least two successes in a 30-minute period.
Version 1
47
145) Sam is a trucker and believes that for every 60 miles he drives on the freeway in Indiana, there is an average of two state troopers checking his speed with a radar gun. a.What is the probability that at least one trooper is checking his speed on a randomly selected 60-mile stretch? b.What is the probability that exactly three troopers are checking his speed on a randomly selected 60-mile stretch? c.Sam drives 240 miles a day. What is the average number of state troopers who check his speed on a given day? d.Sam drives 240 miles a day. What is the probability that exactly five troopers check Sam’s speed on a randomly selected day?
146) Due to turnover and promotion, a bank manager knows that, on average, she hires four new tellers per year. Suppose the number of tellers she hires is Poisson distributed. a.What is the probability that in a given year, the manager hires exactly five new tellers? b.What is the average number of tellers the manager hires in a six-month period? c.What is the probability that the manager hires at least one new teller in a given six-month period?
Version 1
48
147) A telemarketer knows that, on average, he is able to make three sales in a 30-minute period. Suppose the number of sales he can make in a given time period is Poisson distributed. a.What is the probability that he makes exactly four sales in a 30-minute period? b.What is the probability that he makes at least two sales in a 30-minute period? c.What is the probability that he makes five sales in an hour-long period?
148) A construction company found that on average its workers get into four car accidents per week. a.What is the probability of exactly six car accidents in a random week? b.What is the probability that there are fewer than two car accidents in a random week? c.What is the probability that there are exactly eight car accidents over the course of three weeks?
149) During an hour of class, a professor anticipates six questions on average. a.What is the probability that in a given hour of class, exactly six questions are asked? b.What is the expected number of questions asked in a 20-minute period? c.What is the probability that no questions are asked over a 20-minute period?
Version 1
49
150) A plane taking off from an airport in New York can expect to run into a flock of birds once out of every 1,250 takeoffs. a.What is the expected number of bird strikes for 10,000 takeoffs? b.What is the standard deviation of the number of bird strikes for 10,000 takeoffs? c.What is the probability of running into seven flocks of birds in 10,000 takeoffs?
151) You are attending a baseball game with two family members when it is announced that four fans in your section (which holds 20 spectators total) will win a free autographed baseball. a.What is the probability that at least one member of your family (including yourself) wins an autographed baseball? b.What is the probability that two members of your family win an autographed baseball? c.What is the probability that all three of you win an autographed baseball?
152) Four of your students submitted an entry to a writing contest. A total of 24 entries were submitted. Six of the entries will move on to the next round. a.What is the probability that all four of your students will move on to the next round? b.How many of your students are expected to move to the next round? c.What is the probability that fewer of your students than expected will make it to the next round than expected? d.What is the standard deviation of the number of students expected to move on to the next round?
Version 1
50
153) In a particular game of cards, success is measured by the number of aces drawn by each player. Eight cards are drawn by the first player. Given that the player is drawing from a full poker deck of 52 cards, find the probability that this player will draw two aces from the deck.
154) An urn is filled with three different colors of balls: red, blue, and white. There are three red balls, seven blue balls, and five white balls. If four balls are drawn, what is the probability of drawing two blue balls?
155) A random variable is a function that assigns numerical values to the outcomes of a random experiment. ⊚ true ⊚ false
156) A discrete random variable X may assume an (infinitely) uncountable number of distinct values. ⊚ true ⊚ false
157) A continuous random variable X assumes an (infinitely) uncountable number of distinct values. ⊚ true ⊚ false
Version 1
51
158) A probability distribution of a continuous random variable X gives the probability that X takes on a particular value x, P (X = x). ⊚ true ⊚ false
159) A cumulative probability distribution of a random variable X is the probability P (X = x), where X is equal to a particular value x. ⊚ true ⊚ false
160)
The expected value of a random variable X can be referred to as the population mean. ⊚ true ⊚ false
161) The variance of a random variable X provides us with a measure of central location of the distribution of X. ⊚ true ⊚ false
162) The relationship between the variance and the standard deviation is such that the standard deviation is the positive square root of the variance. ⊚ true ⊚ false
163) A risk-averse consumer may decline a risky prospect even if it offers a positive expected value.
Version 1
52
⊚ ⊚
true false
164) A risk-averse consumer ignores risk and makes his or her decisions solely on the basis of expected value. ⊚ true ⊚ false
165) A risk-neutral consumer ignores risk and makes his or her decisions solely on the basis of expected value. ⊚ true ⊚ false
166)
A risk-loving consumer will take on risk even if the expected gain is negative. ⊚ true ⊚ false
167) Given two random variables X and Y, the expected value of their sum, E (X + Y), is equal to the sum of their individual expected values, E (X) and E (Y). ⊚ true ⊚ false
168) A Bernoulli process consists of a series of n independent and identical trials of an experiment such that in each trial there are three possible outcomes and the probabilities of each outcome remain the same. ⊚ true ⊚ false
Version 1
53
169) A binomial random variable is defined as the number of successes achieved in n trials of a Bernoulli process. ⊚ true ⊚ false
170) A Poisson random variable counts the number of successes (occurrences of a certain event) over a given interval of time or space. ⊚ true ⊚ false
171) We use the hypergeometric distribution in place of the binomial distribution when we are sampling with replacement from a population whose size N is significantly larger than the sample size n. ⊚ true ⊚ false
172) The ounces of soda consumed by an adult next month are an example of a discrete random variable. ⊚ true ⊚ false
173) The expected value of simple information is the mean of a discrete probability distribution when the discrete random variable is expressed in term of dollars. ⊚ true ⊚ false
174) Testing whether the computer is infected or not would be best described using binomial probability distribution. ⊚ true ⊚ false
Version 1
54
175) The equation used to calculate binomial probabilities relies on permutations of n objects selected x at a time. ⊚ true ⊚ false
176) The number of syntactic errors found in a program code would best be described using a Poisson probability distribution. ⊚ true ⊚ false
177) The Excel’s function HYPERGEOM.DIST can be used to solve hypergeometric probabilities. ⊚ true ⊚ false
178) An example of a discrete random variable is waiting time at the checkout line of a supermarket. ⊚ true ⊚ false
Version 1
55
Answer Key Test name: Chap 05_4e 1) positive 2) correlation 3) tree 4) POISSON.DIST 5) independent 6) variance 7) A 8) C 9) A 10) A 11) D 12) B 13) A 14) C 15) C 16) D 17) D 18) C 19) A 20) A 21) D 22) C 23) D 24) C 25) C 26) D Version 1
56
27) B 28) B 29) B 30) A 31) B 32) C 33) B 34) D 35) C 36) B 37) A 38) B 39) B 40) D 41) B 42) B 43) B 44) C 45) A 46) B 47) C 48) C 49) C 50) A 51) D 52) C 53) D 54) D 55) C 56) D Version 1
57
57) C 58) C 59) B 60) D 61) C 62) D 63) C 64) D 65) C 66) D 67) D 68) A 69) A 70) C 71) D 72) A 73) B 74) D 75) C 76) C 77) B 78) C 79) C 80) C 81) B 82) C 83) B 84) A 85) D 86) C Version 1
58
87) B 88) D 89) B 90) C 91) D 92) C 93) D 94) A 95) B 96) C 97) C 98) D 99) A 100) B 101) C 102) A 103) C 104) A 105) B 106) D 107) B 108) B 109) D 110) C 111) D 112) D 113) B 114) A 115) C 116) A Version 1
59
117) B 118) C 119) B 120) A 121) C 122) C 123) D 124) A 125) B 126) C 127) B 128) C 129) B 130) D 131) A 132) 0.6667
133) No, the sum of probabilities is 0.9 rather than 1. 134) E(X) = 88, Var(X) = 21, and SD(X) = 4.58
135) E(X) = $1,750. Therefore, keep the ticket, since the expected value is greater than the sale price ($1,500). 136) Sell the lottery ticket. 137) a.0.2485 b.9.6833% c.9.71% 138) a.–0.06 b.–0.45% c.17.84% 139) a.0.3955 b.0.7630 c.Yes
Version 1
60
140) a.0.2942 b.0.2921 c.0.7079 d.Yes, on average we expect Lisa to make as many free throws as Bill since 7.7 > 7. 141) a.0.0170 b.0.2621 c.0.7379 142) a.0.0105 b.0.9139 c.0.5 143) a.0.0105 b.0.0996 c.1.28 144) a.4 b.18 c.0.8008 145) a.0.8647 b.0.1804 c.Eight troopers d.0.0916 146) a.0.1563 b.Two tellers c.0.8647 147) a.0.1680 b.0.8010 c.0.1606 These results can also be found using Excel’s POISSON.DIST function. (See the text for details.)
Version 1
61
148) a. 0.1042 b. 0.0916 c. 0.0655 149) a.0.1606 b.2 c.0.1353 150) a.8 b.2.8284 c.0.1396 151) a.0.5088 b.0.0842 c.0.0035 152) a.0.0014 b.1 c.0.2880 d.0.8076
155) TRUE 156) FALSE 157) TRUE 158) FALSE 159) FALSE 160) TRUE 161) FALSE 162) TRUE 163) TRUE 164) FALSE 165) TRUE 166) TRUE 167) TRUE Version 1
62
168) FALSE 169) TRUE 170) TRUE 171) FALSE 172) FALSE 173) FALSE 174) TRUE 175) FALSE 176) TRUE 177) FALSE 178) FALSE
Version 1
63
CHAPTER 06 – 4e 1) The normal distribution is ______ in the sense that the tails get closer and closer to the horizontal axis but never touch it.
2)
The z table provides ______ probabilities for positive and negative values of z.
3) Scores on a business statistics final exam are normally distributed with a mean of 70 and standard deviation of 10. z value for the exam score of 85 equals ______.
4) We can use the ______ transformation, x = µ + zσ, to compute x values for given probabilities.
5)
Excel provides function ______ to compute probabilities for the exponential distribution.
6) The exponential distribution is based entirely on one parameter called the ______ parameter.
7)
For σ < 1, the lognormal distribution somewhat resembles the ______ distribution.
8) If the mean and the standard deviation of the underlying normal random variable equals respectively µ = 2 and σ = 1, the mean of a lognormal random variable equals ______.
9)
Which of the following is correct?
Version 1
1
A) A continuous random variable has a probability density function but not a cumulative distribution function. B) A discrete random variable has a probability mass function but not a cumulative distribution function. C) A continuous random variable has a probability mass function, and a discrete random variable has a probability density function. D) A continuous random variable has a probability density function, and a discrete random variable has a probability mass function.
10)
Which of the following does not represent a continuous random variable? A) Height of oak trees in a park. B) Heights and weights of newborn babies. C) Time of a flight between Chicago and New York. D) The number of customer arrivals to a bank between 10 am and 11 am.
11)
Which of the following is not a characteristic of a probability density function f(x)?
A) f(x) ≥ 0 for all values of x. B) f(x) is symmetric around the mean. C) The area under f(x) over all values of x equals one. D) f(x) becomes zero or approaches zero if x increases to +infinity or decreases to −infinity.
12)
The cumulative distribution function is denoted and defined as which of the following? A) f(x) and f(x) = P(X ≤ x) B) f(x) and f(x) = P(X ≥ x) C) F(x) and F(x) = P(X ≤ x) D) F(x) and F(x) = P(X ≥ x)
Version 1
2
13) The cumulative distribution function F(x) of a continuous random variable X with the probability density function f(x) is which of the following? A) The area under f over all values x. B) The area under f over all values that are x or less. C) The area under f over all values that are x or more. D) The area under f over all non-negative values that are x or less.
14) A continuous random variable has the uniform distribution on the interval [a, b] if its probability density function f(x) ______. A) provides all probabilities for all x between a and b B) is bell-shaped between a and b C) is constant for all x between a and b, and 0 otherwise D) asymptotically approaches the x axis when x increases to +∞ or decreases to −∞
15) The height of the probability density function f(x) of the uniform distribution defined on the interval [a, b] is ______. A) 1/(b − a) between a and b, and zero otherwise B) (b − a)/2 between a and b, and zero otherwise C) (a + b)/2 between a and b, and zero otherwise D) 1/(a + b) between a and b, and zero otherwise
16) The waiting time at an elevator is uniformly distributed between 30 and 200 seconds. Find the mean and standard deviation of the waiting time.
Version 1
3
A) 115 seconds and 49.07 seconds B) 1.15 minutes and 0.4907 minutes C) 1.15 minutes and 24.08333 (minute)2 D) 115 seconds and 2408.3333 (second)2
17) The waiting time at an elevator is uniformly distributed between 30 and 200 seconds. What is the probability a rider must wait between 1 minute and 1.5 minutes? A) 0.1765 B) 0.3529 C) 0.5294 D) 0.8824
18) The waiting time at an elevator is uniformly distributed between 30 and 200 seconds. What is the probability a rider must wait more than 1.5 minutes? A) 0.3529 B) 0.4500 C) 0.5294 D) 0.6471
19) The waiting time at an elevator is uniformly distributed between 30 and 200 seconds. What is the probability a rider waits less than two minutes? A) 0.4706 B) 0.5294 C) 0.6000 D) 0.7059
Version 1
4
20) The time of a call to a technical support line is uniformly distributed between 2 and 10 minutes. What are the mean and variance of this distribution? A) 6 minutes and 2.3094 (minutes)2 B) 6 minutes and 5.3333 (minutes)2 C) 6 minutes and 5.3333 minutes D) 8 minutes and 2.3094 minutes
21) An analyst is forecasting net income for Excellence Corporation for the next fiscal year. Her low-end estimate of net income is $270,000, and her high-end estimate is $375,000. Prior research allows her to assume that net income follows a continuous uniform distribution. The probability that net income will be greater than or equal to $362,500 is ______. A) 97.0% B) 88.1% C) 11.9% D) 29.0%
22) An analyst is forecasting net income for Excellence Corporation for the next fiscal year. Her low-end estimate of net income is $250,000, and her high-end estimate is $350,000. Prior research allows her to assume that net income follows a continuous uniform distribution. The probability that net income will be greater than or equal to $337,500 is ______. A) 12.5% B) 29.6% C) 87.5% D) 96.4%
23) Alex is in a hurry to get to work and is rushing to catch the bus. She knows that the bus arrives every six minutes during rush hour, but does not know the exact times the bus is due. She realizes that from the time she arrives at the stop, the amount of time that she will have to wait follows a uniform distribution with a lower bound of 0 minutes and an upper bound of six minutes. What is the probability that she will have to wait less than two minutes? Version 1
5
A) 0.1667 B) 0.3333 C) 0.6667 D) 1.0000
24) Suppose the average price of gasoline for a city in the United States follows a continuous uniform distribution with a lower bound of $4.00 per gallon and an upper bound of $4.30 per gallon. What is the probability a randomly chosen gas station charges more than $4.10 per gallon?
A) 0.6754 B) 0.6667 C) 1.0000 D) 0.3000
25) Suppose the average price of gasoline for a city in the United States follows a continuous uniform distribution with a lower bound of $3.50 per gallon and an upper bound of $3.80 per gallon. What is the probability a randomly chosen gas station charges more than $3.70 per gallon? A) 0.3000 B) 0.3333 C) 0.6667 D) 1.0000
26) Suppose the average price of gasoline for a city in the United States follows a continuous uniform distribution with a lower bound of $3.50 per gallon and an upper bound of $3.80 per gallon. What is the probability a randomly chosen gas station charges between $3.70 and $3.90 per gallon?
Version 1
6
A) 0.3000 B) 0.3333 C) 0.6667 D) 1.0000
27) Suppose the average price of gasoline for a city in the United States follows a continuous uniform distribution with a lower bound of $3.50 per gallon and an upper bound of $3.80 per gallon. What is the probability a randomly chosen gas station charges less than $3.70 per gallon? A) 0.3000 B) 0.3333 C) 0.6667 D) 1.0000
28)
How many parameters are needed to fully describe any normal distribution? A) 1 B) 2 C) 3 D) 4
29) What does it mean when we say that the tails of the normal curve are asymptotic to the x axis? A) The tails get closer and closer to the x axis but never touch it. B) The tails get closer and closer to the x axis and eventually touch it. C) The tails get closer and closer to the x axis and eventually cross this axis. D) The tails get closer and closer to the x axis and eventually become this axis.
Version 1
7
30)
The probability that a normal random variable is less than its mean is ______. A) 0.0 B) 0.5 C) 1.0 D) Cannot be determined
31) Let X be normally distributed with mean μ and standard deviation σ > 0. Which of the following is false about the z value corresponding to a given x value? A) A positive z = (x − μ)/σ indicates how many standard deviations x is above μ. B) A negative z = (x − μ)/σ indicates how many standard deviations x is below μ. C) The z value corresponding to x = μ is zero. D) The z value corresponding to a given value of x assumes any value between 0 and 1.
32) It is known that the length of a certain product X is normally distributed with μ = 35 inches. How is the probability P(X > 31) related to P(X < 31)? A) No comparison can be made with the given information. B) P(X>31) is the same as P(X<31). C) P(X>31) is greater than P(X<31). D) P(X>31) is smaller than P(X<31).
33) It is known that the length of a certain product X is normally distributed with μ = 20 inches. How is the probability P(X > 16) related to P(X < 16)? A) P(X > 16) is greater than P(X < 16). B) P(X > 16) is smaller than P(X < 16). C) P(X > 16) is the same as P(X < 16). D) No comparison can be made with the given information.
Version 1
8
34) It is known that the length of a certain product X is normally distributed with μ = 20 inches. How is the probability P(X < 20) related to P(X < 16)? A) P(X < 20) is greater than P(X < 16). B) P(X < 20) is smaller than P(X < 16). C) P(X < 20) is the same as P(X < 16). D) No comparison can be made with the given information.
35) It is known that the length of a certain product X is normally distributed with μ = 20 inches. How is the probability P(X > 24) related to P(X < 16)? A) P(X > 24) is greater than P(X < 16). B) P(X > 24) is smaller than P(X < 16). C) P(X > 24) is the same as P(X < 16). D) No comparison can be made with the given information.
36) It is known that the length of a certain product X is normally distributed with μ = 40 inches and σ = 4 inches. How is the probability P(X > 48) related to P(X < 36)? A) P(X>48) is the same as P(X<36). B) P(X>48) is greater than P(X<36). C) No comparison can be made with the given information. D) P(X>48) is smaller than P(X<36).
37) It is known that the length of a certain product X is normally distributed with μ = 20 inches and σ = 4 inches. How is the probability P(X > 28) related to P(X < 16)?
Version 1
9
A) P(X > 28) is greater than (X < 16). B) P(X > 28) is smaller than (X < 16). C) P(X > 28) is the same as P(X < 16). D) No comparison can be made with the given information.
38)
The probability P(Z < −1.28) is closest to ______. A) −0.10 B) 0.10 C) 0.20 D) 0.90
39)
The probability P(Z > 1.28) is closest to ______. A) −0.10 B) 0.10 C) 0.20 D) 0.90
40)
Find the probability P(−1.76 ≤ Z ≤ 0). A) 0.5390 B) 0.4610 C) 0.0780 D) 0.0390
41)
Find the probability P(−1.96 ≤ Z ≤ 0).
Version 1
10
A) 0.0250 B) 0.0500 C) 0.4750 D) 0.5250
42)
Find the probability P(−1.96 ≤ Z ≤ 1.96). A) 0.0500 B) 0.9500 C) 0.9750 D) 1.9500
43)
Find the z value such that P(Z ≤ z) = 0.6915. A) z = 0.6923 B) z = 0.50 C) z = −0.50 D) z = 0.3077
44)
Find the z value such that P(Z ≤ z) = 0.9082. A) z = −1.33 B) z = 0.1814 C) z = 0.8186 D) z = 1.33
45)
Find the z value such that P(−z ≤ Z ≤ z) = 0.922.
Version 1
11
A) z = 1.447 B) z = −1.447 C) z = 1.762 D) z = −1.762
46)
Find the z value such that P(−z ≤ Z ≤ z) = 0.95. A) z = −1.645 B) z = −1.96 C) z = 1.645 D) z = 1.96
47) You work in marketing for a company that produces work boots. Quality control has sent you a memo detailing the length of time before the boots wear out under heavy use. They find that the boots wear out in an average of 208 days, but the exact amount of time varies, following a normal distribution with a standard deviation of 14 days. For an upcoming ad campaign, you need to know the percent of the pairs that last longer than six months—that is, 180 days. Use the empirical rule to approximate this percent. A) 2.5% B) 5% C) 95% D) 97.5%
48) A hedge fund returns on average 26% per year with a standard deviation of 12%. Using the empirical rule, approximate the probability the fund returns over 50% next year.
Version 1
12
A) 0.5% B) 1% C) 2.5% D) 5%
49) For any normally distributed random variable with mean μ and standard deviation σ, the percent of the observations that fall between [μ − 2σ, μ + 2σ] is the closest to ______. A) 68% B) 68.26% C) 95% D) 99.73%
50) For any normally distributed random variable with mean μ and standard deviation σ, the proportion of the observations that fall outside the interval [μ − σ, μ + σ] is the closest to ______. A) 0.0466 B) 0.3174 C) 0.8413 D) 0.1687
51) Sarah’s portfolio has an expected annual return at 8%, with an annual standard deviation at 12%. If her investment returns are normally distributed, then in any given year Sarah has an approximate ______. A) 50% chance that the actual return will be greater than 8% B) 68% chance that the actual return will fall within 4% and 20% C) 68% chance that the actual return will fall within −20% and 20% D) 95% chance that the actual return will fall within −4% and 28%
Version 1
13
52) Sarah’s portfolio has an expected annual return at 10%, with an annual standard deviation at 12%. If her investment returns are normally distributed, then in any given year Sarah has an approximate ________. A) 50% chance that the actual return will be greater than 12% B) 68% chance that the actual return will fall within −2% and 22% C) 68% chance that the actual return will fall within −20% and 22% D) 95% chance that the actual return will fall within −2% and 24%
53) If X has a normal distribution with µ = 100 and σ = 5, then the probability P(90 ≤ X ≤ 95) can be expressed in terms of a standard normal variable Z as ______. A) P(2 ≤ Z ≤ 1) B) P(−2 ≤ Z ≤ 1) C) P(−2 ≤ Z ≤ −1) D) P(−2 ≤ Z ≤ −2)
54) The time to complete the construction of a soapbox derby car is normally distributed with a mean of three hours and a standard deviation of one hour. Find the probability that it would take more than five hours to construct a soapbox derby car. A) 0 B) 0.0228 C) 0.4772 D) 0.9772
55) The time to complete the construction of a soapbox derby car is normally distributed with a mean of three hours and a standard deviation of one hour. Find the probability that it would take between 2.5 and 3.5 hours to construct a soapbox derby car.
Version 1
14
A) 0.3085 B) 0.3830 C) 0.6170 D) 0.6915
56) The time to complete the construction of a soapbox derby car is normally distributed with a mean of three hours and a standard deviation of one hour. Find the probability that it would take exactly 3.7 hours to construct a soapbox derby car. A) 0.0000 B) 0.5000 C) 0.7580 D) 0.2420
57) Let X be normally distributed with mean µ = 250 and standard deviation σ = 80. Find the value x such that P(X ≤ x) = 0.0606. A) −1.55 B) 1.55 C) 126 D) 374
58) Let X be normally distributed with mean µ = 250 and standard deviation σ = 80. Find the value x such that P(X ≤ x) = 0.9394. A) −1.55 B) 1.55 C) 126 D) 374
Version 1
15
59) Let X be normally distributed with mean µ = 25 and standard deviation σ = 5. Find the value x such that P(X ≥ x) = 0.1736. A) −0.94 B) 0.94 C) 20.30 D) 29.70
60) The salary of teachers in a particular school district is normally distributed with a mean of $50,000 and a standard deviation of $2,500. Due to budget limitations, it has been decided that the teachers who are in the top 2.5% of the salaries would not get a raise. What is the salary level that divides the teachers into one group that gets a raise and one that doesn’t? A) −1.96 B) 1.96 C) 45,100 D) 54,900
61) The starting salary of an administrative assistant is normally distributed with a mean of $50,000 and a standard deviation of $2,500. We know that the probability of a randomly selected administrative assistant making a salary between μ − x and μ + x is 0.7416. Find the salary range referred to in this statement. A) $42,825 to $52,825 B) $42,825 to $57,175 C) $47,175 to $52,825 D) $47,175 to $57,175
Version 1
16
62) An investment consultant tells her client that the probability of making a positive return with her suggested portfolio is 0.90. What is the risk, measured by standard deviation that this investment manager has assumed in her calculation if it is known that returns from her suggested portfolio are normally distributed with a mean of 6%? A) 1.28% B) 4.69% C) 6.00% D) 10.0%
63) The stock price of a particular asset has a mean and standard deviation of $58.50 and $8.25, respectively. Use the normal distribution to compute the 95th percentile of this stock price. A) −1.645 B) 1.645 C) 44.93 D) 72.07
64) You are planning a May camping trip to Denali National Park in Alaska and want to make sure your sleeping bag is warm enough. The average low temperature in the park for May follows a normal distribution with a mean of 32°F and a standard deviation of 8°F. One sleeping bag you are considering advertises that it is good for temperatures down to 25°F. What is the probability that this bag will be warm enough on a randomly selected May night at the park? A) 0.1894 B) 0.3106 C) 0.8092 D) 0.8800
Version 1
17
65) You are planning a May camping trip to Denali National Park in Alaska and want to make sure your sleeping bag is warm enough. The average low temperature in the park for May follows a normal distribution with a mean of 32°F and a standard deviation of 8°F. An inexpensive bag you are considering advertises to be good for temperatures down to 38°F. What is the probability that the bag will not be warm enough? A) 0.2266 B) 0.2734 C) 0.7500 D) 0.7734
66) You are planning a May camping trip to Denali National Park in Alaska and want to make sure your sleeping bag is warm enough. The average low temperature in the park for May follows a normal distribution with a mean of 32°F and a standard deviation of 8°F. Above what temperature must the sleeping bag be suited such that the temperature will be too cold only 5% of the time? A) −1.645 B) 1.645 C) 18.84 D) 45.16
67) Gold miners in Alaska have found, on average, 12 ounces of gold per 1,000 tons of dirt excavated with a standard deviation of 3 ounces. Assume the amount of gold found per 1,000 tons of dirt is normally distributed. What is the probability the miners find more than 16 ounces of gold in the next 1,000 tons of dirt excavated? A) 0.0912 B) 0.4082 C) 0.5918 D) 0.9082
Version 1
18
68) Gold miners in Alaska have found, on average, 12 ounces of gold per 1,000 tons of dirt excavated with a standard deviation of 3 ounces. Assume the amount of gold found per 1,000 tons of dirt is normally distributed. What is the probability the miners find between 10 and 14 ounces of gold in the next 1,000 tons of dirt excavated? A) 0.2514 B) 0.4950 C) 0.5028 D) 0.7486
69) Gold miners in Alaska have found, on average, 12 ounces of gold per 1,000 tons of dirt excavated with a standard deviation of 3 ounces. Assume the amount of gold found per 1,000 tons of dirt is normally distributed. If the miners excavated 1,000 tons of dirt, how little gold must they have found such that they find that amount or less only 15% of the time? A) −1.04 B) 1.04 C) 8.89 D) 15.12
70) Suppose the life of a particular brand of laptop battery is normally distributed with a mean of 8 hours and a standard deviation of 0.6 hours. What is the probability that the battery will last more than 9 hours before running out of power? A) 0.0478 B) 0.4525 C) 0.9525 D) 1.6667
Version 1
19
71) A superstar major league baseball player just signed a new deal that pays him a record amount of money. The star has driven in an average of 110 runs over the course of his career, with a standard deviation of 31 runs. An average player at his position drives in 80 runs. What is the probability the superstar bats in fewer runs than an average player next year? Assume the number of runs batted in is normally distributed. A) 0.1666 B) 0.3340 C) 0. 8340 D) 0.9700
72)
If an exponential distribution has the rate parameter λ = 6, what is its expected value? A) 1/6 B) 1/36 C) 6 D) 6/2
73)
If an exponential distribution has the rate parameter λ = 5, what is its expected value? A) 5 B) 1/5 C) 1/25 D) 5/2
74)
If an exponential distribution has the rate parameter λ = 5, what is its variance? A) 5 B) 1/5 C) 1/25 D) 5/2
Version 1
20
75) What can be said about the expected value and standard deviation of an exponential distribution? A) The expected value is equal to the standard deviation. B) The expected value is equal to the square of the standard deviation. C) The expected value is equal to the reciprocal of the standard deviation. D) The expected value is equal to the square root of the standard deviation.
76) Let the time between two consecutive arrivals at a grocery store checkout line be exponentially distributed with a mean of three minutes. Find the probability that the next arrival does not occur until at least three minutes have passed since the last arrival.
A) 0.6321 B) 0.4229 C) 0.6711 D) 0.3679
77) Let the time between two consecutive arrivals at a grocery store checkout line be exponentially distributed with a mean of four minutes. Find the probability that the next arrival does not occur until at least three minutes have passed since the last arrival. A) 0.5276 B) 0.2636 C) 0.4724 D) 0.7364
78) Patients scheduled to see their primary care physician at a particular hospital wait, on average, an additional nine minutes after their appointment is scheduled to start. Assume the time that patients wait is exponentially distributed. What is the probability a randomly selected patient will have to wait more than 13 minutes? Version 1
21
A) 0.2359 B) 0.6629 C) 0.3987 D) 0.5001
79) Patients scheduled to see their primary care physician at a particular hospital wait, on average, an additional eight minutes after their appointment is scheduled to start. Assume the time that patients wait is exponentially distributed. What is the probability a randomly selected patient will have to wait more than 10 minutes? A) 0.2865 B) 0.4493 C) 0.5507 D) 0.7135
80) Patients scheduled to see their primary care physician at a particular hospital wait, on average, an additional eight minutes after their appointment is scheduled to start. Assume the time that patients wait is exponentially distributed. What is the probability a randomly selected patient will see the doctor within thirteen minutes of the scheduled time? A) 0.8737 B) 0.5404 C) 0.4596 D) 0.8031
81) Patients scheduled to see their primary care physician at a particular hospital wait, on average, an additional eight minutes after their appointment is scheduled to start. Assume the time that patients wait is exponentially distributed. What is the probability a randomly selected patient will see the doctor within five minutes of the scheduled time?
Version 1
22
A) 0.2019 B) 0.4647 C) 0.5353 D) 0.7981
82) The average time between trades for a high-frequency trading investment firm is 40 seconds. Assume the time between trades is exponentially distributed. What is the probability that the time between trades for a randomly selected trade and the one proceeding it is less than 20 seconds? A) 0.1354 B) 0.3935 C) 0.6065 D) 0.8446
83) The average time between trades for a high-frequency trading investment firm is 40 seconds. Assume the time between trades is exponentially distributed. What is the probability that the time between trades for a randomly selected trade and the one proceeding it is more than a minute? A) 0.2231 B) 0.4869 C) 0.5134 D) 0.7769
84) If Y = eX has a lognormal distribution, what can be said of the distribution of the random variable X?
Version 1
23
A) X follows a normal distribution. B) X follows an exponential distribution. C) X follows a standard normal distribution. D) X follows a continuous uniform distribution.
85) Find the mean of the lognormal variable if the mean and standard deviation of the underlying normal variable are 1.8 and 0.9, respectively. A) 0.69 B) 2.32 C) 9.07 D) 8.40
86) Find the variance of the lognormal variable if the mean and variance of the underlying normal variable are 2 and 1, respectively. A) 0 B) 12.18 C) 15.97 D) 255.02
87) The mean travel time to work is 25.2 minutes (U.S. Census). Further, suppose that commute time follows a log-normal distribution with a standard deviation of 10 minutes. What is the probability a randomly selected U.S. worker has a commute time of more than half an hour? A) 25.78% B) 31.56% C) 68.44% D) 74.22%
Version 1
24
88) The mean travel time to work is 25.2 minutes (U.S. Census). Further, suppose that commute time follows a log-normal distribution with a standard deviation of 10 minutes. What is the probability a randomly selected U.S. worker has a commute time of less than 20 minutes? A) 30.15% B) 33.97% C) 65.91% D) 69.85%
89) Let the lifetime of a new Jet Ski be represented by a lognormal variable, Y = eX where X is normally distributed. Let the mean of the lifetime of the Jet Ski be six years with a standard deviation of three years. What proportion of the Jet Skis will last less than seven years? A) 0.2877 B) 0.3707 C) 0.6293 D) 0.7131
90) Let the lifetime of a new Jet Ski be represented by a lognormal variable, Y = eX where X is normally distributed. Let the mean of the lifetime of the Jet Ski be six years with a standard deviation of three years. What proportion of the Jet Skis will last nine or more years? A) 0.1369 B) 0.1587 C) 0.8413 D) 0.8621
91) A daily mail is delivered to your house between 1:00 p.m. and 6:00 p.m. Assume delivery times follow the continuous uniform distribution. Determine the percentage of mail deliveries that are made after 4:00 p.m.
Version 1
25
A) 48.3% B) 40% C) 75% D) 52.5%
92) A daily mail is delivered to your house between 1:00 p.m. and 5:00 p.m. Assume delivery times follow the continuous uniform distribution. Determine the percentage of mail deliveries that are made after 4:00 p.m. A) 25% B) 33.3% C) 37.5% D) 27.5%
93) The time for a professor to grade a student’s homework in statistics is normally distributed with a mean of 14.2 minutes and a standard deviation of 2.0 minutes. What is the probability that randomly selected homework will require less than 17 minutes to grade? A) 0.1462 B) 0.7795 C) 0.9192 D) 0.5971
94) The time for a professor to grade a student’s homework in statistics is normally distributed with a mean of 12.6 minutes and a standard deviation of 2.5 minutes. What is the probability that randomly selected homework will require less than 16 minutes to grade? A) 0.1401 B) 0.5910 C) 0.7734 D) 0.9131
Version 1
26
95) The time for a professor to grade a homework in statistics is normally distributed with a mean of 12.6 minutes and a standard deviation of 2.5 minutes. What is the probability that randomly selected homework will require more than 12 minutes to grade? A) 0.8023 B) 0.5948 C) 0.6664 D) 0.5040
96) The time for a professor to grade a student’s homework in statistics is normally distributed with a mean of 12.6 minutes and a standard deviation of 2.5 minutes. What is the probability that randomly selected homework will require between 11 and 15 minutes to grade? A) 0.4875 B) 0.7567 C) 0.5704 D) 0.9637
97) Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be more than $450? A) 0.1796 B) 0.0788 C) 0.3369 D) 0.2033
Version 1
27
98) Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be less than $300? A) 0.4286 B) 0.1015 C) 0.2005 D) 0.3192
99) Suppose the round-trip airfare between Boston and Orlando follows the normal probability distribution with a mean of $387.20 and a standard deviation of $68.50. What is the probability that a randomly selected airfare between Boston and San Francisco will be between $325 and $425? A) 0.2650 B) 0.1548 C) 0.5275 D) 0.3875
100) On a particular busy section of the Garden State Parkway in New Jersey, police use radar guns to detect speeding drivers. Assume the time that elapses between successive speeders is exponentially distributed with the mean of 15 minutes. What is the probability of a waiting time less than 10 minutes between successive speeders? A) 0.5132 B) 0.3776 C) 0.6234 D) 0.4866
Version 1
28
101) On a particular busy section of the Garden State Parkway in New Jersey, police use radar guns to detect speeding drivers. Assume the time that elapses between successive speeders is exponentially distributed with the mean of 31 minutes. What is the probability of a waiting time in excess of 41 minutes between successive speeders? A) 0.8111 B) 0.8888 C) 0.3449 D) 0.2664
102) On a particular busy section of the Garden State Parkway in New Jersey, police use radar guns to detect speeding drivers. Assume the time that elapses between successive speeders is exponentially distributed with the mean of 15 minutes. What is the probability of a waiting time in excess of 25 minutes between successive speeders? A) 0.1889 B) 0.2674 C) 0.7336 D) 0.8113
103) On average, a certain kind of kitchen appliance requires repairs once every four years. Assume that the times between repairs are exponentially distributed. Which of the following Excel commands computes the probability that the appliance will work no more than three years without requiring repairs? A) EXPON.DIST(3,1.0,1) B) EXPON.DIST(3,0.25,1) C) EXPON.DIST(3.1.0,0) D) EXPON.DIST(3.0.25,0)
104) On average, a certain kind of kitchen appliance requires repairs once every four years. Assume that the times between repairs are exponentially distributed. What is the probability that the appliance will work no more than three years without requiring repairs? Version 1
29
A) 0.3254 B) 0.6744 C) 0.5276 D) 0.4724
105) On average, a certain kind of kitchen appliance requires repairs once every five years. Assume that the times between repairs are exponentially distributed. What is the probability that the appliance will work at least sven years without requiring repairs? A) 0.6779 B) 0.3691 C) 0.8004 D) 0.2466
106) On average, a certain kind of kitchen appliance requires repairs once every four years. Assume that the times between repairs are exponentially distributed. What is the probability that the appliance will work at least six years without requiring repairs? A) 0.7769 B) 0.3456 C) 0.6544 D) 0.2231
107)
Which of the following is FALSE about a continuous random variable?
A) It has uncountable value within an interval. B) The probability that its value is within a specific interval is equal to one. C) Its probability density function is nonnegative. D) The cumulative distribution function of a specific value of the variable is defined as the area under its probability density function up to that value.
Version 1
30
108)
Which of the following is an example of a uniformly distributed random variable? A) The waiting time at a checkout counter of a supermarket. B) The failure rate of a light bulb. C) The scheduled arrival time of a cable technician. D) The scores on the Business Statistics exam.
109) Which of the following statements is FALSE regarding the given curves showing the shapes of the probability density function for various values of mean and standard deviation?
A) The black and red curves have the same mean. B) The green curve has a smaller standard deviation than the red curve. C) The green curve has a higher mean than the other two curves. D) All three curves are depicting a standard normal distribution.
Version 1
31
110) Which of the following statements is FALSE regarding the given curves showing the shapes of a probability density function for different values of a key parameter λ?
A) All three curves are depicting an exponential probability density function. B) The green curve has the highest value of the rate parameter λ. C) The blue curve has the lowest value of the rate parameter λ. D) The red curve has the lowest value of the rate parameter λ.
111) When attending a movie, patrons are interested in avoiding the pre-movie trivia games, ads, and previews. It is known that the previews begin at the scheduled movie start time and they last between 5 and 15 minutes. Assume that the time of the previews is uniformly distributed. a. Find the expected time and variance of the movie preview duration. b. What is the probability that on a given day the previews last between 10 and 12 minutes?
Version 1
32
112) Snack food companies always have a target weight when filling a box of snack crackers. It is not possible, however, to always fill boxes to the exact target weight. For a particular box of crackers, the target weight is 14 ounces. The filling machine drops between 13.5 and 15 ounces of crackers into each box. If these weights are uniformly distributed, what is the expected value of the fill weights? Does this value match the target weight? If not, what impact does this difference have on the manufacturer’s costs to produce these crackers?
113) Snack food companies always have a target weight when filling a box of snack crackers. It is not possible, however, to always fill boxes to the exact target weight. For a particular box of crackers, the target weight is 14 ounces. The filling machine drops between 13.5 and 15 ounces of crackers into each box. Let these weights be uniformly distributed. a. What is the probability that the weight in a box will be less than 14 ounces? b. What is the probability that the weight in a box will be between 14.5 and 15.5 ounces?
114) The time you must wait for an Orange Line train of the Massachusetts Bay Transit Authority follows a uniform distribution with a lower bound of 0 minutes and an upper bound of 8 minutes. Jonathan is running to catch the train to get to a meeting. He knows that the train needs to arrive within five minutes or else he will be late. What is the probability that he will be late to his meeting?
Version 1
33
115) Suppose Jennifer is waiting for a taxicab. A taxicab’s arrival time is equally likely at any constant time range in the next 12 minutes. a. Compute the expected arrival time. b. What is the probability that a taxi arrives in three minutes or less?
116) Find the following probabilities for a standard normal random variable Z. a.P(0 < Z ≤ 1.96) b.P(−1.96 ≤ Z < 0) c.P(−1.96 < Z ≤ 1.96) d.P(Z ≥ 1.96)
117) Find the following probabilities for a standard normal random variable Z. a.P(−0.25 < Z ≤ 0.50) b.P(−2.5 ≤ Z < 0) c.P(−1.45 < Z < 3.09) d.P(Z < −1.75)
Version 1
34
118) Find the value of z for which the standard normal random variable Z satisfies the following. a.P(Z ≤ z) = 0.9573 b.P(Z < z) = 0.0122 c.P(−z ≤ Z ≤ z) = 0.2434 d.P(Z ≤ z) = 0.9265
119) Given normally distributed random variable X with a mean of 10 and a variance of 4, find the following probabilities. a.P(8 < X ≤ 10) b.P(6 ≤ X < 8) c.P(X < 13) d.P(X > 4)
120) Given normally distributed random variable X with a mean of 12 and a standard deviation of 3.4, find the following probabilities. a.P(11 < X ≤ 16.4) b.P(9 ≤ X < 11.5) c.P(8.6 < X < 15.4) d.P(X > 13.6)
Version 1
35
121) A normal random variable X has a mean of 17 and a variance of 5. a. Find the value x for which P(X ≤ x) = 0.0020. b. Find the value of x for which P(X > x) = 0.0122.
122) A soft drink company fills two-liter bottles on several different lines of production equipment. The fill volumes are normally distributed with a mean of 1.97 liters and a variance of 0.04 (liter)2. a. Find the probability that a randomly selected two-liter bottle would contain between 1.95 and 2.03 liters. b. If X is the fill volume of a randomly selected two-liter bottle, find the value of x for which P(X > x) = 0.3228.
Version 1
36
123) A soft drink company fills two-liter bottles on several different lines of production equipment. The fill volumes are normally distributed with a mean of 1.97 liters and a variance of 0.04 (liter)2. a. Find the probability that a randomly selected two-liter bottle would contain more than 1.92 liters. b. If X is the fill volume of a randomly selected two-liter bottle, find the value of x for which P(X < x) = 0.6293.
124) You are considering the risk-return profile of two mutual funds for investment. The relatively risky fund promises an expected return of 9%, with a standard deviation of 12%. The relatively less risky fund promises an expected return and standard deviation of 5% and 8%, respectively. a. Which mutual fund will you pick if your objective is to minimize the probability of earning a negative return? b. Which mutual fund will you pick if your objective is to maximize the probability of earning a return of between 8% and 12%?
125) The annual return of a well-known mutual fund has historically had a mean of about 10% and a standard deviation of 21%. Suppose the return for the following year follows a normal distribution, with the historical mean and standard deviation. What is the probability that you will lose money in the next year by investing in this fund?
Version 1
37
126) The East Los Angeles Interchange is the busiest freeway interchange in the world. Last year, an average of 550,000 cars passed through the intersection per day with a standard deviation of 100,000. What is the probability more than 620,000 use the interchange on a random day? Assume the number of cars on the interchange is approximately normally distributed.
127) The weight of competition pumpkins at the Circleville Pumpkin Show in Circleville, Ohio, can be represented by a normal distribution with a mean of 703 pounds and a standard deviation of 347 pounds. a. Find the probability that a randomly selected pumpkin weighs at least 1,622 pounds. b. Find the probability that a randomly selected pumpkin weighs between 465.1 and 1,622 pounds.
128) The average annual percentage rate (APR) for credit cards held by U.S. consumers is approximately 15 percent ("Ouch - Credit Card APR Now Tops 15 Percent," Time, January 3, 2012). Suppose the APR for new credit card offers is normally distributed with a mean of 15% and a standard deviation of 4%. What APR must a credit card charge to be in the bottom 10% of all cards?
Version 1
38
129) The average annual inflation rate in the United States over the past 98 years is 3.37% and has a standard deviation of approximately 5% ( Inflationdata.com). In 1980, the inflation rate was above 13%. If the annual inflation rate is normally distributed, what is the probability that inflation will be above 13% next year?
130) The Japan Sumo Association has begun to measure the body fat of wrestlers to try to combat the growing problem of excessive obesity within the sport. As of last year, the average wrestler weighed 412 pounds. Suppose the weights of sumo wrestlers are normally distributed, with a standard deviation of 37 pounds. What is the probability that a randomly selected wrestler weighs between 350 and 450 pounds?
131) A producer has a history of making bad movies. The movies he has produced have averaged $160,000 dollars at the box office with a standard deviation of $185,000. He thinks his latest movie will be a huge hit. How much will the movie earn at the box office if it is only expected to earn that much or more 0.5% of the time? (Assume that the profit at the box office is normally distributed.)
Version 1
39
132) Suppose the amount of time customers must wait to check bags at the ticketing counter in Boston Logan Airport is exponentially distributed with a mean of 14 minutes. What is the probability that a randomly selected customer will have to wait more than 20 minutes?
133) The average wait time to see a doctor at a maternity ward is 16 minutes. What is the probability that a patient will have to wait between 20 and 30 minutes before seeing a doctor? Suppose the wait time is exponentially distributed.
134) The average wait time at a McDonald’s drive-through window is about three minutes ("The Doctor Will See You Eventually," The Wall Street Journal, October 18, 2010). Suppose the wait time is exponentially distributed. What is the probability that a randomly selected customer will have to wait no more than five minutes?
135) Jennifer is waiting for a taxicab. The average wait time for a taxi is six minutes. Suppose the wait time is exponentially distributed. What is the probability that a taxi arrives in three minutes or less?
Version 1
40
136) Customers arrive at a drive-through teller window of a bank. They stay in line when the teller is busy. The service time is exponentially distributed with a mean of four minutes. a. What is the probability that the next customer in line will take longer than seven minutes to be served? b. What is the probability that the next customer in line will take less than eight minutes to be served? c. What is the probability that the next customer in line will take between three and six minutes to be served?
137) Compute the mean and variance of a lognormal variable Y if the mean and the variance of the underlying normal variable are μ = 2; σ2 = 1.8.
138) After a heavy snow, the city of Boston spends millions of dollars plowing the streets. Suppose the amount of time the city must spend before the streets are clear follows a log-normal distribution. Further suppose that the average amount of time is 12 hours and the standard deviation is 5 hours. What percentage of the time does it take more than 10 hours for the streets to be cleared?
Version 1
41
139) Let the household income of residents of the United States be represented by Y = eX, where X is normally distributed. Last year, the mean U.S. household income was approximately $50,000. Suppose the standard deviation was $16,000. Estimate the proportion of U.S. households that have an income less than $80,000.
140) Compute the mean and variance of a lognormal variable Y if the mean and the variance of the underlying normal variable are μ = 1; σ2 = 0.5.
141) The mean household income of France is approximately 20,000 euros. Suppose the household income in France has a standard deviation of 10,000 euros and follows a log-normal distribution. Estimate the proportion of French households that have an income of more than 25,000 euros.
142) Let house prices in a rich community in Chicago be represented by Y = ex where X is normally distributed. Suppose the mean house price is $1.8 million and the standard deviation is $0.4 million. What is the proportion of the houses that are worth more than $2.5 million?
143) A project has a mean completion time of 120 days with a variance of 100 days. There is a 20% chance that the project will be completed before day ______ (rounded to nearest day). Assume the project completion time follows a normal distribution.
Version 1
42
144) Complete the following statements with respect to the given graph showing the shapes of the lognormal probability density function for various values ofσ along with μ = 0.
a. The _____ curve has the highest value of σ. b. The _____ curve has the lowest value of σ. c. The value of σ for the blue curve is _____1. d. The value of σ for the red curve is _____1.
145) A continuous random variable is characterized by uncountable values and can take on any value within an interval. ⊚ true ⊚ false
146) We are often interested in finding the probability that a continuous random variable assumes a particular value.
Version 1
43
⊚ ⊚
true false
147) The probability density function of a continuous random variable is the counterpart to the probability mass function of a discrete random variable. ⊚ true ⊚ false
148) Cumulative distribution functions can only be used to compute probabilities for continuous random variables. ⊚ true ⊚ false
149) The continuous uniform distribution describes a random variable, defined on the interval [a, b], that has an equally likely chance of assuming values within a specified range. ⊚ true ⊚ false
150) The probability density function for a continuous uniform distribution is positive for all values between −∞ and +∞. ⊚ true ⊚ false
151) The mean of a continuous uniform distribution is simply the average of the upper and lower limits of the interval on which the distribution is defined. ⊚ true ⊚ false
Version 1
44
152)
The mean and standard deviation of the continuous uniform distribution are equal. ⊚ true ⊚ false
153)
The normal probability distribution is symmetric and bell-shaped. ⊚ true ⊚ false
154) Examples of random variables that closely follow a normal distribution include the age and the class year designation of a college student. ⊚ true ⊚ false
155) Given that the probability distribution is normal, it is completely described by its mean μ > 0 and its standard deviation σ > 0. ⊚ true ⊚ false
156) Just as in the case of the continuous uniform distribution, the probability density function of the normal distribution may be easily used to compute probabilities. ⊚ true ⊚ false
157) The standard normal distribution is a normal distribution with a mean equal to zero and a standard deviation equal to one. ⊚ true ⊚ false
Version 1
45
158)
The letter Z is used to denote a random variable with any normal distribution. ⊚ true ⊚ false
159)
The standard normal table is also referred to as the z table. ⊚ true ⊚ false
160) According the empirical rule for normally distributed variables, 75% of the values fall within one standard deviation of the mean. ⊚ true ⊚ false
161) A standard normal variable Z can be transformed to the normally distributed random variable X with only mean µ known. ⊚ true ⊚ false
162) Excel’s function NORM.DIST can be used to compute probabilities for the normal distribution. ⊚ true ⊚ false
163)
The exponential distribution is related to the Poisson distribution. ⊚ true ⊚ false
164)
The lognormal distribution is clearly negatively skewed for σ > 1.
Version 1
46
⊚ ⊚
Version 1
true false
47
Answer Key Test name: Chap 06_4e 1) asymptotic 2) cumulative 3) 1.5 4) inverse 5) EXPON.DIST 6) rate 7) normal 8) 12.18 9) D 10) D 11) B 12) C 13) B 14) C 15) A 16) A 17) A 18) D 19) B 20) B 21) C 22) A 23) B 24) B 25) B 26) B Version 1
48
27) C 28) B 29) A 30) B 31) D 32) C 33) A 34) A 35) C 36) D 37) B 38) B 39) B 40) B 41) C 42) B 43) B 44) D 45) C 46) D 47) D 48) C 49) C 50) B 51) A 52) B 53) C 54) B 55) B 56) A Version 1
49
57) C 58) D 59) D 60) D 61) C 62) B 63) D 64) C 65) D 66) C 67) A 68) B 69) C 70) A 71) A 72) A 73) B 74) C 75) A 76) D 77) C 78) A 79) A 80) D 81) B 82) B 83) A 84) A 85) C 86) D Version 1
50
87) A 88) B 89) D 90) A 91) B 92) A 93) C 94) D 95) B 96) C 97) A 98) B 99) C 100) D 101) D 102) A 103) B 104) C 105) D 106) D 107) B 108) C 109) D 110) D 111) a. 10 and 8.3333 b. 0.2 112) The expected value equals 14.25 ounces, μ = (15 + 13.5)/2, which is greater than the fill target of 14 ounces. The manufacturer should adjust the machine. Since the machine is dispensing more than the target weight, the manufacturer is losing money for each product made. Therefore, the manufacturer should adjust the machine. 113) a. 0.3333 b. 0.3333
Version 1
51
114) 0.375 115) a. 6 minutes b. 0.25
116) a. 0.4750 b. 0.4750 c. 0.9500 d. 0.0250 117) a. 0.2902 b. 0.4938 c. 0.9255 d. 0.0401 118) a. 1.72 b. −2.25 c. 0.31 d. −1.45 119) a. 0.3413 b. 0.1359 c. 0.9332 d. 0.9987 120) a. 0.5178 b. 0.2527 c. 0.6827 d. 0.319
121) a. 10.56 b. 22.03 122) a. 0.1577 b. 2.062 123) a. 0.5987 b. 2.036 liters 124) a. risky fund b. less risky fund 125) 0.3169 126) 0.2420
Version 1
52
127) a. 0.0040 b. 0.7495
128) The APR must be no more than 9.87%. 129) 0.0271 130) 0.8009
131) $636,528.42. 132) 0.2397 133) 0.1331 134) 0.8111 135) 0.3935 136) a. 0.1738 b. 0.8647 c. 0.2492 137) a. 18.1741 b. 1667.90 138) 0.6009 139) 0.9517 140) 3.4903 and 7.9030 141) 0.2393 142) 0.5413 143) 112 144) a. green b. blue c. < d. ≥
145) TRUE 146) FALSE 147) TRUE 148) FALSE 149) TRUE Version 1
53
150) FALSE 151) TRUE 152) FALSE 153) TRUE 154) FALSE 155) FALSE 156) FALSE 157) TRUE 158) FALSE 159) TRUE 160) FALSE 161) FALSE 162) TRUE 163) TRUE 164) FALSE
Version 1
54
CHAPTER 07 – 4e 1) Ten random samples of 250 observations each are used to track the proportion of defects in the production process of circuit breakers, which has had a 4% defect rate. Following are the number of defective circuit breakers found in each sample: Sample
1
2
3
4
5
6
7
8
9
10
# of Defects
16
14
13
17
18
11
22
19
21
20
a.What kind of control chart should be used? b. Is the central limit theorem applicable? Explain. c. Is the production process in control? Explain.
2)
Stratified random sampling is preferred when the objective is to ______.
3)
Cluster sampling is preferred when the objective is to ______.
4)
<p>The ______ of is the same as the mean of the individual observations.
5) <p>If is normally distributed then we can transform it into a(n) ______ normal random variable.
6) <p>For making statistical inferences, it is essential that the sampling distribution of is ______ distributed.
Version 1
1
7) <p>According to the central limit theorem, the sampling distribution of approaches the normal distribution as the sample size ______.
8)
A preferred approach to quality control is the ______ approach.
9) The ______ variation in the production process is caused by specific events or factors that can usually be identified and eliminated.
10)
Bias can occur in sampling. Bias refers to ______.
A) the division of the population into overlapping groups B) the creation of strata, which are proportional to the stratum’s size C) the use of cluster sampling instead of stratified random sampling D) the tendency of a sample statistic to systematically over- or underestimate a population parameter
11)
Selection bias occurs when ______. A) the population has been divided into strata B) portions of the population are excluded from the consideration for the sample C) cluster sampling is used instead of stratified random sampling D) those responding to a survey or poll differ systematically from the nonrespondents
12)
Nonresponse bias occurs when ______.
Version 1
2
A) the population has been divided into strata B) portions of the population are excluded from the sample C) cluster sampling is used instead of stratified random sampling D) those responding to a survey or poll differ systematically from the nonrespondents
13)
Which of the following is not a form of bias? A) Portions of the population are excluded from the sample. B) Information from the sample is typical of information in the population. C) Information from the sample overemphasizes a particular stratum of the population. D) Those responding to a survey or poll differ systematically from the nonrespondents.
14)
Which of the following meets the requirements of a simple random sample?
A) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people who volunteer for the sample. B) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people chosen at random, without regard to age. C) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six males chosen at random, without regard to age. D) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include two people chosen at random under the age of 25 and four people chosen at random over 25.
15)
Which of the following meets the requirements of a cluster sample?
Version 1
3
A) A population can be divided into 50 city blocks. The sample will include one hundred people who volunteer for the sample from any city block. B) A population can be divided into 50 city blocks. The sample will include one hundred people chosen at random, without regard to the city block where they live. C) A population can be divided into 50 city blocks. The sample will include residents from two randomly chosen city blocks. D) A population can be divided into 50 city blocks. The sample will include two people chosen at random from each city block.
16)
Which of the following meets the requirements of a stratified random sample?
A) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people who volunteer for the sample. B) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six people chosen at random, without regard to age. C) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include six males chosen at random, without regard to age. D) A population contains 10 members under the age of 25 and 20 members over the age of 25. The sample will include two people chosen at random under the age of 25 and four people chosen at random over 25.
17) Which of the following is true about statistics such as the sample mean or sample proportion? A) A statistic is a constant. B) A statistic is a parameter. C) A statistic is always known. D) A statistic is a random variable.
18) Statistics are used to estimate population parameters, particularly when it is impossible or too expensive to poll an entire population. A particular value of a statistic is referred to as a(n) ______. Version 1
4
A) mean B) stratum C) estimate D) finite correction factor
19)
Which of the following is considered an estimator? A) B) µ C) σ D) σ2
20)
Which of the following is considered an estimate? A) B) C) D)
21) What is the relationship between the expected value of the sample mean and the expected value of the population? A) B) C) D)
22)
How does the variance of the sample mean compare to the variance of the population?
Version 1
5
A) It is smaller and therefore suggests that averages have less variation than individual observations. B) It is larger and therefore suggests that averages have less variation than individual observations. C) It is smaller and therefore suggests that averages have more variation than individual observations. D) It is larger and therefore suggests that averages have more variation than individual observations.
23) What is the relationship between the standard deviation of the sample mean and the population standard deviation? A) B) C) D)
24) A nursery sells trees of different types and heights. These trees average 90 inches in height with a standard deviation of 17 inches. Suppose that 80 pine trees are sold for planting at City Hall. What is the standard deviation for the sample mean? A) 3.47 B) 4 C) 1.90 D) 17
25) A nursery sells trees of different types and heights. These trees average 60 inches in height with a standard deviation of 16 inches. Suppose that 75 pine trees are sold for planting at City Hall. What is the standard deviation for the sample mean?
Version 1
6
A) 1.85 B) 3.41 C) 4 D) 16
26) If a population is known to be normally distributed, what can be said of the sampling distribution of the sample mean drawn from this population? A) For any sample size n, the sampling distribution of the sample mean is normally distributed. B) For a sample size n < 50, the sampling distribution of the sample mean is normally distributed. C) For a sample size n < 30, the sampling distribution of the sample mean is normally distributed. D) For a sample size n > 30, the sampling distribution of the sample mean is normally distributed.
27) Over the entire six years that students attend an Ohio elementary school, they are absent, on average, 28 days due to influenza. Assume that the standard deviation over this time period is σ = 9 days. Upon graduation from elementary school, a random sample of 36 students is taken and asked how many days of school they missed due to influenza. What is the expected value for the sampling distribution of theaverage number of school days missed due to influenza? A) 6 B) 9 C) 28 D) 168
Version 1
7
28) Over the entire six years that students attend an Ohio elementary school, they are absent, on average, 28 days due to influenza. Assume that the standard deviation over this time period is σ = 9 days. Upon graduation from elementary school, a random sample of 36 students is taken and asked how many days of school they missed due to influenza. What is the standard deviation for the sampling distribution of the average number of school days missed due to influenza? A) 1.22 B) 1.50 C) 2.25 D) 9.00
29) Over the entire six years that students attend an Ohio elementary school, they are absent, on average, 28 days due to influenza. Assume that the standard deviation over this time period is σ = 9 days. Upon graduation from elementary school, a random sample of 36 students is taken and asked how many days of school they missed due to influenza. The probability that the sample mean is less than 30 school days is ______. A) 0.0918 B) 0.4129 C) 0.5871 D) 0.9088
30) Over the entire six years that students attend an Ohio elementary school, they are absent, on average, 28 days due to influenza. Assume that the standard deviation over this time period is σ = 9 days. Upon graduation from elementary school, a random sample of 36 students is taken and asked how many days of school they missed due to influenza. The probability that the sample mean is between 25 and 30 school days is _______. A) 0.0228 B) 0.0918 C) 0.8860 D) 0.9082
Version 1
8
31) Suppose that, on average, electricians earn approximately µ = $54,100 per year in the United States. Assume that the distribution for electricians’ yearly earnings is normally distributed and that the standard deviation is σ = $13,000. Given a sample of fourelectricians, what is the standard deviation for the sampling distribution of the sample mean? A) 6,500 B) 58,500 C) 39,000 D) 13,000
32) Suppose that, on average, electricians earn approximately µ = $54,000 per year in the United States. Assume that the distribution for electricians’ yearly earnings is normally distributed and that the standard deviation is σ = $12,000. Given a sample of four electricians, what is the standard deviation for the sampling distribution of the sample mean? A) 6,000 B) 12,000 C) 36,000 D) 54,000
33) Suppose that, on average, electricians earn approximately µ = $54,000 per year in the United States. Assume that the distribution for electricians’ yearly earnings is normally distributed and that the standard deviation is σ = $12,000. What is the probability that the average salary of four randomly selected electricians exceeds $60,000?
Version 1
9
A) 0.1587 B) 0.3085 C) 0.6915 D) 0.8413
34) Suppose that, on average, electricians earn approximately µ = $54,000 per year in the United States. Assume that the distribution for electricians’ yearly earnings is normally distributed and that the standard deviation is σ = $12,000. What is the probability that the average salary of four randomly selected electricians is less than $50,000? A) 0.2525 B) 0.3707 C) 0.6293 D) 0.7486
35) Suppose that, on average, electricians earn approximately µ = $54,000 per year in the United States. Assume that the distribution for electricians’ yearly earnings is normally distributed and that the standard deviation is σ = $12,000 dollars. What is the probability that the average salary of four randomly selected electricians is more than $50,000 but less than $60,000? A) 0.5889 B) 0.7486 C) 0.8413 D) 0.9048
Version 1
10
36) Susan has been on a bowling team for 13 years. After examining all of her scores over that period of time, she finds that they follow a normal distribution. Her average score is 230, with a standard deviation of 22. What is the probability that in a one-game playoff, her score is more than 236? A) 0.3247 B) 0.1519 C) 0.0167 D) 0.6753
37) Susan has been on a bowling team for 14 years. After examining all of her scores over that period of time, she finds that they follow a normal distribution. Her average score is 225, with a standard deviation of 13. What is the probability that in a one-game playoff, her score is more than 227? A) 0.2676 B) 0.4244 C) 0.5596 D) 0.7324
38) Susan has been on a bowling team for 14 years. After examining all of her scores over that period of time, she finds that they follow a normal distribution. Her average score is 225, with a standard deviation of 13. If during a typical week Susan bowls 16 games, what is the probability that her average score is more than 230? A) 0.0620 B) 0.3520 C) 0.6480 D) 0.9382
Version 1
11
39) Susan has been on a bowling team for 14 years. After examining all of her scores over that period of time, she finds that they follow a normal distribution. Her average score is 225, with a standard deviation of 13. If during a typical week Susan bowls 16 games, what is the probability that her average score for the week is between 220 and 228? A) 0.0618 B) 0.2390 C) 0.7600 D) 0.8212
40) Susan has been on a bowling team for 14 years. After examining all of her scores over that period of time, she finds that they follow a normal distribution. Her average score is 225, with a standard deviation of 13. If during a typical month Susan bowls 64 games, what is the probability that her average score in this month is above 227? A) 0.1092 B) 0.4404 C) 0.5596 D) 0.8907
41) Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. What is the probability that a class of 15 students will have a class average greater than 70 on Professor Elderman’s final exam? A) 0.0262 B) 0.6915 C) 0.9738 D) Cannot be determined.
Version 1
12
42) Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 74 and a standard deviation of 12. What is the probability that a class of 36 students will have an average greater than 70 on Professor Elderman’s final exam? A) 0.6700 B) 0.0229 C) 0.9772 D) 0.2870
43) Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. What is the probability that a class of 36 students will have an average greater than 70 on Professor Elderman’s final exam? A) 0.0014 B) 0.3085 C) 0.6915 D) 0.9986
44) Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. Professor Elderman offers his class of 36 a pizza party if the class average is above 80. What is the probability that he will have to deliver on his promise?
Version 1
13
A) 0.0228 B) 0.3707 C) 0.6293 D) 0.9772
45) Professor Elderman has given the same multiple-choice final exam in his Principles of Microeconomics class for many years. After examining his records from the past 10 years, he finds that the scores have a mean of 76 and a standard deviation of 12. What is the probability Professor Elderman’s class of 36 has a class average below 78? A) 0.1587 B) 0.5675 C) 0.8413 D) Cannot be determined.
46) According to the central limit theorem, the distribution of the sample means is normal if ________. A) the underlying population is normal B) the sample size n ≥ 30 C) the standard deviation of the population is known D) both the underlying population is normal and the sample size n ≥ 30 are correct
47) The central limit theorem states that, for any distribution, as n gets larger, the sampling distribution of the sample mean ______. A) becomes larger B) becomes smaller C) is closer to a normal distribution D) is closer to the standard deviation
Version 1
14
48) A random sample of size 36 is taken from a population with mean µ = 20 and standard deviation σ = 5. What are the expected value and the standard deviation for the sampling distribution of the sample mean? A) 20 and 0.83 B) 3.425 and 0.83 C) 3.425 and 2.66 D) 20 and 5
49) A random sample of size 36 is taken from a population with mean µ = 17 and standard deviation σ = 6. What are the expected value and the standard deviation for the sampling distribution of the sample mean? A) 0.425 and 1.00 B) 0.425 and 2.83 C) 17 and 1.00 D) 17 and 2.83
50) A random sample of size 36 is taken from a population with mean µ = 30 and standard deviation σ = 5. The probability that the sample mean is greater than 32 is ______. A) 0.6908 B) 0.2820 C) 0.0082 D) 0.4170
Version 1
15
51) A random sample of size 36 is taken from a population with mean µ = 17 and standard deviation σ = 6. The probability that the sample mean is greater than 18 is ______. A) 0.1587 B) 0.4325 C) 0.5675 D) 0.8413
52) A random sample of size 81 is taken from a population with mean µ = 24 and standard deviation σ = 7. The probability that the sample mean is less than22 is ______. A) 0.3530 B) 0.0051 C) 0.6116 D) 0.9949
53) A random sample of size 36 is taken from a population with mean µ = 17 and standard deviation σ = 6. The probability that the sample mean is less than 15 is ______. A) 0.0228 B) 0.3707 C) 0.6293 D) 0.9772
54) A random sample of size 36 is taken from a population with mean µ = 17 and standard deviation σ = 6. The probability that the sample mean is between 15 and 18 is ______.
Version 1
16
A) 0.0228 B) 0.8186 C) 0.8413 D) 0.8641
55) Using the central limit theorem, applied to the sampling distribution of the sample proportion, what conditions must be met? A) B) C) D)
and and and and
56) A random sample of size 100 is taken from a population described by the proportion p = 0.60. What are the expected value and the standard error for the sampling distribution of the sample proportion? A) 0.006 and 0.0024 B) 0.060 and 0.049 C) 0.600 and 0.0024 D) 0.600 and 0.049
57) A random sample of size 100 is taken from a population described by the proportion p = 0.60. The probability that the sample proportion is greater than 0.62 is ______.
Version 1
17
A) 0.3415 B) 0.4082 C) 0.6591 D) 1</p>
58) A random sample of size 100 is taken from a population described by the proportion p = 0.60. The probability that the sample proportion is less than 0.55 is ______. A) ≈ 0 B) 0.1537 C) 0.3669 D) 0.8461
59) A random sample of size 100 is taken from a population described by the proportion p = 0.60. The probability that the sample proportion is between 0.55 and 0.62 is ______. A) 0.1539 B) 0.5047 C) 0.6591 D) 0.8130
60) A university administrator expects that 25% of students in a core course will receive an A. He looks at the grades assigned to 60 students. What are the expected value and the standard error for the proportion of students that receive an A?
Version 1
18
A) 0.25 and 0.0031 B) 0.25 and 0.0559 C) 15.0 and 0.0031 D) 15.0 and 0.0559
61) A university administrator expects that 25% of students in a core course will receive an A. He looks at the grades assigned to 60 students. The probability that the proportion of students that receive an A is 0.20 or less is ______. A) 0.1855 B) 0.6266 C) 0.8133 D) 0.8900
62) A university administrator expects that 25% of students in a core course will receive an A. He looks at the grades assigned to 60 students. The probability that the proportion of students who receive an A is between 0.20 and 0.35 is A) 0.1867. B) 0.7776. C) 0.8133. D) 0.9633.
63) A university administrator expects that 25% of students in a core course will receive an A. He looks at the grades assigned to 60 students. The probability that the proportion of students who receive an A is not between 0.20 and 0.30 is ______.
Version 1
19
A) 0.1867 B) 0.3711 C) 0.6266 D) 0.8133
64) The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 65.6% (Calculatedrisk.com). A marketing company asks 150 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. What are the expected value and the standard error for a labor participation rate in the company’s sample? A) 76.44 and 0.0019 B) 0.656 and 0.0388 C) 76.44 and 0.0388 D) 0.656 and 0.0019
65) The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 63.7% (Calculatedrisk.com). A marketing company asks 120 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. What are the expected value and the standard error for a labor participation rate in the company’s sample? A) 0.637 and 0.0019 B) 0.637 and 0.0439 C) 76.44 and 0.0019 D) 76.44 and 0.0439
Version 1
20
66) The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 63.7% (Calculatedrisk.com). A marketing company asks 120 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. For the company’s sample, the probability that the proportion of people who are in the labor force is greater than 0.65 is ______. A) 0.1179 B) 0.3000 C) 0.3836 D) 0.6179
67) The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 63.7% (Calculatedrisk.com). A marketing company asks 120 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. What is the probability that fewer than 60% of those surveyed are members of the labor force? A) 0.1996 B) 0.7995 C) 0.8400 D) 0.9706
Version 1
21
68) The labor force participation rate is the number of people in the labor force divided by the number of people in the country who are of working age and not institutionalized. The BLS reported in February 2012 that the labor force participation rate in the United States was 63.7% (Calculatedrisk.com). A marketing company asks 120 working-age people if they either have a job or are looking for a job, or, in other words, whether they are in the labor force. What is the probability that between 60% and 62.5% of those surveyed are members of the labor force? A) 0.0243 B) 0.1926 C) 0.2005 D) 0.3936
69) Super Bowl XLVI was played between the New York Giants and the New England Patriots in Indianapolis. Due to a decade-long rivalry between the Patriots and the city’s own team, the Colts, most Indianapolis residents were rooting heartily for the Giants. Suppose that 90% of Indianapolis residents wanted the Giants to beat the Patriots. What is the probability that, of a sample of 100 Indianapolis residents, at least 15% were rooting for the Patriots in Super Bowl XLVI? A) 0.0300 B) 0.0478 C) 0.4763 D) 0.9525
70) Super Bowl XLVI was played between the New York Giants and the New England Patriots in Indianapolis. Due to a decade-long rivalry between the Patriots and the city’s own team, the Colts, most Indianapolis residents were rooting heartily for the Giants. Suppose that 90% of Indianapolis residents wanted the Giants to beat the Patriots. What is the probability that from a sample of 100 Indianapolis residents, fewer than 95% were rooting for the Giants in Super Bowl XLVI?
Version 1
22
A) 0.0300 B) 0.0475 C) 0.4763 D) 0.9522
71) Super Bowl XLVI was played between the New York Giants and the New England Patriots in Indianapolis. Due to a decade-long rivalry between the Patriots and the city’s own team, the Colts, most Indianapolis residents were rooting heartily for the Giants. Suppose that 91% of Indianapolis residents wanted the Giants to beat the Patriots. What is the probability that from a sample of 40 Indianapolis residents, fewer than 95% were rooting for the Giants in Super Bowl XLVI? A) 0.0474 B) 0.1469 C) 0.8531 D) Cannot be determined
72) Super Bowl XLVI was played between the New York Giants and the New England Patriots in Indianapolis. Due to a decade-long rivalry between the Patriots and the city’s own team, the Colts, most Indianapolis residents were rooting heartily for the Giants. Suppose that 90% of Indianapolis residents wanted the Giants to beat the Patriots. What is the probability that from a sample of 40 Indianapolis residents, fewer than 95% were rooting for the Giants in Super Bowl XLVI? A) 0.0474 B) 0.1469 C) 0.8531 D) Cannot be determined
Version 1
23
73) Super Bowl XLVI was played between the New York Giants and the New England Patriots in Indianapolis. Due to a decade-long rivalry between the Patriots and the city’s own team, the Colts, most Indianapolis residents were rooting heartily for the Giants. Suppose that 90% of Indianapolis residents wanted the Giants to beat the Patriots. What is the probability that from a sample of 200 Indianapolis residents, fewer than 170 were rooting for the Giants in Super Bowl XLVI? A) 0.0092 B) 0.0212 C) 0.4954 D) 0.9908
74) According to the 2011 Gallup daily tracking polls (https://www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that at least 60% of a random sample of 200 Mississippi residents identify themselves as conservative? A) 0.0307 B) 0.3530 C) 0.4847 D) 0.9693
75) According to the 2011 Gallup daily tracking polls (https://www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that at least 100 but fewer than 115 respondents of a random sample of 200 Mississippi residents identify as conservative? A) 0.1685 B) 0.3370 C) 0.7099 D) 0.8770
Version 1
24
76) According to the 2011 Gallup daily tracking polls (https://www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 58.7 percent of its residents identifying themselves as conservative. What is the probability that at least 50 respondents of a random sample of 100 Mississippi residents do not identify themselves as conservative? A) 0.0492 B) 0.4436 C) 0.5425 D) 0.0386
77) According to the 2011 Gallup daily tracking polls (https://www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that at least 50 respondents of a random sample of 100 Mississippi residents do not identify themselves as conservative? A) 0.0499 B) 0.2478 C) 0.4966 D) 0.7517
78) According to the 2011 Gallup daily tracking polls (https://www.gallup.com, February 3, 2012), Mississippi is the most conservative U.S. state, with 53.4 percent of its residents identifying themselves as conservative. What is the probability that fewer than 45 respondents of a random sample of 100 Mississippi residents do not identify themselves as conservative? A) 0.0499 B) 0.1873 C) 0.3742 D) 0.6255
Version 1
25
79) <p>Under what condition is the finite population correction factor used for computing the standard error of and ? A) n ≥ Np B) n < Np C) n ≥ 0.05N D) n < 0.05N
80)
The finite correction factor is always ______.
A) <p>less than one, and therefore increases the standard deviations of and computed under the assumption of infinite population</p> B) <p>less than one, and therefore decreases the standard deviations of and computed under the assumption of infinite population</p> C) <p>greater than one, and therefore increases the standard deviations of and computed under the assumption of infinite population</p> D) <p>greater than one, and therefore decreases the standard deviations of and computed under the assumption of infinite population</p>
81) A local company makes snack-size bags of potato chips. Each day, the company produces batches of 480 snack-size bags using a process designed to fill each bag with an average of 2 ounces of potato chips. However, due to imperfect technology, the actual amount placed in a given bag varies. Assume the amount placed in each of the 480 bags is normally distributed and has a standard deviation of 0.1 ounce. What is the probability that a sample of 40 bags has an average weight of at least 2.02 ounces? A) 0.0152 B) 0.1086 C) 0.4255 D) 0.0935
Version 1
26
82) A local company makes snack-size bags of potato chips. Each day, the company produces batches of 400 snack-size bags using a process designed to fill each bag with an average of 2 ounces of potato chips. However, due to imperfect technology, the actual amount placed in a given bag varies. Assume the amount placed in each of the 400 bags is normally distributed and has a standard deviation of 0.1 ounce. What is the probability that a sample of 40 bags has an average weight of at least 2.02 ounces? A) 0.0150 B) 0.0915 C) 0.1038 D) 0.4207
83) Suppose 35% of homes in a Miami, Florida, neighborhood are under water (in other words, the amount due on the mortgage is larger than the value of the home). There are 160 homes in the neighborhood and 30 of those homes are owned by your friends. What is the probability that fewer than 30% of your friends’ homes are under water? A) 0.2627 B) 0.2843 C) 0.6400 D) 0.7389
84) Successful firms must focus on the quality of the products and services they offer. Which of the following factors does not contribute to the quest for quality? A) Global competition B) Consumer expectations C) Technological advances D) First mover advantages
85)
Acceptance sampling is _______.
Version 1
27
A) a division of the population into strata B) a plot of calculated statistics of the production process over time C) an inspection of a portion of the products at the completion of the production process D) a determination of a point at which the production process does not conform to specifications
86)
The detection approach to statistical quality control ______.
A) divides the population into strata B) inspects a portion of the products at the completion of the production process C) determines at which point the production process does not conform to specifications D) uses the finite correction factor when the sample size is not much smaller than the population size
87) In any production process, variation in the quality of the end product is inevitable. Chance variation, or common variation, refers to ______. A) the variation caused by stratified random sampling B) the variation caused by the use of the finite correction factor C) specific events or factors that can usually be identified and eliminated D) a number of randomly occurring events that are part of the production process
88) In any production process, variations in the quality of the end product are inevitable. Assignable variation refers to ______. A) the variation caused by stratified random sampling B) the variation caused by the use of the finite correction factor C) specific events or factors that can usually be identified and eliminated D) a number of randomly occurring events that are part of the production process
Version 1
28
89) A local company makes snack-size bags of potato chips. The company produces batches of 400 snack-size bags using a process designed to fill each bag with an average of 2 ounces of potato chips. However, due to imperfect technology, the actual amount placed in a given bag varies. Assume the population of filling weights is normally distributed with a standard deviation of 0.1 ounce. The company periodically weighs samples of 10 bags to ensure the proper filling process. The last five sample means, in ounces, were 1.99, 2.02, 2.07, 1.96, and 2.01. Is the production process under control? A) No, because the sample means show a downward trend. B) Yes, because the sample means show a downward trend. C) No, because the sample means fall within the upper and lower control limits. D) Yes, because the sample means fall within the upper and lower control limits.
90) A manufacturing process produces computer chips in batches of 300. The firm believes that the percent of defective computer chips is 2%. If in five batches the percent of chips defective were 3%, 8%, 1%, 2%, and 7%, how many of these fell outside of the upper or lower control limits for the proportion of defective computer chips in a batch? A) 0 B) 1 C) 2 D) 3
91) For a numerical variable, the most appropriate control chart to monitor central tendency is a(n) A) chart.</p> B) chart.</p> C) c chart. D) s chart.
Version 1
29
92) For a categorical variable, the most appropriate control chart to monitor the proportion of a certain characteristic is a(n) A) chart.</p> B) chart.</p> C) c chart. D) s chart.
93) <p>A random sample of 120 cast aluminum pots is taken from a production line once every day. The number of defective pots is counted. The proportion of defective pots has been closely examined in the past and is believed to be 0.05. What are the upper and lower control limits for the chart? A) −0.0097 and 0.1097 B) 0.0 and 0.1097 C) 0.1097 and −0.0097 D) 0.1097 and 0.0
94) A random sample of 150 cast aluminum pots is taken from a production line once every day. The number of defective pots is counted. The proportion of defective pots has been closely examined in the past and is believed to be 0.04. The sample proportions for the week are shown in the accompanying table. Day of the Week
Proportion Defective
Monday
0.06
Tuesday
0.07
Wednesday
0.04
Thursday
0.06
Friday
0.05
Is the production process in control?
Version 1
30
A) No, because the sample proportions show a downward trend B) No, because the sample proportions fall within the upper and lower control limits C) Yes, because the sample proportions show a downward trend D) Yes, because the sample proportions fall within the upper and lower control limits
95) John would like to conduct a survey in his neighborhood to get homeowners’ opinion on the Delmarva proposal to switch to natural gas. Which of the following is an example of a stratified sample? A) John randomly chooses three streets and selects every third house on those streets. B) John selects the first 25 homes that he passes as he walks into the entrance of the development. C) John selects every third house on each street. D) John divides the population into two-story, split-level, and ranch houses. Then, he selects a proportional number of houses from each group.
96) Peggy would like to conduct a survey in her neighborhood to get homeowners’ opinions on the Delmarva proposal to switch to natural gas. Which of the following is an example of a convenience sample? A) John randomly chooses three streets and selects every third house on those streets. B) John selects the first 25 homes that he passes as he walks into the entrance of the development. C) John selects every third house on each street. D) John divides the population into two-story, split-level, and ranch houses. Then, he selects a proportional number of houses from each group.
97) Suppose the average math SAT score for students enrolled at a local community college is 490.4 with a standard deviation of 63.7. A random sample of 49 students has been selected. The standard error of the mean for this sample is ______.
Version 1
31
A) 10.6 B) 9.1 C) 63.7 D) 28.4
98) According to the IRS, the average refund in the 2011 tax year was $3,109. Assuming that the standard deviation for these refunds was $874, what is the standard error of the sample mean for a random sample of 50 tax returns? A) $74.66 B) $100.56 C) $123.60 D) $152.25
99) According to the Bureau of Labor Statistics it takes an average of 22 weeks for someone over 55 to find a new job. Assume that the probability distribution is normal and that the standard deviation is two weeks. What is the probability that eight workers over the age of 55 take an average of more than 20 weeks to find a job? A) 0.8977 B) 0.6777 C) 0.7788 D) 0.9977
100) According to the Bureau of Labor Statistics it takes an average of 21 weeks for young workers to find a new job. Assume that the probability distribution is normal and that the standard deviation is two weeks. What is the probability that 25 young workers average less than 20 weeks to find a job?
Version 1
32
A) 0.0062 B) 0.0387 C) 0.0162 D) 0.0312
101) According to the Bureau of Labor Statistics it takes an average of 16 weeks for young workers to find a new job. Assume that the probability distribution is normal and that the standard deviation is two weeks. What is the probability that 20 young workers average less than 15 weeks to find a job? A) 0.0225 B) 0.0450 C) 0.0375 D) 0.0127
102) Suppose the local Best Buy store averages 522 customers every day entering the facility with a standard deviation of 124 customers. A random sample of 40 business days was selected. What is the probability that the average number of customers in the sample is between 530 and 540? A) 0.2387 B) 0.1623 C) 0.3057 D) 0.0572
103) A key metric in the cell phone industry is average revenue per user (ARPU), which represents the average dollar amount that a customer spends per store visit. In 2011, AT&T reported their ARPU as $63.81. Suppose the standard deviation for this population is $21.70. What is the probability that the ARPU will be between $60 and $63 from a random sample of 38 customers?
Version 1
33
A) 0.5559 B) 0.2695 C) 0.4090 D) 0.1396
104) A key metric in the cell phone industry is average revenue per user (ARPU), which represents the average dollar amount that a customer spends per store visit. In 2011, AT&T reported their ARPU as $63.76. Suppose the standard deviation for this population is $22.50. What is the probability that the ARPU will be between $60 and $63 from a random sample of 38 customers? A) 0.5517 B) 0.4168 C) 0.2661 D) 0.1515
105) According to the recent report, scientists in New England say they have identified a set of genetic variants that predicts extreme longevity with 77% accuracy. Assume 150 patients decide to get their genome sequenced. If the claim by scientists is accurate, what is the probability that more than 120 patients will get a correct diagnosis for extreme longevity? A) 0.5151 B) 0.2219 C) 0.1515 D) 0.1913
106) According to the recent report, scientists in New England say they have identified a set of genetic variants that predicts extreme longevity with 77% accuracy. Assume 150 patients decide to get their genome sequenced. If the claim by scientists is accurate, what is the probability that fewer than 70% of the patients will get a correct diagnosis for extreme longevity?
Version 1
34
A) 0.1422 B) 0.0722 C) 0.2214 D) 0.0208
107) We are told that 16% of college graduates, under the age of 26 are unemployed. What is the probability that at least 200 out of 235 college graduates under age 26 are employed? A) 0.3218 B) 0.6782 C) 0.1075 D) 0.9231
108) We are told that 8% of college graduates, under the age of 25 are unemployed. What is the probability that at least 200 out of 215 college graduates under age 25 are employed? A) 0.8925 B) 0.7088 C) 0.2901 D) 0.1075
109) The finite population correction factor for the standard error of the mean should be used when the ratio of the sample size to the size of population exceeds ______. A) 1% B) 2% C) 3% D) 5%
Version 1
35
110) A packaging company strives to maintain a constant temperature for its packages that require a specific temperature range. It is believed that the temperature of packages follows a normal distribution with a mean of 7 degrees Celsius and a standard deviation of 0.1 degree Celsius. Inspectors take weekly samples for 5 weeks of eight randomly selected boxes and report their temperatures. The five weekly sample means are shown in the table below. Week 1
4.85
Week 2
6.91
Week 3
6.97
Week 4
5.13
Week 5
4.98
How many points are outside the control limits? A) 1 B) 3 C) 2 D) 4
111) A packaging company strives to maintain a constant temperature for its packages that require a specific temperature range. It is believed that the temperature of packages follows a normal distribution with a mean of 5 degrees Celsius and a standard deviation of 0.3 degree Celsius. Inspectors take weekly samples for 5 weeks of eight randomly selected boxes and report their temperatures. The five weekly sample means are shown in the table below. Week 1
4.97
Week 2
5.16
Week 3
5.12
Week 4
5.35
Week 5
5.50
How many points are outside the control limits? A) 1 B) 2 C) 3 D) 4
Version 1
36
112) A survey is designed to collect data on students’ evaluations of their instructor’s teaching performance. Which of the following situations most likely results in social-desirability biases? A) The survey is administered in class with the instructor in the room the entire time. B) The survey is conducted during the first week of class. C) The survey takes over thirty minutes to complete. D) The survey is mailed to students’ home address during exam week.
113) In order to increase participation, survey respondents will enter for a chance to win one of fifty $10 gift cards upon completion of the survey. There are 785 survey respondents. Which of the following is R’s function to generate a simple random sample of fifty gift card winners? A) randbetween(785,1) B) sample(1:785, 50) C) sample_draw(1:785, 50) D) randbetween(1,785)
114) Which of the following statements regarding the sampling distribution of the sample mean is TRUE? A) The expected value of the sample mean from a large sample is smaller than that from a small sample. B) The variance of the sample mean is equal to the variance of all individual observations in the population. C) The variance of the sample mean is smaller than the variance of all individual observations in the population. D) The expected value of the sample mean from a large sample is greater than that from a small sample.
Version 1
37
115)
The graph below shows the population distribution of variable X with expected value µ.
<p>Which of the following statements is TRUE regarding the given sampling distribution of the sample mean, , with different sample size?
A) The green curve has a larger sample size than the brown curve. B) The green curve has a smaller sample size than the brown curve. C) The green curve has the same sample size as the brown curve. D) The green curve is representative of the central limit theorem.
116) According to the Bureau of Justice Statistics, 10% of persons age 16 or older had been victims of identity theft during the prior 12 months in 2016. What is the minimal sample size of a random sample of persons age 16 or older to generate a normal distribution approximation? A) 30 B) 10 C) 50 D) 20
117) Sixty percent of 800 students in a Business Statistics class are female. If you want to make probability estimates of the sample proportion of female students without applying the finite population correction factor, what minimal sample size of the random sample should be used?
Version 1
38
A) 80 B) 30 C) 100 D) 40
118) The California Department of Education wants to gauge the difficulty of a new exam by having a sample of students at a particular school take the exam. The quality of the students at the chosen school varies widely and the school administrators are allowed to choose who gets to take the exam. The administrators have a strong incentive for the school to do well on the exam. Do you think the results will represent the true ability of the students at school? What kind of bias, if any, do you think will be present? Explain.
119) The campaign manager for a candidate for governor in Arizona wants to conduct a poll to better understand his candidate’s chances for the upcoming election. a.What is the population of interest? b.Why may the poll be biased if a simple random sample of voters in the last gubernatorial election (four years prior) is taken?
120) It is known that college students at a local community college study 12 hours per week with a standard deviation of five hours. What are the expected value and variance for a sample of nine students?
Version 1
39
121) A fast-food restaurant uses an average of 110 grams of meat per burger patty. Suppose the amount of meat in a burger patty is normally distributed with a standard deviation of 20 grams. What is the probability that the average amount of meat in four randomly selected burgers is less than 105 grams?
122) Suppose residents in a well-to-do neighborhood pay an average overall tax rate of 25% with a standard deviation of 8%. Assume tax rates are normally distributed. What is the probability that the mean tax rate of 16 randomly selected residents is between 20% and 30%.
123) Suppose the average casino patron in Las Vegas loses $110 per day, with a standard deviation of $700. Assume winnings/losses are normally distributed. a.What is the probability that a random group of nine people averages more than $500 in winnings on their one-day trip to Las Vegas? b.What is the probability that a random group of nine people averages more than $500 in losses on their one-day trip to Las Vegas?
124) A ski resort gets an average of 2000 customers per weekday with a standard deviation of 800 customers. Assume the underlying distribution is normal. What is the probability a ski resort averages between 1500 customers and 3000 customers per weekday over the course of four weekdays?
Version 1
40
125) A mining company made some changes to their mining process in an attempt to save fuel. Before the changes were made, it took an average of 20 gallons of diesel fuel to mine 1,000 pounds of copper. Suppose the standard deviation of fuel used per 1,000 pounds of copper mined is 6 gallons. After the changes were made, the company only used an average of 18 gallons of diesel for the next 30,000 pounds of copper mined. a.How unusual would it be to get a sample average of 18 gallons or less for 30,000 pounds of copper mined if the changes to the mining process had no effect? b.Do you think the changes in the mining process actually lowered the fuel used? Explain.
126) A gym knows that each member, on average, spends 70 minutes at the gym per week, with a standard deviation of 20 minutes. Assume the amount of time each customer spends at the gym is normally distributed. a.What is the probability that a randomly selected customer spends less than 65 minutes at the gym? b.Suppose the gym surveys a random sample of 49 members about the amount of time they spend at the gym each week. What are the expected value and standard deviation (standard error) of the sample mean of the time spent at the gym? c.If 49 members are randomly selected, what is the probability that the average time spent at the gym exceeds 75 minutes?
Version 1
41
127) A book publisher knows that it takes an average of nine business days from when the material for the book is finalized until the first edition is printed and ready to sell. Suppose the exact amount of time has a standard deviation of four days. a.Suppose the publisher examines the printing time for a sample of 36 books. What is the probability that the sample mean time is shorter than eight days? b.Suppose the publisher examines the printing time for a sample of 36 books. What is the probability that the sample mean time is between 7 and 10 days? c.Suppose the publisher signs a contract for the printer to print 100 books. If the average printing time for the 100 books is longer than 9.3 days, the printer must pay a penalty. What is the probability the penalty clause will be activated? d.Suppose the publisher signs a contract for the printer to print 10 books. If the average printing time for the 10 books is longer than 9.7 days, the printer must pay a penalty. What is the probability the penalty clause will be activated?
128) In a large metropolitan area, the top providers for television and Internet services are a phone company, a satellite company, and a cable company. The satellite company serves 43% of the homes in the area. What is the probability that in a survey of 1,000 homes, more than 447 of them are served by the satellite company?
129) In early 2012, the U.S. Congress approval rating was approximately 10% (Reuters.com). In a poll of 400 Americans, what is the probability that their approval rating is between 8% and 12%?
Version 1
42
130) A tutoring company claims that 75% of the high school students who hire one of their tutors will improve their grades. a.In a sample of 100 high school students, what is the probability that 80% or more improved their grades? b.In a sample of 200 high school students, what is the probability that 80% or more improved their grades? c.Comment on the reason for the difference between the computed probabilities in parts a and b.
131) A school is required by the government to give some randomly chosen students a standardized test. From previous experience, the school knows about 68% of their students will receive passing scores in math and English. To improve funding, the school needs to score at least 70% on the standardized test. This year the school can decide if it wants to test 100 or 200 students. Should the school test 100 or 200 students? Explain the answer.
132) The office of career services at a major university knows that 74% of its graduates find full-time positions in the field of their choosing within six months of graduation. Suppose the office of career services surveys 25 alumni six months after graduation. a.What is the probability that at least 80% of the alumni have a job in the field of their choosing? b.What is the probability that between 60% and 76% of the alumni have a job in the field of their choosing? c.What is the probability that fewer than 60% of the alumni have a job in the field of their choosing?
Version 1
43
133) Administrative assistants in a local university have been asked to prove their proficiency in the use of spreadsheet software by taking a proficiency test. Historically, the mean test score has been 74 with a standard deviation of 4. A random sample of size 40 is taken from the 100 administrative assistants and asked to complete the proficiency test. a.Calculate the expected value and the standard deviation (standard error) of the sample mean. b.What is the probability that the sample mean score is more than 75, the predetermined passing score?
134) In a small town, there are 3,000 registered voters. An editor of a local newspaper would like to predict the outcome of the next election; in particular, he is interested in the likelihood that Eli Brady will be elected. The editor believes that Eli, a local hero, will garner 54% of the vote. A poll of 500 registered voters is taken. Assuming that the editor’s belief is true, calculate: a.The expected value and the standard deviation (standard error) of the sample proportion. b.The probability that the sample proportion score is more than 0.58.
Version 1
44
135) A random sample of nine cast aluminum pots is taken from a production line once every hour. The interior diameter of the pots is measured and the sample mean is calculated. The target for the diameter is 12 inches and the standard deviation for the pot diameter is 0.05 inches. Assume the pot diameter is normally distributed. a.Construct the centerline and the upper and lower control limits for the chart. b.The means of the samples for a given eight-hour day are 12.01 12.06 11.97 12.08 11.92 11.95 11.97 12.04. Does it appear that the process is under control? Explain.
136) In a recent investigation, the National Highway Traffic Safety Administration (NHTSA) found that the Chevrolet Volt and other electric vehicles do not pose a greater risk of fire than gasoline-powered vehicles (The Boston Globe, January 25, 2012). Specifically, it was determined that “no discernible defect trend exists.” Suppose a consumer advocacy group wants to verify some of these claims by constructing a chart. The group expects 2% of electric cars to catch fire each month. For each of the last six months, 500 electric car owners are asked if their cars have caught fire. These sample proportions are obtained: 0.010 0.020 0.015 0.030 0.025 0.015. a.Assuming that the group expectation is correct, construct the centerline and the upper and lower control limits for the chart. b.Do the consumer group’s findings support those of the NHTSA? Explain the answer.
Version 1
45
137) A bottled water plant uses a production process designed to fill bottles with 20 ounces of water. The population of filling volumes is normally distributed with a standard deviation of 1.3 ounces. Periodically, process engineers take 20-bottle samples and compute the sample mean. a.What are the upper and lower control limits? b.Suppose the last five sample means were 19.4 20.2 20.5 20.7 21.1 ounces. Is the process under control?
138) A manufacturing process produces tubeless mountain bike tires in batches of 200. Past records show that 6% of the tires will not hold air. An engineer tests five batches, each one week apart, and shows the proportion of tires that will not hold air below. Proportion of tires that will not hold air: 0.074
0.068
0.062
0.055
0.040
a. Construct the centerline and the upper control limit for the chart. b. Should the engineer be worried? Comment on any trend in the proportion of tubeless tires that will not hold air.
139) A large accounting firm gives out 1,000 job offers every year to new college graduates. Suppose that 85% of those who received offers accept the position. The following shows the number of graduates who have accepted jobs in the last four years. Number of job offers accepted: 2007
Version 1
2008
2009
2010
2011
46
840
860
870
830
810
<p>a. Construct the centerline and the upper and lower control limits for the chart. b. Does the company need to worry about its ability to attract college graduates to the firm?
140) Bias refers to the tendency of a sample statistic to systematically over - or underestimate a population parameter. ⊚ true ⊚ false
141)
We use a population parameter to make inferences about a sample statistic. ⊚ true ⊚ false
142) Selection bias occurs when the sample is mistakenly divided into strata, and random samples are drawn from each stratum. ⊚ true ⊚ false
143) Nonresponse bias occurs when those responding to a survey or poll differ systematically from the nonrespondents. ⊚ true ⊚ false
Version 1
47
144) Social-desirability bias refers to systematic difference between a group's "socially acceptable" responses to a survey or poll and this group’s ultimate choice. ⊚ true ⊚ false
145) A simple random sample is a sample of n observations that has the same probability of being selected from the population as any other sample of n observations. ⊚ true ⊚ false
146) In stratified random sampling, the population is first divided up into mutually exclusive and collectively exhaustive groups, called strata. A stratified sample includes randomly selected observations from each stratum, which are proportional to the stratum’s size. ⊚ true ⊚ false
147) In cluster random sampling, the population is first divided up into mutually exclusive and collectively exhaustive groups, called clusters. A cluster sample includes randomly selected observations from each cluster, which are proportional to the cluster’s size. ⊚ true ⊚ false
148)
Cluster sampling is preferred when the objective is to increase precision. ⊚ true ⊚ false
149)
A parameter is a random variable, whereas a sample statistic is a constant. ⊚ true ⊚ false
Version 1
48
150) When a statistic is used to estimate a parameter, the statistic is referred to as an estimator. A particular value of the estimator is called an estimate. ⊚ true ⊚ false
151)
<p>The standard deviation of (standard error of the sample mean) equals the population
standard deviation divided by the square root of the sample size, or, ⊚ true ⊚ false
equivalently.
152) <p>The standard deviation of suggests that the variation between observations is smaller than the variation between averages. ⊚ true ⊚ false
153)
A point estimator refers to an estimator that provides a single value. ⊚ true ⊚ false
154) If the expected value of a sample mean equals the population mean, the sample mean is biased. ⊚ true ⊚ false
155) For any sample size n, the sampling distribution of is normal if the population from which the sample is drawn is uniformly distributed.
Version 1
49
⊚ ⊚
true false
156) For any population X with expected value µ and standard deviation σ, the sampling distribution of will be approximately normal if the sample size n is sufficiently small. As a general guideline, the normal distribution approximation is justified when n < 30. ⊚ true ⊚ false
157)
The central limit theorem approximation improves as the sample size decreases. ⊚ true ⊚ false
158) For any population proportion p, the sampling distribution of will be approximately normal if the sample size n is sufficiently large. As a general guideline, the normal distribution approximation is justified when np ≥ 5 and n(1 − p) ≥ 5. ⊚ true ⊚ false
Version 1
50
Answer Key Test name: Chap 07_4e 1) a. A p-chart. b. np = 250 × 0.04 = 10 and n(1-p) = 250 × 0.96 = 240 both are ≥ 5 which means the central limit theorem applies. c. UCL is 0.0772 and LCL is 0.028. Sample proportion of samples 7, 9, and 10 are 0.088, 0.084, and 0.080 fall above the UCL of 0.0772, which means the process is not in control. 2) increase precision 3) reduce costs 4) expected value 5) standard 6) normally 7) increases 8) detection 9) assignable 10) D 11) B 12) D 13) B 14) B 15) C 16) D 17) D 18) C 19) A 20) B 21) A Version 1
51
22) A 23) B 24) C 25) A 26) A 27) C 28) B 29) D 30) C 31) A 32) A 33) A 34) A 35) A 36) C 37) B 38) A 39) C 40) A 41) D 42) C 43) D 44) A 45) C 46) B 47) C 48) A 49) C 50) C 51) A Version 1
52
52) B 53) A 54) B 55) C 56) D 57) A 58) B 59) B 60) B 61) A 62) B 63) B 64) B 65) B 66) C 67) A 68) B 69) B 70) D 71) D 72) D 73) A 74) A 75) C 76) D 77) B 78) C 79) C 80) B 81) D Version 1
53
82) B 83) A 84) D 85) C 86) C 87) D 88) C 89) D 90) C 91) B 92) A 93) D 94) D 95) D 96) B 97) B 98) C 99) D 100) A 101) D 102) B 103) B 104) C 105) D 106) D 107) A 108) C 109) D 110) B 111) B Version 1
54
112) A 113) B 114) C 115) B 116) C 117) D 118) Results will be biased because of selection bias. 119) a.Voters in the upcoming election b.Selection bias 120) Expected value equals 12, and variance equals 2.7779. 121) 0.3085 122) 0.9876 123) a. 0.0045 b. 0.0473 124) 0.8881 125) a. 0.03394 b. Yes, it’s very unlikely to average 18 gallons or less for 30,000 pounds of copper mined given no changes in the mining process. 126) a. 0.4013 b. 70 and 2.8571 c. 0.0401 127) a. 0.0668 b. 0.9318 c. 0.2266 d. Cannot be determined 128) 0.1388 129) 0.8176 130) a. 0.1241 b. 0.0512 c. The standard error of is lower with a larger sample size. 131) 100 students 132) a. 0.2470 b. 0.5349 c. 0.0553 133) a. 74 and 0.4924 b. 0.0211 Version 1
55
134) a.0.54 and 0.0204 b.0.025 135) a. UCL =12+3*0.05/SQRT(9) = 12.05, Centerline = 12.00, LCL =12-3*0.05/SQRT(9) = 11.95; b. The process needs adjustment as three sample means (12.06, 12.08, and 11.92) fall outside the control limits. 136) a. UCL =0.02+3*SQRT(0.02*(1-0.02)/500) = 0.0388; LCL =0.02-3*SQRT(0.02*(10.02)/500) = 0.0012 [MISSING IMAGE: A graph plots U C L, centerline, L C L, and sample proportions. The horizontal axis ranges from 0 to 6 in increments of 1. The vertical axis ranges from 0 to 0.045 in increments of 0.005. A horizontal line labeled U C L is from (1, 0.039) to (6, 0.039). Another horizontal line labeled centerline is from (1, 0.02) to (6, 0.02). Another horizontal line labeled L C L is from (1, 0.001) to (6, 0.001). A curve labeled sample proportions begins at (1, 0.01), passes through (2, 0.02), (3, 0.015), (4, 0.03), (5, 0.025), and ends at (6, 0.015)., A graph plots U C L, centerline, L C L, and sample proportions. The horizontal axis ranges from 0 to 6 in increments of 1. The vertical axis ranges from 0 to 0.045 in increments of 0.005. A horizontal line labeled U C L is from (1, 0.039) to (6, 0.039). Another horizontal line labeled centerline is from (1, 0.02) to (6, 0.02). Another horizontal line labeled L C L is from (1, 0.001) to (6, 0.001). A curve labeled sample proportions begins at (1, 0.01), passes through (2, 0.02), (3, 0.015), (4, 0.03), (5, 0.025), and ends at (6, 0.015). ] b. Yes, all points are randomly dispersed between the upper and lower limits. There is no evidence of a defective trend. 137) a. UCL =20+*1.3/SQRT(20) = 20.87; LCL =20-3*1.3/SQRT(20) = 19.13; b. No, the process is out of control. Two of the five sample means are above the upper control limit. 138) a. UCL=0.06+3*SQRT(0.06*(1-0.06)/200) = 0.1104, Centerline: 0.06 b. No, all proportions are below the UCL limit and the trend is improving (trending downward). 139) a. UCL=0.85+3*SQRT(0.85*(1-0.85)/1000) = 0.8839, LCL=0.85-3*SQRT(0.85*(10.85)/1000) = 0.8161, Centerline: 0.85 b. Yes, in 2011, fewer people accepted jobs than the number resulting from the LCL.
140) TRUE 141) FALSE 142) FALSE 143) TRUE 144) TRUE 145) TRUE
Version 1
56
146) TRUE 147) FALSE 148) FALSE 149) FALSE 150) TRUE 151) TRUE 152) FALSE 153) TRUE 154) FALSE 155) FALSE 156) FALSE 157) FALSE 158) TRUE
Version 1
57
CHAPTER 08 – 4e 1)
A confidence interval is generally associated with a(n) _____________.
2) To construct a 95% confidence interval for µ, the sampling distribution of must be _______.
3)
The confidence coefficient is equal to ________.
4)
The width of the confidence interval is _____ times the margin of error.
5)
The ___________ determine the extent of the broadness of the tails of the distribution.
6) It is important to know that __________ is increased when we estimate the population standard deviation with the sample standard deviation.
7)
The parameter p represents the ________ of successes in the population.
8) If calculated required sample size is a noninteger value, we should always _______ calculated value.
9)
What is the purpose of calculating a confidence interval?
Version 1
1
A) To provide a range of values that has a certain large probability of containing the sample statistic of interest. B) To provide a range of values that, with a certain measure of confidence, contains the sample statistic of interest. C) To provide a range of values that, with a certain measure of confidence, contains the population parameter of interest. D) To provide a range of values that has a certain large probability of containing the population parameter of interest.
10)
What is the most typical form of a calculated confidence interval? A) Point estimate ± Standard error B) Point estimate ± Margin of error C) Population parameter ± Standard error D) Population parameter ± Margin of error
11) Which of the following is the necessary condition for creating confidence intervals for the population mean? A) Normality of the estimator B) Normality of the population C) Known population parameter D) Known standard deviation of the estimator
12)
<p>What is
for a 95% confidence interval of the population mean?
A) 0.48 B) 0.49 C) 1.645 D) 1.96
Version 1
2
13)
<p>What is
for a 90% confidence interval of the population mean?
A) 0.48 B) 0.49 C) 1.645 D) 1.96
14) We draw a random sample of size 36 from the normal population with variance 1.9. If the sample mean is 17.5, what is a 90% confidence interval for the population mean? A) [16.6801, 18.3199] B) [16.8036, 18.1964] C) [17.1221, 17.8779] D) [17.1994, 17.8006]
15) We draw a random sample of size 25 from the normal population with variance 2.4. If the sample mean is 12.5, what is a 99% confidence interval for the population mean? A) [11.2600, 13.7400] B) [11.3835, 13.6165] C) [11.7019, 13.2981] D) [11.7793, 13.2207]
16) We draw a random sample of size 36 from a population with standard deviation 3.5. If the sample mean is 27, what is a 95% confidence interval for the population mean? A) [25.8567, 28.1433] B) [26.0405, 27.9595] C) [26.8100, 27.1900] D) [26.8401, 27.1599]
Version 1
3
17) In an examination of purchasing patterns of shoppers, a sample of 16 shoppers revealed that they spent, on average, $54 per hour of shopping. Based on previous years, the population standard deviation is thought to be $21 per hour of shopping. Assuming that the amount spent per hour of shopping is normally distributed, find a 90% confidence interval for the mean amount. A) [$45.36, $62.64] B) [$47.27, $60.73] C) [$51.84, $56.16] D) [$52.32, $55.68]
18) At an academically challenging high school, the average GPA of a high school senior is known to be normally distributed with a variance of 0.25. A sample of 20 seniors is taken and their average GPA is found to be 2.71. Develop a 90% confidence interval for the population mean GPA. A) [2.5261, 2.8939] B) [2.5667, 2.8533] C) [2.6180, 2.8020] D) [2.6384, 2.7816]
19) The daily revenue from the sale of fried dough at a local street vendor in Boston is known to be normally distributed with a known standard deviation of $120. The revenue on each of the last 25 days is noted, and the average is computed as $550. Construct a 95% confidence interval for the population mean of the sale of fried dough by this vendor. A) 120 ± 1.645(550/5) B) 120 ± 1.96(550/5) C) 550 ± 1.645(120/5) D) 550 ± 1.96(120/5)
Version 1
4
20) The ages of MBA students at a university are normally distributed with a known population variance of 10.24. Suppose you are asked to construct a 95% confidence interval for the population mean age if the mean of a sample of 36 students is 26.5 years. What is the margin of error for a 95% confidence interval for the population mean? A) 1.645(3.20/6) B) 1.645(10.24/6) C) 1.96(3.20/6) D) 1.96(10.24/6)
21) The ages of MBA students at a university are normally distributed with a known population variance of 10.24. Suppose you are asked to construct a 95% confidence interval for the population mean age if the mean of a sample of 36 students is 26.5 years. If a 99% confidence interval is constructed instead of a 95% confidence interval for the population mean, then __________. A) the resulting margin of error will increase and the risk of reporting an incorrect interval will increase B) the resulting margin of error will decrease and the risk of reporting an incorrect interval will increase C) the resulting margin of error will increase and the risk of reporting an incorrect interval will decrease D) the resulting margin of error will decrease and the risk of reporting an incorrect interval will decrease
22) An analyst takes a random sample of 25 firms in the telecommunications industry and constructs a confidence interval for the mean return for the prior year. Holding all else constant, if he increased the sample size to 30 firms, how are the standard error of the mean and the width of the confidence interval affected? Standard error of the mean
Width of confidence interval
A
Increases
Becomes wider
B
Increases
Becomes narrower
Version 1
5
C
Decreases
Becomes wider
D
Decreases
Becomes narrower
A) A B) B C) C D) D
23) The daily revenue from the sale of fried dough at a local street vendor in Boston is known to be normally distributed with a known standard deviation of $120. The revenue on each of the last 25 days is noted, and the average is computed as $550. A 95% confidence interval is constructed for the population mean revenue. If the data from the last 40 days had been used, then the resulting 95% confidence intervals would have been _____________________. A) wider, with a larger probability of reporting an incorrect interval B) wider, with the same probability of reporting an incorrect interval C) narrower, with a larger probability of reporting an incorrect interval D) narrower, with the same probability of reporting an incorrect interval
24) A sample of a given size is used to construct a 95% confidence interval for the population mean with a known population standard deviation. If a bigger sample had been used instead, then the 95% confidence interval would have been _______ and the probability of making an error would have been ________. A) wider; smaller B) narrower; smaller C) wider; unchanged D) narrower; unchanged
25) For a given confidence level and sample size, which of the following is true in the interval estimation of the population mean when σ is known?
Version 1
6
A) If the sample standard deviation is smaller, the interval is wider. B) If the population standard deviation is greater, the interval is wider. C) If the population standard deviation is smaller, the interval is wider. D) If the population standard deviation is greater, the interval is narrower.
26) For a given confidence level and population standard deviation, which of the following is true in the interval estimation of the population mean? A) If the sample size is bigger, the interval is narrower. B) If the sample size is smaller, the interval is narrower. C) If the population size is bigger, the interval is narrower. D) If the population size is smaller, the interval is narrower.
27) For a given sample size and population standard deviation, which of the following is true in the interval estimation of the population mean? A) If the confidence level is greater, the interval is wider. B) If the confidence level is smaller, the interval is wider. C) If the confidence level is greater, the interval is more precise. D) If the confidence level is smaller, the interval is less precise.
28) A 90% confidence interval is constructed for the population mean. If a 95% confidence interval had been constructed instead (everything else remaining the same), the width of the interval would have been ________ and the probability of making an error would have been _________. A) wider; bigger B) wider; smaller C) narrower; bigger D) narrower; smaller
Version 1
7
29) Suppose taxi fares from Logan Airport to downtown Boston is known to be normally distributed and a sample of seven taxi fares produces a mean fare of $12.51 and a 95% confidence interval of [$11.93, $13.07]. Which of the following statements is a validexplanation of the confidence interval. A) 95% of all taxi fares are between $11.93 and $13.07. B) We are 95% confident that a randomly selected taxi fare will be between $11.93 and $13.07. C) The mean amount of a taxi fare is $12.51, 95% of the time. D) We are 95% confident that the average taxi fare between Logan Airport and downtown Boston will fall between $11.93 and $13.07.
30) Suppose taxi fares from Logan Airport to downtown Boston is known to be normally distributed and a sample of seven taxi fares produces a mean fare of $22.31 and a 95% confidence interval of [$20.5051, $24.2091]. Which of the following statements is a valid explanation of the confidence interval. A) 95% of all taxi fares are between $20.51 and $24.21. B) We are 95% confident that a randomly selected taxi fare will be between $20.51 and $24.21. C) The mean amount of a taxi fare is $22.31, 95% of the time. D) We are 95% confident that the average taxi fare between Logan Airport and downtown Boston will fall between $20.51 and $24.21.
31) Suppose taxi fare from Logan Airport to downtown Boston is known to be normally distributed with a standard deviation of $2.50. The last seven times John has taken a taxi from Logan to downtown Boston, the fares have been $22.94, $22.61, $23.12, $24.38, $20.83, $20.73, and $23.34.What is a 95% confidence interval for the population mean taxi fare?
Version 1
8
A) [$17.66, $27.46] B) [$20.71, $24.42] C) [$21.01, $24.12] D) [$21.86, $23.26]
32) Suppose taxi fare from Logan Airport to downtown Boston is known to be normally distributed with a standard deviation of $2.50. The last seven times John has taken a taxi from Logan to downtown Boston, the fares have been $22.10, $23.25, $21.35, $24.50, $21.90, $20.75, and $22.65. What is a 95% confidence interval for the population mean taxi fare? A) [$17.46, $27.26] B) [$20.46, $24.16] C) [$20.80, $23.91] D) [$21.66, $23.06]
33) A website advertises job openings on its website, but job seekers have to pay to access the list of job openings. The website recently completed a survey to estimate the number of days it takes to find a new job using its service. It took the last 35 customers an average of 50 days to find a job. Assume the population standard deviation is 10 days. Calculate a 99% confidence interval of the population mean number of days it takes to find a job. A) [29.6251, 70.3749] B) [45.1336, 54.8664] C) [45.6461, 54.3539] D) [46.2217, 53.7783]
34) A website advertises job openings on its website, but job seekers have to pay to access the list of job openings. The website recently completed a survey to estimate the number of days it takes to find a new job using its service. It took the last 30 customers an average of 60 days to find a job. Assume the population standard deviation is 10 days. Calculate a 95% confidence interval of the population mean number of days it takes to find a job. Version 1
9
A) [40.4000, 79.6000] B) [55.9085, 64.0915] C) [56.4216, 63.5784] D) [56.9966, 63.0034]
35)
The tdf distribution is similar to the z distribution because __________.
A) as the degrees of freedom go to infinity, the t distribution converges to the z distribution, but both do not have asymptomatic tails B) both have asymptotic tails—that is, their tails become closer and closer to the horizontal axis and eventually cross the axis C) as the degrees of freedom go to infinity, the t distribution converges to the z distribution and both have asymptotic tails—that is, their tails become closer and closer to the horizontal axis but never touch it D) Neither as the degrees of freedom go to infinity, the t distribution converges to the z distribution nor both have asymptotic tails—that is, their tails become closer and closer to the horizontal axis but never touch it
36)
How do the tdf and z distributions differ?
A) There is no difference between the tdf and z distributions; they both do not have asymptomatic tails. B) The z distribution has broader tails (it is flatter around zero). C) The tdf distribution has broader tails (it is flatter around zero). D) The z distribution has asymptotic tails, while the tdf distribution does not.
37) <p>What is for a 95% confidence interval of the population mean based on a sample of 15 observations?
Version 1
10
A) 1.761 B) 1.960 C) 2.131 D) 2.145
38) <p>What is for a 99% confidence interval of the population mean based on a sample of 25 observations? A) 2.492 B) 2.576 C) 2.787 D) 2.797
39) Confidence intervals of the population mean may be created for the cases when the population standard deviation is known or unknown. How are these two cases treated differently? A) Use the z table when s is unknown; use the t table when s is known. B) Use the z table when s is known; use the t table when s is unknown. C) Use the z table when σ is unknown; use the t table when σ is known. D) Use the z table when σ is known; use the t table when σ is unknown.
40) What conditions are required by the central limit theorem before a confidence interval of the population mean may be created?
Version 1
11
A) The underlying population must be normally distributed. B) The underlying population must be normally distributed if the sample size is 30 or more. C) The underlying population need not be normally distributed if the sample size is 30 or more. D) The underlying population need not be normally distributed if the population standard deviation is known.
41) The GPA of accounting students in a university is known to be normally distributed. A random sample of 32 accounting students results in a mean of 2.97 and a standard deviation of 0.16. Construct the 90% confidence interval for the mean GPA of all accounting students at this university. A) 2.97 B) 2.97 C) 2.97 D) 2.97
±1.729 ( 0.16 / 32 ) ±1.96 ( 0.16 / 32 ) ±2.086 ( 0.16 / 32 ) ±1.696 ( 0.16 / 32 )
42) The GPA of accounting students in a university is known to be normally distributed. A random sample of 20 accounting students results in a mean of 2.92 and a standard deviation of 0.16. Construct the 95% confidence interval for the mean GPA of all accounting students at this university. A) B) C) D)
Version 1
12
43) The starting salary of business students in a university is known to be normally distributed. A random sample of 18 business students results in a mean salary of $46,500 with a standard deviation of $10,200. Construct the 90% confidence interval for the mean starting salary of business students in this university. A) B) C) D)
44) Given a sample mean of 27 and a sample standard deviation of 3.5 computed from a sample of size 36, find a 95% confidence interval on the population mean. A) [25.8158, 28.1842] B) [25.8170, 28.1830] C) [26.0142, 27.9858] D) [26.9930, 27.0070]
45) Given a sample mean of 12.5—drawn from a normal population, a sample of size 25, and a sample variance of 2.4—find a 99% confidence interval for the population mean.
A) [11.7019, 13.2981] B) [11.1574, 13.8426] C) [11.6334, 13.3666] D) [11.7279, 13.2721]
46) At a particular academically challenging high school, the average GPA of a high school senior is known to be normally distributed. After a sample of 20 seniors is taken, the average GPA is found to be 2.71 and the variance is determined to be 0.25. Find a 90% confidence interval for the population mean GPA.
Version 1
13
A) [2.5167, 2.9033] B) [2.5261, 2.8961] C) [2.5615, 2.8585] D) [2.6358, 2.7842]
47) In an examination of holiday spending (known to be normally distributed) of a sample of 16 holiday shoppers at a local mall, an average of $54 was spent per hour of shopping. Based on the current sample, the standard deviation is equal to $21. Find a 90% confidence interval for the population mean level of spending per hour. A) [$44.83, $63.17] B) [$44.79, $63.20] C) [$45.36, $62.63] D) [$46,96, $61.04]
48) The mortgage foreclosure crisis that preceded the Great Recession impacted the U.S. economy in many ways, but it also impacted the foreclosure process itself as community activists better learned how to delay foreclosure, and lenders became more wary of filing faulty documentation. Suppose the duration of the eight most recent foreclosures filed in the city of Boston (from the beginning of foreclosure proceedings to the filing of the foreclosure deed, transferring the property) has been 230 days, 420 days, 340 days, 367 days, 295 days, 314 days, 385 days, and 311 days. Assume the duration is normally distributed. Construct a 99% confidence interval for the mean duration of the foreclosure process in Boston. A) [126.2730, 539.2270] B) [259.7393, 405.7607] C) [279.0056, 386.4944] D) [291.8600, 373.6400]
Version 1
14
49) The mortgage foreclosure crisis that preceded the Great Recession impacted the U.S. economy in many ways, but it also impacted the foreclosure process itself as community activists better learned how to delay foreclosure and lenders became more wary of filing faulty documentation. Suppose the duration of the eight most recent foreclosures filed in the city of Boston (from the beginning of foreclosure proceedings to the filing of the foreclosure deed, transferring the property) has been 230 days, 420 days, 340 days, 367 days, 295 days, 314 days, 385 days, and 311 days. Assume the duration is normally distributed. Construct a 90% confidence interval for the mean duration of the foreclosure process in Boston. A) [259.7400, 405.7600] B) [293.2229, 372.2771] C) [298.4296, 367.0704] D) [303.2282, 362.2718]
50) A company that produces computers recently tested the battery in its latest laptop in six separate trials. The battery lasted 8.23, 7.89, 8.14, 8.25, 8.30, and 7.95 hours before burning out in each of the tests. Assuming the battery duration is normally distributed, construct a 95% confidence interval for the mean battery life in the new model. A) [7.9490, 8.3044] B) [7.9575, 8.2959] C) [7.9873, 8.2661] D) [7.9912, 8.2622]
51) Professors of accountancy are in high demand at American universities. A random sample of 28 new accounting professors found the average salary was $135 thousand with a standard deviation of $16 thousand. Assume the distribution is normally distributed. Construct a 95% confidence interval for the salary of new accounting professors. Answers are in thousands of dollars. A) [102.1680, 167.832] B) [127.8247, 142.1753] C) [128.7958, 141.2042] D) [129.0735, 140.9265]
Version 1
15
52) Professors of accountancy are in high demand at American universities. A random sample of 28 new accounting professors found the average salary was $135 thousand with a standard deviation of $16 thousand. Assume the distribution is normally distributed. Construct a 90% confidence interval for the salary of new accounting professors. Answers are in thousands of dollars. A) [107.7520, 162.2480] B) [129.8497, 140.1503] C) [130.0260, 139.9740] D) [131.0268, 138.9732]
53) A basketball coach wants to know how many free throws an NBA player shoots during the course of an average practice. The coach takes a random sample of 43 players and finds the average number of free throws shot per practice was 225 with a standard deviation of 35. Construct a 99% confidence interval for the average number of free throws in practice. A) [210.5992, 239.4008] B) [210.6155, 239.3845] C) [211.2506, 238.7494] D) [214.2290, 235.7710]
54) The confidence intervals for the population proportion are generally based on ___________. A) the z distribution B) the t distribution C) the z distribution when the sample size is very small D) the t distribution when the population standard deviation is not known
55) When examining the possible outcome of an election, what type of confidence interval is most suitable for estimating the current support for a candidate? Version 1
16
A) The confidence interval for the sample mean B) The confidence interval for the population mean C) The confidence interval for the sample proportion D) The confidence interval for the population proportion
56) Which of the following is the correct formula for the margin of error in the interval estimation of p? A) B) C) D)
57) The sampling distribution of the population proportion is based on a binomial distribution. What condition must be met to use the normal approximation for the confidence interval? A) B) C) D)
and and
58) According to a report in USA Today, more and more parents are helping their young adult children get homes. Suppose eight persons in a random sample of 50 young adults who recently purchased a home in Kentucky received help from their parents. You have been asked to construct a 95% confidence interval for the population proportion of all young adults in Kentucky who received help from their parents. What is the margin of error for a 95% confidence interval for the population proportion?
Version 1
17
A) 1.96(0.0518) B) 1.645(0.005) C) 1.96(0.005) D) 1.645(0.0518)
59) According to a report in USA Today, more and more parents are helping their young adult children get homes. Suppose eight persons in a random sample of 40 young adults who recently purchased a home in Kentucky received help from their parents. You have been asked to construct a 95% confidence interval for the population proportion of all young adults in Kentucky who received help from their parents. What is the margin of error for a 95% confidence interval for the population proportion? A) 1.645(0.0040) B) 1.645(0.0632) C) 1.96(0.0040) D) 1.96(0.0632)
60) According to a report in USA Today, more and more parents are helping their young adult children get homes. Suppose eight persons in a random sample of 40 young adults who recently purchased a home in Kentucky received help from their parents. You have been asked to construct a 95% confidence interval for the population proportion of all young adults in Kentucky who received help from their parents. If a 99% confidence interval is constructed instead of a 95% confidence interval for the population proportion, then __________. A) the resulting margin of error will increase and the risk of reporting an incorrect interval will increase B) the resulting margin of error will decrease and the risk of reporting an incorrect interval will increase C) the resulting margin of error will increase and the risk of reporting an incorrect interval will decrease D) the resulting margin of error will decrease and the risk of reporting an incorrect interval will decrease
Version 1
18
61) Candidate A is facing two opposing candidates in a mayoral election. In a recent poll of 300 residents, she has garnered 51% support. Construct a 95% confidence interval on the population proportion for the support of candidate A in the following election. A) [0.4534, 0.5666] B) [0.4625, 0.5575] C) [0.5084, 0.5116] D) [0.5086, 0.5114]
62) Candidate A is facing two opposing candidates in a mayoral election. In a recent poll of 300 residents, 153 supported her. Construct a 90% confidence interval on the population proportion for the support of candidate A in the following election. A) [0.4534, 0.5666] B) [0.4625, 0.5575] C) [0.5086, 0.5114] D) [0.5090, 0.5110]
63) Candidate A is facing two opposing candidates in a mayoral election. In a recent poll of 300 residents, 98 supported candidate B and 58 supported candidate C. Construct a 99% confidence interval on the population proportion for the support of candidate A in the following election. A) [0.4057, 0.5543] B) [0.4130, 0.5470] C) [0.4779, 0.4821] D) [0.4781, 0.4819]
Version 1
19
64) A sample of 2,016 American adults was asked how they viewed China, with 20% of respondents calling the country "unfriendly" and 5% of respondents indicating the country was "an enemy." Construct a 95% confidence interval of the proportion of American adults who viewed China as either "unfriendly" or "an enemy." A) [0.2208, 0.2792] B) [0.2311, 0.2689] C) [0.2341, 0.2659] D) [0.2403, 0.2597]
65) A sample of 2,007 American adults was asked how they viewed China, with 17% of respondents calling the country "unfriendly" and 6% of respondents indicating the country was "an enemy." Construct a 95% confidence interval of the proportion of American adults who viewed China as either "unfriendly" or "an enemy." A) [0.2013, 0.2587] B) [0.2116, 0.2484] C) [0.2146, 0.2454] D) [0.2208, 0.2392]
66) A sample of 2,007 American adults was asked how they viewed China, with 17% of respondents calling the country "unfriendly" and 6% of respondents indicating the country was "an enemy." Construct a 99% confidence interval of the proportion of American adults who viewed China as "an enemy." A) [0.0463, 0.0737] B) [0.0476, 0.0724] C) [0.0512, 0.0688] D) [0.0597, 0.0603]
Version 1
20
67) A sample of 1,400 American households was asked if they planned to buy a new car next year. Of the respondents, 34% indicated they planned to buy a new car next year. Construct a 98% confidence interval of the proportion of American households who expect to buy a new car next year. A) [0.3074, 0.3726] B) [0.3105, 0.3695] C) [0.3140, 0.3660] D) [0.3392, 0.3408]
68) A sample of 1,400 American households was asked if they planned to buy a new car next year. Of the respondents, 34% indicated they planned to buy a new car next year. Construct a 90% confidence interval of the proportion of American households who expect to buy a new car next year. A) [0.3152, 0.3648] B) [0.3192, 0.3608] C) [0.3394, 0.3406] D) [0.3354, 0.3446]
69) Statisticians like precision in their interval estimates. A low margin of error is needed to achieve this. Which of the following supports this when selecting sample sizes? A) A larger sample size reduces the margin of error. B) A smaller sample size reduces the margin of error. C) A larger sample size increases the margin of error. D) A sample size has no impact on the margin of error.
70) When the required sample size calculated by using a formula is not a whole number, what is the best choice for the required sample size?
Version 1
21
A) Round the result of the calculation up to the nearest whole number plus one. B) Round the result of the calculation up to the nearest whole number. C) Round the result of the calculation down to the nearest whole number. D) Round the result of the calculation down to the nearest whole number minus one.
71) What is the minimum sample size required to estimate a population mean with 95% confidence when the desired margin of error is E = 1.9? The population standard deviation is known to be 10.54. A) n = 59 B) n = 60 C) n = 118 D) n = 119
72) What is the minimum sample size required to estimate a population mean with 95% confidence when the desired margin of error is E = 1.5? The population standard deviation is known to be 10.75. A) n = 138 B) n = 139 C) n = 197 D) n = 198
73) What is the minimum sample size required to estimate a population mean with 90% confidence when the desired margin of error is D = 1.25? The standard deviation in a preselected sample is 8.5. A) n = 76 B) n = 125 C) n = 126 D) n = 190
Version 1
22
74) The minimum sample size n required to estimate a population mean with 95% confidence and the desired margin of error 1.5 was found to be 198. Which of the following is the approximate value of the assumed estimate of the population standard deviation? A) 10.7690 B) 12.8309 C) 115.9671 D) 164.6326
75) The minimum sample size n required to estimate a population mean with 95% confidence and the assumed estimate of the population standard deviation 6.5 was found to be 124. Which of the following is the approximate value of the assumed desired margin of error? A) D = 0.9220 B) D = 0.9602 C) D = 1.1441 D) D = 1.3090
76) A marketing firm wants to estimate how much root beer the average teenager drinks per year. A previous study found a standard deviation of 1.09 liters. How many teenagers must the firm interview in order to have a margin of error of at most 0.4 liter when constructing a 99% confidence interval? A) 48.39 B) 49 C) 49.39 D) 50
Version 1
23
77) A marketing firm wants to estimate how much root beer the average teenager drinks per year. A previous study found a standard deviation of 1.12 liters. How many teenagers must the firm interview in order to have a margin of error of at most 0.1 liter when constructing a 99% confidence interval? A) 831.39 B) 832 C) 832.39 D) 833
78) A medical devices company wants to know the number of hours its MRI machines are used per day. A previous study found a standard deviation of four hours. How many MRI machines must the company find data for in order to have a margin of error of at most 0.40 hour when calculating a 98% confidence interval? A) 541.19 B) 542 C) 618.69 D) 619
79) A medical devices company wants to know the number of hours its MRI machines are used per day. A previous study found a standard deviation of four hours. How many MRI machines must the company find data for in order to have a margin of error of at most 0.5 hour when calculating a 98% confidence interval? A) 346.26 B) 347 C) 424.69 D) 425
Version 1
24
80)
<p>Suppose we wish to find the required sample size to find a 90% confidence interval
for the population proportion with the desired margin of error. If there is no rough estimate of the population proportion, what value should be assumed for ? A) 0.05 B) 0.10 C) 0.50 D) 0.90
81) Find the minimum sample size when we want to construct a 90% confidence interval on the population proportion for the support of candidate A in the following mayoral election. Candidate A is facing two opposing candidates. In a preselected poll of 100 residents, 22 supported her. The desired margin of error is 0.08. A) n = 44 B) n = 72 C) n = 73 D) n = 103
82) Find the minimum sample size when we want to construct a 95% confidence interval on the population proportion for the support of candidate A in the following mayoral election. Candidate A is facing two opposing candidates. In a preselected poll of 100 residents, 22 supported candidate B and 14 supported candidate C. The desired margin of error is 0.06. A) n = 173 B) n = 174 C) n = 245 D) n = 246
Version 1
25
83) According to a report in USAToday, more and more parents are helping their young adult children buy homes. You would like to construct a 90% confidence interval of the proportion of all young adults in Louisville, Kentucky, who received help from their parents in buying a home. How large a sample should you take so that the margin of error is no more than 0.06? A) n = 114 B) n = 188 C) n = 267 D) Cannot be determined.
84)
The minimum sample size n required to estimate a population proportion, with 95%
confidence and the assumed most conservative = 0.5, was found to be 122. Which of the following is the approximate value of the assumed desired margin of error? A) D = 0.0055 B) D = 0.0078 C) D = 0.0745 D) D = 0.0887
85) A researcher in campaign finance law wants to estimate the proportion of elementary, middle, and high school teachers who contributed to a candidate during a recent election cycle. Given that no prior estimate of the population proportion is available, what is the minimum sample size such that the margin of error is no more than 0.01 for a 99% confidence interval? A) 751.67 B) 752 C) 16,640.11 D) 16,588
Version 1
26
86) A researcher in campaign finance law wants to estimate the proportion of elementary, middle, and high school teachers who contributed to a candidate during a recent election cycle. Given that no prior estimate of the population proportion is available, what is the minimum sample size such that the margin of error is no more than 0.03 for a 95% confidence interval? A) 751.67 B) 752 C) 1067.11 D) 1068
87) A politician wants to estimate the percentage of people who like his new slogan. Given that no prior estimate of the population proportion is available, what is the minimum sample size such that the margin of error is no more than 0.08 for a 95% confidence interval? A) 105.70 B) 106 C) 150.06 D) 151
88) A machine that is programmed to package 1.30 pounds of cereal is being tested for its accuracy in a sample of 43 cereal boxes, the sample mean filling weight is calculated as 1.32 pounds. The population standard deviation is known to be 0.06 pounds. Find the 95% confidence interval for the mean. A) [1.30, 1.34] B) [1.28, 1.36] C) [1.26, 1.38] D) [1.24, 1.40]
89) A machine that is programmed to package 1.20 pounds of cereal is being tested for its accuracy in a sample of 36 cereal boxes, the sample mean filling weight is calculated as 1.22 pounds. The population standard deviation is known to be 0.06 pounds. Find the 95% confidence interval for the mean. Version 1
27
A) [1.20, 1.24] B) [1.18, 1.26] C) [1.16, 1.28] D) [1.14, 1.30]
90) A machine that is programmed to package 1.20 pounds of cereal is being tested for its accuracy in a sample of 36 cereal boxes: the sample mean filling weight is calculated as 1.22 pounds. The population standard deviation is known to be 0.06 pounds. Which of the following conclusions is correct with a 95% confidence level? A) The machine is operating improperly because the target is below the lower limit. B) The machine is operating properly because the interval contains the target. C) The machine is operating improperly because the target is above the upper limit. D) There is not enough information to make a conclusion.
91) Bob’s average golf score at his local course is 93.6 from a random sample of 34 rounds of golf. Assume that the population standard deviation for his golf score is 4.2. The 90% confidence interval around this sample mean is ______. A) [89.0, 98.3] B) [90.5, 96.7] C) [92.4, 94.8] D) [93.2, 94.0]
92) Recently, six single-family homes inSan Luis Obispo County inCalifornia sold at the following prices (in $1,000s): 535, 427, 683, 510, 640, 591. Find a 95% confidence interval for the mean sale price in San Luis Obispo County.
Version 1
28
A) [466.80, 661.87] B) [471.35, 657.32] C) [489.91, 638.76] D) [457.75, 670.92]
93) Recently, six single-family homes in San Luis Obispo County in California sold at the following prices (in $1,000s): 549,449,705,529,639,609. Find a 95% confidence interval for the mean sale price in San Luis Obispo County. A) [484.63, 674.37] B) [489.90, 670.10] C) [508.46, 651.54] D) [476.30, 683.70]
94) Students who graduated from college last year owed an average of $25,250 in student loans. An economist wants to determine if average debt has changed. She takes a sample of 40 recent graduates and finds that their average debt is $27,500 with a standard deviation of $9,120. Use 90% confidence interval. Which of the following conclusion is correct? A) The average debt decreased. B) The average debt increased. C) The average debt has not changed. D) There is not enough information.
95) The average natural gas bill for a random sample of 21 homes in 19810 zip code during the month of March was $311.90 with a sample standard deviation of $51.60. The 90% confidence interval around this sample mean is _______.
Version 1
29
A) [$285.98, $337.82] B) [$292.48, $331.32] C) [$278.09. $345.71] D) [$266.82, $356.98]
96) A Monster.com pool of 3,057 individuals asked, "What’s the longest vacation you plan to take this summer?" The following relative frequency distribution summarizes the results. Response
Relative Frequency
A few days
0.21
A few long weekends
0.18
One week
0.36
Two weeks
0.22
The 95% confidence interval for the proportion of people who plan a one-week vacation this summer is ______. A) [0.345, 0.375] B) [0.345, 0.382] C) [0.343, 0.377] D) [0.338, 0.382]
97) A Monster.com pool of 3,057 individuals asked: "What’s the longest vacation you plan to take this summer?" The following relative frequency distribution summarizes the results. Response
Relative Frequency
A few days
0.21
A few long weekends
0.18
One week
0.36
Two weeks
0.22
The 99% confidence interval for the proportion of people who plan a one-week or two-week vacation this summer is ______.
Version 1
30
A) [0.563, 0.603] B) [0.557, 0.597] C) [0.563, 0.597] D) [0.557, 0.603]
98) A random sample of 130 mortgages in the state of Maryland was randomly selected. From this sample, 14 were found to be delinquent on their current payment. The 98% confidence interval for the proportion based on this sample is _______. A) [0.044, 0.171] B) [0.036, 0.180] C) [0.029, 0.188] D) [0.015, 0.201]
99) The Retail Advertising and Marketing Association would like to estimate the average amount of money that a person spends for Mother’s Day with 99% confidence interval and a margin of error within plus or minus $6. Assuming the standard deviation for spending on Mother’s Day is $36, the required sample size is ______. A) 210 B) 239 C) 284 D) 313
100) An employee of the Bureau of Transportation Statistics has been given the task of estimating the proportion of on-time arrivals of a budget airline. A prior study has estimated this on-time arrival rate as 78.5%. What is the minimum number of arrivals this employee must include in the sample to ensure that the margin of error for a 95% confidence interval is no more than 0.05?
Version 1
31
A) 270 B) 259 C) 260 D) 269
101) The National Center for Education would like to estimate the proportion of students who defaulted on their student loans for the state of Arizona. The total sample size needed to construct a 95% confidence interval for the proportion of student loans in default with a margin of error equal to 4% is ______. A) 336 B) 455 C) 416 D) 601
102) The snowfall (in inches) during the month of January in a particular geographic region is normally distributed with a standard deviation of 16.75. In the last 12 years, the sample average of snowfall is computed as 122.50. a.Construct a 90% confidence interval of the average snowfall in this region. b.Construct a 95% confidence interval of the average snowfall in this region. c.Comment on the width of the preceding confidence intervals.
103) The height of high school basketball players is known to be normally distributed with a standard deviation of 1.75 inches. In a random sample of eight high school basketball players, the heights (in inches) are recorded as 75, 82, 68, 74, 78, 70, 77, and 76. Construct a 95% confidence interval on the average height of all high school basketball players.
Version 1
32
104) Each portion of the SAT exam is designed to be normally distributed such that it has a population standard deviation of 100 and a mean of 500. However, the mean has changed over the years as less selective schools began requiring the SAT, and because students later began to prepare more specifically for the exam. Construct a 90% confidence interval for the population mean from the following eight scores from the math portion, using the population standard deviation of 100: 450, 660, 760, 540, 420, 430, 640, and 580.
105) To estimate the mean earnings forecast for a large company, an investor looked up earnings forecasts from six financial analysts. The six forecasts he found were 200, 220, 300, 185, 210, and 210 (in millions). Suppose the investor knows the population standard deviation is 25 million. Calculate a 95% confidence interval of the population mean earnings forecast.
106) A sample of holiday shoppers is taken randomly from a local mall. Forty-nine shoppers were selected and asked what their average spending on gifts would be during the entire holiday season. The point estimate of the population mean was calculated as $550 and the sample standard deviation was calculated as $92. a.Construct a 95% confidence interval of the population mean spending. b.Explain how the central limit theorem is used in constructing this interval.
Version 1
33
107) Pure-bred dogs competing in shows must often meet specific height criteria to be within the breed standard. A new breed of dog, the pug-cock-a-poo, has been approved by a kennel club. To establish the breed standard, the organization of breeders has taken the height from a sample of 25 pug-cock-a-poos and found the sample mean to equal 14 inches. The variance of this sample is 2.25. a.Construct a 90% confidence interval for the population mean height of the pug-cock-a-poos. b.What assumption is necessary for constructing the aforementioned interval?
108) A sample of the weights of 35 babies is taken from the local hospital maternity ward. The point estimate for the mean weight of the babies is 110.2 ounces with a sample standard deviation of 23.4 ounces. Construct a 90% confidence interval for the population mean baby weight at this hospital. Interpret this interval.
109) The personnel department of a large corporation wants to estimate the family dental expenses of its employees to determine the feasibility of providing a dental insurance plan. A random sample of six employees reveals the following family dental expenses: $180, $260, $60, $40, $100, and $80. It is known that dental expenses follow a normal distribution. Construct a 90% confidence interval for the average family dental expenses for all employees of this corporation.
Version 1
34
110) A job candidate with an offer from a prominent investment bank wanted to estimate how many hours she would have to work per week during her first year at the bank. She took a sample of six first-year analysts, asking how many hours they worked in the last week. a. Construct a 95% confidence interval with her results: 64, 82, 74, 73, 78, and 87 hours. b. What will happen to the width of the confidence interval for the average hours of work if the job candidate uses a smaller confidence level than 95%? Explain.
111) Bobby does not want to be late to work again. He drives to work every morning from Oakland to San Francisco and crossing the Bay Bridge seems to take forever. He times himself crossing the bridge for seven consecutive mornings, with the resulting times of 22.34, 27.54, 15.26, 13.56, 18.57, 21.22, and 19.08 minutes. Construct a 99% confidence interval from those times.
112) A string quartet in Maui wants to better understand its income stream, which primarily comes from playing at weddings. The cellist records the number of weddings the quartet plays in six successive months. Construct a 90% confidence interval from her results: 3, 5, 7, 4, 4, and 5.
113) A medical engineering company creates x-ray machines. The machines the company sold in 1995 were expected to last six years before breaking. To test how long the machines actually lasted, the company took a simple random sample of six machines. The company got the following results (in years) for how long the x-ray machines lasted: 8, 6, 7, 9, 5, and 7. Assume the distribution of the longevity of x-ray machines is normally distributed. Construct a 98% confidence interval for the average longevity of x-ray machines.
Version 1
35
114) An environmentalist is measuring the number of spotted leopard lizards in central California per acre. The environmentalist surveys six acres and finds signs of 24, 26, 30, 27, 28, and 27 lizards per acre. Construct a 95% confidence interval for the average number of leopard lizards per acre.
115) A company manager thinks he is overpaying for fertilizer at $174 per ton. To find out, he collects fertilizer prices per ton from five distributors. Construct a 90% confidence interval from his results: 128, 140, 170, 166, and 156.
116) A large university is interested in the outcome of a course standardization process. They have taken a random sample of 100 student grades, across all instructors. The grades represent the proportion of problems answered correctly on a midterm exam. The sample proportion correct was calculated as 0.78. a.Construct a 90% confidence interval on the population proportion of correctly answered problems. b.Construct a 95% confidence interval on the population proportion of correctly answered problems.
Version 1
36
117) A large university is interested in the outcome of a course standardization process. They have learned that 150 students of the total 1,500 students failed to pass the course in the current semester. Construct a 99% confidence interval on the population proportion of students who failed to pass the course. Was the normality condition met for the validity of the confidence interval formula?
118) A large city recently surveyed 1,354 streetlights, finding that 4.2% of them had burned out. Construct a 99% confidence interval for the proportion of the city’s streetlights that are burned out.
119) A recent survey of 1,014 American adults found that 46% view environmental conditions as either "excellent" or "good" (Gallup, March 4–7). Construct a 95% confidence interval of the population proportion.
120) A large earthquake hit a city in South America. A survey of 2,000 houses found that 600 of them had collapsed. Construct a 95% confidence interval for the proportion of collapsed houses.
Version 1
37
121) A student interviewing at a major company wants to know his chances of getting a job offer. He surveyed 18 alumni that had interviewed with the company and found nine of them had received a job offer. Construct a 90% confidence interval for the population proportion.
122) Pure-bred dogs competing in shows must often meet specific height criteria to be considered to be within the breed standard. A new breed of dog, the pug-cock-a-poo, has been approved by a kennel club. The variance of pug-cock-a-poo height is known to be 2.25 inches squared. You would like to estimate the mean height of all pug-cock-a-poo breeds of dogs. a.Find the appropriate sample size to achieve a margin of error of 0.3 when maintaining a 90% level of confidence. b.Find the appropriate sample size to achieve a margin of error of 0.3 when maintaining a 95% level of confidence.
123) A sample of holiday shoppers is taken randomly from a local mall to determine the average daily spending on gifts. From a preselected sample, the standard deviation was determined to be $26. You would like to construct a 95% confidence interval for the mean daily spending on all holiday spending. a.Find the appropriate sample size necessary to achieve a margin of error of $5. b.Find the appropriate sample size necessary to achieve a margin of error of $8.
Version 1
38
124) The Department of Education wishes to estimate the average SAT score among U.S. high school students. How many students would they have to sample, given that the test has a known population standard deviation of 100, in order to ensure that the margin of error was no more than 20 for a 99% confidence interval?
125) A financial analyst would like to construct a 95% confidence interval for the mean earnings of a company. The company’s earnings have a standard deviation of $12 million. a. What is the minimum sample size required by the analyst if he wants to restrict the margin of error to $2 million? b. What happens to the minimum sample size required if the analyst wants to have a smaller margin of error than $2 million? Explain.
126) Newscasters wish to predict the outcome of the next presidential election with 95% confidence and a margin of error equal to 0.08. Find the appropriate sample size to accomplish this goal.
127) Newscasters wish to predict the outcome of the next presidential election with 99% confidence and a margin of error equal to 0.08. Find the appropriate sample size to accomplish this goal.
Version 1
39
128) A budget airline wants to estimate what proportion of customers would pay $10 for inflight wireless access. Given that the airline has no prior knowledge of the proportion, how many customers would it have to sample to ensure a margin of error of no more than 5% for a 95% confidence interval?
129) A car dealership wants to estimate the percentage of customers who are satisfied with their customer service. Given that the car dealership has no prior knowledge of the proportion, how many customers would it have to sample to ensure a margin of error of no more than 5% for a 99% confidence interval?
130) A confidence interval provides a value that, with a certain measure of confidence, is the population parameter of interest. ⊚ true ⊚ false
131) <p>For a given confidence level and sample size n, the width of the confidence interval for the population mean is narrower, the greater the population standard deviation σ. ⊚ true ⊚ false
132) <p>For a given confidence level and population standard deviation σ, the width of the confidence interval for the population mean is wider, the smaller the sample size n.
Version 1
40
⊚ ⊚
true false
133) For a given sample size n and population standard deviation σ, the width of the confidence interval for the population mean is wider, the smaller the confidence level . ⊚ true ⊚ false
134)
If a random sample of size n is taken from a normal population with a finite variance,
then the statistic ⊚ true ⊚ false
follows the tdf distribution with (n − 1) degrees of freedom, df.
135) Like the z distribution, the tdf distribution is symmetric around 0, bell-shaped, and with tails that approach the horizontal axis and eventually cross it. ⊚ true ⊚ false
136)
The tdf distribution has broader tails than the z distribution. ⊚ true ⊚ false
137) The tdf distribution consists of a family of distributions where the actual shape of each one depends on the degrees of freedom, df. For lower values of df, the tdf distribution is similar to the z distribution. ⊚ true ⊚ false
Version 1
41
138) The required sample size for the interval estimation of the population mean can be computed if we specify the population standard deviation σ, the value of based on the confidence level 100(1 − α)% and the desired margin of error E. ⊚ true ⊚ false
139) <p>If we want to find the required sample size for the interval estimation of the population proportion, and no reasonable estimate of this proportion is available, we assume the worst-case scenario under which ⊚ true ⊚ false
140) The main ingredient for developing a confidence interval is the sampling distribution of the underlying statistic. ⊚ true ⊚ false
141)
If a small segment of the population is sampled then an estimate will be less precise. ⊚ true ⊚ false
142) The t distribution table lists tdf values for selected lower-tail probabilities and degrees of freedom df. ⊚ true ⊚ false
143)
Excel does not have a special function that allows to calculate t values.
Version 1
42
⊚ ⊚
144) size.
true false
<p>The sampling distribution of is based on a normal distribution on samples of any ⊚ ⊚
Version 1
true false
43
Answer Key Test name: Chap 08_4e 1) margin of error 2) normal 3) 1 - α 4) two 5) degrees of freedom 6) uncertainty 7) proportion 8) round up 9) C 10) B 11) A 12) D 13) C 14) C 15) C 16) A 17) A 18) A 19) D 20) C 21) C 22) D 23) D 24) D 25) B 26) A Version 1
44
27) A 28) B 29) D 30) D 31) B 32) B 33) C 34) C 35) C 36) C 37) D 38) D 39) D 40) C 41) D 42) D 43) D 44) A 45) C 46) A 47) B 48) B 49) B 50) A 51) C 52) B 53) A 54) A 55) D 56) B Version 1
45
57) C 58) A 59) D 60) C 61) A 62) B 63) A 64) B 65) B 66) A 67) B 68) B 69) A 70) B 71) D 72) D 73) C 74) A 75) C 76) D 77) D 78) B 79) B 80) C 81) C 82) D 83) B 84) D 85) D 86) D Version 1
46
87) D 88) A 89) A 90) B 91) C 92) A 93) A 94) C 95) B 96) C 97) D 98) A 99) B 100) C 101) D 102) a.[114.5466, 130.4534] b.[113.0230, 131.9770] c.The 95% interval is wider. 103) [73.7873, 76.2127] 104) [501.8456, 618.1544] 105) [200.8295, 240.8371] 106) a. [523.5745, 576.4255] b. follows the normal distribution because the sample size n = 49 >30 (central limit theorem). 107) a.[13.4867, 14.5133] b.Because the sample size is less than 30, we must assume that the underlying population is normally distributed. 108) [103.5119, 116.8881] Ninety percent of similarly constructed intervals of size 35 would contain the population mean weight of babies in this hospital. 109) [50.9805, 189.0195]
Version 1
47
110) a. [67.9729, 84.6937] b. The width of the confidence interval becomes narrower. For example, a 90% confidence interval is [69.7797, 82.8869] as opposed to a 95% confidence interval of [67.9729, 84.6937]. 111) [13.1353, 26.1705] 112) [3.5427, 5.7907] 113) [5.0573, 8.9427] 114) [24.9011, 29.0989] 115) [135.1059, 168.8941] 116) a. [0.7119, 0.8481] b. [0.6988, 0.8612] 117) [0.0800, 0.1200]; Yes 118) [0.0280, 0.0560] 119) [0.4293, 0.4907] 120) [0.2799, 0.3201] 121) [0.3062, 0.6938] 125) a. 139 b. The minimum sample size required will increase. For example, if E = $1 million, n = 554.
130) FALSE 131) FALSE 132) TRUE 133) FALSE 134) TRUE 135) FALSE 136) TRUE 137) FALSE 138) TRUE 139) TRUE 140) TRUE 141) TRUE Version 1
48
142) FALSE 143) FALSE 144) FALSE
Version 1
49
CHAPTER 09 – 4e 1)
The hypothesis statement H: µ = 25 is an example of a(an) __________ hypothesis.
2)
The hypothesis statement H:
is an example of a(an) __________ hypothesis.
3) A(n) __________ error is committed when we reject the null hypothesis when the null hypothesis is true.
4) We define the allowed probability of making a Type I error as α, and we refer to 100α% as the __________.
5) If we reject a null hypothesis at the 1% significance level, then we have __________ evidence that the null hypothesis is false.
6) Given the __________ of the z distribution, the p-value for a two-tailed test is twice that of the p-value for a one-tailed test.
7) Excel’s function __________ returns the p-value for a right-tailed test of the population mean, when the raw data is available and is known.
8) To obtain the p-value for a right-tailed test of the population mean, when σ is unknown, we use the Excel’s function __________.
Version 1
1
9)
The null hypothesis in a hypothesis test refers to __________. A) the desired outcome B) the default state of nature C) the altered state of nature D) the desired state of nature
10)
In general, the null and alternative hypotheses are __________. A) additive B) correlated C) multiplicative D) mutually exclusive
11)
The alternative hypothesis typically __________. A) corresponds to the presumed default state of nature B) contests the status quo, for which a corrective action may be required C) states the probability of rejecting the null hypothesis when it is false D) states the probability of rejecting the null hypothesis when it is true
12)
Which of the following types of tests may be performed? A) Right-tailed and two-tailed tests B) Left-tailed and two-tailed tests C) Right-tailed and left-tailed tests D) Right-tailed, left-tailed, and two-tailed tests
Version 1
2
13) The national average for an eighth-grade reading comprehension test is 73. A school district claims that its eighth-graders outperform the national average. In testing the school district’s claim, how does one define the population parameter of interest? A) The mean score on the eighth-grade reading comprehension test B) The number of eighth graders who took the reading comprehension test C) The standard deviation of the score on the eighth-grade reading comprehension test D) The proportion of eighth graders who scored above 73 on the reading comprehension test
14) A local courier service advertises that its average delivery time is less than 6 hours for local deliveries. When testing the two hypotheses, H0: μ ≥ 6 and HA: μ < 6, μ stand for __________. A) the mean delivery time B) the standard deviation of the delivery time C) the number of deliveries that took less than 6 hours D) the proportion of deliveries that took less than 6 hours
15) It is generally believed that no more than 0.50 of all babies in a town in Texas are born out of wedlock. A politician claims that the proportion of babies born out of wedlock is increasing. In testing the politician’s claim, how does one define the population parameter of interest? A) The current proportion of babies born out of wedlock B) The mean number of babies born out of wedlock C) The number of babies born out of wedlock D) The general belief that the proportion of babies born out of wedlock is no more than 0.50
Version 1
3
16) It is generally believed that no more than 0.50 of all babies in a town in Texas are born out of wedlock. A politician claims that the proportion of babies born out of wedlock is increasing. When testing the two hypotheses, H0: p ≤ 0.50 and HA: p > 0.50, p stands for __________. A) the current proportion of babies born out of wedlock B) the mean number of babies born out of wedlock C) the number of babies born out of wedlock D) the general belief that the proportion of babies born out of wedlock is no more than 0.50
17) It is generally believed that no more than 0.50 of all babies in a town in Texas are born out of wedlock. A politician claims that the proportion of babies born out of wedlock is increasing. Identify the correct null and alternative hypotheses to test the politician’s claim. A) H0: p = 0.50 and HA: p ≠ 0.50 B) H0: p ≤ 0.50 and HA: p > 0.50 C) H0: p ≥ 0.50 and HA: p > 0.50 D) H0: p ≥ 0.50 and HA: p < 0.50
18) Many cities around the United States are installing LED streetlights, in part to combat crime by improving visibility after dusk. An urban police department claims that the proportion of crimes committed after dusk will fall below the current level of 0.84 if LED streetlights are installed. Specify the null and alternative hypotheses to test the police department’s claim. A) H0: p = 0.84 and HA: p ≠ 0.84 B) H0: p < 0.84 and HA: p ≥ 0.84 C) H0: p ≤ 0.84 and HA: p > 0.84 D) H0: p ≥ 0.84 and HA: p < 0.84
Version 1
4
19) A professional sports organization is going to implement a test for steroids. The test gives a positive reaction in 94% of the people who have taken the steroid. However, it erroneously gives a positive reaction in 4% of the people who have not taken the steroid. What is the probability of Type I and Type II errors giving the null hypothesis "the individual has not taken steroids." A) Type I: 4%, Type II: 6% B) Type I: 6%, Type II: 4% C) Type I: 94%, Type II: 4% D) Type I: 4%, Type II: 94%
20) A statistics professor works tirelessly to catch students cheating on his exams. He has particular routes for his teaching assistants to patrol, an elevated chair to ensure an unobstructed view of all students, and even a video recording of the exam in case additional evidence needs to be collected. He estimates that he catches 95% of students who cheat in his class, but 1% of the time that he accuses a student of cheating he is actually incorrect. Consider the null hypothesis, "the student is not cheating." What is the probability of a Type I error? A) 1% B) 5% C) 95% D) 99%
21)
A Type I error occurs when we __________. A) reject the null hypothesis when it is actually true B) reject the null hypothesis when it is actually false C) do not reject the null hypothesis when it is actually true D) do not reject the null hypothesis when it is actually false
22)
A Type II error occurs when we __________.
Version 1
5
A) reject the null hypothesis when it is actually true B) reject the null hypothesis when it is actually false C) do not reject the null hypothesis when it is actually true D) do not reject the null hypothesis when it is actually false
23) When we reject the null hypothesis when it is actually false, we have committed __________. A) no error B) a Type I error C) a Type II error D) a Type I error and a Type II error
24)
For a given sample size n, __________.
A) decreasing the probability of a Type I error α will increase the probability of a Type II error β B) decreasing the probability of a Type I error α will decrease the probability of a Type II error β C) changing the probability of a Type I error α will have no impact on the probability of a Type II error β D) increasing the probability of a Type I error α will increase the probability of a Type II error β as long as σ is known
25) Consider the following hypotheses that relate to the medical field: H0: A person is free of disease. HA: A person has disease. In this instance, a Type I error is often referred to as __________.
Version 1
6
A) a false positive B) a false negative C) a negative result D) the power of the test
26) Consider the following hypotheses that relate to the medical field: H0: A person is free of disease. HA: A person has disease. In this instance, a Type II error is often referred to as __________. A) a false positive B) a false negative C) a negative result D) the power of the test
27) A fast-food franchise is considering building a restaurant at a busy intersection. A financial advisor determines that the site is acceptable only if, on average, more than 300 automobiles pass the location per hour. The advisor tests the following hypotheses: H0: μ ≤ 300. HA: μ > 300. The consequences of committing a Type I error would be that __________. A) the franchiser builds on an acceptable site B) the franchiser builds on an unacceptable site C) the franchiser does not build on an acceptable site D) the franchiser does not build on an unacceptable site
Version 1
7
28) A fast-food franchise is considering building a restaurant at a busy intersection. A financial advisor determines that the site is acceptable only if, on average, more than 300 automobiles pass the location per hour. The advisor tests the following hypotheses: H0: μ ≤ 300. HA: μ > 300. The consequences of committing a Type II error would be that __________. A) the franchiser builds on an acceptable site B) the franchiser builds on an unacceptable site C) the franchiser does not build on an acceptable site D) the franchiser does not build on an unacceptable site
29) A fast-food franchise is considering building a restaurant at a busy intersection. A financial advisor determines that the site is acceptable only if, on average, more than 300 automobiles pass the location per hour. If the advisor tests the hypotheses H0: μ ≤ 300 versus HA: μ > 300, μ stands for __________. A) the average number of automobiles that pass the intersection per hour B) the number of automobiles that pass the intersection per hour C) the proportion of automobiles that pass the intersection per hour D) the standard deviation of the number of automobiles that pass the intersection per hour
30)
Which of the following answers represents the objective of a hypothesis test?
A) Rejecting the null hypothesis when it is true. B) Rejecting the null hypothesis when it is false. C) Not rejecting the null hypothesis when it is true. D) Rejecting the null hypothesis when it is false and not rejecting the null hypothesis when it is true.
31)
When conducting a hypothesis test, which of the following decisions represents an error?
Version 1
8
A) Rejecting the null hypothesis when it is true. B) Rejecting the null hypothesis when it is false. C) Not rejecting the null hypothesis when it is true. D) Rejecting the null hypothesis when it is false and not rejecting the null hypothesis when it is true.
32)
A hypothesis test regarding the population mean is based on __________. A) the sampling distribution of the sample mean B) the sampling distribution of the sample variance C) the sampling distribution of the sample proportion D) the sampling distribution of the sample standard deviation
33) When conducting a hypothesis test concerning the population mean, and the population standard deviation is known, the value of the test statistic is calculated as __________. A) B) C) D)
34) When conducting a hypothesis test concerning the population mean, and the population standard deviation is unknown, the value of the test statistic is calculated as __________.
Version 1
9
A) B) C) D)
35) When conducting a hypothesis test concerning the population proportion, the value of the test statistic is calculated as __________. A) B) C) D)
36)
Which of the following represents an appropriate set of hypotheses? A) B) C) D)
37)
If the chosen significance level is α = 0.05, then __________.
Version 1
10
A) there is a 5% probability of rejecting a true null hypothesis B) there is a 5% probability of accepting a true null hypothesis C) there is a 5% probability of rejecting a false null hypothesis D) there is a 5% probability of accepting a false null hypothesis
38) When conducting a hypothesis test for a given sample size, if α is increased from 0.05 to 0.10, then __________. A) the probability of incorrectly rejecting the null hypothesis decreases B) the probability of incorrectly failing to reject the null hypothesis increases C) the probability of Type I error decreases D) the probability of Type II error decreases
39) When conducting a hypothesis test for a given sample size, if the probability of a Type I error decreases, then the __________. A) probability of Type II error decreases B) probability of incorrectly rejecting the null hypothesis increases C) probability of incorrectly not rejecting the null hypothesis increases D) probability of incorrectlynot rejecting the null hypothesis decreases
40)
Which of the following are one-tailed tests? A) H0:μ ≤ 10, HA:μ > 10 B) H0:μ = 10, HA:μ ≠ 10 C) H0:μ ≥ 400, HA:μ < 400 D) Both H0:μ ≤ 10, HA:μ > 10 and H0:μ ≥ 400, HA:μ < 400
41)
Which of the following are two-tailed tests?
Version 1
11
A) H0:μ ≤ 10, HA:μ > 10 B) H0:μ = 10, HA:μ ≠ 10 C) H0:μ ≥ 400, HA:μ < 400 D) Both H0:μ ≤ 10, HA:μ > 10 and H0:μ ≥ 400, HA:μ < 400
42) If the p-value for a hypothesis test is 0.027 and the chosen level of significance is α = 0.05, then the correct conclusion is to __________. A) reject the null hypothesis B) not reject the null hypothesis C) reject the null hypothesis if σ = 10 D) not reject the null hypothesis if σ = 10
43) If the p-value for a hypothesis test is 0.07 and the chosen level of significance is α = 0.05, then the correct conclusion is to __________. A) reject the null hypothesis B) not reject the null hypothesis C) reject the null hypothesis if σ = 10 D) not reject the null hypothesis if σ = 10
44)
If the null hypothesis is rejected at a 1% significance level, then __________. A) the null hypothesis will be rejected at a 5% significance level B) the alternative hypothesis will be rejected at a 5% significance level C) the null hypothesis will not be rejected at a 5% significance level D) the alternative hypothesis will not be rejected at a 5% significance level
Version 1
12
45)
What is the decision rule when using the p-value approach to hypothesis testing? A) Reject H0 if the p-value > α. B) Reject H0 if the p-value < α. C) Do not reject H0 if the p-value < 1 − α. D) Do not reject H0 if the p-value > 1 − α.
46) Consider the following competing hypotheses: Ho: μ = 0, HA: μ ≠ 0. The value of the test statistic is z = −1.70. If we choose a 1% significance level, then we __________. A) reject the null hypothesis and conclude that the population mean is significantly different from zero B) do not reject the null hypothesis and conclude that the population mean is significantly different from zero C) reject the null hypothesis and conclude that the population mean is not significantly different from zero D) do not reject the null hypothesis and conclude that the population mean is not significantly different from zero
47) Consider the following competing hypotheses: H0:μ = 0, HA:μ ≠ 0. The value of the test statistic is z = −1.38. If we choose a 5% significance level, then we __________. A) reject the null hypothesis and conclude that the population mean is significantly different from zero B) reject the null hypothesis and conclude that the population mean is not significantly different from zero C) do not reject the null hypothesis and conclude that the population mean is significantly different from zero D) do not reject the null hypothesis and conclude that the population mean is not significantly different from zero
Version 1
13
48) A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.80. A sample of 37 honors students is taken and is found to have a mean GPA equal to 3.90. The population standard deviation is assumed to equal 0.40. The parameter to be tested is __________. A) the mean GPA of 3.9 for the 37 selected honors students B) the mean GPA of the university honors students C) the proportion of honors students with a GPA exceeding 3.8 D) the mean GPA of all university students
49) A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.50. A sample of 36 honors students is taken and is found to have a mean GPA equal to 3.60. The population standard deviation is assumed to equal 0.40. The parameter to be tested is __________. A) the mean GPA of all university students B) the mean GPA of the university honors students C) the mean GPA of 3.60 for the 36 selected honors students D) the proportion of honors students with a GPA exceeding 3.50
50) A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.50. A sample of 36 honors students is taken and is found to have a mean GPA equal to 3.60. The population standard deviation is assumed to equal 0.40. To establish whether the mean GPA exceeds 3.50, the appropriate hypotheses are __________. A) B) C) D)
Version 1
14
51) A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.50. A sample of 36 honors students is taken and is found to have a mean GPA equal to 3.60. The population standard deviation is assumed to equal 0.40. The value of the test statistic is __________. A) z = −1.50 B) t35 = −1.50 C) z = 1.50 D) t35 = 1.50
52) A university is interested in promoting graduates of its honors program by establishing that the mean GPA of these graduates exceeds 3.50. A sample of 36 honors students is taken and is found to have a mean GPA equal to 3.60. The population standard deviation is assumed to equal 0.40. At a 5% significance level, the decision is to __________. A) reject H0; we can conclude that the mean GPA is significantly greater than 3.50 B) reject H0; we cannot conclude that the mean GPA is significantly greater than 3.50 C) not reject H0; we can conclude that the mean GPA is significantly greater than 3.50 D) not reject H0; we cannot conclude that the mean GPA is significantly greater than 3.50
53) The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. The population parameter to be tested is __________. A) the average number of 750 customers per day B) the proportion of customers visiting the dealership per day C) the mean number of customers visiting the dealership per day D) the standard deviation of the number of customers visiting the dealership per day
Version 1
15
54) The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. To determine whether there has been a decrease in the average number of customers visiting the dealership daily, the appropriate hypotheses are __________. A) B) C) D)
55) The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 830 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 780. Assume that the population standard deviation is 330. The value of the test statistic is __________. A) t99 = 1.515 B) z = 1.515 C) t99 = −1.515 D) z = −1.515
56) The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. The value of the test statistic is __________. A) z = −1.429 B) t99 = −1.429 C) z = 1.429 D) t99 = 1.429
Version 1
16
57) The owner of a large car dealership believes that the financial crisis decreased the number of customers visiting her dealership. The dealership has historically had 800 customers per day. The owner takes a sample of 100 days and finds the average number of customers visiting the dealership per day was 750. Assume that the population standard deviation is 350. At the 5% significance level, the decision is to __________. A) reject H0; we can conclude that the mean number of customers visiting the dealership is significantly less than 800 B) reject H0; we cannot conclude that the mean number of customers visiting the dealership is significantly less than 800 C) do not reject H0; we can conclude that the mean number of customers visiting the dealership is significantly less than 800 D) do not reject H0; we cannot conclude that the mean number of customers visiting the dealership is significantly less than 800
58) The Boston public school district has had difficulty maintaining on-time bus service for its students ("A Year Later, School Buses Still Late," Boston Globe, October 5). Suppose the district develops a new bus schedule to help combat chronic lateness on a particularly woeful route. Historically, the bus service on the route has been, on average, 12 minutes late. After the schedule adjustment, the first 36 runs were an average of eight minutes late. As a result, the Boston public school district claimed that the schedule adjustment was an improvement— students were not as late. Assume a population standard deviation for bus arrival time of 12 minutes. Which of the following can be used to determine whether the schedule adjustment reduced the average lateness time of 12 minutes? A) B) C) D)
Version 1
17
59) The Boston public school district has had difficulty maintaining on-time bus service for its students ("A Year Later, School Buses Still Late," Boston Globe, October 5). Suppose the district develops a new bus schedule to help combat chronic lateness on a particularly woeful route. Historically, the bus service on the route has been, on average, 14 minutes late. After the schedule adjustment, the first 25 runs were an average of eightminutes late. As a result, the Boston public school district claimed that the schedule adjustment was an improvement— students were not as late. Assume a population standard deviation for bus arrival time of 14 minutes. The value of the test statistic is __________. A) z = −2.14 B) t35 = −0.33 C) z = −0.33 D) t35 = −2.14
60) The Boston public school district has had difficulty maintaining on-time bus service for its students ("A Year Later, School Buses Still Late," Boston Globe, October 5). Suppose the district develops a new bus schedule to help combat chronic lateness on a particularly woeful route. Historically, the bus service on the route has been, on average, 12 minutes late. After the schedule adjustment, the first 36 runs were an average of eight minutes late. As a result, the Boston public school district claimed that the schedule adjustment was an improvement— students were not as late. Assume a population standard deviation for bus arrival time of 12 minutes. The value of the test statistic is __________. A) t35 = −2.00 B) z = −2.00 C) t35 = −0.33 D) z = −0.33
61) A 99% confidence interval for the population mean yields the following results: [−3.79, 5.86]. At the 1% significance level, what decision should be made regarding the following hypothesis test with H0:μ = 0,HA:μ ≠ 0?
Version 1
18
A) Reject H0; we can conclude that the mean differs from zero. B) Reject H0; we cannot conclude that the mean differs from zero. C) Do not reject H0; we can conclude that the mean differs from zero. D) Do not reject H0; we cannot conclude that the mean differs from zero.
62) To test if the mean IQ of employees in an organization is greater than 100, a sample of 30 employees is taken and the value of the test statistic is computed as t29 = 2.42 If we choose a 5% significance level, we __________. A) reject the null hypothesis and conclude that the mean IQ is greater than 100 B) reject the null hypothesis and conclude that the mean IQ is not greater than 100 C) do not reject the null hypothesis and conclude that the mean IQ is greater than 100 D) do not reject the null hypothesis and conclude that the mean IQ is not greater than 100
63) A recent report claimed that Americans are retiring later in life (U.S. News & World Report, August 17). An economist wishes to determine if the mean retirement age has increased from 62. To conduct the relevant test, she takes a random sample of 38 Americans who have recently retired and computes the value of the test statistic as t37 = 1.92. With α = 0.05, she __________. A) rejects the null hypothesis and concludes that the mean retirement age has increased B) rejects the null hypothesis and concludes that the mean retirement age has not increased C) does not reject the null hypothesis and does not conclude that the mean retirement age has changed D) does not reject the null hypothesis and does not conclude that the mean retirement age has increased
Version 1
19
64) A schoolteacher is worried that the concentration of dangerous, cancer-causing radon gas in her classroom is greater than the safe level of 4pCi/L. The school samples the air for 36 days and finds an average concentration of 4.4pCi/L with a standard deviation of 1pCi/L. To test whether the average level of radon gas is greater than the safe level, the appropriate hypotheses are __________. A) B) C) D)
65) A schoolteacher is worried that the concentration of dangerous, cancer-causing radon gas in her classroom is greater than the safe level of 8 pCi/L. The school samples the air for 36 days and finds an average concentration of 8.4 pCi/L with a standard deviation of 1 pCi/L. The value of the test statistic is __________. A) t35 = −2.40 B) z = −2.40 C) t35 = 2.40 D) z = 2.40
66) A schoolteacher is worried that the concentration of dangerous, cancer-causing radon gas in her classroom is greater than the safe level of 4pCi/L. The school samples the air for 36 days and finds an average concentration of 4.4pCi/L with a standard deviation of 1pCi/L. The value of the test statistic is __________. A) t35 = −2.40 B) z = −2.40 C) t35 = 2.40 D) z = 2.40
Version 1
20
67) A schoolteacher is worried that the concentration of dangerous, cancer-causing radon gas in her classroom is greater than the safe level of 4pCi/L. The school samples the air for 36 days and finds an average concentration of 4.4pCi/L with a standard deviation of 1pCi/L. At a 5% significance level, the decision is to __________. A) reject H0; we can conclude that the mean concentration of radon gas is greater than the safe level B) reject H0; we cannot conclude that the mean concentration of radon gas is greater than the safe level C) not reject H0; we can conclude that the mean concentration of radon gas is greater than the safe level D) not reject H0; we cannot conclude that the mean concentration of radon gas is greater than the safe level
68) A car dealer who sells only late-model luxury cars recently hired a new salesperson and believes that this salesperson is selling at lower markups. He knows that the long-run average markup in his lot is $5,300. He takes a random sample of 25 of the new salesperson’s sales and finds an average markup of $4,700 and a standard deviation of $840. Assume the markups are normally distributed. What is the value of an appropriate test statistic for the car dealer to use to test his claim? A) z = −3.57 B) t24 = −3.57 C) z = −0.75 D) t24 = −0.75
69) A car dealer who sells only late-model luxury cars recently hired a new salesperson and believes that this salesperson is selling at lower markups. He knows that the long-run average markup in his lot is $5,600. He takes a random sample of 16 of the new salesperson’s sales and finds an average markup of $5,000 and a standard deviation of $800. Assume the markups are normally distributed. What is the value of an appropriate test statistic for the car dealer to use to test his claim?
Version 1
21
A) t15 = −3.00 B) z = −3.00 C) t15 = −0.75 D) z = −0.75
70) A newly hired basketball coach promised a high-paced attack that will put more points on the board than the team’s previously tepid offense historically managed. After a few months, the team owner looks at the data to test the coach’s claim. He takes a sample of 16 of the team’s games under the new coach and finds that they scored an average of 85 points with a standard deviation of 5 points. Over the past 10 years, the team had averaged 83 points. What is the value of the appropriate test statistic to test the new coach’s claim at the 1% significance level? A) t15 = 1.60 B) z = 1.60 C) z = 0.33 D) t15 = 0.33
71) A newly hired basketball coach promised a high-paced attack that will put more points on the board than the team’s previously tepid offense historically managed. After a few months, the team owner looks at the data to test the coach’s claim. He takes a sample of 36 of the team’s games under the new coach and finds that they scored an average of 101 points with a standard deviation of 6 points. Over the past 10 years, the team had averaged 99 points. What is the value of the appropriate test statistic to test the new coach’s claim at the 1% significance level? A) t35 = 0.33 B) z = 0.33 C) t35 = 2.00 D) z = 2.00
72) What type of variable would necessitate using a hypothesis test of the population proportion rather than a test of the population mean? Version 1
22
A) Ratio-scaled B) Interval-scaled C) Categorical D) Numerical
73) You want to test if more than 20% of homes in a neighborhood have recently been sold through a short sale, at a foreclosure auction, or by the bank following an unsuccessful foreclosure auction. You take a sample of 60 homes from this neighborhood and find that 14 fit your criteria. The appropriate null and alternative hypotheses are __________. A) B) C) D)
74) A university interested in tracking its honors program believes that the proportion of graduates with a GPA of 3.00 or below is less than 0.20. In a sample of 200 graduates, 30 students have a GPA of 3.00 or below. In testing the university’s belief, how does one define the population parameter of interest? A) It’s the proportion of honors graduates with a GPA of 3.00 or below. B) It’s the mean number of honors graduates with a GPA of 3.00 or below. C) It’s the number of honors graduates with a GPA of 3.00 or below. D) It’s the standard deviation of the number of honors graduates with a GPA of 3.00 or below.
75) A university interested in tracking its honors program believes that the proportion of graduates with a GPA of 3.00 or below is less than 0.20. In a sample of 200 graduates, 30 students have a GPA of 3.00 or below. In testing the university’s belief, the appropriate hypotheses are __________.
Version 1
23
A) B) C) D)
76) A university interested in tracking its honors program believes that the proportion of graduates with a GPA of 3.00 or below is less than 0.25. In a sample of 170 graduates, 35 students have a GPA of 3.00 or below. The value of the test statistic and its associated p-value are __________.
A) t169 = −1.33 and p-value = 0.0918 B) z = −1.33 and p-value = 0.0768 C) t169 = −1.33 and p-value = 0.0768 D) z = −1.33 and p-value = 0.0918
77) A university interested in tracking its honors program believes that the proportion of graduates with a GPA of 3.00 or below is less than 0.20. In a sample of 200 graduates, 30 students have a GPA of 3.00 or below. The value of the test statistic and its associated p-value are __________. A) t199 = −1.77 and p-value = 0.0385 B) z = −1.77 and p-value = 0.0385 C) t199 = −1.77 and p-value = 0.0768 D) z = −1.77 and p-value = 0.0768
78) A university interested in tracking its honors program believes that the proportion of graduates with a GPA of 3.00 or below is less than 0.20. In a sample of 200 graduates, 30 students have a GPA of 3.00 or below. At a 5% significance level, the decision is to __________.
Version 1
24
A) reject H0; we can conclude that the proportion of graduates with a GPA of 3.00 or below is significantly less than 0.20 B) reject H0; we cannot conclude that the proportion of graduates with a GPA of 3.00 or below is significantly less than 0.20 C) not reject H0; we can conclude that the proportion of graduates with a GPA of 3.00 or below is significantly less than 0.20 D) not reject H0; we cannot conclude that the proportion of graduates with a GPA of 3.00 or below is significantly less than 0.20
79) The Institute of Education Sciences measures the high school dropout rate as the percentage of 16- through 24-year-olds who are not enrolled in school and have not earned a high school credential. Last year, the high school dropout rate was 8.1%. A polling company recently took a survey of 1,000 people between the ages of 16 and 24 and found that 6.5% of them are high school dropouts. The polling company would like to determine whether the dropout rate has decreased. When testing whether the dropout rate has decreased, the appropriate hypotheses are __________. A) B) C) D)
80) The Institute of Education Sciences measures the high school dropout rate as the percentage of 16- through 24-year-olds who are not enrolled in school and have not earned a high school credential. Last year, the high school dropout rate was 8.1%. A polling company recently took a survey of 1,000 people between the ages of 16 and 24 and found that 6.5% of them are high school dropouts. The polling company would like to determine whether the dropout rate has decreased. At a 5% significance level, the test statistic is __________. A) z = −2.052 B) z = −1.8545 C) z = 1.8545 D) z = 2.052
Version 1
25
81) The Institute of Education Sciences measures the high school dropout rate as the percentage of 16- through 24-year-olds who are not enrolled in school and have not earned a high school credential. Last year, the high school dropout rate was 8.1%. A polling company recently took a survey of 1,000 people between the ages of 16 and 24 and found that 6.5% of them are high school dropouts. The polling company would like to determine whether the dropout rate has decreased. At a 5% significance level, the p-value and α are __________. A) p-value = 0.0210 and α = 0.05 B) p-value = 0.0318 and α = 0.05 C) p-value = 0.0210 and α = 0.025 D) p-value = 0.0318 and α = 0.025
82) The Institute of Education Sciences measures the high school dropout rate as the percentage of 16- through 24-year-olds who are not enrolled in school and have not earned a high school credential. Last year, the high school dropout rate was 8.1%. A polling company recently took a survey of 1,000 people between the ages of 16 and 24 and found that 6.5% of them are high school dropouts. The polling company would like to determine whether the dropout rate has decreased. At a 5% significance level, the decision is to __________. A) reject H0; we can conclude that the high school dropout rate has decreased B) reject H0; we cannot conclude that the high school dropout rate has decreased C) do not reject H0; we can conclude that the high school dropout rate has decreased D) do not reject H0; we cannot conclude that the high school dropout rate has decreased
Version 1
26
83) Vermont-based Green Mountain Coffee Roasters dominates the market for single-serve coffee in the United States, with its subsidiary Keurig accounting for approximately 70% of sales ("Rivals Try to Loosen Keurig’s Grip on Single-Serve Coffee Market," Chicago Tribune, February 26). But Keurig’s patent on K-cups, the plastic pods used to brew the coffee, is expected to expire this year, allowing other companies to better compete. Suppose a potential competitor has been conducting blind taste tests on its blend and finds that 47% of consumers strongly prefer its French Roast to that of Green Mountain Coffee Roasters. After tweaking its recipe, the competitor conducts a test with 144 tasters, of which 72 prefer its blend. The competitor claims that its new blend is preferred by more than 47% of consumers to Green Mountain Coffee Roasters’ French Roast. Which of the following should be used to develop the null and alternative hypotheses to test this claim? A) B) C) D)
84) Vermont-based Green Mountain Coffee Roasters dominates the market for single-serve coffee in the United States, with its subsidiary Keurig accounting for approximately 70% of sales ("Rivals Try to Loosen Keurig’s Grip on Single-Serve Coffee Market," Chicago Tribune, February 26). But Keurig’s patent on K-cups, the plastic pods used to brew the coffee, is expected to expire this year, allowing other companies to better compete. Suppose a potential competitor has been conducting blind taste tests on its blend and finds that 47% of consumers strongly prefer its French Roast to that of Green Mountain Coffee Roasters. After tweaking its recipe, the competitor conducts a test with 144 tasters, of which 72 prefer its blend. The competitor claims that its new blend is preferred by more than 47% of consumers to Green Mountain Coffee Roasters’ French Roast. What is the value of the appropriate test statistic to test this claim? A) t143 = 0.721 B) z = 0.721 C) t143 = 1.96 D) z = 1.96
Version 1
27
85) Expedia would like to test if the average round-trip airfare between Philadelphia and Dublin is less than $1,200. The correct hypothesis statement would be __________. A) B) C) D)
86) Expedia would like to test if the average round-trip airfare between Philadelphia and Dublin is less than $1,200. Which of the following hypothesis tests should be performed? A) Left-tailed B) Right-tailed C) Two-tailed D) There is not enough information to answer.
87) A company has developed a new diet that it claims will lower one’s weight by more than 10 pounds. Health officials decide to conduct a test to validate this claim. The manager of the company __________. A) is more concerned about Type I error B) is more concerned about Type II error C) is concerned about both Type I and Type II errors D) is not concerned at all
88) A company has developed a new diet that it claims will lower one’s weight by more than 10 pounds. Health officials decide to conduct a test to validate this claim. The consumers should be __________.
Version 1
28
A) more concerned about Type I error B) more concerned about Type II error C) concerned about both Type I and Type II errors D) not concerned at all
89) A company decided to test the hypothesis that the average time a company’s employees are spending to check their private e-mails at work is more than 10 minutes. A random sample of 60 employees were selected and they averaged 10.6 minutes. It is believed that the population standard deviation is 1.8 minutes. The α is set to 0.05. The p-value for this hypothesis test would be __________. A) 0.0049 B) 0.0315 C) 0.0188 D) 0.0564
90) A company decided to test the hypothesis that the average time a company’s employees are spending to check their private e-mails at work is more than 6 minutes. A random sample of 40 employees were selected and they averaged 6.6 minutes. It is believed that the population standard deviation is 1.7 minutes. The αis set to 0.05. The p-value for this hypothesis test would be __________. A) 0.0644 B) 0.0128 C) 0.0268 D) 0.0395
Version 1
29
91) The Department of Education would like to test the hypothesis that the average debt load of graduating students with a bachelor’s degree is equal to $16,800. A random sample of 25 students had an average debt load of $18,000. It is believed that the population standard deviation for student debt load is $4,100. The α is set to 0.05. The confidence interval for this hypothesis test would be __________. A) [$13,723.48, $19,876.62] B) [$14,443.68, $19,156.32] C) [$16,392.80, $19,607.20] D) [$15,884.28, $17,715.72]
92) The Department of Education would like to test the hypothesis that the average debt load of graduating students with a bachelor’s degree is equal to $17,000. A random sample of 34 students had an average debt load of $18,200. It is believed that the population standard deviation for student debt load is $4,200. The α is set to 0.05. The confidence interval for this hypothesis test would be __________. A) [$14,118.9, $19,881.2] B) [$14,839.1, $19,160.9] C) [$16,279.7, $17,720.3] D) [$16,788.22, $19,611.78]
93) The Department of Education would like to test the hypothesis that the average debt load of graduating students with a bachelor’s degree is equal to $15,600. A random sample of 36 students had an average debt load of $16,800. It is believed that the population standard deviation for student debt load is $4,140. The α is set to 0.05. The p-value for this hypothesis test would be __________. A) 0.041 B) 0.008 C) 0.082 D) 0.118
Version 1
30
94) The Department of Education would like to test the hypothesis that the average debt load of graduating students with a bachelor’s degree is equal to $17,000. A random sample of 34 students had an average debt load of $18,200. It is believed that the population standard deviation for student debt load is $4,200. The α is set to 0.05. The p-value for this hypothesis test would be __________. A) 0.1310 B) 0.0957 C) 0.0475 D) 0.0219
95) An advertisement for a popular weight-loss clinic suggests that participants in its new diet program lose, on average, more than 10 pounds. A consumer activist decides to test the authenticity of the claim. She follows the progress of 18 women who recently joined the weightreduction program. She calculates the mean weight loss of these participants as 10.8 pounds with a standard deviation of 2.4 pounds. Which of the following are appropriate hypotheses to test the advertisement’s claim? A) H0:μ < 10, HA:μ ≥ 10 B) H0:μ ≥ 10,HA:μ < 10 C) H0:μ ≤ 10,HA:μ > 10 D) H0:μ = 10,HA:μ ≠ 10
96) An advertisement for a popular weight-loss clinic suggests that participants in its new diet program lose, on average, more than 10 pounds. A consumer activist decides to test the authenticity of the claim. She follows the progress of 25 women who recently joined the weightreduction program. She calculates the mean weight loss of these participants as 10.8 pounds with a standard deviation of 1.7 pounds. The test statistic for this hypothesis would be __________.
Version 1
31
A) −2.68 B) −2.35 C) 2.68 D) 2.35
97) An advertisement for a popular weight-loss clinic suggests that participants in its new diet program lose, on average, more than 10 pounds. A consumer activist decides to test the authenticity of the claim. She follows the progress of 18 women who recently joined the weightreduction program. She calculates the mean weight loss of these participants as 10.8 pounds with a standard deviation of 2.4 pounds. The test statistic for this hypothesis would be __________. A) 1.41 B) −1.41 C) 1.74 D) −1.74
98) A television network is deciding whether or not to give its newest television show a spot during prime viewing time at night. If this is to happen, it will have to move one of its most viewed shows to another slot. The network conducts a survey asking its viewers which show they would rather watch. The network receives 827 responses, of which 428 indicate they would like to see the new show in the lineup. Which of the following is an appropriate hypotheses to test if the television network should give its newest show a spot during prime time at night? A) H0: p < 0.50, HA: p ≥ 0.50 B) H0: p ≥ 0.50, HA: p < 0.50 C) H0: p ≤ 0.50, HA: p > 0.50 D) H0: p = 0.50, HA: p ≠ 0.50
Version 1
32
99) A television network is deciding whether or not to give its newest television show a spot during prime viewing time at night. If this is to happen, it will have to move one of its most viewed shows to another slot. The network conducts a survey asking its viewers which show they would rather watch. The network receives 830 responses, of which 439 indicate that they would like to see the new show in the lineup. The test statistic for this hypothesis would be __________. A) 1.91 B) 1.71 C) 1.67 D) 2.01
100) A television network is deciding whether or not to give its newest television show a spot during prime viewing time at night. If this is to happen, it will have to move one of its most viewed shows to another slot. The network conducts a survey asking its viewers which show they would rather watch. The network receives 827 responses, of which 428 indicate that they would like to see the new show in the lineup. The test statistic for this hypothesis would be __________. A) 1.35 B) 1.05 C) 1.25 D) 1.01
101)
Which of the following statements concerning hypothesis testing is true?
A) If sample evidence is inconsistent with the null hypothesis, we do not reject the null hypothesis. B) If sample evidence is consistent with the null hypothesis, we prove that the null hypothesis is true. C) If we reject the null hypothesis, we establish that the evidence supports the status quo. D) If we are unable to reject the null hypothesis, then we maintain business as usual.
Version 1
33
102) The birth weight for babies is normally distributed with a mean of 7.5 lb and a standard deviation of 1.125 lb. Suppose a pediatrician claims that the average birth weight for babies under her care is greater than 7.5 lb. She randomly selects 30 newborns and produces an average birth weight of 8 lb. Which of the following is true if the pediatrician uses a p-value approach to implement a hypothesis test at a 5% significance level to support her claim? A) The null hypothesis is H0: µ = 7.5. B) A two-tailed hypothesis test should be used. C) The allowed probability of making a Type II error is 5%. D) The p-value = p( ≥ 7.5) = P(Z ≥ 2.434) = 0.0075
103) Construct the null and alternative hypotheses for the following claims. a. "The school’s mean GPA differs from 2.50 GPA." b. "The school’s mean GPA is less than 2.50 GPA."
104) Massachusetts Institute of Technology grants pirate certificates to those students who successfully complete courses in archery, fencing, sailing, and pistol shooting ("MIT Awards Pirate Certificates to Undergraduates," Boston Globe, March 3). Sheila claims that those students who go on to earn pirate certificates are able to hit a higher proportion of bull’s-eyes during the archery final exam than the course average of 0.15. Specify the null and alternative hypotheses to test her claim.
Version 1
34
105) An engineer is designing an experiment to test if airplane engines are faulty and unsafe to fly. The engineer expects 0.0001% of engines to be unsafe. The null hypothesis is that the probability of an unsafe engine is less than or equal to 0.0001% and the alternative hypothesis is that the probability the engine is unsafe is greater than 0.0001%. a. Describe the consequences of Type I and Type II errors. b. Should the engineer design the test with a higher alpha or a higher beta? Explain.
106) George W. Bush famously claimed in his 2003 State of the Union address that the United States had uncovered evidence that Iraq had developed weapons of mass destruction under Saddam Hussein. Consider the null hypothesis, "Iraq does not possess weapons of mass destruction." After deciding that Iraq did possess weapons of mass destruction and invading this country, the United States found no evidence of said weapons. Is this a Type I or a Type II error on the part of the United States? Explain.
107) A philanthropic organization helped a town in Africa dig several wells to gain access to clean water. Before the wells were in place, an average of 120 infants contracted typhoid each month. After the wells were installed, the philanthropic organization surveyed for nine months and found an average of 90 infants contracted typhoid per month. Assume that the population standard deviation is 40 and the number of infants that contract typhoid is normally distributed. a. Specify the null and alternative hypotheses to determine whether the average number of infants that contract typhoid has decreased because the wells were put in place. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, can you conclude that the number of babies falling ill due to typhoid has decreased? Explain.
Version 1
35
108) Billy wants to test whether the average speed of his favorite pitcher’s fastball differs from the league average of 92 miles per hour. He takes a sample of 36 of the pitcher’s fastballs and computes a sample mean of 94 miles per hour. Assume that the standard deviation of the population is 4 miles per hour. a. Specify the null and alternative hypotheses to test Billy’s claim. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, can you conclude that Billy’s favorite pitcher’s fastball differs in speed from the league average? d. At the 1% significance level, can you conclude that Billy’s favorite pitcher’s fastball differs in speed from the league average?
109) An engineer wants to know if the average amount of energy used in his factory per day has changed since last year. The factory used an average of 2,000 megawatt hours (mwh) per day prior to last year. Since last year, the engineer surveyed 400 days and found the average energy use was 2,040 mwh per day. Assume that the population standard deviation is 425 mwh per day. a. Specify the null and alternative hypotheses to determine whether the factory’s energy use has changed. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, can you conclude that the energy usage has changed? Explain.
110) A university wants to know if the average salary of its graduates has increased since 2015. The average salary of graduates prior to 2015 was $48,000. Since 2015, the university surveyed 256 graduates and found an average salary of $48,750. Assume that the standard deviation of all graduates’ salaries is $7,000. a. Specify the null and alternative hypotheses to determine whether the average salary of graduates has increased. b. Calculate the value of the test statistic and the p-value at a 5% significance level. c. At the 5% significance level, can you conclude that salaries have increased? Explain.
Version 1
36
111) Students who graduated from college last year with student loans owed an average of $25,250 (The New York Times, November 2). An economist wants to determine if the average debt has increased since last year. She takes a sample of 40 recent graduates and finds that their average debt was $28,275. Assume that the population standard deviation is $7,250. a. Specify the competing hypotheses to determine whether the average undergraduate debt has increased since last year. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, can you conclude that the average undergraduate debt has increased? Explain.
112) A professional baseball player changed his throwing motion to increase the velocity of his fastball. Before the change, the player threw his fastball at an average of 91 miles per hour. After the change in his throwing motion, the team watched him throw 64 fastballs with an average speed of 91.75 miles per hour and a standard deviation of 3 miles per hour. a. Specify the null and alternative hypotheses to determine whether the baseball player increased the speed of his fastball. b. Calculate the value of the test statistic and the p-value. c. At the 1% significance level, can you conclude that the baseball player has increased the speed of his fastball? Explain.
Version 1
37
113) A company that manufactures helmets wants to test their helmets to make sure they pass SNELL certification. One of the tests the helmets are required to undergo is a test that simulates real-world impacts. During the test, crash dummies must, on average, experience a force of nearly 300 g in order to pass. The company tests 36 helmets and finds the dummy experienced an average of 298 g with a standard deviation of 8 g. a. Specify the null and alternative hypotheses to determine whether the company’s helmets experience, on average, less than 300 g during the SNELL test. b. Calculate the value of the test statistic, and find the p-value at a 10% significance level. c. At the 10% significance level, can you conclude that the company’s helmets experience less than 300 g during the SNELL test? Explain.
114) A city is considering widening a busy intersection in town. Last year, the city reported 16,000 cars passed through the intersection per day. The city conducted a survey for 49 days this year and found an average of 17,000 cars passed through the intersection, with a standard deviation of 5,000. a. Specify the null and alternative hypotheses to determine whether the intersection has seen an increase in traffic. b. Calculate the value of the test statistic and the p-value. c. The city is going to widen the intersection if it believes traffic has increased. At the 5% significance level, can you conclude that the intersection has seen an increase in traffic? Should the city widen the intersection?
Version 1
38
115) A hairdresser believes that she is more profitable on Tuesdays, her lucky day of the week. She knows that, on average, she has a daily revenue of $250. She randomly samples the revenue from eight Tuesdays and finds she takes in $260, $245, $270, $260, $295, $235, $270, and $265. Assume that daily revenue is normally distributed. a. Specify the population parameter to be tested. b. Specify the null and alternative hypotheses to test the hairdresser’s claim. c. Calculate the sample mean revenue and the sample standard deviation. d. Compute the value of the appropriate test statistic. e. At the 10% significance level, calculate the p-value. f. At the 10% significance level, is the hairdresser’s claim supported by the data?
116) A portfolio manager claims that the mean annual return on one of the mutual funds he manages exceeds 8%. To substantiate his claim, he states that over the past 10 years, the mean annual return for the mutual fund has been 9.5% with a sample standard deviation of 1.5%. Assume annual returns are normally distributed. a. Specify the competing hypotheses to test the portfolio manager’s claim. b. Calculate the value of the test statistic. c. At the 5% significance level, use the p-value approach to state the decision rule. d. Is the portfolio manager’s claim substantiated by the data? Explain.
117) According to a Lawyers.com survey, only 35% of adult Americans had a will (The Wall Street Journal, December 12). Suppose a recent survey of 250 adult Americans found 100 adults with wills. a. Specify the competing hypotheses to determine whether the proportion of adult Americans who have a will differs from the proportion reported. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, is the proportion different? Explain.
Version 1
39
118) A real estate investor thinks the real estate market has bottomed out. One of the variables he examined to arrive at this conclusion was the proportion of houses sold at or above the asking price. Last year, the proportion of houses sold at or above the asking price was 14%. The real estate investor takes a random sample of 40 recently sold houses and finds that nine of them are selling at or above the asking price. a. Specify the population parameter to be tested. b. Specify the null and alternative hypotheses to determine whether the proportion of houses sold at or above the asking price has increased. c. Calculate the value of the test statistic and the p-value. d. At the 10% significance level, can you conclude that the proportion of houses sold at or above the asking price has increased?
119) A doctor thinks he has found a miraculous cure for rheumatoid arthritis. He thinks it will cure more than 50% of all cases within a year of first taking the drug. To test his drug, he runs a clinical trial on 400 patients with rheumatoid arthritis. He finds that 208 of the patients are symptom free within a year. a. Specify the null and alternative hypotheses to determine whether the proportion of patients that are cured by the drug exceeds 50%. b. Calculate the value of the test statistic, and find the p-value at a 5% significance level. c. At the 5% significance level, can you conclude that the percentage of patients who are cured by the drug exceeds 50%?
Version 1
40
120) The percentage of complete passes is an important measure for a quarterback in the NFL. A particular quarterback has a career completion percentage of 55%. The quarterback got a new coach and then played a few games where he completed 48 of 100 passes. a. Specify the null and alternative hypotheses to determine whether the proportion of completed passes has decreased with the new coach. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, can you conclude that the quarterback’s completion percentage has decreased?
121) A Massachusetts state police officer measured the speed of 100 motorists on the Massachusetts Turnpike and found that 70 exceeded the posted speed limit by more than 10 miles per hour. The police officer claims that more than 60% of motorists drive at least 10 miles per hour more than the posted speed on the turnpike. a. Specify the null and alternative hypotheses to test the officer’s claim. b. Calculate the value of the test statistic. c. At the 5% significance level, what is the p-value to test the officer’s claim? d. At the 5% significance level, does the evidence support the officer’s claim?
122) Customers complained that it has taken longer than they were told to get their prescriptions filled at ABC Pharmacy. The manager of the pharmacy wants to conduct a hypothesis test to determine if the average wait time for filling a prescription is within 10 minutes as promised. a. What are the null and alternative hypotheses? b. Is the manager conducting a one- or two-tailed test? c. Is the manager more concerned about a Type I or Type II error? Explain. d. Are customers more concerned about a Type I or Type II error? Explain.
Version 1
41
123) Suppose a pediatrician believes that the average birth weight for babies under her care differs from the national average of 7.5 lb. She randomly selects 35 newborns and records their weight as shown in the table below: Weight (lb) 7 ⋮ 8
{MISSING IMAGE}Click here for the Excel Data File a. What are the null and alternative hypotheses to test the pediatrician’s belief? b. Should a one- or two-tailed test be used? Explain. c. What is the sample mean and standard deviation? d. What is the value of the test statistic? e. What is the p-value? f. What is the conclusion to the hypothesis test at a 5% significance level? g. What happens to the conclusion of the hypothesis test at a significance level higher than 5%? Explain.
124) According to the National Center for Education Statistics, the 6-year graduation rate is 60% at public institutions. In order to determine if the 6-year graduation rate of private colleges in California is higher than 60%, the 6-year graduation rate for 50 private colleges in California is collected, as shown in the table below: 6-Year Graduation Rate 80 ⋮ 74
Version 1
42
{MISSING IMAGE}Click here for the Excel Data File a. State the null and alternative hypotheses to test the 6-year graduation rate of private colleges in California as claimed. b. Is the test one- or two-tailed? Explain. c. Is the normality condition satisfied? Explain. d. What is the value of the test statistic? e. What is the p-value? f. What is the conclusion if α = 0.1?
125)
The null hypothesis typically corresponds to a presumed default state of nature. ⊚ true ⊚ false
126)
The alternative hypothesis typically agrees with the status quo. ⊚ true ⊚ false
127) On the basis of sample information, we either "accept the null hypothesis" or "reject the null hypothesis." ⊚ true ⊚ false
128) As a general guideline, we use the alternative hypothesis as a vehicle to establish something new, or contest the status quo, for which a corrective action may be required. ⊚ true ⊚ false
Version 1
43
129) In a one-tailed test, the rejection region is located under one tail (left or right) of the corresponding probability distribution, while in a two-tailed test this region is located under both tails. ⊚ true ⊚ false
130)
A Type I error is committed when we reject the null hypothesis, which is actually true. ⊚ true ⊚ false
131) A Type II error is made when we reject the null hypothesis and the null hypothesis is actually false. ⊚ true ⊚ false
132) For a given sample size, any attempt to reduce the likelihood of making one type of error (Type I or Type II) will increase the likelihood of the other error. ⊚ true ⊚ false
133) A hypothesis test regarding the population mean µ is based on the sampling distribution of the sample mean ⊚ true ⊚ false
Version 1
44
134) Under the assumption that the null hypothesis is true as an equality, the p-value is the likelihood of observing a sample mean that is at least as extreme as the one derived from the given sample. ⊚ true ⊚ false
Version 1
45
Answer Key Test name: Chap 09_4e 1) null 2) alternative 3) Type I 4) significance level 5) very strong 6) symmetry 7) NORM.DIST 8) T.DIST.RT 9) B 10) D 11) B 12) D 13) A 14) A 15) A 16) A 17) B 18) D 19) A 20) A 21) A 22) D 23) A 24) A 25) A 26) B Version 1
46
27) B 28) C 29) A 30) D 31) A 32) A 33) A 34) B 35) C 36) D 37) A 38) D 39) C 40) D 41) B 42) A 43) B 44) A 45) B 46) D 47) D 48) B 49) B 50) A 51) C 52) D 53) C 54) C 55) D 56) A Version 1
47
57) D 58) C 59) A 60) B 61) D 62) A 63) A 64) A 65) C 66) C 67) A 68) B 69) A 70) A 71) C 72) C 73) C 74) A 75) D 76) D 77) B 78) A 79) D 80) B 81) B 82) A 83) C 84) B 85) C 86) A Version 1
48
87) B 88) A 89) A 90) B 91) C 92) D 93) C 94) B 95) C 96) D 97) A 98) C 99) C 100) D 101) D 102) D 103) a. H0: μ = 2,50, HA: μ ≠ 2.50 b. H0: μ ≥ 2,50, HA: μ < 2.50 104) H0: p ≤ 0.15, HA: p > 0.15. 105) a. Type I error—ground planes that are safe; Type II error—fly planes that are unsafe. b. Higher alpha 106) Type I; the United States appeared to have erroneously rejected the null hypothesis. 107) a. Ho: μ ≥ 120,HA: μ < 120 b. z = −2.250, p-value = 0.0122 c. Yes, reject H0 because 0.0122 <0.05 108) a. H0: μ = 92,HA: μ ≠ 92 b. z = 3.00, p-value = 0.0027 c. Yes d. Yes
Version 1
49
109) a. H0: μ = 2,000, HA: μ ≠ 2,000 b. z = 1.882, p-value = 0.0598 c. No, fail to reject H0 because 0.0598 > 0.05. 110) a. H0: μ ≤ 48,HA: μ > 48 b. z = 1.714, p-value = 0.0432 c. Yes, reject H0 because 0.0432 > 0.05. 111) a. H0: μ ≤ 25,250,HA: μ > 25,250 b. z = 2.64, p-value = 0.0042 c. Yes, reject H0 because 0.0042<0.05. 112) a. H0: μ ≤ 91, HA: μ > 91 b. t63 = 2.00, p-value = T.DIST.RT(2.00,64-1) = 0.0249 c. No, fail to reject H0 because 0.0249 > 0.01. 113) a. H0: μ ≥ 300,HA: μ < 300 b. t35 = −1.50, p-value = T.DIST(-1.5,36-1,TRUE) = 0.0713 c. Yes, reject H0 because 0.0713 < 0.10. 114) a. H0: μ ≤ 16,000, HA: μ > 16,000 b. t48 = 1.40, p-value = T.DIST.RT(1.4,49-1) = 0.0840 c. Yes, fail to reject H0 because the p-value > 0.05. 115) a. The mean revenue on Tuesdays. b. H0: μ ≤ 250, HA: μ > 250 c. d. t7 = 1.972 e. p-value = T.DIST.RT(1.972,8-1) = 0.0446 f. Yes, 0.0446 < 0.05 116) a. H0: μ ≤ 8,HA: μ > 8 b. t9 = 3.16 c. p-value = T.DIST.RT(3.16,10-1) = 0.0058; Reject H0 0.0058 < 0.05. d. Reject H0,yes he is. 117) a. H0:p = 0.35, HA:p ≠ 0.35 b. z = 1.66, p-value = 0.097 c. Because 0.097 > 0.05, do not reject H0; at the 5% significance level, we cannot conclude that the proportion differs from 0.35. 118) a. The current proportion of houses sold at or above the asking price. b. H0: p ≤ 0.14,HA: p > 0.14 c. z = 1.549, p-value = 0.0607. d. Yes, reject H0 because p-value 0.0607 < 0.10. 119) a. H0: p ≤ 0.50,HA: p > 0.50. b. z = 0.80, p-value=1-NORM.S.DIST(0.80,TRUE) = 0.2119. c. Fail to reject H0 because the 0.2119 > 0.05.
Version 1
50
120) a. H0: p ≥ 0.55, HA: p < 0.55 b. z = –1.407, p-value = 0.0797 c. Fail to reject H0 because 0.0797 > 0.05. 121) a. H0: p ≤ 0.60, HA: p > 0.60 b. z = 2.041 c. p-value=1-NORM.S.DIST(2.041,TRUE) = 0.0206 d. Yes
122) a. H0: the average wait time is ≤ 10 minutes. H1: the average wait time is > 10 minutes. b. One-tailed test. c. Type I. The manager does not want to reject H0 when it is true. d. Type II. The customers do not want to accept H0 when it is false. 123) a. H0: µ = 7.5; HA: µ ≠ 7.5 b. A two-tailed test should be used because the test is about finding evidence to support that the average birth weight differs from 7.5 lb, i.e., it may be higher or lower than 7.5 lb. c. = 7.3429, s = 1.7480 d. t34 = −0.5319 e. p-value = 0.5983 f. The average birth weight for babies under the pediatrician’s care does not differ from the national average of 7.5 lb. g. The conclusion is the same unless the significance level is raised to 60% because at 60%, the p-value of 0.5983 is less than α=0.6. 124) a. H0: p ≤ 0.6; HA: p > 0.6 b. This is a one-tailed test because we want to determine if there is substantial evidence that the 6-year graduation rate of private colleges in California is higher than 60%. c. np = 35 and n(1 − p) = 15 both are ≥ 5. The normality condition is satisfied. d. z = 1.5473 e. p-value = 0.0609 f. The 6-year graduation rate of private colleges in California is greater than 60%.
125) TRUE 126) FALSE 127) FALSE 128) TRUE 129) TRUE 130) TRUE 131) FALSE Version 1
51
132) TRUE 133) TRUE 134) TRUE
Version 1
52
CHAPTER 10 – 4e 1) Statistical inference about the differences between two population means is based on _________ random samples.
2)
A left-tailed test determines whether μ1 is _____ than μ2.
3) To use R for solving hypothesis tests for the difference between two means, use the function _________ .
4)
A common case of dependent sampling is usually referred to as __________ sampling.
5) <p>When constructing a confidence interval for the mean difference same general format of point estimate ± _________ .
, we follow the
6) To use Excel for solving hypothesis tests for the mean difference you should choose: Data >__________> t-Test: Paired Two Sample for Means options.
7)
<p>We estimate the unknown standard deviation of
by the _________ .
8)
If the hypothesized difference d0 is ______, then the value of the test statistic is
computed as
Version 1
1
9)
Two or more random samples are considered independent if _________ .
A) the process that generates one sample is the same as the process that generates the other sample B) the process that generates one sample is related to the process that generates the other sample C) the process that generates one sample is influenced by the process that generates the other sample D) the process that generates one sample is completely separate from the process that generates the other sample
10) The choice of an appropriate test for comparing two population means depends on whether we deal with the following except: A) categorical or numerical data B) independent or matched-pairs sampling C) the equality or lack of equality of population variances D) the sum or product of a single parameter
11)
Which of the following is not a restriction for comparing two population means? A) <p>A normally distributed sampling distribution of B) Matched-pairs sampling C) Independent random samples D) Equal or unequal variances
12)
When comparing two population means, their hypothesized difference _________ . A) must be negative B) must be positive C) must be zero D) may assume any value
Version 1
2
13) Suppose you want to perform a test to compare the mean GPA of all freshmen with the mean GPA of all sophomores in a college? What type of sampling is required for this test? A) Independent sampling with categorical data B) Independent sampling with numerical data C) Matched-pairs sampling with categorical data D) Matched-pairs sampling with numerical data
14) <p>If the underlying populations cannot be assumed to be normal, then by the central limit theorem, the sampling distribution of is approximately normal only if both sample sizes are sufficiently large—that is, when _________ . A) n1 + n2 = 30 B) n1 + n2 ≥ 30 C) n1 = 30 and n2 = 30 D) n1 ≥ 30 and n2 ≥ 30
15) Which of the following pairs of hypotheses are used to test if the mean of the first population is smaller than the mean of the second population, using independent random sampling? A) H0: µD ≤ 0, HA: µD > 0 B) H0: µD ≥ 0, HA: µD < 0 C) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 D) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0
16) <p>A demographer wants to measure life expectancy in countries 1 and 2. Let and denote the mean life expectancy in countries 1 and 2, respectively. Specify the hypothesis to determine if life expectancy in country 1 is more than 10 years lower than in country 2.
Version 1
3
A) H0: µ1 – µ2 ≤ 10, HA: µ1 – µ2 > 10 B) H0: µ1 – µ2 ≥ 10, HA: µ1 – µ2 < 10 C) H0: µ1 – µ2 ≤ –10, HA: µ1 – µ2 > −10 D) H0: µ1 – µ2 ≥ –10, HA: µ1 – µ2 < −10
17) <p>When calculating the standard error of the sample variances and ?
, under what assumption do you pool
A) Known population variances. B) Unknown population variances that are assumed equal. C) Unknown population variances that are assumed unequal. D) The two sample means are derived from two dependent populations.
18) What formula is used to calculate the degrees of freedom for the t test for comparing two population means when the population variances are unknown and unequal? A) df = n − 1 B) df = n1 + n2 − 2 C) df = n1 + n2 − 1
D)
19) When testing the difference between two population means and the population variances are unknown and unequal, the degrees of freedom is calculated as 34.7. What degrees of freedom should be used to find the p-value of the test?
Version 1
4
A) 34 B) 34.7 C) 35 D) 33
20) When testing the difference between two population means under independent sampling, we use the z distribution if _________ . A) the population variances are known B) the population variances are unknown, but assumed to be equal C) the population variances are unknown and cannot be assumed equal D) Both the population variances are known and the population variances are unknown, but assumed to be equal
21)
<p>If the sampling distribution of
cannot be assumed normal, we _________ .
A) are unable to compute a confidence interval B) are still able to use the z distribution to compute a confidence interval C) must use the tdf distribution with n1 + n2 − 2 degrees of freedom to compute a confidence interval D) must use the tdf distribution and calculate the degrees of freedom using n1, n2, s1, and s2 to compute a confidence interval
22) Assume the competing hypotheses take the following form H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0, where µ1 is the population mean for population 1 and µ2 is the population mean for population 2. Also assume that the populations are normally distributed, the variances are known, and independent sampling is used. Which of the following expressions is the appropriate test statistic?
Version 1
5
A) B) C) D)
23) Assume the competing hypotheses take the following form: H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0, where µ1 is the population mean for population 1 and µ2 is the population mean for population 2. Also assume that the populations are normally distributed and we use independent sampling. The population variances are not known and assumed to be unequal. Which of the following expressions is the appropriate test statistic? A) B) C) D)
24) Assume the competing hypotheses take the following form: H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0, where µ1 is the population mean for population 1 and µ2 is the population mean for population 2. Also assume that the populations are normally distributed and that the observations in the two samples are independent. The population variances are not known but are assumed equal. Which of the following expressions is appropriate test statistic?
Version 1
6
A) B) C) D)
25) What type of test for population means should be performed when employees are first tested, trained, and then retested? A) A z test under independent sampling with known population variances. B) A t test under independent sampling with unknown but equal population variances. C) A t test under dependent sampling. D) A t test under independent sampling with unknown and unequal population variances.
26) A particular personal trainer works primarily with track and field athletes. She believes that her clients run faster after going through her program for six weeks. How might she test that claim? A) <p>A hypothesis test for B) <p>A hypothesis test for C) A matched-pairs hypothesis test for μD. D) We are unable to conduct a hypothesis test because the samples would not be independent.
27) What type of data is required to compare prices of the same textbooks sold by two different vendors?
Version 1
7
A) Dependent random samples with categorical data. B) Dependent random samples with numerical data. C) Independent random samples with categorical data. D) Independent random samples with numerical data.
28) Which of the following set of hypotheses is used to test if the mean of the first population is smaller than the mean of the second population, using matched-paired sampling? A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: µD ≤ 0, HA: µD > 0 D) H0: µD ≥ 0,HA: µD < 0
29) What type of data should be collected to compare the likelihoods of two candidates winning their elections when the candidates are running in different elections? A) Matched-pairs sampling with categorical data. B) Matched-pairs sampling with numerical data. C) Independent sampling with categorical data. D) Independent sampling with numerical data.
30) You would like to determine if there is a higher incidence of smoking among women than among men in a neighborhood. Let women and men be represented by populations 1 and 2, respectively. Which of the following hypotheses is relevant to this claim? A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 D) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0
Version 1
8
31) You would like to determine if there is a higher incidence of smoking among women than among men in a neighborhood. Let men and women be represented by populations 1 and 2, respectively. Which of the following hypotheses is relevant to this claim? A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 D) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0
32) Which of the following is appropriate to conduct a hypothesis test for the difference between two population proportions under independent sampling. A) In every case. B) <p>Only if C) <p>Only if D) <p>Only if
and
.</p> , and , and
33) When the hypothesized difference of the population proportions is equal to 0, we _________ . A) use a matched-pairs approach B) <p>are able to estimate the standard error of using the pooled C) can use the confidence interval to implement the test if the difference of the population proportions is equal to 0 D) <p>Both estimate the standard error of using the pooled and use the confidence interval to implement the test if the difference of the population proportions is equal to 0</p>
Version 1
9
34) A particular bank has two loan modification programs for distressed borrowers: Home Affordable Modification Program (HAMP) modifications, where the federal government pays the bank $1,000 for each successful modification, and non-HAMP modifications, where the bank does not receive a bonus from the federal government. To qualify for a HAMP modification, borrowers must meet a set of financial suitability criteria. What type of hypothesis test should we use to test whether borrowers from this particular bank who receive HAMP modifications are more likely to re-default than those who receive non-HAMP modifications? A) A hypothesis test for p1 − p2. B) A hypothesis test for µ1 − µ2. C) A matched-pairs hypothesis test. D) We are unable to conduct a hypothesis test because independent random samples of each group could not be collected.
35) A particular bank has two loan modification programs for distressed borrowers: Home Affordable Modification Program (HAMP) modifications, where the federal government pays the bank $1,000 for each successful modification, and non-HAMP modifications, where the bank does not receive a bonus from the federal government. To qualify for a HAMP modification, borrowers must meet a set of financial suitability criteria. Define the null and alternative hypotheses to test whether borrowers who receive HAMP modifications default less than borrowers who receive non-HAMP modifications. Let p1 and p2 represent the proportion of borrowers who received HAMP and non-HAMP modifications that did not re-default, respectively. A) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 B) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0 C) H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 D) H0: p1 − p2 > 0, HA: p1 − p2 ≤ 0
Version 1
10
36) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 23 pounds per batch and fertilizer from distributor B contained 18 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is four pounds per batch and five pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizer’s A and B, respectively. Specify the competing hypotheses to determine if fertilizer A contains more nitrogen per batch than fertilizer B. A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 D) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0
37) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 31 pounds per batch and fertilizer from distributor B contained 21 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is seven pounds per batch and eight pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizer’s A and B, respectively. Which of the following is the correct value of the test statistic? A) t6 = 1.8814 B) z = –1.8814 C) t6 = –1.8814 D) z = 1.8814
Version 1
11
38) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 23 pounds per batch and fertilizer from distributor B contained 18 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is four pounds per batch and five pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizer’s A and B, respectively. Which of the following is the correct value of the test statistic? A) t6 = −1.5617 B) t6 = 1.5617 C) z = −1.5617 D) z = 1.5617
39) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 35 pounds per batch and fertilizer from distributor B contained 24 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is four pounds per batch and five pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizers A and B, respectively. Which of the following is an appropriate p-value? A) 0.0003 B) 0.0258 C) 0.0307 D) 0.0595
Version 1
12
40) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 23 pounds per batch and fertilizer from distributor B contained 18 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is four pounds per batch and five pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizers A and B, respectively. Which of the following is an appropriate p-value? A) 0.0592 B) 0.0847 C) 0.0896 D) 0.1184
41) A farmer uses a lot of fertilizer to grow his crops. The farmer’s manager thinks fertilizer products from distributor A contain more of the nitrogen that his plants need than distributor B’s fertilizer does. He takes two independent samples of four batches of fertilizer from each distributor and measures the amount of nitrogen in each batch. Fertilizer from distributor A contained 23 pounds per batch and fertilizer from distributor B contained 18 pounds per batch. Suppose the population standard deviation for distributor A and distributor B is four pounds per batch and five pounds per batch, respectively. Assume the distribution of nitrogen in fertilizer is normally distributed. Let µ1 and µ2 represent the average amount of nitrogen per batch for fertilizers A and B, respectively. Which of the following is the appropriate conclusion at the 5% significance level? A) Reject H0; we can conclude that the mean amount of fertilizer per batch for distributor A is greater than the amount of fertilizer per batch for distributor B. B) Reject H0; we cannot conclude that the mean amount of fertilizer per batch for distributor A is greater than the amount of fertilizer per batch for distributor B. C) Fail to reject H0; we can conclude that the mean amount of fertilizer per batch for distributor A is greater than the amount of fertilizer per batch for distributor B. D) Fail to reject H0; we cannot conclude that the mean amount of fertilizer per batch for distributor A is greater than the amount of fertilizer per batch for distributor B.
Version 1
13
42) Calcium is an essential nutrient for strong bones and for controlling blood pressure and heart beat. Because most of the body’s calcium is stored in bones and teeth, the body withdraws the calcium it needs from the bones. Over time, if more calcium is taken out of the bones than is put in, the result may be thin, weak bones. This is especially important for women who are often recommended a calcium supplement. A consumer group activist assumes that calcium content in two popular supplements are normally distributed with the same unknown population variance, and uses the following information obtained under independent sampling: Supplement 1
Supplement 2
Let μ1 and μ2 denote the corresponding population means. Construct a 95% confidence interval for the difference μ1 − μ2. A) [−26.1657, 3.8343] B) [−39.5741, 5.5741] C) [−29.0278, 4.5736] D) [−26.8157, −3.1843]
43) Calcium is an essential nutrient for strong bones and for controlling blood pressure and heart beat. Because most of the body’s calcium is stored in bones and teeth, the body withdraws the calcium it needs from the bones. Over time, if more calcium is taken out of the bones than is put in, the result may be thin, weak bones. This is especially important for women who are often recommended a calcium supplement. A consumer group activist assumes that calcium content in two popular supplements are normally distributed with the same unknown population variance, and uses the following information obtained under independent sampling: Supplement 1
Supplement 2
Let μ1 and μ2 denote the corresponding population means. Construct a 95% confidence interval for the difference μ1 − μ2. A) [−30.9386, 1.0614] B) [−31.5886, −0.4114] C) [−33.8007, 1.8007] D) [−34.8012, 2.8012]
Version 1
14
44) Calcium is an essential nutrient for strong bones and for controlling blood pressure and heart beat. Because most of the body’s calcium is stored in bones and teeth, the body withdraws the calcium it needs from the bones. Over time, if more calcium is taken out of the bones than is put in, the result may be thin, weak bones. This is especially important for women who are often recommended a calcium supplement. A consumer group activist assumes that calcium content in two popular supplements are normally distributed with the same unknown population variance, and uses the following information obtained under independent sampling: Supplement 1
Supplement 2
Let μ1 and μ2 denote the corresponding population means. Can we conclude that the average calcium content of the two supplements differs at the 95% confidence level? A) No, because the 95% confidence interval contains the hypothesized value of zero. B) Yes, because the 95% confidence interval contains the hypothesized value of zero. C) No, because the 95% confidence interval does not contain the hypothesized value of zero. D) Yes, because the 95% confidence interval does not contain the hypothesized value of zero.
45) A restaurant chain has two locations in a medium-sized town and, believing that it has oversaturated the market for its food, is considering closing one of the restaurants. The manager of the restaurant with a downtown location claims that his restaurant generates more revenue than the sister restaurant by the freeway. The CEO of this company, wishing to test this claim, randomly selects 36 monthly revenue totals for each restaurant. The revenue data from the downtown restaurant have a mean of $360,000 and a standard deviation of $50,000, while the data from the restaurant by the freeway have a mean of $340,000 and a standard deviation of $40,000. Assume there is no reason to believe the population standard deviations are equal, and let μ1 and μ2 denote the mean monthly revenue of the downtown restaurant and the restaurant by the freeway, respectively. Which of the following hypotheses should be used to test the manager’s claim?
Version 1
15
A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0 D) H0: µ1 − µ2 > 0, HA: µ1 − µ2 ≤ 0
46) A restaurant chain has two locations in a medium-sized town and, believing that it has oversaturated the market for its food, is considering closing one of the restaurants. The manager of the restaurant with a downtown location claims that his restaurant generates more revenue than the sister restaurant by the freeway. The CEO of this company, wishing to test this claim, randomly selects 18 monthly revenue totals for each restaurant. The revenue data from the downtown restaurant have a mean of $364,000 and a standard deviation of $40,000, while the data from the restaurant by the freeway have a mean of $350,000 and a standard deviation of $41,000. Assume there is no reason to believe the population standard deviations are equal, and let μ1 and μ2 denote the mean monthly revenue of the downtown restaurant and the restaurant by the freeway, respectively. Which of the following is the correct value of the test statistic to analyze the claim? A) t33 = 1.037 B) t31 = 1.037 C) t33 = 1.011 D) t31 = 1.011
47) A restaurant chain has two locations in a medium-sized town and, believing that it has oversaturated the market for its food, is considering closing one of the restaurants. The manager of the restaurant with a downtown location claims that his restaurant generates more revenue than the sister restaurant by the freeway. The CEO of this company, wishing to test this claim, randomly selects 36 monthly revenue totals for each restaurant. The revenue data from the downtown restaurant have a mean of $360,000 and a standard deviation of $50,000, while the data from the restaurant by the freeway have a mean of $340,000 and a standard deviation of $40,000. Assume there is no reason to believe the population standard deviations are equal, and let μ1 and μ2 denote the mean monthly revenue of the downtown restaurant and the restaurant by the freeway, respectively. Which of the following is the correct value of the test statistic to analyze the claim?
Version 1
16
A) t66 = 1.848 B) t67 = 1.848 C) t66 = 1.874 D) t67 = 1.874
48) A restaurant chain has two locations in a medium-sized town and, believing that it has oversaturated the market for its food, is considering closing one of the restaurants. The manager of the restaurant with a downtown location claims that his restaurant generates more revenue than the sister restaurant by the freeway. The CEO of this company, wishing to test this claim, randomly selects 36 monthly revenue totals for each restaurant. The revenue data from the downtown restaurant have a mean of $360,000 and a standard deviation of $50,000, while the data from the restaurant by the freeway have a mean of $340,000 and a standard deviation of $40,000. Assume there is no reason to believe the population standard deviations are equal, and let μ1 and μ2 denote the mean monthly revenue of the downtown restaurant and the restaurant by the freeway, respectively. Which of the following is the appropriate critical value(s) to test the manager’s claim at the 5% significance level? A) 1.645 B) 1.668 C) 1.960 D) 1.997
49) A restaurant chain has two locations in a medium-sized town and, believing that it has oversaturated the market for its food, is considering closing one of the restaurants. The manager of the restaurant with a downtown location claims that his restaurant generates more revenue than the sister restaurant by the freeway. The CEO of this company, wishing to test this claim, randomly selects 36 monthly revenue totals for each restaurant. The revenue data from the downtown restaurant have a mean of $360,000 and a standard deviation of $50,000, while the data from the restaurant by the freeway have a mean of $340,000 and a standard deviation of $40,000. Assume there is no reason to believe the population standard deviations are equal, and let μ1 and μ2 denote the mean monthly revenue of the downtown restaurant and the restaurant by the freeway, respectively. At the 5% significance level, does the evidence support the manager’s claim?
Version 1
17
A) No, because the test statistic value is less than the critical value. B) Yes, because the test statistic value is less than the critical value. C) No, because the test statistic value is greater than the critical value. D) Yes, because the test statistic value is greater than the critical value.
50) A 7,000-seat theater is interested in determining whether there is a difference in attendance between shows on Tuesday evening and those on Wednesday evening. A random sample of 25 weeks is collected for Tuesday; a different sample of 25 weeks is collected for Wednesday. The mean attendance on Tuesday evening is calculated as 5,500, while the mean attendance on Wednesday evening is calculated as 5,850. The known population standard deviation for attendance on Tuesday evening is 550 and the known population standard deviation for attendance on Wednesday evening is 445. Let μ1 be the population mean of Tuesday, μ2 be the population mean of Wednesday, and μD be the mean difference for a matched-pairs sampling. What are the appropriate hypotheses to determine whether there is a difference, on average, in attendance between shows on Tuesday evening and Wednesday evening? A) H0: µD ≥ 0, HA: µD < 0 B) H0: µD = 0, HA: µD ≠ 0 C) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0 D) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0
51) A 7,000-seat theater is interested in determining whether there is a difference in attendance between shows on Tuesday evening and those on Wednesday evening. Two independent samples of 32 weeks are collected for Tuesday and Wednesday. The mean attendance on Tuesday evening is calculated as 5,450, while the mean attendance on Wednesday evening is calculated as 5,740. The known population standard deviation for attendance on Tuesday evening is 620 and the known population standard deviation for attendance on Wednesday evening is 480. Which of the following is the value of the appropriate test statistic? A) tdf = 2.0922 B) tdf = −2.0922 C) z = −2.0922 D) z = 2.0922
Version 1
18
52) A 7,000-seat theater is interested in determining whether there is a difference in attendance between shows on Tuesday evening and those on Wednesday evening. Two independent samples of 25 weeks are collected for Tuesday and Wednesday. The mean attendance on Tuesday evening is calculated as 5,500, while the mean attendance on Wednesday evening is calculated as 5,850. The known population standard deviation for attendance on Tuesday evening is 550 and the known population standard deviation for attendance on Wednesday evening is 445. Which of the following is the value of the appropriate test statistic? A) z = −2.4736 B) z = 2.4736 C) tdf = −2.4736 D) tdf = 2.4736
53) A 7,000-seat theater is interested in determining whether there is a difference in attendance between shows on Tuesday evening and those on Wednesday evening. Two independent samples of 16 weeks are collected for Tuesday and Wednesday. The mean attendance on Tuesday evening is calculated as 5,420, while the mean attendance on Wednesday evening is calculated as 5,970. The known population standard deviation for attendance on Tuesday evening is 560 and the known population standard deviation for attendance on Wednesday evening is 470. Which of the following is the appropriate decision given a 5% level of significance? A) Do not conclude that the mean attendance differs because the p-value = 0.0013 < 0.05. B) Conclude that the mean attendance differs because the p-value = 0.0026 < 0.05. C) Do not conclude that the mean attendance differs because the p-value = 0.0026 < 0.05. D) Conclude that the mean attendance differs because the p-value = 0.0013 < 0.05.
Version 1
19
54) A 7,000-seat theater is interested in determining whether there is a difference in attendance between shows on Tuesday evening and those on Wednesday evening. Two independent samples of 25 weeks are collected for Tuesday and Wednesday. The mean attendance on Tuesday evening is calculated as 5,500, while the mean attendance on Wednesday evening is calculated as 5,850. The known population standard deviation for attendance on Tuesday evening is 550 and the known population standard deviation for attendance on Wednesday evening is 445. Which of the following is the appropriate decision given a 5% level of significance? A) Conclude that the mean attendance differs because the p-value = 0.0067 < 0.05. B) Conclude that the mean attendance differs because the p-value = 0.0134 < 0.05. C) Do not conclude that the mean attendance differs because the p-value = 0.0067 < 0.05. D) Do not conclude that the mean attendance differs because the p-value = 0.0134 < 0.05.
55) A producer of fine chocolates believes that the sales of two varieties of truffles differ significantly during the holiday season. The first variety is milk chocolate while the second is milk chocolate filled with mint. It is reasonable to assume that truffle sales are normally distributed with unknown but equal population variances. Two independent samples of 18 observations each are collected for the holiday period. A sample mean of 12 million milk chocolate truffles sold with a sample standard deviation of 2.5 million. A sample mean of 13.5 million truffles filled with mint sold with a sample standard deviation of 2.3 million. Use milk chocolate as population 1 and mint chocolate as population 2. Which of the following are the appropriate hypotheses to determine if the average sales of the two varieties of truffles differ significantly during the holiday season? A) H0: µD ≥ 0, HA: µD < 0 B) H0: µD = 0, HA: µD ≠ 0 C) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 D) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0
Version 1
20
56) A producer of fine chocolates believes that the sales of two varieties of truffles differ significantly during the holiday season. The first variety is milk chocolate while the second is milk chocolate filled with mint. It is reasonable to assume that truffle sales are normally distributed with unknown but equal population variances. Two independent samples of 18 observations each are collected for the holiday period. A sample mean of 12 million milk chocolate truffles sold with a sample standard deviation of 2.5 million. A sample mean of 13.5 million truffles filled with mint sold with a sample standard deviation of 2.3 million. Use milk chocolate as population 1 and mint chocolate as population 2. Assuming the population variances are equal, which of the following is the value of the appropriate test statistic? A) z = −1.8734 B) z = 1.8734 C) t34 = −1.8734 D) t34 = 1.8734
57) A producer of fine chocolates believes that the sales of two varieties of truffles differ significantly during the holiday season. The first variety is milk chocolate while the second is milk chocolate filled with mint. It is reasonable to assume that truffle sales are normally distributed with unknown but equal population variances. Two independent samples of 18 observations each are collected for the holiday period. A sample mean of 12 million milk chocolate truffles sold with a sample standard deviation of 2.5 million. A sample mean of 13.5 million truffles filled with mint sold with a sample standard deviation of 2.3 million. Use milk chocolate as population 1 and mint chocolate as population 2. Which of the following is the appropriate decision given a 5% level of significance? A) Conclude that the average milk chocolate and mint chocolate sales differ because the p-value is greater than 0.05. B) Conclude that the average milk chocolate and mint chocolate sales do not differ because the p-value is less than 0.05. C) Do not conclude that the average milk chocolate and mint chocolate sales differ because the p-value is greater than 0.05. D) Do not conclude that the average milk chocolate and mint chocolate sales do not differ because the p-value is less than 0.05.
Version 1
21
58) A university wants to compare out-of-state applicants’ mean SAT math scores (μ1) to instate applicants; mean SAT math scores (μ2). The university looks at 35 in-state applicants and 35 out-of-state applicants. The mean SAT math score for in-state applicants was 540, with a standard deviation of 20. The mean SAT math score for out-of-state applicants was 555, with a standard deviation of 25. It is reasonable to assume the corresponding population standard deviations are equal. To calculate the confidence interval for the difference μ1 − μ2, what is the number of degrees of freedom of the appropriate probability distribution? A) 64 B) 64.87 C) 68 D) 69
59) A university wants to compare out-of-state applicants’ mean SAT math scores (μ1) to instate applicants’ mean SAT math scores (μ2). The university looks at 35 in-state applicants and 35 out-of-state applicants. The mean SAT math score for in-state applicants was 540, with a standard deviation of 20. The mean SAT math score for out-of-state applicants was 555, with a standard deviation of 25. It is reasonable to assume the corresponding population standard deviations are equal. Calculate a 95% confidence interval for the difference μ1 – μ2. A) [−25.6067,−4.3933] B) [−25.7961, −4.2039] C) [−25.8124, −4.1876] D) [−33.6105, 3.6105]
60) A university wants to compare out-of-state applicants’ mean SAT math scores (μ1) to instate applicants’ mean SAT math scores (μ2). The university looks at 35 in-state applicants and 35 out-of-state applicants. The mean SAT math score for in-state applicants was 540, with a standard deviation of 20. The mean SAT math score for out-of-state applicants was 555, with a standard deviation of 25. It is reasonable to assume the corresponding population standard deviations are equal. At the 5% significance level, can the university conclude that the mean SAT math score for in-state students and out-of-state students differ?
Version 1
22
A) No, because the confidence interval contains zero. B) Yes, because the confidence interval contains zero. C) No, because the confidence interval does not contain zero. D) Yes, because the confidence interval does not contain zero.
61) A statistics professor at a large university hypothesizes that students who take statistics in the morning typically do better than those who take it in the afternoon. He takes a random sample of 36 students who took a morning class and, independently, another random sample of 36 students who took an afternoon class. He finds that the morning group scored an average of 74 with a standard deviation of 8, while the evening group scored an average of 68 with a standard deviation of 10. The population standard deviation of scores is unknown but is assumed to be equal for morning and evening classes. Let µ1 and µ2 represent the population mean final exam scores of statistics’ courses offered in the morning and the afternoon, respectively. Which of the following hypotheses will test the professor’s claim? A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: µ1 − µ2 > 0, HA: µ1 − µ2 ≤ 0 D) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0
62) A statistics professor at a large university hypothesizes that students who take statistics in the morning typically do better than those who take it in the afternoon. He takes a random sample of 36 students who took a morning class and, independently, another random sample of 36 students who took an afternoon class. He finds that the morning group scored an average of 74 with a standard deviation of 8, while the evening group scored an average of 68 with a standard deviation of 10. The population standard deviation of scores is unknown but is assumed to be equal for morning and evening classes. Let µ1 and µ2 represent the population mean final exam scores of statistics’ courses offered in the morning and the afternoon, respectively. Compute the appropriate test statistic to analyze the claim at the 1% significance level.
Version 1
23
A) t70 = −2.811 B) t70 = 2.811 C) z = −2.811 D) z = 2.811
63) A statistics professor at a large university hypothesizes that students who take statistics in the morning typically do better than those who take it in the afternoon. He takes a random sample of 36 students who took a morning class and, independently, another random sample of 36 students who took an afternoon class. He finds that the morning group scored an average of 74 with a standard deviation of 8, while the evening group scored an average of 68 with a standard deviation of 10. The population standard deviation of scores is unknown but is assumed to be equal for morning and evening classes. Let µ1 and µ2 represent the population mean final exam scores of statistics’ courses offered in the morning and the afternoon, respectively. Which of the following is(are) the appropriate critical value(s) to test the professor’s claim at the 1% significance level? A) −2.381 and 2.381 B) −2.326 and 2.326 C) 2.326 D) 2.381
64) A statistics professor at a large university hypothesizes that students who take statistics in the morning typically do better than those who take it in the afternoon. He takes a random sample of 36 students who took a morning class and, independently, another random sample of 36 students who took an afternoon class. He finds that the morning group scored an average of 74 with a standard deviation of 8, while the evening group scored an average of 68 with a standard deviation of 10. The population standard deviation of scores is unknown but is assumed to be equal for morning and evening classes. Let µ1 and µ2 represent the population mean final exam scores of statistics’ courses offered in the morning and the afternoon, respectively. At the 1% significance level, does the evidence support the professor’s claim?
Version 1
24
A) No, because the test statistic is less than the critical value. B) Yes, because the test statistic is less than the critical value. C) No, because the test statistic is greater than the critical value. D) Yes, because the test statistic is greater than the critical value.
65) A new sales training program has been instituted at a rent-to-own company. Prior to the training, 10 employees were tested on their knowledge of products offered by the company. Once the training was completed, the employees were tested again in an effort to determine whether the training program was effective. Scores are known to be normally distributed. The sample scores on the tests are listed next. Use pretest score as µ1 for population 1 and posttest score as µ2 for population 2, or µD as the mean of the difference calculated as pretest score minus posttest score. Pretest Score
Posttest Score
66
75
94
100
87
93
84
85
76
75
88
90
Which of the following are the appropriate hypotheses to determine if the training increases scores? A) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 B) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 C) H0: µD ≥ 0, HA: µD < 0 D) H0: µD ≤ 0, HA: µD > 0
Version 1
25
66) A new sales training program has been instituted at a rent-to-own company. Prior to the training, 6 employees were tested on their knowledge of products offered by the company. Once the training was completed, the employees were tested again in an effort to determine whether the training program was effective. Scores are known to be normally distributed. The sample scores on the tests are listed next. Use pretest score as µ1 for population 1 and posttest score as µ2 for population 2, or µD as the difference calculated as pretest score minus posttest score. Pretest Score
Posttest Score
54
62
81
83
107
115
81
90
86
77
70
74
Which of the following is the value of the appropriate test statistic? A) t5 = –1.3262 B) z = 1.3262 C) z = –1.3262 D) t5 = 1.3262
67) A new sales training program has been instituted at a rent-to-own company. Prior to the training, 6 employees were tested on their knowledge of products offered by the company. Once the training was completed, the employees were tested again in an effort to determine whether the training program was effective. Scores are known to be normally distributed. The sample scores on the tests are listed next. Use pretest score as µ1 for population 1 and posttest score as µ2 for population 2, or µD as the difference calculated as pretest score minus posttest score. Pretest Score
Posttest Score
66
75
94
100
87
93
84
85
76
75
88
90
Which of the following is the value of the appropriate test statistic?
Version 1
26
A) t5 = −2.4947 B) t5 = 2.4947 C) z = −2.4947 D) z = 2.4947
68) A new sales training program has been instituted at a rent-to-own company. Prior to the training, 10 employees were tested on their knowledge of products offered by the company. Once the training was completed, the employees were tested again in an effort to determine whether the training program was effective. Scores are known to be normally distributed. The sample scores on the tests are listed next. Use pretest score as µ1 for population 1 and posttest score as µ2 for population 2, or µD as the difference calculated as pretest score minus posttest score. Pretest Score
Posttest Score
66
75
94
100
87
93
84
85
76
75
88
90
Which of the following is the correct conclusion for the test at α = 0.05? A) Given the critical value 1.645, we conclude that training increases scores on average. B) Given the critical value 2.015, we conclude that training increases scores on average. C) Given the critical value −2.015, we conclude that training increases scores on average. D) Given the critical value −1.645, we conclude that training increases scores on average.
Version 1
27
69) A tutor promises to improve GMAT scores of students by more than 50 points after three lessons. To see if this is true, the tutor takes a sample of 49 students’ test scores after and before they received tutoring. The mean difference was 53 points better after tutoring, with a standard deviation of the difference equal to 12 points. Let µD denote the mean of the difference: score after tutoring minus score before tutoring. Which of the following hypotheses will determine if the students improved their test scores by more than 50 points after being tutored? A) H0: µD ≤ 0, HA: µD > 0 B) H0: µD ≥ 0, HA: µD < 0 C) H0: µD ≤ 50, HA: µD > 50 D) H0: µD ≥ 50, HA: µD < 50
70) A tutor promises to improve GMAT scores of students by more than 56 points after three lessons. To see if this is true, the tutor takes a sample of 49 students’ test scores after and before they received tutoring. The mean difference was 59 points better after tutoring, with a standard deviation of the difference equal to 13 points. Let µD denote the mean of the difference: score after tutoring minus score before tutoring. Which of the following is the correct value of the test statistic? A) t48 = 1.6154 B) t48 = 1.5797 C) z = 1.5797 D) z = 1.6154
71) A tutor promises to improve GMAT scores of students by more than 50 points after three lessons. To see if this is true, the tutor takes a sample of 49 students’ test scores after and before they received tutoring. The mean difference was 53 points better after tutoring, with a standard deviation of the difference equal to 12 points. Let µD denote the mean of the difference: score after tutoring minus score before tutoring. Which of the following is the correct value of the test statistic?
Version 1
28
A) z = 1.7143 B) z = 1.7500 C) t48 = 1.7143 D) t48 = 1.7500
72) A tutor promises to improve GMAT scores of students by more than 50 points after three lessons. To see if this is true, the tutor takes a sample of 49 students’ test scores after and before they received tutoring. The mean difference was 53 points better after tutoring, with a standard deviation of the difference equal to 12 points. Let µD denote the mean of the difference: score after tutoring minus score before tutoring. Assuming α = 0.05, which of the following is the appropriate critical value? A) z = 1.677 B) z = 2.011 C) t0.05,48 = 1.677 D) t0.05,48 = 2.011
73) A tutor promises to improve GMAT scores of students by more than 50 points after three lessons. To see if this is true, the tutor takes a sample of 49 students’ test scores after and before they received tutoring. The mean difference was 53 points better after tutoring, with a standard deviation of the difference equal to 12 points. Let µD denote the mean of the difference: score after tutoring minus score before tutoring. Which of the following is the appropriate conclusion at the 5% significance level? A) Reject H0; we can conclude that the mean difference in test scores after and before tutoring is greater than 50 points. B) Reject H0; we cannot conclude that the mean difference in test scores after and before tutoring is greater than 50 points. C) Fail to reject H0; we can conclude that the mean difference in test scores after and before tutoring is greater than 50 points. D) Fail to reject H0; we cannot conclude that the mean difference in test scores after and before tutoring is greater than 50 points.
Version 1
29
74) A bank is trying to determine which model of safe to install. The bank manager believes that each model is equally resistant to safe crackers but sets up a test to be sure. He hires nine safe experts to break into each of the models, timing each endeavor. The results (in seconds) are given next, paired by expert. Let D be the difference: Time to break Safe 1 minus Time to break Safe 2. Safe 1
Safe 2
103
101
90
94
64
58
120
112
104
103
92
90
145
140
106
110
76
74
Which of the following hypotheses will determine if the two safes take, on average, the same amount of time to crack? A) H0: µD ≤ 0, HA: µD > 0 B) H0: µD ≥ 0, HA: µD < 0 C) H0: µD ≠ 0, HA: µD = 0 D) H0: µD = 0, HA: µD ≠ 0
75) A bank is trying to determine which model of safe to install. The bank manager believes that each model is equally resistant to safe crackers but sets up a test to be sure. He hires nine safe experts to break into each of the models, timing each endeavor. The results (in seconds) are given next, paired by expert. Let D be the difference: Time to break Safe 1 minus Time to break Safe 2. Safe 1
Safe 2
103
101
90
94
64
58
120
112
Version 1
30
104
103
92
90
145
140
106
110
76
74
Assuming the difference D is normally distributed, which of the following is the 95% confidence interval for μD? A) [−1.7537, 5.7537] B) [−1.1459, 5.1459] C) [−0.6739, 4.6739] D) [−0.5368, 4.5368]
76) A bank is trying to determine which model of safe to install. The bank manager believes that each model is equally resistant to safe crackers but sets up a test to be sure. He hires nine safe experts to break into each of the models, timing each endeavor. The results (in seconds) are given next, paired by expert. Let D be the difference: Time to break Safe 1 minus Time to break Safe 2. Safe 1
Safe 2
103
101
90
94
64
58
120
112
104
103
92
90
145
140
106
110
76
74
Should we conclude that the average time it takes experts to crack the safes does not differ by model at the 5% significance level?
Version 1
31
A) Yes, the 95% confidence interval for μD contains 0. B) No, the 95% confidence interval for μDdoes not contain 0. C) No, the 95% confidence interval for μD contains 0, but the absolute value of the upper bound is larger than the absolute value of the lower bound. D) There is not enough information to determine the answer.
77) Paul owns a mobile wood-fired pizza oven operation. A couple of his clients complained about his dough at a recent catering, so he changed his dough to a newer product. Using the old dough, there were 6 complaints out of 385 pizzas. With the new dough, there were 16 complaints out of 340 pizzas. Let p1 be the proportion of customer complains with the old dough and p2 be the proportion of customer complaints with the new dough. State the competing hypotheses to determine if the proportion of customer complaints using the old dough is less than the proportion of customer complaints using the new dough. A) H0:p1 − p2 ≤ 0, HA: p1 − p2 > 0 B) H0:p1 − p2 ≥ 0, HA: p1 − p2 < 0 C) H0:p1 − p2 = 0, HA: p1 − p2 ≠ 0 D) H0:p1 − p2 > 0, HA: p1 − p2 ≤ 0
78) Paul owns a mobile wood-fired pizza oven operation. A couple of his clients complained about his dough at a recent catering, so he changed his dough to a newer product. Using the old dough, there were 6 complaints out of 385 pizzas. With the new dough, there were 16 complaints out of 340 pizzas. Let p1 be the proportion of customer complaints with the old dough and p2 be the proportion of customer complains with the new dough. Based on a 95% confidence for the difference of the proportions, what can be concluded?
Version 1
32
A) Reject H0; we can conclude the proportion of customer complaints is more for the old dough. B) Do not reject H0; we cannot conclude the proportion of customer complaints is more for the old dough. C) Do not reject H0; we can conclude the proportion of customer complaints is more for the old dough. D) Reject H0; we cannot conclude the proportion of customer complaints is more for the old dough.
79)
<p>If testing the difference between two population proportions under independent
sampling, you pool the sample proportions and compute as
_________ .
A) if the population proportions are assumed equal B) if the difference in the population proportions is hypothesized to be zero C) if the difference in the population proportions is hypothesized to be different than zero D) if the population proportions are not equal
80) A coach is examining two basketball players’ free-throw percentages over the last few games. Bob made 34 out of 85 shots and Joe made 72 out of 125 shots. The coach wants to know if Bob is a 10% worse free-throw shooter than Joe. Which of the following would be the p-value of the corresponding test? A) 0.1364 B) 0.2728 C) 0.4322 D) 0.4544
Version 1
33
81) The student senate at a local university is about to hold elections. A representative from the women’s sports program and a representative from the men’s sports program must both be elected. Two candidates, an incumbent and a challenger, are vying for each position. A hypothesis test must be performed to determine whether the percentages of supporting votes are different between the two incumbent candidates. In a sample of 100 voters, 67 said that they would vote for the women’s incumbent candidate. In a separate sample of 100 voters, 55 said they would vote for the men’s incumbent candidate. Let p1 and p2 be the proportions of supporting votes for the incumbent candidates representing women’s and men’s sports programs, respectively. Which of the following hypothesis tests must be performed to determine whether the percentages of supporting votes are different between the two incumbent candidates? A) A z test of two sample proportions with a zero hypothesized difference. B) A t test of two population proportions with a zero hypothesized difference. C) A z test of two population proportions with a zero hypothesized difference. D) A t test of two population proportions with a nonzero hypothesized difference.
82) The student senate at a local university is about to hold elections. A representative from the women’s sports program and a representative from the men’s sports program must both be elected. Two candidates, an incumbent and a challenger, are vying for each position and early polling results are presented next. A hypothesis test must be performed to determine whether the percentages of supporting votes are different between the two incumbent candidates. In a sample of 100 voters, 67 said that they would vote for the women’s incumbent candidate. In a separate sample of 100 voters, 55 said they would vote for the men’s incumbent candidate. Let p1 and p2 be the proportions of supporting votes for the incumbent candidates representing women’s and men’s sports programs, respectively. Which of the following are the competing hypotheses to determine whether the percentages of supporting votes are different between the two incumbent candidates? A) H0: p = p0 , HA: p ≠ p0 B) H0: p1 = p2 , HA: p1 ≠ p2 C) H0: p1 ≤ p2 , HA: p1 > p2 D) H0: p1 ≥ p2 , HA: p1 < p2
Version 1
34
83) The student senate at a local university is about to hold elections. A representative from the women’s sports program and a representative from the men’s sports program must both be elected. Two candidates, an incumbent and a challenger, are vying for each position and early polling results are presented next. A hypothesis test must be performed to determine whether the percentages of supporting votes are different between the two incumbent candidates. In a sample of 100 voters, 67 said that they would vote for the women’s incumbent candidate. In a separate sample of 100 voters, 55 said they would vote for the men’s incumbent candidate. Let p1 and p2 be the proportions of supporting votes for the incumbent candidates representing women’s and men’s sports programs, respectively. Use a 5% level of significance. Which of the following is(are) critical value(s) of the appropriate test? A) 1.645 B) 1.960 C) −1.645 and +1.645 D) −1.960 and ±1.960
84) The student senate at a local university is about to hold elections. A representative from the women’s sports program and a representative from the men’s sports program must both be elected. Two candidates, an incumbent and a challenger, are vying for each position and early polling results are presented next. A hypothesis test must be performed to determine whether the percentages of supporting votes are different between the two incumbent candidates. In a sample of 100 voters, 67 said that they would vote for the women’s incumbent candidate. In a separate sample of 100 voters, 55 said they would vote for the men’s incumbent candidate. Let p1 and p2 be the proportions of supporting votes for the incumbent candidates representing women’s and men’s sports programs, respectively. Which of the following is the correct conclusion for the test at = 0.05? A) Reject H0: p1 = p2 and therefore conclude that the proportions differ. B) Reject H0: p1 = p2 and therefore do not conclude that the proportions differ. C) Do not reject H0: p1 = p2 and therefore conclude that the proportions differ. D) Do not reject H0: p1 = p2 and therefore do not conclude that the proportions differ.
Version 1
35
85) A veterinarian wants to know if pit bulls or golden retrievers have a higher incidence of tooth decay at the age of three. The vet surveys 120 three-year-old pit bulls and finds 30 of them have tooth decay. The vet then surveys 160 three-year-old golden retrievers and finds 32 of them have tooth decay. Number the population of pit bulls and golden retrievers by 1 and 2, respectively. Calculate a 90% confidence interval for the difference in the population proportion of pit bulls and golden retrievers that have tooth decay. Which of the following is correct? A) [−0.0492, 0.1492] B) [−0.0332, 0.1332] C) [0.0199. 0.0801] D) [0.0428, 0.0572]
86) A veterinarian wants to know if pit bulls or golden retrievers have a higher incidence of tooth decay at the age of three. The veterinarian surveys 120 three-year-old pit bulls and finds 30 of them have tooth decay. The veterinarian then surveys 160 three-year-old golden retrievers and finds 32 of them have tooth decay. Number the population of pit bulls and golden retrievers by 1 and 2, respectively. At the 10% significance level, can the veterinarian conclude the proportion of pit bulls that have tooth decay is different than the proportion of golden retrievers that have tooth decay? A) No, because the confidence interval contains zero. B) Yes, because the confidence interval contains zero. C) No, because the confidence interval does not contain zero. D) Yes, because the confidence interval does not contain zero.
87) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 45 out of 600 bags. Down Under airlines lost 30 of 500 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the correct competing hypotheses?
Version 1
36
A) H0: p = p0 , HA: p ≠ p0 B) H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 C) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 D) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0
88) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 35 out of 640 bags. Down Under airlines lost 16 of 430 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the correct value of the test statistic? A) z = −1.3142 B) t1068 = 1.3142 C) t1068 = −1.3142 D) z = 1.3142
89) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 45 out of 600 bags. Down Under airlines lost 30 of 500 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the correct value of the test statistic? A) z = −0.9804 B) t1098 = −0.9804 C) z = 0.9804 D) t1098 = 0.9804
Version 1
37
90) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 65 out of 670 bags. Down Under airlines lost 36 of 460 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the appropriate p-value? A) 0.8941 B) 0.1392 C) 0.8124 D) 0.0696
91) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 45 out of 600 bags. Down Under airlines lost 30 of 500 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the appropriate p-value? A) 0.0817 B) 0.1634 C) 0.8366 D) 0.9183
92) A consumer magazine wants to figure out which of two major airlines lost a higher proportion of luggage on international flights. The magazine surveyed Standard Air (population 1) and Down Under airlines (population 2). Standard Air lost 45 out of 600 bags. Down Under airlines lost 30 of 500 bags. Does Standard Air have a higher population proportion of lost bags on international flights? Which of the following is the appropriate conclusion at the 5% significance level?
Version 1
38
A) Reject H0; we can conclude that the proportion of lost luggage is higher for Standard Air than for Down Under airlines. B) Reject H0; we cannot conclude that the proportion of lost luggage is higher for Standard Air than for Down Under airlines. C) Fail to reject H0; we can conclude that the proportion of lost luggage is higher for Standard Air than for Down Under airlines. D) Fail to reject H0; we cannot conclude that the proportion of lost luggage is higher for Standard Air than for Down Under airlines.
93) In August 2010, Massachusetts enacted a 150-day right-to-cure period that mandates that lenders give homeowners who fall behind on their mortgage an extra five months to become current before beginning foreclosure proceedings. Policymakers claimed that the policy would result in a higher proportion of delinquent borrowers becoming current on their mortgages. To test this claim, researchers took a sample of 244 homeowners in danger of foreclosure in the time period surrounding the enactment of this law. Of the 100 who fell behind just before the law was enacted, 30 were able to avoid foreclosure, and of 144 who fell behind just after the law was enacted, 48 were able to avoid foreclosure. Let p1 and p2 represent the proportion of delinquent borrowers who avoid foreclosure just before and just after the right-to-cure law is enacted, respectively. Which of the following is the correct competing hypotheses that will test the policymakers’ claim? A) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 B) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0 C) H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 D) H0: p1 − p2 > 0, HA: p1 − p2 ≤ 0
Version 1
39
94) In August 2010, Massachusetts enacted a 150-day right-to-cure period that mandates that lenders give homeowners who fall behind on their mortgage an extra five months to become current before beginning foreclosure proceedings. Policymakers claimed that the policy would result in a higher proportion of delinquent borrowers becoming current on their mortgages. To test this claim, researchers took a sample of 244 homeowners in danger of foreclosure in the time period surrounding the enactment of this law. Of the 100 who fell behind just before the law was enacted, 30 were able to avoid foreclosure, and of 144 who fell behind just after the law was enacted, 48 were able to avoid foreclosure. Let p1 and p2 represent the proportion of delinquent borrowers who avoid foreclosure just before and just after the right-to-cure law is enacted, respectively. Which of the following is the appropriate test statistic value to analyze the claim at the 10% significance level? A) t243 = −0.549 B) t243 = 0.549 C) z = −0.549 D) z = 0.549
95) In August 2010, Massachusetts enacted a 150-day right-to-cure period that mandates that lenders give homeowners who fall behind on their mortgage an extra five months to become current before beginning foreclosure proceedings. Policymakers claimed that the policy would result in a higher proportion of delinquent borrowers becoming current on their mortgages. To test this claim, researchers took a sample of 300 homeowners in danger of foreclosure in the time period surrounding the enactment of this law. Of the 140 who fell behind just before the law was enacted, 15 were able to avoid foreclosure, and of 160 who fell behind just after the law was enacted, 23 were able to avoid foreclosure. Let p1 and p2 represent the proportion of delinquent borrowers who avoid foreclosure just before and just after the right-to-cure law is enacted, respectively. Which of the following is the appropriate p-value to verify the claim? A) 0.5877 B) 0.4363 C) 0.4625 D) 0.1708
Version 1
40
96) In August 2010, Massachusetts enacted a 150-day right-to-cure period that mandates that lenders give homeowners who fall behind on their mortgage an extra five months to become current before beginning foreclosure proceedings. Policymakers claimed that the policy would result in a higher proportion of delinquent borrowers becoming current on their mortgages. To test this claim, researchers took a sample of 244 homeowners in danger of foreclosure in the time period surrounding the enactment of this law. Of the 100 who fell behind just before the law was enacted, 30 were able to avoid foreclosure, and of 144 who fell behind just after the law was enacted, 48 were able to avoid foreclosure. Let p1 and p2 represent the proportion of delinquent borrowers who avoid foreclosure just before and just after the right-to-cure law is enacted, respectively. Which of the following is the appropriate p-value to verify the claim? A) 0.2915 B) 0.5570 C) 0.5832 D) 0.7084
97) In August 2010, Massachusetts enacted a 150-day right-to-cure period that mandates that lenders give homeowners who fall behind on their mortgage an extra five months to become current before beginning foreclosure proceedings. Policymakers claimed that the policy would result in a higher proportion of delinquent borrowers becoming current on their mortgages. To test this claim, researchers took a sample of 244 homeowners in danger of foreclosure in the time period surrounding the enactment of this law. Of the 100 who fell behind just before the law was enacted, 30 were able to avoid foreclosure, and of 144 who fell behind just after the law was enacted, 48 were able to avoid foreclosure. Let p1 and p2 represent the proportion of delinquent borrowers who avoid foreclosure just before and just after the right-to-cure law is enacted, respectively. Assuming α = 0.10, does the evidence support the policymakers’ claim? A) Yes, because the p-value is less than the critical value. B) No, because the test statistic value is less than the critical value. C) No, because the p-value is greater than the significance level. D) Yes, because the test statistic value is greater than the critical value.
Version 1
41
98) A new study has found that, on average, 6- to 12-year-old children are spending less time on household chores today compared to 1981 levels. Suppose two samples representative of the study’s results report the following summary statistics for the two periods. 1981 Levels
2021 Levels
Which of the following are the appropriate competing hypotheses? A) H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 B) H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 C) H0: µ1 − µ2 > 0, HA: µ1 − µ2 ≤ 0 D) H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0
99) A new study has found that, on average, 6- to 12-year-old children are spending less time on household chores today compared to 1981 levels. Suppose two samples representative of the study’s results report the following summary statistics for the two periods. 1981 Levels
2021 Levels
Which of the following is the correct value of the test statistic assuming that the unknown population variances are equal? A) t59 = −7.83 B) t58 = 7.83 C) z = 7.83 D) z = −7.83
100) A new study has found that, on average, 6- to 12-year-old children are spending less time on household chores today compared to 1981 levels. Suppose two samples representative of the study’s results report the following summary statistics for the two periods.
Version 1
42
1981 Levels
2021 Levels
Which of the following is the correct value of the test statistic assuming that the unknown population variances are equal? A) z = −5.73 B) t58 = 5.73 C) z = 5.73 D) t59 = −5.73
101) A new study has found that, on average, 6- to 12-year-old children are spending less time on household chores today compared to 1981 levels. Suppose two samples representative of the study’s results report the following summary statistics for the two periods. 1981 Levels
2021 Levels
Using a 5% confidence level, which of the following is correct? A) Reject H0: µ1 − µ2 ≤ 0 and therefore do not conclude that relative to the 1981 levels, 6- to 12-year-old children spent less time in 2021 on household chores. B) Do not reject H0: µ1 − µ2 ≤ 0 and therefore conclude that relative to the 1981 levels, 6- to 12-year-old children spent less time in 2021 on household chores. C) Reject H0: µ1 − µ2 ≤ 0 and therefore conclude that relative to the 1981 levels, 6- to 12-year-old children spent less time in 2021 on household chores. D) Do not reject H0: µ1 − µ2 ≤ 0 and therefore do not conclude that relative to the 1981 levels, 6- to 12-year-old children spent less time in 2021 on household chores.
Version 1
43
102) Do men really spend more money on St. Patrick’s Day as compared to women? A recent survey found that men spend an average of $43.95 while women spend an average of $27.47. Assume that these data were based on a sample of 100 men and 100 women and the population standard deviations of spending for men and women are $33 and $24, respectively. Which of the following is the correct value of test statistic? A) tdf = −4.04 B) z = −4.04 C) z = 4.04 D) tdf = 4.04
103) Do men really spend more money on St. Patrick’s Day as compared to women? A recent survey found that men spend an average of $43.87 while women spend an average of $29.54. Assume that these data were based on a sample of 100 men and 100 women and the population standard deviations of spending for men and women are $32 and $25, respectively. Which of the following is the correct value of test statistic? A) z = −3.53 B) tdf = 3.53 C) tdf = −3.53 D) z = 3.53
104) Do men really spend more money on St. Patrick’s Day as compared to women? A recent survey found that men spend an average of $43.87 while women spend an average of $29.54. Assume that these data were based on a sample of 100 men and 100 women and the population standard deviations of spending for men and women are $32 and $25, respectively. Using 1% confidence level, which of the following is the correct conclusion for this test?
Version 1
44
A) Reject H0: µ1 − µ2 ≤ 0 and therefore do not conclude that on average men spend more money than women on St. Patrick’s Day. B) Do not reject H0: µ1 − µ2 ≤ 0 and therefore conclude that on average men spend more money than women on St. Patrick’s Day. C) Reject H0: µ1 − µ2 ≤ 0 and therefore conclude that on average men spend more money than women on St. Patrick’s Day. D) Do not reject H0: µ1 − µ2 ≤ 0 and therefore do not conclude that on average men spend more money than women on St. Patrick’s Day.
105) A farmer is concerned that a change in fertilizer to an organic variant might change his crop yield. He subdivides his six lots and uses the old fertilizer on one half of each lot and the new fertilizer on the other half. The following table shows the results. Lot
Crop Yield Using Old Fertilizer
Crop Yield Using New Fertilizer
1
10
12
2
11
10
3
10
13
4
9
9
5
12
11
6
11
12
Which of the following are the appropriate competing hypotheses? A) H0: µD ≤ 0, HA: µD > 0 B) H0: µD ≥ 0, HA: µD < 0 C) H0: µD ≠ 0, HA: µD = 0 D) H0: µD = 0, HA: µD ≠ 0
106) A farmer is concerned that a change in fertilizer to an organic variant might change his crop yield. He subdivides his six lots and uses the old fertilizer on one half of each lot and the new fertilizer on the other half. The following table shows the results.
Version 1
45
Lot
Crop Yield Using Old Fertilizer Crop Yield Using New Fertilizer
1
12
13
2
12
11
3
9
10
4
12
9
5
10
11
6
10
12
Assume the crop yields are normally distributed. Which of the following is the correct value of test statistic? A) z = −0.22 B) z = 0.22 C) t5 = 0.22 D) t5 = −0.22
107) A farmer is concerned that a change in fertilizer to an organic variant might change his crop yield. He subdivides his six lots and uses the old fertilizer on one half of each lot and the new fertilizer on the other half. The following table shows the results. Lot
Crop Yield Using Old Fertilizer
Crop Yield Using New Fertilizer
1
10
12
2
11
10
3
10
13
4
9
9
5
12
11
6
11
12
Assume the crop yields are normally distributed. Which of the following is the correct value of test statistic? A) t5 = −1.01 B) z = 1.01 C) z = −1.01 D) t5 = 1.01
Version 1
46
108) A farmer is concerned that a change in fertilizer to an organic variant might change his crop yield. He subdivides his six lots and uses the old fertilizer on one half of each lot and the new fertilizer on the other half. The following table shows the results. Lot
Crop Yield Using Old Fertilizer
Crop Yield Using New Fertilizer
1
10
12
2
11
10
3
10
13
4
9
9
5
12
11
6
11
12
Using 5% significance level, which of the following is an appropriate conclusion that supports or does not support the farmer’s claim? A) Reject H0: µD = 0 and therefore conclude that the crop yield with the new fertilizer is not significantly different from the crop yield with the old fertilizer. B) Reject H0: µD = 0 and therefore do not conclude that the crop yield with the new fertilizer is not significantly different from the crop yield with the old fertilizer. C) Do not reject H0: µD = 0 and therefore conclude that the crop yield with the new fertilizer is not significantly different from the crop yield with the old fertilizer. D) Do not reject H0: µD = 0 and therefore do not conclude that the crop yield with the new fertilizer is not significantly different from the crop yield with the old fertilizer.
109) A recent Health of Boston report suggests that 14% of female residents suffer from asthma as opposed to 6% of males. Suppose 250 females and 200 males responded to the study. Which of the following are an appropriate null and alternative hypotheses to test whether the proportion of females suffering from asthma is greater than the proportion of males? A) H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 B) H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0 C) H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 D) H0: p1 − p2 > 0, HA: p1 − p2 ≤ 0
Version 1
47
110) A recent Health of Boston report suggests that 14% of female residents suffer from asthma as opposed to 6% of males. Suppose 250 females and 200 males responded to the study. Which of the following is the correct value of the test statistics? A) −1.76 B) 2.76 C) −2.76 D) 1.76
111) <p>A recent Health of Boston report suggests that 14% of female residents suffer from asthma as opposed to 6% of males. Suppose 250 females and 200 males responded to the study. Which of the following is the correct conclusion for the test at ? A) Reject H0: p1 − p2 ≤ 0 and therefore conclude that females suffer more from asthma than males. B) Reject H0: p1 − p2 ≤ 0 and therefore do not conclude that females suffer more from asthma than males. C) Do not reject H0: p1 − p2 ≤ 0 and therefore conclude that females suffer more from asthma than males. D) Do not reject H0: p1 − p2 ≤ 0 and therefore do not conclude that females suffer more from asthma than males.
112) Which of the following is an example about analyzing the difference between two population means? A) Compare the average six-year graduation rate between private and public colleges. B) Compare the average checkout time at ABC Supermarket against what is being advertised. C) Compare the proportion of defects found in a sample of 10 chips versus a sample of 100 chips. D) Compare the average number of defects against the control limits of the control chart.
Version 1
48
113) Which of the following is NOT an example of analyzing the mean difference of two populations based on matched-pairs sampling? A) Compare the mean wait time of customers being served at a bank before and after the remodeling. B) Compare the mean wait time of customers being served at a bank before and after the use of chatbots. C) Compare the mean wait time of customers being served at a bank before and after the weekend. D) Compare the mean wait time of customers being served at a bank in a branch that is fully automated versus one that is 10% automated.
114) Which of the following is an example of analyzing the difference between two populations’ proportions? A) Compare the relative fatality rate of COVID-19 between the young and old adults in the United States. B) Compare the 7-day moving average of positive COVID-19 cases in Texas to see if there is a positive or negative trend. C) Compare the 7-day moving average of positive COVID-19 cases before and after school reopening in Quincy, Illinois. D) Compare the 7-day moving average of fatality rate of COVID in Texas to see if there is a positive or negative trend.
Version 1
49
115) Weather forecasters would like to report on temperatures in Washington, DC. In particular, they would like to compare the temperatures in December to temperatures in January to help potential tourists understand whether there is a difference. The average December temperature in a random sample of 16 days is 35 and the average January temperature in a random sample of 20 days is 33. The standard deviations based on historic information are 4 for December and 5 for January. a. Construct a 90% confidence interval for the difference between the population mean temperatures in December and January. b. Can we conclude with 90% confidence that the mean temperatures in December and January are different?
116) Weather forecasters would like to report on temperatures in Washington, DC. In particular, they would like to compare the temperatures in December to temperatures in January to help potential tourists understand whether there is a difference. They believe it is colder in January but want to conduct a test to support their assertion. The average December temperature in a random sample of 16 days is 35 and the average January temperature in a random sample of 20 days is 33. The standard deviations based on historic information are 4 for December and 5 for January. a. Specify the competing hypotheses to determine if it is colder in January than in December. b. Find the critical value(s) of the appropriate test at the 5% significance level. c. Calculate the value of the test statistic. d. Make a conclusion at the 5% significance level.
Version 1
50
117) According to a recent survey, women chat on their mobile phones more than do men (CNN.com, August 25, 2010). Irina wants to estimate the difference between the average chat time for female and male college students. She takes a random sample of 40 male students and 40 female students in her college. She finds that female students chatted for a sample average of 820 minutes per month with a sample standard deviation of 160 minutes. Male students, on the other hand, chatted for a sample average of 760 minutes per month with a sample standard deviation of 240 minutes. You may assume that the population variances are equal. a. Construct a 95% confidence interval for the difference between the population means. b. Can we conclude that average chat time is different between female and male college students with 95% confidence?
118) A student has to decide whether he wants to major in electrical engineering or computer science. To find out which job pays more, he interviews eight alumni in each profession and asks them what their starting salary was. The electrical engineering alumni had a mean starting salary of $54,000 and a standard deviation of $8,000. The computer science alumni had a mean starting salary of $62,000 and a standard deviation of $10,000. It is reasonable to assume equal population variances, and assume starting salaries are normally distributed. a. Specify the competing hypotheses to determine if electrical engineering and computer science majors mean starting salaries are equal. b. Calculate the value of the test statistic and find the critical value at a 5% significance level. c. Make a conclusion at the 5% significance level.
Version 1
51
119) A sports commentator believes that a particular basketball player is just awful in road games but relatively effective at home. To test this claim, the commentator takes a random sample of 36 road games and 49 home games and collects the player’s scoring output in each game. He finds that in the 49 home games, the player averaged 12.3 points and his scoring output had a standard deviation of 2.6 points. Alternatively, in the 36 road games, the player averaged 8.9 points with a standard deviation of 5.6 points. a. Specify the competing hypotheses to determine if the player has a higher scoring average in home games. b. Calculate the value of the test statistic and the critical value at a 1% significance level. There is no reason to assume that the population variance of scoring output is equal for home and road games. c. Does the evidence support the commentator’s claim at the 1% significance level?
120) A shoe company designed a low-top and high-top version of a particular shoe. One of the shoe-making engineers claims that the low-top version does not last as long as the high-top version before wearing out. To test this claim, she takes a sample of 36 low-top and 64 high-top wearers and asks each the length of time before their shoes wore out. She finds that the low-tops wore out in an average of 140 days, and the high-tops wore out in an average of 150 days. Suppose the population standard deviation for both is 20 days. a. Specify the competing hypotheses to determine whether high-tops last longer than low-tops for this particular shoe. b. Calculate the value of the relevant test statistic. c. Compute the p-value. Does the evidence support the commentator’s claim at the 1% significance level?
Version 1
52
121) According to a recent survey, women chat on their mobile phones more than do men (CNN.com, August 25, 2010). To determine if the same patterns also exist in colleges, Irina takes a random sample of 40 male students and 40 female students in her college. She finds that female students chatted for a sample average of 820 minutes per month, with a sample standard deviation of 160 minutes. Male students, on the other hand, chatted for a sample average of 760 minutes per month, with a sample standard deviation of 240 minutes. It is not reasonable to assume equal population variances. a. Specify the competing hypotheses to determine if female students chat more, on average, than male students. b. Calculate the value of the test statistic and its p-value. c. Make a conclusion at the 5% significance level.
122) According to a recent survey, women chat on their mobile phones more than men (CNN.com, August 25, 2010). To determine if the same patterns also exist in colleges, Irina takes a random sample of 40 male students and 40 female students in her college. She finds that female students chatted for a sample average of 820 minutes per month, with a sample standard deviation of 160 minutes. Male students, on the other hand, chatted for a sample average of 760 minutes per month, with a sample standard deviation of 240 minutes. It is not reasonable to assume equal population variances. a. Obtain a 95% confidence interval for the difference between the average amount of chat time for males and the average amount of chat time for females. b. Make a conclusion at the 5% significance level.
Version 1
53
123) <p>Students are planning a bake sale to raise money to purchase new uniforms for the volleyball team at a local high school. They would like to generate the best possible results and have therefore been experimenting with cupcake recipes. Nine different recipes are chosen and baked. The same recipes with 50% additional baking powder are then baked, leaving all other aspects of the recipe unchanged. They then measured the height of the cupcakes paired by recipe. The sample mean of the differences is and the sample standard deviation of the differences is . Perform a hypothesis test at the 1% level of significance to determine if the height of cupcakes rises with the additional baking powder. Assume the distribution of cupcake heights is normally distributed.
124) A bicycle manufacturer makes downhill mountain bikes out of two different types of material: carbon fiber and aluminum. One test the company uses to measure the durability of their frames is called a stress test. The company tested 10 aluminum and 10 carbon frames to see how many stress cycles each type of material could withstand before breaking. The results are listed next in thousands. It is not reasonable to assume equal population variances. Assume the number of stress cycles a bicycle frame can withstand before breaking is normally distributed. Material Type
a. Specify the competing hypotheses to determine if a carbon frame can withstand at least 10,000 more stress cycles than an aluminum frame. b. Calculate the value of the test statistic and its p-value. c. Make a conclusion at the 5% significance level.
Version 1
54
125) A company claims that you can expect your car to get one mpg better gas mileage while using their gasoline additive. A magazine did a study to find out how much a car’s gas mileage improved while using the gasoline additive. The study used 36 cars and recorded the average mpg with and without the additive for each car in the study. The cars with the additive averaged 1.20 mpg better than without and had a variance of 0.36 (mpg)2. a. Specify the competing hypotheses to determine if the gasoline additive improved gas mileage by at least one mpg. Use the matched-pairs sampling. b. Calculate the value of the test statistic and the critical value. c. Make a conclusion at the 5% significance level.
126) A large city installed LED streetlights, in part, to help reduce crime in particular areas. The city collected the number of property crimes along nine randomly selected streets in the month before and after the LED light installation, and claims that crime has been reduced. The crime data is shown in the table. Before LED
After LED
17
14
25
20
10
14
14
12
16
8
34
35
32
27
21
18
14
12
a. Specify the competing hypotheses to test the city’s claim. b. Calculate the value of the relevant test statistic and find the critical value at the 5% significance level. c. Does the evidence support the city’s claim at the 5% significance level?
Version 1
55
127) A recent study from the University of California, Berkeley, suggests that, compared to white Americans, black Americans are more likely to cross the race barrier in online dating. In a sample of 120 white men, 24 initiated contact with someone outside their race, whereas in a similar sample of 90 black men, 36 initiated contact with someone outside their race. a. Specify the hypotheses to justify the suggestion. b. Find the critical value(s) of the appropriate test at the 1% significance level. c. Calculate the value of the test statistic. d. Make a conclusion at the 1% significance level.
128) A recent study from the University of California, Berkeley, suggests that compared to black Americans, white Americans are less likely to cross the race barrier in online dating. In a sample of 120 white men, 24 initiated contact with someone outside of their race, whereas in a similar sample of 90 black men, 36 initiated contact with someone outside of their race. a. Specify the competing hypotheses to determine if the proportion of white Americans initiating a contact outside their race is exactly 10 percentage points lower than that of black Americans. b. Find the critical value(s) of the test. c. Calculate the value of the test statistic. d. Make a conclusion at the 5% significance level.
Version 1
56
129) The serious delinquency rate measures the proportion of homes that are well behind on their mortgage payments. A real estate investor is trying to decide in which of two neighborhoods in Phoenix to buy an investment property. He is trying to decide if the serious delinquency rate differs between the two neighborhoods. He took a sample of 70 homes in neighborhood 1 and found 14 homes were seriously delinquent. He took a sample of 80 homes in neighborhood 2 and 10 homes were seriously delinquent. a. Specify the competing hypotheses to determine if the delinquency rates in the two neighborhoods are equal. b. Calculate the value of the test statistic and the p-value. c. Make a conclusion at the 10% significance level. How should the delinquency rate influence the investor’s decision?
130) A candidate in a local election thinks seniors are by far her best supporters. She believes that the difference in the support between seniors and nonseniors is more than 15%. To verify this belief, the team samples 100 seniors and 150 nonseniors to find what proportion of each group supports the candidate. It is found that 50 of the seniors support her and 36 of the nonseniors support her. a. Specify the hypotheses to determine if the difference in the support between seniors and nonseniors is more than 15%. b. Calculate the value of the test statistic and the p-value. c. Make a conclusion at the 5% significance level.
131) A Catholic priest believes that a higher proportion of women in his congregation regularly attend services than the men in his congregation. He randomly selects 100 men and 100 women from his congregation and finds that 32 men and 48 women regularly attend. a. Specify the competing hypotheses to test the priest’s claim. b. Calculate the value of the relevant test statistic. c. Compute the p-value. Does the evidence support the priest’s claim at the 1% significance level?
Version 1
57
132) A computer manufacturer believes that the proportion of hardware malfunctions is different in humid climates than in dry climates. To test this claim, the manufacturer takes a random sample of 400 machines sold in Florida (humid) and 400 machines sold in Arizona (dry). The manufacturer finds that 44 of the machines in Florida and 24 of the machines in Arizona had hardware malfunctions. a. Specify the competing hypotheses to test the manufacturer’s claim. b. Calculate the value of the relevant test statistic. c. Compute the p-value. Does the evidence support the computer manufacturer’s claim at the 10% significance level?
133) ABC College is concerned about students’ performance gaps between the online and inperson sections of an Introduction to Business Statistics course. A random sample of 90 exam scores is collected from the online and in-person sections. A portion of the data showing the Exam Score and Section (1 means in-person and 2 means online) is given below: Exam Score 68 ⋮ 75 87
Section (1: in-person; 2: online) 1 ⋮ 2 1
{MISSING IMAGE}Click here for the Excel Data File a. Create two subsamples consisting of students in the in-person and online sections. What is the average exam score for each subsample? b. Specify the competing hypotheses to test whether the average exam scores differ between the two sections. c. Calculate the value of the test statistic and the p-value. Assume that the population variances are equal. d. What is the conclusion of the test using a 5% level of significance? Explain.
Version 1
58
134) A researcher wants to examine the impacts of COVID-19 lockdown on the number of meetings an average worker attended by videoconferencing. He expects that the number of such meetings is higher after the lockdown. He collects data on the number of such meetings an average worker attended in the month before and after the lockdown in 16 cities. A portion of data is shown below: City 1 ⋮ 16
Before 17 ⋮ 9
After 20 ⋮ 6
{MISSING IMAGE}Click here for the Excel Data File a. State the null and alterative hypotheses to test the researcher’s expectation. b. Assuming the difference in number of meetings is normally distributed, calculate the value of the test statistic and the p-value. c. At the 5% significance level, do the sample data support the researcher’s expectation? Explain.
135) According to the Centers for Disease Control and Prevention, some racial and ethnic minority groups are being disproportionately affected by COVID-19. To determine if African American adults are under more financial stress than Caucasian adults, you randomly select 100 African American adults and 200 Caucasian adults and ask whether they are stressed financially by COVID-19. The following table shows the information collected: Race African American Caucasian
Version 1
Stressed 80 125
59
a. State the null and alternative hypotheses to test whether the proportion of African Americans under more financial stress is higher than that of the Caucasians. b. Calculate the value of the test statistic and the p-value. c. At the 5% significance level, what is the conclusion? Explain.
136) Two random samples are considered independent if the observations in the first sample are related to the observations in the second sample. ⊚ true ⊚ false
137) <p>The difference between the two sample means difference between two population means . ⊚ true ⊚ false
138) <p>For a statistical inference regarding distribution of is normally distributed. ⊚ true ⊚ false
is an interval estimator of the
, it is imperative that the sampling
139) <p>If the underlying populations cannot be assumed to be normal, then by the central limit theorem, the sampling distribution of is approximately normal only if the sum of the sample observations is sufficiently large—that is, when ⊚ true ⊚ false
Version 1
60
140) <p>The confidence interval for the difference in the case of one sample: Point Estimate ± Standard Error. ⊚ true ⊚ false
is based on the same approach used
141) <p>The margin of error in the confidence interval for the difference equals the standard error multiplied by either or depending on whether or not the population variances are known. ⊚ true ⊚ false
142) <p>In the case when and are unknown and can be assumed equal, we can calculate a pooled estimate of the population variance. ⊚ true ⊚ false
143) <p>We convert the estimate into the corresponding value of the z or t test statistic by dividing the difference between and the hypothesized difference by the standard error of the estimator . ⊚ true ⊚ false
144)
There is only one type of matched-pairs sampling. ⊚ true ⊚ false
145) We always deal with matched-pairs sampling if two samples have the same number of observations. ⊚ true ⊚ false Version 1
61
146) The statistical inference concerning the difference between two population proportions is used for categorical data. ⊚ true ⊚ false
147) <p>We use the difference between the sample proportions of the difference between two population proportions . ⊚ true ⊚ false
as the point estimator
148) For a hypothesis test about the difference of two proportions, the standard error can be improved on when the hypothesized difference does not equal zero. ⊚ true ⊚ false
149)
The t statistic is used to estimate the difference between two population proportions. ⊚ true ⊚ false
150)
The hypothesis test H0:p1 – p2 ≤ d0; HA: p1 – p2 > d0 is a left-tailed test. ⊚ true ⊚ false
Version 1
62
Answer Key Test name: Chap 10_4e 1) independent 2) less 3) t-test 4) matched-pairs 5) margin of error 6) Data Analysis 7) standard error 8) not zero 9) D 10) D 11) B 12) D 13) B 14) D 15) D 16) D 17) B 18) D 19) A 20) A 21) A 22) C 23) B 24) D 25) C 26) C Version 1
63
27) B 28) D 29) C 30) C 31) D 32) C 33) D 34) A 35) B 36) A 37) D 38) D 39) A 40) A 41) D 42) B 43) D 44) A 45) A 46) A 47) C 48) B 49) D 50) C 51) C 52) A 53) B 54) B 55) D 56) C Version 1
64
57) C 58) C 59) B 60) D 61) A 62) B 63) D 64) D 65) C 66) A 67) A 68) C 69) C 70) A 71) D 72) C 73) A 74) D 75) B 76) A 77) B 78) B 79) B 80) B 81) C 82) B 83) D 84) D 85) B 86) A Version 1
65
87) C 88) D 89) C 90) B 91) B 92) D 93) B 94) C 95) D 96) A 97) C 98) A 99) B 100) B 101) C 102) C 103) D 104) C 105) D 106) D 107) A 108) C 109) A 110) B 111) A 112) A 113) C 114) A 115) a. [−0.4675, 4.4675] b. No
Version 1
66
116) a. H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 b. z0.05 = 1.645 c. z = 1.3333 d. Do not reject H0. We cannot conclude the average temperature in January is less than the average temperature in December. 117) a. [−30.8035, 150.8035] b. No 118) a. H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0 b.tdf = 1.7669, critical values: −2.145 and 2.145 c. Do not reject H0. We cannot conclude the average starting salary of electrical engineering students differs from the average starting salary of computer science students. 119) a. H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 b. t46 = 3.385, critical value = 2.410 c. Reject H0; the commentator’s claim is supported. 120) a. H0: µ1 − µ2 ≥ 0, HA: µ1 − µ2 < 0 b. z = −2.400, p-value = 0.0082 c. Reject H0; the engineer’s claim is supported. 121) a. H0: µ1 − µ2 ≤ 0, HA: µ1 − µ2 > 0 b. tdf = 1.3156; p-value = 0.0964 c. Do not reject H0. We cannot conclude the average amount of chat time for females is more than the average amount of chat time for males. 122) a. [−151.03, 31.03] b.Since the confidence interval contains zero, we cannot conclude the average amount of chat time for males is different from the average amount of chat time for females. 123) Students cannot conclude that the additional baking powder does not raise the height of the cupcakes. 124) a. H0: µ1 − µ2 ≤ 10,000, HA: µ1 − µ2 > 10,000 b. tdf = 13; p-value = 0.0268 c. Reject H0. We can conclude the average durability of the carbon frame is more than 10,000 stress tests more than the average durability of the aluminum frame. 125) a. H0: µD ≤ 1, HA: µD > 1 b. t35 = 2.00, the critical value t0.05,35 = 1.69 c. Reject H0. We can conclude that, on average, the gasoline additive improves gas mileage by more than 1 mpg. 126) a. H0: µD ≤ 0, HA: µD > 0 b. t8 = 2.188; the critical value t0.05,8 = 1.860 c. Reject H0. The city’s claim is supported.
Version 1
67
127) a. H0: p1 − p2 ≥ 0, HA: p1 − p2 < 0 b. z0.01 = −2.326 c. z = −3.1749 d. Reject H0. We can conclude the proportion of black men that initiate contact with someone outside their race is less than the proportion of white men that initiate contact with someone outside their race. 128) a. H0: p1 − p2 = −0.10, HA: p1 − p2 ≠ −0.10 b. z0.025 = −1.96 and z0.025 = 1.96 c. z = −1.5811 d. Do not reject H0. We cannot conclude the percentage of white Americans that initiate contact with someone outside their race is 10% less than black American that initiate contact with someone outside their race. 129) a. H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 b. z = 1.25, p-value = 0.2112 c. Do not reject H0. We cannot conclude the delinquency rates of the two neighborhoods differ. 130) a. H0: p1 − p2 ≤ 0.15, HA: p1 − p2 > 0.15 b. z = 1.8033, p-value −0.0356 c. Do not reject H0. We cannot conclude the percentage of senior supporters is more than 15% more than the percentage of nonsenior supporters. 131) a. H0: p1 − p2 ≤ 0, HA: p1 − p2 > 0 b. z = 2.309 c. p-value = 0.0105. Do not reject H0; the priest’s claim is not supported. 132) a. H0: p1 − p2 = 0, HA: p1 − p2 ≠ 0 b. z = 2.538 c. p-value = 0.0112. Reject H0; the manufacturer’s claim is supported. 133) a. The average exam score for the in-person section is 70.04. The average exam score for the online section is 67.57. b. H0: µ1 − µ2 = 0, HA: µ1 − µ2 ≠ 0 c. The test statistic is 1.13. The p-value is 0.26. d. At the 5% significance level, we cannot reject H0 because the p-value is greater than 0.05. This means that while the exam score for the in-person section (70.04) is slightly higher than that of the online section (67.57), the difference is not statistically significant. 134) a. H0: µD ≥ 0; HA: µD < 0 b. Test statistic = −3.5295; p-value = 0.0015 c. The data support the researcher’s expectation that the number of meetings is higher after the lockdown. A p-value of 0.0015 is less than the significance level of 0.05. Therefore, we can reject the null hypothesis.
Version 1
68
135) a. H0: p1-p2 ≤ 0; H1: p1-p2 > 0. b. Test statistic = 3.0717; p-value = 0.0011 c. The p-value of 0.0011 is less than the 5% significance level. Therefore, we reject H0 which means the proportion of African American adults under financial stress is higher than their Caucasian counterparts.
136) FALSE 137) FALSE 138) TRUE 139) FALSE 140) FALSE 141) TRUE 142) TRUE 143) TRUE 144) FALSE 145) FALSE 146) TRUE 147) TRUE 148) FALSE 149) FALSE 150) FALSE
Version 1
69
CHAPTER 11 – 4e 1) <p>In general, the distribution is the probability distribution of the sum of several independent squared standard _______ random variables.
2)
<p>The relevant value in the ____ tail of the
distribution is
3) <p>The formula for the confidence interval of the population variance when the random sample is drawn from a _______ distributed population.
.
is valid only
4) Use the R function ________ to obtain left-tail probabilities of the chi-square distribution.
5) We can use Excel’s function ___________ that returns the right-tailed probability of the chi-square distribution.
6)
The F distribution depends on _____ degrees of freedom.
7)
<p>All
distributions are __________ skewed.
8) The specification of the confidence interval for the ratio of two population variances is based on the assumption that the sample variances are computed from ____________ drawn samples from two normally distributed populations.
Version 1
1
9) Use the R function ________ to obtain a p-value for a test about the ratio of two population variances.
10) The Excel’s function ________ returns the p-value for a right-tailed test for the ratio of two population variances.
11)
As the df grow larger, the
distribution approaches the
A) F distribution B) uniform distribution C) student’s t distribution D) normal distribution
12)
If a sample of size n is taken from a normal population with a finite variance, then the
statistic
follows the
distribution with degrees of freedom.
A) (n + 1)(n − 1) B) n + 1 C) n − 1 D) n
13)
If s2 is computed from a random sample of n observations drawn from an underlying
normal population with a finite variance, then the
Version 1
variable is defined as
2
A) B) C) D)
.</p>
14) The _____ is the probability distribution of the sum of several independent squared standard normal random variables. A) F distribution B) distribution</p> C) student’s t distribution D) uniform distribution
15) Which of the following is the formula for the sample variance s2 when used as an estimate of σ2 for a random sample of n observations from a population? A) B) C) D)
16)
Statistical inference about σ2 is based on which of the following distributions?
Version 1
3
A) The F distribution B) The student’s t distribution C) The chi-square distribution D) The normal distribution
17)
If P(
≥ x) = 0.05, then the value of x is
A) 14.449. B) 10.645. C) 12.592. D) 1.6350.
18) The values taken from a normally distributed population are 21 23 25 27 28 35 30 32 33. Which of the following is a 95% confidence interval for the population variance? A) [2.03, 16.30] B) [10.12, 81.43] C) [9.00, 72.41] D) [11.39, 91.64]
19) <p>Which of the following is the value of of freedom equal 6?
for a 99% confidence level and degrees
A) 16.812 B) 18.548 C) 0.872 D) 0.676
Version 1
4
20) Which of the following is a 98% confidence interval for the population variance when the sample variance is 20 for a sample of 10 items from a normal population? A) [8.308, 86.207] B) [7.476, 77.512] C) [8.617, 78.125] D) [7.755, 70.313]
21) Which of the following is used to conduct a hypothesis test about the population variance? A) Sample mean B) Population mean C) Sample proportion D) Sample variance
22)
Which of the following hypotheses is a right-tail test about the population variance? A) B) C) D)
23) You want to test whether the population variance differs from 50. From a sample of 25 observations drawn from a normally distributed population, you calculate s2 = 80. When conducting this test at the 5% significance level, the
critical value is
A) 5.625. B) 12.401. C) 14.400. D) 39.364.
Version 1
5
24) <p>For a sample of 10 observations drawn from a normally distributed population, we obtain the sample mean and the sample variance as 50 and 75, respectively. We want to determine whether the population variance is greater than 70. The
test statistic is
A) 1.645. B) 3.325. C) 9.642. D) 10.332.
25)
Which of the following is a feature of the F distribution?
A) The F distribution depends on one degree of freedom. B) The F distribution is bell-shaped with values ranging from negative infinity to infinity. C) The F distribution becomes increasingly symmetric when the degrees of freedom increase. D) The F distribution is negatively skewed.
26) <p>We conduct the following hypothesis test . For a random sample of 15 observations, the sample standard deviation is 12. Which of the following is the correct approximation of the p-value used to conduct this test? A) p-value lies between 0.025 and 0.05 B) p-value lies between 0.01 and 0.025 C) p-value lies between 0.05 and 0.10 D) p-value is greater than 0.10
Version 1
6
27) Students of two sections of a history course took a common final examination. The course instructor examines the variance in scores between the two sections. He selects random samples of n1 = 11 and n2 = 16 with sample variances of and , respectively. Assuming that the population distributions are normal, construct a 90% confidence interval for the ratio of the population variance. A) [0.90, 2.41] B) [0.50, 2.00] C) [0.25, 4.00] D) [0.79, 5.70]
28) Students of two sections of a history course took a common final examination. The course instructor examines the variance in scores between the two sections. He selects random samples of n1 = 11 and n2 = 16 with sample variances of and , respectively. Suppose you obtain a 95% confidence for the ratio of the population variances. Which of the below allows you to conclude the first variance is smaller than the second variance? A) The entire interval is less than 1. B) The interval captures 1. C) The entire interval is more than 1. D) The entire interval is equal to 1.
29) Becky owns a diner and is concerned about sustaining the business. She wants to ascertain if the standard deviation of the profits for each week is greater than $250. The details of the profits for the week are (in dollars) 1,743 1,438 1,212 1,705 1,985 1,857 1,916 Assume that profits are normally distributed. Which of the following are appropriate hypotheses to test Becky's concern? A) B) C) D)
Version 1
7
30) Becky owns a diner and is concerned about sustaining the business. She wants to ascertain if the standard deviation of the profits for each week is greater than $250. The details of the profits for the week are (in dollars) 1,743 1,438 1,212 1,705 1,985 1,857 1,916 Assume that profits are normally distributed. Which of the following is the correct value of the test statistic? A) 6.146 B) 8.604 C) 6.652 D) 7.375
31) Becky owns a diner and is concerned about sustaining the business. She wants to ascertain if the standard deviation of the profits for each week is greater than $250. The details of the profits for the week are (in dollars) 1,743 1,438 1,212 1,705 1,985 1,857 1,916 Assume that profits are normally distributed. Using the critical value approach at α = 0.05, which of the following is the correct conclusion for Becky’s concern? A) We reject H0 because the value of the test statistic is greater than .</p> B) We do not reject H0 because the value of the test statistic is greater than .</p> C) We reject H0 because the value of the test statistic is less than D) We do not reject H0 because the value of the test statistic is less than
32) A random sample of 18 observations is taken from a normal population. The sample mean and sample standard deviation are 76.4 and 4.2, respectively. What is an 80% interval estimate of the population variance?
Version 1
8
A) [12.107, 29.735] B) [10.870, 34.581] C) [12.819, 31.484] D) [14.636, 23.443]
33)
How does the width of the interval respond to the changes in the confidence level? A) The width of the interval decreases with an increase in the confidence level. B) The width of the interval increases with an increase in the confidence level. C) The width of the interval is halved with the increase in the confidence level. D) The width of the interval is doubled with the decrease in the confidence level.
34) The manager of a video library would like the variance of the waiting times of the customers not to exceed 2.30 minutes-squared. He would like to add an additional billing counter if the variance exceeds the cut-off. He checks the recent sample data. For a random sample of 24 customer waiting times, he arrives at a sample variance of 3.8 minutes-squared. The manager assumes the waiting times to be normally distributed. Which of the following would be null and the alternate hypothesis to test if the cut-off is surpassed? A) B) C) D)
35) The manager of a video library would like the variance of the waiting times of the customers not to exceed 2.30 minutes-squared. He would like to add an additional billing counter if the variance exceeds the cut-off. He checks the recent sample data. For a random sample of 24 customer waiting times, he arrives at a sample variance of 3.8 minutes-squared. The manager assumes the waiting times to be normally distributed. Which of the following is the correct approximation of the p-value used to conduct this test?
Version 1
9
A) p-value lies between 0.005 and 0.010 B) p-value lies between 0.010 and 0.025 C) p-value lies between 0.050 and 0.10 D) p-value lies between 0.025 and 0.05
36) The manager of a video library would like the variance of the waiting times of the customers not to exceed 2.30 minutes-squared. He would like to add an additional billing counter if the variance exceeds the cut-off. He checks the recent sample data. For a random sample of 24 customer waiting times, he arrives at a sample variance of 3.8 minutes-squared. The manager assumes the waiting times to be normally distributed. At α = 0.05, which of the following is the critical value
?
A) 13.091 B) 32.007 C) 35.172 D) 38.076
37)
<p>Which of the following Excel functions is used to calculate the exact left-tail
probability for a
distribution?
A) CHISQ.DIST(x, Deg_freedom, Cumulative) B) CHISQ.DIST(x, n−2) C) CHISQ.DIST(x, n/2) D) CHISQ.DIST(x/2, Deg_freedom, Cumulative)
38)
<p>Which of the following R functions is used to calculate the exact left-tail probability
for a
distribution?
Version 1
10
A) 1-pchisq(x, df) B) 1-pchisq(x, df, lower.tail=TRUE) C) pchisq(x, df) D) pchisq(x, df, upper.tail=TRUE)
39)
Which of the following characteristics is true regarding the F distribution?
A) <p>The distribution is negatively skewed.</p> B) <p>The values of the distribution range from negative infinity to infinity.</p> C) <p>The distribution is the probability distribution of the ratio of two independent chi-square variables.</p> D) <p>The shape of the distribution is independent of the degrees of freedom.</p>
40) If independent samples of size n1 and n2 are drawn from normal populations with equal variances, then the value of the statistic is calculated as A) B) C) D)
.</p> .</p> .</p> .</p>
41) Which of the following are the degrees of freedom df1 and df2 for an distribution? A) (n1 − 2); (n2 − 2) B) n2(n1 − 2); n1(n2 − 2) C) (n1 − 1); (n2 − 1) D) n(n1 − 1); n(n2 − 1)
Version 1
11
42)
Which of the following is the value of x for which
?
A) 4.07 B) 5.46 C) 3.22 D) 5.39
43) <p>Which of the following is the formula for a confidence interval for the ratio of the population variances
A) B) C) D)
44) <p>A professor wants to compare the variances of scores between two sections of classes. The students in each section took the same test. The random samples yield sample variances of = 203.15 and = 474.42 for samples of n1 = 13 and n2 = 16, respectively. Which of the following is a 99% confidence interval for the ratio of the population variances? A) [0.1540, 2.7809] B) [0.1008, 2.0217] C) [0.1386, 3.0895] D) [0.0907, 1.8198]
45)
For a two-tailed test about two population variances, the null hypothesis is given by
Version 1
12
A) B) C) D)
.</p> .</p> .</p> .</p>
46) <p>The result of placing a larger sample variance in the numerator of the statistic allows us to
test
A) focus only on the right tail of the distribution. B) <p>arrive at a more accurate statistic value.</p> C) focus only on the left tail of the distribution. D) determine if the distribution is symmetric.
47) Which of the following Excel’s functions is used to calculate the right-tailed probability for a value x on the distribution? A) F.DIST.RT(x, Deg_freedom1, Deg_freedom2, Cumulative) B) F.DIST.RT(x, n1, n2) C) F.DIST.RT(x, n1−1, n2−2) D) F.DIST.RT(x, Deg_freedom1, Deg_freedom2)
48) Which of the following R functions is used to obtain a right-tail probability for a value x of the distribution? A) pf(x, df1, df2) B) 1-pf(x, df1, df2) C) pf(x, df2, df1) D) 1-pf(x, df2, df1)
Version 1
13
49) Construct a 95% confidence interval for the ratios of two population variances. The random samples of n1= 9 and n2= 11 with sample variances of = 500 and = 250, respectively. Assume that the samples were drawn from a normal population. A) [0.50, 2.00] B) [0.52, 8.60] C) [0.25, 1.41] D) [0.44, 4.30]
50)
The following are the competing hypotheses and the relevant summary statistics . Sample 1
Sample 2
Which of the following statements is correct, regarding the assumptions for conducting the hypothesis test? A) The samples are drawn from populations that are not normally distributed. B) The values in one group are related to the values in the other group. C) The difference of the sample variances is used to test the hypotheses. D) The samples are independent and drawn from normally distributed populations.
51)
The following are the competing hypotheses and the relevant summary statistics: . Sample 1
Sample 2
Which of the following is the critical value at the 5% significance level?
Version 1
14
A) 3.02 B) 3.14 C) 3.23 D) 3.39
52)
The following are the competing hypotheses and the relevant summary statistics:
Sample 1
Sample 2
The p-value associated with the value of the test statistic is 0.3692. At the 5% significance level, which of the following conclusions is correct? A) We reject the null hypothesis and conclude the first variance is larger than the second. B) We do not reject the null hypothesis and conclude the first variance is larger than the second. C) We reject the null hypothesis and cannot conclude the first variance is larger than the second. D) We do not reject the null hypothesis and cannot conclude the first variance is larger than the second.
53) Consider the expected returns (in percent) from two investment options. Beth wants to determine if investment 1 has a lower variance. Use the following summary statistics. Investment 1
Investment 2
Which of the following are the competing hypotheses for this test?
Version 1
15
A) B) C) D)
54) Consider the expected returns (in percent) from the two investment options. Beth claims that the variances of the returns for the two investments differ. Use the following data to arrive at the results. Investment 1
2
−6
−4
10
8
7
5
6.5
Investment 2
5.2
−8
6
−2
5
9.5
−7
−4.5
Which of the following is the correct p-value? A) 0.2873 B) 0.7127 C) 0.3564 D) 0.6436
55) Consider the expected returns (in percent) from the two investment options. Beth claims that the variances of the returns for the two investments differ. Use the following data to arrive at the results. Investment 1
2
−6
−4
10
8
7
5
6.5
Investment 2
5.2
−8
6
−2
5
9.5
−7
−4.5
Test Beth’s claim at the 5% significance level. Which of the following is the correct conclusion?
Version 1
16
A) p-value = 0.7127 > α = 0.05; Beth’s claim is correct. B) p-value = 0.7127 > α = 0.05; Beth’s claim is wrong. C) p-value = 0.7127 < α = 0.05; Beth’s claim is wrong. D) p-value = 0.7127 < α = 0.05; Beth’s claim is correct.
56) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m. and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds) Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
99
Which of the following is the 95% confidence interval for the ratio of the population variances? A) [1.02, 8.55] B) [1.00, 8.73] C) [0.99, 8.83] D) [1.19, 7.34]
57) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m. and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds)
Version 1
17
Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
99
A 90% confidence interval is found to be [1.19, 7.36], where the morning is the first group and the evening is the second group. Which of the following is the correct conclusion? A) We cannot conclude the variance of the wait times for the morning hours is different from the variance of the wait times for the evening hours. B) We can conclude the variance of the wait times for the morning hours is more than the variance of the wait times for the evening hours. C) We can conclude the variance of the wait times for the evening hours is more than the variance of the wait times for the morning hours. D) We can conclude the variance of the wait times for the evening hours is equal to the variance of the wait times for the morning hours.
58) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m., and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds) Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
99
Which of the following is the correct hypotheses to determine if the variance of wait time during morning peak hours (population 1) differs from that during the evening peak hours (population 2)? A) B) C) D)
Version 1
18
59) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m. and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds) Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
99
Which of the following is the correct value of the test statistic? A) 1.72 B) 2.96 C) 1.66 D) 0.34
60) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m. and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds) Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
Which of the following is the correct approximation of the p-value?
Version 1
19
99
A) p-value is greater than 0.1. B) p-value lies between 0.025 and 0.05. C) p-value lies between 0.05 and 0.10. D) p-value is greater than 0.2.
61) Amie Jackson, a manager at Sigma travel services, makes every effort to ensure that customers attempting to make online reservations do not have to wait too long to complete the reservation process. The travel website is open for reservations 24 hours a day, and Amie regularly checks the website for the waiting time to maintain consistency in service. She uses the following independently drawn samples of wait time during two peak hours, morning 8 a.m. to 10 a.m. and evening 7 p.m. to 9 p.m., for the analysis. Assume that wait times are normally distributed. Wait Time (in seconds) Morning hours 8 97 101 115 107 129 98 96 132 118 104 123 128 95 127 112 a.m. to 10 a.m. Evening hours 7 95 92 p.m. to 9 p.m.
89
90 102 96 85 81
84 100 97
80 98 79
99
At the 10% significance level, which of the following is the correct conclusion? A) Do not reject H0. We cannot conclude that the variance of wait time during morning peak hours differs from that during the evening peak hours. B) Reject H0. We conclude that the variance of wait time during morning peak hours differs from that during the evening peak hours. C) Do not reject H0. We conclude that the variance of wait time during morning peak hours differs from that during the evening peak hours. D) Reject H0. We cannot conclude that the variance of wait time during morning peak hours differs from that during the evening peak hours.
62) A financial analyst examines the performance of two mutual funds and claims that the variances of the annual returns for the bond funds differ. To support his claim, he collects data on the annual returns (in percent) for the years 2001 through 2010. The analyst assumes that the annual returns for the two emerging market bond funds are normally distributed. Use the following summary statistics.
Version 1
20
Mutual Fund 1
Mutual Fund 2
<p>For the competing hypotheses correct approximation of the p-value?
which of the following is the
A) Less than 0.01 B) Between 0.01 and 0.025 C) Between 0.02 and 0.05 D) Between 0.05 and 0.10
63) A financial analyst examines the performance of two mutual funds and claims that the variances of the annual returns for the bond funds differ. To support his claim, he collects data on the annual returns (in percent) for the years 2001 through 2010. The analyst assumes that the annual returns for the two emerging market bond funds are normally distributed. Use the following summary statistics. Mutual Fund 1
Mutual Fund 2
At α = 0.10, is the analyst’s
<p>The competing hypotheses are claim supported by the data? A) No, the p-value < α = 0.10. B) Yes, the p-value > α = 0.10. C) No, the p-value > α = 0.10. D) Yes, the p-value < α = 0.10.
64) A financial analyst examines the performance of two mutual funds and claims that the variances of the annual returns for the bond funds differ. To support his claim, he collects data on the annual returns (in percent) for the years 2001 through 2010. The analyst assumes that the annual returns for the two emerging market bond funds are normally distributed. Use the following summary statistics. Mutual Fund 1
Version 1
Mutual Fund 2
21
<p>The competing hypotheses are critical F value at the 10% significance level?
Which of the following is the
A) F0.10,(9,9) = 2.44 B) F0.05,(10,10) = 2.98 C) F0.05,(9,9) = 3.18 D) F0.10,(10,10) = 2.32
65) A financial analyst examines the performance of two mutual funds and claims that the variances of the annual returns for the bond funds differ. To support his claim, he collects data on the annual returns (in percent) for the years 2001 through 2010. The analyst assumes that the annual returns for the two emerging market bond funds are normally distributed. Use the following summary statistics. Mutual Fund 1
Mutual Fund 2
<p>The competing hypotheses are claim supported by the data using the critical value approach?
At α = 0.10, is the analyst’s
A) No, because the value of the test statistic is less than the critical F value. B) Yes, because the value of the test statistic is less than the critical F value. C) Yes, because the value of the test statistic is greater than the critical F value. D) No, because the value of the test statistic is greater than the critical F value.
66) A financial analyst maintains that the risk, measured by the variance, of investing in emerging markets is more than 280(%)2. Data on 20 stocks from emerging markets revealed the following sample results: = 12.1% and s2 = 361(%)2. Assume that the returns are normally distributed. Which of the following are appropriate hypotheses to test the analyst’s claim? A) B) C) D)
Version 1
22
67) A financial analyst maintains that the risk, measured by the variance, of investing in emerging markets is more than 280(%)2. Data on 20 stocks from emerging markets revealed the following sample results: = 12.1% and s2 = 361(%)2. Assume that the returns are normally distributed. Which of the following is the correct value of the test statistic? A) B) C) D)
= 36.19</p> = 24.50</p> = 24.50</p> = 36.19</p>
68) A financial analyst maintains that the risk, measured by the variance, of investing in emerging markets is more than 280(%)2. Data on 20 stocks from emerging markets revealed the following sample results: = 12.1% and s2 = 361(%)2. Assume that the returns are normally distributed. Which of the following is the correct conclusion of the financial analyst’s claim? A) Do not reject H0. The financial analyst’s claim is supported by the sample data at 1% significance level. B) Reject H0. The financial analyst’s claim is not supported by the sample data at 1% significance level. C) Do not reject H0. The financial analyst’s claim is not supported by the sample data at 1% significance level. D) Reject H0. The financial analyst’s claim is supported by the sample data at 1% significance level.
69) Suppose Apple would like to test the hypothesis that the standard deviation for the webbrowsing battery life of the iPad is different from 3.0 hours. The following data represents the battery life, in hours, experienced by a random sample of six users: 10 14 12 16 10 6. Which of the following would be the correct hypothesis test?
Version 1
23
A) H0:σ2 ≤ 3.0, HA:σ2 > 3.0 B) H0:σ = 3.0, HA:σ ≠ 3.0 C) H0:σ2 ≥ 9.0, HA:σ2 < 9.0 D) H0:σ2 = 9.0, HA:σ2 ≠ 9.0
70) Suppose Apple would like to test the hypothesis that the standard deviation for the webbrowsing battery life of the iPad is different from 3.0 hours. The following data represents the battery life, in hours, experienced by a random sample of six users: 10 14 12 16 10 6. A 95% confidence interval for variance of the battery life is given by [4.78, 73.79]. What is the decision and conclusion? A) Reject the null hypothesis, we can conclude the standard deviation is different from 3 hours. B) Reject the null hypothesis, we cannot conclude the standard deviation is different from 3 hours. C) Do not reject the null hypothesis, we can conclude the standard deviation is different from 3 hours. D) Do not reject the null hypothesis, we cannot conclude the standard deviation is different from 3 hours.
71) Annual growth rates for individual firms in the toy industry tend to fluctuate dramatically, depending on consumers’ tastes and current fads. Consider the following growth rates (in percent) for two companies in this industry, Hasbro and Mattel. Year
Hasbro
Mattel
2005
3.0
1.5
2006
2.1
9.1
2007
21.8
5.7
2008
4.8
−0.1
2009
1.2
−8.2
Which of the following are the null and alternative hypotheses in order to determine if the variance of growth rates differs for the two firms?
Version 1
24
A) B) C) D)
72) Annual growth rates for individual firms in the toy industry tend to fluctuate dramatically, depending on consumers’ tastes and current fads. Consider the following growth rates (in percent) for two companies in this industry, Hasbro and Mattel. Year
Hasbro
Mattel
2005
3.0
1.5
2006
2.1
9.1
2007
21.8
5.7
2008
4.8
−0.1
2009
1.2
−8.2
Which of the following is(are) the critical value(s) at α = 0.05? A) 9.60 and 0.10 B) 0.10 C) 9.60 D) 2.60 and 9.60
73) Annual growth rates for individual firms in the toy industry tend to fluctuate dramatically, depending on consumers' tastes and current fads. Consider the following growth rates (in percent) for two companies in this industry, Hasbro and Mattel. Year
Hasbro
Mattel
2005
3.0
1.5
2006
2.1
9.1
2007
21.8
5.7
2008
4.8
−0.1
2009
1.2
−8.2
Which of the following is the correct conclusion?
Version 1
25
A) Do not reject H0. We can conclude the growth rate variance of Hasbro is different from the growth rate variance of Mattel. B) Reject H0. We can conclude the growth rate variance of Hasbro is different from the growth rate variance of Mattel. C) Do not reject H0. We cannot conclude the growth rate variance of Hasbro is different from the growth rate variance of Mattel. D) Reject H0. We cannot conclude the growth rate variance of Hasbro is different from the growth rate variance of Mattel.
74) Edmunds.com would like to test the hypothesis that the standard deviation for the age of an imported car on the road is greater than the standard deviation for the age of a domestic car. The following data shows the ages of a random sample of import and domestic cars. Import
1
3
4
5
4
5
10
Domestic
4
5
4
6
5
4
6
Which of the following would be the correct hypothesis test? A) B) C) D)
75) Edmunds.com would like to test the hypothesis that the standard deviation for the age of an imported car on the road is greater than the standard deviation for the age of a domestic car. The following data shows the ages of a random sample of import and domestic cars. Import
1
3
4
5
4
5
10
Domestic
4
5
4
6
5
4
6
Which of the following would be the correct value of the test statistic? A) 9.41 B) 5.23 C) 6.75 D) 10.56
Version 1
26
76) Which of the following is NOT an example of making inferences concerning the population variance? A) Evaluate the variability of the cost of insurance premiums. B) Evaluate whether the standard deviation of returns for the mutual fund exceeds 10%. C) Evaluate whether the average diameter of pizza made is within the 12” specification. D) Evaluate the dispersion of exam performance in a Business Statistics class.
77)
Which of the following is FALSE regarding the given chi-square distributions?
A) The green curve has the highest degree of freedom. B) The red curve has a higher degree of freedom than the blue curve. C) The red curve is less positively skewed than the blue curve. D) The blue curve has the highest degree of freedom.
78) Which of the following is an example of making inferences concerning the ratio of two population variances?
Version 1
27
A) Compare two sections of a Business Statistics class based on the average exam performance B) Compare the relative variability of fatality rate of COVID-19 between males and females. C) Compare the amount of online purchases before and after the training program. D) Compare the effectiveness of a drug between the treatment and control groups.
79)
Which of the following is TRUE regarding the given F distributions?
A) The red curve is more negatively skewed than the blue curve. B) The two degrees of freedom of the blue curve is greater than that of the brown curve. C) The brown curve has the largest degrees of freedom among the three curves. D) The red curve has the largest degrees of freedom among the three curves.
80) Find the value x for which a. P( ≥ x) = 0.005. b. P( > x) = 0.025. c. P( < x) = 0.005. d. P( < x) = 0.025.
Version 1
28
81) Find the value x for which a. P( ≥ x) = 0.10. b. P( < x) = 0.10. c. P( ≥ x) = 0.05. d. P( < x) = 0.05.
82) <p>Find and under the following scenarios. a. A 90% confidence level with n = 10. b. A 95% confidence level with n = 15. c. A 99% confidence level with n = 20.
83) A random sample of 21 pages is used to estimate the population variance of the number of typographical errors in a book. The sample mean and sample standard deviation are calculated as 7.34 and 5.11, respectively. Assume that the population is normally distributed. a. Construct a 95% interval estimate of the population variance. b. Construct a 99% interval estimate of the population variance. c. Use your answers to discuss the impact of the confidence level on the width of the interval.
Version 1
29
84) The following data, drawn from a normal population, are the waiting times (in minutes) of 12 randomly selected customers at a travel desk. 1.8 2.4 2.7 2.1 1.3 1.7 1.5 1.5 2.3 2.8 2.5 1.9 a. Calculate the point estimates of the population variance and the population standard deviation. b. Compute a 90% confidence intervals for the population variance and the population standard deviation.
85) The following table shows the annual returns (in percent) for the Fidelity Utilities fund from 2006 through 2010. Year
Utilities
2006
30.52
2007
10.83
2008
−34.55
2009
11.08
2010
17.33
a. Calculate the point estimate of σ2. b. Construct a 95% confidence interval of σ2.
86) The sample mean and the sample standard deviation of annual returns on a stock over a 12-year period as computed by a stock analyst were 18% and 12%, respectively. The analyst wants to know if the risk, as measured by the standard deviation, differs from 15%.
a. Construct a 90% confidence interval of the population variance and the population standard deviation. b. What assumption is required in constructing the confidence interval? c. Based on the above confidence interval, does the risk differ from 15%?
Version 1
30
87) Consider the following hypotheses: H0:σ2 = 169, HA:σ2 ≠ 169 A sample variance of 98 is obtained from a random sample of 10 observations drawn from a normally distributed population. At 1% significance level, conduct a test for the population variance using the critical value approach.
88) Consider the hypotheses H0: σ2 = 169, HA:σ2 ≠ 169. A sample variance of 98 is obtained from a random sample of 10 observations drawn from a normally distributed population. At 1% significance level, conduct a test for the population variance using the p-value approach.
89) Consider the hypotheses H0:σ2 ≤ 450, HA:σ2 > 450. Approximate the p-value based on the following sample information, where the sample is drawn from a normally distributed population. a. s2 = 500; n = 15 b. s2= 600; n = 10 c. Which of the above sample information enables us to reject the null hypothesis at α = 0.10?
Version 1
31
90) A random sample of 10 homes sold during last month in the city of Fontana had the following number of bedrooms in each house: 4, 2, 3, 5, 4, 3, 3, 4, 5, 3. At α = 0.01, test the claim that the variance of a normally distributed population is less than 1 using a. the p-value approach. b. the critical value approach.
91) The sample standard deviation of the monthly sales (in thousands of dollars) of a telecommunications firm in the United States for two years, 2010 and 2011, is computed as 6.7. Assuming that the sales data are drawn from a normally distributed population, conduct the following hypothesis tests for the population variance. Use the critical value approach at α = 0.05. a. H0:σ2 ≤ 25, HA:σ2 > 25 b. H0:σ2 = 36, HA:σ2 ≠ 36
92) The supervisor of an automobile sales and service industry would like the variance of the number of automobiles serviced not to exceed five units-squared. He would recruit one more service executive if the variance exceeds this threshold. In a recent random sample of 14 services, he computed the sample standard deviation as 3.3. He believes that the automobile services are normally distributed. a. State the null and the alternative hypotheses to test if the threshold has been crossed. b. Use the critical value approach to conduct the test at α = 0.01. c. What should the supervisor do?
Version 1
32
93) The administrator of a college is concerned about the students using cell phones for texting during the class hours. She randomly selects 25 students to check if the variance of the number of students in the college using cell phones during the class hours is less than 12. She arrives at a sample variance of 7. She assumes the distribution to be normal. a. State the appropriate null and alternative hypotheses for the test. b. Compute the value of the test statistic. c. Use the critical value approach to test the administrator’s concern at α = 0.10. d. Repeat the test at α = 0.01.
94) (Use Excel) The following are the prices (in $1,000s) of 20 houses sold recently in Vancouver, Washington. A real estate agent believes that the standard deviation of house prices is less than 70 units, where each unit equals $1,000. Assume house prices are normally distributed. 138,9
197.1
205
214
180
174.9
143.5
265
210
170
139.9
204.9
299.9
124.9
114
134.9
249.9
159
120
164.9
a. State the null and the alternative hypotheses for the test. b. Calculate the value of the test statistic. c. Use Excel’s function (either CHISQ.DIST.RT or CHISQ.DIST) to calculate the p-value. d. At α = 0.10, what is the conclusion? Is the agent’s claim supported by the data?
95) Find the value x for which a. P(F(6.4) ≥ x) = 0.01. b. P(F(6.4) < x) = 0.01. c. P(F(7.9) ≤ x) = 0.10. d. P(F(6.4) < x) = 0.10.
Version 1
33
96) Use the F table to approximate these probabilities. a. P(F(8,12) ≥ 2.85) b. P(F(8,12) < 0.4) c. P(F(5,5) ≥ 7.15) d. P(F(5,5) < 0.09)
97) Construct a 95% interval estimate of the ratio of the population variances using the following results from two independently drawn samples from normally distributed populations Sample 1
Sample 2
98) The following are the measures based on independently drawn samples from normally distributed populations. Sample 1
Sample 2
a. Construct a 90% interval estimate of the ratio of the population variances. b. Test if the ratio of the population variances differs from Sample 1, using the computed confidence interval, at the 10% significance level.
Version 1
34
99) To estimate the ratio of the population variances of the monthly closing stock prices (in $) of two emerging market funds, following summary measures are obtained using the sample data for eight months. The two samples are drawn independently from normally distributed populations. Emerging Market Fund 1
Emerging Market Fund 2
Construct a 90% interval estimate of the ratio of the population variances.
100) The following table shows the heights (in inches) of individuals recorded recently for two families, the Williamses and the Thomases, in Chicago, Illinois: Williams Family
Thomas Family
71
72
60
62
62
68
73
61
63
73 65
a. Construct a 95% interval estimate of the ratio of the population variances. b. Using the computed confidence interval, test if the ratio of the population variances differs from one at the 5% significance level. Explain.
101) <p>Using the critical value approach, conduct the following hypothesis test at the 1% significance level: Use the following results from two independently drawn samples from normally distributed populations. Sample 1
Version 1
Sample 2
35
102)
<p>Consider the following competing hypotheses and relevant summary statistics: Use the following results from two independently drawn samples from normally distributed populations. Sample 1
Sample 2
a. Using the critical value approach, conduct this hypothesis test at the 5% significance level. What are the assumptions? b. Confirm the conclusions by using the p-value approach.
103) A researcher compares the returns for two mutual funds, Smith, Incorporated and Campbell, Incorporated to determine which of the two has a higher risk rate. She assumes that the returns are normally distributed. Implement the critical value approach at a significance level of 10%. The sample descriptive measures are given below. Smith, Incorporated
Campbell, Incorporated
104) Use the p-value approach to conduct the following left-tailed hypothesis test at the 10% significance level. Use the following results from two independently drawn samples from normally distributed populations: Sample 1
Version 1
Sample 2
36
105) The following table shows the annual returns (in percent) for Fidelity’s Technology and Communications funds for the years 2006 through 2010. Year
Technology
Communications
2006
9.45
2.25
2007
22.44
9.78
2008
−48.52
−48.49
2009
83.17
80.7
2010
23.76
27.7
Use the critical value approach to test if the population variances differ at the 10% significance level.
106) <p>A supermarket has just added a new cash register to reduce the waiting times of the customers during weekends. Because a new cash register has been added, the customers expect their waiting time to be less than the waiting time had been before the cash register was added. Suppose the sample variance of the waiting times for 25 customers before adding the new cash register is minutes, while the sample variance of the waiting times for 25 customers after adding the cash register is minutes. The manager of the supermarket believes that the waiting times are normally distributed and that the two samples are drawn independently. a. Develop the hypotheses to test whether the ratio of two population variances is less than 1. b. Calculate the appropriate test statistic. c. Using the p-value approach, what is the decision at the 5% significance level? d. What is the conclusion? Given that all other criteria are satisfied, should the supermarket continue with the new cash register?
Version 1
37
107) Two students, Mary and Joanna, are in a statistics class working hard on the consistency of their performance in the weekly class tests. In particular, they are hoping to bring down the variance of their test scores. Their professor believes that the students are not equally consistent in their tests. Over a 12-week period, the test scores of these two students are shown below. Assume that the two samples are drawn independently from normally distributed populations. Mary
32
48
46
30
47
32
33
32
46
28
35
37
Joanna
39
44
37
45
42
44
32
33
31
35
41
40
a. Develop the hypotheses to test whether the two students differ in consistency. b. Use the critical value approach to test the professor’s claim at α = 0.05.
108) The monthly adjusted (for dividends and splits) closing stock prices (rounded to the nearest dollar) for Gladstone Commercial Corporated and Kimco Realty Corporated for the last six months of 2010 are reported in the table below. Month
Gladstone Commercial Corporated
Kimco Realty Corporated
July
$16
$15
August
$15
$14
September
$16
$15
October
$17
$17
November
$17
$16
December
$18
$18
a. State the null and the alternative hypotheses to determine if the variance of price differs for the two firms. b. What assumption regarding the population is necessary to implement this step? c. Use Excel’s F.TEST function to calculate the p-value. d. At α = 0.05 what is your conclusion?
Version 1
38
109) The sales price (in $1,000) of three-bedroom apartments in two cities—Houston, Texas, and Orlando, Florida—are given in the following table. Houston is known to have lower house prices than Orlando; however, it is not clear if it also has lesser variability in its prices. Houston
189
145
220
205
225
275
248
310
150
318
Orlando
301
279
224
400
249
205
289
350
218
300
a. State the null and the alternative hypotheses to determine if the variance of house price in Houston is less than that in Orlando. b. What assumption regarding the population is necessary to implement this step? c. Test the hypothesis at α = 0.05. What is your conclusion?
110) ABC Bakery receives complaints from customers that their bagels are inconsistent in size. The complaints would be valid if the standard deviation of the diameter of the bagels differs from 0.5”. The baker measures the diameter of 50 bagels. Assume that the diameters are normally distributed. A portion of the data is shown below: Diameter (inches) 6.25 : 5.84
Version 1
39
{MISSING IMAGE}Click here for the Excel Data File a. State the appropriate hypotheses for the variance test. b. Calculate the value of the test statistic. c. Calculate the p-value. d. At the 10% significance level, should the bakery be concerned? Explain.
111) To test the theory that students in face-to-face classes have a lower variability of exam performance than their online counterparts, you collect students’ exam scores in face-to-face and online sections of a Business Statistics class. One hundred scores from the face-to-face section and seventy scores from the online section are drawn independently from normally distributed populations. A portion of the data is shown below: Exam Score
Sectio (1-face-toface; 2-online)
80.17
1
:
:
79.42
2
:
:
85.17
1
{MISSING IMAGE}Click here for the Excel Data File a. State the appropriate hypotheses for the variance test. b. Calculate the value of the test statistic. c. Calculate the p-value. d. At the 10% significance level, what is your conclusion? Explain.
112)
Statistical inference for σ2 is based on the F distribution. ⊚ true ⊚ false
Version 1
40
113) The population variance is one of the most widely used quantitative measures of risk in investments. ⊚ true ⊚ false
114)
The skewness of the chi-square distribution depends on the degrees of freedom. ⊚ true ⊚ false
115)
<p>The values of the ⊚ true ⊚ false
distribution range from negative infinity to infinity.
116) The formula for the confidence interval of the population variance σ2 is valid for the random samples drawn from any population. ⊚ true ⊚ false
The value of the test statistic for the hypothesis test of the population variance, σ2 is
117)
computed as = ⊚ true ⊚ false
118)
.
<p>The null hypothesis
is rejected if the value of the test statistic exceeds
.
Version 1
41
⊚ ⊚
119)
true false
<p>Inference for two population variances is done through their difference ⊚ true ⊚ false
120) <p>The estimator of variances is . ⊚ true ⊚ false
.
used in the inference regarding the ratio of two population
121)
<p>The sampling distribution of ⊚ true ⊚ false
122)
<p>The ⊚ true ⊚ false
is the<em>
distribution.
distribution is negatively skewed.
123) The formula for constructing the confidence interval for the ratio of two population variances is based on the assumption that the sample variances are computed from independently drawn samples from two non-normally distributed populations. ⊚ true ⊚ false
124) <p>A right-tailed test for the ratio of two population variances whether is greater than . Version 1
examines
42
⊚ ⊚
true false
125)
<p>It is preferable to place the smaller sample variance in the numerator of the statistic. ⊚ true ⊚ false
126)
<p>For a test about the ratio of two population variances, the test statistic is given by . ⊚ ⊚
Version 1
true false
43
Answer Key Test name: Chap 11_4e 1) normal 2) left 3) normally 4) pchisq 5) CHISQ.DIST.RT 6) two 7) positively 8) independently 9) pf 10) F.DIST.RT 11) D 12) C 13) B 14) B 15) A 16) C 17) C 18) B 19) B 20) A 21) D 22) B 23) C 24) D 25) C 26) C Version 1
44
27) D 28) A 29) C 30) D 31) D 32) A 33) B 34) A 35) D 36) C 37) A 38) C 39) C 40) B 41) C 42) A 43) D 44) B 45) D 46) A 47) D 48) B 49) B 50) D 51) D 52) D 53) A 54) B 55) B 56) C Version 1
45
57) B 58) C 59) B 60) C 61) B 62) C 63) D 64) C 65) C 66) A 67) B 68) C 69) D 70) D 71) C 72) A 73) C 74) B 75) A 76) C 77) D 78) B 79) C 80) a. 12.838 b. 9.348 c. 0.072 d. 0.216 81) a. 24.769 b. 10.085 c. 35.172 d. 13.091
Version 1
46
82) a. 16.919 and 3.325 b. 26.119 and 5.629 c. 38.582 and 6.844 83) a. [15.28, 54.45] b. [13.06, 70.25] c. The width of the interval increases with the confidence level. 84) a. s2 = 0.25, s = 0.5 b. [0.14, 0.60] and [0.37, 0.77] 85) a. s2 = 604.40 b. [216.96, 4990.74] 86) a. [80.51, 346.24] and [8.97, 18.61] b. The annual return is normally distributed. c. The risk does not differ from 15%. 87) The test statistic is 5.21 and the critical values are 2.70 and 19.02. Do not reject the null hypothesis. We cannot conclude the population variance is different from 169. 88) The test statistic is 5.21 and the approximate p-value is 0.2479. Do not reject the null hypothesis. We cannot conclude the population variance is different from 169. 89) a. p-value is greater than 0.10. b. p-value is greater than 0.10. c. Neither. 90) a. The test statistic is 8.4 and the critical value is 2.09. Do not reject the null hypothesis. We cannot conclude the variance of the number of bedrooms is less than 1. b. The test statistic is 8.4 and the p-value is 0.5056. Do not reject the null hypothesis. We cannot conclude the variance of the number of bedrooms is less than 1. 91) a. The test statistic is 41.30 and the critical value is 35.17. Reject the null hypothesis. We can conclude the variance of the monthly sales is greater than 25. b. The test statistic is 28.70 and the p-value is approximately 0.3823. Do not reject the null hypothesis. We cannot conclude the variance of the monthly sales is different from 36. 92) a. H0:σ2 ≤ 5, HA:σ2 > 5 b.The test statistic is 28.31 and the critical value is 22.36. Reject H0. We can conclude the variance of the number of automobiles serviced is greater than 5. c. Since the null hypothesis is rejected, the supervisor should recruit one more service executive. 93) a.H0:σ2 ≥ 12,HA:σ2 < 12 b. c. The critical value is 33.20. RejectH0; we can conclude the variance of the number of students in the college using cell phones during the class hours is less than 12. d. The critical value is 4.98. Do not reject H0; we cannot conclude the variance of the number of students in the college using cell phones during the class hours is less than 12.
Version 1
47
94) a. H0:σ2 ≥ 4,900, HA:σ2 < 4,900 b. c. 0.0447 d. Reject H0. The agent’s claim is supported by the data. 95) a. 15.21 b. 0.11 c. 2.51 d. 0.37 96) a. 0.05 b. 0.10 c. 0.025 d. 0.10 97) [0.39, 3.68]; we are 95% confidence the ratio of the population variances is between [0.39, 3.68]. Since the interval contains 1, we cannot conclude the variances are different. 98) a. [0.60, 2.54] b. Ratio of the population variances does not differ from Sample 1. 99) [0.52, 7.42] 100) a. [0.18, 12.44] b. Since the confidence interval contains 1, do not reject the null hypothesis. We cannot conclude the population variances are different. 101) The test statistic is 2.23 and the p-value is 0.1737. Do not reject : we cannot conclude the first variance is larger than the second variance. 102) a. We assume that the samples are independently drawn from normally distributed populations. The test statistic is 1.36 and the critical value is 2.836. Do not reject H0: we cannot conclude the variances are different. b. The p-value is 0.5568. Do not reject H0: we cannot conclude the variances are different. 103) We reject the null hypothesis since F(11,11) = 8.66 exceeds the critical value F0.1,(11,11) = 2.23. Reject the null hypothesis: we can conclude the risk rate is higher for Smith, Inc. than Campbell, Inc. 104) The test statistic is 29.55 and the p-value is < 0.0001. Reject the null hypothesis; we can conclude the first variance is larger than the second variance. 105) <p>The competing hypotheses are The p-value is 0.9907. Do not reject the null hypothesis; we cannot conclude that population variances differ. 106) <p>a. b. Fα,(df1,df2) = 2.18 c. The p-value is 0.0309. The decision Reject H0 if the p-value < α = 0.05 d. Yes, the supermarket should keep its new cash register.
Version 1
48
107) <p>a. b. The p-value is 0.2664. Do not reject the null hypothesis. We cannot conclude the variances of the test scores are different. 108) <p>a. ; because b. Monthly adjusted closing stock prices are normally distributed, the stock prices of Gladstone are independent of Kimco. c. The p-value = 0.4747 d. Do not reject the null hypothesis. We cannot conclude the variances of the closing stock price are different. 109) <p>a. b. House prices are normally distributed. The home prices in Houston are independent of the home prices in Orlando. c. The p-value is 0.5049. Do not reject H0. We cannot conclude that the variance of the house prices in Houston is less than that of Orlando. 110) a. H0: = 0.25; HA: ≠ 0.25 b. The test statistic = 4.4584 c. The p-value = 2 d. The bakery should not be concerned because a p-value of 2 is greater than the significance level of 0.1. The null hypothesis is not rejected. 111) a. H0: / ≤ 1, HA: / >1 b. The test statistic = 1.9577 c. The p-value = 0.0011 d. We can reject the null hypotheses because the p-value = 0.0011 is less than the 10% significance level. This means that the variance of the online exam scores is greater than the face-to-face scores.
112) FALSE 113) TRUE 114) TRUE 115) FALSE 116) FALSE 117) TRUE 118) TRUE 119) FALSE 120) TRUE
Version 1
49
121) FALSE 122) FALSE 123) FALSE 124) TRUE 125) FALSE 126) TRUE
Version 1
50
CHAPTER 12 – 4e 1)
For a multinomial experiment, which of the following is not true? A) The number of categories is at least two, k ≥ 2. B) The trials are dependent. C) The sum of the cell probabilities is P1 + P2 + ... + Pk = 1. D) The category probabilities are the same for each trial.
2) For the goodness-of-fit test, the expected category frequencies are found using the _________________________. A) sample proportions B) proportions specified under the null hypothesis C) average of the hypothesized and sample proportions D) proportions specified under the alternative hypothesis
3) Which of the following null hypotheses is used to test if five population proportions are the same? A) H0:p1 = p2 = p3 = p4 = p5 = 0.25 B) H0:p1 = p2 = p3 = p4 = 0.25 C) H0:p1 = p2 = p3 = p4 = 0.20 D) H0:p1 = p2 = p3 = p4 = p5 = 0.20
4)
For the goodness-of-fit test, the sum of the expected frequencies must equal ___. A) 1 B) n C) k D) k−1
Version 1
1
5)
For the goodness-of-fit test, the chi-square test statistic will __________________. A) always equal zero B) always be negative C) be at least zero D) always be equal to n
6) The chi-square test of a contingency table is a test of independence for __________________. A) a single qualitative variable B) two qualitative variables C) two quantitative variables D) three or more quantitative variables
7) For the chi-square test of a contingency table, the expected cell frequencies are found as ________________. A) the row total multiplied by the column total divided by the sample size B) the observed cell frequency C) (r−1)(c−1) D) rc
8) The chi-square test of a contingency table is valid when the expected cell frequencies are _______________. A) equal to 0 B) more than 0 but less than 5 C) at least 5 D) negative
Version 1
2
9)
<p>For the chi-square test of a contingency table, the expected cell frequencies are found
as
which is the same as ___________________________. A) the observed cell frequencies B) the cell probability multiplied by the sample size C) the row total D) the column total
10)
What are the degrees of freedom for the goodness-of-fit test for normality? A) 2 B) k−3 C) k−2 D) k−1
11) For the chi-square test for normality, the expected frequencies for each interval must be __________. A) exactly 2 B) k−3 C) at least 5 D) k−1
12) For the goodness-of-fit test for normality, the null and alternative hypotheses are _____________________.
Version 1
3
A) H0: Data does not follow a normal distribution, HA: Data follows a normal distribution B) H0: Data follows a normal distribution, HA: Data does not follow a normal distribution C) H0: Data follows a normal distribution, HA: Data are skewed right D) H0: Data follows a normal distribution, HA: Data are skewed left
13) For the goodness-of-fit test for normality to be applied, what is the minimum number of qualitative intervals the quantitative data can be converted to? A) 2 B) 4 C) 5 D) 10
14)
The calculation of the Jarque-Bera test statistic involves ___________. A) only the sample size B) the sample size, standard deviation, and average C) the sample size, skewness coefficient, and kurtosis coefficient D) the sample average, skewness coefficient, and kurtosis coefficient
15) For the Jarque-Bera test for normality, the null and alternative hypotheses are __________________________. A) H0:S < 0 and K > 0, HA:S < 0 or K < 0 B) H0:S < 0 and K = 0, HA:S > 0 or K ≠ 0 C) H0:S = 0 and K = 0, HA:S ≠ 0 or K ≠ 0 D) H0:S = 0 and K > 0, HA:S ≠ 0 or K < 0
Version 1
4
16) Packaged candies have three different types of colors. Suppose you want to determine if the population proportion of each color is the same. The most appropriate test is the ______________________________________. A) goodness-of-fit test for a multinomial experiment B) chi-square test for independence C) goodness-of-fit test for normality D) Jarque-Bera test for normality
17) Suppose you want to determine if gender and major are independent. Which of the following tests should you use? A) Goodness-of-fit test for a multinomial experiment B) Chi-square test for independence C) Goodness-of-fit test for normality D) Jarque-Bera test for normality
18) Suppose you want to determine if the quarterly returns for mutual funds have a normal distribution when your available data is partitioned into some non-overlapping. The most appropriate test is the _____________________________________. A) goodness-of-fit test for a multinomial experiment B) chi-square test for independence C) goodness-of-fit test for normality D) Jarque-Bera test for normality
19) Suppose you want to determine if the quarterly returns for mutual funds have a normal distribution using the skewness and kurtosis coefficients. The most appropriate test is the _________________________________.
Version 1
5
A) goodness-of-fit test for a multinomial experiment B) chi-square test for independence C) goodness-of-fit test for normality D) Jarque-Bera test for normality
20) If a test statistic has a value of X and is assumed to be distributed with df degrees of freedom, then the p-value for a right-tailed test found by using Excel’s command ________________________. A) ‘=CHISQ.DIST.RT(X, Deg_freedom)’ B) ‘=CHISQ.DIST.RT(Deg_freedom, X)’ C) ‘=1-CHISQ.DIST.RT(X, Deg_freedom)’ D) ‘=1-CHISQ.DIST.RT(Deg_freedom, X)’
21) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
To test if the poker-dealing machine deals cards at random, the null and alternative hypotheses are ________________________________________. A) H0:p1 = p2 = p3 = p4 = 0, HA: Not all population proportions are equal to 0,25 B) H0:p1 = p2 = p3 = p4 = 0.25,HA: Not all population proportions are equal to 0,25 C) H0:p1 = p2 = p3 = p4 = 1, HA: Not all population proportions are equal to 0,25 D) H0:p1 = p2 = p3 = p4 = 0.20, HA: Not all population proportions are equal to 0,20
Version 1
6
22) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
For the goodness-of-fit test, the degrees of freedom are ____. A) 2 B) 3 C) 4 D) 5
23) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
395
Clubs
370
Diamonds
425
For the goodness-of-fit test, the value of the test statistic is _____. A) 3.25 B) 4.125 C) 7.45 D) 8.815
Version 1
7
24) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
For the goodness-of-fit test, the value of the test statistic is _____. A) 2.25 B) 3.125 C) 6.45 D) 7.815
25) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
The p-value is _______________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
Version 1
8
26) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
Using the p-value approach and α = 0.05, the decision and conclusion are _________________________________. A) do not reject the null hypothesis; all of the population proportions are the same B) reject the null hypothesis; conclude that not all proportions are equal to 0.20 C) reject the null hypothesis; conclude that not all proportions are equal to 0.25 D) do not reject the null hypothesis; we cannot conclude that not all of the proportions are equal to 0.25
27) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
At the 5% significance level, the critical value is ______. A) 6.251 B) 7.815 C) 9.348 D) 11.345
Version 1
9
28) A card-dealing machine deals spades (1), hearts (2), clubs (3), and diamonds (4) at random as if from an infinite deck. In a randomness check, 1,600 cards were dealt and counted. The results are shown below. Suit
Observed
Spades
410
Hearts
405
Clubs
370
Diamonds
415
Using the critical value approach, the decision and conclusion are _________________________________________. A) do not reject the null hypothesis; we cannot conclude that not all of the proportions are equal to 0.25 B) do not reject the null hypothesis; all of the population proportions are the same C) reject the null hypothesis; conclude that not all proportions are equal to 0.25 D) reject the null hypothesis; conclude that not all proportions are equal to 0.20
29) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
For the goodness-of-fit test, the assumed degrees of freedom are _____.
Version 1
10
A) 2 B) 3 C) 4 D) 5
30) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
For the goodness-of-fit test, the alternative hypothesis states that __________________________________________. A) HA: Not all population proportions are equal to 0,20 B) HA: At least one of the population proportions is different from its hypothesized value C) HA: Not all population proportions are the same D) HA: Not all population proportions are equal to 0.15
31) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Version 1
Observed
Proportion
11
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
Which of the following is the value of the goodness-of-fit test statistic? A) 3.08 B) 15.09 C) 15.64 D) 16.75
32) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
The p-value is ______________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
Version 1
12
33) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
Using the p-value approach and α = 0.01, the decision and conclusion are ___________________________________. A) do not reject the null hypothesis; all proportions are equal to 0.20 B) do not reject the null hypothesis; we cannot conclude that not all of the proportions are the same C) reject the null hypothesis; at least one of the proportions is different from its hypothesized value D) reject the null hypothesis; all of the proportions are not the same
34) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
392
0.15
2
375
0.14
3
405
0.15
4
787
0.29
5
346
0.13
6
430
0.14
At the 5% significance level, the critical value is _______. Version 1
13
A) 9.236 B) 12.833 C) 15.086 D) 11.07
35) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students who participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
At the 1% significance level, the critical value is _______. A) 9.236 B) 11.070 C) 12.833 D) 15.086
36) A university has six colleges and takes a poll to gauge student support for a tuition increase. The university wants to ensure each college is represented fairly. The below table shows the observed number of students that participate in the poll from each college and the actual proportion of students in each college. College
Observed
Proportion
1
457
0.20
2
206
0.08
Version 1
14
3
301
0.13
4
792
0.29
5
336
0.15
6
373
0.15
Using the critical value approach, the decision and conclusion are _________________________________________. A) reject the null hypothesis; at least one of the proportions is different from its hypothesized value B) reject the null hypothesis; all of the proportions are not the same C) do not reject the null hypothesis; all proportions are equal to 0.20 D) do not reject the null hypothesis; we cannot conclude not all of the proportions are the same
37) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
190
Tuesday
200
Wednesday
198
Thursday
183
Friday
229
For the goodness-of-fit test, what are the degrees of freedom for the chi-squared test statistic? A) 4 B) 5 C) 6 D) 7
Version 1
15
38) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
For the goodness-of-fit test, what are the degrees of freedom for the chi-squared test statistic? A) 4 B) 5 C) 6 D) 7
39) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
For the goodness-of-fit test, the null and alternative hypotheses are ________________________________________. A) H0:p1 = p2 = p3 = p4 = 1/4,HA: Not all population proportions are equal to 1/4 B) H0:p1 = p2 = p3 = p4 = p5 = 1/5,HA: Not all population proportions are equal to 1/5 C) H0:p1 = p2 = p3 = p4 = p5 = 1/4,HA: Not all population proportions are equal to 1/4 D) H0:p1 = p2 = p3 = p4 = 1/5,HA: Not all population proportions are equal to 1/5
Version 1
16
40) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
200
Tuesday
188
Wednesday
215
Thursday
188
Friday
209
Which of the following is the value of goodness-of-fit chi-square test statistic? A) 1.005 B) 1.032 C) 2.020 D) 2.970
41) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
Which of the following is the value of goodness-of-fit chi-square test statistic?
Version 1
17
A) 0.605 B) 0.632 C) 1.62 D) 2.57
42) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
The p-value is ________________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
43) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
Version 1
18
Using the p-value approach and α = 0.05, the decision and conclusion are ____________________________________. A) do not reject the null hypothesis; all of the proportions are the same B) do not reject the null hypothesis; we cannot conclude that not all of the proportions are the same C) reject the null hypothesis; not all of the proportions are the same D) reject the null hypothesis; all of the proportions are not the same
44) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week
Observed
Monday
192
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
At the 5% significance level, the critical value is _______. A) 7.779 B) 9.488 C) 11.143 D) 13.277
45) A fund manager wants to know if it is equally likely that the Dow Jones Industrial Average will go up each day of the week. For each day of the week, the fund manager observes the following number of days when the Dow Jones Industrial Average goes up. Day of the Week Monday
Version 1
Observed 192
19
Tuesday
189
Wednesday
202
Thursday
199
Friday
218
Using the critical value approach, the decision and conclusion are __________. A) reject the null hypothesis; not all of the proportions are the same B) reject the null hypothesis; all of the proportions are not the same C) do not reject the null hypothesis; all of the proportions are the same D) do not reject the null hypothesis; we cannot conclude that not all of the proportions are the same
46) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
To test that gender and candidate preference are independent, the null hypothesis is ___________________________. A) H0: Gender and candidate preference are independent B) H0: Gender and candidate preference are mutually exclusive C) H0: Gender and candidate preference are not mutually exclusive D) H0: Gender and candidate preference are dependent
47) In the following table, likely voters’ preferences of two candidates are cross-classified by gender.
Candidate A
Version 1
Male
Female
150
130
20
Candidate B
100
120
For the chi-square test of independence, the assumed degrees of freedom are ____. A) 1 B) 2 C) 3 D) 4
48) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
For the chi-square test of independence, the value of the test statistic is _______. A) 2.34 B) 1.62 C) 3.25 D) 4
49) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
The p-value is _______________.
Version 1
21
A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
50) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
Using the p-value approach and α = 0.10, the decision and conclusion are ___________________________________. A) reject the null hypothesis; gender and candidate preference are dependent B) do not reject the null hypothesis; gender and candidate preference are independent C) reject the null hypothesis; gender and candidate preference are independent D) do not reject null hypothesis; gender and candidate preference are dependent
51) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
At the 10% significance level, the critical value is _______.
Version 1
22
A) 6.635 B) 5.024 C) 3.841 D) 2.706
52) In the following table, likely voters’ preferences of two candidates are cross-classified by gender. Male
Female
Candidate A
150
130
Candidate B
100
120
Using the critical value approach, the decision and conclusion are ________________________________________. A) reject the null hypothesis; gender and candidate preference are dependent B) do not reject the null hypothesis; gender and candidate preference are independent C) reject the null hypothesis; gender and candidate preference are independent D) do not reject the null hypothesis; gender and candidate preference are dependent
53) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
Which of the following is the estimated joint probability for the "low income and 21-35 age group" cell?
Version 1
23
A) 0.0830 B) 0.0874 C) 0.0996 D) 0.1328
54) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
Which of the following is the expected joint probability for the "low income and 21-35 age group" cell assuming age group and income are independent? A) 0.0830 B) 0.0874 C) 0.0996 D) 0.1328
55) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
Version 1
24
Assuming age group and income are independent, the expected "low income and 21-35 age group" cell frequency is ________. A) 105.27 B) 107.72 C) 146.31 D) 178.42
56) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
To test that age group and income are independent, the null and alternative hypothesis are _________________________________________________________________________. A) H0: Age group and income are dependent; HA: Age group and income are independent B) H0: Age group and income are mutually exclusive; HA: Age group and income are not mutually exclusive C) H0: Age group and income are not mutually exclusive; HA: Age group and income are mutually exclusive D) H0: Age group and income are independent; HA: Age group and income are dependent
57) In the following table, individuals are cross-classified by their age group and income level.
Version 1
25
Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
For the chi-square test of independence, the degrees of freedom are __________. A) 2 B) 4 C) 9 D) 8
58) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
For the chi-square test of independence, the value of the test statistic is ________. A) 8.779 B) 10.840 C) 13.243 D) 16.159
59) In the following table, individuals are cross-classified by their age group and income level.
Version 1
26
Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
The p-value is A) Less than 0.01 B) Between 0.01 and 0.05 C) Between 0.05 and 0.10 D) Greater than 0.10
60) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
Using the p-value approach and α = 0.05, the decision and conclusion are __________________________________. A) do not reject the null hypothesis; age and income are dependent B) do not reject the null hypothesis; age and income are independent C) reject the null hypothesis; age and income are dependent D) reject the null hypothesis; age and income are independent
61) In the following table, individuals are cross-classified by their age group and income level.
Version 1
27
Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
At the 5% significance level, the critical value is _______. A) 13.277 B) 11.143 C) 9.488 D) 7.779
62) In the following table, individuals are cross-classified by their age group and income level. Income Age
Low
Medium
High
21–35
120
100
75
36–50
150
160
100
51–65
160
180
160
Using the critical value approach, the decision and conclusion are _________________________________________. A) do not reject the null hypothesis; age and income are dependent B) do not reject the null hypothesis; age and income are independent C) reject the null hypothesis; age and income are dependent D) reject the null hypothesis; age and income are independent
Version 1
28
63) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
The row total for Asians is _____. A) 86 B) 75 C) 62 D) 31
64) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
The column total for directors is _____. A) 16 B) 56 C) 73 D) 109
Version 1
29
65) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
25
19
19
11
Black
28
35
32
12
Hispanic
23
14
18
1
Asian
13
21
9
3
Assuming that race and seniority are independent, which of the following is the expected frequency of Asian directors? A) 0 B) 4.39 C) 5.34 D) 7.06
66) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
Assuming that race and seniority are independent, which of the following is the expected frequency of Asian directors?
Version 1
30
A) 0 B) 1.95 C) 3.91 D) 5.42
67) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
How many expected frequencies in each cell of the above table do not satisfy the condition for the test for independence? A) 0 B) 4 C) 3 D) 5
68) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Version 1
31
Hispanic
32
15
13
2
Asian
10
11
10
0
Which of the following is a way to ensure all expected frequencies in each cell of the above table are five or more? A) Combine the expected frequencies for races Black and Hispanic. B) Combine the expected frequencies for races Hispanic and Asian. C) Combine the expected frequencies for seniorities Manager and Director. D) Remove the expected frequencies for the Director seniority.
69) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
To test that race and seniority are independent, the null and alternative hypothesis are _______________________________________________________________________. A) H0: Race and seniority are independent; HA: Race and seniority are dependent B) H0: Race and seniority are mutually exclusive; HA: Race and seniority are not mutually exclusive C) H0: Race and seniority are not mutually exclusive; HA: Race and seniority are mutually exclusive D) H0: Race and seniority are dependent; HA: Race and seniority are independent
Version 1
32
70) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test for independence to be valid, Martha combines the seniorities Manager and Director. As a result, the degrees of freedom used are _____. A) 2 B) 16 C) 3 D) 6
71) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test of independence to be valid, Martha combines the seniorities Manager and Director. The resulting value of the test statistic is _______.
Version 1
33
A) 12.221 B) 11.293 C) 17.853 D) 20.154
72) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test of independence to be valid, Martha combines the seniorities Manager and Director. The resulting p-value is ________________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
73) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race White
Version 1
Coordinator
Analyst
Manager
Director
32
20
25
9
34
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test of independence to be valid, Martha combines the seniorities Manager and Director. Using the p-value approach and α = 0.05, the decision and conclusion are ________________________________________________________________________. A) reject the null hypothesis; conclude race and seniority are dependent B) reject the null hypothesis; conclude race and seniority are independent C) do not reject the null hypothesis; cannot conclude race and seniority are dependent D) do not reject the null hypothesis; conclude race and seniority are independent
74) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test of independence to be valid, Martha combines the seniorities Manager and Director. At the 5% significance level, the critical value is ________. A) 12.59 B) 16.92 C) 19.68 D) 21.03
Version 1
35
75) The following table shows the distribution of employees in an organization. Martha Foreman, an analyst, wants to see if race has a bearing on the position a person holds with this company. Seniority Race
Coordinator
Analyst
Manager
Director
White
32
20
25
9
Black
35
10
25
5
Hispanic
32
15
13
2
Asian
10
11
10
0
For the chi-square test of independence to be valid, Martha combines the seniorities Manager and Director. Using the critical value approach, the decision and conclusion are _______________________________________________________________. A) reject the hypothesis; conclude race and seniority are dependent B) reject the null hypothesis; conclude race and seniority are independent C) do not reject the null hypothesis; conclude race and seniority are dependent D) do not reject the null hypothesis; conclude race and seniority are independent
76) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
For the goodness-of-fit test for normality, the null and alternative hypotheses are ______________________________________________________________________.
Version 1
36
A) H0: Heights follow a normal distribution with mean 166.55 and standard deviation 12.46, HA: Heights do not follow a normal distribution with mean 166.55 and standard deviation 12.46 B) H0: Heights do not follow a normal distribution with mean 166.55 and standard deviation 12.46, HA: Heights follow a normal distribution with mean 166.55 and standard deviation 12.46 C) H0: Heights follow a normal distribution with mean 166.55 and standard deviation 12.57, HA: Heights do not follow a normal distribution with mean 166.55 and standard deviation 12.57 D) H0: Heights do not follow a normal distribution with mean 166.55 and standard deviation 12.57, HA: Heights follow a normal distribution with mean 166.55 and standard deviation 12.57
77) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expectedpi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
The heights are subdivided into five intervals. The degrees of freedom for the goodness-of-fit test for normality is ____. A) 2 B) 3 C) 4 D) 5
Version 1
37
78) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
For the heights subdivided into five intervals, the expected frequency of males that weigh less than 150 is _____. A) 5.4 B) 12.6 C) 18.6 D) 8.4
79) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
For the goodness-of-fit test for normality, suppose the value of the test statistic is 7.71. The pvalue is _____________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
Version 1
38
80) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
Using the p-value approach and α = 0.05, the decision and conclusion are ___________________________________________________________________. A) reject the null hypothesis; conclude that heights are normally distributed B) reject the null hypothesis; conclude that heights are not normally distributed C) do not reject the null hypothesis; conclude that heights are normally distributed D) do not reject the null hypothesis; conclude that heights are not normally distributed
81) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
At the 5% significance level, the critical value is ______.
Version 1
39
A) 4.605 B) 5.991 C) 7.378 D) 9.210
82) The heights (in cm) for a random sample of 60 males were measured. The sample mean is 166.55, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The following table shows the heights subdivided into non-overlapping intervals. Class
Observed
Expected pi
Height < 150
10
0.09
150 ≤ Height < 160
6
0.21
160 ≤ Height < 170
18
0.31
170 ≤ Height < 180
17
0.25
Height ≥ 180
9
0.14
Suppose the value of the test statistic is 7.71. Using the critical value approach, decision and conclusion are ____________________________________________________________________. A) reject the null hypothesis; conclude that heights are normally distributed B) reject the null hypothesis; conclude that heights are not normally distributed C) do not reject the null hypothesis; conclude that heights are normally distributed D) do not reject the null hypothesis; conclude that heights are not normally distributed
83) The heights (in cm) for a random sample of 60 male employees of S&M Construction Company were measured. The sample mean is 166.5, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The null and alternative hypotheses for the Jarque-Bera test for normality are _________________________________.
Version 1
40
A) H0:S < 0. and K > 0, HA:S > 0 and K < 0 B) H0:S < 0 and K = 0, HA:S > 0 or K ≠ 0 C) H0:S = 0 and K = 0, HA:S ≠ 0 or K ≠ 0 D) H0:S = 0 and K > 0, HA:S ≠ 0 and K < 0
84) The heights (in cm) for a random sample of 60 male employees of S&M Construction Company were measured. The sample mean is 166.5, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The Jarque-Bera test statistic for normality is assumed to follow a following degrees of freedom _____.
distribution with the
A) k B) 2 C) k − 1 D) (r − 1)(c − 1)
85) The heights (in cm) for a random sample of 60 male employees of S&M Construction Company were measured. The sample mean is 166.5, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The value of the Jarque-Bera test statistic is ______. A) 0.28 B) −0.493 C) 0.57 D) 2
86) The heights (in cm) for a random sample of 60 male employees of S&M Construction Company were measured. The sample mean is 166.5, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. The p-value is ______________.
Version 1
41
A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
87) The heights (in cm) for a random sample of 60 male employees of S&M Construction Company were measured. The sample mean is 166.5, the standard deviation is 12.57, the sample kurtosis is 0.12, and the sample skewness is −0.23. Suppose the null hypothesis of normality is rejected. Which of the following is a valid conclusion? A) The heights follow a normal distribution. B) Not being able to determine if the heights do not follow a normal distribution. C) The heights do not follow a normal distribution. D) Not being able to determine the heights follow a normal distribution.
88) The airline industry defines "no-shows" as passengers who have purchased a ticket but fail to arrive at the gate on time for departure. United Airlines operates many flights from Philadelphia to Dallas. The table below represents the observed number of no-shows that occurred on a random sample of 120 flights between Philadelphia and Dallas this year, and experienced the no-show proportions last year. No-shows
Observed
Proportion
0
10
0.05
1
18
0.15
2
26
0.20
3
32
0.30
4
20
0.20
5
14
0.10
United Airlines would like to know if the proportions of no-shows has changed between last year and this year on flights between these two cities using α = 0.05. The expected frequency of five no-shows for this sample is ______.
Version 1
42
A) 9 B) 12 C) 14 D) 19
89) The airline industry defines "no-shows" as passengers who have purchased a ticket but fail to arrive at the gate on time for departure. United Airlines operates many flights from Philadelphia to Dallas. The table below represents the observed number of no-shows that occurred on a random sample of 120 flights between Philadelphia and Dallas this year, and experienced the no-show proportions last year. No-shows
Observed
Proportion
0
10
0.05
1
18
0.15
2
26
0.20
3
32
0.30
4
20
0.20
5
14
0.10
United Airlines would like to know if the proportions of no-shows has changed between last year and this year on flights between these two cities using α = 0.05. The test statistic for this sample is ______. A) 4.28 B) 6.19 C) 7.50 D) 9.33
90) The airline industry defines "no-shows" as passengers who have purchased a ticket but fail to arrive at the gate on time for departure. United Airlines operates many flights from Philadelphia to Dallas. The table below represents the observed number of no-shows that occurred on a random sample of 120 flights between Philadelphia and Dallas this year, and experienced the no-show proportions last year.
Version 1
43
No-shows
Observed
Proportion
0
10
0.05
1
18
0.15
2
26
0.20
3
32
0.30
4
20
0.20
5
14
0.10
United Airlines would like to know if the proportions of no-shows has changed between last year and this year on flights between these two cities using α = 0.05. The critical value for this hypothesis is ______. A) 5.991 B) 7.815 C) 9.448 D) 11.070
91) The airline industry defines "no-shows" as passengers who have purchased a ticket but fail to arrive at the gate on time for departure. United Airlines operates many flights from Philadelphia to Dallas. The table below represents the observed number of no-shows that occurred on a random sample of 120 flights between Philadelphia and Dallas this year, and experienced the no-show proportions last year. No-shows
Observed
Proportion
0
10
0.05
1
18
0.15
2
26
0.20
3
32
0.30
4
20
0.20
5
14
0.10
United Airlines would like to know if the proportions of no-shows has changed between last year and this year on flights between these two cities using α = 0.05. The conclusion for this hypothesis would be that because the test statistic is __________________________________________________________________________.
Version 1
44
A) Reject the null hypothesis; we cannot conclude that the proportions for no-shows has changed from last year to this year B) Reject the null hypothesis; we can conclude that the proportions for no-shows has changed from last year to this year C) Do not rejectthe null hypothesis; we cannot conclude that the proportions for noshows has changed from last year to this year D) Do not reject the null hypothesis; we can conclude that the proportions for no-shows has changed from last year to this year
92) A manufacturer of flash drives for data storage operates a production facility that runs on three 8-hour shifts per day. The following contingency table shows the number of flash drives that were defective and not defective from each shift. Shift
Defective
Not Defective
7 a.m.–3 p.m.
5
123
3 p.m.–11 p.m.
16
84
11 p.m.–7 p.m.
7
65
The expected number of not defective flash drives manufactured between 3 p.m. and 11 p.m. is ______. A) 95.7 B) 90.7 C) 86.7 D) 72.1
93) A manufacturer of flash drives for data storage operates a production facility that runs on three 8-hour shifts per day. The following contingency table shows the number of flash drives that were defective and not defective from each shift. Shift
Defective
Not Defective
7 a.m.–3 p.m.
7
113
3 p.m.–11 p.m.
10
90
11 p.m.–7 p.m.
4
76
Version 1
45
The expected number of not defective flash drives manufactured between 3 p.m. and 11 p.m. is ______. A) 98.0 B) 93.0 C) 89.0 D) 74.4
94) A manufacturer of flash drives for data storage operates a production facility that runs on three 8-hour shifts per day. The following contingency table shows the number of flash drives that were defective and not defective from each shift. Shift
Defective
Not Defective
7 a.m.–3 p.m.
7
113
3 p.m.–11 p.m.
10
90
11 p.m.–7 p.m.
4
76
The test statistic for this test of independence is ______. A) 2.125 B) 1.225 C) 2.152 D) 2.521
95) A manufacturer of flash drives for data storage operates a production facility that runs on three 8-hour shifts per day. The following contingency table shows the number of flash drives that were defective and not defective from each shift. Shift
Defective
Not Defective
7 a.m.–3 p.m.
7
113
3 p.m.–11 p.m.
10
90
11 p.m.–7 p.m.
4
76
Version 1
46
Based on the critical value approach and using α = 0.05, which of the following decisions and conclusions for this hypothesis test would be accurate? A) The test statistic is greater than the critical value; we reject the null hypothesis and cannot conclude that the quality of the flash drive production and the production shift are independent of one another. B) The test statistic is greater than the critical value; we fail to reject the null hypothesis and can conclude that the quality of the flash drive production and the production shift are independent of one another. C) The test statistic is less than the critical value; we fail to reject the null hypothesis and cannot conclude that the quality of the flash drive production and the production shift are not independent of one another. D) The test statistic is less than the critical value; we reject the null hypothesis and cannot conclude that the quality of the flash drive production and the production shift are independent of one another.
96) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
26
30
24
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
The expected number of individuals with income less than $50,000 and a credit score between 650 and 750 is ______. A) 24.0 B) 26.0 C) 32.5 D) 52.0
Version 1
47
97) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
21
31
27
$50,000 ≤ Income < $100,000
46
45
40
$100,000 ≤ Income < $150,000
48
33
28
Income ≥ $150,000
31
21
29
The expected number of individuals with income between $50,000 and $100,000 and a credit score less than 650 is ________. A) 23.9 B) 26.4 C) 31.4 D) 47.8
98) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
26
30
24
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
The expected number of individuals with income between $50,000 and $100,000 and a credit score less than 650 is ________.
Version 1
48
A) 30.0 B) 32.5 C) 37.5 D) 60.0
99) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
20
25
29
$50,000 ≤ Income < $100,000
62
48
50
$100,000 ≤ Income < $150,000
51
25
20
Income ≥ $150,000
27
16
27
The test statistic for this sample is ________. A) 13.600 B) 15.154 C) 18.885 D) 22.410
100) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class Income < $50,000
Version 1
Less than 650
650–750
More than 750
26
30
24
49
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
The test statistic for this sample is ________. A) 1.766 B) 3.320 C) 7.051 D) 10.576
101) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
26
30
24
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
The degrees of freedom for the critical value is ________. A) 3 B) 4 C) 5 D) 6
Version 1
50
102) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
26
30
24
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
Using α = 0.01, the critical value for this hypothesis test is ________. A) 9.488 B) 12.592 C) 16.812 D) 13.277
103) Suppose Bank of America would like to investigate if the credit score and income level of an individual are independent of one another. Bank of America selected a random sample of 400 adults and asked them to report their credit score range and their income range. The following contingency table presents these results. Credit Score Class
Less than 650
650–750
More than 750
Income < $50,000
26
30
24
$50,000 ≤ Income < $100,000
63
53
44
$100,000 ≤ Income < $150,000
40
30
30
Income ≥ $150,000
21
17
22
Based on the critical value approach and using α = 0.01, which of the following would be the decision and conclusion for this hypothesis test?
Version 1
51
A) Reject the null hypothesis; we cannot conclude that credit score and income are independent of one another. B) Do notreject the null hypothesis; we can conclude that credit score and income are independent of one another. C) Do notreject the null hypothesis; we cannot conclude that credit score and income are not independent of one another. D) Reject the null hypothesis; we can conclude that credit score and income are independent of one another.
104) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. For the goodness-of-fit test for normality, the null and alternative hypotheses are ______________________________________________________________________________ ___________. A) H0: The returns follow a normal distribution with mean 6.49% and standard deviation 0.31%; HA: The returns do not follow a normal distribution with mean 6.49% and standard deviation 0.31%. B) H0: The returns do not follow a normal distribution with mean 6.49% and standard deviation 0.31%; HA: The returns follow a normal distribution with mean 6.49% and standard deviation 0.31%. C) H0: The returns follow a normal distribution with mean 0.31% and standard deviation 6.49%; HA: The returns do not follow a normal distribution with mean 0.31% and standard deviation 6.49%. D) H0: The returns do not follow a normal distribution with mean 0.31% and standard deviation 6.49%; HA: The returns follow a normal distribution with mean 0.31% and standard deviation 6.49%.
Version 1
52
105) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. The probability that the return is less than −5% if the return is normally distributed is _______. A) 0.2061 B) 0.2740 C) 0.2841 D) 0.2358
106) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. The expected frequency for the class 0 ≤ Return < 5 if the return is normally distributed is _______.
Version 1
53
A) 12.37 B) 16.44 C) 17.05 D) 14.15
107) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. At the 5% confidence level, which of the following is the correct conclusion for this goodness-of-fit test for normality? A) Reject the null hypothesis; conclude that monthly stock returns are normally distributed with mean 0.31% and standard deviation 6.49%. B) Reject the null hypothesis; conclude that monthly stock returns are not normally distributed with mean 0.31% and standard deviation 6.49%. C) Do not reject the null hypothesis; conclude that monthly stock returns are normally distributed with mean 0.31% and standard deviation 6.49%. D) Do not reject the null hypothesis; conclude that monthly stock returns are not normally distributed with mean 0.31% and standard deviation 6.49%.
108) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
Version 1
54
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. The test statistic for the Jarque-Bera test for normality is ______. A) 0.59 B) 0.95 C) 0.43 D) 0.34
109) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Observed Frequency
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over the time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. Using the Jarque-Bera test for normality, the p-value is __________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
110) The following frequency distribution shows the monthly stock returns for Home Depot for the years 2003 through 2007. Class (in percent)
Version 1
Observed Frequency
55
Return <−5
13
−5 ≤ Return < 0
16
0 ≤ Return < 5
20
Return ≥ 5
11
Over time period, the following summary statistics are provided: Mean = 0.31%, Standard deviation = 6.49%, Skewness = 0.15, and Kurtosis = 0.38. At the 5% confidence level, which of the following is the correct conclusion for the Jarque-Bera test for normality? A) Reject the null hypothesis; conclude that monthly stock returns are normally distributed with mean 0.31% and standard deviation 6.49%. B) Reject the null hypothesis; conclude that monthly stock returns are not normally distributed with mean 0.31% and standard deviation 6.49%. C) Do not reject the null hypothesis; conclude that monthly stock returns are normally distributed with mean 0.31% and standard deviation 6.49%. D) Do not reject the null hypothesis; conclude that monthly stock returns are not normally distributed with mean 0.31% and standard deviation 6.49%.
111) Which of the following is an example of a goodness-of-fit test for a multinomial experiment? A) To determine whether the public’s opinions about self-driving cars have changed in a year. B) To determine whether the public’s opinions about self-driving cars differ between men and women. C) To determine whether the public’s opinions about self-driving cars and energy conservation are related. D) To determine whether the public’s opinions about self-driving cars differ between developed and developing countries.
112)
Which of the following is an example of a chi-square test for independence?
Version 1
56
A) To determine whether there is a relationship between advertising expenditure and sales volume. B) To determine whether there is a change in media literacy after the training program. C) To determine whether there is a decrease in lung cancer after the smoking cessation program. D) To determine whether the likelihood of college admission depends on the race of the applicants.
113) A travel agent wants to determine if clients have a preference of four different airlines. For a sample of 200 flight reservations, there were 70, 45, 48, and 37 reservations for the four different airlines. a. Set up the competing hypotheses to test if the population proportions are not equal to each other. b. Calculate the value of the test statistic. c. Specify the critical value at the 5% significance level. d. What is the conclusion of the hypothesis test?
114) A researcher wants to determine if the distribution of races hired at government agencies is reflective of the overall U.S. population demographics. The researcher uses census data on demographics to obtain proportions for different races. The following table shows these proportions and the number of each race hired for a particular government agency. Race
Observed
Proportion
White
253
0.60
Black
87
0.15
Hispanic
75
0.20
Asian
27
0.05
Version 1
57
a. Set up the competing hypotheses to test if at least one proportion is different from the population demographics. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Compute the p-value. Does the evidence suggest at least one proportion is different from the population demographics at the 1% significance level?
115) MARS claims that Skittles candies should be comprised of 30% purple, 10% orange, and 20% should be red, yellow, and green. Business students count a large bag of Skittles and find 265 red, 279 purple, 303 yellow, 271 green, and 212 orange candies. a. Set up the competing hypotheses to test if at least one proportion is different than the value claimed by MARS. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Compute the p-value. Does the evidence suggest that at least one proportion is different from the claimed value at the 5% significance level?
116) A researcher wants to verify his belief that smoking and drinking go together. The following table shows individuals cross-classified by drinking and smoking habits. Smoke
Not Smoke
Drink
328
101
Not Drink
141
98
a. Set up the competing hypotheses to determine if drinking and smoking are dependent. b. Calculate the value of the test statistic. c. Specify the critical value at the 5% significance level. d. Can you conclude smoking and drinking are dependent?
Version 1
58
117) The following table shows the cross-classification of hedge funds by market capitalization (either small, mid, or large cap) and objective (either growth or value). Growth
Value
Large Cap
23
19
Mid Cap
9
10
Small Cap
17
72
a. Set up the competing hypotheses to determine if market capitalization and objective are dependent. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Compute the p-value. Does the evidence suggest market capitalization and objective are dependent at the 5% significance level?
118) The following table shows the cross-classification of accounting practices (either straight line, declining balance, or both) and country (either France, Germany, or United Kingdom). France
Germany
United Kingdom
Straight Line
20
11
30
Declining Balance
14
16
15
Both
15
23
13
a. Set up the competing hypotheses to determine if accounting practice and country are dependent. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Compute the p-value. Does the evidence suggest market capitalization and objective are dependent at the 1% significance level?
Version 1
59
119) The following table shows the observed frequencies of the quarterly returns for a sample of 60 hedge funds. The table also contains the hypothesized proportions of each class assuming the quarterly returns have a normal distribution. The sample mean and standard deviation are 3.6% and 7.4% respectively. Class
Observed
Expected pi
Return <−5
14
0.122
−5 ≤ Return < 0
11
0.191
0 ≤ Return < 5
10
0.262
5 ≤ Return < 10
11
0.231
Return ≥ 10
14
0.194
a. Set up the competing hypotheses for the goodness-of-fit test of normality for the quarterly returns. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Compute the p-value. Does the evidence suggest that the quarterly returns do not have a normal distribution at the 10% significance level?
120) The following table shows the observed frequencies of the amount of assets under management for a sample of 134 hedge funds. The table also contains the hypothesized proportion of each class assuming the amount of assets under management has a normal distribution. The sample mean and standard deviation are 15 billion and 11 billion respectively.
Version 1
Class
Observed
Expected pi
Assets < 9
45
0.30
9 ≤ Assets < 18
40
0.31
18 ≤ Assets < 27
29
0.25
27 ≤ Assets < 36
13
0.11
60
Assets ≥ 36
7
0.03
a. Set up the competing hypotheses for the goodness-of-fit test of normality for amount of assets under management. b. Calculate the value of the test statistic and determine the degrees of freedom. c. Specify the critical value at the 5% significance level. d. Is there evidence to suggest the amount of assets under management do not have a normal distribution? e. Are there any conditions that may not be satisfied?
121) The following table shows numerical summaries of the best quarter returns (in percentages) for a sample of 121 hedge funds. Mean
28.91
Standard Deviation
12.90
Skewness
0.55
Kurtosis
0.57
a. Set up the competing hypotheses for the Jarque-Bera test for normality for the best quarter returns. b. Calculate the value of the test statistic. c. Specify the critical value at the 1% significance level. d. Does the evidence suggest that the best quarter returns do not have a normal distribution?
122) The following table shows numerical summaries of the worst quarter returns (in percentages) for a sample of 121 hedge funds. Mean
−18.47
Standard Deviation
4.44
Skewness
−1.02
Kurtosis
1.2
Version 1
61
a. Set up the competing hypotheses for the Jarque-Bera test for normality for the worst quarter returns. b. Calculate the value of the test statistic and find the p-value. c. Does the evidence suggest the worst quarter returns do not have a normal distribution at the 5% significance level?
123) A researcher wants to determine if there has been a change in reasons for using social media. In 2014, a survey found that 38% used social media to share photos or videos, 27% to share details of daily life, 24% to research products to buy, and 11% to keep up with news or sport. A recent sample of 500 respondents produces the following results: Share photos/videos 165
Share details of daily life
Research products to buy
100
140
Keep up with news/sport 95
a. What are the competing hypotheses to test whether the reasons for using social media have changed? b. What is the value of the test statistic? c. What is the value of the p-value? d. At the 5% significance level, what is your conclusion?
124) The manager of a theater wants to determine whether the age of its patrons is normally distributed or not. He randomly sampled 261 of his patrons and recorded their ages. A portion of the data is shown below: Age 18 : 21
Version 1
62
{MISSING IMAGE}Click here for the Excel Data File Divide the observations of Age into 4 non-overlapping intervals: < 20, 20 to 29, 30 to 39, and 40 or older. a. What is the mean and standard deviation of the sample data? b. Use the goodness-of-fit test for normality, state the competing hypotheses to determine whether the age of the theater patrons is normally distributed or not. c. What is the condition that must be satisfied before the goodness-of-fit test for normality can be applied? Is the condition satisfied in this case? d. What is the value of the test statistic? e. What is the value of the p-value? f. At the 5% significance level, what is the conclusion of the test? g. Would your conclusion change at the 10% significance level?
125) The manager of a theater wants to determine whether the age of its patrons is normally distributed or not. He randomly sampled 261 of his patrons and recorded their ages. A portion of the data is shown below: Age 18 : 21
{MISSING IMAGE}Click here for the Excel Data File a.What is the value of the skewness and kurtosis coefficients? b. Use the Jarque-Bera test; what are the competing hypotheses to determine whether the age of the theater patrons is normally distributed? c. What is the value of the test statistic? d. What is the value of the p-value? e. At the 1% significance level, what is your conclusion?
Version 1
63
126) For a multinomial experiment with k categories, the goodness-of-fit test statistic is assumed to follow a chi-square distribution with k degrees of freedom. ⊚ true ⊚ false
127) If the null hypothesis is rejected by the goodness-of-fit test, the alternative hypothesis specifies which of the population proportions differ from their hypothesized values. ⊚ true ⊚ false
128)
The chi-square goodness-of-fit test is a right-tailed test. ⊚ true ⊚ false
129) For a chi-square test of a contingency table, the expected frequencies for each cell are calculated assuming the two events are independent of one another. ⊚ true ⊚ false
130) For a chi-square test of a contingency table, the degrees of freedom are calculated as (r−1)(c−1) where r and c are the number of rows and columns in the contingency table. ⊚ true ⊚ false
131)
For a chi-square test of a contingency table, each expected frequency must be at least 3. ⊚ true ⊚ false
Version 1
64
132) The chi-square test statistic measures the difference between the observed frequencies and the expected frequencies assuming the null hypothesis is true. ⊚ true ⊚ false
133) A goodness-of-fit test analyzes for two qualitative variables whereas a chi-square test of a contingency table is for a single qualitative variable. ⊚ true ⊚ false
134) When applying the goodness-of-fit test for normality, the data are divided into k nonoverlapping intervals. ⊚ true ⊚ false
135) For the Jarque-Bera test for normality, the test statistic is assumed to have a chi-square distribution with two degrees of freedom. ⊚ true ⊚ false
Version 1
65
Answer Key Test name: Chap 12_4e 1) B 2) B 3) D 4) B 5) C 6) B 7) A 8) C 9) B 10) B 11) C 12) B 13) B 14) C 15) C 16) A 17) B 18) C 19) D 20) A 21) B 22) B 23) B 24) B 25) D 26) D Version 1
66
27) B 28) A 29) D 30) B 31) C 32) A 33) C 34) D 35) D 36) A 37) A 38) A 39) B 40) D 41) D 42) D 43) B 44) B 45) D 46) A 47) A 48) C 49) C 50) A 51) D 52) A 53) C 54) B 55) A 56) D Version 1
67
57) B 58) B 59) B 60) C 61) C 62) C 63) D 64) A 65) B 66) B 67) C 68) C 69) A 70) D 71) B 72) C 73) C 74) A 75) D 76) C 77) A 78) A 79) B 80) B 81) B 82) B 83) C 84) B 85) C 86) D Version 1
68
87) C 88) B 89) A 90) D 91) D 92) B 93) B 94) A 95) C 96) B 97) D 98) D 99) B 100) B 101) D 102) C 103) C 104) C 105) A 106) C 107) C 108) A 109) D 110) C 111) A 112) D 113) a. H0:p1 = p2 = p3 = p4 = 0.25;HA: Not all proportions equal to 0.25 b. 11.96 c. 7.815 d. Because the test statistic value is greater than the critical value, reject the null hypothesis. Conclude that not all population proportions are 0.25.
Version 1
69
114) a.H0: p1 = 0.60, p2 = 0.15, p3 = 0.20, p4 = 0.05, HA: At least one of the proportions is different from its hypothesized value b. 10.141 and the degrees of freedom are 3 c. The p-value is between 0.01 and 0.05. Because the p-value is less than the significance level, reject the null hypothesis. Conclude at least one population proportion is different from its hypothesized value. 115) a. H0:pp = 0.30, po = 0.10, pr = 0.20, py = 0.20, pg = 0.20 HA: At least one of the proportions is different from its hypothesized value. b. 88.26 and the degrees of freedom are 4. c. The p-value is less than 0.05. Because the p-value is less than the significance level, reject the null hypothesis. Conclude that at least one population proportion is different from the claimed value by MARS. 116) a. H0: Drinking and smoking are independent HA: Drinking and smoking are dependent. b. 22.37 and the degree of freedom is 1. c. 3.841 d. Because the test statistic value is greater than the critical value, reject the null hypothesis. Conclude that smoking and drinking are dependent. 117) a. H0: Objective and market cap are independent, HA: Objective and market cap are dependent. b. 18.635 and the degrees of freedom are 2. c. The p-value is less than 0.01. Because the p-value is less than the significance level, reject the null hypothesis. Conclude that market capitalization and objective are dependent. 118) a. H0: Accounting practice and country are independent; HA: Accounting practice and country are dependent. b. 11.205 and the degrees of freedom are 4. c. The p-value is between 0.01 and 0.05. Because the p-value is less than the significance level, reject the null hypothesis. Conclude the accounting practice and country are dependent. 119) a. H0: Quarterly returns follow a normal distribution with mean 3.36% and standard deviation 7.4%; HA: Quarterly returns do not follow a normal distribution with mean 3.36% and standard deviation 7.4%. b. 9.26 and the degrees of freedom are 2. c. The p-value is less than 0.01. Because the p-value is less than the significance level, reject the null hypothesis. Conclude the quarterly returns do not have a normal distribution.
Version 1
70
120) a. H0: Assets follow a normal distribution with mean 15 and standard deviation 11; HA: Assets do not follow a normal distribution with mean 15 and standard deviation 11. b. 3.649 and the degrees of freedom are 2. c. 5.991. d. Because the test statistic value is less than the critical value, do not reject the null hypothesis. We cannot conclude the amount of assets under management does not have a normal distribution. e. For the at least 36 billion under management class, the expected frequency is 4.02, which is less than 5. 121) a. H0:S = 0 and K = 0, HA:S ≠ 0 or K ≠ 0 b. 7.738 and the degrees of freedom are 2. c. 9.210 d. Because the test statistic is less than the critical value, do not reject the null hypothesis. We cannot conclude that the best quarter returns do not have a normal distribution. 122) a. H0:S = 0 and K = 0, HA:S ≠ 0 or K ≠ 0 b. 28.24 and the p-value is close to zero. c. Because the p-value is less than the significance level, reject the null hypothesis. Conclude the worst quarter returns do not have a normal distribution. 123) a.H0: p1 = 0.38, p2 = 0.27, p3 = 0.24, and p4 = 0.11. HA: not all reasons equal their hypothesized values. b. The test statistic = 44.7878 c. The p-value = 0.0000 d. The p-value = 0 is less than the 5% significance level. We can reject H0 and conclude that there has been a change in reasons for using social media since 2014. 124) a. The mean = 25.8774. The standard deviation = 9.1159 b. H0: Age follows a normal distribution with a mean of 25.8774 and standard deviation of 9.1159; HA: Age does not follow a normal distribution with a mean of 25.8774 and standard deviation of 9.1159. c. The condition of the expected frequency is five or more in each interval. This condition is satisfied in this case. d. The test statistic = 3.4625 e. The p-value = 0.0628 f. We conclude that the age of the theater patrons is normally distributed because the p-value = 0.0628 is greater than α = 0.05. We cannot reject H0. g. We conclude that the age of the theater patrons is not normally distributed because at the 10% significance level, the p-value = 0.0628 is less than α = 0.1.
Version 1
71
125) a. Skewness = 0.3726, Kurtosis = −0.5191 b. H0: S = 0 and K = 0; HA: S ≠ 0 and K ≠ 0. c. The test statistic = 8.9705 d. The p-value = 0.0113 e. The conclusion is the age of the theater patrons is normally distributed because the p-value = 0.0113 is greater than the significance level of 1%. H0 cannot be rejected.
126) FALSE 127) FALSE 128) TRUE 129) TRUE 130) TRUE 131) FALSE 132) TRUE 133) FALSE 134) TRUE 135) TRUE
Version 1
72
CHAPTER 13 – 4e 1) test?
Which of the following is the assumption that is not applicable for a one-way ANOVA
A) The populations are normally distributed. B) The population standard deviations are not all equal. C) The samples are selected independently. D) The sample is drawn at random from each population.
2)
Using R, which of the below functions is used to generate an ANOVA table? A) aov B) anova C) Tukey's HSD D) colnames
3)
If there are five treatments under study, the number of pairwise comparisons is _______. A) 15 B) 5 C) 20 D) 10
4) The variability due to chance, also known as the within-treatments variance, is the estimate of σ2 which is not based on the variability _______. A) between the sample means B) due to random chance C) within each sample D) due to the common population variance
Version 1
1
5)
The one-way ANOVA null hypothesis is rejected when the _______.
A) two estimates of the variance are relatively close together B) variability in the sample means can be explained by chance C) ratio of the within-treatments variance and the between-treatments variance is 1 D) ratio of the within-treatments variance and the between-treatments variance is significantly greater than 1
6) The between-treatments variance is based on a weighted sum of squared differences between the _______. A) population variances and the overall mean of the data set B) sample means and the overall mean of the data set C) sample variances and the overall mean of the data set D) population means and the overall mean of the data set
7)
Fisher’s LSD method is applied when _______.
A) the ANOVA null hypothesis is rejected B) the ANOVA null hypothesis is not rejected C) either the ANOVA null hypothesis is rejected or the ANOVA null hypothesis is not rejected D) None of these answers is correct.
8) When using Fisher’s LSD method with a given significance level for each interval, the probability of committing a Type I error for at least one comparison increases as the number of _______.
Version 1
2
A) pairwise comparisons decreases B) pairwise comparisons increases C) sample size increases D) treatments decreases
9) Which of the following is the Fisher’s 100(1 − α)% confidence interval for the difference between two population means μi− μj? A) B) C) D)
10) Which of the following is the correct interpretation of Fisher’s 100(1− α)% confidence interval for μi− μj? A) If the interval includes zero, the null hypothesis H0: μi− μj = 0, is rejected at a significance level α. B) If the interval does not include zero, the null hypothesis H0: μi− μj = 0, is rejected at a significance level 100(1− α)%. C) If the interval does not include zero, the null hypothesis H0: μi− μj = 0, is rejected at a significance level α. D) If the interval includes zero, the null hypothesis H0: μi− μj = 0, is rejected at a significance level 100(1− α)%.
11) One of the disadvantages of Fisher’s LSD method is that the probability of committing at least one_______.
Version 1
3
A) Type II error increases as the number of pairwise comparisons increases B) Type I error increases as the number of pairwise comparisons decreases C) Type II error increases as the number of pairwise comparisons decreases D) Type I error increases as the number of pairwise comparisons increases
12) Tukey’s Honestly Significant Differences (HSD) method ensures that the probability of at least one Type I error remains fixed irrespective of the number of _______. A) pairwise comparisons B) treatments C) replications within each treatment D) replications for each combination of factor A and factor B
13) Tukey’s HSD method uses _______ instead of _______ when compared to Fisher’s LSD method for pairwise comparisons. A) t values; studentized range values B) studentized range values; F values C) F values; t values D) studentized range values; t values
14) Tukey’s 100(1− α)% confidence interval for the difference between two population means μi− μj for balanced data is given by _______. A) B) C) D)
Version 1
4
15) Which of the below is not true about Tukey’s 100(1−α)% confidence interval for the difference between two population means? A) Tukey’s intervals are narrower than those based on Fisher’s method. B) Tukey’s intervals are narrower than those based on the two-sample T method. C) Tukey’s intervals use a reduced significance level in comparison to Fisher’s method. D) Tukey’s intervals use an increased significance level in comparison to Fisher’s method.
16)
In a two-way ANOVA test, how many null hypotheses are tested? A) 1 B) 1 or 2 C) 2 or 3 D) More than 3
17) If the interaction between two factors is not significant, the next tests to be done are _______. A) none, the analysis is complete B) none, gather more data C) tests about the population means of factor A or factor B using two-way ANOVA without interaction D) Tukey’s confidence intervals
18) Which of these null hypotheses is applicable for a two-way ANOVA test with interaction?
Version 1
5
A) There is interaction between factors A and B. B) Factor A and factor B means differ. C) There is no interaction between factors A and B. D) Factor A and factor B means do not differ.
19)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
For the within groups category, the degrees of freedom are _______. A) 6 B) 7 C) 8 D) 9
20)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
The sum of squares due to treatments is _______.
Version 1
6
A) 10 B) 25 C) 75 D) 100
21)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
The mean square error is _______. A) 1.333 B) 9.375 C) 25 D) 75
22)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
The value of the test statistic is _______.
Version 1
7
A) 1.333 B) 9.375 C) 12.5 D) 100
23)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
At the 5% significance level, the critical value is _______. A) 3.11 B) 4.46 C) 6.06 D) 8.65
24)
The following is an incomplete ANOVA table. Source of Variation
SS
Between groups
df
MS
2
12.5
F
Within groups Total
100
10
The p-value of the test is _______.
Version 1
8
A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
25) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
Which of the following is the sum of squares due to treatments? A) 5,285.83 B) 13,281.79 C) 18,567.63 D) 4,427.26
Version 1
9
26) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
Which of the following is the mean square for treatments? A) 18,567.63 B) 13,281.79 C) 5,285.83 D) 4,427.26
27) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Version 1
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
10
90
70
35
10
Which of the following is the sum of squared errors? A) 264.29 B) 5,285.83 C) 18,567.63 D) 13,281.79
28) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
The competing hypotheses about the mean commute times are _______. A) H0: μ1 = μ2 = μ3, HA: Not all population means are equal B) H0: Not all population means are equal, HA: μ1 = μ2 = μ3 C) H0: μ1 = μ2 = μ3 = μ4, HA: Not all population means are equal D) H0: Not all population means are equal, HA: μ1 = μ2 = μ3 = μ4
Version 1
11
29) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
The value of the test statistic is _______. A) 0.06 B) 0.40 C) 2.51 D) 16.75
30) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Version 1
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
12
90
70
35
10
At the 5% significance level, the critical value is _______. A) 2.38 B) 3.10 C) 3.86 D) 4.94
31) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
The p-value for the test is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
Version 1
13
32) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston
Charlotte
Tucson
Akron
45
25
25
10
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
The conclusion for the hypothesis test is _______. A) to reject the null hypothesis; we cannot conclude that not all mean commute times are equal B) do not reject the null hypothesis; we cannot conclude that not all mean commute times are equal C) reject the null hypothesis; we can conclude that not all mean commute times are equal D) do not reject the null hypothesis; we can conclude that not all mean commute times are equal
33) A researcher with the Ministry of Transportation is commissioned to study the drive times to work (one-way) for U.S. cities. The underlying hypothesis is that average commute times are different across cities. To test the hypothesis, the researcher randomly selects six people from each of the four cities and records their one-way commute times to work. Refer to the below data on one-way commute times (in minutes) to work. Note that the grand mean is 36.625. Houston 45
Version 1
Charlotte 25
Tucson
Akron
25
10
14
65
30
30
15
105
35
19
15
55
10
30
10
85
50
10
5
90
70
35
10
Based on the sample standard deviation, the one-way ANOVA assumption that is likely not met is _______. A) the populations are normally distributed. B) the population standard deviations are assumed to be equal. C) the samples are independent. D) the samples are not normally distributed.
34) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Fisher 95% confidence intervals are shown below.
Fisher’s 95% confidence intervals
Lower
Upper
Houston – Charlotte
17.92
57.08
Houston– Tucson
29.75
68.91
Houston– Akron
43.75
82.91
Charlotte– Tucson
−7.75
31.41
Charlotte– Akron
6.25
45.41
Tucson – Akron
−5.58
33.58
How many pairs of cities show a significant difference in average commute times to work? A) 2 B) 3 C) 4 D) 6
Version 1
15
35) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Fisher 95% confidence intervals are shown below.
Fisher’s 95% confidence intervals
Which of the following is the intervals?
Lower
Upper
Houston– Charlotte
17.92
57.08
Houston– Tucson
29.75
68.91
Houston– Akron
43.75
82.91
Charlotte– Tucson
−7.75
31.41
Charlotte– Akron
6.25
45.41
Tucson – Akron
−5.58
33.58
value used to calculate the Fisher’s 95% confidence
A) 1.725 B) 2.086 C) 2.080 D) 2.090
36) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Fisher’s 95% confidence intervals are shown below.
Fisher’s 95% confidence intervals
Version 1
Lower
Upper
Houston– Charlotte
17.92
57.08
Houston– Tucson
29.75
68.91
Houston– Akron
43.75
82.91
Charlotte– Tucson
−7.75
31.41
16
Charlotte– Akron
6.25
45.41
Tucson – Akron
−5.58
33.58
Which of these pair of cities show no significant difference between the average commute times to work? A) Houston, Akron B) Charlotte, Akron C) Charlotte, Tucson D) Houston, Tucson
37) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Tukey’s 95% confidence intervals are shown below.
Tukey’s 95% confidence intervals
Lower
Upper
Houston– Charlotte
11.22
63.78
Houston– Tucson
23.05
75.62
Houston– Akron
37.05
89.62
Charlotte– Tucson
−14.45
38.12
Charlotte– Akron
−0.45
52.12
Tucson– Akron
−12.28
40.28
How many pairs of cities show a significant difference between the average commute times to work? A) 2 B) 3 C) 4 D) 6
Version 1
17
38) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Tukey 95% confidence intervals are shown below.
Tukey’s 95% confidence intervals
Lower
Upper
Houston– Charlotte
11.22
63.78
Houston– Tucson
23.05
75.62
Houston– Akron
37.05
89.62
Charlotte– Tucson
−14.45
38.12
Charlotte– Akron
−0.45
52.12
Tucson– Akron
−12.28
40.28
Which of the following is the studentized range value with α = 0.05 for Tukey’s HSD method? A) 5.02 B) 3.58 C) 3.96 D) 4.64
39) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Tukey 95% confidence intervals are shown below.
Tukey’s 95% confidence intervals
Version 1
Lower
Upper
Houston – Charlotte
11.22
63.78
Houston– Tucson
23.05
75.62
Houston– Akron
37.05
89.62
Charlotte– Tucson
−14.45
38.12
Charlotte– Akron
−0.45
52.12
Tucson– Akron
−12.28
40.28
18
Which of these pairs of cities show a significant difference between the average commute times to work? A) Charlotte − Tucson B) Charlotte− Akron C) Houston − Charlotte D) Tucson− Akron
40) The ANOVA test performed for determined that not all mean commute times across the four cities are equal. However, it did not indicate which means differed. To find out which population means differ requires further analysis of the direction and the statistical significance of the difference between paired population means. Tukey’s 95% confidence intervals are shown below.
Tukey’s 95% confidence intervals
Lower
Upper
Houston – Charlotte
11.22
63.78
Houston– Tucson
23.05
75.62
Houston– Akron
37.05
89.62
Charlotte– Tucson
−14.45
38.12
Charlotte– Akron
−0.45
52.12
Tucson– Akron
−12.28
40.28
The conclusion of the Tukey confidence intervals is _______. A) the mean commute time in Houston is different from the mean commute time in Charlotte, Tucson, and Akron B) the mean commute time in Charlotte is different from the mean commute time in Houston, Tucson, and Akron C) the mean commute time in Tucson is different from the mean commute time in Houston, Charlotte, and Akron D) the mean commute time in Akron is different from the mean time in Houston, Charlotte, and Tucson
Version 1
19
41) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South(4) sides), and obtains data on the number of crimes per day in each area. The one-way ANOVA table is shown below. Source of Variation
SS
df
MS
F 9.12
Between groups
87.79
3
29.26
Within groups
64.17
20
3.21
Total
151.96
23
The competing hypotheses about the mean crime rates are _______. A) H0: μ1 = μ2 = μ3, HA: Not all population means are equal B) H0: Not all population means are equal, HA: μ1 = μ2 = μ3 C) H0: μ1 = μ2 = μ3 = μ4, HA: Not all population means are equal D) H0: Not all population means are equal, HA: μ1 = μ2 = μ3 = μ4
42) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South sides), and obtains data on the number of crimes per day in each area. The one-way ANOVA table is shown below. Source of Variation
SS
df
MS
F 9.12
Between groups
87.79
3
29.26
Within groups
64.17
20
3.21
Total
151.96
23
The degrees of freedom for the hypothesis test are _______. A) 4, 20 B) 3, 23 C) 3, 20 D) 4, 23
Version 1
20
43) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South(4) sides), and obtains data on the number of crimes per day in each area. The one-way ANOVA table is shown below. Source of Variation
SS
df
MS
F 9.12
Between groups
87.79
3
29.26
Within groups
64.17
20
3.21
Total
151.96
23
At the 1% significance level, the critical value is _______. A) 2.38 B) 3.10 C) 3.86 D) 4.94
44) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South(4) sides), and obtains data on the number of crimes per day in each area. The one-way ANOVA table is shown below. Source of Variation
SS
df
MS
F 9.12
Between groups
87.79
3
29.26
Within groups
64.17
20
3.21
Total
151.96
23
The p-value for the test is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
Version 1
21
45) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South(4) sides), and obtains data on the number of crimes per day in each area. The one-way ANOVA table is shown below. Source of Variation
SS
df
MS
F 9.12
Between groups
87.79
3
29.26
Within groups
64.17
20
3.21
Total
151.96
23
At the 1% significance level, the conclusion for the hypothesis test is _______. A) reject the null hypothesis; we can conclude that not all mean number of crimes are equal B) do not reject the null hypothesis; we can conclude not all mean number of crimes are equal C) reject the null hypothesis; we cannot conclude that not all mean number of crimes are equal D) do not reject the null hypothesis; we cannot conclude that not all mean number of crimes are equal
46) A police chief wants to determine if crime rates are different for four different areas of the city (East(1), West(2), North(3), and South(4) sides), and obtains data on the number of crimes per day in each area. The Tukey’s confidence intervals are shown below.
Tukey’s 95% confidence intervals
Lower
Upper
North – East
0.771
6.562
South– East
−3.729
2.062
West– East
0.104
5.896
South– North
−7.396
−1.604
West– North
−3.562
2.229
West– South
0.938
6.729
At the 1% significance level, the conclusion from Tukey’s confidence intervals is we _______.
Version 1
22
A) cannot conclude the mean number of crimes differs for West and East B) cannot conclude the mean number of crimes differs for West and South C) cannot conclude the mean number of crimes differs for South and North D) cannot conclude the mean number of crimes differs for West and North
47) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
How many degrees of freedom are there for factors A and B? A) 2, 3 B) 3, 4 C) 3, 6 D) 2, 4
Version 1
23
48) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
Which of the following are the total degrees of freedom? A) 10 B) 11 C) 12 D) 6
49) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation
Version 1
High
MS
F 24
Rows (Factor B) Columns (Factor A)
1,985.42 1,137.50
Error Total
7,456.25
Which of the following is the value of the test statistic for factor A? A) 4.76 B) 5.14 C) 9.41 D) 32.86
50) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
Which of the following is the value of the test statistic for factor B?
Version 1
25
A) 4.76 B) 5.14 C) 9.41 D) 32.86
51) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
At the 5% significance level, the critical value for the hypothesis test about factor A is _______. A) 3.46 B) 5.14 C) 7.26 D) 10.92
Version 1
26
52) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
The p-value for the hypothesis test about factor A is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
53) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation
Version 1
High
MS
F 27
Rows (Factor B) Columns (Factor A)
1,985.42 1,137.50
Error Total
7,456.25
At the 5% significance level, the conclusion for the hypothesis test about factor A is _______. A) do not reject the null hypothesis; we can conclude the average mortgage payments differ by income level B) do not reject the null hypothesis; we cannot conclude the average mortgage payments differ by income level C) reject the null hypothesis; we can conclude the average mortgage payments differ by income level D) reject the null hypothesis; we cannot conclude the average mortgage payments differ by income level
54) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
Version 1
7,456.25
28
At the 1% significance level, the critical value for the hypothesis test about factor B is _____. A) 3.29 B) 4.76 C) 6.60 D) 9.78
55) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation Rows (Factor B) Columns (Factor A)
High
MS
F
1,985.42 1,137.50
Error Total
7,456.25
The p-value for the hypothesis test about factor B is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
Version 1
29
56) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location using a two-way ANOVA without interaction. The data and an incomplete ANOVA table are shown below. Zonal Location (Factor B)
Income (Factor A) Low
Medium
Downtown
25
55
60
Suburbs
70
75
100
Midtown
30
35
40
Uptown
10
25
30
SS
df
Source of Variation
MS
Rows (Factor B) Columns (Factor A)
High
F
1,985.42 1,137.50
Error Total
7,456.25
At the 1% significance level, the conclusion for the hypothesis test about factor B is _______. A) do not reject the null hypothesis; we can conclude the average mortgage payments differ by zonal location B) do not reject the null hypothesis; we cannot conclude the average mortgage payments differ by zonal location C) reject the null hypothesis; we can conclude the average mortgage payments differ by zonal location D) reject the null hypothesis; we cannot conclude the average mortgage payments differ by zonal location
57) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
Version 1
SS
df
MS
F
30
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
How many degrees of freedom are there for factors A and B? A) 2, 3 B) 3, 4 C) 3, 6 D) 2, 4
58) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
Which of the following is the value of SSA? A) 46,869 B) 159,860 C) 1,115,101 D) 1,321,831
Version 1
31
59) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
Which of the following is the value of MSB? A) 15,623 B) 79,930 C) 37,170 D) 1,321,831
60) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
Version 1
1,321,831
35
32
Which of the following is the value of MSE? A) 15,623 B) 79,930 C) 37,170 D) 1,321,831
61) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
For factor A, the value of the test statistic is _______. A) 3.21 B) 2.15 C) 16.43 D) 3
62) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
Version 1
SS
df
MS
F 33
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
At the 5% significance level, the critical value for the hypothesis test about factor A is _______. A) 2.49 B) 3.32 C) 4.18 D) 5.39
63) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
The p-value for the hypothesis test about factor A is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
Version 1
34
64) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
At the 5% significance level, the conclusion of the hypothesis test for factor A is _______. A) do not reject the null hypothesis; we cannot conclude the average amount spent differs by spending category B) do not reject the null hypothesis; we can conclude the average amount spent differs by spending category C) reject the null hypothesis; we cannot conclude the average amount spent differs by spending category D) reject the null hypothesis; we can conclude the average amount spent differs by spending category
65) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
Version 1
1,321,831
35
35
At the 5% significance level, the critical value for the hypothesis test about factor B is _______. A) 2.28 B) 2.92 C) 3.59 D) 4.51
66) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below Source of Variation
SS
df
MS
F
Rows (Factor B) Columns (Factor A)
46,869
79,930
Error Total
1,321,831
35
The p-value for the hypothesis test about factor B is _______. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
67) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) – Factor A, and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers) – Factor B. An incomplete ANOVA table is shown below. Source of Variation Rows (Factor B)
Version 1
SS
df
MS
F
46,869
36
Columns (Factor A)
79,930
Error Total
1,321,831
11
At the 5% significance level, the conclusion of the hypothesis test for Factor B is ______________________________.
A) do not reject the null hypothesis; we cannot conclude the average amount spent differs by generation B) do not reject the null hypothesis; we can conclude the average amount spent differs by generation C) reject the null hypothesis; we cannot conclude the average amount spent differs by generation D) reject the null hypothesis; we can conclude the average amount spent differs by generation
68) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Gen-X
Gen-Y
Gen-Z
Version 1
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
37
Source of Variation
SS
df
MS
F
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Interaction Error
4,865
Total
1,321,831
35
For the analysis with an interaction between spending category and generation, the first hypothesis test to conduct should be about ______. A) the average spending across spending B) the interaction between spending and generation C) the average spending across generation D) both the average spending across spending and generation
69) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Gen-X
Gen-Y
Gen-Z
Source of Variation
Version 1
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
SS
df
MS
F
38
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Interaction Error
4,865
Total
1,321,831
35
The degrees of freedom for the interaction and the error are ________. A) 6, 24 B) 6, 30 C) 24, 6 D) 30, 6
70) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
Gen-X
Gen-Y
Gen-Z
Version 1
3
F
15,623
39
Columns (Factor A)
159,860
2
79,930
Interaction Error
4,865
Total
1,321,831
35
The value of SSE is _______. A) 46,869 B) 159,860 C) 116,767 D) 1,321,831
71) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Gen-X
Gen-Y
Gen-Z
Source of Variation
Shopping
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
MS
F
SS
df
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Version 1
Electronics
40
Interaction Error
4,865
Total
1,321,831
35
The value of MSAB is ________. A) 983,335 B) 116,767 C) 159,860 D) 166,389
72) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Gen-X
Gen-Y
Gen-Z
F
Interaction
Version 1
41
Error
4,865
Total
1,321,831
35
For the interaction, the value of the test statistic is ______. A) 34.20 B) 16.43 C) 3.21 D) 24
73) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Gen-X
Gen-Y
Gen-Z
F
Interaction Error
Version 1
4,865
42
Total
1,321,831
35
Which of the following are the competing hypotheses for the hypothesis test about the interaction? A) H0: There is interaction between spending category and generation,HA: There is no interaction between spending category and generation. B) H0: μ1 = μ2 = μ3,HA: Not all population means are equal. C) H0: There is no interaction between spending category and generation,HA: There is interaction between spending category and generation. D) H0: μ1 = μ2 = μ3 = μ4,HA: Not all population means are equal.
74) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Gen-X
Gen-Y
Gen-Z
F
Interaction
Version 1
43
Error
4,865
Total
1,321,831
35
At the 5% significance level, the critical value for the test about the interaction is ______. A) 2.04 B) 2.51 C) 2.99 D) 3.67
75) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Gen-X
Gen-Y
Gen-Z
F
Interaction Error
Version 1
4,865
44
Total
1,321,831
35
The p-value for the test about the interaction term is ________________. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
76) A market researcher is studying the spending habits of people across age groups. The amount of money spent by each individual is classified by spending category (Dining out, Shopping, or Electronics) and generation (Gen-X, Gen-Y, Gen-Z, or Baby Boomers). The data and an incomplete ANOVA table are shown below. Generation (Factor B)
Monthly Spending ($) (Factor A) Dining out
Baby Boomers
Shopping
Electronics
50
675
25
75
500
50
0
750
80
150
400
100
200
440
75
225
380
95
100
200
300
300
225
400
400
245
450
500
100
200
550
180
300
600
190
260
Source of Variation
SS
df
MS
Rows (Factor B)
46,869
3
15,623
Columns (Factor A)
159,860
2
79,930
Gen-X
Gen-Y
Gen-Z
F
Interaction Error Total
Version 1
4,865 1,321,831
35
45
At the 5% significance level, the conclusion for the hypothesis test about the interaction term is ________________. A) do not reject the null hypothesis; there is evidence of an interaction effect between spending category and generation B) reject the null hypothesis; there is evidence of an interaction effect between spending category and generation C) reject the null hypothesis; there is no evidence of an interaction effect between spending category and generation D) do not reject the null hypothesis; there is no evidence of an interaction effect between spending category and generation
77) Psychology students want to determine if there are differences between the ability of business majors and science majors to solve various types of analytic problems. They conduct an experiment and record the amount of time it takes to complete each analytic problem. The following two-Way ANOVA table summarizes their findings. Source of Variation
SS
df
MS
F
Major
16.82
1
16.82
11.21
Problem
102.92
4
25.73
17.15
Interaction
157.48
4
39.37
26.25
Error
60.00
40
1.5
Total
337.22
49
The number of different analytic problems the students solved is _____. A) 1 B) 2 C) 4 D) 5
78) Psychology students want to determine if there are differences between the ability of business majors and science majors to solve various types of analytic problems. They conduct an experiment and record the amount of time it takes to complete each analytic problem. The following two-way ANOVA table summarizes their findings.
Version 1
46
Source of Variation
SS
df
MS
F
Major
16.82
1
16.82
11.21
Problem
102.92
4
25.73
17.15
Interaction
157.48
4
39.37
26.25
Error
60.00
40
1.5
Total
337.22
49
For the hypothesis test about the interaction term, the p-value is ______________. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
79) Psychology students want to determine if there are differences between the ability of business majors and science majors to solve various types of analytic problems. They conduct an experiment and record the amount of time it takes to complete each analytic problem. The following two-way ANOVA table summarizes their findings. Source of Variation
SS
df
MS
F
Major
16.82
1
16.82
11.21
Problem
102.92
4
25.73
17.15
Interaction
157.48
4
39.37
26.25
Error
60.00
40
1.5
Total
337.22
49
At the 5% significance level, the critical value for the test about the interaction is less than ______. A) 2.14 B) 2.61 C) 2.25 D) 2.51
Version 1
47
80) Psychology students want to determine if there are differences between the ability of business majors and science majors to solve various types of analytic problems. They conduct an experiment and record the amount of time it takes to complete each analytic problem. The following two-way ANOVA table summarizes their findings. Source of Variation
SS
df
MS
F
Major
16.82
1
16.82
11.21
Problem
102.92
4
25.73
17.15
Interaction
157.48
4
39.37
26.25
Error
60.00
40
1.5
Total
337.22
49
At the 5% significance level, the conclusion for the hypothesis test about the interaction term is ______. A) do not reject the null hypothesis; we cannot conclude there exists an interaction effect between major and problem type B) reject the null hypothesis; we cannot conclude there exists an interaction effect between major and problem type C) reject the null hypothesis; we can conclude there exists an interaction effect between major and problem type D) do not reject the null hypothesis; we can conclude there exists an interaction effect between major and problem type
81) For a one-way ANOVA, which of the following formulas is used to compute the grand mean?
A) B) C) D)
Version 1
48
82) For a one-way ANOVA, which of the following is a correct formula for the sum of squares due to treatments? A) SSTR = B) SSTR = C) SSTR = D) SSTR =
83) For a one-way ANOVA, which of the following is a correct formula for the error sum of squares? A) SSE = B) SSE = C) SSE = D) SSE =
84) For a one-way ANOVA, which of the following is a correct formula for the test statistic for a one-way ANOVA test? A) B) C) D)
85)
Consider the partially completed one-way ANOVA summary table. Source of Variation
Between groups
Version 1
SS
df
MS
F
270
49
Within groups
18
Total
810
21
The value of the test statistic is ________. A) 3.0 B) 2.2 C) 5.5 D) 7.4
86) AutoTrader.com would like to test if a difference exists in the age of three different types of vehicles currently on the road: trucks, cars, and vans. The following data represent the age of a random sample of trucks, cars, and vans. Trucks
Cars
Vans
12
8
3
8
7
7
9
10
6
11
7
8
The grand mean for these observations is ________. A) 7.2 B) 8.0 C) 8.9 D) 9.3
87) AutoTrader.com would like to test if a difference exists in the age of three different types of vehicles currently on the road: trucks, cars, and vans. The following data represent the age of a random sample of trucks, cars, and vans. Trucks
Cars
Vans
12
8
3
Version 1
50
8
7
7
9
10
6
11
7
8
The mean square for treatments MSTR for these observations is ________. A) 2.0 B) 4.7 C) 16.0 D) 12.5
88) AutoTrader.com would like to test if a difference exists in the age of three different types of vehicles currently on the road: trucks, cars, and vans. The following data represent the age of a random sample of trucks, cars, and vans. Trucks
Cars
Vans
12
8
3
8
7
7
9
10
6
11
7
8
The sum of squares within SSE for these observations is ________. A) 22.4 B) 20.5 C) 41.8 D) 30.0
89) Which of the following is the pooled estimate of the common population variance for Fisher’s LSD method?
Version 1
51
A) B) C) D)
90) Tukey’s HSD 100(1 − α)% confidence interval for the difference between two population means μi − μj for unbalanced data is given by ____________________. A) B) C) D)
91) For a two-way ANOVA with no interaction, which of the following is the correct formula for the sum of squares SSA for factor A ? A) SSA = B) SSA = C) SSA = D) SSA =
92) For a two-way ANOVA with no interaction, which of the following is the correct formula for the error sum of squares SSE?
Version 1
52
A) SSE = SST − (SSA + SSB + SSAB) B) SSE = SST − (SSA − SSB) C) SSE = SST + SSA + SSB D) SSE = SST − (SSA + SSB)
93) For a two-way ANOVA with no interaction, which of the following is the correct degrees of freedom when computing MSE? A) nT − c − r + 1 B) nT − 1 C) (c − 1)(r − 1) D) cr(w − 1)
94)
Consider the partially completed two-way ANOVA (without interaction) summary table. Source of Variation
Rows (Factor B)
SS 600
Columns (Factor A)
df
MS
F
3 5
Error
750
Total
2,150
23
The sum of squares for factor A is ______. A) 930 B) 800 C) 1,150 D) 980
Version 1
53
95)
Consider the partially completed two-way ANOVA (without interaction) summary table. Source of Variation
Rows (Factor B)
SS 600
Columns (Factor A)
df
MS
F
3 5
Error
750
Total
2,150
23
The mean square for factor A is ______. A) 200 B) 410 C) 160 D) 50
96)
Consider the partially completed two-way ANOVA (without interaction) summary table. Source of Variation
Rows (Factor B)
SS 600
Columns (Factor A)
df
MS
F
3 5
Error
750
Total
2,150
23
Which of the following is the value of the test statistic for factor B? A) 3.20 B) 6.70 C) 5.16 D) 4.00
Version 1
54
97)
Consider the partially completed two-way ANOVA (with interaction) summary table. Source of Variation
SS
df
Rows (Factor B)
MS
F
2
Columns (Factor A)
600
200
Interaction
144
Error
384
12
Total
1,288
23
The degrees of freedom for the interaction of factors A and B is _____. A) 6 B) 4 C) 3 D) 5
98)
Consider the partially completed two-way ANOVA (with interaction) summary table. Source of Variation
SS
df
Rows (Factor B)
MS
F
2
Columns (Factor A)
600
200
Interaction
144
Error
384
12
Total
1,288
23
The mean square for interaction MSAB is _____.
Version 1
55
A) 50 B) 24 C) 75 D) 65
99)
Consider the partially completed two-way ANOVA (with interaction) summary table. Source of Variation
SS
df
Rows (Factor B)
MS
F
2
Columns (Factor A)
600
200
Interaction
144
Error
384
12
Total
1,288
23
The mean square error (MSE) is _____. A) 21 B) 43 C) 32 D) 26
100)
Consider the partially completed two-way ANOVA (with interaction) summary table. Source of Variation
SS
df
Rows (Factor B)
F
2
Columns (Factor A)
600
Interaction
144
Version 1
MS
200
56
Error
384
12
Total
1,288
23
The interaction test statistic is ____. A) 3.16 B) 4.80 C) 2.56 D) 0.75
101)
Which of the following is an example in which a one-way ANOVA test is appropriate? A) To determine if variability of test scores differs between male and female students. B) To determine if test scores depend on homework scores and gender. C) To determine if mean test scores differ between the four sections. D) To determine if test scores are normally distributed.
102) Which of the following statements is FALSE regarding the methods covered in this chapter to determine which population means differ? A) Both methods construct confidence intervals for all pairwise differences for the population means to identify which means significantly differ from one another. B) Both methods use the mean square error from the one-way ANOVA test to estimate the population variance. C) Both methods require an equal number of observations in each sample. D) Both methods use a significance level to limit the probability of making a Type I error.
103)
Which of the following situations is appropriate for a two-way ANOVA test?
Version 1
57
A) To determine if age and income influence average credit ratings. B) To determine if income influences average credit rating. C) To determine if average credit rating differs by income level. D) To determine if age influences average credit ratings.
104) A business statistics instructor wants to determine whether the average exam score is the same for two different sections and three levels of GPA. The following table shows the average exam scores according to sections and GPA levels. Assume that scores are normally distributed. Section (Factor B)
GPA Level (Factor A) High
Average
Low
Hybrid
96.4
86.4
79.8
Online
90.3
85.2
65.2
Conduct a two-way ANOVA test with no interaction. Which of the following conclusions can be made? A) At the 5% significance level, the average exam score differs across sections. B) At the 5% significance level, the average exam score differs across GPA. C) At the 10% significance level, the average exam score differs across GPA. D) At the 10% significance level, the average exam score is higher in the hybrid than the online section.
105) Which of the following statements is TRUE about a two-way ANOVA test with interaction? A) The total variation is partitioned into three components. B) We cannot use a randomized block design. C) The test is always a two-tailed test. D) If the interaction effect is significant, we can use another technique such as regression analysis to examine the main effects.
Version 1
58
106) A researcher wants to measure the effect of gender and income on an individual’s sense of belonging, measured as a score on a scale of 0 to 200. A randomized block design experiment with interaction is performed. A portion of the results is shown below. Assume that sense of belonging scores are normally distributed. Gender
Upper-Income
Middle-Income
Lower-Income
Female
145
100
77
⋮
⋮ 102
⋮ 85
⋮ 73
Male
{MISSING IMAGE}Click here for the Excel Data File Which of the following conclusions can be drawn from a two-way ANOVA test with interaction? A) At the 5% significance level, there is interaction between gender and income. B) At the 5% significance level, there is interaction between gender and sense of belonging. C) At the 1% significance level, there is interaction between gender and income. D) At the 5% significance level, the main effect of income is significant.
107) A farmer plants tomato seeds into four different plots. In each plot, there is a different fertilizer treatment that is applied to the soil. After three weeks, he measures the height of each tomato plant from each of the four plots. The data he collects is given below. Plot 1
Plot 2
Plot 3
Plot 4
15
12
65
43
19
19
45
22
29
18
43
24
35
6
26
40
29
23
44
22
10
67
Version 1
59
a. Construct an ANOVA table. b. Set up the competing hypothesis to test whether there are some differences in the mean heights between the different plots/fertilizers. c. At the 5% significance level, what is the conclusion to the test? d. Consider the sample standard deviations for each plot. Which of the assumptions might be violated?
108) A farmer plants tomato seeds into four different plots. In each plot, there is a different fertilizer treatment that is applied to the soil. After conducting the hypothesis test for the mean heights, he obtains Tukey’s confidence intervals which are below. Lower
Upper
Plots 1–2
−14.5
30.2
Plots 1–3
−43.5
−4.4
Plots 1–4
−27.6
12.8
Plots 2–3
−52.7
−10.9
Plots 2–4
−36.8
6.3
Plots 3–4
−2.0
35.1
Use Tukey’s confidence intervals to determine which plots/fertilizers have different mean heights.
109) A regional sales manager of a bank wants to know if the number of ATM transactions per day is different for four cities of equal size in the executive’s region. The executive collects relevant data and obtains the following incomplete one-way ANOVA table. Source of Variation Between Cities
Version 1
SS
df
MS
F
32
3
10.67
0.562
60
Within Cities Total
31
a. Complete the ANOVA table. b. Set up the competing hypothesis to test whether there are differences in the mean number of ATM transactions per day between the different cities. c. At the 5% significance level, what is the conclusion to the test?
110) A career counselor wants to understand if the average annual earnings differ by educational attainment. The counselor uses three education levels: high school or less, undergraduate degree, and master’s degree or more. After collecting data, the resulting incomplete one-way ANOVA table and Tukey’s confidence intervals are shown below. Source of Variation
SS
df
Between Degrees
MS
F
1,936
Within Degrees Total
8,054
24
Lower
Upper
Undergraduate–High School
8.23
43.11
Master’s–High School
11.89
46.77
Master’s–Undergraduate
−12.65
19.98
a. Complete the ANOVA table. b. Set up the competing hypothesis to test whether there are differences in mean annual salary between the different education levels. c. At the 5% significance level, what is the conclusion to the test? d. If significant differences exist, use Tukey’s confidence intervals to determine which educational levels have different mean annual earnings.
Version 1
61
111) A golf instructor wants to determine if the drive distance is different for three golf ball brands. The instructor has several of his students drive the different brands and collects the following data. Golfer (Factor B)
Golf Ball Brands (Factor A) Miller
Sapphire
Cruiser
Brad
275
350
295
Will
300
525
380
Huang
245
286
300
Singh
220
256
286
Woods
400
415
475
Fisher
356
385
396
a. Construct a two-way (without interaction) ANOVA table for the data. b. At the 5% significance level, can you conclude the average drive distance differs by brand? c. Explain why the golf instructor is not interested in the hypothesis test about the golfers. d. Explain why the golf instructor uses a two-way ANOVA rather than a one-way ANOVA for brand.
112) A research firm wants to compare the miles per gallon (mpg) for three different types of gasoline. To control for the performance of different types of cars, seven types of cars were used as the blocks. The study yielded the below as the following incomplete two-way ANOVA table. Source of Variation
SS
df
Car Type 44.10
Error
9.91
Version 1
F
12.87
Gas Type
Total
MS
20
62
a. Complete the ANOVA table. b. At the 5% significance level, can you conclude if the average mpg differs by gas type? c. At the 5% significance level, can you conclude if the average mpg differs by car type?
113) A career counselor wants to understand if the average salary differs by educational attainment blocking on industry. The counselor uses four degree levels (high school, undergraduate, master’s, and doctoral) and three industries. The study resulted in the following incomplete ANOVA table. Source of Variation Industry
SS
df
F
7.17
Education Error
MS
46.11 9.5
6
Total
a. Complete the ANOVA table. b. At the 5% significance level, can you conclude the average salary differs by educational attainment? c. At the 5% significance level, can you conclude the average salary differs by industry?
114) A researcher wants to understand how an annual mortgage payment (in dollars) depends on income level and zonal location allowing for an interaction. The data are shown below. Zonal Location (Factor B)
Income ($) (Factor A) Low
Downtown
Version 1
Medium
High
25
55
60
40
60
65
63
Suburbs
Midtown
Uptown
22
5
63
30
100
60
70
75
100
85
50
115
74
54
108
100
45
160
75
75
75
55
70
60
41
55
79
40
70
65
5
40
59
10
25
30
10
40
45
15
35
40
a. Construct a two-way (with interaction) ANOVA table for these data. b. At the 5% significance level, can you conclude there is interaction between income and zonal location? c. Are you able to conduct tests on the main effects? If yes, conduct these tests at the 5% significance level.
115) A fund manager suspects there is an interaction between fund market cap (either small, mid, or large) and the fund objective (either growth or value) on the average annual fund performance. The fund manager obtains data and produces the following incomplete two-way ANOVA table. Source of Variation Market Cap
SS
df
MS
F
345.1
Objective
18.39
Interaction Error Total
Version 1
18
14.26
1,167.4
64
a. Complete the ANOVA table. b. At the 5% significance level, can you conclude there is interaction between market cap and objective? c. Are you able to conduct tests on the main effects? If yes, conduct these tests at the 5% significance level.
116) A career counselor wants to determine if there is an interaction between gender (male or female) and sector (private or public) on average weekly income. The counselor obtains data and produces the following incomplete two-way ANOVA table. Source of Variation
SS
df
MS
Gender Sector
F 11.44
156,468
Interaction Error Total
3,853 277,045
19
a. Complete the ANOVA table. b. At the 5% significance level, can you conclude there is interaction between gender and sector? c. Are you able to conduct tests on the main effects? If yes, conduct these tests at the 5% significance level. d. Why is a two-way ANOVA (with or without interaction) preferable to two two-sample T-tests in this situation?
Version 1
65
117) The manager of a retail chain wants to determine whether mean monthly store traffic is dependent on one of the four regions of the country. The manager collects monthly store traffic data from 10 randomly selected stores in each of the four regions. Assume store traffic is normally distributed. A portion of the data is shown below: North Region
South Region
East Region
West Region
3676
7123
2718
5654
⋮ 7667
⋮ 3383
⋮ 3781
⋮ 3763
{MISSING IMAGE}Click here for the Excel Data File a.What are the competing hypotheses to test whether there are differences in the mean monthly store traffic of the four regions? b.What is the value of the test statistic? c.What is the p-value? d.At the 5% significance level, what is the conclusion to the test? e.Can you further determine which pairs of the four regions’ mean monthly store traffic are different? If you can, use the Tukey’s Honestly Significant Different method at the 5% significance level to determine which regions differ.
118) One-way ANOVA assumes the population standard deviations are unknown and assumed unequal. ⊚ true ⊚ false
119) One-way ANOVA is used to determine if differences exist between the means of three or more populations under dependent sampling. ⊚ true ⊚ false
Version 1
66
120) The between-treatments variance is the estimate of σ2 based on the variability due to chance. ⊚ true ⊚ false
121) We use ANOVA to test for differences between population means by examining the amount of variability between the samples relative to the amount of variability within the samples. ⊚ true ⊚ false
122) When the null hypothesis is rejected in an ANOVA test, Fisher’s least significant difference method is superior to Tukey’s honestly significant differences method to determine which population means differ. ⊚ true ⊚ false
123) When using Fisher’s Least Significant Difference (LSD) where each interval has an error of α, the probability of committing a Type I error for at least one of the individual confidence intervals increases as the number of pairwise comparisons increases. ⊚ true ⊚ false
124) In general, a blocking variable is used to eliminate the variability in the response due to the levels of the blocking variable. ⊚ ⊚
Version 1
true false
67
125) If the units within each block are randomly assigned to each of the treatments, then the design of the experiment is referred to as a randomized block design. ⊚ ⊚
true false
126) The interaction test is performed before making any conclusions based on the tests for the main effects. ⊚ true ⊚ false
127) When two factors interact, the effect of one factor on the mean depends upon the specific value or level for the other factor. ⊚ true ⊚ false
Version 1
68
Answer Key Test name: Chap 13_4e 1) B 2) B 3) D 4) A 5) D 6) B 7) A 8) B 9) B 10) C 11) D 12) A 13) D 14) B 15) C 16) C 17) C 18) C 19) C 20) B 21) B 22) A 23) B 24) D 25) B 26) D Version 1
69
27) B 28) C 29) D 30) B 31) A 32) C 33) B 34) C 35) B 36) C 37) B 38) C 39) C 40) A 41) C 42) C 43) D 44) A 45) A 46) D 47) A 48) B 49) C 50) D 51) B 52) B 53) C 54) D 55) A 56) C Version 1
70
57) A 58) B 59) A 60) B 61) B 62) B 63) D 64) D 65) B 66) D 67) A 68) B 69) A 70) C 71) D 72) A 73) C 74) B 75) A 76) B 77) D 78) A 79) B 80) C 81) A 82) B 83) C 84) D 85) A 86) B Version 1
71
87) C 88) D 89) A 90) B 91) C 92) D 93) A 94) B 95) C 96) D 97) A 98) B 99) C 100) D 101) C 102) C 103) A 104) C 105) D 106) A 107) a. Source of Variation
SS
df
MS
F
Between Plots/Fertilizers
3,111.61
3
1,037.20
7.45
Within Plots/Fertilizers
2,507.66
18
139.31
Total
5,619.27
21
b. Let μi be the mean height for the i-th fertilizer treatment. The competing hypotheses are H0: μ1 = μ2 = μ3 = μ4, HA: Not all population means are equal. c. The critical value is F0.05, (3, 18) = 3.16 and the p-value, P(F(3.18) ≥ 7.45), is less than 0.01. Reject the null hypothesis; we can conclude the mean height is not the same for each plot/fertilizer. d. The sample standard deviations are different. This might suggest the population standard deviations are not equal. Version 1
72
108) The mean heights are different for plots 1–3 and 2–3. We cannot conclude the mean heights are different for plots 1–2, 1–4, 2–4, and 3–4. 109) a. Source of Variation
SS
df
MS
F
Between Cities
32
3
10.67
Within Cities
532
28
19
Total
564
31
0.562
b. Let μi be the mean number of ATM transactions per day for the i-th city. The competing hypotheses are H0: μ1 = μ2 = μ3 = μ4, HA: Not all population means are equal. c. The critical value is F0.05,(3,28) = 2.95 and the p-value, P(F(3,28) ≥ 0.562), is greater than 0.10. Do not reject the null hypothesis; we cannot conclude the mean number of ATM transactions per day is different between the four cities. 110) a. Source of Variation
SS
df
MS
F
Between Degrees
3,872
2
1,936
10.18
Within Degrees
4,182
22
190
Total
8,054
24
b. Let where µi is the mean annual earnings for the i-th graduation level. The competing hypotheses are H0: μ1 = μ2 = μ3 = μ4, HA: Not all population means are equal. c. The critical value is F0.05, (2, 22) = 3.44 and the p-value, P(F(2, 22) ≥ 10.18) = 0.0007, is less than 0.01. Reject the null hypothesis; we can conclude that the mean annual earnings are not the same for each degree level. d. We cannot conclude the mean annual earnings is different for the undergraduate and master’s group. We can conclude that the mean annual earnings is different for the undergraduate and high school degree levels and the master’s and high school degree levels. 111) a. Source of Variation
SS
df
MS
F
Rows
77,665
5
15,533
7.64
Columns
16,520
2
8,260.1
4.06
Error
20,335
10
2,033.5
Total
114,520
17
Version 1
73
b. Let µi be the mean drive distance for the i-th golf ball brand. The competing hypotheses are H 0 : μ1 = μ2 = μ3 , HA: Not all population means are equal. c. The critical value is F(0.05,(2,10)) = 4.10 and the p-value, P(F(2,10) ≥ 4.06), is between 0.05 and 0.10. Do not reject the null hypothesis; we cannot conclude the average drive distance differs by the golf ball brand. The instructor is not interested in differences between the players. d. To reduce the variations in drive distances, the instructor used the randomized block design, in which the golfers form blocks. 112) a. Source of Variation
SS
Car Type
77,22
Gas Type
df
MS
F
6
12.87
15.58
44.10
2
22.05
26.70
Error
9.91
12
0.8258
Total
131.23
20
b. Let µi be the mean mileage per gallon for the i-th type of gasoline. The competing hypotheses are H 0 : μ1 = μ2 = μ3 , HA: Not all population means are equal. The critical value is F0.05,(2,12) = 3.89 and the p-value, P(F(2,12) ≥ 26.70), is less than 0.01. Reject the null hypothesis; we can conclude that the average mpg differs by gas type. c. Let µi be the mean mileage per gallon for the i-th type of car. The competing hypotheses are H 0 : μ1 = μ2 = μ3 , HA: Not all population means are equal The critical value is F0.05, (6, 12) = 3.00 and the p-value, P(F(6, 12) ≥ 15.58, is less than 0.01. Reject the null hypothesis and conclude that the average mpg differs by car type. 113) a. Source of Variation
SS
df
MS
F
Industry
7.17
2
3.59
2.27
Education
183.33
3
46.11
29.18
Error
9.5
6
1.58
Total
200.0
11
Version 1
74
b. Let µi be the mean salary for the i-th educational attainment. The competing hypotheses are H 0 : μ1 = μ2 = μ3 = μ4 , HA: Not all population means are equal The critical value is F0.05, (3, 6) = 4.76 and the p-value, P(F(3, 6) ≥ 29.18), is less than 0.01. Reject the null hypothesis; we can conclude the average salary differs by educational attainment. c. Let µi be the mean salary for the i-th industry. The competing hypotheses are H 0 : μ1 = μ2 = μ3 , HA: Not all population means are equal. The critical value is F0.05, (2, 6) = 5.14 and the p-value, P(F(2, 6) ≥ 2.27), is greater than 0.10. Do not reject the null hypothesis; we cannot conclude the average salary differs by industry. 114) a. Source of Variation
SS
df
MS
F
Zonal Location (Factor B)
20,698
3
6,899
25.15
Income (Factor A)
7,723
2
3,861.5
14.07
Interaction
6,250
6
1,042
3.80
Error
9,878
36
274.4
Total
44,549
47
b. The competing hypotheses are H0: There is no interaction between income and zonal location, HA: There is interaction between income and zonal location. The F table does not show critical values for df2 = 36 degrees of freedom, so use df2 = 30 degrees of freedom, and conclude that the test statistic is greater than critical value F0.05,(6,36)= 2.42. The p-value, P(F(6,36) ≥ 3.80), is less than 0.01. Reject the null hypothesis; we can conclude there exists an interaction effect between income and zonal location. c. Because the interaction effect is significant, we cannot test the main effects. 115) a. Source of Variation
SS
df
MS
F
Market Cap
345.1
2
172.55
12.10
Objective
262.3
1
262.3
18.39
Interaction
303.3
2
151.65
14.27
Error
256.7
18
14.26
Total
1,167.4
23
Version 1
75
b. The competing hypotheses are H0: There is no interaction between market cap and objective, HA: There is interaction between market cap and objective. The critical value is F0.05,(2,18) = 3.55 and the p-value, P(F(2,18) ≥ 14.27), is less than 0.01. Reject the null hypothesis; we can conclude there exists an interaction effect between market cap and category. c. Because the interaction effect is significant, we cannot test the main effects. 116) a. Source of Variation
SS
df
MS
F
Gender
44,086
1
44,806
11.44
Sector
156,468
1
156,468
40.36
Interaction
14,851
1
14,851
3.85
Error
61,640
16
3,853
Total
277,045
19
b. The competing hypotheses are H0: There is no interaction between gender and sector, HA: There is interaction between gender and sector. The critical value isF0.05, (1, 16) = 4.49 and the p-value, P(F(2, 18) ≥ 3.85), is between 0.05 and 0.10. Do not reject the null; we cannot conclude there exists and interaction effect between gender and sector. c. Because the interaction is not significant, we can test the main effects. Let the means below be for the average weekly income for males and females. The competing hypotheses are H 0 : μ1 = μ2 H 0 : μ1 ≠ μ2 The critical value is F0.05,(1,16) = 4.49 and the p-value, P(F0.05,(1,16) ≥ 11.44), is less than 0.01. Reject the null hypothesis; we conclude the average weekly income is different for males and females. Let the means below be for the average weekly income for public and private sectors. The competing hypotheses are H 0 : μ1 = μ2 , H 0 : μ1 ≠ μ2 The critical value is F0.05,(1,16) = 4.49 and the p-value, P(F0.05,(1,16) ≥ 40.36), is less than 0.01. Reject the null hypothesis; we can conclude the average weekly income is different for the public and private sector. d. Using two two-sample T-tests does not investigate the possible interaction or control for the other variable not being tested.
Version 1
76
117) a. H0: μ1 = μ2 = μ3 = μ4; HA: Not all population means are equal. b. The test statistic = 6.3129 c. The p-value = 0.0015 d. The p-value of 0.0015 is less than the 5% significance level, we reject H0 and conclude that there are some differences in the mean monthly store traffic of the four regions. e. Since significant differences exist, we can further determine which regions differ. Using the Tukey’s Honestly Significant Different method, mean monthly store traffic in the South Region is different from the North Region and East Region at the 5% significance level.
118) FALSE 119) FALSE 120) FALSE 121) TRUE 122) FALSE 123) TRUE 124) TRUE 125) TRUE 126) TRUE 127) TRUE
Version 1
77
CHAPTER 14 – 4e 1)
The correlation coefficient captures only a ________ relationship.
2) ________ correlation can make two variables appear closely related when no casual relation exists.
3) The actual value y may differ from the expected value E(y). Therefore, we add a ________ ________ term ε to develop a simple linear regression model.
4) With regression analysis, we explicitly assume that one variable, called the ________ variable, is influenced by another variable, called the explanatory variable.
5) The relationship between the response variable and the explanatory variables is ________ if the value of the response variable is uniquely determined by the explanatory variables.
6) The ________ regression model allows for multiple explanatory variables to predict the response variable.
7) The numerical measure that gauges dispersion from the sample regression equation is the sample ________ of the residuals.
8)
We use ________ to derive the coefficient of determination.
Version 1
1
9) If the ________ is substantially greater than zero and the number of explanatory variables is large compared with sample size, then the adjusted R2 will differ substantially from R2. Hint: use the acronym.
10)
Which of the following identifies the range for a correlation coefficient? A) Any value less than 1. B) Any value greater than 0. C) Any value between 0 and 1. D) Any value between −1 and 1.
11)
A correlation coefficient r = −0.85 could indicate a ________. A) very weak positive linear relationship B) very strong negative linear relationship C) very weak negative linear relationship D) very strong positive linear relationship
Version 1
2
12) The following scatterplot indicates that the relationship between the two variables x and y is ________.
A) strong and positive B) strong and negative C) weak and negative D) no relationship
13) The following scatterplot indicates that the relationship between the two variables x and y is ________.
Version 1
3
A) weak and positive B) strong and positive C) strong and negative D) no relationship
14)
Which of the following statements is the least accurate concerning correlation analysis?
A) Correlation does not imply causation. B) The correlation coefficient captures only a linear relationship. C) The correlation coefficient may not be a reliable measure when outliers are present in one or both of the variables. D) The correlation coefficient describes both the direction and strength of the relationship between two variables only if the two variables have the same units of measurement.
15) When testing whether the correlation coefficient differs from zero, the value of the test statistic is t20 = 1.95 with a corresponding p-value of 0.0653. At the 5% significance level, can you conclude that the correlation coefficient differs from zero? A) Yes, since the p-value exceeds 0.05. B) Yes, since the test statistic value of 1.95 exceeds 0.05. C) No, since the p-value exceeds 0.05. D) No, since the test statistic value of 1.95 exceeds 0.05.
16) When testing whether the correlation coefficient differs from zero, the value of the test statistic is t20 = −2.95 with a corresponding p-value of 0.0061. At the 5% significance level, can you conclude that the correlation coefficient differs from zero? A) Yes, since the p-value is less than 0.05. B) Yes, since the absolute value of the test statistic exceeds 0.05. C) No, since the p-value is less than 0.05. D) No, since the absolute value of the test statistic exceeds 0.05.
Version 1
4
17) The sample standard deviations for x and y are 10 and 15, respectively. The covariance between x and y is −120. The correlation coefficient between x and y is ________. A) −0.8 B) −0.5 C) 0.5 D) 0.8
18) The variance of the rates of return is 0.25 for stock X and 0.01 for stock Y. The covariance between the returns of X and Y is −0.01. The correlation of the rates of return between X and Y is ________. A) −0.25 B) −0.20 C) 0.20 D) 0.25
19) Simple linear regression analysis differs from multiple regression analysis in that ________. A) simple linear regression uses only one explanatory variable B) the coefficient of correlation is meaningless in simple linear regression C) goodness-of-fit measures cannot be calculated with simple linear regression D) the coefficient of determination is always higher in simple linear regression
20) <p>A sample regression equation is given by = −100 + 0.5x. If x = 20, the predicted value of y is ________.
Version 1
5
A) −80 B) −90 C) 110 D) 120
21) <p>A sample regression equation is given by = 155 − 34x1 − 12x2. If x1 = 3 and x2 = 2, the predicted value of y would be ________. A) 29 B) 77 C) 233 D) 281
22)
What is the name of the variable that is used to predict another variable? A) Response B) Explanatory C) Coefficient of determination D) Standard error of the estimate
23) Consider the following simple linear regression model: y = β0 + β1 response variable is ________.
+
ε . The
A) y B) x C) ε D) β0
Version 1
6
24) Consider the following simple linear regression model: y = β0 + β1x + ε. The explanatory variable is ________. A) y B) x C) ε D) β0
25) Consider the following simple linear regression model: y = β0 + β1x + ε. The random error term is ________. A) y B) x C) ε D) β0
26) Consider the following simple linear regression model: y = β0 + β1x + ε.β0 and β1 are ________. A) the response variables B) the random error terms C) the unknown parameters D) the explanatory variables
27)
<p>In the sample regression equation = b0 + b1x, What is ? A) The y-intercept B) The slope of the equation C) The value of y when x = 0 D) The predicted value of y, given a specific x value
Version 1
7
28) <p>Consider the following data: = 20, sx = 2, = −5, sy = 4, and b1 = −0.8. Which of the following is the sample regression equation? A) = −21 − 0.80 x B) = −21 + 0.80x C) = 11 − 0.80x D) = 11 + 0.80x
29) <p>Suppose a sample regression equation is given by = 11 − 0.80x. What is the predicted value of y and residual when x is 2 and y is observed to be 9? A) B) C) D)
30) <p>Consider the following data: = 20, sx = 2, = −5, sy = 4, and b1 = 0.40. Which of the following is the sample regression equation? A) = −13 − 0.40x B) = −13 + 0.40x C) = 3 − 0.40x D) = 3 + 0.40x
31) <p>Suppose a sample regression equation is given by = 3 + 0.40x. Suppose when x is 10 y is observed to be 8. What is the residual of the model prediction, and does the model under or overpredict the value of y?
Version 1
8
A) = −1, the model overpredicts y B) = 1, the model overpredicts y C) = 1, the model underpredicts y D) = −1, the model underpredicts y
32) Given the augmented Phillips model: y = β0 + β1x1 + β2x2 + ε, where y = actual rate of inflation (%), x1 = unemployment rate (%), and x2 = anticipated inflation rate (%). The response variable or variables in this model is (are) the ________. A) unemployment rate B) actual inflation rate C) anticipated inflation rate D) unemployment rate and anticipated inflation rate
33) Given the augmented Phillips model: y = β0 + β1x1 + β2x2 + ε, where y = actual rate of inflation (%), x1 = unemployment rate (%), and x2 = anticipated inflation rate (%). The explanatory variable or variables in this model is (are) the ________. A) unemployment rate B) actual inflation rate C) anticipated inflation rate D) unemployment rate and anticipated inflation rate
34) The capital asset pricing model is given by R − Rf = α + β(RM − Rf) + ε, where RM = expected return on the market, Rf = risk-free market return, and R = expected return on a stock or portfolio of interest. The response variable in this model is ________.
Version 1
9
A) R B) R − Rf C) RM D) RM− Rf
35) The capital asset pricing model is given byR − Rf = α + β(RM− Rf) + ε, where RM = expected return on the market, Rf = risk-free market return, and R = expected return on a stock or portfolio of interest. The explanatory variable in this model is ________. A) R B) R− Rf C) RM D) RM− Rf
36)
<p>Consider the sample regression equation = 10− 5x, with an R2 value of 0.65. Which
of the following is the value of the sample correlation between y and ? A) −0.81 B) −0.65 C) 0.81 D) 0.65
37)
<p>Consider the sample regression equation
100 + 10x, with an R2 value of 0.81.
Which of the following is the value of the sample correlation between y and ? A) −0.90 B) −0.81 C) 0.81 D) 0.90
Version 1
10
38) In a simple linear regression model, if the plots on a scatter diagram lie on a straight line, which of the following is the standard error of the estimate? A) −1 B) 0 C) +1 D) Infinity
39) In a simple linear regression model, if the points on a scatter diagram lie on a straight line with a negative slope, which of the following is the coefficient of determination? A) −1 B) 0 C) +1 D) Infinity
40)
Calculate the value of R2 given the ANOVA portion of the following regression output:
Source of variance
SS
df
MS
F
p-value
Regression
2,562
1
2,562
6.58
0.0145
Residual
14,395
37
389
Total
16,957
38
A) 0.151 B) 0.515 C) 0.849 D) 1.000
41)
Which of the following is not true of the standard error of the estimate?
Version 1
11
A) It can take on negative values. B) Theoretically, its value has no predefined upper limit. C) It is a measure of the accuracy of the regression model. D) It is based on the squared deviations between the actual and predicted values of the response variable.
42)
The standard error of the estimate measures ________. A) the variability of the explanatory variables B) the variability of the values of the sample regression coefficients C) the variability of the observed y-values around the predicted y-values D) the variability of the predicted y-values around the mean of the observed y-values
43)
The standard error of the estimate measures ________. A) the standard deviation of the random error B) the standard deviation of the response variable C) the standard deviation of the explanatory variable D) the standard deviation of the correlation coefficient
44) <p>Consider the sample regression equation = 12 + 3x1 − 5x2 + 7x3 − 2x4. When x1 increases by 1 unit and x2 increases by 2 units, while x3 and x4 remain unchanged, what change would you expect in the predicted y? A) Decrease by 2 B) Decrease by 4 C) Decrease by 7 D) No change in predicted y
Version 1
12
45) <p>Consider the sample regression equation: = 12 + 2x1− 6x2 + 6x3 + 2x4. When x1 increases 1 unit and x2 increases 2 units, while x3 and x4 remain unchanged, what change would you expect in the predicted y? A) Decrease by 2 B) Decrease by 10 C) Increase by 2 D) No change in the predicted y
46)
The R2 of a multiple regression of y on x1 and x2 measures the ________. A) percentage of the variation in y that is explained by the variability of x1 B) percentage of the variation in y that is explained by the variability of x2 C) statistical significance of the coefficients in the sample regression equation D) percentage of the variation in y that is explained by the sample regression equation
47) Unlike the coefficient of determination, the sample correlation coefficient in a simple linear regression ________. A) can be greater than 1 B) measures the percentage of variation explained by the regression line C) indicates whether the slope of the regression line is positive or negative D) is a proportion
48) A simple linear regression of the return of firm A, RA, on the return of firm B, RB, based on 18 observations, is = 2.2 + 0.4 RB. If the coefficient of determination from this regression is 0.09, calculate the correlation between RA and .
Version 1
13
A) 0.03 B) 0.04 C) 0.09 D) 0.30
49) When two regression models applied on the same data set have the same response variable but a different number of explanatory variables, the model that would evidently provide the better fit is the one with a ________. A) lower standard error of the estimate and a higher coefficient of determination B) higher standard error of the estimate and a higher coefficient of determination C) higher coefficient of determination and a lower adjusted coefficient of determination D) lower standard error of the estimate and a higher adjusted coefficient of determination
50)
The coefficient of determination R2 is ________. A) sometimes negative B) always lower than adjusted R2 C) usually higher than adjusted R2 D) always equal to adjusted R2
51) Using the same data set, four models are estimated using the same response variable, however, the number of explanatory variables differs. Which of the following models provides the best fit? Model 1
Model 2
Model 3
Model 4
Multiple R
0.993
0.991
0.936
0.746
R Square
0.987
0.982
0.877
0.557
Adjusted R Square
0.982
0.978
0.849
0.513
Standard Error
4,043
4,463
11,615
20,878
Version 1
14
Observations
12
12
12
12
A) Model 1 B) Model 2 C) Model 3 D) Model 4
52) A multiple regression model with two explanatory variables is estimated using 20 observations resulting in SSE = 550 and SST = 1000. Which of the following is the correct value of R2? A) 0.10 B) 0.45 C) 0.55 D) 0.90
53) A multiple regression model with two explanatory variables is estimated using 20 observations resulting in SSE = 550 and SST = 1000. The value of adjusted R2 is the closest to ________. A) 0.39 B) 0.45 C) 0.55 D) 0.61
54) A multiple regression model with four explanatory variables is estimated using 25 observations resulting in SSE = 660 and SST = 1,000. Which of the following is the correct value of R2?
Version 1
15
A) 0.16 B) 0.34 C) 0.66 D) 0.84
55) A multiple regression model with four explanatory variables is estimated using 25 observations resulting in SSE = 660 and SST = 1,000. The value of adjusted R2 is the closest to ________. A) 0.21 B) 0.34 C) 0.66 D) 0.79
56) Over the past 30 years, the sample standard deviations of the rates of return for stock X and stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. The correlation of the rates of return between X and Y is the closest to ________. A) 0.20 B) 0.24 C) 0.36 D) 0.40
57) Over the past 30 years, the sample standard deviations of the rates of return for stock X and stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. To determine whether the correlation coefficient is significantly different from zero, the appropriate hypotheses are ________.
Version 1
16
A) B) C) D)
58) Over the past 30 years, the sample standard deviations of the rates of return for stock X and stock Y were 0.20 and 0.12, respectively. The sample covariance between the returns of X and Y is 0.0096. When testing whether the correlation coefficient differs from zero, the value of the test statistic is t28 = 2.31. At the 5% significance level, the critical value is t0.025,28 = 2.048. The conclusion to the hypothesis test is ________. A) to reject H0; we can conclude that the correlation coefficient differs from zero B) to reject H0; we cannot conclude that the correlation coefficient differs from zero C) do not reject H0; we can conclude that the correlation coefficient differs from zero D) do not reject H0; we cannot conclude that the correlation coefficient differs from zero
59) A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: = 440, The value of the slope b1 is ____.
,
= 1,120, n = 11.
A) −1.29 B) −0.51 C) 0.51 D) 1.29
60) A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: ∑ xi = 224, ∑( xi − ) 2 = 402, ∑( xi− ) ( yi − ) = −552, ∑ yi = 403, ∑( y i − ) 2 = 1,139, n = 11. Which of the following is the sample regression equation?
Version 1
17
A) y^ = 12.93− 1.37x B) y^ = 64.53 − 1.37x C) y^ = 64.53 + 1.37x D) y^ = 12.93 + 1.37x
61) A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following values: = 440, , Which of the following is the sample regression equation?
= 1,120, n = 11.
A) B) C) D)
62)
A statistics student is asked to estimate y = β0 + β1x + ε. She calculates the following
values: = 440, , n = 11. Which of the following is the value of y if x equals 2?
= 1,120,
A) 6.62 B) 11.78 C) 58.22 D) 63.38
63) <p>Consider the following sample regression equation = 150− 20x, where y is the demand for Product A (in 1,000s) and x is the price of the product (in $). The slope coefficient indicates that if ________.
Version 1
18
A) the price of Product A increases by $1, then we predict the demand to decrease 20 B) the price of Product A increases by $1, then we predict the demand to increase by 20 C) the price of Product A increases by $1, then we predict the demand to decrease by 20,000 D) the price of Product A increases by $1, then we predict the demand to increase by 20,000
64) <p>Consider the following sample regression equation = 150− 20x, where y is the demand for Product A (in 1,000s) and x is the price of the product (in $). If the price of Product A is $5, then we expect demand to be ________. A) 50 B) 500 C) 5,000 D) 50,000
65) <p>Consider the following sample regression equation = 150− 20x, where y is the demand for Product A (in 1,000s) and x is the price of the product (in $). If the price of the good increases by $3, then we expect demand for Product A to ________. A) increase by 60 B) decrease by 60 C) decrease by 60,000 D) increase by 60,000
66) <p>Consider the following sample regression equation = 200 + 10x, where y is the supply for Product A (in 1,000s) and x is the price of Product A (in $). The slope coefficient indicates that if ________.
Version 1
19
A) the price of Product A increases by $1, then we predict the supply to decrease by 10 B) the price of Product A increases by $1, then we predict the supply to increase by 10 C) the price of Product A increases by $1, then we predict the supply to decrease by 10,000 D) the price of Product A increases by $1, then we predict the supply to increase by 10,000
67) <p>Consider the following sample regression equation = 200 + 10x, where y is the supply for Product A (in 1,000s) and x is the price of Product A (in $). If the price of Product A is $5, then we expect supply to be ________. A) 250 B) 2,500 C) 25,000 D) 250,000
68) <p>Consider the following sample regression equation = 244 + 9x, where y is the supply for Product A (in 1,000s) and x is the price of Product A (in $). If the price of Product A increases by $1, then we expect the supply for Product A to ________. A) increase by 9 B) decrease by 9 C) decrease by 9,000 D) increase by 9,000
69) <p>Consider the following sample regression equation = 200 + 10x, where y is the supply for Product A (in 1,000s) and x is the price of Product A (in $). If the price of Product A increases by $3, then we expect the supply for Product A to ________.
Version 1
20
A) increase by 30 B) decrease by 30 C) increase by 30,000 D) decrease by 30,000
70) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 +β1 Advertising + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
1
78.78
78.78
3.59
Residual
23
504.27
21.92
Total
24
583.05
Coefficients
Standard Error
t-stat
p-value
Intercept
40.7
14.27
2.852
0.0058
Advertising
2.91
1.59
−1.83
0.0602
Which of the following is the predicted Sales for a firm with Advertising of $430? A) $40,700 B) $1292.0 C) $55,100 D) $148,600
71) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 +β1 Advertising + ε. The following ANOVA table shows a portion of the regression results.
Version 1
21
df
SS
MS
F
Regression
1
78.53
78.53
3.58
Residual
23
504.02
21.91
Total
24
582.55
Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following is the predicted Sales for a firm with Advertising of $500? A) $1,480 B) $40,100 C) $54,500 D) $148,000
72) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 +β1 Advertising + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
1
78.53
78.53
3.58
Residual
23
504.02
21.91
Total
24
582.55
Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following is true?
Version 1
22
A) If Advertising goes up by $100, then we predict Sales to go up by $2,880. B) If Sales go up by $100, then we predict Advertising to go up by $2,880. C) If Advertising goes up by $100, then we predict Sales to go up by $4,298. D) If Sales go up by $100, then we predict Advertising to go up by $4,298.
73) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
1
78.53
78.53
3.58
Residual
23
504.02
21.91
Total
24
582.55
Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following is the standard error of the estimate? A) 4.68 B) 8.86 C) 21.91 D) 78.53
74) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following ANOVA table shows a portion of the regression results.
Version 1
23
df
SS
MS
F
Regression
1
78.42
78.42
3.58
Residual
23
504.05
21.92
Total
24
582.47
Coefficients
Standard Error
t-stat
p-value
Intercept
39.2
13.89
2.822
0.0058
Advertising
2.98
1.6
−1.863
0.0587
Which of the following is the coefficient of determination? A) 0.1556 B) 0.8440 C) 0.8654 D) 0.1346
75) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
1
78.53
78.53
3.58
Residual
23
504.02
21.91
Total
24
582.55
Coefficients
Standard Error
t-stat
p-value
Intercept
40.1
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following is the coefficient of determination?
Version 1
24
A) 0.1348 B) 0.1558 C) 0.8442 D) 0.8652
76) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 + β1 Service + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
1
555,420
555,420
7.64
Residual
27
1,962,873
72,699
Total
28
2,518,293
Coefficients
Standard Error
t-stat
p-value
Intercept
784.92
322.25
2.44
0.02
Service
9.19
3.20
2.87
0.01
Which of the following is the monthly salary of an employee that has worked for 48 months at the bank? A) $441 B) $785 C) $1,050 D) $1,226
77) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 + β1 Service + ε. The following ANOVA table summarizes a portion of the regression results.
Regression
Version 1
df
SS
1
555,420
MS
F
555,420
7.64
25
Residual
27
1,962,873
72,699
Total
28
2,518,293
Coefficients
Standard Error
t-stat
p-value
Intercept
784.92
322.25
2.44
0.02
Service
9.19
3.20
2.87
0.01
Which of the following is the standard error of the estimate? A) 3.20 B) 269.63 C) 322.25 D) 72699
78) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 + β1 Service + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
1
555,420
555,420
7.64
Residual
27
1,962,873
72,699
Total
28
2,518,293
Coefficients
Standard Error
t-stat
p-value
Intercept
784.92
322.25
2.44
0.02
Service
9.19
3.20
2.87
0.01
The coefficient of determination indicates that ________. A) 22.06% of the variation in Salary is explained by the sample regression equation B) 22.06% of the variation in Service is explained by the sample regression equation C) 81.61% of the variation in Salary is explained by the sample regression equation D) 81.61% of the variation in Service is explained by the sample regression equation
Version 1
26
79) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the model: Salary = β0 + β1 Service + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
1
555,420
555,420
7.64
Residual
27
1,962,873
72,699
Total
28
2,518,293
Coefficients
Standard Error
t-stat
p-value
Intercept
784.92
322.25
2.44
0.02
Service
9.19
3.20
2.87
0.01
How much of the variation in Salary is unexplained by the sample regression equation? A) 1% B) 2% C) 18.39% D) 77.94%
80) Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales (in $100,000s) as the response variable with housing starts (in 1,000s) and commercial construction (in 1,000s) as the explanatory variables. The estimated model is Lumber Sales = β0 + β1 Housing Starts + β2 Commercial Constructions + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
2
180,770
90,385
103.3
Residual
45
39,375
875
Total
47
220,145
Coefficients
Standard Error
Version 1
t-stat
p-value
27
Intercept
5.37
1.71
3.14
0.0030
Housing Starts
0.76
0.09
8.44
0.0000
Commercial Construction
1.25
0.33
3.78
0.0005
If Housing Starts were 17,000 and Commercial Construction was 3,200, the best estimate of Lumber Sales would be ________.
A) 1,692,000 B) 1,692,537 C) 2,201,400 D) 2,229,000
81) Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales (in $100,000s) as the response variable with housing starts (in 1,000s) and commercial construction (in 1,000s) as the explanatory variables.The estimated model is Lumber Sales = β0 + β1 Housing Starts + β2 Commercial Constructions + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
2
180,770
90,385
103.3
Residual
45
39,375
875
Total
47
220,145
Coefficients
Standard Error
t-stat
p-value
Intercept
5.37
1.71
3.14
0.0030
Housing Starts
0.76
0.09
8.44
0.0000
Commercial Construction
1.25
0.33
3.78
0.0005
The sample regression equation explains approximately ________% of the variation in the response Lumber Sales.
Version 1
28
A) 18 B) 22 C) 78 D) 82
82) Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales (in $100,000s) as the response variable with housing starts (in 1,000s) and commercial construction (in 1,000s) as the explanatory variables. The estimated model is Lumber Sales = β0 + β1 Housing Starts + β2 Commercial Constructions + ε. The following ANOVA table summarizes a portion of the regression results. df
SS
MS
F
Regression
2
180,770
90,385
103.3
Residual
45
39,375
875
Total
47
220,145
Coefficients
Standard Error
t-stat
p-value
Intercept
5.37
1.71
3.14
0.0030
Housing Starts
0.76
0.09
8.44
0.0000
Commercial Construction
1.25
0.33
3.78
0.0005
The standard deviation of the difference between actual lumber sales and the estimate of those sales is ________. A) 29.58 B) 48 C) 103.3 D) 875
Version 1
29
83) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
5,694,717
1,898,239
50.88
Residual
36
1,343,176
37,310
Total
39
7,037,893
Coefficients
Standard Error
t-stat
p-value
Intercept
300
84.0
3.57
0.0010
Bedroom
226
60.3
3.75
0.0006
Bath
89
55.9
1.59
0.1195
Sqft
0.2
0.09
2.22
0.0276
Which of the following would be the rent for a 1,000-square-foot apartment that has two bedrooms and two bathrooms? A) $840 B) $1,130 C) $1,260 D) $1,335
84) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df Regression
Version 1
3
SS 5,694,736
MS
F
1,898,245
50.88
30
Residual
36
1,343,175
Total
39
7,037,911
37,310
Coefficients
Standard Error
t-stat
p-value
Intercept
319
75.0
4.25
0.0040
Bedroom
221
57.7
3.83
0.0007
Bath
110
56.1
1.96
0.1201
Sqft
0.3
0.07
4.29
0.0264
The slope coefficient for Bedroom indicates that, holding other explanatory variables constant, ________. A) for each additional bedroom the rent is predicted to increase by $110 B) for each additional bedroom the rent is predicted to increase by $319 C) for each additional bedroom the rent is predicted to increase by $221 D) for each additional bedroom the rent is predicted to increase by $0.30
85) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
5,694,717
1,898,239
50.88
Residual
36
1,343,176
37,310
Total
39
7,037,893
Coefficients
Standard Error
t-stat
p-value
Intercept
300
84.0
3.57
0.0010
Bedroom
226
60.3
3.75
0.0006
Bath
89
55.9
1.59
0.1195
Sqft
0.2
0.09
2.22
0.0276
The slope coefficient for Bedroom indicates that, holding other explanatory variables constant, ________. Version 1
31
A) for each additional bedroom the rent is predicted to increase by $226 B) for each additional bedroom the rent is predicted to increase by $89 C) for each additional bedroom the rent is predicted to increase by $300 D) for each additional bedroom the rent is predicted to increase by $0.20
86) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
5,694,717
1,898,239
50.88
Residual
36
1,343,176
37,310
Total
39
7,037,893
Coefficients
Standard Error
t-stat
p-value
Intercept
300
84.0
3.57
0.0010
Bedroom
226
60.3
3.75
0.0006
Bath
89
55.9
1.59
0.1195
Sqft
0.2
0.09
2.22
0.0276
The coefficient of determination indicates that ________. A) 19.08% of the variation in Rent is explained by the sample regression equation B) 19.08% of the variation in square footage is explained by the sample regression equation C) 80.92% of the variation in Rent is explained by the sample regression equation D) 80.92% of the variation in square footage is explained by the sample regression equation
Version 1
32
87) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
5,694,771
1,898,257.0
50.87
Residual
36
1,343,265
37,313
Total
39
7,038,036
Coefficients
Standard Error
t-stat
p-value
Intercept
271
99
2.74
0.0040
Bedroom
203
57.5
3.53
0.0008
Bath
82
57.3
1.43
0.1187
Sqft
0.3
0.04
7.5
0.0281
The standard deviation of the difference between actual rent and the estimated rent is ________. A) 37,313 B) 40.13 C) 50.87 D) 193.17
88) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model as Rent = β0 + β1 Bedroom + β2 Bath + β3 Sqft + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
5,694,717
1,898,239
50.88
Residual
36
1,343,176
37,310
Version 1
33
Total
39
7,037,893
Coefficients
Standard Error
t-stat
p-value
Intercept
300
84.0
3.57
0.0010
Bedroom
226
60.3
3.75
0.0006
Bath
89
55.9
1.59
0.1195
Sqft
0.2
0.09
2.22
0.0276
The standard deviation of the difference between actual rent and the estimate of rent is ________. A) 40 B) 50.88 C) 193 D) 37,310
89) <p>When estimating = b0 + b1x1 + b2x2, the following regression results using ANOVA were obtained. df
SS
MS
F
Regression
2
210.9
105.5
114.7
Residual
17
15.6
0.92
Total
19
226.5
Coefficients
Standard Error
t-stat
p-value
Intercept
−1.6
0.57
−2.77
0.0132
x1
−0.5
0.04
−15.11
2.77E-11
x2
0.1
0.07
1.89
0.0753
<p>Which of the following is the prediction of , if x1 = 1 and x2 = 2? A) −1.9 B) −0.3 C) 1.3 D) 1.9
Version 1
34
90) <p>When estimating = β0 + β1x1 + β2x2 + ε, the following regression results using ANOVA were obtained. df
SS
MS
F
Regression
2
210.9
105.5
114.7
Residual
17
15.6
0.92
Total
19
226.5
Coefficients
Standard Error
t-stat
p-value
Intercept
−1.6
0.57
−2.77
0.0132
x1
−0.5
0.04
−15.11
2.77E-11
x2
0.1
0.07
1.89
0.0753
Which of the following is the coefficient of determination? A) 0.07 B) 0.10 C) 0.90 D) 0.93
91) <p>When estimating = β0 + β1x1 + β2x2 + ε, the following regression results using ANOVA were obtained. df
SS
MS
F
Regression
2
210.9
105.5
114.7
Residual
17
15.6
0.92
Total
19
226.5
Coefficients
Standard Error
t-stat
p-value
Intercept
−1.6
0.57
−2.77
0.0132
x1
−0.5
0.04
−15.11
2.77E-11
x2
0.1
0.07
1.89
0.0753
Which of the following is the standard error of the estimate?
Version 1
35
A) 0.82 B) 0.86 C) 0.92 D) 0.96
92) <p>When estimating = β0 + β1x1 + β2x2 + ε, the following regression results using ANOVA were obtained. df
SS
MS
F
Regression
2
210.9
105.5
114.7
Residual
17
15.6
0.92
Total
19
226.5
Coefficients
Standard Error
t-stat
p-value
Intercept
−1.6
0.57
−2.77
0.0132
x1
−0.5
0.04
−15.11
2.77E-11
x2
0.1
0.07
1.89
0.0753
2
Which of the following is the adjusted R ? A) 0.82 B) 0.86 C) 0.92 D) 0.96
93) A sociologist examines the relationship between the poverty rate and several socioeconomic factors. For the 50 states and the District of Columbia (n = 51), he collects data on the poverty rate (y, in %), the percent of the population with at least a high school education (x1), median income (x2, in $1000s), and the mortality rate per 1,000 residents (x3). He estimates the following model as = β0 + β1 Education + β2 Income + β3 Mortality + ε. The following ANOVA table shows a portion of the regression results. df Regression
Version 1
3
SS 415.3
MS 138.43
F 92.9 36
Residual
47
70.2
Total
50
485.5
Coefficients
1.49
Standard Error
t-stat
p-value
Intercept
60.6
4
15.15
1.65E-16
Education
−0.42
0.09
−4.67
5.45E-10
Income
−0.4
0.02
−20
6.02E-10
Morality
0.14
0.15
0.93
0.6438
What is the poverty rate for a state where 60% of the population has at least a high school education, the median income is $49,000, and the mortality rate is 11 per 1,000 residents? A) 21.54% B) 19.44% C) 17.34% D) 15.44%
94) A sociologist examines the relationship between the poverty rate and several socioeconomic factors. For the 50 states and the District of Columbia (n = 51), he collects data on the poverty rate (y, in %), the percent of the population with at least a high school education (x1), median income (x2, in $1000s), and the mortality rate per 1,000 residents (x3). He estimates the following model as = β0 + β1 Education + β2 Income + β3 Mortality + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
417.3
139.1
94.6
Residual
47
69.1
?
Total
50
486.4
Coefficients
Standard Error
t-stat
p-value
Intercept
60.3
4.8
12.47
1.65E-16
Education
−0.43
0.05
−7.78
5.45E-10
Income
−0.20
0.03
−7.75
6.02E-10
Morality
0.08
0.17
0.47
0.6438
What is the poverty rate for a state where 85% of the population has at least a high school education, the median income is $50,000, and the mortality rate is 10 per 1,000 residents? Version 1
37
A) 12.6% B) 14.6% C) 16.6% D) 18.6%
95) A sociologist examines the relationship between the poverty rate and several socioeconomic factors. For the 50 states and the District of Columbia (n = 51), he collects data on the poverty rate (y, in %), the percent of the population with at least a high school education (x1), median income (x2, in $1000s), and the mortality rate per 1,000 residents (x3). He estimates the following model as y = β0 + β1 Education + β2 Income + β3 Mortality + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
417.3
139.1
94.6
Residual
47
69.1
1.47
Total
50
486.4
Coefficients
Standard Error
t-stat
p-value
Intercept
60.3
4.8
12.47
1.65E-16
Education
−0.43
0.05
−7.78
5.45E-10
Income
−0.20
0.03
−7.75
6.02E-10
Morality
0.08
0.17
0.47
0.6438
The coefficient of determination indicates that ________. A) 14% of the variation in the poverty rate is explained by the regression model B) 14% of the variation in median incomes is explained by the sample regression equation C) 86% of the variation in the poverty rate is explained by the sample regression equation D) 86% of the variation in median incomes is explained by the sample regression equation
Version 1
38
96) A sociologist examines the relationship between the poverty rate and several socioeconomic factors. For the 50 states and the District of Columbia (n = 51), he collects data on the poverty rate (y, in %), the percent of the population with at least a high school education (x1), median income (x2, in $1000s), and the mortality rate per 1,000 residents (x3). He estimates the following model asy = β0 + β1 Education + β2 Income + β3 Mortality + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
Regression
3
418.1
139.37
Residual
47
67.7
1.44
Total
50
485.8
Coefficients Intercept
62.7
Education
−0.31
Income Morality
Standard Error 5.6
t-stat
F 96.8
p-value
11.2
1.65E-16
0.08
−3.88
5.45E-10
−0.1
0.06
−1.67
6.02E-10
0.1
0.27
0.37
0.6438
Which of the following is the value of the standard error of the estimate? A) 11.81 B) 139.37 C) 1.44 D) 1.2
97) A sociologist examines the relationship between the poverty rate and several socioeconomic factors. For the 50 states and the District of Columbia (n = 51), he collects data on the poverty rate (y, in %), the percent of the population with at least a high school education (x1), median income (x2, in $1000s), and the mortality rate per 1,000 residents (x3). He estimates the following model asy = β0 + β1 Education + β2 Income + β3 Mortality + ε. The following ANOVA table shows a portion of the regression results. df
SS
MS
F
Regression
3
417.3
139.1
94.6
Residual
47
69.1
1.47
Total
50
486.4
Version 1
39
Coefficients
Standard Error
t-stat
p-value
Intercept
60.3
4.8
12.47
1.65E-16
Education
−0.43
0.05
−7.78
5.45E-10
Income
−0.20
0.03
−7.75
6.02E-10
Morality
0.08
0.17
0.47
0.6438
Which of the following is the value of the standard error of the estimate? A) 1.47 B) 1.21 C) 139.1 D) 4.8
98) The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience
Sales
2
8
3
8
3
6
3
9
8
9
5
14
Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The slope of the regression equation is equal to ________. A) 5.61 B) 0.46 C) 0.17 D) −2.79
99) The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience 1
Version 1
Sales 7
40
2
9
2
9
4
8
5
14
8
14
Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The slope of the regression equation is equal to ________. A) −3.33 B) 0.71 C) 1.00 D) 6.15
100) The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience
Sales
1
7
2
9
2
9
4
8
5
14
8
14
Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The coefficient of determination for this sample is equal to ________. A) 0.456 B) 0.519 C) 0.667 D) 0.712
101) The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience
Version 1
Sales
41
8
14
5
11
4
7
8
18
8
17
2
14
Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The standard error of the estimate is equal to ________. A) 4.41 B) 6.81 C) 3.65 D) 4.65
102) The following table shows the number of cars sold last month by six dealers at Centreville Nissan dealership and their number of years of sales experience. Years of Experience
Sales
1
7
2
9
2
9
4
8
5
14
8
14
Management would like to use simple regression analysis to estimate monthly car sales using the number of years of sales experience. The standard error of the estimate is equal to ________. A) 1.84 B) 2.60 C) 2.84 D) 5.00
Version 1
42
103) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales
Price
7
$ 8
2
$ 11
6
$ 13
1
$ 11
7
$ 17
7
$ 7
4
$ 12
The predicted weekly sales for the novel when priced at $10 is equal to ________. A) 3.49 B) 4.87 C) 3.02 D) 5.75
104) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales 7 4 5 9 8 8 7
Price $ 12 $ 11 $ 10 $ 9 $ 8 $ 7 $ 7
The predicted weekly sales for the novel when priced at $10 is equal to ________.
Version 1
43
A) 4.60 B) 5.07 C) 6.45 D) 7.33
105) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales 7 4 5 9 8 8 7
Price $ 12 $ 11 $ 10 $ 9 $ 8 $ 7 $ 7
The total sum of squares for this sample is equal to ________. A) 18.86 B) 21.56 C) 22.86 D) 26.93
106) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales 7 4 5 9 8
Version 1
Price $ 12 $ 11 $ 10 $ 9 $ 8
44
8 7
$ 7 $ 7
The error sum of squares for this sample is equal to ________. A) 10.19 B) 13.70 C) 16.61 D) 20.33
107) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales 7 4 5 9 8 8 7
Price $ 12 $ 11 $ 10 $ 9 $ 8 $ 7 $ 7
The regression sum of squares for this sample is equal to ________. A) 4.17 B) 4.61 C) 5.16 D) 6.25
108) Costco sells paperback books in their retail stores and wants to examine the relationship between the prices and sales. The price of a particular novel was adjusted each week and the weekly sales are in the following table. Management would like to use a simple linear regression model that uses prices to predict sales. Sales 7
Version 1
Price $ 12
45
4 5 9 8 8 7
$ 11 $ 10 $ 9 $ 8 $ 7 $ 7
The coefficient of determination for this sample is equal to ________. A) 0.273 B) 0.311 C) 0.392 D) 0.450
109) The following data for five years of the annual returns for two of Vanguard’s mutual funds, the Vanguard Energy Fund (x) and the Vanguard Healthcare Fund (y), were given as sx = 35.77, sy = 13.34, sxy = 447.68. Which of the following is the value of the sample correlation coefficient? A) 0.88 B) 0.94 C) 0.78 D) −0.78
110) The following data for five years of the annual returns for two of Vanguard’s mutual funds, the Vanguard Energy Fund (x) and the Vanguard Healthcare Fund (y), were given as sx = 35.77, sy = 13.34, sxy = 447.68. The competing hypotheses are . Which of the following is the value of the test statistic? A) 7.47 B) 5.74 C) 4.77 D) 4.27
Version 1
46
111) The following data for five years of the annual returns for two of Vanguard’s mutual funds, the Vanguard Energy Fund (x) and the Vanguard Healthcare Fund (y), were given as sx = 35.77, sy = 13.34, sxy = 447.68. The competing hypotheses are . At the 5% significance level, which of the following is the critical value of the test statistic? A) 2.353 B) 2.571 C) 2.776 D) 3.182
112) The following data for five years of the annual returns for two of Vanguard’s mutual funds, the Vanguard Energy Fund (x) and the Vanguard Healthcare Fund (y), were given as sx = 35.77, sy = 13.34, sxy = 447.68. The competing hypotheses are . At the 5% significance level which of the following is the correct conclusion to the test? A) Reject the null hypothesis; we can conclude that the returns of these two mutual funds are correlated. B) Reject the null hypothesis; we cannot conclude that the returns of these two mutual funds are not correlated. C) Do not reject the null hypothesis; we can conclude that the returns of these two mutual funds are correlated. D) Do not reject the null hypothesis; we cannot conclude that the returns of these two mutual funds are correlated
113)
Which of the following Excel functions returns the sample correlation coefficient? A) CORRELATION B) CORREL C) CORREL.S D) COVARIANCE.S
Version 1
47
114)
Which of the following R functions returns the sample correlation coefficient? A) cor.test B) lm C) summary D) cor
115)
Which of the following R functions creates a linear model object? A) cor.test B) lm C) summary D) cor
116)
Which of the following statements about correlation analysis is FALSE?
A) It can be used to determine the strength of the linear relationship between two variables. B) It can be used to determine the direction of the linear relationship between two variables. C) It can be used to determine the strength of the linear relationship between three or more variables. D) It involves measures that quantify the linear relationship between two variables.
117)
Which of the following statements about regression analysis is FALSE?
Version 1
48
A) It uses information on the explanatory variables to predict the value of the response variable. B) It can establish cause-and-effect relationships. C) It is related to correlation analysis. D) It can have multiple explanatory variables.
118)
Which of the following statements is TRUE about goodness-of-fit measures? A) Goodness-of-fit measures are used to gauge the predictive power of a regression
model. B) Goodness-of-fit measures are always improved by increasing the number of explanatory variables in a regression model. C) The higher the value of any goodness-of-fit measure, the better the model fit. D) Goodness-of-fit measures are used to assess how well the explanatory variables explain the variation in the response variable.
119)
Consider the following sample data.
x
3
13
9
6
20
y
50
83
20
80
160
a. Construct and interpret a scatterplot. b. Calculate and interpret the sample covariance. c. Calculate and interpret the sample correlation coefficient.
Version 1
49
120) A sample of 30 observations provides the following statistics: sx = 14;sy = 19; sxy = −150. a. Calculate and interpret the sample correlation coefficient rxy. b. Specify the hypotheses to determine whether or not x and y are significantly correlated. c. At the 5% significance level, what is the conclusion to the test using the critical-value approach? Explain. d. At the 5% significance level, what is the conclusion to the test using the p-value approach? Explain.
121) A statistics instructor wants to examine the relationship between the hours a student spends studying for the final exam (Hours) and a student’s grade on the final exam (Grade). She takes a sample of five students. Student
x (Hours)
y (Grade)
1
8
75
2
2
47
3
3
50
4
15
88
5
25
93
= 10.6 = 70.6 sx = 9.6 sy = 21.2 sxy = 186.8 a. Compute the sample correlation coefficient. b. Specify the competing hypotheses to determine whether the hours spent studying and the final grade are correlated. c. Calculate the value of the test statistic and approximate the corresponding p-value. d. At the 10% significance level, what is the conclusion to the test? Explain.
Version 1
50
122) A portfolio manager is interested in reducing the risk of a particular portfolio by including assets that have little, if any, correlation. He wonders whether the stock prices for the firms Apple and Google are correlated. As a very preliminary step, he collects the monthly closing stock price for each firm from January 2012 to April 2012. Month 1 2 3 4
Apple $ 428.58 $ 497.57 $ 577.51 $ 606.00
Google $ 614.89 $ 606.40 $ 627.27 $ 619.43
527 617 sx = 80.30 sy = 8.72 sxy = 410.11 a. Compute the sample correlation coefficient. b. Specify the competing hypotheses to determine whether the stock prices are correlated. c. Calculate the value of the test statistic and approximate the corresponding p-value. d. At the 5% significance level, what is the conclusion to the test? Explain.
123) Consider the following information regarding a response variable y and an explanatory variable x.
a. Calculate b0 and b1. b. What is the sample regression equation? Predict y if x equals 10. c. Calculate the standard error of the estimate. d. Calculate and interpret the coefficient of determination.
124)
Consider the following sample data: x
Version 1
22
35
14
10
51
y
26
55
30
12
a. Use software to construct a scatterplot. What kind of relationship does it indicate? b. Use software to obtain b1 and b0. What is the sample regression equation? c. Find the predicted value for y if x equals 10, 15, and 20.
125)
Consider the following sample data: x
12
23
11
20
14
y
20
12
17
13
18
a. Use software to construct a scatterplot and verify that estimating a simple linear regression is appropriate. b. Use software to calculate b0 and b1. What is the sample regression equation? Predict y if x equals 15. c. Interpret the value of the slope coefficient, b1. d. Use software to obtain and se. e. Use software to obtain R2.
126) John is an undergraduate business major studying at a local university. He wonders how his grade point average (GPA) can affect his future earnings. He asks five recent business school graduates information on their GPA and income (in $1,000s). The following table shows this information. x: GPA
3.0
3.9
3.2
3.7
3.5
y: Income (in $1,000s)
39
58
36
45
48
a. Use software to construct a scatterplot, Verify that estimating a simple linear regression is appropriate in this case. b. Use software to calculate b0 and b1. What is the sample regression equation? c. Interpret the coefficient for GPA. d. Find the predicted income earned if a GPA equals 3.0, 3.3, and 3.6.
Version 1
52
127) The following portion of regression results was obtained when estimating a simple linear regression model. df
SS
MS
F
Regression
1
725.56
725.56
751.68
Residual
23
22.20
B
Total
24
A
Coefficients
Standard Error
t-stat
p-value
Intercept
80.30
2.08
38.68
1.95E-22
x
−0.28
0.01
−27.42
4.54E-19
a. What is the sample regression equation? b. Interpret the slope coefficient for x1. c. Find the predicted value for y if x1 equals 200. d. Fill in the missing values A and B in the ANOVA table. e. Calculate the standard error of the estimate. f. Calculate R2.
128)
When estimating a multiple regression model, the following portion of output is obtained: Coefficient
Standard Error
t-stat
p-value
Intercept
10.66
6.78
1.57
0.1770
x1
2.53
0.40
6.30
0.0015
x2
0.29
0.18
1.68
0.1534
a. What is the sample regression equation? b. Interpret the slope coefficient for x1. c. Find the predicted value for y if x1 equals 22 and x2 equals 41.
Version 1
53
129)
The following ANOVA table was obtained when estimating a multiple regression model. df
SS
MS
F
Regression
2
699
349.5
Residual
17
52
3
Total
19
751
116.5
a. Calculate the standard error of the estimate. b. Calculate the coefficient of determination. c. Calculate adjusted R2.
130) The following portion of regression results was obtained when estimating a multiple regression model. df
SS
MS
F
Regression
A
8,693.11
4,346.56
375.16
Residual
22
254.89
C
Total
24
B
Coefficients
Standard Error
t-stat
p-value
Intercept
257.74
22.82
11.29
1.27E-10
x1
−2.97
0.47
−6.31
2.37E-06
x1
0.23
0.24
0.96
0.3454
Version 1
54
a. What is the sample regression equation? b. Interpret the slope coefficient for x1. c. Find the predicted value for y if x1 equals 25 and x2 equals 50. d. Fill in the missing values in the ANOVA table. e. Calculate the standard error of the estimate. f. Calculate R2.
131) <p>A researcher analyzes the relationship between amusement park attendance and the price of admission. She estimates the following model: , where Attendance is the daily attendance (in 1,000s) and Price is the gate price (in $). A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
1
19,744.99
19,744.99
44.79
Residual
28
12,343.78
440.85
Total
29
32,088.78
Coefficients
Standard Error
t-stat
p-value
Intercept
334.60
38.84
8.61
2.32E-09
Price
−3.10
0.46
−6.69
2.9E-07
a. Predict the Attendance for an amusement park that charges $80 for admission. b. Interpret the slope coefficient for Price. c. Calculate the standard error of the estimate. d. Calculate and interpret the coefficient of determination. How much of the variability in Attendance is unexplained?
Version 1
55
132) <p>An admissions officer wants to examine the cumulative GPA of new students, and has data on 224 first-year students at the end of their first two semesters. The admissions officer estimates the following model: , where HSM, HSS, and MSE are their average high school math, science, and English grades (as proportions). The regression results are shown in the accompanying table. df
SS
MS
F
Regression
3
27.71
9.24
Residual
220
107.75
0.48977
Total
223
135.46
Coefficients
Standard Error
t-stat
p-value
Intercept
3.01
0.2942
2.01
0.0462
HSM
0.17
0.0354
4.75
0.0001
HSS
0.03
0.0376
0.091
0.3619
HSE
0.05
0.0387
1.17
0.2451
18.61
a. Predict the GPA when the average math grade is 90%, the average science grade is 85%, and the average English grade is 85%. b. If a student had a GPA of 3.0. What is the residual and does the sample regression equation under- or overpredict the GPA? c. Calculate the standard error of the estimate. d. Interpret the coefficient of determination.
133) An investment analyst wants to examine the relationship between a mutual fund’s return, its turnover rate, and its expense ratio. She randomly selects 10 mutual funds and estimates: , where Return is the average five-year return (in %), Turnover is the annual holdings turnover (in %), Expense is the annual expense ratio (in %), and is the random error component. A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
2
93.33
46.67
4.90
Residual
7
66.69
9.53
Version 1
56
Total
9
160.02
Coefficients
Standard Error
t-stat
p-value
Intercept
30.60
4.30
7.12
0.000
Turnover
0.13
0.06
2.23
0.061
Expense
0.90
4.08
0.22
0.831
a. Predict the return for a mutual fund that has an annual holdings turnover of 60% and an annual expense ratio of 1.5%. b. Interpret the slope coefficient for the variable Expense. c. Calculate the standard error of the estimate. d. Calculate and interpret the coefficient of determination.
134) Data was collected for 30 professional tennis players regarding their performance in Grand Slams (the four major tennis tournaments in the world). The response variable Win, expressed as a proportion ranging from 0 to 1, is believed to depend on two explanatory variables: the percentage of the Double Faults and the number of Aces. The following model is estimated: . A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
2
1.24
0.620
41.85
Residual
27
0.40
0.015
Total
29
1.64
Coefficients
Standard Error
t-stat
p-value
Intercept
0.451
0.080
5.646
5.4E-06
DoubleFaults
−0.007
0.0024
−2.875
0.0078
Aces
0.015
0.003
4.65
7.8E-05
a. Predict the winning percentage for a player who had 20 double faults and five aces. b. Interpret the slope coefficient for the variable DoubleFaults. c. Calculate the standard error of the estimate. d. Calculate and interpret the coefficient of determination. e. Calculate the adjusted R2.
Version 1
57
135) A manager at a ski resort in Vermont wanted to determine the effect that weather had on its sales of lift tickets. The manager of the resort collected data over the last 20 years on the number of lift tickets sold during Christmas week (y), the total snowfalls in inches (x1), and the average temperature in degrees Fahrenheit (x2). The following model is estimated: . A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
2
32,516
16,250
6.49
Residual
17
42,539
2,502
Total
19
75,055
Coefficients
Standard Error
t-stat
p-value
Intercept
8,308
903.7
9.19
0.0001
Snowfall
74.59
31.57
2.36
0.0305
Temperature
−8.75
19.70
−0.44
0.6625
a. Predict the number of lift tickets sold during Christmas week, when the total snowfall was 25 inches and the average temperature was 35 degrees Fahrenheit. b. Interpret the slope coefficient for Snowfall. c. Calculate the standard deviation of the difference between the actual number of tickets sold and the estimate of the number of tickets sold. d. Calculate and interpret the coefficient of determination. e. Calculate the adjusted R2.
Version 1
58
136) A sociologist studies the relationship between a district’s average score on a standardized test for 10th-grade students (y), the average school expenditures per student (x1 in $1,000s), and an index of the socioeconomic status of the district (x2). The following model is estimated: . A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
2
840
420
21.47
Residual
23
450
19.57
Total
25
1,290
Coefficients
Standard Error
t-stat
p-value
Intercept
10.16
11.9
0.85
0.4041
Expenditures
8.50
2.23
3.81
0.0009
Index
4.30
1.90
2.26
0.0336
a. Predict a district’s average test score if average expenditures are $4,500 and the district’s social index is 8. b. Interpret the slope coefficient for Expenditures. c. Calculate the standard error of the estimate. d. Calculate and interpret the coefficient of determination. e. Calculate the adjusted R2.
137)
<p>An analyst examines the effect that various variables have on crop yield. He
estimates the following model: , where y is the average yield in bushels per acre, x1 is the amount of summer rainfall, x2 is the average daily use in machine hours of tractors on the farm, and x3 is the amount of fertilizer used per acre. A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
3
12,000
4000
10
Residual
6
2,400
400
Total
9
14,400
Version 1
59
Coefficients
Standard Error
t-stat
p-value
Intercept
1.6
1
1.6
0.1232
x1
7.5
2.5
3
0.0064
x2
6
4
1.5
0.1472
x3
1
0.5
2
0.0574
a. Predict the crop yield per acre if x1 is 5, x2 is 4, and x3 is the 0.5. b. Calculate the standard deviation of the difference between the actual crop yield and the estimate of the crop yield. c. How much of the variation in crop yield is unexplained by the model?
138) A realtor studies the relationship between price and total square footage of commercial properties in the United States. She believes that total square footage is a key factor in determining the price of the property. She collects data on price and square footage of 100 commercial properties throughout the country. A portion of the data is shown below: Price
Total Square Footage
4400000
11900
:
:
279000
3124
{MISSING IMAGE}Click here for the Excel Data File a. Calculate the sample correlation coefficient between price and total square footage. b. What are the competing hypotheses to determine whether the population correlation coefficient between price and total square footage is positive? c. Calculate the value of the test statistic. d. Calculate the p-value. e. At the 10% significance level, what is the conclusion to the test?
Version 1
60
139) A realtor studies the relationship between price and total square footage of commercial properties in the United States. She believes that total square footage is a key factor in determining the price of the property. She collects data on price and square footage of 100 commercial properties throughout the country. A portion of the data is shown below: Price
Total Square Footage
4400000
11900
:
:
279000
3124
{MISSING IMAGE}Click here for the Excel Data File a. What is the sample regression equation? b. Predict the price of a commercial property if its total square footage is 8000.
140) A realtor believes that square footage and age are key factors in determining the price of commercial properties in the United States. She collects data on price, age, and square footage of 100 commercial properties throughout the country. A portion of the data is shown below: Price
Total Square Footage
Age
4400000
11900
12
:
:
:
279000
3124
55
{MISSING IMAGE}Click here for the Excel Data File a. What is the sample regression equation? b. Predict the price of a commercial property if its total square footage is 8000 and age is 50.
141) An analyst at a credit card company wants to find the best fit linear regression model to predict an individual’s credit score. He has data linking credit scores for 55 individuals with two measures: age and income. A portion of the data is shown below: Version 1
61
Credit Score
Income
Age
676
50505
40
:
:
:
458
30985
39
Click here for the Excel Data File a. Compare three linear regression models where Model 1 predicts the credit score based on Age, Model 2 uses Income, and Model 3 uses both Age and Income. b. Which of the three models is the preferred model? Explain your choice with two goodnessof-fit measures. c. What percentage of the sample variation in credit scores is explained by the preferred model? d. What percentage of the sample variation in credit scores is unexplained by the preferred model?
142)
A scatterplot graphically shows the relationship between two variables. ⊚ true ⊚ false
143) The covariance and correlation coefficient are measures that quantify the nonlinear relationship between two variables. ⊚ true ⊚ false
144) Covariance can be used to determine if a linear relationship between two variables is positive or negative. ⊚ true ⊚ false
Version 1
62
145) The covariance can be used to determine the strength of a linear relationship between two variables. ⊚ true ⊚ false
146)
The correlation coefficient can only range between 0 and 1. ⊚ true ⊚ false
147)
The correlation coefficient is unit-free. ⊚ true ⊚ false
148)
The sample correlation coefficient cannot equal zero. ⊚ true ⊚ false
149) The value 0.75 of a sample correlation coefficient indicates a stronger linear relationship than that of 0.60. ⊚ true ⊚ false
150)
A residual is the difference between the predicted and observed values of y. ⊚ true ⊚ false
Version 1
63
151)
Another name for an explanatory variable is the dependent variable. ⊚ true ⊚ false
152)
Simple linear regression includes more than one explanatory variable. ⊚ ⊚
true false
153) The deterministic component of a linear regression model is due to the omission of relevant factors that influence the response variable. ⊚ true ⊚ false
154)
<p>The deterministic component of the simple linear regression model is given by . ⊚ true ⊚ false
155)
A residual is the difference between the observed and the predicted values of y. ⊚ true ⊚ false
156) The method of least squares picks the slope and intercept of the sample regression equation by minimizing SSE. ⊚ true ⊚ false
Version 1
64
157) The coefficients of each explanatory variable in a multiple linear regression model have the same interpretation as the coefficient of the explanatory variable in a simple linear regression model. ⊚ ⊚
true false
158) If two linear regression models have the same number of explanatory variables, a model with an R2 value of 0.45 is a better prediction model than a model with an R2 value of 0.65. ⊚ true ⊚ false
Version 1
65
Answer Key Test name: Chap 14_4e 1) linear 2) Spurious 3) random error 4) response 5) deterministic 6) multiple 7) variance 8) analysis of variance (ANOVA) 9) SSE 10) D 11) B 12) A 13) C 14) D 15) C 16) A 17) A 18) B 19) A 20) B 21) A 22) B 23) A 24) B 25) C 26) C Version 1
66
27) D 28) C 29) D 30) B 31) C 32) B 33) D 34) B 35) D 36) A 37) D 38) B 39) C 40) A 41) A 42) C 43) A 44) C 45) B 46) D 47) C 48) D 49) D 50) C 51) A 52) B 53) A 54) B 55) A 56) D Version 1
67
57) B 58) A 59) A 60) B 61) C 62) C 63) C 64) D 65) C 66) D 67) D 68) D 69) C 70) B 71) A 72) A 73) A 74) D 75) A 76) D 77) B 78) A 79) D 80) D 81) D 82) A 83) B 84) C 85) A 86) C Version 1
68
87) D 88) C 89) A 90) D 91) D 92) D 93) C 94) B 95) C 96) D 97) B 98) B 99) C 100) D 101) C 102) A 103) B 104) C 105) A 106) B 107) C 108) A 109) B 110) C 111) D 112) A 113) B 114) A 115) B 116) C Version 1
69
117) B 118) D 119) a. The scatterplot indicates that there is a positive linear relationship between x and y. [MISSING IMAGE: A graph plots x versus y. The horizontal axis ranges from 0 to 25 in increments of 5. The vertical axis ranges from 0 to 180 in increments of 20. Dots are plotted at (3, 50), (6, 80), (9, 20), (14, 80), and (20, 160)., A graph plots x versus y. The horizontal axis ranges from 0 to 25 in increments of 5. The vertical axis ranges from 0 to 180 in increments of 20. Dots are plotted at (3, 50), (6, 80), (9, 20), (14, 80), and (20, 160). ] b. sxy = 270.10; positive linear relationship c. sxy = 0.78; relatively strong positive linear relationship 120) a. rxy = −0.56; moderate negative linear relationship b. c. t28 = −3.58; Reject H0 if t28 < −2.048 or t28 > 2.048. Reject the null hypothesis; we can conclude the variables are correlated. d. Using the t-table, the p-value < 0.005; Since the p-value < α = 0.05, reject H0; We can conclude the variables are correlated. 121) a. rxy = 0.92 b. c. t3 = 4.07 using the t table; 0.02 < p-value < 0.05 d. Because the p-value < α = 0.10, we reject H0; we can conclude that the hours spent studying and the final grade are correlated. 122) a. rxy = 0.59 b. c. t2 = 1.03 using the t table, the p-value > 0.20 d. Because the p-value > α = 0.05, we do not reject H0; we cannot conclude that the stock prices are significantly correlated. Including these two stocks may increase the diversification effect; however, keep in mind that this was a preliminary step and the test was based on a very small sample. 123) a. b1 = −0.5, b0 = −1 b. = −1 − 0.5 x, −6 c. se = 0.75 d. R2 = 0.93, 93% of the variation in y is explained by the sample regression equation.
Version 1
70
124) a. The scatterplot indicates a positive linear relationship between x and y. [MISSING IMAGE: A graph plots x versus y. The horizontal axis ranges from 0 to 40 in increments of 5. The vertical axis ranges from 0 to 60 in increments of 10. Dots are plotted at (10, 15), (14, 31), (22, 28), and (35, 55)., A graph plots x versus y. The horizontal axis ranges from 0 to 40 in increments of 5. The vertical axis ranges from 0 to 60 in increments of 10. Dots are plotted at (10, 15), (14, 31), (22, 28), and (35, 55). ] b. b1 = 1.4976, b0 = 0.4236 , = 0.4236 + 1.4976 x c. If x = 10, = 15.3996, if x = 15, = 22.8876, if x = 20 , = 30.3756 125) a. The variables x and y appear to have a negative linear relationship. [MISSING IMAGE: A graph plots x versus y. The horizontal axis ranges from 10 to 22 in increments of 4. The vertical axis ranges from 0 to 60 in increments of 10. Dots are plotted at (11, 16), (12, 21), (14, 19), (20, 13), and (23, 12)., A graph plots x versus y. The horizontal axis ranges from 10 to 22 in increments of 4. The vertical axis ranges from 0 to 60 in increments of 10. Dots are plotted at (11, 16), (12, 21), (14, 19), (20, 13), and (23, 12). ] b. b1 = −0.59, b0 = 25.46, = 25.46 − 0.59 x c. If x increases by 1 unit, then we predict y to decrease by 0.59 d. = 2.53, se = 1.59 e. R2 = 0.8350 126) a. Because GPA and Income appear to have a positive linear relationship, the simple linear regression model is appropriate in this case. [MISSING IMAGE: A graph plots income in $1,000s versus G P A. The horizontal axis is labeled G P A. It ranges from 2.5 to 4 in increments of 0.5. The vertical axis is labeled income in $1000s. It ranges from 0 to 70 in increments of 10. Dots are plotted at (3, 40), (3.2, 35), (3.5, 50), (3.7, 45), and (3.8, 60)., A graph plots income in $1,000s versus G P A. The horizontal axis is labeled G P A. It ranges from 2.5 to 4 in increments of 0.5. The vertical axis is labeled income in $1000s. It ranges from 0 to 70 in increments of 10. Dots are plotted at (3, 40), (3.2, 35), (3.5, 50), (3.7, 45), and (3.8, 60). ] b. b1 = 20.56, b0 = − 25.95, =− 25.95 + 20.56 GPA c. For each 0.1 increase in GPA, income is predicted to increase by $2,056. d. If GPA = 3.0, = $35,726, if GPA = 3.3, = $41,894, if GPA = 3.6, $48,062
Version 1
=
71
127) <p>a. b. As x1 increases by 1 unit, y is predicted to decrease by 0.28 c. d. A(SST) = 747.76, B(MSE) = 0.965 e. f. 128) <p>a. b. As x1 increases by 1 unit, y is predicted to increase by 2.53 holding x2 constant. c. 129) a. b. c. Adjusted 130) a. b. As x1 increases by 1 unit, y is predicted to decrease by 2.97 holding x2 constant. c. d. A = 2, B = 8,948, C = 11.59 e. se = 3.40 f. R2 = 0.97 131) a. 86,600 people b. If the Price goes up by $1, then we predict Attendance to decrease by 3,100 people. c. or 34,000 people d. , 61.5% of the variation in Attendance is explained by the sample regression equation; this infers that 38.5% of the variability in Attendance is unexplained. 132) a. We predict a GPA of 3.23. b. The residual is 3.23 − 3.0 = 0.23; the model overpredicts the GPA. c. d. , so 20.5% of the variation in GPAs is explained by the sample regression equation. 133) a. 39.75% b. If the expense ratio goes up by 1%, then we predict the return to increase by 0.90% holding the turnover rate constant. c. . d. , 58% of the variation in Return is explained by the sample regression equation.
Version 1
72
134) a. 38.6% b. If DoubleFaults go up by 1%, then we predict the winning proportion goes down by 0.007 holding the number of Aces constant. c. . d. , 76% of the variation in Win is explained by the sample regression equation. e. Adjusted 135) a. 9,867 b. If Snowfall increases by 1 inch, then we predict Sales of lift tickets to increase by approximately 75 holding Temperature constant. c. d. , 43% of the variation in ticket Sales is explained by the variation in the sample regression equation. e. Adjusted 136) a. 82.8 b. If Expenditures increase by $1,000, then we predict a district’s test Score to increase by 8.5 holding the socioeconomic Index constant. c. d. , 65% of the variation in test Scores is explained by the variation in the explanatory variables. e. Adjusted 137) a. 63.6 b. c. Approximately 17% of the variation in crop yield is unexplained by the sample regression equation. 138) a. 0.1395 b. H0 : ρxy ≤ 0; HA : ρxy > 0 c. The test statistic = 1.3947 d. The p-value = 0.0831 e. A p-value of 0.0831 is less than the 10% significance level. The conclusion is the population correlation coefficient between Price and Total Square Footage is greater than zero. 139) <p>a. = 5257441.76 + 4.78 × Total Square Footage b. 5295685.41 140) <p>a. = 4798080.11 + 4.85 × Total Square Footage + 13189.31 × Age b. 5496357.97
Version 1
73
141) a. Model 1:
= 311.0952 + 5.3644 × Age
Model 2: = 429.5474 + 0.0023 × Income Model 3: = 319.5833 + 0.0015 × Income + 3.6038 × Age b. Model 3 is the preferred model because it has the lowest Standard Error and highest Adjusted R Square. c. 54% d. 46%
142) TRUE 143) FALSE 144) TRUE 145) FALSE 146) FALSE 147) TRUE 148) FALSE 149) TRUE 150) FALSE 151) FALSE 152) FALSE 153) FALSE 154) TRUE 155) TRUE 156) TRUE 157) FALSE 158) FALSE
Version 1
74
CHAPTER 15 – 4e 1) The F test can be applied for any number of linear restrictions; the resulting test is often referred to as the ________ F test.
2)
The ________ model is a complete model that imposes no restrictions on the coefficients.
3) If the confidence interval does not contain the value zero, then the ________ variable associated with the regression coefficient is significant.
4) It is common to refer to the interval estimate for an individual value of y as the ________ interval.
5) One of the required assumptions of regression analysis states that the error term has a(an) ________ value of zero.
6)
Conditional on x1, x2, ..., xk, the error term ε is ________ distributed.
7) ________ plots can be used to detect common violations, and they can be used to detect outliers.
8) Consider the following simple linear regression model: y = β0 + β1x + ε. When determining whether x significantly influences y, the null hypothesis takes the form ________.
Version 1
1
A) H0:β1 = 0 B) H0:β1 = 1 C) H0:b1 = 0 D) H0:b1 = 1
9) Consider the following simple linear regression model: y = β0 + β1x + ε. When determining whether there is a one-to-one relationship between x and y, the null hypothesis takes the form ________. A) H0:β1 = 0 B) H0:β1 = 1 C) H0:b1 = 0 D) H0:b1 = 1
10) Consider the following simple linear regression model: y = β0 + β1x + ε. When determining whether there is a positive linear relationship between x and y, the alternative hypothesis takes the form ________. A) HA:β1 = 0 B) HA:β1 > 0 C) HA:β1 < 0 D) HA:b1 > 1
11) Consider the following simple linear regression model:y = β0 + β1x + ε. When determining whether there is a negative linear relationship between x and y, the alternative hypothesis takes the form ________.
Version 1
2
A) HA:β1 = 0 B) HA:β1 > 0 C) HA:β1 < 0 D) HA:b1 > 0
12)
Excel and virtually all other statistical packages report the p-value ________. A) for a two-tailed test that assesses whether the regression coefficient differs from one B) for a right-tailed test that assesses whether the regression coefficient is greater than
zero C) for a two-tailed test that assesses whether the regression coefficient differs from zero D) for a left-tailed test that assesses whether the regression coefficient is less than zero
13) Which of the following is the 95% confidence interval for the regression coefficient β1, if df = 30, b1 = −2 and A) [−5.00, −1.00] B) [−7.88, 3.88] C) [−7.09, 3.09] D) [−8.13, 4.13]
14)
The accompanying table shows the regression results when estimating y = β0 + β1x + ε Coefficients
Standard Error
t-stat
p-value
Intercept
0.083
3.56
0.9822
x
1.417
0.63
0.0745
Which of the following is the value of the test statistic when testing whether x significantly influences y? Version 1
3
A) 0.66 B) 1.42 C) 1.96 D) 2.25
15)
The accompanying table shows the regression results when estimatingy = β0 + β1x + ε. Coefficients
Standard Error
t-stat
p-value
Intercept
0.083
3.56
0.02
0.9822
x
1.417
0.63
2.25
0.0745
Is x significantly related to y at the 5% significance level? A) Yes, because the p-value of 0.0745 is greater than 0.05. B) No, because the p-value of 0.0745 is greater than 0.05. C) Yes, because the slope coefficient of 1.417 is less than the test statistic of 2.25. D) No, because the slope coefficient of 1.417 is less than the test statistic of 2.25.
16)
The accompanying table shows the regression results when estimatingy = β0 + β1x + ε. Coefficients
Standard Error
t-stat
p-value
Intercept
0.083
3.56
0.9822
x
1.417
0.63
0.0745
When testing whether the slope coefficient differs from 1, the value of the test statistic is ____. A) 0.66 B) 1.42 C) 1.96 D) 2.25
Version 1
4
17) Given the following portion of regression results, which of the following conclusions is true with regard to the F test at the 5% significance level? df
SS
MS
F 8.04
Regression
2
7,562
3,781
Residual
20
9,395
470
Total
22
16,957
Significance F 0.0028
A) Neither of the explanatory variables is related to the response variable. B) Both of the explanatory variables are related to the response variable. C) Exactly one of the explanatory variables is related to the response variable. D) At least one of the explanatory variables is significantly related to the response variable.
18) Given the following portion of regression results, which of the following is the value of the F2,20 test statistic? df
SS
MS
Regression
2
7,562
3,781
Residual
20
9,395
470
Total
22
16,957
F
Significance F 0.0028
A) 1.96 B) 3.49 C) 8.04 D) 10
Version 1
5
19)
Refer to below regression results. df
SS
MS
F
Regression
2
7,562
3,781
Residual
20
9,395
470
Total
22
16,957
Significance F 0.0028
When testing the overall significance of the regression model at the 5% level given a critical value of F0.05,(2.20) = 3.49, the decision is to ________. A) reject H0; we can conclude that the explanatory variables are jointly significant. B) not reject H0; we can conclude that the explanatory variables are jointly significant. C) reject H0; we cannot conclude that the explanatory variables are jointly significant. D) not reject H0; we cannot conclude the explanatory variables are jointly significant.
20)
Refer to below regression results. df
SS
MS
F
Regression
2
3,500
1,750
Residual
20
13,500
675
Total
22
17,000
Significance F 0.1000
When testing the overall significance of the regression model at the 5%, the decision is to ________. A) reject H0; we can conclude that the explanatory variables are jointly significant. B) not reject H0; we can conclude that the explanatory variables are jointly significant. C) reject H0; we cannot conclude that the explanatory variables are jointly significant. D) not reject H0; we cannot conclude the explanatory variables are jointly significant.
Version 1
6
21) Given the following portion of regression results, which of the following is the value of the F(2,20) test statistic? df
SS
MS
F
Regression
2
3,500
1,750
Residual
20
13,500
675
Total
22
17,000
Significance F 0.1000
A) 1.96 B) 2.59 C) 3.49 D) 10
22)
Refer to the portion of regression results in the accompanying table. df
SS
MS
F
Regression
2
3,500
1,750
Residual
20
13,500
675
Total
22
17,000
Significance F 0.1000
When testing the overall significance of the regression model at the 5% level given a critical value of F0.05,(2,20) = 3.49, the decision is to ________. A) reject H0; we can conclude that the explanatory variables are jointly significant. B) not reject H0; we can conclude that the explanatory variables are jointly significant. C) reject H0; we cannot conclude that the explanatory variables are jointly significant. D) not reject H0; we cannot conclude the explanatory variables are jointly significant.
Version 1
7
23) The accompanying table shows the regression results when estimating y = β0 + β1x1 + β2x2 + β3x3 + ε. df
SS
MS
F
Significance F
Regression
3
453
151
5.03
0.0030
Residual
85
2,521
30
Total
88
2,974
Coefficients Standard Error
t-stat
p-value
Intercept
14.96
3.08
4.80
0.0000
x1
0.87
0.29
3.00
0.0035
x2
0.46
0.22
2.09
0.0400
x3
0.04
0.34
0.12
0.9066
At the 5% significance level, which of the following explanatory variable(s) is(are) individually significant? A) Only x1 B) Only x3 C) x1 and x2 D) x2 and x3
24) A sample of 200 monthly observations is used to run a simple linear regression: Returns = β0 + β1 Leverage + ε. A 5% level of significance is used to study if leverage has a significant influence on returns. The value of the test statistic for the regression coefficient of Leverage is calculated as t198 = −1.09, with an associated p-value of 0.2770. The correct decision is to ________.
Version 1
8
A) reject the null hypothesis; we can conclude that leverage significantly explains returns B) reject the null hypothesis; we cannot conclude that leverage significantly explains returns C) not reject the null hypothesis; we cannot conclude that leverage significantly explains returns D) not reject the null hypothesis; we can conclude that leverage significantly explains returns
25) When estimating y = β0 + β1x1 + β2x2 + β3x3 + ε, you wish to test H0: β1 = β2 = 0 versus HA: At least one βi ≠ 0. The value of the test statistic is F(2,20) = 2.50 and its associated p-value is 0.1073. At the 5% significance level, the conclusion is to ________. A) reject the null hypothesis; we can conclude that x1 and x2 are jointly significant B) reject the null hypothesis; we cannot conclude that x1 and x2 are jointly significant C) not reject the null hypothesis; we can conclude that x1 and x2 are jointly significant D) not reject the null hypothesis; we cannot conclude that x1 and x2 are jointly significant
26) When testing r linear restrictions imposed on the model y = β0 + β1x1 + ... + βkxk + ε, the test statistic is assumed to follow the F(df1, df2) distribution with ________. A) df1 = k and df2 = n − k − 1 B) df1 = k − 1 and df2 = n − k − 1 C) df1 = r and df2 = n − k D) df1 = r and df2 = n − k − 1
27) When estimating y = β0 + β1x1 + β2x2 + β3x3 + ε, you wish to test H0: β1 = β2 versus HA: β1 ≠ β2 The value of the test statistic is F(1,30)= 7.75 and its associated p-value is 0.0092. At the 1% significance level, the conclusion is to ________.
Version 1
9
A) reject the null hypothesis; we cannot conclude that the influence of x1 on y is different from the influence of x2 on y. B) reject the null hypothesis; we can conclude that the influence of x1 on y is different from the influence of x2 on y. C) not reject the null hypothesis; we can conclude that the influence of x1 on y is different from the influence of x2 on y. D) not reject the null hypothesis; we cannot conclude that the influence of x1 on y is different from the influence of x2 on y.
28)
In regression, the predicted values concerning y are subject to ________. A) kurtosis B) skewness C) mean deviation D) sampling variation
29)
In regression, the two types of interval estimates concerning y are called ________. A) significance interval and prediction interval B) confidence interval and significance interval C) confidence interval and prediction interval D) significance interval and frequency interval
30) For a given confidence level, the prediction interval is always wider than the confidence interval because ________.
Version 1
10
A) the confidence interval is for a particular value of y rather than for the expected value E(y) B) the prediction interval is for a particular value of y rather than for the expected value E(y) C) the confidence interval is for a particular value of x rather than for the expected value E(x) D) the prediction interval is for a particular value of x rather than for the expected value E(x)
31) In regression, multicollinearity is considered problematic when two or more explanatory variables are ________. A) not correlated B) rarely correlated C) highly correlated D) moderately correlated
32)
Multicollinearity is suspected when ________. A) there is a low R2 coupled with significant explanatory variables B) there is a high R2 coupled with significant explanatory variables C) there is a low R2 coupled with insignificant explanatory variables D) there is a high R2 coupled with insignificant explanatory variables
33)
One of the assumptions of regression analysis is ________. A) there is perfect multicollinearity among some explanatory variables B) there is no perfect multicollinearity among any explanatory variables C) there is some degree of multicollinearity among some explanatory variables D) there is some degree of multicollinearity among all explanatory variables.
Version 1
11
34)
Which of the following violates the assumptions of regression analysis? A) The error term is normally distributed. B) The error term has a zero mean. C) The error term is correlated with an explanatory variable. D) The error term has a constant variance.
35) When confronted with multicollinearity, a good remedy is to ________ if we can justify its redundancy. A) add one more collinear variable B) drop one of the collinear variables C) remove both the collinear variables D) add as many collinear variables as possible
36)
If the variance of the error term is not the same for all observations, we ________.
A) get estimators that are biased B) can perform tests of significance C) cannot conduct tests of significance D) Both get estimators that are biased and cannot conduct tests of significance are correct.
37)
Serial correlation is typically observed in ________. A) outliers B) sparse data C) time series data D) get estimators that are biased and cannot conduct tests of significance
Version 1
12
38) A researcher gathers data on 25 households and estimates the following model: Expenditure = β0 + β1 Income + ε. A residual plot of the estimated model is shown in the accompanying graph.
Which of the following can be inferred from the residual plot? A) The residuals have little variation. B) The residuals do not exhibit any pattern. C) The residuals seem to fan out across the horizontal axis. D) The residuals are evenly distributed across the horizontal axis.
39)
Serial correlation occurs when the error term is ________. A) dispersed evenly B) randomly distributed C) correlated across observations D) uncorrelated across observations
Version 1
13
40) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and so collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following table shows a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following are the competing hypotheses used to test whether Advertising is significant in predicting Sales? A) H0:b1 = 0; HA:b1 ≠ 0. B) H0:b1 = 2.88; HA:b1 ≠ 2.88. C) H0:β1 = 0; HA:β1 ≠ 0. D) H0:β1 ≠ 0; HA:β1 = 0.
41) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and so collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following table shows a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
When testing whether Advertising is significant at the 10% significance level, the conclusion is to ________. A) reject H0; we can conclude advertising is significant B) not reject H0; we cannot conclude advertising is significant C) reject H0; we cannot conclude advertising is significant D) not reject H0; we can conclude advertising is significant
Version 1
14
42) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following table shows a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
Which of the following are the competing hypotheses used to test whether the slope coefficient differs from 3? A) H0:b1 = 3; HA:b1 ≠ 3 B) H0:b1 = 2.88; HA:b1 ≠ 2.88 C) H0:β1 = 3; HA:β1 ≠ 3 D) H0:β1 = 2.88; HA:β1 ≠ 2.88
43) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and so collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following table shows a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
When testing whether the slope coefficient differs from 3, the value of the test statistic is ________. A) t23 = −1.895 B) t23 = −0.079 C) t23 = 0.079 D) t23 = 1.895
Version 1
15
44) A marketing analyst wants to examine the relationship between sales (in $1,000s) and advertising (in $100s) for firms in the food and beverage industry and so collects monthly data for 25 firms. He estimates the model: Sales = β0 + β1 Advertising + ε. The following table shows a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
40.10
14.08
2.848
0.0052
Advertising
2.88
1.52
−1.895
0.0608
When testing whether the slope coefficient differs from 3, the critical values at the 5% significance level are −2.069 and 2.069. The conclusion to the test is to ________. A) reject H0; we can conclude the slope coefficient differs from 3 B) not reject H0; we can conclude the slope coefficient differs from 3 C) reject H0; we cannot conclude the slope coefficient differs from 3 D) not reject H0; we cannot conclude the slope coefficient differs from 3
45) A sports analyst wants to exam the factors that may influence a tennis player’s chances of winning. Over four tournaments, he collects data on 30 tennis players and estimates the following model: Win = β0 + β1 Double Faults + β2Aces + ε, where Win is the proportion of winning, Double Faults is the percentage of double faults made, and Aces is the number of aces. A portion of the regression results are shown in the accompanying table. df
SS
MS
F
Significance F
Regression
2
1.24
0.620
41.85
5.34E-09
Residual
27
0.40
0.015
Total
29
1.64 t-stat
p-value
Lower 95%
Upper 95%
5.646
5.4E06
0.287
0.614
Coefficients Standard Error Intercept
Version 1
0.451
0.080
16
Double Faults
−0.007
0.0024
−2.875
0.0078
−0.012
−0.002
Aces
0.015
0.003
4.65
7.8E05
0.008
0.023
Is the relationship between Win and Aces significant at the 5% significance level? A) No, because the relevant p-value is less than 0.05. B) Yes, because the relevant p-value is less than 0.05. C) No, because the relevant p-value is greater than 0.05. D) Yes, because the relevant p-value is greater than 0.05.
46) A sports analyst wants to exam the factors that may influence a tennis player’s chances of winning. Over four tournaments, he collects data on 30 tennis players and estimates the following model: Win = β0 + β1 Double Faults + β2Aces + ε, where Win is the proportion of winning, Double Faults is the percentage of double faults made, and Aces is the number of aces. A portion of the regression results are shown in the accompanying table. df
SS
MS
F
Significance F
Regression
2
1.24
0.620
41.85
5.34E-09
Residual
27
0.40
0.015
Total
29
1.64 t-stat
p-value
Lower 95%
Upper 95%
Coefficients Standard Error Intercept
0.451
0.080
5.646
5.4E06
0.287
0.614
Double Faults
−0.007
0.0024
−2.875
0.0078
−0.012
−0.002
Aces
0.015
0.003
4.65
7.8E05
0.008
0.023
Excel shows that the 95% confidence interval for β1 is [−0.12, −0.002]. When determining whether or not Double Faults is significant at the 5% significance level, he ________.
Version 1
17
A) rejects H0:β1 = 0, and concludes that DoubleFaults is significant B) does not reject H0:β1 = 0, and concludes that DoubleFaults is significant C) rejects H0:β1 = 0, and cannot conclude that DoubleFaults is significant D) does not reject H0:β1 = 0, and cannot conclude DoubleFaults is significant
47) A sports analyst wants to exam the factors that may influence a tennis player’s chances of winning. Over four tournaments, he collects data on 30 tennis players and estimates the following model: Win = β0 + β1 Double Faults + β2 Aces + ε, where Win is the proportion of winning, Double Faults is the percentage of double faults made, and Aces is the number of aces. A portion of the regression results are shown in the accompanying table. df
SS
MS
F
Significance F
Regression
2
1.24
0.620
41.85
5.34E-09
Residual
27
0.40
0.015
Total
29
1.64 t-stat
p-value
Lower 95%
Upper 95%
Coefficients Standard Error Intercept
0.451
0.080
5.646
5.4E06
0.287
0.614
Double Faults
−0.007
0.0024
−2.875
0.0078
−0.012
−0.002
Aces
0.015
0.003
4.65
7.8E05
0.008
0.023
When testing whether the explanatory variables jointly influence the response variable, the null hypothesis is ________. A) H0:β1 = β2 = 0 B) H0:β1 + β2 = 0 C) H0:β0 = β1 = β2 = 0 D) H0:β0 + β1 + β2 + 0
Version 1
18
48) A sports analyst wants to exam the factors that may influence a tennis player’s chances of winning. Over four tournaments, he collects data on 30 tennis players and estimates the following model: Win = β0 + β1 Double Faults + β2Aces + ε, where Win is the proportion of winning, Double Faults is the percentage of double faults made, and Aces is the number of aces. A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Significance F
Regression
2
1.24
0.620
41.85
5.34E-09
Residual
27
0.40
0.015
Total
29
1.64 t-stat
p-value
Lower 95%
Upper 95%
Coefficients Standard Error Intercept
0.451
0.080
5.646
5.4E06
0.287
0.614
Double Faults
−0.007
0.0024
−2.875
0.0078
−0.012
−0.002
Aces
0.015
0.003
4.65
7.8E05
0.008
0.023
When testing whether the explanatory variables are jointly significant at the 5% significance level, he ________. A) rejects H0, and concludes that the explanatory variables are jointly significant B) does not reject H0, and concludes that the explanatory variables are jointly significant C) rejects H0, and cannot conclude that the explanatory variables are jointly significant D) does not reject H0, and cannot conclude that the explanatory variables are jointly significant
Version 1
19
49) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model: Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
0.008
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
When testing whether Price is significant in explaining Attendance, the value of the test statistic is ________. A) −4.085. B) −1.201. C) 1.201. D) 4.085.
50) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model:Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table.
Version 1
20
df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
When testing whether Temperature is significant at the 5% significance level, she ________. A) rejects H0:β2 = 0, and concludes that Temperature is significant B) does not reject H0:β2 = 0, and concludes that Temperature is significant C) rejects H0:β2 = 0, and cannot conclude that Temperature is significant D) does not reject H0:β2 = 0, and cannot conclude that Temperature is significant
51) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model:Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Version 1
F
Significanc e F 2.18E-14
21
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
When testing whether Rides is significant at the 1% significance level, she ________. A) rejects H0:β3 = 0, and concludes that Rides is significant B) does not reject H0:β3 = 0, and concludes that Rides is significant C) rejects H0:β3 = 0, and cannot conclude that Rides is significant D) does not reject H0:β3 = 0, and cannot conclude that Rides is significant
52) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model: Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
27.328
40.254
Intercept
Version 1
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
0.503 2
−55.415
110.07 1
22
Price
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
Which of the following is the value of the test statistic for testing the joint significance of the linear regression model? A) −4.09 B) 0.04 C) 9.95 D) 99.78
53) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model: Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
Version 1
t-stat
F
23
When testing whether the explanatory variables are jointly significant at the 5% significance level, the researcher________. A) rejects H0, and concludes that the explanatory variables are jointly significant B) does not reject H0, and concludes that the explanatory variables are jointly significant C) rejects H0, and cannot conclude that the explanatory variables are jointly significant D) does not reject H0, and cannot conclude that the explanatory variables are jointly significant
54) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model:Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
Which of the following are the hypotheses to test if the explanatory variables Temperature and Rides are jointly significant in explaining Attendance?
Version 1
24
A) H0:β2 = β3 = 0; HA: At least one of the coefficients is nonzero. B) H0:β2 = β3 = 0; HA: At least one of the coefficients is zero. C) H0:β0 = β1 = 0; HA: At least one of the coefficients is nonzero. D) H0:β0 = β1 = 0; HA: At least one of the coefficients is zero.
55) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model: Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
When testing whether the explanatory variables Temperature and Rides are jointly significant, the error sum of squares for the restricted model is SSER = 12,343.78. Which of the following is the value of the test statistic when conducting this test?
Version 1
25
A) −4.09 B) 25.33 C) 49.58 D) 99.78
56) A researcher analyzes the factors that may influence amusement park attendance. She estimates the following model:Attendance = β0 + β1 Price + β2 Temperature + β3 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), Temperature is the average daily temperature (in oF), and Rides is the number of rides at the amusement park. A portion of the regression results is shown in the accompanying table. df
SS
MS
Regression
3
29,524.4 1
9,841.4 7
Residual
26
2,564.37
98.63
Total
29
32,088.7 8
Coefficient s
Standard Error
Intercept
27.328
Price
t-stat
F
Significanc e F 2.18E-14
p-value
Lower 95% Upper 95%
40.254
0.503 2
−55.415
110.07 1
−1.201
0.294
0.000 4
−1.805
−0.598
Temperatur e
0.008
0.208
0.969 3
−0.419
0.435
Rides
3.621
0.364
2.32E -10
2.874
4.369
When testing whether the explanatory variables Temperature and Rides are jointly significant, the p-value associated with the test is 0.0000. At the 1% significance level, the researcher ________.
Version 1
26
A) rejects H0, and concludes that at least one of the explanatory variables, Temperature and Rides, is significant in explaining Attendance B) does not reject H0, and concludes that at least one of the explanatory variables, Temperature and Rides, is significant in explaining Attendance C) rejects H0, and cannot conclude that at least one of the explanatory variables, Temperature and Rides, is significant in explaining Attendance D) does not reject H0, and cannot conclude at least one of the explanatory variables, Temperature and Rides, is significant in explaining Attendance
57) A researcher analyzes the factors that may influence amusement park attendance and estimates the following model:Attendance = β0 + β1 Price + β2 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), and Rides is the number of rides at the amusement park. The researcher would like to construct interval estimates for Attendance when Price and Rides equal $85 and 30, respectively. The researcher estimates a modified model where Attendance is the response variable and the explanatory variables are now defined as Price* = Price − 85 and Rides* = Rides − 30. A portion of the regression results is shown in the accompanying table. Coefficients
Standard Error
t-stat
p-value Lower 95% Upper 95%
Intercept
34.41
4.06
8.48
4.33E09
26.08
42.74
Price*
−1.20
0.28
−4.23
0.0002
−1.79
−0.62
Rides*
3.62
0.36
10.15
1.04E10
2.89
4.35
According to the modified model, which of the following is the predicted value for Attendance when Price and Rides equal $85 and 30, respectively? A) 25,670 B) 34,410 C) 40,910 D) 55,600
Version 1
27
58) A researcher analyzes the factors that may influence amusement park attendance and estimates the following model: Attendance = β0 + β1 Price + β2 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), and Rides is the number of rides at the amusement park. The researcher would like to construct interval estimates for Attendance when Price and Rides equal $85 and 30, respectively. The researcher estimates a modified model where Attendance is the response variable and the explanatory variables are now defined as Price* = Price − 85 and Rides* = Rides − 30. A portion of the regression results is shown in the accompanying table. Coefficients
Standard Error
t-stat
p-value Lower 95% Upper 95%
Intercept
34.41
4.06
8.48
4.33E09
26.08
42.74
Price*
−1.20
0.28
−4.23
0.0002
−1.79
−0.62
Rides*
3.62
0.36
10.15
1.04E10
2.89
4.35
According to the modified model, which of the following is a 95% confidence interval for expected Attendance when Price and Rides equal $85 and 30, respectively? (Note that t0,025,27 = 2.052.) A) [12,740, 56,780] B) [16,330, 53,450] C) [26,080, 42,740] D) [28,900, 41,500]
59) A researcher analyzes the factors that may influence amusement park attendance and estimates the following model:Attendance = β0 + β1 Price + β2 Rides + ε, where Attendance is the daily attendance (in 1,000s), Price is the gate price (in $), and Rides is the number of rides at the amusement park. The researcher would like to construct interval estimates for Attendance when Price and Rides equal $85 and 30, respectively. The researcher estimates a modified model where Attendance is the response variable and the explanatory variables are now defined as Price* = Price − 85 and Rides* = Rides − 30. A portion of the regression results is shown in the accompanying table. Regression Statistics Multiple R
0.96
R Square
0.92
Version 1
28
Adjusted R Square
0.91
Standard Error
9.75
Observations
30 Coefficients Standard Error
t-stat
p-value
Lower 95%
Upper 95%
Intercept
34.41
4.06
8.48
4.33E09
26.08
42.74
Price*
−1.20
0.28
−4.23
0.0002
−1.79
−0.62
Rides*
3.62
0.36
10.15
1.04E10
2.89
4.35
According to the modified model, which of the following is a 95% prediction interval for Attendance when Price and Rides equal $85 and 30, respectively? (Note that t0.025,27 = 2.052.) A) [12,740, 56,080] B) [16,330, 53,450] C) [26,080, 42,740] D) [28,900, 41,500]
60) The accompanying table shows the regression results when estimating y = β0+ β1x1 + β2x2 + β3x3 + ε. df
SS
MS
F
Significance F
Regression
3
453
151
5.03
0.0030
Residual
85
2,521
30
Total
88
2,974
Coefficients
Standard Error
t-stat
p-value
Intercept
14.96
3.08
4.86
0.0000
x1
0.87
0.29
3.00
0.0035
x2
0.46
0.22
2.09
0.0400
Version 1
29
x3
0.04
0.34
0.12
0.9066
When testing whether the explanatory variables are jointly significant at the 5% significance level, the conclusion is to ________. A) reject H0, and conclude that the explanatory variables are jointly significant B) not reject H0, and conclude that the explanatory variables are jointly significant C) reject H0, and cannot conclude that the explanatory variables are jointly significant D) not reject H0, and cannot conclude that the explanatory variables are jointly significant
61) The accompanying table shows the regression results when estimatingy = β0 + β1x1 + β2x2 + β3x3 + ε. df
SS
MS
F
Significance F
Regression
3
453
151
5.03
0.0030
Residual
85
2,521
30
Total
88
2,974
Coefficients
Standard Error
t-stat
p-value
Intercept
14.96
3.08
4.86
0.0000
x1
0.87
0.29
3.00
0.0035
x2
0.46
0.22
2.09
0.0400
x3
0.04
0.34
0.12
0.9066
At the 5% significance level, which of the following explanatory variable(s) is(are) individually significant?
Version 1
30
A) Only x1 B) Only x3 C) x1 and x2 D) x2 and x3
62) The accompanying table shows the regression results when estimatingy = β0 + β1x1 + β2x2 + β3x3 + ε. df
SS
MS
F 5.03
Regression
3
453
151
Residual
85
2,521
30
Total
88
2,974
Coefficients
Standard Error
t-stat
p-value
Intercept
14.96
3.08
4.86
0.0000
x1
0.87
0.29
3.00
0.0035
x2
0.46
0.22
2.09
0.0400
x3
0.04
0.34
0.12
0.9066
When testing whether or not x1 and x2 have the same influence on y, the null hypothesis is ________. A) H0: β1 = β2 B) H0: β1 ≠ β2 C) H0: β1 + β2 = β3 D) H0: β1 + β2 = 0
63) The accompanying table shows the regression results when estimatingy = β0 + β1x1 + β2x2 + β3x3 + ε. df Regression
Version 1
3
SS
MS
F
453
151
5.03
31
Residual
85
2,521
30
Total
88
2,974
Coefficients
Standard Error
t-stat
p-value
Intercept
14.96
3.08
4.86
0.0000
x1
0.87
0.29
3.00
0.0035
x2
0.46
0.22
2.09
0.0400
x3
0.04
0.34
0.12
0.9066
When testing whether or not x1 and x2 are jointly significant, the null hypothesis is ________. A) H0:β1 = β2 B) H0:β1 ≠ β2 C) H0:β1 + β2= 0 D) H0:β1 = β2 = 0
64) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
You would like to determine whether an investment in Tiffany’s is riskier than the market. When conducting this test, you set up the following competing hypotheses: ________. A) H0:α = 0; HA:α ≠ 0 B) H0:β = 0; HA:β ≠ 0 C) H0:α ≤ 1; HA:α > 1 D) H0:β ≤ 1; HA:β > 1
Version 1
32
65) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
When testing whether the beta coefficient is significantly greater than one, the value of the test statistic is ________. A) −1.98 B) 1.98 C) 4.33 D) 9.58
66) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
When testing whether the beta coefficient is significantly greater than one, the relevant critical value at the 5% significance level is t0.05,58 = 1.672. The conclusion to the test is to ________.
Version 1
33
A) reject H0; we can conclude that the return on Tiffany stock is riskier than the return on the market B) not reject H0; we can conclude that the return on Tiffany stock is riskier than the return on the market C) reject H0; we cannot conclude that the return on Tiffany stock is riskier than the return on the market D) not reject H0; we cannot conclude that the return on Tiffany stock is riskier than the return on the market
67) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
When testing whether there are abnormal returns, or whether the alpha coefficient is significantly different from zero, the value of the test statistic is ________. A) H0:α = 0; HA:α ≠ 0 B) H0:β = 0; HA:β ≠ 0 C) H0:α ≤ 1; HA:α > 1 D) H0:β ≤ 1; HA:β > 1
Version 1
34
68) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
When testing whether there are abnormal returns, or whether the alpha coefficient is significantly different from zero, the value of the test statistic is ________. A) −1.98 B) 1.98 C) 4.33 D) 9.58
69) Tiffany & Company has been the world’s premier jeweler since 1837. The performance of Tiffany’s stock is likely to be strongly influenced by the economy. Monthly data for Tiffany’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Tiffany’s return. Coefficients
Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.0198
0.010
1.98
0.0598
−0.0008
0.0405
RM − Rf
1.827
0.191
9.58
1.494E13
1.4456
2.2094
When testing whether there are abnormal returns, the conclusion to the test is at the 5% significance level is to ________.
Version 1
35
A) reject H0; we can conclude there are abnormal returns B) not reject H0; we can conclude there are abnormal returns C) reject H0; we cannot conclude there are abnormal returns D) not reject H0, we cannot conclude there are abnormal returns
70) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the following model: Salary = β0 + β1 Service + ε. The following table summarizes a portion of the regression results. Coefficients Intercept Service
Standard Error
tpLower 95% stat value
Upper 95%
784.92
322.25
124.95
1,444.89
9.19
3.20
2.64
15.74
Which of the following hypotheses will determine whether the intercept differs from zero? A) H0:β0 = 0; HA:β0 ≠ 0 B) H0:β1 = 0; HA:β1 ≠ 0 C) H0:b0 = 0; HA:b0 ≠ 0 D) H0:b1 = 0; HA:b1 ≠ 0
71) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the following model:Salary = β0 + β1 Service + ε. The following table summarizes a portion of the regression results. Coefficients Intercept Service
Version 1
Standard Error
tpLower 95% stat value
Upper 95%
784.92
322.25
124.95
1,444.89
9.19
3.20
2.64
15.74
36
Which of the hypotheses will determine whether the slope differs from zero? A) H0:β0 = 0; HA: β0 ≠ 0 B) H0:β1 = 0; HA: β1 ≠ 0 C) H0:b0 = 0; HA: b0 ≠ 0 D) H0:b1 = 0; HA: b1 ≠ 0
72) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the following model:Salary = β0 + β1 Service + ε. The following table summarizes a portion of the regression results. Coefficients Intercept Service
Standard Error
tpLower 95% stat value
Upper 95%
784.92
322.25
124.95
1,444.89
9.19
3.20
2.64
15.74
Using the 95% confidence interval, which of the following is the conclusion to the following hypothesis test: H0:β0 = 0; HA: β0 ≠ 0? A) At the 5% significance level, reject H0 because the interval contains 0. B) At the 5% significance level, do not reject H0 because the interval contains 0. C) At the 5% significance level, reject H0 because the interval does not contain 0. D) At the 5% significance level, do not reject H0 because the interval does not contain 0.
73) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. She estimates the following model:Salary = β0 + β1 Service + ε. The following table summarizes a portion of the regression results.
Version 1
37
Coefficients Intercept Service
Standard Error
tpLower 95% stat value
Upper 95%
784.92
322.25
124.95
1,444.89
9.19
3.20
2.64
15.74
Using the 95% confidence interval, what is the conclusion to the following hypothesis test: H0:b1 = 0; HA:b1 ≠ 0? A) At the 5% significance level, reject H0 and conclude that service is significant. B) At the 5% significance level, do not reject H0 and conclude that service is significant. C) At the 5% significance level, reject H0 and cannot conclude that service is significant. D) At the 5% significance level, do not reject H0 and cannot conclude that service is significant.
74) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119
−24.24
202.7
Version 1
38
5 Sq ft
0.2
0.09
2.22
0.027 6
7 0.024
0.39
When testing whether the explanatory variables jointly influence the response variable, the null hypothesis is ________. A) H0:β0 = β1 = β2 = β3 = 0 B) H0:β1 = β2 = β3 = 0 C) H0:β0 + β1 + β2 + β3 = 0 D) H0:β1 + β2 + β3 + 0
75) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sq ft
0.2
0.09
2.22
0.027 6
0.024
0.39
Version 1
39
Which of the following is the value of the test statistic for testing the joint significance of the linear regression model? A) 1.59 B) 2.22 C) 3.75 D) 50.88
76) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3 Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sq ft
0.2
0.09
2.22
0.027 6
0.024
0.39
When testing whether the explanatory variables are jointly significant at the 5% level, she ________.
Version 1
40
A) rejects H0, and concludes that the explanatory variables are jointly significant B) does not reject H0, and concludes that the explanatory variables are jointly significant C) rejects H0, and cannot conclude that the explanatory variables are jointly significant. D) does not reject H0, and cannot conclude that the explanatory variables are jointly significant
77) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms(x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3 Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sq ft
0.2
0.09
2.22
0.027 6
0.024
0.39
When testing whether Bedroom is significant at the 5% significance level, she ________.
Version 1
41
A) rejects H0:β1 = 0, and concludes that the number of bedrooms significantly influence rent B) does not reject H0:β1 = 0, and concludes that the number of bedrooms significantly influence rent C) rejects H0:β1 = 0, and cannot conclude that the number of bedrooms significantly influence rent D) does not reject H0:β1 = 0, and cannot conclude that the number of bedrooms significantly influence rent
78) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3 Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sq ft
0.2
0.09
2.22
0.027 6
0.024
0.39
When testing whether Bath is significant at the 5% significance level, she ________.
Version 1
42
A) rejects H0:β2 = 0, and concludes that the number of bathrooms significantly influences rent B) does not reject H0:β2 = 0, and concludes that the number of bathrooms significantly influences rent C) rejects H0:β2 = 0, and cannot conclude that the number of bathrooms significantly influence rent D) does not reject H0:β2 = 0, and cannot conclude that the number of bathrooms significantly influence rent.
79) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent (y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3 Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sq ft
0.2
0.09
2.22
0.027 6
0.024
0.39
Which of the following are the hypotheses to test if the explanatory variables Bath and Sqft are jointly significant in explaining Rent?
Version 1
43
A) H0:β2 = β3 = 0; HA: both of the coefficients are nonzero. B) H0:β2 = β3 = 0; HA: at least one of the coefficients is zero. C) H0:β2 = β3 = 0; HA: at least one of the coefficients is nonzero. D) H0:β2 = β3 = 0; HA: both of the coefficients are zero.
80) A real estate analyst believes that the three main factors that influence an apartment’s rent in a college town are the number of bedrooms, the number of bathrooms, and the apartment’s square footage. For 40 apartments, she collects data on the rent(y, in $), the number of bedrooms (x1), the number of bathrooms (x2), and its square footage (x3). She estimates the following model: Rent = β0 + β1 Bedroom +β2 Bath + β3 Sqft + ε. The following table shows a portion of the regression results. df
SS
MS
F
Significanc e F
Regressio n
3
5,694,71 7
1,898,23 9
50.88
4.99E-13
Residual
36
1,343,17 6
37,310
Total
39
7,037,89 3 t-stat
p-value
Lower 95%
Upper 95%
Coefficient Standard s Error Intercept
300
84.0
3.57
0.001 0
130.03
470.7 9
Bedroom
226
60.3
3.75
0.000 6
103.45
348.1 7
Bath
89
55.9
1.59
0.119 5
−24.24
202.7 7
Sqft
0.2
0.09
2.22
0.027 6
0.024
0.39
When testing whether the explanatory variables Bath and Sqft are jointly significant, the p-value associated with the test is 0.0039. At the 5% significance level, she ________.
Version 1
44
A) rejects H0, and cannot conclude that at least one of the explanatory variables, Bath and/or Sqft, is significant in explaining Rent B) does not reject H0, and concludes that at least one of the explanatory variables, Bath and/or Sqft, is significant in explaining Rent C) rejects H0, and concludes that at least one of the explanatory variables, Bath and/or Sqft, is significant in explaining Rent D) does not reject H0, and cannot conclude that at least one of the explanatory variables, Bath and/or Sqft, is significant in explaining Rent
81) An economist estimates the following model: y = β0 + β1x + ε. She would like to construct interval estimates for y when x equals 2. She estimates a modified model where y is the response variable and the explanatory variable is now defined as x+ = x − 2. A portion of the regression results is shown in the accompanying table. Regression Statistics R Square
0.42
Standard Error
3.86
Observations
12 Coefficients Standard t-stat Error
p-value Lower 95% Upper 95%
Intercept
24.78
2.76
8.98
4.2E06
18.63
30.93
Bedroom
2.52
0.95
2.66
0.0237
0.41
4.63
According to the modified model, which of the following is the predicted value of y when x equals 2? A) 2.52 B) 24.78 C) 27.30 D) 29.82
Version 1
45
82) An economist estimates the following model: y = β0 + β1x + ε. She would like to construct interval estimates for y when x equals 2. She estimates a modified model where y is the response variable and the explanatory variable is now defined asx+ = x – 2. A portion of the regression results is shown in the accompanying table. Regression Statistics R Square
0.42
Standard Error
3.86
Observations
12 Coefficients Standard t-stat Error
p-value Lower 95% Upper 95%
Intercept
24.78
2.76
8.98
4.2E06
18.63
30.93
Bedroom
2.52
0.95
2.66
0.0237
0.41
4.63
According to the modified model, which of the following is a 95% confidence interval for E(y) when x equals 2? (Note that t0.025,10 = 2.228.) A) [0.41, 4.63] B) [14.21, 35.35] C) [18.63, 30.93] D) [20.35, 25.65]
83) An economist estimates the following model:y = β0 + β1x + ε. She would like to construct interval estimates for y when x equals 2. She estimates a modified model where y is the response variable and the explanatory variable is now defined asx* = x − 2. A portion of the regression results is shown in the accompanying table. Regression Statistics R Square
0.42
Standard Error
3.86
Observations
12 Coefficients Standard t-stat p-value Lower 95% Upper 95% Error
Intercept
Version 1
24.78
2.76
8.98
4.2E06
18.63
30.93
46
x*
2.52
0.95
2.66
0.0237
0.41
4.63
According to the modified model, which of the following is a 95% prediction interval for y when x equals 2? (Note that t0.025,10 = 2.228.) A) [0.41, 4.63] B) [14.21, 35.35] C) [18.63, 30.93] D) [20.35, 25.65]
84) A sociologist wishes to study the relationship between happiness and age. He interviews 24 individuals and collects data on age and happiness, measured on a scale from 0 to 100. He estimates the following model:Happiness = β0 + β1Age + ε. The following table summarizes a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
56.1772
5.2145
10.7732
0.0000
Age
0.2845
0.0871
3.2671
0.0035
Which of the following is the estimate of Happiness for the person who is 65 years old? A) 75 B) 62 C) 78 D) 68
85) A sociologist wishes to study the relationship between happiness and age. He interviews 24 individuals and collects data on age and happiness, measured on a scale from 0 to 100. He estimates the following model: Happiness = β0 + β1Age + ε. The following table summarizes a portion of the regression results.
Intercept
Version 1
Coefficients
Standard Error
t-stat
p-value
56.1772
5.2145
10.7732
0.0000 47
Age
0.2845
0.0871
3.2671
0.0035
When defining whether age is significant in explaining happiness, the competing hypotheses are ________. A) H0:β1 = 1; HA:β1 ≠ 1 B) H0:β1 = 0; HA:β1 ≠ 0 C) H0:β1 ≤ 0; HA:β1 > 0 D) H0:b1 ≥ 1; HA:β1 < 1
86) A sociologist wishes to study the relationship between happiness and age. He interviews 24 individuals and collects data on age and happiness, measured on a scale from 0 to 100. He estimates the following model:Happiness = β0 + β1Age + ε. The following table summarizes a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
56.1772
5.2145
10.7732
0.0000
Age
0.2845
0.0871
3.2671
0.0035
At the 1% significance level, which of the following is correct for the test about the population slope? A) Do not reject the null hypothesis. At 1% significance level, we cannot conclude that Age is significant in explaining Happiness. B) Do not reject the null hypothesis. At 1% significance level, we conclude that Age is significant in explaining Happiness. C) Reject the null hypothesis. At 1% significance level, we conclude that Age is significant in explaining Happiness. D) Reject the null hypothesis. At 1% significance level, we cannot conclude that Age is significant in explaining Happiness.
Version 1
48
87) A sociologist wishes to study the relationship between happiness and age. He interviews 24 individuals and collects data on age and happiness, measured on a scale from 0 to 100. He estimates the following model:Happiness = β0 + β1Age + ε. The following table summarizes a portion of the regression results. Coefficients
Standard Error
t-stat
p-value
Intercept
56.1772
5.2145
10.7732
0.0000
Age
0.2845
0.0871
3.2671
0.0035
At the 1% significance level, which of the following is the correct confidence interval of the regression coefficient β1? A) [0.0390, 0.0530] B) [0.0190, 0.5300] C) [0.0190, 0,0530] D) [0.0390, 0.5300]
88) A researcher studies the relationship between SAT scores, the test-taker’s family income, and his or her grade point average (GPA). Data are collected from 24 students. He estimates the following model: SAT = β0 + β1 GPA + β2 Income + ε. The following table summarizes a portion of the regression results. df
SS
MS
Regression
2
141,927.9571
70,963.98
Residual
21
22,167.8762
1,055.613
Total
23
164,095.8333
Coefficients
Standard Error
t-stat
p-value
1,104.2580
54.7524
20.1682
0.0000
150.9920
15.0931
10.0040
0.0000
0.0017
0.0003
6.7696
0.0000
Intercept GPA Income
F
At the 5% significance level, which of the following explanatory variable(s) is(are) individually significant?
Version 1
49
A) Only Income. B) Both, GPA and Income. C) Only GPA. D) Neither GPA nor Income.
89) A researcher studies the relationship between SAT scores, the test-taker’s family income, and his or her grade point average (GPA). Data are collected from 24 students. He estimates the following model:SAT = β0 + β1 GPA + β2 Income + ε. The following table summarizes a portion of the regression results. df
SS
MS
Regression
2
141,927.9571
70,963.98
Residual
21
22,167.8762
1,055.613
Total
23
164,095.8333
Coefficients
Standard Error
t-stat
p-value
1,104.2580
54.7524
20.1682
0.0000
150.9920
15.0931
10.0040
0.0000
0.0017
0.0003
6.7696
0.0000
Intercept GPA Income
F
Which of the following is the correct hypotheses for a test about joint significance? A) H0:β0 = β2 = 0; HA: At least β0 ≠ 0 B) H0:β1 = β2 = 0; HA: At least one βj ≠ 0 C) H0:β0 = β1 = β2 = 0; HA: All βj ≠ 0 D) H0:β1 = β2 = 0; HA: Both βj ≠ 0
90) A researcher studies the relationship between SAT scores, the test-taker’s family income, and his or her grade point average (GPA). Data are collected from 24 students. He estimates the following model: SAT = β0 + β1 GPA + β2 Income + ε. The following table summarizes a portion of the regression results.
Version 1
50
df
SS
MS
Regression
2
141,927.9571
70,963.98
Residual
21
22,167.8762
1,055.613
Total
23
164,095.8333
Coefficients
Standard Error
t-stat
p-value
1,104.2580
54.7524
20.1682
0.0000
150.9920
15.0931
10.0040
0.0000
0.0017
0.0003
6.7696
0.0000
Intercept GPA Income
F
Which of the following is the value of the test statistic for testing the joint significance of the linear regression model? A) 6.40 B) 134.45 C) 67.23 D) 45.13
91) Which of the following can be used to conduct the test for individual significance of explanatory variables? A) Only p-value approach. B) Only confidence intervals of the regression coefficients. C) Neither p-value approach nor confidence intervals. D) Both, p-value approach and confidence intervals.
92)
Test statistic for the test of linear restrictions uses all of the following, except ________.
Version 1
51
A) total sum of squares of the restricted model B) error sum of squares of the restricted model C) number of linear restrictions D) number of explanatory variables in the unrestricted model
93) Which of the following is the correct expression for computing a 100(1 − α)% prediction interval for an individual value of y?
A) B) C) D)
94) Which of the following is the correct expression for computing a 100(1 − α)% confidence interval of the expected value of y? A) B) C) D)
95) Which of the following statements in statistical terminology is equivalent to the statement: "There is no exact linear relationship among the explanatory variables"?
Version 1
52
A) The error term is homoscedastic. B) There is no endogeneity. C) There is no autocorrelation. D) There is no perfect multicollinearity.
96) Which of the following statements is TRUE about tests of significance to determine whether there is evidence of a linear relationship between the response variable and the explanatory variables? A) The test of individual significance is about whether there is evidence that at least one explanatory variable influences the response variable. B) The test of individual significance uses a right-tailed F test. C) The test of individual significance is the same as the test of joint significance for a simple linear regression model. D) The test of joint significance uses a two-tailed test.
97) A financial analyst considers three company-specific factors, number of employees, dividend rate, and earnings, as explanatory variables in a regression model to explain the company’s stock prices. Which of the following is an example of a test of one restriction? A) The influence of dividend rate is identical to that of earnings. B) The influence of number of employees, dividend rate, and earnings are jointly significant. C) The influence of dividend rate and earnings are jointly significant. D) The influence of number of employees and earnings are jointly significant.
98) Which of the following statements is FALSE regarding interval estimates for the response variable?
Version 1
53
A) The interval estimate for the expected value of the response variable is called the confidence interval. B) The interval estimate for the individual value of the response variable is called the prediction interval. C) The confidence interval incorporates the variability of the random error term. D) The prediction interval is always wider than the corresponding confidence interval.
99)
The assumptions of the classical linear regression model include the following except: A) Explanatory variables are not perfectly correlated. B) The variables in the model are linear. C) The observations are uncorrelated. D) The error term and the explanatory variables are uncorrelated.
100)
The accompanying table shows the regression results when estimating y = β0 + β1x + ε. df
SS
MS
F
Significance F
Regression
1
78.5900
78.5900
15.82
0.0106
Residual
5
24.8385
4.9677
Total
6
103.4286
Coefficients
Standard Error
t-stat
p-value
Intercept
15.6615
2.4020
6.520
0.0013
x
−1.6927
0.4256
−3.977
0.0106
a. Specify the competing hypotheses to determine whether x is significant in explaining y. b. At the 5% significance level, is x significant in explaining y? Explain.
Version 1
54
101)
Consider the following regression results based on 40 observations. Coefficients
Standard Error
t-stat
p-value
Intercept
427.93
66.29
6.46
0.0000
x
0.37
0.17
2.21
0.0335
a. Specify the hypotheses to determine if the slope differs from one. b. Calculate the value of the test statistic. c. At the 5% significance level, find the critical value(s). d. Does the slope differ from one? Explain.
102) When estimating a multiple regression model based on 30 observations, the following results were obtained. Coefficient Standard t-stat s Error
p-value
Lower 95%
Upper 95%
Intercep t
395.96
76.9 5
5.15
0.000 0
238.0 7
553.8 5
x1
−0.45
0.19
−2.3 5
0.026 4
−0.84
−0.06
x2
1.26
0.65
1.94
0.063 1
−0.07
2.59
a. Specify the hypotheses to determine whether x1 is linearly related to y. At the 5% significance level, use the p-value approach to complete the test. Are x1 and y linearly related? b. Construct the 95% confidence interval for β2. Using this confidence interval, is x2 significant in explaining y? Explain. c. At the 5% significance level, can you conclude that β1differs from −1? Show the relevant steps of the appropriate hypothesis test.
Version 1
55
103) The accompanying table shows the regression results when estimating y = β0 + β1x1 + β2x2 + β3x3 + ε. df
SS
MS
F
Significance F
Regression
3
878
293
13.31
0.0003
Residual
13
286
22
Total
16
1,164
Coefficients
Standard Error
t-stat
p-value
Intercept
32.85
5.10
6.45
2.18E-05
x1
0.75
0.45
1.65
0.1225
x2
0.23
0.43
0.54
0.5972
x3
−2.84
0.49
−5.77
0.0001
a. Specify the competing hypotheses to determine whether the explanatory variables are jointly significant. b. At the 5% significance level, are the explanatory variables jointly significant? Explain. c. At the 5% significance level, is x2 significant in explaining y? Explain. d. At the 5% significance level, is the slope coefficient attached to x3 different from −2?
104)
Consider the following regression results based on 30 observations.
Variable
Restricted
Unrestricted
Intercept
412.42×(0.0000)
535.47×(0.0000)
x1
−2.41×(0.0000)
−2.26×(0.0000)
Version 1
56
x2
NA
4.75×(0.0002)
x3
NA
−0.53×(0.0000)
11,013
4,355
SSE
Notes: Parameter estimates are in the main body of the table with the p-values in parentheses; * represents significance at 5% level. The last row presents the error sum of squares. a. Formulate the hypotheses to determine whether x2 and x3 are jointly significant in explaining y. b. Define the restricted and the unrestricted models needed to conduct the test. c. Calculate the value of the test statistic. d. At the 5% significance level, find the critical value(s). e. What is your conclusion to the test?
105)
Consider the following regression results based on 30 observations.
= 238.33 − 0.95x1 + 7.13x2 + 4.76x3; SSE = 3,439 = 209.56 − 1.03x1 + 5.24(x2 + x3); SSE = 3,559 a. Formulate the hypotheses to determine whether the influences of x2 and x3 differ in explaining y. b. Calculate the value of the test statistic. c. At the 5% significance level, find the critical value(s). d. What is your conclusion to the test?
106) In a simple linear regression based on 30 observations, the following information is provided: = 32.3 + 1.45x and se = 3.2. Also, evaluated at x = 20is 1.3. a. Construct a 95% confidence interval for E(y) if x = 20. b. Construct a 95% prediction interval for y if x = 20.
Version 1
57
107)
In a multiple regression based on 30 observations, the following information is provided:
= 38.6 − 0.1x1 + 0.2x2 and se = 5.4. Also, evaluated at x1 = 40 and x2 = 30 is 2.8. a. Construct a 95% confidence interval for E(y) if x1 equals 40 and x2 equals 30. b. Construct a 95% prediction interval for y if x1 equals 40 and x2 equals 30. c. Which interval is narrower? Explain.
108)
In a multiple regression based on 30 observations, the following information is provided:
= 469 − 2x1 + 6x2 − 0.5x3 and se = 16. Also, when x1 = 30, x2 = 10, and x3 = 65 a. Construct a 90% confidence interval for E(y) if x1 = 30, x2 = 10, and x3 = 65. b. Construct a 90% prediction interval for y if x1 = 30, x2 = 10, and x3 = 65. c. Which interval is narrower? Explain.
= 37.
109) A manager at a local bank analyzed the relationship between monthly salary (y, in $) and length of service (x, measured in months) for 30 employees. The following table summarizes a portion of the regression results: Coefficients
Standard Error
t-stat
p-value
Intercept
784.92
322.25
2.44
0.02
Service
9.19
3.20
2.87
0.01
Version 1
58
a. Specify the competing hypotheses to determine whether Service is significant in explaining Salary. b. At the 5% significance level, is Service significant? Explain.
110) An investment analyst wants to examine the relationship between a mutual fund’s return, its turnover rate, and its expense ratio. She randomly selects 10 mutual funds and estimates: Return = β0 + β1Turnover + β2Expense + ε, where Return is the average five-year return (in %), Turnover is the annual holdings turnover (in %), Expense is the annual expense ratio (in %), and ε is the random error component. A portion of the regression results is shown in the accompanying table. df
SS
MS
F
Regression
2
93.33
46.67
4.90
Residual
7
66.69
9.53
Total
9
160.02
Coefficients
Standard Error
t-stat
p-value
Intercept
30.60
4.30
7.12
0.000
Turnover
0.13
0.06
2.23
0.061
Expense
0.90
4.08
0.22
0.831
Significance F 0.047
a. At the 10% significance level, are the explanatory variables jointly significant in explaining Return? Explain. b. At the 10% significance level, is each explanatory variable individually significant in explaining Return? Explain.
Version 1
59
111) Pfizer Incorporated is the world’s largest research-based pharmaceutical company. Monthly data for Pfizer’s risk-adjusted return and the risk-adjusted market return are collected for a five-year period (n = 60). The accompanying table shows the regression results when estimating the Capital Asset Pricing Model (CAPM) model for Pfizer’s return. Coefficients Standard Error
t-stat
p-value
Lower 95% Upper 95%
Intercept
0.004
0.006
0.62
0.05364
−0.009
0.017
RM − Rf
0.716
0.119
6.00
1.35E07
1.477
0.955
a. At the 5% significance level, is the beta coefficient less than one? Show the relevant steps of the appropriate hypothesis test. b. At the 5% significance level, are there abnormal returns? Show the relevant steps of the appropriate hypothesis test.
112) Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses LumberSales (in $100,000s) as the response variable with HousingStarts (in 1,000s) and CommercialConstruction (in 1,000s) as the explanatory variables. The results of the regression are as follows: df
SS
MS
F
Regression
2
180,770
90,385
103.3
Residual
45
39,375
875
Total
47
220,145
Coefficients
Standard Error
t-stat
p-value
Intercept
5.37
1.71
3.14
0.0030
Housing Starts
0.76
0.09
8.44
0.0000
Commercial Construction
1.25
0.33
3.78
0.0005
Version 1
Significance F 0.0000
60
a. At the 5% significance level, are the explanatory variables jointly significant in explaining LumberSales? Explain. b. At the 5% significance level, is CommercialConstruction significant in explaining LumberSales? Explain. c. At the 5% significance level, can you conclude that the slope coefficient attached to HousingStarts differs from 1? Explain.
113) An analyst examines the effect that various variables have on crop yield. He estimatesy = β0 + β1x1 + β2x2 + β3x3 + ε, where y is the average yield in bushels per acre, x1 is the amount of summer rainfall, x2 is the average daily use in machine hours of tractors on the farm, and x3 is the amount of fertilizer used per acre. The results of the regression are as follows: df
SS
MS
F
Significance F
Regression
3
12,000
4,000
10
0.0095
Residual
6
2,400
400
Total
9
14,400
Coefficients
Standard Error
t-stat
p-value
Intercept
1.6
1.0
1.6
0.1232
x1
7.5
2.5
3.0
0.0064
x2
6.0
4.0
1.5
0.1472
x3
1.0
0.5
2.0
0.0574
a. At the 10% significance level, are the explanatory variables jointly significant in explaining crop yield? Explain. b. At the 10% significance level, is fertilizer significant in explaining crop yield? Explain. c. At the 10% significance level, can you conclude that the slope coefficient attached to rainfall differs from 9? Explain.
Version 1
61
114) A sociologist estimates the regression relating Poverty (y) to Education (x1). The estimated regression equation is: = 71.68 − 0.69x1; n = 39; SSE = 152. In an attempt to improve the results, he adds two more explanatory variables: Median Income (x2, in $1,000s) and the Mortality Rate (x3, per 1,000 residents). The estimated regression equation is: = 59.80 − 0.44x1− 0.18x2 + 0.17x3; n = 39; SSE = 60. a. Formulate the hypotheses to determine whether Median Income and the Mortality Rate are jointly significant in explaining Poverty. b. Calculate the value of the test statistic. c. At the 5% significance level, find the approximate value of the critical value(s). d. What is the conclusion to the test?
115) A marketing manager examines the relationship between the attendance at amusement parks and the price of admission. He estimates the following model: Attendance = β0 + β1 price + ε, where Attendance is the average daily number of people who attend an amusement park in July (in 1,000s) and Price is the price of admission. The marketing manager would like to construct interval estimates for Attendance when Price equals $80. The researcher estimates a modified model where Attendance is the response variable and the Price is now defined as Price* = Price − 80. A portion of the regression results is shown in the accompanying table. Regression Statistics R Square
0.62
Standard Error
21
Observations
30 Coefficients Standard Error
t-stat
p-value
Lower 95%
Upper 95%
Intercept
86.8
4.2
20.85
1.4E18
78.2
95.4
Price*
−3.1
0.5
−6.69
2.9E-
−4.0
−2.1
Version 1
62
07
a. According to the modified model, what is the point estimate for Attendance when Price equals $80? b. According to the modified model, what is a 95% confidence interval for Attendance when Price equals $80? (Note that t0.025,28 = 2.048.) c. According to the modified model, what is a 95% prediction interval for Attendance when Price equals $80? (Note that t0.025,28 = 2.048.)
116) A researcher analyzes the factors that may influence the poverty rate and estimates the following model:y = β0 + β1x1 + β2x2 + β3x3 + ε, where y is the poverty rate (y, in %), x1 is the percent of the population with at least a high school education, x2 is the median income (in $1,000s), and x3 is the mortality rate (per 1,000 residents). The researcher would like to construct interval estimates for y when x1, x2, and x3 equal 85%, $50,000, and 10, respectively. The researcher estimates a modified model where poverty rate is the response variable and the explanatory variables are now defined as portion of the regression results is shown in the accompanying table.
A
Regression Statistics R Square
0.86
Standard Error
1.30
Observations
39 Coefficients Standard t-stat Error
Intercept
14.89
0.37
39.73
p-value Lower 95% Upper 95% 1.04E30
14.14
15.64
a. According to the modified model, what is the point estimate for the poverty rate when x1, x2, and x3 equal 85%, $50,000, and 10, respectively. b. According to the modified model, what is a 95% confidence interval for the expected poverty rate when x1, x2, and x3 equal 85%, $50,000, and 10, respectively? ? (Note that t0.025,35 = 2.030.) c. According to the modified model, what is a 95% prediction interval for the poverty rate when x1, x2, and x3 equal 85%, $50,000, and 10, respectively? (Note that t0.025,35 = 2.030.)
Version 1
63
117) A sociologist studies the relationship between a district’s average score on a standardized test for 10th grade students (y), the average school expenditures per student (x1 in $1,000s), and an index of socioeconomic status of the district (x2). The results of the regression are = 10.16 + 8.50x1 + 4.30x2; n = 25; SSE = 450; He would like to determine whether the influence of expenditures on test scores differs from the influence of the index on test scores, or β1 ≠ β2. The results of the restricted model for this test are = 12.25 + 6.05(x1 + x2); n = 25; SSE = 600. a. Formulate the hypotheses to determine whether Expenditures and Index Rate have the same influence on Scores. b. Calculate the value of the test statistic. c. At the 5% significance level, find the critical value(s). d. What is the conclusion to the test?
118) A simple linear regression, Sales =β0 + β1Advertising + ε, is estimated using crosssectional data from 10 firms. The resulting residuals e, along with the values of the explanatory variable Advertising (x, in $100s), are shown in the accompanying table. x
2
e −3
3
6
8
11
15
16
21
25
30
2 −4
3
5
−6
−7
9
11 −11
a. Graph the residuals e against Advertising (x) and look for any discernible pattern. Which assumption is being violated? Discuss its consequences and suggest a possible remedy.
Version 1
64
119) A simple linear regression,Sales =β0 + β1Advertising + ε, is estimated using time-series data over the last 10 years. The residuals, e, and the time variable, t, are shown in the accompanying table. x
1
2
3
4
5
6
7
8
9
10
e −3
−2
−1
3
5
6
3
−3
−4
−3
a. Graph the residuals e against time and look for any discernible pattern. Which assumption is being violated? Discuss its consequences and suggest a possible remedy.
120) A marketing manager uses data from 30 sales agents to examine an agent’s sales performance (measured by average deal size in $1,000s) as a function of number of Facebook friends and number of months with the company. A portion of the data is shown below. Sales
Friends
Months
2800
10
60
:
:
:
5200
43
19
{MISSING IMAGE}Click here for the Excel Data File a.Estimate the model: Sales = β0 + β1 Friends + β2 Months + ε. Specify the sample regression equation. b. Conduct a hypothesis test to determine if Friends and Month are jointly significant in explaining Sales at α = 0.05. c. Conduct a hypothesis test to determine whether Friends influences Sales at α = 0.05. d. Specify the competing hypotheses to determine whether Months is significant in explaining Sales. e. Construct the 95% confidence interval for β2. Use the confidence interval to determine if Months is significant in explaining Sales.
121) If in the multiple linear model the slope coefficient βi is negative, it suggests an inverse (negative) relationship between the explanatory variable xi and the response variable y holding the other explanatory variables constant.
Version 1
65
⊚ ⊚
true false
122) The test statistic for a test of individual significance is assumed to have a t distribution with n − k − 2 degrees of freedom, where n is the sample size and k is the number of explanatory variables. ⊚ true ⊚ false
123) The restricted model is a reduced model where the coefficients that are restricted under the null hypothesis are estimated. ⊚ true ⊚ false
124)
The term BLUE stands for Best Linear Unbiased Estimator. ⊚ true ⊚ false
125) When some explanatory variables of a regression model are strongly correlated, this phenomenon is called serial correlation. ⊚ true ⊚ false
126) A crucial assumption in a regression model is that the error term is not correlated with any of the explanatory variables. ⊚ ⊚
Version 1
true false
66
127) If the residuals are correlated, then the residuals should show no pattern around the horizontal axis. ⊚ true ⊚ false
128) The term multicollinearity refers to the condition when the variance of the error term, conditional on x1, x2, …, xn, is the same for all observations. ⊚ true ⊚ false
129) The alternative hypothesis for the test of joint significance is specified as: HA: At least two βj ≠ 0. ⊚ true ⊚ false
Version 1
67
Answer Key Test name: Chap 15_4e 1) partial 2) unrestricted 3) explanatory 4) prediction 5) expected 6) normally 7) Residual 8) A 9) B 10) B 11) C 12) C 13) D 14) D 15) B 16) A 17) D 18) C 19) A 20) D 21) B 22) D 23) C 24) C 25) D 26) D Version 1
68
27) B 28) D 29) C 30) B 31) C 32) D 33) B 34) C 35) B 36) C 37) C 38) C 39) C 40) C 41) A 42) C 43) B 44) D 45) B 46) A 47) A 48) A 49) A 50) D 51) A 52) D 53) A 54) A 55) C 56) A Version 1
69
57) B 58) C 59) A 60) A 61) C 62) A 63) D 64) D 65) C 66) A 67) A 68) B 69) D 70) A 71) B 72) C 73) A 74) B 75) D 76) A 77) A 78) D 79) C 80) C 81) B 82) C 83) B 84) A 85) B 86) C Version 1
70
87) D 88) A 89) B 90) C 91) D 92) A 93) B 94) C 95) D 96) C 97) A 98) C 99) B 100) a. H0:β1 = 0; HA:β1 ≠ 0. b. Yes, since the p-value of 0.0106 is less than 0.05, we reject H0: β1 = 0 and conclude that x is significant in explaining y. 101) a. H0:β1 = 1; HA:β1 ≠ 1. b. t38 = −3.71. c. −t0.025,38 = −2.024 and t0.025,38 = 2.024. d. Because −3.71 <−2.024, we reject H0 and conclude that the slope differs from one. 102) a. H0:β1 = 0; HA:β1 ≠ 0; because the p-value = 0.0264 < 0.05 = α, we reject H0; we can conclude that x1 and y are linearly related. b. The confidence interval for β2 is reported as [−0.07, 2.59]; we cannot conclude that x2 is significant in explaining y because the interval contains 0. c. H0:β1 = −1; HA:β1 ≠ −1; t27 = 2.89; critical value t0.025,27 = 2.052. Because t27 > t0.025,27 we reject H0 and conclude that β1 differs from −1. 103) a. H0: β1 = β2 = β3 = 0; HA: At least one βj ≠ 0. b. Yes, because the p-value of 0.0003 is less than 0.05, we reject H0 and conclude that the explanatory variables are jointly significant. c. H0:β2 = 0; HA:β2 ≠ 0; because the p-value of 0.5972 is greater than 0.05, we do not reject H0 and conclude that x2 is not significant in explaining y. d. H0: β3 = 2; HA:β3 ≠ − 2. We calculate the value of the test statistic as t13 = −1.71. At the 5% significance level, the critical values are −t0.025,13 = −2.160 andt0.025,13 = 2.160. Because −2.160 < −1.71 < 2.160, we do not reject H0 and conclude that β3 is not significantly different from −2.
Version 1
71
104) a. H0:β2 = β3 = 0; HA: At least one βj ≠ 0. b. Restricted model: y =β0 + β1x + ε; unrestricted model: y = β0 + β1x1 + β2x2 + β3x3 + ε. c. F(2,26) = 19.87. d. F0.05,(2,26) = 3.37. e. We reject H0 because 19.87 is greater than 3.37. At the 5% level, we conclude that x2 and x3 are jointly significant in explaining y. 105) a. H0:β2 = β3; HA:β2 ≠ β3. b. F(1,26) = 0.91. c. F0.05(1,26) = 4.23. d. We do not reject H0 because 0.91 is less than 3.37. At the 5% level, we cannot conclude that the influence of x2 is different from the influence of x3. 106) a. [58.54, 63.86] b. [54.13, 68.27] 107) a. [34.85, 46.35] b. [28.12, 53.08] c. The confidence interval is narrower because it assumes that the expected value of the error term is zero, whereas the prediction interval incorporates the nonzero error term. 108) a. [373.38, 499.62] b. [367.73, 505.27] c. The confidence interval is narrower because it assumes that the expected value of the error term is zero, whereas the prediction interval incorporates the nonzero error term.
109) a. H0:β1 = 0; HA:β1 ≠ 0. b. Yes, since the p-value of 0.01 is less than 0.05, we reject H0:β1 = 0 and conclude that Service is significant in explaining Salary.
Version 1
72
110) a. Here the competing hypotheses for a joint significance test of Turnover and Expense take the form: H0:β1 = β2 = 0; HA: At least one βj ≠ 0; Because the p-value associated with F(2,7)= 4.90 is 0.047 and this value is less than 0.10, we reject H0 and conclude that at least one of the explanatory variables is significant in explaining Return. b. The competing hypotheses for an individual significance test of Turnover take the form: H0:β1 = 0; HA:β1 ≠ 0. Because the p-value associated with t7 = 2.23is 0.061 and this value is less than 0.10, we reject H0 and conclude that Turnover is significant. The competing hypotheses for an individual significance test of Expense take the form: H0:β2 = 0; HA:β2 ≠ 0. Because the p-value associated with t7 = 0.22is 0.831 and this value is greater than 0.10, we do not reject H0 and cannot conclude that Expense is significant in explaining Return. 111) a. H0:β ≥ 1; HA:β < 1. The value of the test statistic is t58 = −2.387. The critical value at the 5% significance level is −t0.025,58 = −1.672 and t0.025,58 = 1.672. Since −2.387 < −1.672, we reject H0:β ≥ 1 in favor of HA:β < 1; we can conclude that the return on Pfizer stock is less risky than the return on the market. b.H0:α = 0; HA: α ≠ 0. Because the p-value of 0.5364 is greater than the significance level of 0.05, we cannot reject H0:α = 0; we cannot conclude that there are abnormal returns. 112) a. Here the competing hypotheses for a joint significance test of HousingStarts and LumberSales take the form: H0:β1 = β2 = 0; HA: At least one βj ≠ 0. Because the p-value associated with F(2,45) = 103.3 is 0.000 and this value is less than 0.05, we reject H0 and conclude that at least one of the explanatory variables is significant in explaining LumberSales. b. The competing hypotheses for an individual significance test of CommercialConstruction take the form: H0:β2 = 0; HA:β2 ≠ 0. Because the p-value associated with t45 = 3.78 is 0.0005 and this value is less than 0.05, we reject H0 and conclude that CommercialConstruction is significant in explaining housing starts. c. H0:β1 = 1; HA:β1 ≠ 1. The value of the test statistic is t45 = −2.67. The critical values at the 5% significance level are −t0.025,45 = −2.014 and t0.025,45 = 2.014. Because −2.67 < −2.014, we rejectH0:β1 = 1 and conclude that the slope coefficient attached to HousingStarts differs from 1. Version 1
73
113) a. The competing hypotheses for a joint significance test of the three explanatory variables take the form: H0:β1 = β2 = β3 = 0; HA: At least one βj ≠ 0. Because the p-value associated with F(3,6) = 10 is 0.0095 and this value is less than 0.10, we reject H0 and conclude that at least one of the explanatory variables is significant in explaining crop yield. b. The competing hypotheses for an individual significance test of fertilizer take the form: H0:β3 = 0; HA:β3 ≠ 0. Because the p-value associated with t6 = 2 is 0.0574 and this value is less than 0.10, we reject H0 and conclude that fertilizer is significant in explaining crop yield. c. H0:β1 = 9; HA:β1 ≠ 9. The value of the test statistic is t6 = −0.6. The critical values at the 10% significance level are −t0.05,6 = −1.943 and t0.05,6 = 1.943. Because −1.943 < −0.6 < 1.943, we do not reject H0:β1 = 9 and cannot conclude that the slope coefficient attached to rainfall differs from 9. 114) a.H0:β2 = β3 = 0; HA: At least one βj ≠ 0 b. F(2,35) = 26.83 c. Using the F table, the critical value approximately is F0.05,(2,35) ≈ 3.32 d. Because 26.83 > 3.32, we reject the null hypothesis and conclude that Median Income and the Mortality Rate are jointly significant in explaining the Poverty. 115) <p>a. = 86.8 or 86,800 people; in the modified model, the point estimate of the intercept is the predicted value for Attendance when Price equals $80. b. In the modified model, we have estimates of the intercept, its standard error, and the confidence interval which is [78.2,95.4] in thousands of people. Converting to the correct units of measurement, we have [78,200, 95,400]. Excel and R also provide the 95% confidence interval directly. c. For prediction interval we have estimates of the intercept, its standard error, and the standard error of estimate. The prediction interval given by [42.9,130.7] in thousands of people or [42,900, 130,700]. 116) <p>a. = 14.859; in the modified model, the point estimate is the intercept for the modified modelwhen x1, x2, and x3 equal 85%, $50,000, and 10, respectively. b. In the modified model, we have estimates of the intercept, standard error, and the confidence interval given by [14.14,15.64]. Excel and R also provide the 95% confidence interval directly. c. For prediction interval we have estimates of the intercept, its standard error, and the standard error of estimate. We then find the prediction interval to be [12.15,17.63]. 117) a. H0:β1 = β2; HA:β1≠ β2 b. F(1,22) = 7.33 c. Using the F table, the critical value is F0.05,(1.22) = 4.30 d. Because 7.33 > 4.30, we reject the null hypothesis and conclude that β1 ≠ β2; we can conclude the influence of the two variables on test scores is different.
Version 1
74
118) a. [MISSING IMAGE: A scatter plot shows e versus x. The horizontal axis is labeled x. It ranges from 0 to 35 in increments of 5. The vertical axis is labeled e. It ranges from negative 15 to 15 in increments of 5. Dots are plotted at (0, negative 1), (6, negative 4), (15, negative 6), (16, negative 7), (30, negative 10), (3, 2), (7, 3), (11, 5), (21, 10), and (25, 11)., A scatter plot shows e versus x. The horizontal axis is labeled x. It ranges from 0 to 35 in increments of 5. The vertical axis is labeled e. It ranges from negative 15 to 15 in increments of 5. Dots are plotted at (0, negative 1), (6, negative 4), (15, negative 6), (16, negative 7), (30, negative 10), (3, 2), (7, 3), (11, 5), (21, 10), and (25, 11). ]
Heroskedasticity is likely a problem in this application in that the residuals seem to fan out across the horizontal axis. The standard errors are biased and so the t and F tests are not reliable. We can correct the biased standard errors using R. 119) a. [MISSING IMAGE: A scatter plot shows e versus x. The horizontal axis is labeled t. It ranges from 0 to 12 in increments of 2. The vertical axis is labeled e. It ranges from negative 6 to 8 in increments of 2. Dots are plotted at (1, negative 3), (2, negative 2), (3, negative 1), (4, 3), (5, 5), (6, 6), (7, 3), (8, negative 3), (9, negative 4), and (10, negative 3)., A scatter plot shows e versus x. The horizontal axis is labeled t. It ranges from 0 to 12 in increments of 2. The vertical axis is labeled e. It ranges from negative 6 to 8 in increments of 2. Dots are plotted at (1, negative 3), (2, negative 2), (3, negative 1), (4, 3), (5, 5), (6, 6), (7, 3), (8, negative 3), (9, negative 4), and (10, negative 3). ]
Serial correlation is likely a problem in this situation because the residuals show a pattern suggesting the residuals are correlated. As a result, the standard errors may be biased, and the t methods may not be reliable. We can correct the biased standard errors using the NeweyWest procedure. 120) <p>a. = −1872.24 + 160.18 × Friends + 57.36 × Months b. H0: β1 = β2 = 0, HA: At least one βj ≠ 0. The p-value is approximately zero ( ). H0 is rejected. At the 5% significance level, Friends and Months are jointly significant in explaining Sales. c. H0: β1 = 0, HA: β1 ≠ 0. The p-value is approximately zero. ( ). H0 is rejected. At the 5% significance level, Friends is significant in explaining Sales. d. H0: β2 = 0, HA: β2 ≠ 0. e. The 95% confidence interval for β2 is [−38.50, 153.22] which contains the value zero. This means that Months is not significant in explaining Sales.
121) TRUE 122) FALSE Version 1
75
123) FALSE 124) TRUE 125) FALSE 126) TRUE 127) FALSE 128) FALSE 129) FALSE
Version 1
76
CHAPTER 15 – 4e Additional Questions Test bank 1) If no assumptionin a regression model is violated, then the OLS estimators have desirable properties. In particular, the OLS estimators of the regression coefficients (the βi’s) are A) correlated and heteroskedastic. B) multicollinear and endogenous. C) nonlinear and inconsistent. D) unbiased and efficient.
2) Suppose you estimate the following model y = β0 +β1x1 + β2x2+ β3x3 + ε and you find that the correlation coefficient betweenx2and x3is −0.85.You choose to ignore the strong, negative correlation between the two explanatory variables.What are the consequences with respect to the OLS estimators and their standard errors? A) The OLS estimators and their standard errors are biased. B) Neither the OLS estimators nor their standard errors are biased. C) The OLS estimators are biased, but their standard errors are appropriate. D) The OLS estimators are unbiased, but their standard errors are inappropriate.
Version 1
1
3) Suppose you have cross-section data and estimate the following model Consumption = β0 +β1Income + ε.When you plot the Residuals against Income, you obtain the following residual plot.
What is the assumption that is likely being violated? A) Perfect multicollinearity B) Correlated observations (serial correlation) C) Changing variability (heteroskedasticity) D) Failure to incorporate nonlinear relationship between Consumption and Income
4) Suppose you have time series data and estimate the following model y = β0 +β1x1+ ε. When you estimate the model, you find that there are correlated observations (serial correlation). You choose to ignore this issue. What are the consequences? A) The predictions are biased, but the hypothesis tests are valid. B) The predictions are not biased, but the hypothesis tests are invalid. C) Both the predictions and the hypothesis tests are valid. D) Both the predictions and the hypothesis tests are not valid.
Version 1
2
5) Suppose you have time series data and estimate the following model: y = β0 +β1x1+ ε.When you plot the residuals against time, you get the following residual plot.
What assumption is likely being violated? A) Perfect Multicollinearity B) Changing variability (heteroskedasticity) C) Correlated observations (serial correlation) D) Failure to incorporate the nonlinear relationship between x and y in regression model
6) If you find that the regression model suffers from correlated observations (serial correlation), then one of the most common ways to remedy the problem is to A) do nothing. B) compute robust standard errors. C) drop one of the correlated explanatory variables. D) incorporate the nonlinearity between the response variable and the suspect explanatory variable.
Version 1
3
Version 1
4
Answer Key Test name: Chap 15_4e_Additional TB Questions 1) D 2) B 3) C 4) B 5) C 6) B
Version 1
5
CHAPTER 16 – 4e 1)
The quadratic regression model allows for one change in ________.
2)
The cubic regression model allows for ______ change(s) in slope.
3)
The log-log regression model is ________ in the variables.
4) When not all variables are transformed with logarithms, the models are called ________ models.
5)
The log-log and the ________ models can have similar shapes.
6) The logarithmic model is especially attractive when only the ________ variable is better captured in percentages.
7)
For a quadratic regression model , it is important to evaluate the estimated ________
effect of the explanatory variable x on the predicted value of the response variable .
8)
To compute the coefficient of determination R2 we have to use Excel’s ________ function
first to compute the correlation between y and .
9)
To compute the coefficient of determination
we have to use R’s ________ function
first to compute the correlation between and . Version 1
1
10)
Which of the following regression models isa first-order polynomial? A) y = β0 + ε B) y = β0 + β1x + ε C) y = β0 + β1x +β2x2+ ε D) y = β0 + β1x + β2x2 + β3x3 + ε
11)
Which of the following is a quadratic regression equation? A) B) C) D)
12)
How many coefficients need to be estimated in the quadratic regression model? A) 4 B) 3 C) 2 D) 1
13)
Although allowing for nonlinear trends, polynomials are still linear in the________. A) reponse B) explanatory variables C) coefficients D) error term
Version 1
2
14)
An inverted U-shaped curve is also known as ________. A) concave B) convex C) opaque D) hyperbola
15)
What is the effect of b2 < 0 in the case of the quadratic equation
= b0 + b1x + b2x2?
A) The curve is U-shaped. B) The curve is inverted U-shaped. C) The curve is a straight line. D) The curve is not a parabola.
16) For the quadratic equation = b0 + b1x + b2x2, which of the following expressions must be zero in order to minimize or maximize the predicted y? A) b1 + 2b2x B) 2b1 + b2x C) −b1/2b2 D) −b2/2b2
17) For the quadratic regression equation = b0 + b1x + b2x2, the predicted y achieves its optimum (maximum or minimum) when x is ________. A) −2b2/b1 B) −b1/2b2 C) b1/2b2 D) 2b1/b2
Version 1
3
18)
For the quadratic regression equation
= b0 + b1x + b2x2, the optimum (maximum or
minimum) value of is ________. A) −b1/2b2 B) b1/2b2 C) D)
19) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
The quadratic regression model is ________. A)
= 35.086 + 6.0523Hires − 0.1023Hires2
B)
= 6.0523 + 35.086Hires − 0.1023Hires2
C)
= 6.0523 − 35.086Hires + 0.1023Hires2
D)
= −0.1023 + 6.0523Hires + 35.086Hires2
Version 1
4
20) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
What percentage of the variation in productivity is explained by the quadratic regression model? A) 85.69% B) 0.7342% C) 90.54% D) 73.42%
21) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
Which of the following is the predicted productivity when 32 workers are hired? Version 1
5
A) 124.00 B) 122.46 C) 121.60 D) 113.50
22) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
For which value of Hires is the predicted Productivity maximized? Note: Do not round to the nearest integer. A) 29.58 B) 124.60 C) 35.086 D) 27.34
Version 1
6
23) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
Assuming that the values of Hires can be nonintegers, what is the maximum value of Productivity predicted by the model? A) 29.58 B) 124.603 C) 35.086 D) 127.50
Version 1
7
24) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
Assuming that the number of hired workers must be integer, how many workers should be hired to achieve the highest productivity according to the model? A) 26 B) 28 C) 30 D) 32
Version 1
8
25) The following scatterplot shows productivity and number of hired workers with a fitted quadratic regression model.
Assuming that the number of hired workers must be an integer, what is the maximum productivity predicted by the model? A) 29.58 B) 30.00 C) 124.603 D) 124.585
26)
What R function is used to fit a quadratic regression model? A) lm B) predict C) summary D) cor
27) The coefficient of determination R2 cannot be used to compare the linear and quadratic models, because
Version 1
9
A) the quadratic model has one parameter more to estimate. B) the quadratic model has two parameters more to estimate. C) the quadratic model always has a lower R2. D) R2 is not defined for the quadratic model.
28) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
What can be said about the linear relationship between Price and Sales? A) The relationship is negatively moderate. B) There is no relationship. C) The relationship is positively strong. D) The relationship is negatively strong.
Version 1
10
29) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
For the considered range of the price, the relationship between Price and Sales should be described by a ________. A) concave function B) hyperbola C) convex function D) linear function
Version 1
11
30) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
What is the number of estimated coefficients of the cubic regression model? A) 1 B) 2 C) 3 D) 4
Version 1
12
31) Typically, the sales volume declines with an increase of a product’s price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
Using the quadratic equation, predict the sales if the luxury good is priced at $100. A) 1191.87 B) 1157.64 C) 1160.79 D) 1168.00
Version 1
13
32) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
Using the cubic regression equation, predict the sales if the luxury good is priced at $100. A) 1171.85 B) 1133.10 C) 1106.61 D) 1092.91
Version 1
14
33) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
Which of the following models is most likely to be chosen in order to describe the relationship between Price and Sales? A) Linear B) Quadratic C) Cubic D) Exponential
Version 1
15
34) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
For which of the following prices do sales predicted by the quadratic regression equation reach their minimum? A) 106.33 B) 1157.16 C) 100.41 D) 1166.64
Version 1
16
35) Typically, the sales volume declines with an increase of a product price. It has been observed, however, that for some luxury goods the sales volume may increase when the price increases. The following scatterplot illustrates this rather unusual relationship.
For which of the following two prices are the sales predicted by the quadratic regression equation equal 1700 units? A) 60.51 and 150.15 B) 61.51 and 151.15 C) 62.51 and 152.15 D) 63.51 and 153.15
36) What does a positive value for price elasticity indicate if y represents the quantity demanded of a particular good and x is its unit price in a log-log regression model? A) As price increases, the expected sales decreases. B) As price decreases, the expected sales increases. C) As price increases, the expected sales increases. D) As price decreases, the expected sales remain the same.
Version 1
17
37) For the exponential model ln(y) = β0 + β1x + ε , if x increases by one unit, then E(y) changes by approximately A) β1 × 100% B) β1 × 100 units C) β1% D) β1 units
38) For the logarithmic model ln(y) = β0 + β1ln(x) + ε, the predicted value of y is computed as ________. A) B) C) D)
39) For the log-log model ln(y) = β0 + β1ln(x) + ε, the predicted value of y is computed as ________. A) B) C) D)
40) <p>For which of the following models is predicted value of y ?
Version 1
used to find the
18
A) B) C) D)
41) When the predicted value of the response variable has to be found, in which of the following two models, is there a need for the standard error correction? A) Linear and log-log B) Log-log and logarithmic C) Logarithmic and linear D) Log-log and exponential
42) A model with one explanatory variable that has been log transformed is called a(n) ________. A) log-log model B) logarithmic model C) exponential model D) linear model
43) A model in which the response variable has been log transformed is called a(n) ________. A) log-log model B) logarithmic model C) exponential model D) linear model
Version 1
19
44) A model in which both the response variable and the explanatory variable have been log transformed is called a(n) ________. A) exponential model B) logarithmic model C) linear model D) log-log model
45)
In the model ln(y) = β0 + β1 ln(x) + ε, the coefficient β1 is the approximate ________. A) change in E(y) when x increases by one unit B) percentage change in E(y) when x increases by 1% C) percentage change in E(y) when x increases by one unit D) change in E(y) when x increases by 1%
46) The linear and logarithmic models, y = β0 + β1x + ε and y = β0 + β1 ln(x) + ε, were fit given data on y and x, and the following table summarizes the regression results. Which of the two models provides a better fit? Variable
Linear Model
Logarithmic Model
Intercept
1.17
−0.29
x
2.09
N/A
ln(x)
N/A
8.57
0.88
0.87
0.86
0.85
R2 Adjusted
R2
A) The linear model. B) The logarithmic model. C) The models are not comparable. D) The provided information is not sufficient to make the conclusion.
Version 1
20
47) The log-log and exponential models, ln(x) = β0 + β1ln(x) + ε and ln(y) = β0 + β1x + ε, were fit given data on y and x, and the following table summarizes the regression results. Which of the two models provides a better fit? Variable
Log-LogModel
ExponentialModel
Intercept
0.75
1.04
x
N/A
0.24
ln(x)
1.08
N/A
R2
0.85
0.87
Adjusted R2
0.84
0.86
A) The log-log model. B) The exponential model. C) The models are not comparable. D) The provided information is not sufficient to make the conclusion.
48) The quadratic and logarithmic models,y = β0 + β1x + β2x2 + ε and y = β0 + β1 ln(x) + ε, were fit given data on y and x, and the following table summarizes the regression results. Which of the two models provides a better fit? Variable
Quadratic Model
Logarithmic Model
Intercept
2.28
5.16
x
4.40
N/A
x2
−0.22
N/A
ln(x)
N/A
8.54
R2
0.78
0.77
Adjusted R2
0.75
0.76
A) The quadratic model. B) The logarithmic model. C) The models are not comparable. D) The provided information is not sufficient to make the conclusion.
49) The logarithmic and log-log models,y = β0 + β1ln(x) + ε and ln(y) = β0 + β1 ln(x) + ε, were fit given data on y and x, and the following table summarizes the regression results. Which of the two models provides a better fit? Version 1
21
Variable
Logarithmic Model
Log-Log Model
Intercept
−1.38
0.75
ln(x)
9.79
1.08
0.88
0.85
0.87
0.84
R2 Adjusted
R2
A) The logarithmic model. B) The log-log model. C) The models are comparable. D) The provided information is not sufficient to make the conclusion.
50) Which of the following regression models is most likely to provide the best fit for the data represented by the following scatterplot?
A) Exponential model B) Logarithmic model C) Linear model D) Log-log model
Version 1
22
51) Which of the following regression models is most likely to provide the best fit for the data represented by the following scatterplot?
A) Exponential model B) Logarithmic model C) Linear model D) Log-log model
52) The following data, with the corresponding Excel scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
5
6
7
8
9
10
11
Height (feet)
6
9.5
13
15
16.5
17.5
18.5
19
19.5
19.7
19.8
Version 1
23
What is the regression model used to describe the relationship between Height and Age? A) Exponential model B) Logarithmic model C) Linear model D) Log-log model
53) The following data, with the corresponding scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
5
6
7
8
9
10
11
Height (feet)
6
9.5
13
15
16.5
17.5
18.5
19
19.5
19.7
19.8
Version 1
24
If the age of a tree increases by 1%, then its predicted height increases by approximately ________. A) 6.1082% B) 0.06108% C) 6.1082 feet D) 0.061082 feet
54) The following data, with the corresponding Excel scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
Height (feet)
6
9.5
13
15
Version 1
5
6
7
16.5 17.5 18.5
8 19
9
10
11
19.5 19.7 19.8
25
What percent of the variation in heights is explained by the model? ________. A) 6.09 B) 6.10 C) 98.63 D) Can’t determine from the given information
55) The following data, with the corresponding Excel scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
Height (feet)
6
9.5
13
15
Version 1
5
6
7
16.5 17.5 18.5
8 19
9
10
11
19.5 19.7 19.8
26
Which of the following is the correlation coefficient between Height and ln(Age)? A) −0.9863 B) 0.9863 C) −0.9931 D) 0.9931
56) The following data, with the corresponding Excel scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
Height (feet)
6
9.5
13
15
Version 1
5
6
7
16.5 17.5 18.5
8 19
9
10
11
19.5 19.7 19.8
27
Which of the following is the predicted height of an eight-year-old cherry tree that was planted as a one-year-old and six-foot-tall tree? A) 54.96 B) 42.66 C) 17.04 D) 18.80
57) The following data, with the corresponding Excel scatterplot, show the average growth rate of Weeping Higan cherry trees planted in Washington, DC. At the time of planting, the trees were one year old and were all six feet in height. Age (years)
1
2
3
4
Height (feet)
6
9.5
13
15
Version 1
5
6
7
16.5 17.5 18.5
8 19
9
10
11
19.5 19.7 19.8
28
If a cherry tree is planted as a one-year-old and six-foot-tall tree, which of the following is the estimated time needed by the tree to reach 16.5 feet in height? A) About 4 years B) About 4.5 years C) About 5 years D) About 5.5 years
58) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 189. 159. 167. 146. 144. 135. 118. 127. 126. 109. 102. 112. 107. 102. (ºF) 2 4 9 4 3 1 3 4 1 6 9 0 1 8
The output for an exponential model ln(Temp) = β0 + β1Time + ε, t is below. df
SS
MS
Regression
1
0.45576
0.45576
Residual
12
0.016
0.00133
Total
13
0.46796
Coefficients
Standard Error
Version 1
t-stat
29
Intercept
5.1241
0.0157
326.38
Time
-0.0107
0.0006
-17.83
Which of the following is the percentage of variation in ln(Temp) explained bythe model? A) 45.86% B) 47.26% C) 1.78% D) 97.39%
59) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model ln(Temp) = β0 + β1Time + ε, is below. df
SS
MS
Regression
1
0.45476
0.45476
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
Which of the following is the percentage of variation in ln(Temp) explained bythe model? A) 45.48% B) 97.01% C) 1.40% D) 46.88%
Version 1
30
60) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, is below. df
SS
MS
Regression
1
0.45476
0.45476
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
Which of the following is the sample correlation coefficient between ln(Temp) and Time? A) −0.9701 B) 0.9701 C) −0.9849 D) 0.9849
61) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, is below.
Regression
Version 1
df
SS
MS
1
0.45476
0.45476
31
Residual
12
0.01400
Total
13
0.46876
Coefficients
Standard Error
0.00117
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
Which of the following is the standard error of the estimate? A) 0.03421 B) 0.45476 C) 0.00117 D) 0.67436
62) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, is below. df
SS
MS
Regression
1
0.45476
0.45476
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
Which of the following is the regression equation for making predictions concerning the coffee temperature?
Version 1
32
A) B) C) D)
63) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, is below. df
SS
MS
Regression
1
0.45476
0.45476
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
During one minute, the predicted temperature decreases by approximately ________. A) 0.0118° F B) 1.18° F C) 1.18% D) 11.8%
Version 1
33
64) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, is below. df
SS
MS
Regression
1
0.45476
0.45476
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
What is the predicted coffee temperature in half an hour after the brewing? A) 164.72 B) −4.7904 C) 164.74 D) 120.42
65) The following data show the cooling temperatures of a freshly brewed cup of coffee after it is poured from the brewing pot into a serving cup. The brewing pot temperature is approximately 180º F. Time (min )
0
5
8
11
15
18
22
25
30
34
38
42
45
50
Temp 179. 168. 158. 149. 141. 134. 125. 123. 116. 113. 109. 105. 102. 100. (ºF) 5 7 1 2 7 6 4 5 3 2 1 7 2 5
The output for an exponential model, ln(Temp) = β0 + β1Time + ε, t is below.
Regression
Version 1
df
SS
MS
1
0.45476
0.45476
34
Residual
12
0.01400
0.00117
Total
13
0.46876
Coefficients
Standard Error
t-stat
Intercept
5.1444
0.0173
297.74
Time
−0.0118
0.0006
−19.74
How many minutes must elapse after the brewing in order to cool the coffee to 158°F? A) About five minutes B) About six minutes C) About seven minutes D) About eight minutes
66) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
N/A
−3.2577
ln(x)
Version 1
35
R2
0.9793
0.9852
Adjusted R2
0.9781
0.9850
1,515.5720
0.2071
Standard Error se
Which of the following is the price elasticity of the demand found by the log-log model? A) 26.3660 B) −3.2577 C) 0.9852 D) 0.2071
67) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
ln(x)
N/A
−3.2577
R2
0.9793
0.9852
Adjusted R2
0.9781
0.9850
1,515.5720
0.2071
Standard Error se
Version 1
36
<p>Which of the following does the slope of the obtained log-log regression equation = 26.3660 – 3.2577 ln(Price) signify? A) For every 1% increase in the price, the predicted demand declines by approximately 3.2577%. B) For every 1% increase in the demand, the expected price increases by approximately 3.2577%. C) For every 1% increase in the demand, the expected price decreases by approximately 3.2577%. D) For every 1% increase in the price, the predicted demand increases by approximately 3.2577%.
68) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
N/A
−3.2577
0.9793
0.9852
ln(x) R2
Version 1
37
Adjusted R2 Standard Error se
0.9781
0.9850
1,515.5720
0.2071
Using the log-log model, which of the following is the predicted demand when the price is $200? A) 10,874.92 B) 9,201.45 C) 7,849.25 D) 12,499.98
69) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
N/A
−3.2577
0.9793
0.9852
0.9781
0.9850
1,515.5720
0.2071
ln(x) R2 Adjusted
R2
Standard Error se
Using the cubic model, which of the following is the predicted demand when the price is $200?
Version 1
38
A) 14,378.72 B) 9,201.45 C) 10,764.66 D) 12,499.98
70) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
ln(x)
N/A
−3.2577
R2
0.9793
0.9852
Adjusted R2
0.9781
0.9850
1,515.5720
0.2071
Standard Error se
Which of the following is the percentage of variation in ln(Demand) explained by the log-log regression model?
Version 1
39
A) 98.52% B) 98.50% C) 91.39% D) 97.93%
71) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
ln(x)
N/A
−3.2577
R2
0.9793
0.9852
Adjusted R2
0.9781
0.9850
1,515.5720
0.2071
Standard Error se
Assuming that the sample correlation coefficient between Demand and = exp(26.3660 – 2 3.2577 ln(Price) + (0.2071) /2) is 0.956, what is the percentage of variations in Demand explained by the log-log regression model?
Version 1
40
A) 98.52% B) 98.50% C) 91.39% D) 97.93%
72) The following data show the demand for an airline ticket dependent on the price of this ticket. Pric e
100
110
120
130
140
150
160
170
180
190
200 210 220 230
Dema 42,3 39,5 32,0 28,5 25,0 22,7 20,5 17,5 15,4 13,2 11,2 8,1 7,4 6,0 nd 22 22 72 72 72 72 52 42 22 62 52 02 42 72 Pric e
240
250
260
270
280
290
300
310
320
330
340 350 360 370
Dema 5,45 4,93 4,47 4,04 3,66 3,31 3,00 2,72 2,46 2,38 2,02 1,8 1,6 1,5 nd 2 2 2 2 2 2 2 2 2 2 2 22 52 02 Pric e
380
390
400
410
420
430
440
450
460
470
480 490 500 510
Dema 1,36 1,23 1,12 1,01 nd 2 2 2 2
922
832
762
682
622
562
512 472 422 382
Pric e
520
530
540
550
560
570
580
590
600
610
620 630 640 650
Dema nd
352
342
332
322
302
282
272
262
242
222
202 182 182 172
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable
Cubic Model
Log-Log Model
Intercept
85,361.3
26.3040
x
−587.4483
N/A
x2
1.3073
N/A
x3
−.0010
N/A
N/A
−3.2693
R2
0.9841
0.9907
Adjusted R2
0.9829
0.9905
1,515.5940
0.2098
ln(x)
Standard Error se
Version 1
41
Assuming that the sample correlation coefficient between Demand and = exp(26.3040 − 3.2693 ln(Price) + (0.2098)2/2) is 0.959, what is the predicted demand for a price of $250 found by the model with better fit? A) 3,434.76 B) 4,580.47 C) 3,853.26 D) 3,319.76
73) The following data show the demand for an airline ticket dependent on the price of this ticket. Price 100
110
120
130
140
150
160
170
180
190
200 210 220 230
Deman 4230 3950 3205 2855 2505 2275 2053 1752 1540 1324 1123 808 742 605 d 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Price 240
250
260
270
280
290
300
310
320
330
340 350 360 370
Deman 5430 4910 4450 4020 3640 3290 2980 2700 2440 2360 2000 180 163 148 d 0 0 0 Price 380
390
400
410
420
430
440
450
460
470
480 490 500 510
Deman 1340 1210 1100 d
990
900
810
740
660
600
540
490 450 400 360
Price 520
530
540
550
560
570
580
590
600
610
620 630 640 650
Deman 330 d
320
310
300
280
260
250
240
220
200
180 160 160 150
For the assumed cubic and log-log regression models, Demand = β0 + β1Price + β2Price2 + β3Price3 + ε and ln(Demand) = β0 + β1ln(Price) + ε, the following regression results are available. Variable Intercept
Cubic Model
Log-Log Model
85,304.4991
26.3660
x
−587.4426
N/A
x2
1.3177
N/A
x3
−0.0010
N/A
N/A
−3.2577
R2
0.9793
0.9852
Adjusted R2
0.9781
0.9850
1,515.5720
0.2071
ln(x)
Standard Error se
Version 1
42
Assuming that the sample correlation coefficient between Demand and = exp(26.3660 – 2 3.2577 ln(Price) + (0.2071) /2) is 0.956, what is the predicted demand for a price of $250 found by the model with better fit? A) 4,447.88 B) 3,914.38 C) 4,029.38 D) 5,175.09
74)
Which of the following is a semi-log regression model? A) Log-log B) Exponential C) Quadratic D) Cubic
75)
In which of the following models does the slope coefficient b1 measure the approximate
percentage change in when x increases by 1%? A) Linear regression model B) Exponential regression model C) Logarithmic regression model D) Log-log regression model
76) In which of the following models does the slope coefficient b1 measure the change in when x increases by one unit? A) Linear regression model B) Exponential regression model C) Logarithmic regression model D) Log-log regression model
Version 1
43
77)
In which of the following models does the slope coefficient b1 × 100 measure the
approximate percentage change in when x increases by one unit? A) Linear regression model B) Exponential regression model C) Logarithmic regression model D) Log-log regression model
78)
In which of the following models does the slope coefficient b1/100 measure the
approximate percentage change in when x increases by 1%? A) Linear regression model B) Exponential regression model C) Logarithmic regression model D) Log-log regression model
79) Which of the following nonlinear regression models is the polynomial regression model of order three? A) Cubic regression model B) Linear regression model C) Logarithmic regression model D) Quadratic regression model
Version 1
44
80) Which of the following regression models is most likely to provide the best fit for the data represented by the following scatterplot?
A) Quadratic model B) Logarithmic model C) Cubic model D) Log-log model
81)
Which of the following is a typical application for the cubic regression model? A) Average cost against output B) Company’s total cost curve C) Rate of growth D) Company’s revenue
Version 1
45
82) The scatterplot shown below represents a typical shape of a cubic regression model y = β0 + β1x + β2x2 + β3x3 + ε.
Which of the following is true about values of the regression coefficients? A) β1 > 0; β2 > 0; β3 > 0 B) β1 < 0; β2 > 0; β3 > 0 C) β1 < 0; β2 < 0; β3 > 0 D) β1 > 0; β2 < 0; β3 > 0
83) The scatterplot shown below represents a typical shape of a cubic regression model y = β0 + β1x + β2x2 + β3x3 + ε.
Which of the following is a predicted value if x is equal to 12?
Version 1
46
A) 31.873 B) 38.173 C) 37.183 D) 33.871
84) Which of the following statements is FALSE regarding the given scatterplot of y against x with a superimposed trend line? {MISSING IMAGE} A) A quadratic regression model y = β0+ β1x + β2x2+ ε can be used to describe the relationship between y and x. B) The predicted value of y, reaches its maximum when x = −b1/(2b2span), where b1and b2are coefficient estimates in the regression equation .</p> 2 C) The coefficient β2 in y = β0+ β1x + β2x + ε is positive. D) <p>The estimated regression equation for the quadratic regression model is equation .</p>
85) Which of the following statements is FALSE regarding the trendlines of three regression models as shown in the plot below? {MISSING IMAGE} A) The red line is depicting a logarithmic regression model. B) The blue line is depicting a linear regression model. C) The green line is depicting an exponential model. D) The red and green lines squared and cubed the explanatory variable respectively to capture nonlinearities between the response variable and explanatory variable.
86) Which of the following is a valid comparison of linear and log-transformed models using the computer-generated R 2?
Version 1
47
A) When the models have a different number of explanatory variables. B) When the models have a different response variable. C) When the models have the same number of explanatory variables and the same response variable. D) When the models have the same response variables and a different number of explanatory variables.
87) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
What is the sample correlation coefficient between Age and Debt?
88) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
Version 1
48
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
Using the quadratic regression equation, find the age of an employed single person with the highest predicted percentage debt.
89) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
Using the quadratic regression equation, find the predicted maximum percentage debt.
90) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Version 1
49
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
0.0824
0.8127
0.8136
0.0496
0.7988
0.7920
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
R2 Adjusted
R2
Standard Error se SSE
What is the regression equation that provides the best fit?
91) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
0.0824
0.8127
0.8136
0.0496
0.7988
0.7920
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
R2 Adjusted
R2
Standard Error se SSE
What is the percentage of variations in Debt explained by the regression model that provides the best fit?
92) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Version 1
Linear Model
Quadratic Model
Cubic Model
50
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
0.0824
0.8127
0.8136
0.0496
0.7988
0.7920
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
R2 Adjusted
R2
Standard Error se SSE
What is the estimate of the variance of the random error ε provided by the regression equation with the best fit?
93) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
What is the predicted percentage debt of a 45-year-old employed single person determined by the model with the best fit?
Version 1
51
94) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
If you impose the restrictions β2 = β3 = 0 on the model Debt = β0 + β1Age + β2Age2+ β3Age3 + ε, what will be the sum of the squared errors (SSER) computed for the restricted model?
95) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
0.0824
0.8127
0.8136
0.0496
0.7988
0.7920
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
R2 Adjusted
R2
Standard Error se SSE
What is the value of the test statistic for testing H0: β2 = β3 = 0 against HA: β2 ≠ 0 or β3 ≠ 0 in the model Debt = β0 + β1Age + β2Age2+ β3Age3 + ε ?
Version 1
52
96) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
367.8559
366.2017
SSE
1,802.2938
For the cubic model, Debt = β0+ β1Age + β2Age + β3Age + ε, the following Excel partial output is available. What is the conclusion when testing the individual significance of Age3? 2
3
Coefficients
Standard Error
t-stat
p-value
−31.9711
15.0714
−2.1213
0.0436
Age
2.1651
1.0622
2.0382
0.0518
Age2
−0.0167
0.0230
−0.7260
0.4743
Age3
−0.0001
0.0002
−0.3427
0.7346
Intercept
97) Thirty employed single individuals were randomly selected to examine the relationship between their age (Age) and their credit card debt (Debt) expressed as a percentage of their annual income. Three polynomial models were applied and the following table summarizes Excel’s regression results. Variable
Linear Model
Quadratic Model
Cubic Model
Intercept
13.7936
−36.7936
−31.9711
Age
0.1340
2.5197
2.1651
Age2
N/A
−0.0245
−0.0167
Age3
N/A
N/A
−0.0001
Version 1
53
R2
0.0824
0.8127
0.8136
Adjusted R2
0.0496
0.7988
0.7920
Standard Error se
8.0229
3.6911
3.7530
1,802.2938
367.8559
366.2017
SSE
Suppose the restriction β3 = 0 is imposed on the cubic regression model Debt = β0+ β1Age + β2Age2+ β3Age3+ ε. What regression equation is obtained under this restriction?
98) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
R2
Version 1
54
Adjusted R2 Standard Error se
0.9531
0.9895
594.2299
0.0434
What is the estimated linear regression model?
99) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
R2
0.9568
0.9904
Adjusted R2
0.9531
0.9895
594.2299
0.0434
Standard Error se
What is the estimated log-log regression model?
Version 1
55
100) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1 ln(Pepsi Price) + β2 ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
Using the linear model and holding Cola Price constant, what is the predicted change in the Pepsi Sales if the Pepsi Price increases by 10 cents?
Version 1
56
101) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
For the log-log model, interpret the estimated coefficient for ln(Pepsi Price).
Version 1
57
102) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1 ln(Pepsi Price) + β2 ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
Using the linear model and holding Cola Price constant, what is the predicted change in the Pepsi Sales if the Pepsi Price increases by 10 cents?
103) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Version 1
58
Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
For log-log model, interpret the estimated coefficient for ln(Cola Price).
104) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
Version 1
59
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
R2
0.9568
0.9904
Adjusted R2
0.9531
0.9895
594.2299
0.0434
Standard Error se
Using the linear model, calculate the predicted Pepsi Sales when the Pepsi Price is $1.50 and the Cola Price is $1.25.
105) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
Version 1
60
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
Using the log-log model, calculate the predicted Pepsi Sales when the Pepsi Price is $1.50 and the Cola Price is $1.25.
106) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
Version 1
61
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
What is the percentage of variations in the Pepsi sales explained by the linear model?
107) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
Version 1
62
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
R2
0.9568
0.9904
Adjusted R2
0.9531
0.9895
594.2299
0.0434
Standard Error se
What is the percentage of variations in the log of the Pepsi sales explained by the log-log model?
108) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
Version 1
63
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
Discuss the choice between the linear model and the log-log model.
109) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
Version 1
64
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows: Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
0.9568
0.9904
0.9531
0.9895
594.2299
0.0434
R2 Adjusted
R2
Standard Error se
Using Excel or R, what is the percentage of variations in the Pepsi Sales as explained by the loglog model?
110) It is believed that the sales volume of one-liter Pepsi bottles depends on the price of the bottle and the price of a one-liter bottle of Coca-Cola. The following data have been collected for a certain sales region. Pepsi Sales Pepsi Price Cola Price Pepsi Sales Pepsi Price Cola Price 11,453
0.79
1.59
6,003
1.34
1.14
10,982
0.84
1.69
5,597
1.39
1.19
11,130
0.84
1.59
5,405
1.44
1.14
9,980
0.89
1.64
4,960
1.49
1.14
9,850
0.89
1.49
4,610
1.54
1.09
9,810
0.94
1.79
4,234
1.59
0.99
9,650
0.99
1.54
4,083
1.64
1.09
8,010
1.04
1.39
3,889
1.69
0.94
7,864
1.09
1.64
3,605
1.79
1.04
7,791
1.14
1.59
3,710
1.84
0.99
7,332
1.14
1.24
3,533
1.84
0.99
7,508
1.19
1.29
3,395
1.89
0.94
6,634
1.19
1.29
3,229
1.94
0.89
The linear model Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and the log-log model ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε have been estimated as follows:
Version 1
65
Variable
Linear Model
Log-Log Model
Intercept
10,982.5223
8.9703
Pepsi Price
−5,528.7557
N/A
Cola Price
2,364.5870
N/A
ln(Pepsi Price)
N/A
−1.2709
ln(Cola Price)
N/A
0.2692
R2
0.9568
0.9904
Adjusted R2
0.9531
0.9895
594.2299
0.0434
Standard Error se
Using Excel or R, which of the two models provides a better fit?
111) A manager wants to evaluate how time (measured in weeks) affects the demand of a new product (measured in thousands of units) using different regression models. A portion of data he collected is shown below. Demand
Week
148
10
⋮ 72.5
⋮ 100
{MISSING IMAGE}Click here for the Excel Data File a.Estimate the linear, quadratic, and cubic regression models. b.Identify the best fitting models. c.Use the best model to predict demand in week 55.
112) An epidemiologist wants to predict the number of cases of a contagious disease based on number of days since monitoring begins. A portion of the relevant data she collected is shown below: Days
Version 1
Cases 66
10
640
⋮ 60
⋮ 3520
{MISSING IMAGE}Click here for the Excel Data File a.Estimate the linear and relevant log-transformed models. b.Comment on the estimated slope coefficient for each model. c.Which model provides the best prediction? Explain. d.Use the best fitted model to predict the number of cases after 70 days since monitoring begins.
113)
A quadratic regression model is a polynomial regression model of order 2. ⊚ true ⊚ false
114)
A cubic regression model is a polynomial regression model of order 2. ⊚ true ⊚ false
115) <p>The quadratic regression model the slope capturing the influence of x on y. ⊚ true ⊚ false
116) <p>The cubic regression model in the slope capturing the influence of x on y.
Version 1
allows for one sign change in
allows for one sign change
67
⊚ ⊚
true false
117) <p>For the quadratic regression model the change in the predicted value of y when x increases by 1 unit. ⊚ true ⊚ false
can be interpreted as
118) <p>The curve representing the regression equation style="color:black;">has a U-shape if . ⊚ true ⊚ false
119) 0.
<p>The quadratic regression model ⊚ ⊚
<span
reaches a maximum when
true false
120) <p>The fit of the regression equations = b0 + b1x + b2x2 and = b0 + b1x + b2x2 + b3x3 can be compared using the coefficient of determination R2. ⊚ true ⊚ false
121)
The regression model ln(y) = β0 + β1x + εis called a logarithmic model. ⊚ true ⊚ false
Version 1
68
<
122)
<p>The regression model ⊚ true ⊚ false
123)
The regression model ⊚ ⊚
is called an exponential model.
is called a log-log model.
true false
124) <p>For the log-log model change in E(y) when x increases by 1%. ⊚ true ⊚ false
is the approximate percent
125) <p>For the exponential model percentage change in E(y) when x increases by 1%. ⊚ true ⊚ false
,β1× 100% is the approximate
126) <p>For thelogarithmic model percentage change in E(y) when x increases by 1%. ⊚ true ⊚ false
is the approximate
127) <p>The fit of the models using the coefficient of determination ⊚ true ⊚ false
Version 1
and
can be compared
.
69
Version 1
70
Answer Key Test name: Chap 16_4e 1) slope 2) two 3) nonlinear 4) semi-log 5) logarithmic 6) explanatory 7) marginal 8) CORREL 9) cor 10) B 11) C 12) B 13) C 14) A 15) B 16) A 17) B 18) C 19) A 20) D 21) A 22) A 23) B 24) C 25) D 26) A Version 1
71
27) A 28) B 29) C 30) D 31) D 32) B 33) B 34) A 35) B 36) C 37) A 38) B 39) B 40) D 41) D 42) B 43) C 44) D 45) B 46) A 47) B 48) B 49) D 50) B 51) A 52) B 53) D 54) C 55) D 56) D Version 1
72
57) B 58) D 59) B 60) C 61) A 62) B 63) C 64) D 65) A 66) B 67) A 68) B 69) D 70) A 71) C 72) B 73) D 74) B 75) D 76) A 77) B 78) C 79) A 80) C 81) B 82) D 83) A 84) C 85) D 86) C Version 1
73
87) 0.2871 88) 51.4 years old 89) 27.99% 90) = −36.7936 + 2.5197 Age − 0.0245 Age2
91) 81.27% 92) 13.6242 (%)2
93) 26.98% 94) SSER = 1802.2938 95) 50.98
96) Do not reject H0: β3 = 0; we cannot conclude Age3 is significant. We cannot conclude that the cubic model is better than the quadratic model. 97) The quadratic equation: Debt = −36.7936 + 2.5197Age − 0.0245Age2 98)
= 10,982.5223 − 5,528.7557 Pepsi Price + 2,364.5870 Cola Price
99) <p>In = =8.9703 − 1.2709 ln(Pepsi Price) − 0.2692 ln(Cola Price) 100) Holding Cola Price constant, the predicted Pepsi sales decrease by about 553 bottles when the Pepsi price increases by 10 cents. 101) Holding Cola Price constant, the predicted Pepsi sales decrease by about 1.27% when the Pepsi price increases by 1%. 102) Holding Pepsi Price constant, the predicted Pepsi sales increase by about 236 bottles when the Cola price increases by 10 cents. 103) Holding the Pepsi price constant, the predicted Pepsi sales increase by approximately 0.27% when the Cola price increases by 1%. 104) About 5,645 bottles 105) About 4,994 bottles 106) 95.68% 107) 99.04%
Version 1
74
108) Both models, Pepsi Sales = β0 + β1Pepsi Price + β2Cola Price + ε and ln(Pepsi Sales) = β0 + β1ln(Pepsi Price) + β2ln(Cola Price) + ε, have the same number of coefficients to estimate. Therefore, the coefficient of determination R2 should be used in their comparison. The estimated linear regression model is: = 10,982.5223 − 5,528.7557 Pepsi Price + 2,364.5870 Cola Price with the coefficient of determination R2 = 0.9568. On the other hand, the estimated log-log regression model is: ln( ) = 8.9703 − 1.2709 ln(Pepsi Price) + 0.2692 ln(Cola Price) with the coefficient of determination R2 = 0.9904. However, the two coefficients are not comparable because the first one measures the explained variations in variations in ln(
while the second one measures the explained
). To choose between the two models, for the log-log model, we have
to compute R2 for the log-log model using the formula coefficient of correlation between yand
where
is the sample
. Then the comparison of
R2 = 0.9568 for the linear model and for the log-log model will allow us to choose the better model. 109) 98.80% 110) The log-log model. 111) <p>a. The estimated regression equation for the linear model is: = 94.3907 − 0.4269 × Week. The estimated regression equation for the quadratic model is: Week + 0.0262 × Week2. The estimated regression equation for the cubic model is: + 0.1276 × Week2 − 0.0006 × Week3
= 152.0398 − 3.3093 × = 204.7740 − 7.9866 × Week
b. The best fitting model is the cubic regression model. c. $49.2956
Version 1
75
112) a. Model 1 – Linear: = −327.6840 + 61.0996 × Days Model 2 – Log-Log: exp(3.8726 + 1.0169 × In(Days) + 0.15222/2) Model 3 – Logarithmic:
= −3813.58 + 1642.6790 × In(Days)
Model 4 – Exponential: b. Model 1 – Linear:
= exp(6.1102 + 0.0357 × Days + 0.0591 2/2) increases by 61 when Days increases by 1.
Model 2 – Log-Log:
increases by 1.0169% when Days increases by 1%.
Model 3 – Logarithmic: Model 4 – Exponential:
increases by 16 when Days increases by 1%. increases by 4 when Days increases by 1.
=
c. Model 4 because its R 2 based on Cases is highest. d. 5480
113) TRUE 114) FALSE 115) TRUE 116) FALSE 117) FALSE 118) FALSE 119) TRUE 120) FALSE 121) FALSE 122) TRUE 123) TRUE 124) TRUE 125) FALSE 126) FALSE 127) TRUE
Version 1
76
CHAPTER 16 – 4e – Additional Test bank questions 1)
The model fit for y = β0+ β1x + ε and ln(y) = β0+ β1x + ε
A) cannot be compared using the computer-generated R2 values. B) can be compared using the computer-generated R2 values. C) can be compared using the computer-generated adjusted R2 values. D) can be compared using the p-values associated with the values of the F(df1,df2) test statistic.
2)
The model fit for ln(y) = β0+ β1x1+ ε and ln(y) = β0+ β1x1+ β2ln(x2) + ε
A) cannot be compared using any of the computer-generated output. B) can be compared using the computer-generated R2 values. C) can be compared using the computer-generated adjusted R2 values. D) can be compared using the p-values associated with the values of the F(df1,df2) test statistic.
3)
The following scatterplot shows y against xwith a superimposed trend line:
Which of the following regression models best fits the data?
Version 1
1
A) y =β0 +β1x + ε B) y =β0 +β1x +β2x2 + ε C) ln(y) =β0 +β1x + ε D) ln(y) =β0 +β1ln(x) + ε
4)
The following scatterplot shows y against x with a superimposed trend line:
Which of the following regression models best fits the data? A) y =β0 +β1x +ε B) y =β0 +β1x +β2x2+ε C) y =β0 +β1x +β2x2+β3x3+ε D) y =β0 +β1ln(x) +ε
5) Suppose that you estimate ln(y) =β0 +β1ln(x) +ε and obtain an estimate for β1 of 1.65. This means that if x increases by A) 1 unit, then y is expected to increase by 1.65%. B) 1%, then y is expected to increase by 1.65%. C) 1 unit, then y is expected to increase by 1.65 units. D) 1%, then y is expected to increase by 1.65 units.
Version 1
2
6) Suppose that you estimate y =β0+β1ln(x) +ε and obtain an estimate forβ1of −500. This means that if x increases by
A) 1 unit, then y is expected to decline by 500 units. B) 1%, then y is expected to decline by 500 units. C) 1 unit, then y is expected to decline by 50%. D) 1%, then y is expected to decline by 5 units.
Version 1
3
Answer Key Test name: Chap 16_4e_Additional TB Questions 1) A 2) C 3) B 4) D 5) B 6) D
Version 1
4
CHAPTER 17 – 4e 1)
Gender is an example of __________ variable.
2)
A dummy variable is also referred to as a(n) __________ variable.
3) Including as many dummy variables as there are categories is referred to as the dummy variable ________.
4) Using a ________ we can determine if a particular dummy variable is statistically significant.
5) To avoid the dummy variable ________, the number of dummy variables should be one less than the number of categories.
6) A dummy variable can be used to create a(n) ________ variable between the dummy variable and a numerical variable x, which allows the predicted y to differ between the two categories by a varying amount across the values of x.
7) In a model y = β0 + β1x + β2d + β3xd + ε, the ________ F test for the joint significance of d and xd can be performed.
8) Regression models that use a dummy variable as the response variable are called binary or discrete ________ models.
Version 1
1
9) A model formulated as y = β0 + β1x + ε = P(y = 1) + ε is called a(n) ________ probability model.
10)
The logit model cannot be estimated with standard ________ least squares procedures.
11) The method of maximum likelihood estimation (MLE) is used to ________ the parameters of a logit model.
12) A logit model ensures that the predicted probability of the binary response variable falls between ________.
13)
Which of the following variables is not categorical?
A) Gender of a person B) Religious affiliation C) Number of dependents claimed on a tax return D) Class level of a student (freshman, sophomore, etc.)
14) Numerical variables assume meaningful ________, whereas categorical variables represent some ________. A) categories, numeric values B) numeric values, categories C) categories, responses D) responses, categories
Version 1
2
15) Consider the model y = β0 + β1x + β2d + ε, where x is a numerical variable and d is a dummy variable. When d = 0, the predicted value of y is ________. A) <p><span style="font-family:monospace;"> = b0+ b1x + b2d B) <p><span style="font-family:monospace;"> = b0 + b1x C) <p><span style="font-family:monospace;"> = b0 + b2d D) <p><span style="font-family:monospace;"> = b0 + b1(x + d)
16) Consider the model y = β0 + β1x + β2d + ε, where x is a numerical variable and d is a dummy variable. When d = 1, the predicted value of y ________. A) <p><span style="font-family:monospace;"> = b0 + b1x + b2x B) <p><span style="font-family:monospace;"> = b0 + b1x C) <p><span style="font-family:monospace;"> = (b0 + b1)x + b2 D) <p><span style="font-family:monospace;"> = (b0 + b2) + b1x
17) <p>Consider the regression equation = b0 + b1x + b2d with a dummy variable d. If d changes from 0 to 1, the intercept changes by: A) b0 B) b0 + b1 C) b2 D) b0 + b2
18) For the model y = β0 + β1x + β2d + ε, which test is used for testing the significance of a dummy variable d?
Version 1
3
A) F test B) chi-square test C) z test D) t test
19) A researcher has developed the following regression equation to predict the prices of luxurious Oceanside condominium units, = 38 + 0.25 Size + 48 View, where Price = the price of a unit (in $1,000s), Size is the square footage (in sq. feet), and View is a dummy variable taking on 1 for an ocean view unit and 0 for a bay view unit. Which of the following is the predicted price of an ocean view unit with 1,700 square feet? A) $48,000 B) $511,000 C) $463,000 D) $38,000
20) <p>A researcher has developed the following regression equation to predict the prices of luxurious Oceanside condominium units, = 40 + 0.15 Size + 50 View, where Price = the price of a unit (in $1,000s), Size is the square footage (in square feet), and View is a dummy variable taking on 1 for an ocean view unit and 0 for a bay view unit. Which of the following is the predicted price of an ocean view unit with 1,500 square feet? A) $315,000 B) $50,000 C) $265,000 D) $40,000
Version 1
4
21) <p>A researcher has developed the following regression equation to predict the prices of luxurious Oceanside condominium units, = 40 + 0.15 Size + 50 View, where Price is the price of a unit (in $1,000s), Size is the square footage (in square feet), and View is a dummy variable taking on 1 for an ocean view unit and 0 for a bay view unit. Which of the following is the predicted price of a bay view unit measuring 1,500 square feet? A) $315,000 B) $50,000 C) $265,000 D) $40,000
22) <p>A researcher has developed the following regression equation to predict the prices of luxurious Oceanside condominium units, = 40 + 0.15 Size + 50View, where Price is the price of a unit (in $1,000s), Size is the square footage (in square feet), and View is a dummy variable taking on 1 for an ocean view unit and 0 for a bay view unit. Which of the following is the difference in predicted prices of the ocean view and bay view units with the same square footage? A) $315,000 B) $50,000 C) $265,000 D) $40,000
23) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε
Version 1
5
df Regression
SS
4
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
6.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε df Regression
SS
3
28,029,969.88
Residual
86
13,803,720.12
Total
86
41,833,690.00
Coefficients
Standard Error
9,343,323.29 58.21 162,508.37
t-stat
p-value
Intercept 4,713.25663.31
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
Which of the following is the regression equation found by Excel for Model A?
Version 1
6
A)
= 4663.31 + 140.66Educ + 3.36Exper + 1.17Train + 615.15Gender
B)
= 365.37 + 20.16Educ + 0.47Exper + 3.72Train + 97.33Gender
C)
=12.76 + 6.98Educ + 7.15Exper + 0.31Train + 6.32Gender
D)
= 140.66Educ + 3.36Exper + 1.17Train + 615.15Gender
24) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε df Regression
4
SS
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
6.32
1.16E08
Intercept
p-value
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε Version 1
7
df Regression
SS
3
28,029,969.88
Residual
86
13,803,720.12
Total
86
41,833,690.00
Coefficients
Standard Error
MS
F
9,343,323.29 58.21 162,508.37
t-stat
p-value
Intercept 4,713.25663.31
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
Using Model A, which of the following is the estimated average difference between the salaries of male and female employees with the same years of education, months of experience, and weeks of training? A) About $5,423 B) About $619 C) About $5,278 D) About $615
25) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3 Train + β4Gender + ε
Version 1
8
df Regression
SS
4
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε df Regression
SS
3
28,029,969.88
Residual
86
13,803,720.12
Total
89
41,833,690.00
Coefficients
Standard Error
9,343,323.29 58.21 162,508.37
t-stat
p-value
Intercept 4,713.25663.31
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
Which of the following explanatory variables in Model A is not significant at the 5% level?
Version 1
9
A) Educ B) Exper C) Train D) Gender
26) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε df Regression
4
SS
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
6.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε df
Version 1
SS
10
Regression
3
28,029,969.88
Residual
86
13,803,720.12
Total
89
41,833,690.00
Coefficients
Standard Error
9,343,323.29 58.21 162,508.37
t-stat
p-value
Intercept 4,713.25663.31
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
For a test of individual significance about gender, what is the conclusion at 5% significance level? A) Do not reject H0; we can conclude gender is significant. B) Reject H0; we can conclude gender is significant. C) Reject H0; we cannot conclude gender is significant. D) Do not reject H0; we cannot conclude gender is significant.
27) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε Regression Statistics Multiple R
0.8188
R2
0.6704
Version 1
11
Adjusted R2
0.6549
Standard Error
402.7513
Observations
90 df
Regression
SS
4
28,045,960.12
Residual
85
13,787,729.88
Total
89
41,833,690.00
MS
F
7,011,490.03 43.23 162,208.59
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε Regression Statistics Multiple R
0.8186
R2
0.6700
Adjusted
R2
Standard Error
0.6585 400.4350
Observation s
90 df
SS
Regression
3
28,029,969.8 8
9,343,323.2 9
Residual
86
13,803,720.1 2
162,508.37
Total
89
41,833,690.0 0
Coefficient s
Version 1
Standard Error
t-stat
58.2 1
p-value
12
Intercept
4,713.26
327.196 7
14.40
1.21E -24
Educ
139.54
19.7253
7.07
3.78E -10
Exper
3.35
0.4649
7.20
2.10E -10
Gender
609.25
94.9920
6.41
7.39E -09
The variable Train is deleted from Model A which results in Model B. Which of the following justifies this choice? A) The adjusted R2 Model B is lower than the adjusted R2 of Model A, and the variable is individually significant in Model A. B) The adjusted R2 of Model B is higher than the adjusted R2 of Model A, and the variable is individually significant in Model A. C) The adjusted R2 of Model B is lower than the adjusted R2 of Model A, and the variable is not individually significant in Model A. D) The adjusted R2 of Model B is higher than the adjusted R2 of Model A, and the variable is not individually significant in Model A.
28) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε df Regressi on Residual
Version 1
4 85
SS
MS
28,045,960. 12
7,011,490. 03
13,787,729. 88
162,208.59
F 43.2 3
13
Total
89 Coefficients
41,833,690. 00 Standard Error
t-stat
p-value
Intercep 4,663.3 t 1
365.37
12.76
1.85E -21
Educ
140.66
20.15
6.98
6.07E -10
Exper
3.36
0.47
7.17
2.54E -10
Train
1.17
3.72
0.31
0.754 3
Gender
615.15
97.33
0.32
1.16E -08
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε df Regressi on
3
SS 28,029,969. 88
9,343,323. 29 162,508.37
Residual
86
13,803,720. 12
Total
89
41,833,690. 00
Coefficients
MS
F 58.2 1
Standard Error
t-stat
p-value
Intercep 4,713.2 t 6
327.1967
14.40
1.21E -24
Educ
139.54
19.7253
7.07
3.78E -10
Exper
3.35
0.4649
7.20
2.10E -10
Gender
609.25
94.9920
6.41
7.39E -09
Using Model B, what is the regression equation for males? A)
= 4,713.26 + 139.5366Educ + 3.3488Exper + 609.25Gender
B)
= 5,322.51 + 139.5366Educ + 3.3488Exper
C)
= 4,713.26 + 139.5366Educ + 3.3488Exper
D)
= 4,663.31 + 140.6634Educ + 3.3566Exper
Version 1
14
29) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0+ β1Educ + β2Exper + β3Train + β4Gender + ε df Regressi on
4
SS 28,045,960. 12
7,011,490. 03 162,208.59
Residual
85
13,787,729. 88
Total
89
41,833,690. 00
Coefficients
MS
Standard Error
t-stat
F 43.2 3
p-value
Intercep 4,663.3 t 1
365.37
12.76
1.85E -21
Educ
140.66
20.15
6.98
6.07E -10
Exper
3.36
0.47
7.17
2.54E -10
Train
1.17
3.72
0.31
0.754 3
Gender
615.15
97.33
0.32
1.16E -08
Model B: Salary = β0+ β1Educ + β2Exper + β3Gender + ε df Regressi on Residual
Version 1
3 86
SS
MS
28,029,969. 88
9,343,323. 29
13,803,720. 12
162,508.37
F 58.2 1
15
Total
89 Coefficients
41,833,690. 00 Standard Error
t-stat
p-value
Intercep 4,713.2 t 6
327.1967
14.40
1.21E -24
Educ
139.54
19.7253
7.07
3.78E -10
Exper
3.35
0.4649
7.20
2.10E -10
Gender
609.25
94.9920
6.41
7.39E -09
Using Model B, what is the regression equation found by Excel for females? A)
= 4,713.26 + 139.5366Educ + 3.3488Exper + 609.25Gender
B)
= 5,322.51 + 139.5366Educ + 3.3488Exper
C)
= 4,713.26 + 139.5366Educ + 3.3488Exper
D)
= 4,663.31 + 140.6634Educ + 3.3566Exper
30) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0 + β1Educ + β2Exper + β3Train + β4Gender + ε df Regressio n Residual
Version 1
4 85
SS
MS
28,045,960. 12
7,011,490. 03
13,787,729. 88
162,208.59
F 43.2 3
16
Total
89 Coefficients
41,833,690. 00 Standard Error
t-stat
p-value
Intercept 4,663.3 1
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Model B: Salary = β0 + β1Educ + β2Exper + β3Gender + ε df Regressio n
3
SS 28,029,969.8 8
9,343,323.2 9 162,508.37
Residual
86
13,803,720.1 2
Total
89
41,833,690.0 0
Coefficients
58.2 1
Standard Error
t-stat
p-value
Intercept 4,713.2 6
327.1967
14.40
1.21E -24
Educ
139.54
19.7253
7.07
3.78E -10
Exper
3.35
0.4649
7.20
2.10E -10
Gender
609.25
94.9920
6.41
7.39E -09
Assuming the same years of education and months of experience, what is the null hypothesis for testing whether the mean salary of males is greater than the mean salary of females using Model B? A) H0: β3 ≤ 0 B) H0: β3 ≥ 0 C) H0: β3 > 0 D) H0: β3 = 0
Version 1
17
31) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0 + β1Educ + β2Exper + β3Train + β4Gender + ε df Regression
4
SS
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0 + β1Educ + β2Exper + β3Gender + ε df Regression
3
SS 28,029,969.88
9,343,323.29 162,508.37
Residual
86
13,803,720.12
Total
89
41,833,690.00
Coefficients
Version 1
Standard Error
t-stat
58.21
p-value
18
Intercept
4,713.26
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
Assuming the same years of education and months of experience, what is the p-value for testing whether the mean salary of males is greater than the mean salary of females using Model B? A) At least 0.025 B) Less than 0.025 but at least 0.01 C) Less than 0.01 but at least 0.005 D) Less than 0.005
32) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0 + β1Educ + β2Exper + β3Train + β4Gender + ε df Regression
4
SS 28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
Intercept
Version 1
MS
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
F 43.23
p-value 1.85E21 19
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Model B: Salary = β0 + β1Educ + β2Exper + β3Gender + ε df Regression
3
SS 28,029,969.88
9,343,323.29 162,508.37
Residual
86
13,803,720.12
Total
89
41,833,690.00
58.21
Coefficients
Standard Error
t-stat
4,713.26
327.1967
14.40
1.21E24
Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
Intercept
p-value
A group of female managers considers a discrimination lawsuit if on average their salaries can be statistically proven to be lower by more than $500 than the salaries of their male peers with the same level of education and experience. Using Model B, what is the alternative hypothesis for testing the lawsuit condition? A) HA:β3 ≤ 500 B) HA:β3 < 500 C) HA:β3 ≠ 500 D) HA:β3 > 500
Version 1
20
33) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected, and two models were created with the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses), Educ = the number of years of education, Exper = the number of months of experience, Train = the number of weeks of training, Gender = the gender of an individual; 1 for males, and 0 for females. The Excel partial outputs corresponding to these models are available and shown below. Model A: Salary = β0 + β1Educ + β2Exper + β3Train + β4Gender + ε df Regression
4
SS
MS
28,045,960.12
7,011,490.03 162,208.59
Residual
85
13,787,729.88
Total
89
41,833,690.00
F 43.23
Coefficients
Standard Error
t-stat
4,663.31
365.37
12.76
1.85E21
Educ
140.66
20.15
6.98
6.07E10
Exper
3.36
0.47
7.17
2.54E10
Train
1.17
3.72
0.31
0.7543
Gender
615.15
97.33
0.32
1.16E08
MS
F
Intercept
p-value
Model B: Salary = β0 + β1Educ + β2Exper + β3Gender + ε df Regression
3
SS 28,029,969.88
9,343,323.29 162,508.37
Residual
86
13,803,720.12
Total
89
41,833,690.00
Intercept
Version 1
Coefficients
Standard Error
t-stat
4,713.26
327.1967
14.40
58.21
p-value 1.21E-
21
24 Educ
139.54
19.7253
7.07
3.78E10
Exper
3.35
0.4649
7.20
2.10E10
Gender
609.25
94.9920
6.41
7.39E09
A group of female managers considers a discrimination lawsuit if on average their salaries could be statistically proven to be lower by more than $500 than the salaries of their male peers with the same level of education and experience. Using Model B, what is the conclusion of the appropriate test at 10% significance level? A) Do not reject H0; the salaries of female managers cannot be proven to be lower on average by more than $500. B) Reject H0; the salaries of female managers cannot be proven to be lower on average by more than $500. C) Do not reject H0; the salaries of female managers are lower on average by more than $500. D) Reject H0; the salaries of female managers are lower on average by more than $500.
34)
The number of dummy variables representing a categorical variable should be ________. A) one less than the number of categories of the variable B) two less than the number of categories of the variable C) the same number as the number of categories of the variable D) one more than the number of categories of the variable
35) Suppose that we have a categorical variable Month with categories: January, February, etc. How many dummy variables are needed to describe Month? A) 12 B) 11 C) 10 D) 9
Version 1
22
36) For the model y = β0 + β1x + β2d1 + β3d2 + ε, which of the following tests is used for testing the joint significance of the dummy variables d1 and d2? A) F test B) t test C) chi-square test D) z test
37) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. What is the regression equation for the summer days? A)
= (b0 + b3) + b1Temperature + b5Rain
B)
= b0 + b1Temperature + b5Rain
C)
= b0 + b1Temperature + b2Spring + b5Rain
D)
= b0 + b1Temperature + b2Spring + b4Fall + b5Rain
38) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. What is the regression equation for the summer rainy days?
Version 1
23
A)
= (b0 + b3) + b1Temperature
B)
= (b0 + b5) + b1Temperature
C)
= b0 + b1Temperature + b2Spring + b4Fall
D)
= (b0 + b3 + b5) + b1Temperature
39) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. What is the regression equation for the winter days? A)
= (b0 + b3) + b1Temperature + b5Rain
B)
= (b0 + b2 + b3 + b4) + b1Temperature + b5Rain
C)
= b0 + b1Temperature + b5Rain
D)
= (b0 + b5) + b1Temperature + b5Rain
40) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. What is the regression equation for the winter rainy days? A)
= (b0 + b3) + b1Temperature + b5Rain
B)
= (b0 + b5) + b1Temperature
C)
= (b0 + b2 + b3 + b4 + b5) + b1Temperature
D)
= (b0 + b2 + b3 + b4) + b1Temperature
Version 1
24
41) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. Assuming the same temperature and precipitation condition, what is the difference between the predicted humidity for summer and winter days? A) b0 + b1 + b5 B) b0 + b3 + b5 C) b3 D) b0 + b5
42) Consider the following regression model: Humidity = β0 + β1Temperature + β2Spring + β3Summer + β4Fall + β5Rain + ε, where the dummy variables Spring, Summer, and Fall represent the categorical variable Season (spring, summer, fall, winter), and the dummy variable Rain is defined as Rain = 1 if rainy day, Rain = 0 otherwise. Assuming the same temperature and precipitation condition, what is the difference between the predicted humidity for summer and fall days? A) b0 + b3 − b4 B) b3 − b4 C) b3 + b4 D) b0 + b4 − b3
43)
Which of the following regression models does not include an interaction variable? A) y = β0 + β1x + β2xd + ε B) y = β0 + β1x + β2d + ε C) y = β0 + β1d + β2xd + ε D) y = β0 + β1x + β2d + β3xd + ε
Version 1
25
44) <p>Consider the regression equation = b0 + b1xd with b1 > 0 and a dummy variable d. If d changes from 0 to 1, which of the following is true? A) The intercept increases by b0 + b1. B) The intercept increases by b1. C) The slope increases by b0 + b1. D) The slope increases by b1.
45)
The model y = β0 + β1x + β2d + β3xd + ε> is an example of a ________.
A) simple linear regression model B) linear regression model with only dummy variable C) linear regression model with dummy variable and numerical variable D) linear regression model with dummy variable, numerical variable, and interaction variable
46) In the model y = β0 + β1x + β2d + β3xd + ε, the dummy variable and the interaction variable cause ________. A) a change in just the intercept B) a change in just the slope C) a change in both the intercept as well as the slope D) a change in either the intercept or the slope
47) <p>In the regression equation = b0 + b1x + b2dx with a dummy variable d, when d changes from 0 to 1, the change in the slope of the corresponding lines is given by ________.
Version 1
26
A) b0 B) b0 + b1 C) b2 D) b0 + b2
48) In the model y = β0 + β1x + β2d + β3xd + ε, for a given x and d = 1, the predicted value of y is given bythe following EXCEPT ________. A) = b0 + b1x + b2 + b3x B) = b0 + b2 + b1x + b3x C) = (b0 + b2) + (b1 + b3)x D) = (b0+ b1) + (b2+ b3)x.
49) In the model y = β0 + β1x + β2d + β3xd + ε, when d changes from 0 to 1 how does the intercept of the corresponding lines change? A) From b0 to b0 + b1 B) From b0 to b0 + b2 C) From b0 to b0 + b3 D) From b0 to b0 + b1 + b2
50) For the model y = β0 + β1x + β2xd + ε, which of the following are the hypotheses for testing the individual significance of the interaction variable xd? A) H0:xd = 0,HA:xd ≠ 0 B) H0:b2 = 0,HA:b2 ≠ 0 C) H0:β2 = 0,HA:β2 ≠ 0 D) H0:β2 ≠ 0,HA:β2 = 0
Version 1
27
51) For a linear regression model with a dummy variable d and an interaction variable xd, we ________. A) cannot conduct a partial F test for the joint significance of d and xd B) can conduct a partial F test for the joint significance of d and xd C) cannot conduct t tests for the individual significance of d and xd D) can conduct a t test for the joint significance of d and xd
52) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, what is the null hypothesis for testing the joint significance of the variable Time and the interaction variable Time ×Prime?
Version 1
28
A) H0:β1 = β2 = β3 = 0 B) H0:β1 = 0 and β3 = 0 C) H0:β1 = 0 or β3 = 0 D) H0:β1 ≠ 0 or β3 ≠ 0
53) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, what is the value of the test statistic for testing the joint significance of the variable Time and the interaction variable Time ×Prime?
Version 1
29
A) −0.64 B) −5.36 C) 2.03 D) 2.74
54) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, what is the conclusion for testing the joint significance of the variable Time and the interaction variable Time × Prime at 5% significance level?
Version 1
30
A) Reject H0; we can conclude Time and Time × Prime are jointly significant. B) Do not reject H0; we can conclude Time and Time × Prime are jointly significant. C) Reject H0; we cannot conclude Time and Time × Prime arejointly insignificant. D) Do not reject H0; we cannot conclude Time and Time × Prime are jointly significant.
55) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Suppose that at the 5% significance level you do not reject the null hypothesis, H0:β1 = β3 = 0, when testing the joint significance of Time and Time × Prime in Model B. Should you delete both Time and Time × Prime from Model B?
Version 1
31
A) No, because Model B has the highest R2. B) No, because Model B reduces than to Model A with a lower adjusted R2. C) No, because Model B reduces than to Model A with lower R2. D) Yes, because Model B reduces than to Model C with higher adjusted R2.
56) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, which of the following is the alternative hypothesis for testing the significance of Time?
Version 1
32
A) HA:β1 = 0 B) HA:β1 = 0 and β3 = 0 C) HA:β1 ≠ 0 D) HA:β1 ≠ 0 or β3 ≠ 0
57) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Which of the following is the p-value for testing the individual significance of Time in Model B?
Version 1
33
A) Less than 0.10 B) Less than 0.20 but at least 0.10 C) Less than 0.40 but at least 0.20 D) More than 0.40
58) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Suppose that at a 10% significance level, you do not reject the null hypothesis, H0: β1 = 0, when testing the individual significance of Time in Model B. Would you delete Time from Model B?
Version 1
34
A) Yes, removing Time from Model B results in Model C which has a higher adjusted 2
R. B) No, removing Time from Model B results in Model C which has a with lower R2. C) No, Model B has the highest R2, so it should be used for making predictions. D) Yes, Time should be deleted because we could not prove its significance even for α = 0.10.
59) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable Intercept
Model A
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
SSE
1,532,480,000
1,369,126,091
1,381,128,299
R2
0.7254
0.7547
0.7526
Adjusted R2
0.7198
0.7388
0.7421
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Which of the following three models would you choose to make the predictions of the remaining loan balance?
Version 1
35
A) Model A B) Model B C) Model C D) Any model
60) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Model B
Model C
Intercept
87,919 (t = 77.75)
90,140 (t = 24.50)
87,954 (t = 80.84)
Time
N/A
−199 (t = −0.61)
N/A
Prime
−15,700 (t = −11.42)
−28,486 (t = −5.49)
−26,385 (t = −6.64)
N/A
699 (t = 2.14)
540 (t = 2.23)
1,533,250,000
1,369,128,243
1,381,127,339
0.7372
0.7447
0.7625
0.7316
0.7288
0.752
Time ×Prime SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, which of the following is the predicted balance on a $100,000 prime loan taken 15 years ago?
Version 1
36
A) $87,954 B) $69,669 C) $74,525 D) $82,117
61) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable
Model A
Intercept
Model B
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
−148 (t = −0.64)
N/A
Prime
−18,000 (t = −11.26)
−28,493 (t = −5.36)
−26,244 (t = −6.66)
Time × Prime
N/A
662 (t = 2.03)
514 (t = 2.27)
1,532,480,000
1,369,126,091
1,381,128,299
0.7254
0.7547
0.7526
0.7198
0.7388
0.7421
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, which of the following is the predicted balance on a $100,000 prime loan taken 15 years ago?
Version 1
37
A) $88,020 B) $69,486 C) $74,591 D) $82,183
62) A researcher wants to examine how the remaining balance on $100,000 loans taken 10 to 20 years ago depends on whether the loan was a prime or subprime loan. He collected a sample of 25 prime loans and 25 subprime loans and recorded the data in the following variables: Balance = the remaining amount of loan to be paid off (in $), Time = the time elapsed from taking the loan, Prime = a dummy variable assuming 1 for prime loans, and 0 for subprime loans. The regression results obtained for the models: Model A: Balance = β0 + β1Prime + ε Model B: Balance = β0 + β1Time + β2Prime + β3Time × Prime + ε Model C: Balance = β0 + β1Prime + β2Time × Prime + ε, are summarized in the following table. Variable Intercept
Model A
Model C
88,020 (t = 77.89) 90,269 (t = 24.35) 88,020 (t = 81.19)
Time
N/A
Prime
−18,000 (t = −11.26)
Time ×Prime
Model B −148 (t = −0.64)
N/A
−28,493 (t = −5.36) −26,244 (t = −6.66)
N/A
662 (t = 2.03)
514 (t = 2.27)
SSE
1,532,480,000
1,369,126,091
1,381,128,299
R2
0.7254
0.7547
0.7526
Adjusted R2
0.7198
0.7388
0.7421
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, what is the predicted balance on a $100,000 subprime loan taken 15 years ago?
Version 1
38
A) $88,020 B) $69,486 C) $74,591 D) $82,183
63) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a binary variable with 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the estimated regression model? A)
= 0.34Age − 40.86Drug + 0.70Age × Drug
B)
=3.15 + 0.07Age − 4.08Drug + 0.09Age × Drug
C)
= 14.94 + 5.06Age − 10.02Drug + 7.94Age × Drug
D)
=47.04 + 0.34Age − 40.86Drug + 0.70Age × Drug
Version 1
39
64) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Regression Statistics Multiple R
0.9761
R2
0.9527
Adjusted
R2
0.9463
Standard Error
3.2346
Observations
26 Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the percentage of variation in Effectiveness explained by the model? A) 97.61% B) 95.27% C) 94.63% D) 32.35%
Version 1
40
65) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug, The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the estimated regression model for the old drug? A)
= 47.04 + 0.34Age
B)
= 6.18 + 1.04Age
C)
= 47.04 + 0.34Age + 0.70Age × Drug
D)
= 47.04 + 0.34Age − 40.86Drug + 0.70Age × Drug
Version 1
41
66) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the estimated regression model for the new drug? A)
= 47.04 + 0.34Age
B)
= 6.18 + 1.04Age
C)
= 47.04 + 0.34Age + 0.70Age × Drug
D)
= 47.04 + 0.34Age − 40.86Drug + 0.70Age × Drug
67) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected.
Version 1
42
New Drug: Old Drug:
Patient Age
19 23 67 56 45 37 27 47 29 59 51 63 28
Effectiveness
32 10 76 62 50 60 27 63 40 77 66 75 24
Patient Age
21 28 33 33 38 43 48 53 53 58 63 67 52
Effectiveness
55 57 66 36 61 63 63 68 78 78 52 54 70
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.57
3.1513
15.0954
6.73E-13
Age
0.3
0.0665
4.5113
1.46E-03
Drug
−39.81
4.0816
−9.7535
1.81E-09
Age × Drug
0.70
0.0825
8.4848
7.12E-08
For which age is the predicted effectiveness of the old and new drug about the same? A) 42 years B) 49 years C) 57 years D) The predicted effectiveness of the new drug is higher for any age.
68) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
Version 1
43
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
For which age is the predicted effectiveness of the old and new drug about the same? A) 43 years B) 50 years C) 58 years D) The predicted effectiveness of the new drug is higher for any age.
69) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, , is estimated and the following results are obtained.
Version 1
44
Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is true? A) The predicted effectiveness of the new drug is not higher for any age. B) The predicted effectiveness of the new drug is higher only for young patients. C) The predicted effectiveness of the new drug is higher only for old patients. D) The predicted effectiveness of the new drug is higher for any age.
70) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Old Drug:
Patient Age
19 23 67 56 45 37 27 47 29 59 51 63 28
Effectiveness
34 28 80 59 44 46 34 49 43 79 60 76 19
Patient Age
21 28 33 33 38 43 48 53 53 58 63 67 52
Effectiveness
53 63 73 31 59 49 55 56 92 76 51 42 65
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
46.55
3.1387
14.831
4.65E-13
Age
0.33
0.0755
4.3709
1.79E-03
Drug
−41.95
4.0609
−10.3302
2.36E-09
Age × Drug
0.70
0.0857
8.168
3.74E-08
Which of the following is the predicted effectiveness of the new drug for a 63-year-old patient?
Version 1
45
A) 64.80 B) 61.16 C) 17.76 D) 67.34
71) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, , is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the predicted effectiveness of the old drug for a 62-year-old patient? A) 21.08 B) 64.48 C) 68.12 D) 70.66
Version 1
46
72) An over-the-counter drug manufacturer wants to examine the effectiveness of a new drug in curing an illness most commonly found in older patients. Thirteen patients are given the new drug and 13 patients are given the old drug. To avoid bias in the experiment, they are not told which drug is given to them. To check how the effectiveness depends on the age of patients, the following data have been collected. New Drug: Patient Age
19
23
67
56
45
37
27
47
29
59
51
63
28
Effectiveness
28
25
71
62
50
46
34
59
36
71
62
71
35
Patient Age
21
28
33
33
38
43
48
53
53
58
63
67
52
Effectiveness
56
55
61
54
58
65
64
61
69
71
66
70
60
Old Drug:
The variables are Effectiveness = the response variable measured on a scale from 0 to 100, Age = the age of a patient (in years), Drug = a dummy variable that is 1 for the new drug and 0 for the old drug. The regression model, Effectiveness = β0 + β1Age + β2Drug + β3Age × Drug, is estimated and the following results are obtained. Coefficients
Standard Error
t-stat
p-value
Intercept
47.04
3.1479
14.9423
5.31E-13
Age
0.34
0.0665
5.0610
4.55E-05
Drug
−40.86
4.0798
−10.016
1.17E-09
Age × Drug
0.70
0.0879
7.9385
6.71E-08
Which of the following is the predicted effectiveness of the new drug for a 62-year-old patient? A) 21.08 B) 64.48 C) 68.12 D) 70.66
73)
Which of the following predictions can be described by a binary choice model?
Version 1
47
A) Predict the day’s temperature in degrees Fahrenheit. B) Predict whether a student will pass or fail a test. C) Predict today’s price of gasoline per gallon in dollars. D) Predict the rainfall in millimeters today.
74)
Which of the following predictions cannot be described by a binary choice model? A) Predict the ability to swim through the English Channel. B) Predict the chances of a candidate winning the next presidential election. C) Predict today’s number of cars crossing the Golden Gate Bridge. D) Predict the occurrence of a hurricane in Florida next year.
75) For the linear probability model y = β0 + β1x + ε, the predicted value of y is always constrained to be ________. A) between 0 and 1. B) positive. C) negative. D) between minus infinity to infinity.
76) The major shortcoming of the general linear probability model y = β0 + β1x1 + β2x2 + … + βkxk + ε, is that the predicted values of y can be sometimes ________. A) greater than 0 and less than 1 B) at least 0 and no more than 1 C) less than 1 but more than 0 D) less than 0 or greater than 1
77)
Which of the following represents a logit regression model?
Version 1
48
A) P = β0 + β1x B) C) P = β0 + β1 ln(x) D) P = exp(β0 + β1x)
78)
Which of the following is an estimated logit model? A) B) C) D)
79) The major advantage of a logit model over a linear probability model is that the predicted values of y are always between ________. A) −1 and 1 B) 0 and 2 C) 0 and 1 D) −1 and 0
80)
A logit model can be estimated with the method of ________. A) ordinary standard least squares B) maximum likelihood estimation C) nonparametric estimation D) ordinary least squares
Version 1
49
81) A medical researcher is interested in assessing the probability of some symptoms of coronary heart disease (CHD) being present at a given age. For this reason, he uses the following data collected on 100 individuals. Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
20
0
30
1
37
0
43
0
48
1
55
1
60
23
0
32
0
37
1
43
0
48
1
56
1
60
24
0
32
0
37
0
43
1
49
0
56
1
61
25
0
33
0
38
0
44
0
49
0
56
1
62
25
1
33
0
38
0
44
0
49
1
57
0
62
26
0
34
0
39
0
44
1
50
0
57
0
63
26
0
34
0
39
1
44
1
50
1
57
1
64
28
0
34
1
40
0
45
0
51
0
57
1
64
28
0
34
0
40
1
45
1
52
0
57
1
65
29
0
34
0
41
0
46
0
52
1
57
1
69
30
0
35
0
41
0
46
1
53
1
58
30
0
35
0
42
0
47
0
53
1
58
30
0
36
0
42
0
47
0
54
1
58
30
0
36
1
42
0
47
1
55
0
59
30
0
36
0
42
1
48
0
55
1
59
The response variable is CHD which is 1 if coronary heart disease is present and 0 otherwise. The following linear probability model and logit model have been estimated:
Linear probability:
= − 0.53796 + 0.02181Age
Logit: Using the linear probability model, what is the estimated probability of coronary heart disease for a 60-year-old individual?
Version 1
50
A) 0.5437 B) 0.2231 C) 0.7706 D) 0.7946
82) A medical researcher is interested in assessing the probability of some symptoms of coronary heart disease being present at a given age. For this reason, he uses the following data collected on 100 individuals. Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
20
0
30
1
37
0
43
0
48
1
55
1
60
23
0
32
0
37
1
43
0
48
1
56
1
60
24
0
32
0
37
0
43
1
49
0
56
1
61
25
0
33
0
38
0
44
0
49
0
56
1
62
25
1
33
0
38
0
44
0
49
1
57
0
62
26
0
34
0
39
0
44
1
50
0
57
0
63
26
0
34
0
39
1
44
1
50
1
57
1
64
28
0
34
1
40
0
45
0
51
0
57
1
64
28
0
34
0
40
1
45
1
52
0
57
1
65
29
0
34
0
41
0
46
0
52
1
57
1
69
30
0
35
0
41
0
46
1
53
1
58
30
0
35
0
42
0
47
0
53
1
58
30
0
36
0
42
0
47
0
54
1
58
30
0
36
1
42
0
47
1
55
0
59
30
0
36
0
42
1
48
0
55
1
59
Version 1
51
The response variable is CHD which is 1 if coronary heart disease is present and 0 otherwise. The following linear probability model and logit model have been estimated: Linear probability:
= − 0.53796 + 0.02181Age
Logit: Using the logit model, what is the estimated probability of coronary heart disease for a 60-yearold individual? A) 0.4581 B) 0.3219 C) 0.7706 D) 0.7946
83) A medical researcher is interested in assessing the probability of some symptoms of coronary heart disease being present at a given age. For this reason, he uses the following data collected on 100 individuals. Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
20
0
30
1
37
0
43
0
48
1
55
1
60
23
0
32
0
37
1
43
0
48
1
56
1
60
24
0
32
0
37
0
43
1
49
0
56
1
61
25
0
33
0
38
0
44
0
49
0
56
1
62
25
1
33
0
38
0
44
0
49
1
57
0
62
26
0
34
0
39
0
44
1
50
0
57
0
63
26
0
34
0
39
1
44
1
50
1
57
1
64
28
0
34
1
40
0
45
0
51
0
57
1
64
28
0
34
0
40
1
45
1
52
0
57
1
65
29
0
34
0
41
0
46
0
52
1
57
1
69
30
0
35
0
41
0
46
1
53
1
58
30
0
35
0
42
0
47
0
53
1
58
30
0
36
0
42
0
47
0
54
1
58
30
0
36
1
42
0
47
1
55
0
59
Version 1
52
30
0
36
0
42
1
48
0
55
1
59
The response variable is CHD which is 1 if coronary heart disease is present and 0 otherwise. The following linear probability model and logit model have been estimated:
Linear probability:
= − 0.53796 + 0.02181Age
Logit: For the linear probability model, what is the range of values of Age for which regarded as the estimated probability of coronary heart disease?
can be
A) From 30.5 to 60.3 B) From 24.7 to 70.5 C) From 20.6 to 60.9 D) From 28.2 to 79.2
84) A medical researcher is interested in assessing the probability of some symptoms of coronary heart disease being present at a given age. For this reason, he uses the following data collected on 100 individuals. Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
20
0
30
1
37
0
43
0
48
1
55
1
60
23
0
32
0
37
1
43
0
48
1
56
1
60
24
0
32
0
37
0
43
1
49
0
56
1
61
25
0
33
0
38
0
44
0
49
0
56
1
62
25
1
33
0
38
0
44
0
49
1
57
0
62
26
0
34
0
39
0
44
1
50
0
57
0
63
26
0
34
0
39
1
44
1
50
1
57
1
64
28
0
34
1
40
0
45
0
51
0
57
1
64
28
0
34
0
40
1
45
1
52
0
57
1
65
29
0
34
0
41
0
46
0
52
1
57
1
69
30
0
35
0
41
0
46
1
53
1
58
30
0
35
0
42
0
47
0
53
1
58
Version 1
53
30
0
36
0
42
0
47
0
54
1
58
30
0
36
1
42
0
47
1
55
0
59
30
0
36
0
42
1
48
0
55
1
59
The response variable is CHD which is 1 if coronary heart disease is present and 0 otherwise. The following linear probability model and logit model have been estimated:
= − 0.53796 + 0.02181Age
Linear probability: Logit :
Using the linear probability model, find the age of a person for which the estimated chances of having coronary heart disease are 50%? A) 47.8 years old B) 56.3 years old C) 45.5 years old D) 47.6 years old
85) A medical researcher is interested in assessing the probability of some symptoms of coronary heart disease being present at a given age. For this reason, he uses the following data collected on 100 individuals. Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
CHD
Age
20
0
30
1
37
0
43
0
48
1
55
1
60
23
0
32
0
37
1
43
0
48
1
56
1
60
24
0
32
0
37
0
43
1
49
0
56
1
61
25
0
33
0
38
0
44
0
49
0
56
1
62
25
1
33
0
38
0
44
0
49
1
57
0
62
26
0
34
0
39
0
44
1
50
0
57
0
63
26
0
34
0
39
1
44
1
50
1
57
1
64
28
0
34
1
40
0
45
0
51
0
57
1
64
28
0
34
0
40
1
45
1
52
0
57
1
65
29
0
34
0
41
0
46
0
52
1
57
1
69
Version 1
54
30
0
35
0
41
0
46
1
53
1
58
30
0
35
0
42
0
47
0
53
1
58
30
0
36
0
42
0
47
0
54
1
58
30
0
36
1
42
0
47
1
55
0
59
30
0
36
0
42
1
48
0
55
1
59
The response variable is CHD which is 1 if coronary heart disease is present and 0 otherwise. The following linear probability model and logit model have been estimated:
= − 0.53796 + 0.02181Age
Linear probability: Logit :
Using the logit model, find the age of a person for which the estimated chances of having coronary heart disease are 50%? A) 47.8 years old B) 56.3 years old C) 45.5 years old D) 47.6 years old
86) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with an increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data also include: average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. The Excel partial output corresponding to this model is shown below. Coefficients Intercept
Version 1
17.65
Standard Error
t-stat
p-value
1.45
12.19
0.00
55
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Woman
3.20
0.74
4.35
0.00
Which of the following is the estimated regression equation for the life expectancy? A)
=17.65 + 0.02Income − 0.69Drinks + 3.20Woman
B)
= 20.85 + 0.02Income − 0.69Drinks
C)
= 12.19 + 0.94Income − 2.62Drinks + 4.35Woman
D)
= 16.54 + 0.94Income − 2.62Drinks
87) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with an increase in life expectancy. Others have also linked longevity with income and gender. Data were collected related to the length of life after 65 (Life) were collected including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.65
1.45
12.19
0.00
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Woman
3.20
0.74
4.35
0.00
Which of the following is the predicted life expectancy for a 65-year-old man with an income of $50,000 and a consumption of three drinks per day? A) 19.78 B) 16.58 C) 13.38 D) 18.70
Version 1
56
88) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with the increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data were collected related to the length of life after 65 (Life) including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.65
1.45
12.19
0.00
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Woman
3.20
0.74
4.35
0.00
Which of the following is the predicted life expectancy for a 65-year-old woman with an income of $60,000 and a consumption of one drink per day? A) 19.78 B) 15.21 C) 21.36 D) 16.58
89) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with the increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data were collected related to the length of life after 65 (Life) including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.95
1.42
12.64
0.00
Income
0.04
0.04
0.94
0.35
Drinks
−0.65
0.27
−2.41
0.01
Version 1
57
Woman
3.30
0.87
3.79
0.00
Which of the following is the difference in life expectancy for 65-year-old women consuming two drinks per day and men consuming one drink per day, and having the same income? A) 3.03 B) 4.03 C) 3.34 D) 2.65
90) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with the increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data were collected related to the length of life after 65 (Life) including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.65
1.45
12.19
0.00
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Woman
3.20
0.74
4.35
0.00
Wh ich of the following is the difference in life expectancy for 65-year-old women consuming two drinks per day and men consuming one drink per day, and having the same income? A) 2.89 B) 3.89 C) 3.20 D) 2.51
Version 1
58
91) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with an increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data were collected related to the length of life after 65 (Life) including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.65
1.45
12.19
0.00
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Woman
3.20
0.74
4.35
0.00
Which of the following are the competing hypotheses to determine if women live longer than men? A) H0: B3 ≤ 0; HA: B3 > 0 B) H0: B3 ≥ 0; HA: B3 < 0 C) H0: B3 = 0; HA: B3 ≠ 0? D) H0: B2 = b3; HA: B2 ≠ b3
92) According to the Center for Disease Control and Prevention, life expectancy at age 65 in America is about 18.7 years. Medical researchers have argued that while excessive drinking is detrimental to health, drinking a little alcohol every day, especially wine, may be associated with the increase in life expectancy. Others have also linked longevity with income and gender. The data relating to the length of life after 65 (Life) were collected. Data were collected related to the length of life after 65 (Life) including the average income (in $1,000s) (Income), the average number of alcoholic drinks consumed per day (Drinks), and a dummy variable Woman which is 1 when the individual is a woman and 0 otherwise. The following model was created: Life = β0 + β1Income + β2Drinks + β3Woman + ε. Output corresponding to this model is shown below. Coefficients
Standard Error
t-stat
p-value
Intercept
17.65
1.45
12.19
0.00
Income
0.02
0.02
0.94
0.35
Drinks
−0.69
0.26
−2.62
0.01
Version 1
59
Woman
3.20
0.74
4.35
0.00
Using a p-value approach, which of the following conclusions is correct at 5% significance level? A) Do not reject H0; conclude that women live longer than men on average. B) Reject H0; conclude that women live longer than men on average. C) Do not reject H0; cannot conclude that women live longer than men on average. D) Reject H0; cannot conclude that women live longer than men on average.
93) To encourage performance, loyalty, and continuing education, the human resources department at a large company wants to develop a regression-based compensation model for mid-level managers based on three variables: business unit-profitability in $1,000s (Profit), years with company (Years), and a dummy variable (Grad) which takes on 1 when the individual holds a graduate degree in a relevant field and 0 otherwise. Data have been collected for 36 managers. The sample regression equation is = − 13,264.3 + 14.13Profit + 1,868.53Years + 45,121.94Grad − 0.1539Profit × Grad − 198.34 Years × Grad. Which of the following is the predicted value of Compensation for a manager having 15 years with the company, a graduate degree, and a business-unit profit of $4,800,000? A) $121,492.60 B) $108,324.80 C) $101,749.70 D) $112,101.60
Version 1
60
94) To encourage performance, loyalty, and continuing education, the human resources department at a large company wants to develop a regression-based compensation model for mid-level managers based on three variables: business unit-profitability in $1,000s (Profit), years with company (Years), and a dummy variable (Grad) which takes on 1 when the individual holds a graduate degree in a relevant field and 0 otherwise. Data have been collected for 36 managers. The sample regression equation is = −13,277.00 + 14.86Profit + 1,867.47Years + 45,122.46Grad − 0.1508Profit × Grad − 197.70Years × Grad. Which of the following is the difference in the predicted compensations for two managers, one having a graduate degree, the other one not, both having 10 years with the company, and having the same business-unit profit of $3,000,000? A) $426,798.01 B) $422,273.80 C) $420,019.00 D) $42,693.06
95) To encourage performance, loyalty, and continuing education, the human resources department at a large company wants to develop a regression-based compensation model for mid-level managers based on three variables: business unit-profitability in $1,000s (Profit), years with company (Years), and a dummy variable (Grad) which takes on 1 when the individual holds a graduate degree in a relevant field and 0 otherwise. Data have been collected for 36 managers. The sample regression equation is = − 13,264.3 + 14.13Profit + 1,868.53Years + 45,121.94Grad − 0.1539Profit × Grad − 198.34Years × Grad. Which of the following is the difference in the predicted compensations for two managers, one having a graduate degree, the other one not, both having 10 years with the company, and having the same business-unit profit of $3,000,000? A) $34,456.51 B) $38,980.72 C) $41,235.52 D) $47,566.51
Version 1
61
96) Like any other university, Seton Hall University uses SAT scores and high school GPAs as a primary criteria for admission. Data for 30 students who had recently applied to Seton Hall were collected including admission (1 for admission and 0 otherwise), SAT score, and GPA. Output for the model is shown below. Predictor
Coefficients
Standard Error
p-value
−19.4786
8.2917
0.019
SAT
0.0065
0.0025
0.010
GPA
2.7800
1.4081
0.048
Constant
Which of the following is the estimated logistic model? A) B) = exp(–19.4786 + 0.0065x1 + 2.78x2) C) D) = −19.4786 + 0.0065x1 + 2.78x2
97) Like any other university, Seton Hall University uses SAT scores and high school GPAs as a primary criteria for admission. Data for 30 students who had recently applied to Seton Hall were collected including admission (1 for admission and 0 otherwise), SAT score, and GPA. Output for the model is shown below. Predictor
Coefficients
Standard Error
p-value
−19.4786
8.2917
0.019
SAT
0.0065
0.0025
0.010
GPA
2.7800
1.4081
0.048
Constant
Which of the following is the predicted probability of admission for an individual with a GPA of 3.5 and an SAT score of 1,700? A) 0.88 B) 0.79 C) 0.97 D) 0.82
Version 1
62
98) Like any other university, Seton Hall University uses SAT scores and high school GPAs as a primary criteria for admission. Data for 30 students who had recently applied to Seton Hall were collected including admission (1 for admission and 0 otherwise), SAT score, and GPA. Output for the model is shown below. Predictor
Coefficients
Standard Error
p-value
−19.4786
8.2917
0.019
SAT
0.0065
0.0025
0.010
GPA
2.7800
1.4081
0.048
Constant
Which of the following statements is correct about the individual significance of the variables at the 5% significance level? A) SAT is individually significant and GPA is not. B) GPA is individually significant and SAT is not. C) Both SAT and GPA are individually significant. D) Neither SAT nor GPA is individually significant.
99) A dummy variable can be used in a regression model in the following manner EXCEPT __________. A) as a categorical response variable only B) as a categorical response variable and a categorical explanatory variable C) as a numerical explanatory variable D) as a categorical explanatory variable only
Version 1
63
100) Consider the regression equation = b0+ b1x + b2d + b3xd to estimate a regression model y = β0 + β1x + β2d + β3xd + ε with a dummy variable d, a numerical variable x, and an interaction variable xd. We plot (on the vertical axis) against x (on the horizontal axis) using two lines to represent d = 0 and d = 1, given b2 > 0 and b3 > 0, as shown in the figure below:
Which of the following statements regarding the given figure above is FALSE? A) The red line represents d = 1. B) The green line’s intercept is b0. C) The red line’s intercept is b0 + b2. D) The green line’s slope is b2.
101)
Which of the following is not an appropriate application of a linear probability model? A) Whether to answer with a Yes, No, or Maybe. B) Whether or not to vote for Trump. C) Whether or not to retire. D) Whether or not to buy local.
Version 1
64
102)
Consider the given plot of Predicted Probability ( y-axis) against x ( x-axis), which of
the following statements is false? A) The blue line is a depiction of the linear probability model. B) The brown line is a depiction of the logistic regression model. C) The regression coefficient of x, i.e., b 1 of both lines is positive. D) The regression coefficient of x, i.e., b 1 of the brown line implies that for any 1unit increase in x, the predicted probability increases by b 1.
103) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected and the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses) Educ = the number of years of education Exper = the number of months of experience Gender = the gender of an individual; 1 for males, and 0 for females The regression results obtained for the models are as follows: Model A: Salary = β0 + β1 Educ + β2 Exper + β3 Gender + β4 Exper × Gender + ε, Model B: Salary= β0 + β1 Educ + β2 Exper + ε, are summarized in the following table. Variable
Model A
Model B
4,708.31 (t = 14.63)
94,256.90 (t = 11.03)
Educ
135.13 (t = 6.92)
178.53 (t = 7.87)
Exper
4.10 (t = 6.87)
3.46 (t = 6.17)
Gender
799.34 (t = 5.94)
N/A
Exper ×Gender
−1.82 (t = −1.96)
N/A
SSE
13,204,523.17
20,406,331.06
R2
0.6844
0.5122
Intercept
Version 1
65
Adjusted R2
0.6695
0.5010
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Define the alternative hypothesis for testing the joint significance of Exper and Exper × Gender in Model A.
104) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected and the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses) Educ = the number of years of education Exper = the number of months of experience Gender = the gender of an individual; 1 for males, and 0 for females The regression results obtained for the models are as follows: Model A: Salary = β0 + β1 Educ + β2 Exper + β3 Gender + β4 Exper × Gender + ε, Model B: Salary= β0 + β1 Educ + β2 Exper + ε, are summarized in the following table. Variable
Model A
Model B
4,708.31 (t = 14.63)
94,256.90 (t = 11.03)
Educ
135.13 (t = 6.92)
178.53 (t = 7.87)
Exper
4.10 (t = 6.87)
3.46 (t = 6.17)
Gender
799.34 (t = 5.94)
N/A
Exper ×Gender
−1.82 (t = −1.96)
N/A
SSE
13,204,523.17
20,406,331.06
R2
0.6844
0.5122
Adjusted R2
0.6695
0.5010
Intercept
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. What is the value of the test statistic for testing the joint significance of Exper and Exper × Gender in Model A?
Version 1
66
105) To examine the differences between salaries of male and female middle managers of a large bank, 90 individuals were randomly selected and the following variables considered: Salary = the monthly salary (excluding fringe benefits and bonuses) Educ = the number of years of education Exper = the number of months of experience Gender = the gender of an individual; 1 for males, and 0 for females The regression results obtained for the models are as follows: Model A: Salary = β0 + β1 Educ + β2 Exper + β3 Gender + β4 Exper × Gender + ε, Model B: Salary= β0 + β1 Educ + β2 Exper + ε, are summarized in the following table. Variable
Model A
Model B
4,708.31 (t = 14.63)
94,256.90 (t = 11.03)
Educ
135.13 (t = 6.92)
178.53 (t = 7.87)
Exper
4.10 (t = 6.87)
3.46 (t = 6.17)
Gender
799.34 (t = 5.94)
N/A
Exper ×Gender
−1.82 (t = −1.96)
N/A
13,204,523.17
20,406,331.06
0.6844
0.5122
0.6695
0.5010
Intercept
SSE R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. At 1% significance level, what is the conclusion of testing the joint significance of Exper and Exper × Gender in Model A?
Version 1
67
106) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, define the null hypothesis for testing the joint significance of the two dummy variables.
Version 1
68
107) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, compute the test statistic for testing the joint significance of the two dummy variables Loc1 and Loc2.
Version 1
69
108) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, what is the conclusion for testing the joint significance of the two dummy variables at the 1% significance level?
Version 1
70
109) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model C, define the alternative hypothesis for testing the individual significance of Age.
Version 1
71
110) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Obtain the p-value for testing the individual significance of Age in Model C.
Version 1
72
111) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Suppose that at the 10% significance level, you do not reject the null hypothesis, H0: β2 = 0, when testing the significance of Age in Model C. Would you delete Age from Model C?
Version 1
73
112) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Which of these three models would you choose to make the predictions of the home prices?
Version 1
74
113) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. What is the regression equation for Model B?
Version 1
75
114) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, compute the predicted difference between the price of homes with the same square footage located in Location 2 and Location 3.
Version 1
76
115) A realtor wants to predict and compare the prices of homes in three neighboring locations. She considers the following linear models: Model A:Price = β0 + β1 Size + β2 Age + ε Model B:Price = β0 + β1 Size + β3 Loc1 + β4 Loc2 + ε Model C:Price = β0 + β1 Size + β2 Age + β3 Loc1 + β4 Loc2 + ε where, Price = the price of a home (in $1,000s) Size = the square footage (in square feet) Loc1 = a dummy variable taking on 1 for Location 1, and 0 otherwise Loc2 = a dummy variable taking on 1 for Location 2, and 0 otherwise After collecting data on 52 sales and applying regression, her findings were summarized in the following table. Variable
Model A
Model B
Model C
Intercept
52.91 (t = 1.31)
61.63 (t = 2.35)
67.74 (t = 2.35)
Size
0.08 (t = 5.24)
0.06 (t = 5.83)
0.06 (t = 5.49)
Age
−1.15 (t = −1.48)
N/A
−0.32 (t = −0.53)
Loc1
N/A
56.12 (t = 7.30)
54.65 (t = 6.65)
Loc2
N/A
38.34 (t = 4.93)
39.01 (t = 4.92)
SSE
55,640.86
25,916.01
25,761.21
0.4212
0.7304
0.7320
0.3976
0.7136
0.7092
R2 Adjusted
R2
Note: The values of relevant test statistics are shown in parentheses below the estimated coefficients. Using Model B, compute the predicted price of a 2,500-square-foot home in Location 1.
116) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Version 1
77
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. What is the estimated logit model?
117) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
Version 1
78
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. What are the conclusions of the tests of individual significance for the balance ratio and the debt ratio at the 1% level?
118) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Version 1
79
Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 – Balance Ratio
0.13 (p-value = 0.0054)
x2 – Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. Compute the predicted probability of default for a person whose balance ratio is 5% and debt ratio is 30%.
119) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Version 1
80
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. Assuming the debt ratio 30%, compute the increase in the probability of defaulting when the balance ratio goes up from 5% to 15%.
120) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. What will be the rating for a person with a balance ratio of 15% and a debt ratio of 30%?
Version 1
81
121) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. Assuming the debt ratio is 30%, what is the range of values for the balance ratio that yields the fair rating?
Version 1
82
122) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. If only applicants with excellent and good ratings are qualified for a loan, find a linear relation between their balance ratio and their debt ratio that must be satisfied to be qualified.
123) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
Version 1
83
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. Bob has a balance ratio of 10%, an annual income of $80,000, and $15,000 in total debt. Only applicants with excellent and good ratings qualify for a loan. Find the maximum amount of loan Bob can get if he is required to maintain his excellent or good rating after getting this amount.
124) A bank manager is interested in assigning a rating to the holders of credit cards issued by her bank. The rating is based on the probability of defaulting on credit cards and is as follows. Rating
Probability of Defaulting
Excellent
p ≤ 0.05
Good
0.05 < p ≤ 0.10
Fair
0.10 < p ≤ 0.25
Poor
p > 0.25
Version 1
84
<p>To estimate this probability, she decided to use the logit model, where y = a binary response variable which is 1 if the credit card is in default and 0 otherwise x1 = the ratio of the credit card balance to the credit card limit (in %) x2 = the ratio of the total debt to the annual income (in %) The following output is obtained. Variable
Logistic Model
Intercept
−8.98 (p-value = 0.0089)
x1 − Balance Ratio
0.13 (p-value = 0.0054)
x2 − Debt Ratio
0.17 (p-value = 0.0067)
Note: The p-values of the corresponding tests are shown in parentheses below the estimated coefficients. (Using Excel or R) Suppose that only applicants with excellent and good ratings are qualified for a loan. Assume that the balance ratio, x1, of those who apply is normally distributed with μ1 = 18% and σ2 = 6%, while their debt ratio, x2, is normally distributed with μ2 = 30% and σ2 = 8%. Assume also that x1 and x2 are independent. Simulate 1,000 applications to estimate the percent of those that are qualified for a loan.
125) A realtor wants to better understand the factors that explain differences in home values in Detroit area. He decides to estimate two models: Model 1 uses crime rate per capita as the explanatory variable. Model 2 uses crime rate per capita and accessibility index (1-low; 2medium; 3-high) as explanatory variables. A portion of the data he collects is shown below. Home Value (in $1,000s)
Accessibility Index
Crime Rate per Capita
397.528
2
1.35472
:
:
:
547.095
3
0.06905
Version 1
85
{MISSING IMAGE}Click here for the Excel Data File a. Estimate Model 1. b. Estimate Model 2 with accessibility index 3-high as the reference category. c. Determine whether home values differ between an accessibility index of 1 and 3 as well as 2 and 3 at the 5% significance level. d. How do you reformulate Model 2 to determine whether home values differ between an accessibility index of 1 and 2 at the 5% significance level? e. Select the best fit model. f. Use the selected model to predict home values for a 1.0 crime rate per capita over various accessibility indices.
126) A financial analyst wants to test whether a company’s assets (in $ millions) depend on its size (small = 0 and big = 1) and # of employees. A portion of the relevant data on 25 companies is shown below: Assets
Size
# of employees
615.2
small
7523
:
:
:
10751
big
82300
{MISSING IMAGE}Click here for the Excel Data File a. Estimate Model 1 with # of employees and size as the explanatory variables. b. Estimate Model 2 with # of employees, size, and an interaction variable (# of employees × size) as the explanatory variables. c. Which of the models is preferred? d. Use the preferred model to estimate assets for a small company with 5,000 employees.
127)
Variables employed in a regression model can be numerical or categorical. ⊚ true ⊚ false
Version 1
86
128)
A dummy variable is a variable that takes on the values of 0 and 1. ⊚ true ⊚ false
129)
A dummy variable is also called an interaction variable. ⊚ true ⊚ false
130) <p>In the regression equation <span style="font-family:monospace;"> = b0 + b1x + b2d, a dummy variable d affects the slope of the line. ⊚ true ⊚ false
131) The number of dummy variables representing a categorical variable should be one less than the number of categories of the variable. ⊚ true ⊚ false
132) If the number of dummy variables representing a categorical variable equals the number of categories of this variable, one deals with the problem of perfect multicollinearity. ⊚ true ⊚ false
133) Consider the regression model y = β0 + β1x + β2d + β3xd + ε. If the dummy variable d changes from 0 to 1, the estimated changes in the intercept and the slope are b0 + b2 and b2, respectively.
Version 1
87
⊚ ⊚
true false
134) For the model y = β0 + β1x + β2d + β3xd + ε, the dummy variable d causes only a shift in intercept. ⊚ true ⊚ false
135) For the model y = β0 + β1x + β2d + β3xd + ε, in which d is a dummy variable, we cannot perform the F test for the joint significance of the dummy variable d and the interaction variable xd. ⊚ true ⊚ false
136) For the model y = β0 + β1x + β2d + β3xd + ε, in which d is a dummy variable, we can perform standard t tests for the individual significance of x, d, and xd. ⊚ true ⊚ false
137) Regression models that use a binary variable as the response variable are called binary choice models. ⊚ true ⊚ false
138)
A binary choice model is also referred to as a discrete choice model. ⊚ true ⊚ false
Version 1
88
139) A model y = β0 + β1x + ε, in which y is a binary variable, is called a linear probability model. ⊚ true ⊚ false
140) For the linear probability model y = β0 + β1x + ε, the predictions made by = b0 + b1x can be always interpreted as probabilities. ⊚ true ⊚ false
141)
The logit model can be estimated with standard ordinary least squares. ⊚ true ⊚ false
142) For the logit model, the predicted values of the response variables can always be interpreted as probabilities. ⊚ true ⊚ false
Version 1
89
Answer Key Test name: Chap 17_4e 1) categorical 2) indicator 3) trap 4) t test 5) trap 6) interaction 7) partial 8) choice 9) linear 10) ordinary 11) estimate 12) zero and one 13) C 14) B 15) B 16) D 17) C 18) D 19) B 20) C 21) C 22) B 23) A 24) D 25) C 26) D Version 1
90
27) D 28) B 29) C 30) A 31) D 32) D 33) A 34) A 35) B 36) A 37) A 38) D 39) C 40) B 41) C 42) B 43) B 44) D 45) D 46) C 47) C 48) D 49) B 50) C 51) B 52) B 53) D 54) D 55) B 56) C Version 1
91
57) D 58) A 59) C 60) A 61) B 62) A 63) D 64) B 65) A 66) B 67) C 68) C 69) C 70) D 71) C 72) D 73) B 74) C 75) D 76) D 77) B 78) D 79) C 80) B 81) C 82) D 83) B 84) D 85) A 86) A Version 1
92
87) B 88) C 89) D 90) D 91) A 92) B 93) C 94) D 95) D 96) A 97) B 98) C 99) C 100) D 101) A 102) D 103) HA: β2 ≠ 0 or β4 ≠ 0 104) F(2, 85) = 23.18. 105) Reject H0; for α = 0.01, Exper and Exper × Gender are jointly significant.
106) H0: β3 = 0 and β4 = 0 107) F(2, 47) = 27.26 108) Reject H0; Loc1 and Loc2 are jointly significant.
109) HA: β2 ≠ 0 110) More than 0.40 111) Yes; Model C reduces to Model B which has higher adjusted R2. 112) Model B 113) = 61.63 + 0.06Size + 56.12Loc1 + 38.34Loc2. 114) $38,340 115) $267,250 116) 117) Both Balance Ratio and Debt Ratio are significant at the 1% level. 118) 0.0381 Version 1
93
119) 0.0887 120) Fair 121) 122) 123) 124)
From about 12.94% to 21.39% 0.13x1 + 0.17x2 ≤ 6.7827 $10,800 34%
125) <p>a. = 531.4036 − 77.6490x where is the home value, x is the crime rate per capita. b. = 537.9753 − 94.4968x − 102.0030d1 − 53.2450d2, where is the home value, x is the crime rate per capita, and d1and d2represent accessibility index 1-low and 2-medium respectively. c. The p-value for the coefficients of the two dummy variables in Model 2 is less than α (0.00 < 0.05). Therefore, home values differ between an accessibility index of 1 and 3 as well as 2 and 3. d. Model 2 is reformulated to use either 1-low or 2-medium as the reference category. e. Model 2 f. The predicted home value for a 1.0 crime rate per capita and accessibility index 1-low = 377.4754. The predicted home value for a 1.0 crime rate per capita and accessibility index 2-medium = 425.9035. The predicted home value for a 1.0 crime rate per capita and accessibility index 3-high = 479.4784. 126) <p>a. −475.9051 + 0.2122x − 4765.24d where is assets, x is # of employees, and d is the big dummy variable. b. = 459.3204 + 0.1038x − 5718.0280d + 0.1086xd where is assets, x is # of employees, and d is the big dummy variable. c. Model 1 d. 585.0400.
127) TRUE 128) TRUE 129) FALSE 130) FALSE 131) TRUE 132) TRUE 133) FALSE 134) FALSE 135) FALSE Version 1
94
136) TRUE 137) TRUE 138) TRUE 139) TRUE 140) FALSE 141) FALSE 142) TRUE
Version 1
95
CHAPTER 18 – 4e 1)
The mean of the absolute residuals defines the __________.
2) Ideally, the chosen model is best in terms of its in-sample predictability and its __________ forecasting ability.
3)
__________ patterns are caused by the presence of a random error term.
4) The optimal value of the speed of decline in the exponential smoothing procedure is determined by a(n) __________ method.
5)
A linear trend can be estimated using __________ technique.
6) When comparing polynomial trend models, we use adjusted R2, which imposes a penalty for __________.
7)
The seasonal component typically represents repetitions over a __________ period.
8)
A __________ time series is one that has its seasonal variation removed.
9) Trend forecasting models are appropriate when the time series does not exhibit __________ variations.
Version 1
1
10) After we have estimated the trend and the seasonal components, we __________ them to make a forecast.
11) With the method of seasonal dummy variables, we estimate a trend forecasting model that includes __________ variables to capture seasonal variations.
12) When the increase in the series gets larger over time, it is attractive to use a(an) __________ trend model.
13) In the model yt = β0 + β1yt−1 + εt, the parameter β1 represents the __________ of the lagged response variable y.
14)
The regression yt = β0 + β1yt−1 + εt is referred to as a(n) __________ model of order one.
15)
A time series is __________. A) any set of data recorded at the same point in time B) a set of sequential observations of a variable over time C) a set of randomly measured data points of multiple variables at the same point in
time D) any set of data collected without regard to differences in time
16)
Which of the following is an example of a time series?
Version 1
2
A) The number of daily visitors to the Niagara Falls during the month of April. B) The recorded exam scores of students in a class. C) The sales prices of single family homes sold last month in Florida. D) The current average prices of regular gasoline in different states of the U.S.
17)
Which of the following is not an example of a time series?
A) Hourly volume of stocks traded on the New York Stock Exchange (NYSE) on the five last trading days. B) The number of daily visitors that frequent the Statue of Liberty during the month of July. C) The monthly sales for a retailer over a five-year period. D) The current temperature in the 49 state capitals.
18)
Which of the following factors refers to quantitative forecasting methods? A) Judgment of the forecaster. B) A formal model for analyzing historical data. C) Prior experience of the forecaster. D) Expertise of the forecaster.
19)
Under which of the following conditions is qualitative forecasting considered attractive? A) When the forecasts have to be documented. B) When the forecasts have to be independent of the forecaster’s judgment. C) When past data are either unavailable or are misleading. D) When the forecasts can be based on reliable data.
20)
Which of the following is a criticism made of qualitative forecasts?
Version 1
3
A) They become the only possible forecast when past data are either not available or are misleading. B) They provide no guidance on the likely effects of changes in explanatory variables. C) They are prone to biases such as optimism and overconfidence. D) They are difficult to document.
21)
In which of the following situations is the use of qualitative forecasts most appropriate?
A) A marketing manager has to forecast monthly sales for the coming financial year based on the past monthly sales figures. B) A TV network executive has to forecast viewership figures for a daily talk show based on historical data from the past on a similar show on a rival network. C) An economist has to forecast credit flow resulting from a newly introduced stimulus package by the federal government. D) A country’s annual rate of growth for the upcoming year has to be estimated based on the annual GDP data from the last 20 years.
22)
Which of the following is a smoothing technique? A) Moving average methods B) Exponential trend models C) Polynomial trend models D) Autoregressive models
23)
Which of the following is a causal forecasting method?
Version 1
4
A) Moving average methods B) Exponential trend models C) Polynomial trend models D) Autoregressive models
24)
Which of the following is not based on a regression model? A) Moving average methods B) Exponential trend models C) Polynomial trend models D) Autoregressive models
25) A time series with observed long-term upward movements in its values is said to have __________. A) a cyclical component B) an increasing trend component C) a seasonally increasing component D) a decreasing trend component
26)
Which of the following is not true of a time series with a cyclical component? A) It has wavelike fluctuations lasting from several months to several years. B) It is difficult to analyze because cycles vary in length and amplitude. C) It typically coincides with business cycles in the economy. D) It has wavelike fluctuations always lasting less than a year.
Version 1
5
27) The __________ method is a smoothing technique based on computing the average from a fixed number of the most recent observations. A) exponential smoothing B) moving average C) linear regression D) causal regression
28) In a moving average method, when a new observation becomes available, the new average is computed by including the new observation and __________. A) dropping the oldest observation B) keeping the previous m observations C) dropping the youngest previous observation D) keeping any previous m − 1 observations
29) The past monthly demands are shown below. The naïve method, that is, the one-period moving average method, is applied to make forecasts. Month
Demand
January
40
February
45
March
50
April
35
What is the mean square error of the forecasts? A) −1.67 B) 275.00 C) 91.67 D) 8.33
Version 1
6
30) The past monthly demands are shown below. The naïve method, that is, the one-period moving average method, is applied to make forecasts. Month
Demand
January
40
February
45
March
50
April
35
What is the mean absolute deviation (MAD) of the forecasts? A) −1.67 B) 25.00 C) 91.67 D) 8.33
31) The past monthly demands are shown below. The naïve method, that is, the one-period moving average method, is applied to make forecasts. Month
Demand
January
40
February
45
March
50
April
35
If the demand in May appears to be 35, what is the residual (error) for May? A) 35 B) 0 C) 68.75 D) 6.25
32)
Which of the following is true of the exponential smoothing method?
Version 1
7
A) It does not incorporate new observations into existing forecasts. B) It weighs all recent observations equally. C) It assigns weights that decrease exponentially as observations get older. D) It does not use all available observations in making a forecast.
33) Which of the following is a similarity between the exponential smoothing method and the moving average method? A) Both methods give different weights to most recent observations. B) Both methods assign weights to all available observations. C) Both methods give equal weight to every observation. D) Both methods continually revise a forecast when a new observation becomes available.
34) In the exponential smoothing formula for updating the level of the series, At = αyt + (1 − α)At − 1, what does α represent? A) The weighted average of the exponentially declining weights. B) The speed of decline in the weights of older observations. C) The initial value of the time series. D) The last value of the time series.
35)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
What is the forecast for May when the three-month moving average method is applied? Version 1
8
A) 19 B) 5 C) 20 D) 21
36)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When a forecast is made by the three-month moving average method, all three monthly observations used to make this forecast are treated equally in the sense that each of them has the same weight of 1/3. What is the forecast for May when the three-month weighted moving average method is applied with the weights: 1/6, 2/6, and 3/6? Assign the smallest weight to the oldest data and the largest weight to the most recent data. A) 19.00 B) 24.00 C) 18.67 D) 21.17
37)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
What is the forecast for May using the exponential smoothing method with α = 0.1?
Version 1
9
A) 20.796 B) 21.000 C) 20.600 D) 20.440
38)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.1 is applied, what is the MSE? A) 20.796 B) 31.2336 C) 12.6736 D) 10.41
39)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.1 is applied, what is the MAD?
Version 1
10
A) 20.796 B) 9.16 C) 3.05 D) 3.56
40)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
What is the forecast for May using the exponential smoothing method with α = 0.5? A) 21 B) 21.5 C) 19 D) 19.5
41)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.5 is applied, what is the MSE?
Version 1
11
A) 21.5 B) 25 C) 41 D) 13.67
42)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.1 is applied, what is the MAD?
A) 21.5 B) 3.00 C) 5.00 D) 9.00
43)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.1 and α = 0.5 is applied, what is the speed of decline for which the mean square error is better? What is this mean?
Version 1
12
A) α = 0.5 and MSE = 13.67 B) α = 0.1 and MSE = 13.67 C) α = 0.5 and MSE = 10.41 D) α = 0.1 and MSE = 10.41
44)
The following table includes the information about a monthly time series. Month
Time Series
January
21
February
17
March
19
April
24
When the exponential smoothing method with α = 0.1 and α = 0.5 is applied, what is the speed of decline for which the mean absolute deviation is better? What is this mean? A) α = 0.1 and MAD = 3.05 B) α = 0.1 and MAD = 3.00 C) α = 0.5 and MAD = 3.00 D) α = 0.5 and MAD = 3.05
45)
Which of the following is true of the linear trend model? A) It assigns exponentially decreasing weights to older observations. B) It is a causal forecasting model. C) It can extract a long-term upward or downward steady movement in a series. D) It is used when a series involves only random fluctuations.
46) Which of the following types of trend models will best suit a series where the value of the series changes by a fixed amount for each period?
Version 1
13
A) Cubic trend B) Linear trend C) Exponential trend D) Quadratic trend
47) Which of the following types of trend models will best suit a series where the increase in value of the series gets larger over time? A) Exponential trend B) Linear trend C) Quadratic trend D) Polynomial trend
48) Which of the following formulas is used to make forecasts using the exponential trend model? A) B) C) D)
49) In the quadratic trend model, yt = β0 +β1t2 + εt, which coefficient determines if the trend is going to be U-shaped or inverted U-shaped? A) β0 B) β1 C) β2 D) εt
Version 1
14
50) A polynomial trend model that only allows one change in the direction of a series is known as a(n) __________. A) exponential trend model B) linear trend model C) cubic trend model D) quadratic trend model
51)
The __________ is a trend model that allows for one change in the direction of a series. A) linear trend model B) exponential trend model C) quadratic trend model D) cubic trend model
52) In comparison with the linear trend model, which of the following is not true of the cubic trend model? A) It always has better MSE. B) Two additional variables, t2 and t3, are defined in the cubic model. C) Only one change in the direction of a series can be modeled. D) It may have better or worse adjusted R2.
53)
When comparing which of the following trend models is the adjusted R2 not used? A) Linear versus quadratic B) Linear versus cubic C) Quadratic versus cubic D) Linear versus exponential
Version 1
15
54) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999
2000
2010
2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
Version 1
16
R2
0.8984
0.9502
0.9205
0.9317
Adjusted R2
0.8933
0.9477
0.9122
0.9203
Which of the following is a linear trend equation? A) = 642.7t + 60.496 B) = 642.7 + 60.496t C) = 642.7 + 60.496t2 D) yt = 642.7 + 60.496t + ε
55) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated.
Version 1
17
Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
R2
0.8984
0.9502
0.9205
0.9317
Adjusted R2
0.8933
0.9477
0.9122
0.9203
Using the linear trend equation, one can say that the predicted revenue increases by __________. A) $642,792,000 a year B) $604,960,000 a year C) $60,496,000 a year D) $6,049,600 a year
56) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
Version 1
18
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
R2
Which of the following is the revenue forecast for 2012 found by the linear trend equation? A) About 2 billion 149 million dollars B) About 2 billion and 189 million dollars C) About 2 billion and 334 million dollars D) About 2 billion and 34 million dollars
Version 1
19
57) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999
2000
2010
2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
R2
0.8984
0.9502
0.9205
0.9317
Adjusted R2
0.8933
0.9477
0.9122
0.9203
Version 1
20
Which of the following is an exponential trend equation? A) = exp(6.632 + 0.045t + (0.069)2⁄2) B) = 6.632 + 0.045t + 0.069 ⁄ 2 C) = exp(6.632 + 0.045t + 0.069 ⁄ 2) D) = 6.632 + 0.045t + (0.069)2⁄2
58) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Version 1
Linear Trend
Exponential
Quadratic
Cubic Trend 21
Trend
Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
R2
Which of the following is a revenue forecast for 2012 found by the exponential trend equation? A) About 2 billion and 334 million dollars B) About 2 billion and 189 million dollars C) About 2 billion and 141 million dollars D) About 2 billion and 34 million dollars
59) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
Version 1
22
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
R2
When three polynomial trend equations are compared, which of them provides the best fit? A) Linear B) Exponential C) Quadratic D) Cubic
Version 1
23
60) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999
2000
2010
2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
Version 1
R2
24
Which of the following is a revenue forecast for 2012 found by the polynomial trend equation with the best fit? A) About 2 billion and 149 million dollars B) About 2 billion and 189 million dollars C) About 2 billion and 337 million dollars D) About 2 billion and 34 million dollars
61) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Version 1
Linear Trend
Exponential
Quadratic
Cubic Trend 25
Trend
Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
R2
When all four trend regression equations are compared, which of them provides the best fit? A) Linear B) Exponential C) Quadratic D) Cubic
62) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990 to 2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
Version 1
26
The scatterplot indicates that the annual revenues have an increasing trend. Linear, exponential, quadratic, and cubic models were fit to the data starting with t = 1, and the following output was generated. Variable
Linear Trend
Exponential Trend
Quadratic Trend
Cubic Trend
Intercept
642.7
6.632
797.312
648.7662
t
60.496
0.045
21.8665
91.726
t2
N/A
N/A
1.6796
−5.7477
t3
N/A
N/A
N/A
0.2153
MSE
18333.16
14487.87
15079.84
13689
se
135.4
0.069
122.8
117
0.8984
0.9502
0.9205
0.9317
0.8933
0.9477
0.9122
0.9203
R2 Adjusted
R2
Which of the following is the revenue forecast for 2013 found by the trend regression equation with the best fit? A) About 2 billion and 337 million dollars B) About 2 billion and 95 million dollars C) About 2 billion and 248 million dollars D) About 2 billion and 290 million dollars
Version 1
27
63)
Which of the following is a two-period lagged regression model? A) yt = β0 + β1t + εt B) b0 + b1xt−1 C) yt = β0 + β1xt−1 + εt D) yt = β0 + β1xt−1 + β2xt−2 + εt
64) Given the linear trend model with seasonal dummy variables: yt= β0+ β1d1+ β2d2+ β3d3+ β4t + εtwhere yt is the value of the response variable at time t and d1, d2, and d3 are the dummy variables representing the first three quarters. How should the model be modified to account for monthly data? A) yt = β0 + β1d1 + β2d2 + β3d3 + β4d4 + β5d5 + β6d6 + β7d7 + β8d8 + β9d9 + β10d10 + β11d11 + β12t + εt B) yt = β0 + β1t + εt C) <p><span style="font-family:monospace;"><em> b0 + b1t D) yt = β0 + β1d1 + β2d2 + β3d3 + β4d4 + β5d5 + β6d6 + β7d7 + β8d8 + β9d9 + β10d10 + β11d11 + β12d12 + εt
65)
The types of lagged variables in a lagged regression model include: A) trend B) causal C) response D) seasonal
66)
<p>With quarterly data, an estimated linear trend model with seasonal dummy variables
can be specified as b0+ b1d1+ b2d2+ b3d3+ b4t. Which of the following is the way to make forecasts for quarter 4?
Version 1
28
A) <p><span style="font-family:monospace;">
b0 +b1+b4t
B) <p><span style="font-family:monospace;">
b0 +b2 +b4t
C) <p><span style="font-family:monospace;">
b0 +b3 +b4t
D) <p><span style="font-family:monospace;">
b0 + b4t
67) You have a time series of quarterly data starting from thefirst quarter of 2010 and concluding at thefourth quarter of 2018. How many time periods are in the time series? A) 8 B) 9 C) 32 D) 36
68)
<p>With quarterly data, an estimated linear trend model with seasonal dummy variables
can be specified as b0+ b1d1+ b2d2+ b3d3+ b4t. Given a time series starting from thefirst quarter of 2010 and concluding at thefourth quarter of 2019, what is the value of t if you are to make a forecast for thesecond quarter of 2020? A) 41 B) 38 C) 40 D) 42
69)
Which of the following is a cubic trend model? A) yt= β0+ β1t + εt B) yt= β0+ β1t + β2t2 + εt C) ln(yt) = β0+ β1t + εt D) yt= β0+ β1t + β2t2 + β3t3 + εt
Version 1
29
70)
Which of the following is a quadratic trend model? A) yt= β0+ β1t + εt B) yt= β0+ β1t + β2t2 + εt C) ln(yt) = β0+ β1t + εt D) yt= β0+ β1t + β2t2 + β3t3 + εt
71)
Which of the following is an exponential trend model? A) yt = β0 + β1t + εt B) yt = β0 + β1t + β2t2 + εt C) ln(yt) = β0 + β1t + εt D) yt = β0 + β1t + β2t2 + β3t3 + εt
72) A seasonal component differs from a cyclical component in that the seasonal component __________. A) is difficult to capture with historical data B) refers to fluctuations that may last more than a year C) represents wavelike fluctuations caused by business cycles D) represents some regular repetitions within a one-year period
Version 1
30
73) Which of the following statements is false regarding the following scatterplot of a time series with superimposed trend line?
A) The green trend line can be specified as b0+ b1t + b2t2 B) b2 < 0 C) A linear trend line will under-forecast future values. D) There is one change in direction of the time series.
74)
A portion of the quarterly sales data for the last 4 years is shown below: Year
Quarter
Sales (in $ million)
1
1
80
:
:
:
4
4
180
{MISSING IMAGE}Click here for the Excel Data File Which of the following is the Sales forecast for thefourth quarter of next year according to the exponential trend model with seasonality? A) 191.4193 B) 185.5422 C) 203.7379 D) 197.4826
Version 1
31
75)
Yearly sales data for the last 16 years is shown below:
Year
1
2
3
4
Sales
80
102
109
138
Year
5
6
7
8
Sales
96
113
128
144
Year
9
10
11
12
Sales
109
125
130
137
Year
13
14
15
16
Sales
128
155
170
180
Which of the following is the Sales forecast of next year according to the quadratic trend model? A) 142.7500 B) 179.7369 C) 166.8500 D) 176.3143
76)
Yearly sales data for the last 16 years is shown below:
Year
1
2
3
4
Sales
80
102
109
138
Year
5
6
7
8
Sales
96
113
128
144
Year
9
10
11
12
Sales
109
125
130
137
Year
13
14
15
16
Sales
128
155
170
180
Which of the following is the Sales forecast of next year according to the cubic trend model?
Version 1
32
A) 179.7369 B) 176.3143 C) 203.7379 D) 208.6951
77)
Which of the following is not true of seasonal dummy variables?
A) Their values are either 0 or 1. B) Their number should be one less than the number of seasons. C) They are used to describe the seasonality in quarterly, monthly, or other increment time series. D) Their number should equal the number of seasons.
78) If the regression framework is used to describe monthly seasonal data, how many dummy variables are needed? A) 13 B) 12 C) 11 D) 10
79) Which of the following models is considered for a quarterly time series that seems to change on average by a fixed amount and seems to have seasonality? A) ln(yt) = β0 + β1d1 + β2d2 + β3d3 + β4t + εt B) yt = β0 + β1d1 + β2d2 + β3d3 + β4t + εt C) ln(yt) = β0 + β1d1 + β2d2 + β3d3 + β4d4 +β5t + εt D) yt = β0 + β1d1 + β2d2 + β3d3 + β4d4 +β5t + εt
Version 1
33
80) Which of the following models is assumed for a quarterly time series that seems to grow on average by an increasing amount (or decline by a decreasing amount) and seems to have seasonality? A) ln(yt) = β0 + β1d1 + β2d2 + β3d3 + β4t + εt B) yt = β0 + β1d1 + β2d2 + β3d3 + β4t + εt C) ln(yt) = β0 + β1d1 + β2d2 + β3d3 + β4d4 +β5t + εt D) yt = β0 + β1d1 + β2d2 + β3d3 + β4d4 +β5t + εt
81) <p>Based on quarterly data collected over the last four years, the following regression equation was found to forecast the quarterly demand for the number of new copies of an economics textbook: = 3,305 − 665Qtr1 − 1,335Qtr2 + 305Qtr3, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2, and 3. Which of the following is not true? A) The demand in Quarter 1 is on average 665 copies lower than in Quarter 4. B) The demand in Quarter 3 is on average 305 copies higher than in Quarter 4. C) The demand in Quarter 2 is on average 1,335 copies higher than in Quarter 4. D) The demand in Quarter 4 is on average 3,305 copies.
82) <p>Based on quarterly data collected over the last four years, the following regression equation was found to forecast the quarterly demand for the number of new copies of an economics textbook: = 3,305 − 665Qtr1 − 1,335Qtr2 + 305Qtr3, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2, and 3. By what percent is the demand in Quarter 3 higher on average than the average quarterly demand? A) 25.29% B) 8.37% C) 31.63% D) 14.71%
Version 1
34
83) <p>Based on quarterly data collected over the last four years, the following regression equation was found to forecast the quarterly demand for the number of new copies of an economics textbook: = 3,305 − 665Qtr1 − 1,335Qtr2 + 305Qtr3, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2, and 3. The demand forecast for Quarter 2 of the next year is __________. A) 2,640 B) 1,970 C) 3,610 D) 3,305
84) <p>Based on quarterly data collected over the last five years, the following regression equation was found to forecast the quarterly demand for the number of new copies of a business statistics textbook: = 2,298 − 6,635Qtr1 − 1,446Qtr2 + 303Qtr3 + 26t, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2, and 3, and t = time period starting with t = 1. For a given year, the demand in Quarter 3 is on average __________. A) 1,749 copies lower than in Quarter 2 B) 303 copies higher than in Quarter 2 C) 1,749 copies higher than in Quarter 2 D) 1,446 copies higher than in Quarter 2
85) <p>Based on quarterly data collected over the last five years, the following regression equation was found to forecast the quarterly demand for the number of new copies of a business statistics textbook: = 2,298 − 335Qtr1 − 1,446Qtr2 + 303Qtr3 + 26t, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2 and 3, and t = time period starting with t = 1. For a given year, the demand in Quarter 4 is on average __________.
Version 1
35
A) 2,298 copies B) 1,993 copies C) 852 copies D) 2,601 copies
86) <p>Based on quarterly data collected over the last five years, the following regression equation was found to forecast the quarterly demand for the number of new copies of a business statistics textbook: = 2,298 − 6,635Qtr1 − 1,446Qtr2 + 303Qtr3 + 26t, where Qtr1, Qtr2, and Qtr3 are dummy variables corresponding to Quarters 1, 2, and 3, and t = time period starting with t = 1. The demand forecast for the second quarter of the next year is __________. A) 2,181 B) 3,199 C) 2,922 D) 1,424
87)
Which of the following models is not called a causal forecasting model? A) yt = β0 + β1t + β1yt−1 + εt B) yt = β0 + β1yt−1 + εt C) yt = β0 + β1t + εt D) yt = β0 + β1xt−1 + εt
88) <p>Given the estimated model = b0 + b1t, the one-step-ahead and multistep-ahead forecasts are possible, if we know or can predict, future values of the __________.
Version 1
36
A) dummy variable B) explanatory variable C) response variable D) dependent variable
89) If T denotes the number of observations, which of the following equations represents the one-step-ahead forecast for the model yt = β0 + β1xt + εt?
90)
A)
= b0 + b1xT − 1
B)
= b0 + b1xT + 1
C)
= b0 + b1xT+1
D)
= b0 − b1xT+1
<p>The model = β0 + β1yt − 1 + εt is known as a(n) __________. A) quadratic trend model B) autoregressive model C) linear trend model D) exponential trend model
91) Which of the following equations is a one-period-ahead forecast of the autoregressive model AR(1)? A)
= b0 + b1(T+1)
B)
= b0 + b1xT
C)
= b0 + b1yT + b2yT − 1
D)
= b0 + b1yT
Version 1
37
92) If there are T observations to estimate the lagged regression model yt = β0 + β1xt − 1 + εt, what is the actual number of observations used to make the forecast for time period T? A) T − 1 B) T + 1 C) 1 − T D) T
93) Prices of crude oil have been steadily rising over the last two years (The Wall Street Journal, December 14, 2010). The monthly data on price per gallon of unleaded regular gasoline in the United States from January 2009 to December 2010 were available. Three trend models were created starting with t = 1, and the following output was generated. Variable
Linear Trend
Quadratic Trend
Cubic Trend
Intercept
2.0717
1.7700
1.5175
t
0.0391
0.1087
0.2188
t2
N/A
−0.0028
−0.0136
N/A
N/A
0.0003
0.7240
0.8668
0.9238
t3 Adjusted
R2
Based on adjusted R2, which of the following models is the most appropriate for making a forecast for the price of regular unleaded gasoline? A) Linear B) Quadratic C) Cubic D) Cannot be assessed with the given information
94) Prices of crude oil have been steadily rising over the last two years (The Wall Street Journal, December 14, 2010). The monthly data on price per gallon of unleaded regular gasoline in the United States from January 2009 to December 2010 were available. Three trend models were created starting with t = 1 and the following output was generated.
Version 1
38
Variable
Linear Trend
Quadratic Trend
Cubic Trend
Intercept
2.0717
1.7700
1.5175
t
0.0391
0.1087
0.2188
t2
N/A
−0.0028
−0.0136
N/A
N/A
0.0003
0.7240
0.8668
0.9238
t3 Adjusted
R2
Which of the following is a cubic trend equation used to forecast for the price of regular unleaded gasoline? A) = 1.5175 + 0.2188t3. B) = exp(1.5175 + 0.2188t − 0.0136t2 + 0.0003t3) C) = 1.5175 − 0.0136t2 + 0.0003t3 D) = 1.5175 + 0.2188t − 0.0136t2 + 0.0003t3
95) Prices of crude oil have been steadily rising over the last two years (The Wall Street Journal, December 14, 2010). The monthly data on price per gallon of unleaded regular gasoline in the United States from January 2009 to December 2010 were available. Three trend models were created starting with t = 1 and the following output was generated. Variable
Linear Trend
Quadratic Trend
Cubic Trend
Intercept
2.0717
1.7700
1.5175
t
0.0391
0.1087
0.2188
t2
N/A
−0.0028
−0.0136
N/A
N/A
0.0003
0.7240
0.8668
0.9238
t3 Adjusted
R2
Using the cubic trend equation, which of the following would be the forecast for the price of regular unleaded gasoline for January 2011? A) $3.00 B) $3.09 C) $3.175 D) $3.22
Version 1
39
96) Prices of crude oil have been steadily rising over the last two years (The Wall Street Journal, December 14, 2010). The monthly data on price per gallon of unleaded regular gasoline in the United States from January 2009 to December 2010 were available. Three trend models were created starting with t = 1 and the following output was generated. Variable
Linear Trend
Quadratic Trend
Cubic Trend
Intercept
2.0717
1.7700
1.5175
t
0.0391
0.1087
0.2188
t2
N/A
−0.0028
−0.0136
N/A
N/A
0.0003
0.7240
0.8668
0.9238
t3 Adjusted
R2
Which of the following would be a difference in the forecasts for the price of regular unleaded gasoline for February 2011 comparing two trend models: cubic and quadratic? A) $0.01 B) $0.02 C) $0.03 D) $0.04
97)
Which of the following components does not cause the systematic patterns? A) Trend B) Seasonal C) Random D) Cyclical
98) When using Excel for calculating moving averages, the Moving Average dialog box should be activated. The value of m − the number of periods should be entered in the box named __________.
Version 1
40
A) Periods B) Number C) Input Range D) Interval
99)
Which of the following R functions is used to do the exponential smoothing method? A) EMA B) SMA C) DLM D) LM
100)
Which of the following is a unique feature of the cyclical component of time series?
A) A downward movement of the series. B) A predictable magnitude and duration of up-and-down swings of the series. C) A wavelike pattern with different duration and magnitude of up-and-down swings of the series. D) An unexplained movement of the series.
101) The following statements about model selection criteria to compare quantitative forecasting models is TRUE except:
Version 1
41
A) Model selection in-sample criteria are based on the residuals. B) If relatively large residuals are undesirable, then Mean Square Error is the preferred performance measure for model selection. C) We can use residuals-based performance measure for selecting models that employ the same number of explanatory variables. D) A large Mean Absolute Percentage Error relative to Mean Absolute Error is indicative of relatively large residuals.
102) Which of the following is a situation when using smoothing techniques for forecasting is most appropriate? A) Number of positive cases of COVID-19 on a daily basis. B) Number of technical support calls after the recent operating systems update. C) Number of online toy purchases before and after the bankruptcy of Toys”R”Us. D) Number of ski-equipment sales throughout the year.
103) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990–2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The autoregressive models of order 1 and 2, yt = β0 + β1yt − 1 + εt, and yt = β0 + β1yt − 1 + β2yt − 2 + εt, were applied on the time series to make revenue forecasts. The relevant parts of Excel regression outputs are given below. Model AR(1): Regression Statistics Multiple R
0.93
R Square
0.87
Version 1
42
Adjusted R Square
0.86
Standard Error
151.85
Observations
21 Coefficients
Standard Error
t-stat
p-value
Intercept
−6.5999
126.5478
−0.0522
0.9590
yt − 1
1.0604
0.0946
11.2117
0.0000
Model AR(2): Regression Statistics Multiple R
0.93
R Square
0.86
Adjusted R Square
0.84
Standard Error
158.63
Observations
20 Coefficients
Standard Error
t-stat
p-value
Intercept
2.6651
150.7574
0.0177
0.9861
yt − 1
0.9748
0.2420
4.0279
0.0009
yt − 2
0.0860
0.2741
0.3136
0.7576
Using the AR(1) model, find the company revenue forecast for 2012.
104) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990–2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
Version 1
43
The autoregressive models of order 1 and 2, yt = β0 + β1yt − 1 + εt, and yt = β0 + β1yt − 1 + β2yt − 2 + εt, were applied on the time series to make revenue forecasts. The relevant parts of Excel regression outputs are given below. Model AR(1): Regression Statistics Multiple R
0.93
R Square
0.87
Adjusted R Square
0.86
Standard Error
151.85
Observations
21 Coefficients
Standard Error
t-stat
p-value
Intercept
−6.5999
126.5478
−0.0522
0.9590
yt − 1
1.0604
0.0946
11.2117
0.0000
Model AR(2): Regression Statistics Multiple R
0.93
R Square
0.86
Adjusted R Square
0.84
Standard Error
158.63
Observations
20 Coefficients
Standard Error
t-stat
p-value
Intercept
2.6651
150.7574
0.0177
0.9861
yt − 1
0.9748
0.2420
4.0279
0.0009
yt − 2
0.0860
0.2741
0.3136
0.7576
Using the AR(2) model, find the company revenue forecast for 2012.
Version 1
44
105) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990–2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The autoregressive models of order 1 and 2, yt = β0 + β1yt − 1 + εt, and yt = β0 + β1yt − 1 + β2yt − 2 + εt, were applied on the time series to make revenue forecasts. The relevant parts of Excel regression outputs are given below. Model AR(1): Regression Statistics Multiple R
0.93
R Square
0.87
Adjusted R Square
0.86
Standard Error
151.85
Observations
21 Coefficients
Standard Error
t-stat
p-value
Intercept
−6.5999
126.5478
−0.0522
0.9590
yt − 1
1.0604
0.0946
11.2117
0.0000
Model AR(2): Regression Statistics Multiple R
0.93
R Square
0.86
Adjusted R Square
0.84
Standard Error
158.63
Observations
20 Coefficients
Standard Error
t-stat
p-value
Intercept
2.6651
150.7574
0.0177
0.9861
yt − 1
0.9748
0.2420
4.0279
0.0009
yt − 2
0.0860
0.2741
0.3136
0.7576
Version 1
45
Compare Excel outputs for AR(1) and AR(2) and choose the forecasting model that seems to be better.
106) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990–2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
The autoregressive models of order 1 and 2, yt = β0 + β1yt − 1 + εt, and yt = β0 + β1yt − 1 + β2yt − 2 + εt, were applied on the time series to make revenue forecasts. The relevant parts of Excel regression outputs are given below. Model AR(1): Regression Statistics Multiple R
0.93
R Square
0.87
Adjusted R Square
0.86
Standard Error
151.85
Observations
21 Coefficients
Standard Error
t-stat
p-value
Intercept
−6.5999
126.5478
−0.0522
0.9590
yt − 1
1.0604
0.0946
11.2117
0.0000
Model AR(2): Regression Statistics Multiple R
0.93
R Square
0.86
Version 1
46
Adjusted R Square
0.84
Standard Error
158.63
Observations
20 Coefficients
Standard Error
t-stat
p-value
Intercept
2.6651
150.7574
0.0177
0.9861
yt − 1
0.9748
0.2420
4.0279
0.0009
yt − 2
0.0860
0.2741
0.3136
0.7576
When for AR(1), H0: β0 = 0 is tested against HA: β0 ≠ 0, the p-value of this t test shown the output as 0.9590. This could suggest that the model yt = β1yt−1 + εt might be an alternative to the AR(1) model yt = β0 + β1yt−1 + εt. Partial output for this simplified model is as follows: Regression Statistics Multiple R
0.99
R Square
0.99
Adjusted R Square
0.94
Standard Error
148.01
Observations
21 Coefficients
Standard Error
t-stat
p-value
0
N/A
N/A
N/A
1.0557
0.0241
43.7304
2.5E-21
Intercept yt − 1
Find the revenue forecast for 2012 through the use of yt = β1yt−1 + εt.
107) The following table shows the annual revenues (in millions of dollars) of a pharmaceutical company over the period 1990–2011. Year
1990
1991
1992
1993
1994
1995
1996
Revenue
830
791
842
891
953
1,036 1,050 1,070 1,127 1,272 1,386
Year
2001
2002
2003
2004
2005
2006
2007
1997 2008
1998 2009
1999 2010
2000 2011
Revenue 1,304 1,379 1,417 1,441 1,525 1,613 1,870 1,593 1,595 2,132 2,330
Version 1
47
The autoregressive models of order 1 and 2, yt = β0 + β1yt − 1 + εt, and yt = β0 + β1yt − 1 + β2yt − 2 + εt, were applied on the time series to make revenue forecasts. The relevant parts of Excel regression outputs are given below. Model AR(1): Regression Statistics Multiple R
0.93
R Square
0.87
Adjusted R Square
0.86
Standard Error
151.85
Observations
21 Coefficients
Standard Error
t-stat
p-value
Intercept
−6.5999
126.5478
−0.0522
0.9590
yt − 1
1.0604
0.0946
11.2117
0.0000
Model AR(2): Regression Statistics Multiple R
0.93
R Square
0.86
Adjusted R Square
0.84
Standard Error
158.63
Observations
20 Coefficients
Standard Error
t-stat
p-value
Intercept
2.6651
150.7574
0.0177
0.9861
yt − 1
0.9748
0.2420
4.0279
0.0009
yt − 2
0.0860
0.2741
0.3136
0.7576
When for AR(1), H0: β0 = 0 is tested against HA: β0 ≠ 0, the p-value of this t test shown by the output as 0.9590. This could suggest that the model yt = β1yt−1 + εt might be an alternative to the AR(1) model yt = β0 + β1yt−1 + εt. Excel partial output for this simplified model is as follows. Regression Statistics Multiple R
0.99
R Square
0.99
Version 1
48
Adjusted R Square
0.94
Standard Error
148.01
Observations
21 Coefficients
Standard Error
t-stat
p-value
0
N/A
N/A
N/A
1.0557
0.0241
43.7304
2.5E-21
Intercept yt − 1
Compare the autoregressive models yt = β0 + β1yt−1 + εt; yt = β0 + β1yt−1 + β2yt−2 + εt, and yt = β1yt−1 + εt, through the use of MSE and MAD.
108) Quarterly sales of a department store for the last seven years are given in the following table. Yea r
1
1
1
1
2
2
2
2
3
3
3
3
4
4
Qtr
1
2
3
4
1
2
3
4
1
2
3
4
1
2
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
4
4
5
5
5
5
6
6
6
6
7
7
7
7
Qtr
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
Version 1
49
The scatterplot shows that the quarterly sales have an increasing trend and seasonality. A linear regression model Sales = β0 + β1Qtr1 + β2Qtr2 + β3Qtr3 + β4t + ε, with dummy variables Qtr1, Qtr2, and Qtr3, is used to make forecasts. For the regression model, the following partial output is available. Coefficients
Standard Error
t stat
p-value
Intercept
31.9261
1.1028
28.9511
0.0000
Qtr1
−7.8555
1.1124
−7.0617
0.0000
Qtr2
−4.7362
1.1071
−4.2781
0.0003
Qtr3
−7.1656
1.1038
−6.4916
0.0000
t
1.0749
0.0487
22.0563
0.0000
What is the regression equation for the linear trend model with seasonal dummy variables?
Version 1
50
109) Quarterly sales of a department store for the last seven years are given in the following table. Yea r
1
1
1
1
2
2
2
2
3
3
3
3
4
4
Qtr
1
2
3
4
1
2
3
4
1
2
3
4
1
2
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
4
4
5
5
5
5
6
6
6
6
7
7
7
7
Qtr
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
The scatterplot shows that the quarterly sales have an increasing trend and seasonality. A linear regression model given by Sales = β0 + β1Qtr1 + β2Qtr2 + β3Qtr3 + β4t + ε, where t is the time period (t = 1, ..., 28) and Qtr1, Qtr2, and Qtr3 are quarter dummies, is estimated and then used to make forecasts. For the regression model, the following partial output is available. Coefficients
Standard Error
t stat
p-value
Intercept
31.9261
1.1028
28.9511
0.0000
Qtr1
−7.8555
1.1124
−7.0617
0.0000
Version 1
51
Qtr2
−4.7362
1.1071
−4.2781
0.0003
Qtr3
−7.1656
1.1038
−6.4916
0.0000
t
1.0749
0.0487
22.0563
0.0000
Using the regression equation for the linear trend model with seasonal dummy variables, what can be said about the sales in Quarter 4 compared to the sales in Quarter 1?
110) Quarterly sales of a department store for the last seven years are given in the following table. Yea r
1
1
1
1
2
2
2
2
3
3
3
3
4
4
Qtr
1
2
3
4
1
2
3
4
1
2
3
4
1
2
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
4
4
5
5
5
5
6
6
6
6
7
7
7
7
Qtr
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
Version 1
52
The scatterplot shows that the quarterly sales have an increasing trend and seasonality. A linear regression model given by Sales = β0 + β1Qtr1 + β2Qtr2 + β3Qtr3 + β4t + ε, where t is the time period (t = 1, ..., 28) and Qtr1, Qtr2, and Qtr3 are quarter dummies, is estimated and then used to make forecasts. For the regression model, the following partial output is available. Coefficients
Standard Error
t stat
p-value
Intercept
31.9261
1.1028
28.9511
0.0000
Qtr1
−7.8555
1.1124
−7.0617
0.0000
Qtr2
−4.7362
1.1071
−4.2781
0.0003
Qtr3
−7.1656
1.1038
−6.4916
0.0000
t
1.0749
0.0487
22.0563
0.0000
Using the regression equation for the linear trend model with seasonal dummy variables, what is the sales forecast for the first quarter of Year 8?
Version 1
53
111) Quarterly sales of a department store for the last seven years are given in the following table. Yea r
1
1
1
1
2
2
2
2
3
3
3
3
4
4
Qtr
1
2
3
4
1
2
3
4
1
2
3
4
1
2
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
4
4
5
5
5
5
6
6
6
6
7
7
7
7
Qtr
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
The scatterplot shows that the quarterly sales have an increasing trend and seasonality. A linear regression model given by Sales = β0 + β1Qtr1 + β2Qtr2 + β3Qtr3 + β4t + ε, where t is the time period (t = 1, ..., 28) and Qtr1, Qtr2, and Qtr3 are quarter dummies, is estimated and then used to make forecasts. For the regression model, the following partial output is available. Coefficients
Standard Error
t stat
p-value
Intercept
31.9261
1.1028
28.9511
0.0000
Qtr1
−7.8555
1.1124
−7.0617
0.0000
Version 1
54
Qtr2
−4.7362
1.1071
−4.2781
0.0003
Qtr3
−7.1656
1.1038
−6.4916
0.0000
t
1.0749
0.0487
22.0563
0.0000
Using the regression equation for the linear trend model with seasonal dummy variables, what is the sales forecast for the fourth quarter of Year 8?
112) Quarterly sales of a department store for the last seven years are given in the following table. Yea r
1
1
1
1
2
2
2
2
3
3
3
3
4
4
Qtr
1
2
3
4
1
2
3
4
1
2
3
4
1
2
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
4
4
5
5
5
5
6
6
6
6
7
7
7
7
Qtr
3
4
1
2
3
4
1
2
3
4
1
2
3
4
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
a.Estimate the linear trend model with no seasonality. b.Estimate the linear trend model with seasonal dummy variables, Qtr1, Qtr2, and Qtr3 where Qtr1 = 1 for quarter 1, 0 otherwise; other dummy variables are defined similarly. c.Which is the preferred model for forecasting? d.Use the preferred model to forecast quarterly sales in the next two quarters.
113)
Yearly sales of a department store for the last 28 years are given in the following table.
Version 1
55
Yea r
1
2
3
4
5
6
7
8
9
10
11
12
13
14
Sal 26.3 29.8 29.4 37.5 31.3 33.6 33.3 41.3 35.4 35.9 33.8 42.8 36.1 43.5 es 31 86 92 22 90 53 61 18 78 09 14 66 50 81 Yea r
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Sal 38.9 47.9 37.3 45.6 45.5 49.5 47.3 50.8 50.0 60.0 52.2 56.0 55.0 64.6 es 24 03 91 98 52 81 04 81 05 06 68 64 42 78
a.Estimate the exponential trend model. b.Use the estimated model to forecast sales for the next two years.
114) A portion of the number of positive cases of COVID-19 worldwide in the month of April 2020 is shown below: Date 04/01/20 : 04/30/20
Cases 851587 : 3131487
{MISSING IMAGE}Click here for the Excel Data File a.Use a three-period moving average to forecast the number of positive cases on May 1, 2020. b.Use simple exponential smoothing with α = 0.8 and L1 = y1 to forecast the number of positive cases on May 1, 2020. c.Which is the preferred technique for making the forecast based on MSE, MAD, and MAPE?
115) A portion of quarterly data on the demand of ABC resort rentals (in thousands) for the past five years is shown below: Year
Version 1
Quarter
Demand
56
1 : 5
1 : 4
480 : 1243
{MISSING IMAGE}Click here for the Excel Data File a.Estimate the linear trend model with seasonal dummy variables, d1, d2, and d3 where d1 = 1 for quarter 1, 0 otherwise; other dummy variables are defined similarly. b.Estimate the exponential trend model with seasonal dummy variables, d1, d2, and d3 where d1 = 1 for quarter 1, 0 otherwise; other dummy variables are defined similarly. c.Which is the preferred model? d.Use the preferred model to forecast the quarterly demand of ABC resort rentals in thefirst andfourth quarters of year 7.
116) A portion of quarterly data on the demand of ABC resort rentals (in thousands) for the past five years is shown below: Year 1 : 5
Quarter 1 : 4
Demand 480 : 1243
{MISSING IMAGE}Click here for the Excel Data File a.Estimate the linear trend model with seasonal dummy variables, d1, d2, and d3 where d1 = 1 for quarter 1, 0 otherwise; other dummy variables are defined similarly. b.Estimate the quadratic trend model with seasonal dummy variables, d1, d2, and d3 where d1 = 1 for quarter 1, 0 otherwise; other dummy variables are defined similarly. c.Which is the preferred model? d.Use the preferred model to forecast the quarterly demand of ABC resort rentals in thefirst and fourth quarters of year 7.
Version 1
57
117) A portion of data on ridership (in thousands) and on-time performance (in %) of DC bus fleet for the past two years is shown below: Year 1 : 2
Month 1 : 2
Ridership 5879 : 7893
On-Time 80.4 : 78.4
{MISSING IMAGE}Click here for the Excel Data File Estimate the following three models and use the preferred model to make a forecast for ridership in January of Year 3. Model 1: Ridership t = β0 + β1On-Timet−1 + εt Model 2: Ridership t = β0 + β1Ridershipt−1 + εt Model 3: Ridership t = β0 + β1Ridershipt−1 β2On-Timet−1 + εt
118) Quantitative forecasting procedures are based on the judgment of the forecaster, who uses prior experience and expertise to make forecasts. ⊚ true ⊚ false
119) Causal forecasting models are based on a regression model framework, where the variable of interest depends on one or more explanatory variables. ⊚ ⊚
true false
120) In forecasting methods, the mean square error (MSE) is computed by dividing the sum of squared residuals (errors) by the number of observations n for which the residuals are available.
Version 1
58
⊚ ⊚
true false
121) Smoothing techniques are suitable for use when forecasts need to be updated frequently due to new observations that become available. ⊚ true ⊚ false
122) The moving average method is one of the most complex smoothing techniques used for processing time series. ⊚ true ⊚ false
123) The exponential smoothing method weighs all available observations in a time series equally. ⊚ true ⊚ false
124) The exponential trend model is attractive when the increase in the series gets larger over time. ⊚ true ⊚ false
125) Although we use the MSE to compare the linear and the exponential trend models, we cannot use it to compare the linear, quadratic, and cubic trend models. ⊚ true ⊚ false
Version 1
59
126) The cyclical component of a time series typically represents repetitions within a one-year period. ⊚ true ⊚ false
127)
<p>When the exponential trend model is used to make forecasts, the forecasts are given
by ⊚ ⊚
true false
128) When a time series has both trend and seasonality, it is necessary to model seasonality along with trend. ⊚ true ⊚ false
129) Seasonality can be incorporated in nonlinear trend models by using seasonal dummy variables. ⊚ true ⊚ false
130) R uses the fourth quarter as the reference season when estimating a linear trend model with seasonality. ⊚ true ⊚ false
131) When the forecasting method of seasonal dummy variables is applied on a quarterly time series, four dummy variables are needed. ⊚ true ⊚ false
Version 1
60
132) Noncausal forecasting models are purely time series models in the sense that the forecasts are made based only on historical data concerning the variable of interest. ⊚ true ⊚ false
Version 1
61
Answer Key Test name: Chap 18_4e 1) mean absolute deviation 2) out-of-sample 3) Unsystematic 4) trial-and-error 5) regression 6) over-parameterization 7) one-year 8) deseasonalized 9) seasonal 10) recompose 11) dummy 12) exponential 13) slope 14) autoregressive 15) B 16) A 17) D 18) B 19) C 20) C 21) C 22) A 23) D 24) A 25) B 26) D Version 1
62
27) B 28) A 29) C 30) D 31) B 32) C 33) D 34) B 35) C 36) D 37) A 38) D 39) C 40) B 41) D 42) B 43) D 44) C 45) C 46) B 47) A 48) B 49) C 50) D 51) C 52) C 53) D 54) B 55) C 56) D Version 1
63
57) A 58) C 59) D 60) C 61) D 62) A 63) D 64) A 65) C 66) D 67) D 68) D 69) D 70) B 71) C 72) D 73) C 74) C 75) D 76) D 77) D 78) C 79) B 80) A 81) C 82) A 83) B 84) C 85) A 86) D Version 1
64
87) C 88) B 89) C 90) B 91) D 92) A 93) C 94) D 95) C 96) C 97) C 98) D 99) A 100) C 101) D 102) A 103)
≈ 2,464,000,000.
104)
≈ 2,457,000,000.
105) Model AR(1) 106) ≈ 2,459,000,000. 107) Model yt = β0 + β1yt−1 + εt has the lowest MSE; Model yt = β1yt−1 + εt has the lowest MAD. 108) 109) For a given year, the sales in Quarter 4 are on average higher by 7.855 than the sales in Quarter 1. 110) 55.243 111) 66.323
Version 1
65
112) a.The linear trend model without seasonality is: = 26.3996 + 1.1154t b.The linear trend model with seasonality is: = 31.9261 − 7.8555Qtr1 − 4.7362Qtr2 − 7.1656Qtr3 + 1.0749t c.The linear trend model with seasonality. d.Quarterly sales in thefirst quarter of year 8 = 55.2434. Quarterly sales in the second quarter of year 8 = 59.4376. 113) <p>a.
=exp(3.3429 +0.0264 t + 0.08432/2)
b. = 61.0292; = 62.6604 114) a.3055811 b.3112460 c.Exponential smoothing 115) a.The estimated linear trend model with seasonal dummy variables is: = 398.2 − 27.7d1 − 113.6d2 + 82.3d3 + 34.7t. b. The estimated exponential trend model with seasonal dummy variables is: = exp(6.1465 − 0.058d1 − 0.1904d2 + 0.1033d3 + 0.0436t + 0.13992/2). c.The preferred model is the exponential trend model with seasonal dummy variables. d.Quarterly demand for thefirst quarter of year 7 is 1323.9886. Quarterly demand for thefourth quarter of year 7 is 1599.3934. 116) a.The estimated linear trend model with seasonal dummy variables is: = 398.2 − 27.7d1 − 113.6d2 + 82.3d3 + 34.7t. b.The estimated quadratic trend model with seasonal dummy variables is: = 610.5285 − 27.7d1 − 108.012d2 + 87.8876d3 − 23.9697t + 2.7938t2. c.The preferred model is quadratic trend model with seasonal dummy variables. d.Quarterly demand for the first quarter of year 7 is 1729.7080. Quarterly demand for thefourth quarter of year 7 is 2129.7124.
117) Model 1:
= 19100.32 − 151.08 × On-Timet−1
Model 2:
= 6411.81 + 0.13 × Ridershipt−1
Model 3:
= 18016.68 + 0.06 × Ridershipt−1 − 143.13 × On-Timet−1
Use the preferred Model 1, the forecast for Ridership in January of Year 3 = 7256.04.
Version 1
66
118) FALSE 119) TRUE 120) TRUE 121) TRUE 122) FALSE 123) FALSE 124) TRUE 125) TRUE 126) FALSE 127) TRUE 128) TRUE 129) TRUE 130) FALSE 131) FALSE 132) TRUE
Version 1
67
CHAPTER 18 – 4e – Additional Test bank Questions 1) Based on quarterly data collected over the last five years, the following regression equation was found to forecast the quarterly demand for the number of new copies of a business statistics textbook: Demand = 2,298 − 6,635Qtr1 − 1,446Qtr2 + 303Qtr3 + 26t, whereQtr1,Qtr2, andQtr3 are dummy variables corresponding to Quarters 1, 2, and 3, respectively,andt= time period starting witht= 1 through t = 20.The forecast for the second quarter of next year is closest to A) 852 B) 878 C) 1,424 D) 2,870
2) The controller of a large construction company is trying to forecast annual expenses for the next year.He collects data on Expenses (in $1,000s) over the past 20 years (Trend = 1, 2, ..., 20) and obtains the following sample regression equation: log(Expenses) = 4.13 + 0.15Trend He concludes that Expenses increase by A) 0.15% each year. B) 15% each year. C) $15,000 each year. D) $445,000 each year.
3) The controller of a large construction company is trying to forecast annual expenses for the next year.He collects data on Expenses (in $1,000s) over the past 20 years (Trend = 1, 2, ..., 20) and obtains the following sample regression equation: log(Expenses) = 4.13 + 0.15Trend If the standard error of the estimate is 0.51, then predicted expenses (in $1,000s)for next year are closest to A) 728 B) 741 C) 1,451 D) 1,653
Version 1
1
4) The following figure shows the annual revenues (in millions of dollars) of a pharmaceutical company over the past ten years.
Given only this information, which of the following forecasting methods would be most appropriate to forecast annual revenues for the next year? A) The linear trend model B) Autoregressive Model of Order 1 (AR1) C) Exponential smoothing with alpha = 0.50 D) The linear trend model with seasonal dummy variables
Version 1
2
5) The following figure shows the monthly beer sales for a beer manufacturer over the past four years:
Which systematic component of the time series is most prevalent over this time period? A) A trend B) A cyclical component C) A seasonal component D) A random component
6)
Which of the following is true of the moving average method? A) It is a causal forecasting model. B) It is used when a series involves only random fluctuations. C) It assigns exponentially decreasing weights to older observations. D) It can extract a long-term upward or downward steady movement in a series.
Version 1
3
7) Based on quarterly data collected over the last five years, the following regression equation was found to forecast the quarterly demand for the number of new copies of a business statistics textbook: Demand = 2,298 − 6,635Qtr1 − 1,446Qtr2 + 303Qtr3 + 26t, whereQtr1,Qtr2, andQtr3 are dummy variables corresponding to Quarters 1, 2, and 3, respectively,andt= time period starting witht= 1 through t = 20.The forecast for the third quarter of next year is closest to A) −5,454 B) 2,601 C) 2,627 D) 3,199
Version 1
4
Answer Key Test name: Chap 18_4e_Additional TB Questions 1) C 2) B 3) D 4) A 5) C 6) B 7) D
Version 1
5
CHAPTER 19 – 4e 1)
Consider the given prices ($ per gallon) of whole milk for the past 5 years:
Year 2015 2016 2017 2018 2019
Price ($ per gallon) 3.76 4.11 3.34 3.99 3.09
a. Using 2015 as the base year, compute the simple price index for whole milk from 2016 to 2019. b. Comment on the whole milk price indices for 2016 and 2019. c. Update the whole milk price indices with a revised base of 2017. d. Explain the reason for changing the base year from 2015 to 2017.
2) Consider the given prices ($ per gallon) and quantity sold (in gallon) information on whole milk and 2% milk for the past 5 years: Year 2015 2016 2017 2018 2019
Whole_Price 3.76 4.11 3.34 3.99 3.09
Whole_#_Sold 14451 15242 15621 15901 16119
2%_Price 3.09 3.49 3.65 4.16 2.89
2%_#_Sold 16759 16556 16100 15665 15281
a. Compute the unweighted aggregate price index for milk using 2015 as the base year. b. Comment on the unweighted aggregate price indices for 2018 and 2019. c. Compute the Laspeyres price index for milk using 2015 as the base year. d. Compute the Paasche price index for milk using 2015 as the base year. e. Comment on the Laspeyres and Paasche price indices.
Version 1
1
3) The ________ component is the capital gain or loss resulting from an increase or decrease in the value of the asset.
4) The income component is ________ for stocks, interest for bonds, and rental income for a real estate investments.
5) The ________ equation is a theoretical relationship between nominal returns, real returns, and the expected inflation rate.
6) When the expected inflation rate is relatively low, a reasonable ________ to the Fisher equation is r = R − i.
7) An index number is an easy-to-interpret numerical value that reflects a ________ change in price or quantity from a base price.
8) If the average price for gasoline in 2001 and 2006 was $1.46 per gallon and $2.59, respectively, the simple price index for 2006 is ________%.
9) The ________ price index is used to represent relative price movements for a group of items.
Version 1
2
10) The ________ price index uses quantities evaluated in the base period to compute the weighted aggregate price index.
11)
The Paasche index uses the ________ period quantities in deriving the weights.
12) A ________ series is obtained by adjusting the given time series for changes in prices, or inflation.
13)
The ________ rate is the percentage rate of change of a price index over time.
14)
Which of the following statements is true of investment returns?
A) A change in the price of the underlying asset usually has no impact on returns to investment. B) Income yield always accounts for the bulk of the returns from an investment. C) The higher the price of the asset during the time of purchase, the lower the returns from that investment. D) The higher the income yield, the lower the overall returns from that investment.
15) Donna Warne purchased a share of company X at $700 one year ago. The share is now trading at $850. Calculate the percent capital gain if Donna sells her share at the current price. A) 17.6% B) 150% C) 6.7% D) 21.4%
Version 1
3
16) Allen Gibbs purchased 50 shares of ABC Corporation, each worth $110. If the price of the share increases to $135 after a year, which of the following will be the capital gain if Allen sells all his shares? A) $1,250 B) $25 C) $6,750 D) $5,500
17) Emily Myers purchased a share of company X at $700 one year ago and earned a dividend of $50 during the year. The share is trading now at $850. Calculate Emily’s return from investment if she sells her share at $850. A) 7.14% B) 21.43% C) 28.57% D) 14.29%
18)
Which of the below is not required to compute an investment return? A) Interest rate B) Price of the asset at purchase C) Price of the asset at sale D) Distributed income
19) Joanna Robertson bought a share of XYZ Corporation at $200 last year. She was paid a dividend of $30 during the same year. However, the same share is trading at $280 this year. Calculate Joanna’s capital gains percent yield if she sells the shares at $280.
Version 1
4
A) 80% B) 15% C) 40% D) 28.57%
20) Joanna Robertson bought a share of XYZ Corporation at $200 last year. She was paid a dividend of $30 during the same year. However, the same share is trading at $280 this year. Calculate Joanna’s income percent yield from holding the share for a year. A) 15% B) 40% C) 55% D) 30%
21) Joanna Robertson bought a share of XYZ Corporation at $200 last year. She was paid a dividend of $30 during the same year. However, the same share is trading at $280 this year. Calculate the investment percent yield if Joanna sells the share at $280. A) 70% B) 110% C) 15% D) 55%
22)
Which of the following investment options is the most profitable in percentage terms?
A) A government bond with a coupon payment of $10, purchased at $100 and sold at $90 after a year. B) A government bond with a coupon payment of $15, purchased at $175 and sold at $155 after a year. C) A share with a dividend payment of $5, purchased at $50 and sold at $60 after a year. D) A Treasury bill purchased at $160 and sold at $175 after a year.
Version 1
5
23) The adjusted close price of a share is now $67. A year back, the adjusted close price of the same share was $50. What is the investment percent return for someone who holds 10 of these shares today? A) 34% B) 17% C) 20% D) 25.37%
24) Calculate the nominal percent return from an investment of $300 in a share whose price increases to $375 after a year. A) 75% B) 20% C) 25% D) 16.67%
25)
Which of the following types of returns capture the changes in purchasing power? A) Nominal returns B) Real returns C) Income yield D) Capital gains yield
26)
Which of the following correctly identifies the Fisher equation? A) r = (1 + R) / (1 + i) B) r = R / (1 + i) C) 1 + r = R / (1 + i) D) 1 + r = (1 + R) / (1 + i)
Version 1
6
27) Jack Simmons is expecting to earn a 0.34% return on his investment in a government bond that matures next year. If the inflation rate during the time of maturity is expected to be 7%, what will be the real rate of return from his investment? A) 6.64% B) −3.36% C) −6.22% D) 25%
28) Timothy Keating invested $120 in buying a share of company X in June last year. Although the company had announced that it will not pay any dividend, the price of this share has increased over the last few months and is expected to go up to $140 by June this year. If the expected annual inflation rate in June this year is 5%, calculate Timothy’s real rate of return from this investment if he sells the share at $140. A) 11.11% B) 14.29% C) −22.22% D) −10%
29) Hugh Wallace has the following information regarding three investment options. Each investment option involves the same one-year period. r Investment option 1
R 2%
Investment option 2 Investment option 3
i 1.2% 5%
0.5%
4%
4%
What nominal rate of return will Hugh earn if he invested in option 1?
Version 1
7
A) 0.79% B) 3.22% C) 0.32% D) 6.67%
30) Hugh Wallace has the following information regarding three investment options. Each investment option involves the same one-year period. r Investment option 1
R 2%
1.2%
Investment option 2 Investment option 3
i
5% 0.5%
4%
4%
What real percent rate of return will Hugh earn if he invests in option 2? A) −2.5% B) 4.58% C) 0.96% D) −3.37%
31) Hugh Wallace has the following information regarding three investment options. Each investment option involves the same one-year period. r Investment option 1
R 2%
Investment option 2 Investment option 3
i 1.2% 5%
0.5%
4%
4%
Which of the following statements regarding these investment options is true?
Version 1
8
A) Investment option 1 has the highest nominal rate of return. B) The expected percent rate of interest during the time of maturity of investment option 3 is 1.48%. C) Investment option 3 has the highest expected inflation rate and the lowest real rate of return. D) Adjusted for inflation, investment option 1 is the best choice for Hugh.
32)
If the price index for a particular year is 110, it implies that ________. A) prices have increased by 11% over the base year B) prices have increased by 10% over the base year C) prices have increased by 1.1% over the base year D) prices have increased by 0.10% over the base year
33)
If the price index for a particular year is 75, it implies that ________. A) prices have increased by 7.5% over the base year B) prices have increased by 0.75% over the base year C) prices have decreased by 0.25% compared to the base year D) prices have decreased by 25% compared to the base year
34) The price of a basket of goods in 2000 was $500. If the same basket cost $600 in 2002, identify the correct statement from the following. A) The price index in 2002 must have been 100 points higher than in 2000. B) The price increase between 2000 and 2002 was 10%. C) If 2000 is considered the base year, then the price index in 2002 must have been 120. D) If 2002 is considered the base year, then the price index in 2000 must have been 70.
Version 1
9
35) If pt is the price of good X in period t, and p0 is the base period price, which of the following is the equation for a simple price index for good X? A) B) C) D)
36) Consider the following information about the price and the price index of a popular book over three years. 2009 Price
$9
Price Index
100
2010
2011
$9.72 111.11
What was the price index for this book in 2010? A) 72 B) 92.59 C) 107.2 D) 108
37) Consider the following information about the price and the price index of a popular book over three years. 2009 Price
$9
Price Index
100
2010
2011
$9.72 111.11
What was the approximate price of the book in 2011?
Version 1
10
A) $10 B) $10.80 C) $8.10 D) $10.69
38) Consider the following information about the price and the price index of a popular book over three years. 2009 Price
$9
Price Index
100
2010
2011
$9.72 111.11
Which of the following is the correct statement? A) The price of the book increased by 3.11% between 2010 and 2011. B) If 2011 was the base year, the price index for the book in 2009 would have been 90. C) Between 2009 and 2011, the price of the book went up by 11.11%. D) If 2010 was the base year, the price index in 2009 would have been 99.28.
39) Consider the following information about the price and the price index of a popular book over three years. 2009 Price
$9
Price Index
100
2010
2011
$9.72 111.11
If the price index in 2011 was 95 instead of 111.11, it would imply that ________. A) the price of the book in 2011 was 16.11% lower than its price in 2009 B) 2011 would be considered as the base year C) the price of the book in 2011 was 5% lower than its price in 2009 D) the price of the book in 2011 would be $4
Version 1
11
40) Consider the following information about the price and the price index of a popular book over three years. 2009 Price
$9
Price Index
100
2010
2011
$9.72 111.11
If the price index in 2011 was 95, which of the following is the percent change in the price of the book between 2010 and 2011? A) −12.04% B) −5% C) −2.53% D) −13%
41) The following table provides the values of the simple price index for watches in a country.
Price Index
2006
2007
2008
100
105
108.5
If the base year is revised from 2006 to 2007, which of the following is the updated index for 2008? A) 113.5 B) 96.77 C) 95.24 D) 103.33
42) Consider the following data on the prices of three items, A, B, and C, from 2000 through 2002. 2000
Version 1
Item A
Item B
Item C
$ 200
$ 215
$ 202
12
2001 2002
$ 203 $ 210
$210 $ 209
$ 203.50 $ 207
Which of the following is the unweighted aggregate price index for the three items for 2002, using 2001 as the base year? A) 98.48 B) 101.54 C) 109.5 D) 102.58
43) Consider the following data on the prices of three items, A, B, and C, from 2000 through 2002. 2000 2001 2002
Item A
Item B
Item C
$ 200 $ 203 $ 210
$ 215 $210 $ 209
$ 202 $ 203.50 $ 207
Compute the unweighted price index for the three items for 2002, using 2000 as the base year. A) 98.26 B) 98.48 C) 109 D) 101.46
44) Consider the following data on the prices of three items, A, B, and C, from 2000 through 2002. 2000 2001 2002
Item A
Item B
Item C
$ 200 $ 203 $ 210
$ 215 $210 $ 209
$ 202 $ 203.50 $ 207
Compute the unweighted price index for the three items for 2000, using 2001 as the base year.
Version 1
13
A) 100.08 B) 98.56 C) 100.85 D) 99.92
45) Consider the following data on the prices of three items, A, B, and C, from 2000 through 2002. 2000 2001 2002
Item A
Item B
Item C
$ 200 $ 203 $ 210
$ 215 $210 $ 209
$ 202 $ 203.50 $ 207
If the price of item C in 2002 was 210 instead of 207, which of the following is the unweighted aggregate price index for 2002 for the three items, using 2000 as the base year? A) 98.09 B) 100.45 C) 102.03 D) 101.94
46) Firms A, B, and C operate in the market for computers. Each firm’s product is popular in the market and is distinct from what the other firms offer. The following table shows the prices of laptops from 2005 through 2007. 2005 2006 2007
Firm A
Firm B
Firm C
$ 3,500 $ 3,200 $ 2,850
$ 4,000 $ 3,850 $ 3,500
$ 2,900 $ 3,000 $ 3,200
Which of the following is the unweighted price index for the total price of three types of laptops in 2006, using 2005 as the base year?
Version 1
14
A) 96.63 B) 103.48 C) 91.83 D) 100.96
47) Firms A, B, and C operate in the market for computers. Each firm’s product is popular in the market and distinct from what the other firms offer. The following table shows the prices of laptops from 2005 through 2007. 2005 2006 2007
Firm A
Firm B
Firm C
$ 3,500 $ 3,200 $ 2,850
$ 4,000 $ 3,850 $ 3,500
$ 2,900 $ 3,000 $ 3,200
If 2005 is used as the base year, which of the following statements about the prices of these laptops is true? A) The prices increased by about 4.8% between 2006 and 2007. B) The prices were about 8.17% lower in 2007 than in 2005. C) The prices increased by about 3.36% from 2005 to 2006. D) The increase in the price of firm C’s laptop outweighed the fall in the prices of the others so that the overall index for 2006 was greater than 100.
48) Firms A, B, and C operate in the market for computers. Each firm’s product is popular in the market and distinct from what the other firms offer. The following table shows the prices of laptops from 2005 through 2007. 2005 2006 2007
Firm A
Firm B
Firm C
$ 3,500 $ 3,200 $ 2,850
$ 4,000 $ 3,850 $ 3,500
$ 2,900 $ 3,000 $ 3,200
If 2006 is used as the base year, which of the following statements is true?
Version 1
15
A) The prices in 2007 were 5.02% lower than the prices in 2005. B) The price index for 2005 will be 100, as per the unweighted price index. C) The unweighted price index for 2007 suggests that laptop prices were 8.17% lower than the prices in 2006, as per the unweighted price index. D) The laptop prices in 2005 were 3.48% higher than the prices in 2006, as per the unweighted price index.
49) Firms A, B, and C operate in the market for computers. Each firm’s product is popular in the market and distinct from what the other firms offer. The following table shows the prices of laptops from 2005 through 2007. 2005 2006 2007
Firm A
Firm B
Firm C
$ 3,500 $ 3,200 $ 2,850
$ 4,000 $ 3,850 $ 3,500
$ 2,900 $ 3,000 $ 3,200
Suppose this data was collected from a particular town, where people have a preference for firm C’s laptops. If 2005 is used as the base year, which of the following statements will be true? A) The unweighted price index for 2007 understates the fall in the prices of laptops. B) The unweighted price index for 2006 overstates the fall in the prices of laptops. C) The unweighted price index should only use the data for laptops produced by firms A and B to provide an accurate picture of the change in prices. D) The unweighted price index will better reflect the change in prices over these years than a weighted price index.
50)
Which of the following is the correct expression for the Laspeyres price index? A) B) C) D)
Version 1
16
51) Three firms, X, Y, and Z, operate in the same industry, although their products have different features and are priced differently. The following table provides the prices and the quantities sold (in units) during the period 2000 to 2002. Year
Firm X
Firm Y
Price ($) Quantity
Firm Z
Price ($)
Quantity
Price ($)
Quantity
2000
400
100
600
145
350
120
2001
450
120
612
150
358
129
2002
435
150
630
160
370
145
Which of the following is the Laspeyres price index for 2001, using 2000 as the base year? A) 95.64 B) 113.60 C) 104.56 D) 105.19
52) Three firms, X, Y, and Z, operate in the same industry, although their products have different features and are priced differently. The following table provides the prices and the quantities sold (in units) during the period 2000–2002. Year
Firm X
Firm Y
Price ($) Quantity
Firm Z
Price ($)
Quantity
Price ($)
Quantity
2000
400
100
600
145
350
120
2001
450
120
612
150
358
129
2002
435
150
630
160
370
145
Which of the following is the Laspeyres price index for 2002, using 2000 as the base year? A) 106.07 B) 106.3 C) 94.28 D) 94.08
Version 1
17
53) Three firms, X, Y, and Z, operate in the same industry, although their products have different features and are priced differently. The following table provides the prices and the quantities sold (in units) during the period 2000 to 2002. Year
Firm X
Firm Y
Price ($) Quantity
Firm Z
Price ($)
Quantity
Price ($)
Quantity
2000
400
100
600
145
350
120
2001
450
120
612
150
358
129
2002
435
150
630
160
370
145
Which of the following is the Paasche price index for 2002 if the base period is 2000 and the current period is 2002? A) 94.11 B) 106.26 C) 130 D) 106.07
54) Three firms, X, Y, and Z, operate in the same industry, although their products have different features and are priced differently. The following table provides the prices and the quantities sold (in units) during the period 2000 to 2002. Year
Firm X
Firm Y
Price ($) Quantity
Firm Z
Price ($)
Quantity
Price ($)
Quantity
2000
400
100
600
145
350
120
2001
450
120
612
150
358
129
2002
435
150
630
160
370
145
Compute the Paasche price index for 2001 if the base period is 2000 and the current period is 2002. A) 105.12 B) 94.11 C) 92.85 D) 98.92
55)
Which of the following statements is true about the Laspeyres and Paasche indices?
Version 1
18
A) As long as the evaluation period is the same, the two indices usually yield the same value. B) The difference in the values of the two indices, if any, can usually account for the varying price levels across periods. C) The Laspeyres price index uses the current period quantities as weights, as opposed to the base period quantities used by the Paasche price index. D) The greater the length of time between periods, the higher the chances of variation between the values of the two indices.
56)
A Paasche index with updated weights ________.
A) is likely to produce a lower estimate than a Laspeyres index when prices are increasing B) uses the current period quantities as weights as opposed to the base period quantities used in the Paasche index, which does not use updated weights C) is likely to produce a lower estimate than a Laspeyres index when prices are decreasing D) accounts for inflation but fails to account for changing consumption patterns over time
57) Which of the following statements is true about the comparison between the Laspeyres and Paasche indices? A) The Laspeyres index is more expensive to compute than the Paasche index. B) The data requirements are more stringent for the Laspeyres index than for the Paache index. C) Unlike the Laspeyres index, the Paasche index incorporates current expenditure patterns of consumers. D) Unlike the Paasche index, the Laspeyres index requires that the weights be updated each year and the index numbers be recomputed for all of the previous years.
Version 1
19
58) When Dana Roberts started her job as a librarian two years ago, her annual salary was $40,000. Although she is very good at her job and is now earning 10% more than what she started with, she notices that new hires in the same field are being offered $43,000 as an annual salary this year. Which of the following statements is most likely to be true? A) Dana is earning less than $43,000 this year. B) Real income for librarians in this country has increased, on an average, by $3,000 during the last two years. C) If inflation in the current year is very high, then it is possible that Dana was better off with $40,000 two years back than the new hires are with a salary of $43,000 today. D) Adjusted for inflation in the current year, it is likely that Dana’s income this year is lower than the income of the new hires.
59)
When a given time series is adjusted for changes in prices, it is called a(n) ________. A) deflated series B) unweighted series C) nominal series D) cyclical series
60)
Which of the following statements is true of the CPI? A) It uses the same basket of goods and services as the PPI. B) The quantities of items in the current year are used for computing the weights for the
index. C) It is an unweighted price index used to measure inflation. D) It is based on the prices paid by urban consumers for a representative basket of goods and services.
61)
Which of the following is true about the comparison between the CPI and the PPI?
Version 1
20
A) Both indices are based on the same basket of goods and services. B) Sales and excise taxes are included in the PPI, but they are not included in the CPI. C) The PPI is more commonly used to adjust wages for changes in the cost of living than the CPI. D) Unlike the CPI that uses prices people pay, the PPI uses the prices measured at the wholesale level.
62) Katie Jones started her career with an annual salary of $30,000 in 2006. Five years later, in 2011, her salary had increased to $55,000 a year. If the values of the CPI (with a base of 1982−1984) for 2006 and 2011 are 201.6 and 224.9, respectively, compute the increase in Katie’s real income. A) 64.34% B) 83.33% C) 105.31% D) 164.34%
63) Gulten started her career with an annual salary of $82,000 in 2006. Five years later, in 2011, her salary had increased to $85,000 a year. If the values of the CPI (with a base of 1982−1984) for 2006 and 2011 are 201.6 and 224.9, respectively, compute the increase in Gulten’s real income. A) −7.08% B) 3.66% C) 15.68% D) 92.92%
64) Jones Incorporation, a company manufacturing drills, earned a revenue of $5 million in 2011. This was 6% more than their revenue in 2009. The CPI for 2009 and 2011 was 115 and 122 respectively, while the PPI for 2009 and 2011 was 110 and 114 respectively. Which of the following accurately reflects the real growth in revenues between 2009 and 2011? Version 1
21
A) 6% B) 2.65% C) 0.28% D) 10.25%
65) Which of the following correctly identifies the expression for calculating the inflation rate for period t, based on the CPI? A) B) C) D)
66)
The following table shows the value of CPI for three years in a country.
Price Index
2003
2004
2005
119
123
124.8
Which of the following is the annual inflation rate for 2004? A) 4% B) 1.46% C) 3.36% D) 4.87%
67)
The following table shows the value of CPI for three years in a country.
Price Index
2003
2004
2005
119
123
124.8
Which of the following is the annual inflation rate for 2005?
Version 1
22
A) 1.46% B) 3.36% C) 4.87% D) 1.8%
68) Toyota Motor Corporation, once considered a company synonymous with reliability and customer satisfaction, has been engulfed in a perfect storm with millions of cars recalled. The following table shows the monthly adjusted close price per share of Toyota from October 2009 to March 2010. Date
Adjusted Close Price ($)
October 2009
78.89
November 2009
78.54
December 2009
84.16
January 2010
77.00
February 2010
74.83
March 2010
79.56
Which of the following is the simple price index for March 2010 with November 2009 as the base? A) 101.30 B) 100.85 C) 98.72 D) 99.16
69) Toyota Motor Corporation, once considered a company synonymous with reliability and customer satisfaction, has been engulfed in a perfect storm with millions of cars recalled. The following table shows the monthly adjusted close price per share of Toyota from October 2009 to March 2010. Date
Adjusted Close Price ($)
October 2009
78.89
November 2009
78.54
December 2009
84.16
January 2010
77.00
Version 1
23
February 2010
74.83
March 2010
79.56
Which of the following is the simple price index for February 2010 with December 2009 as the base? A) 112.47 B) 88.91 C) 94.85 D) 105.43
70) Toyota Motor Corporation, once considered a company synonymous with reliability and customer satisfaction, has been engulfed in a perfect storm with millions of cars recalled. The following table shows the monthly adjusted close price per share of Toyota from October 2009 to March 2010. Date
Adjusted Close Price ($)
October 2009
78.89
November 2009
78.54
December 2009
84.16
January 2010
77.00
February 2010
74.83
March 2010
79.56
Which of the following is the percentage price change from October 2009 to December 2009? A) 3.33% B) −6.26% C) 6.68% D) 5.48%
71) Toyota Motor Corporation, once considered a company synonymous with reliability and customer satisfaction, has been engulfed in a perfect storm with millions of cars recalled. The following table shows the monthly adjusted close price per share of Toyota from October 2009 to March 2010. Date October 2009
Version 1
Adjusted Close Price ($) 78.89 24
November 2009
78.54
December 2009
84.16
January 2010
77.00
February 2010
74.83
March 2010
79.56
If the price index in April 2010 was 96.84, using December 2009 as the base, which of the following is the missing price per share for April 2010? A) $86.94 B) $80.84 C) $86.21 D) $81.50
72) An investor bought 1,000 shares of Citigroup in January 2009 for $3.55 a share. She sold all of her shares in December 2009 for $3.31 a share. The inflation rate for 2009 was −0.35%. Which of the following is the investor’s real rate of return? A) −6.43% B) −6.76% C) −5.12% D) 6.43%
73) Kim Back invested $20,000 one year in corporate bonds. Each bond sold for $1,000 and earned a coupon payment of $80 each during the year. The price of the bond at the end of the year dropped to $980. Which of the following is Kim’s investment percent return on each bond? A) −2% B) 6% C) 4% D) −6%
Version 1
25
74) Kim Back invested $20,000 one year in corporate bonds. Each bond sold for $1,000 and earned a coupon payment of $80 each during the year. The price of the bond at the end of the year dropped to $980. Which of the following is Kim’s total dollar gain or loss on her investment? A) $1,600 B) −$400 C) $1,200 D) $1,000
75) Lindsay Kelly bought 100 shares of Google, 300 shares of Microsoft, and 500 shares of Nokia in January 2005. The adjusted close prices of these stocks over the next three years are shown in the accompanying table. 2005 2006 2007
Microsoft
Nokia
$ 195.62 $ 432.66 $ 505.00
$ 24.11 $ 26.14 $ 28.83
$ 13.36 $ 16.54 $ 19.83
Which of the following is the unweighted aggregate price index for Lindsay’s portfolio for 2007, using 2005 as the base year? A) 258.15 B) 116.48 C) 203.93 D) 237.53
76) Lindsay Kelly bought 100 shares of Google, 300 shares of Microsoft, and 500 shares of Nokia in January 2005. The adjusted close prices of these stocks over the next three years are shown in the accompanying table. 2005 2006 2007
Microsoft
Nokia
$ 195.62 $ 432.66 $ 505.00
$ 24.11 $ 26.14 $ 28.83
$ 13.36 $ 16.54 $ 19.83
Which of the following is the weighted aggregate price index for Lindsay’s portfolio for 2007, using 2005 as the base year? Version 1
26
A) 206.32 B) 177.38 C) 116.31 D) 237.53
77) Which of the following are the most commonly used price indices used to deflate economic time series? A) Customer price index and producer price index. B) Consumer price index and simple price index. C) Aggregate price index and simple price index. D) Consumer price index and producer price index.
78)
Which of the following statements about investment return is false? A) It can be distinguished into nominal and real return. B) Nominal return is the sum of capital gains yield and income yield. C) Nominal return for any asset can be computed by using its adjusted closing prices. D) Real return is nominal return adjusted for inflation.
79)
Which of the following statements about a simple price index is false?
A) The index number for the base period is 100. B) A simple price index of less than 100 means there is a price drop with respect to the base year. C) It is expressed as a percentage. D) It provides direct comparisons between non-base years.
Version 1
27
80)
Which of the following is not an example of an aggregate price index? A) S&P 500 index B) Producer price index C) Consumer price index D) Happiness index
81) You bought a single-family house in 2000 for $250,000, which grew to $395,000 in 2019. The consumer price indices for the years 2000 and 2019 are reported as 172.2 and 251.1, respectively. What is the nominal and real increase in house value? A) 0.58, 0.37 B) 0.58, 0.08 C) 0.08, 0.58 D) 0.37, 0.58
82) Nicole Watson purchased a share of McAllister Incorporated, a year back when it was trading at $75. She also earned a dividend of $4 on this. The price of the same share has now increased to $87. a. Compute the annual capital gain from this share. b. Compute the annual income yield from this share.
Version 1
28
83) Rhea Anderson purchased a corporate bond at $500 a year ago, on which she received a coupon payment of $35. The price of the bond has now gone up to $525. a. Compute Rhea’s capital gains yield. b. Compute Rhea’s income yield. c. Compute Rhea’s investment return.
84) Rita Jacob purchased a corporate bond at $650 a year ago. The price of the bond has now increased to $665. a. What is Rita’s capital gains yield from this investment? b. If her investment return is 7%, what coupon payment must she have received?
85) Jake Morris invested $150 in buying a share of ABC Corporation a year back, on which he received a dividend of $20. Now when the share is trading at a different price, Jake computes his investment return to be 10%. a. Compute Jake’s income yield from this investment. b. What is the current price of the share? c. Interpret the result.
Version 1
29
86) The following table provides the adjusted close prices of the shares of firm X for three consecutive months of the last year. Date
Adjusted Close Prices ($)
April
75
May
79
June
83.50
a. Compute the monthly return for May. b. Compute the monthly return for June.
87) The income yield from a one-year infrastructure bond purchased today is 0.5%. Compute the real return from this bond if the inflation rate a year later is expected to be 4% and the capital gains yield is zero.
88) Almas Mohammed paid $50 for a stock of Jones Incorporation and expects to earn an annual dividend of $5. If the price of the stock increases to $55 in a year’s time, compute the real rates of return from this investment if: a. the expected inflation rate a year from now is 1.25%. b. the expected inflation rate a year from now is 3%.
Version 1
30
89) The real return from investing $100 in a stock for a year is computed to be 2%. If the nominal return from this investment is 3.5%, what is the expected inflation rate?
90) Joanna Williams purchased a one-year Treasury bill at $100. This bond does not have any coupon payment, but Joanna will get $105 at maturity. The inflation rate in a year’s time is expected to be 1.3%. a. What is the income yield from this investment? b. What is the nominal rate of return from this investment? c. What is the real rate of return from this investment?
Version 1
31
91) Tammy Welsh has two investment options to choose from. The stocks of both ABC Incorporated and XYZ Corporation are currently trading at $200. ABC Incorporated is paying a dividend of $10, and is expected to trade at $215 in a year’s time. XYZ Corporation is paying a higher dividend of $25. However, the price of XYZ’s stock is only expected to go up to $203 a year from now. Experts are estimating inflation to be about 2% in a year’s time. a. In which stock should Tammy invest and why? b. If the expected inflation rate was revised to 2.5%, would it alter Tammy’s choice? Explain your answer.
92) Suppose the price of a slice of pizza increased from $2 in 2009 to $2.25 in 2010, and then to $3 in 2011. a. Compute the percentage change in the price of a slice of pizza between 2009 and 2011. b. Compute the simple price index for 2011 if 2009 is the base year. c. Compute the simple price index for 2009 if 2010 is the base year.
93) Using 2008 as the base year, the simple price index for an algebra book in 2009 was computed to be 105. If the price of the book was $50 in 2009, compute the price in the base year.
Version 1
32
94) Suppose the simple price index for a good was 112 in 2008, using 2007 as the base year. In 2009 however, the simple price index for the same good was 98. a. What does this information imply? b. If the price of the good was $20 in 2008, what was the price in 2009?
95)
The following table provides the prices of a good over three consecutive years.
Year Price
2009 $5
2010 $ 6.50
2011 $ 7
a. Compute the simple price index for 2010 using 2009 as the base year. b. If the base year is changed to 2010, compute the simple price index for 2011.
96) Consider the following table that provides the simple price indices for an item from 2006 through 2008. Year Simple Price Index
Version 1
2006
2007
2008
100
102
103.5
33
a. If the price of the good in 2006 was $8, compute the price in 2008. b. If the base year is changed to 2007, compute the value of the updated simple price index in 2008.
97) Consider the following table that provides the prices for three different brands of chocolate chip cookies during 2008 to 2009 in a particular country. (Prices are in the country’s local currency.) Year
Brand A
Brand B
Brand C
2008
4
4.50
3.50
2009
5
4.75
3.50
a. Compute the unweighted aggregate price index for 2009, using 2008 as the base year. b. Interpret the result.
98) Consider the following table providing the prices for three different brands of chocolate chip cookies during 2009 to 2010 in a particular country. (Prices are in the country’s local currency.) Year
Brand A
Brand B
Brand C
2009
5
4.75
3.50
2010
3
4
3
Version 1
34
a. Compute the unweighted aggregate price index for 2010, using 2009 as the base year. b. Interpret the result.
99)
The following table provides the prices of three brands of a good during 2009 and 2010. Year 2009 2010
Brand X $ 10 $ 10.20
Brand Y $ 9.45 $ 9.35
Brand Z $ 11 $ 10.90
a. Compute the unweighted aggregate price index for 2010, with 2009 as the base year. b. Interpret the result. c. Is the result an accurate representation of the changes?
100) The following table provides the prices of the tickets for three types of water rides at an amusement park. Year 2005 2006 2007
Ride 1 $ 4 $ 4 $ 4.25
Ride 2 $ 3 $ 3 $ 3.20
Ride 3 $ 4 $ 4.50 $ 4.25
a. Compute the unweighted aggregate price index for 2010, using 2009 as the base year. b. Compute the unweighted aggregate price index for 2011, using 2010 as the base year.
Version 1
35
101) The following table provides the price and quantity data for three brands of a particular good during 2010 and 2011. Year
Brand X Price ($)
Brand Y
Brand Z
Quantity
Price ($)
Quantity
Price ($)
Quantity
2010
10
100
9.45
120
11
90
2011
10.20
90
9.05
135
10.98
105
a. Use 2010 as the base year and compute the weighted aggregate price index for 2011 using the Laspeyres method. b. Interpret the result.
102) The following table provides the price and quantity data for three brands of a particular good during 2010 and 2011. Year
Brand X Price ($)
Brand Y
Brand Z
Quantity
Price ($)
Quantity
Price ($)
Quantity
2010
10
100
9.45
120
11
90
2011
10.20
90
9.05
135
10.98
105
a. Use 2010 as the base year and compute the weighted aggregate price index for 2011 using the Paasche method. b. Interpret the result.
Version 1
36
103) Consider the following table. It provides the price and quantity data for three brands of a good. Year
Brand A Price ($)
Brand B
Brand C
Quantity
Price ($)
Quantity
Price ($)
Quantity
2008
11
120
9.45
120
10
110
2009
11.45
90
9.90
110
10.90
100
a. Compute the weighted aggregate price index using the Laspeyres method with 2008 as the base year. b. Compute the weighted aggregate price index using the Paasche method with 2008 as the base year. c. Suppose the new price and quantity data was available for the base year of 2000 instead of 2008. Would this lead to a substantial difference between the two indices? Explain your answer.
104) Consider the following information regarding the price and quantity of three brands of a good sold in 1999 and 2010. Year
Brand 1
Brand 2
Price ($) Quantity Price ($) Quantity
Brand 3 Price ($)
Quantity
1999
6
100
5.50
120
4.75
110
2010
11
60
6.75
110
7
120
Version 1
37
a. Compute the weighted aggregate price index using the Laspeyres method with 1999 as the base year. b. Compute the weighted aggregate price index using the Paasche method with 1999 as the base year. c. Interpret the result.
105) Consider the following price (in dollars) and quantity data for two items during 2010 and 2011. Item 1
Item 2
Year
Price ($)
Quantity
2010
12
130
11.50
120
150
10
140
2011
Price ($)
Quantity
If the value of the Laspeyres price index for 2011, using 2010 as the base year, is 89.46, find the price of item 1 in 2011.
106) Consider the following price (in dollars) and quantity data for two items during 2010 and 2011. Item 1
Item 2
Year
Price ($)
Quantity
2010
11.52
120
2011
Version 1
1450
Price ($)
Quantity 150
8.50
170
38
If the value of the Paasche price index for 2011 is 90.61 (base year 2010), find the price of item 2 in 2010.
107) Amy Peterson’s annual salary when she started working in 2006 was $35,000. In 2011, her annual salary had increased to $44,000. Using 2006 as the base year, the consumer price index for 2011 was computed to be 112. Use this information to determine if Amy was better off in 2006 or in 2011.
108) Alex Aliyev is a primary school teacher. When he started working in 2004, he was earning an annual salary of $36,000. In 2011, his salary was $48,000. The consumer price indices for 2004 and 2011 were 108 and 120, respectively. a. Compute the nominal increase in Alex’s salary between 2004 and 2011. b. Compute the real increase in Alex’s salary between 2004 and 2011. c. Interpret the result.
109) Sales revenue of firm X was $10 million in 2007. In 2010, the same firm reported higher sales revenue of $15 million. The producer price indices for 2007 and 2010 were 130 and 135. a. Compute the nominal increase in sales revenue between 2007 and 2010. b. Compute the real increase in sales revenue between 2007 and 2010.
Version 1
39
110)
Consider the following data on the consumer price index in a country.
Year
2004
2005
2006
CPI
190
193
200
a. Compute the annual inflation for 2005. b. Compute the annual inflation for 2006.
111) Boris Arshavin invested $1 million in buying a house in 2002, and sold it for $2.5 million in 2005. The consumer price index for 2002 and 2005 were 180 and 210, respectively. Compute the real return Boris earned from selling the house.
112) You bought Apple, Incorporation’s stock for $55 a share a year ago and it is selling for $115 a year later. During this year, you received a dividend of $0.1925 for the first two quarters and $0.205 for the last two quarters. a. What is your capital gains yield? b. What is your income yield? c. What is your investment return? d. What is the real rate of return if the inflation rate is expected to be 0.62%?
113) The only possible income from an investment is the direct cash payment from the underlying asset. ⊚ true ⊚ false
Version 1
40
114)
As long as an investor does not sell an asset, there is no capital gain or loss involved. ⊚ true ⊚ false
115) Rates of return expressed in nominal terms do not represent a true picture because they are not adjusted for inflation. ⊚ true ⊚ false
116)
If the nominal rate of return is positive, then the real rate of return must also be positive. ⊚ true ⊚ false
117)
There is no need to have a constant base value for a price index. ⊚ true ⊚ false
118) If the year 2000 is used as a base year and the value of the price index in 2001 is 105, it implies that there was a 5% increase in the price level in 2001. ⊚ true ⊚ false
119) Index numbers provide direct comparisons of prices not only with respect to the base year, but also between non-base years. ⊚ true ⊚ false
Version 1
41
120) The aggregate price index is used to represent relative price movements for a group of items. ⊚ true ⊚ false
121) An unweighted price index for different types of properties will unfairly treat all property prices equally. ⊚ true ⊚ false
122) The weighted aggregate price index assigns a lower weight to the items that are sold in higher quantities. ⊚ true ⊚ false
123) The Laspeyres price index uses the quantities evaluated in the base period to compute the weighted aggregate price index. ⊚ true ⊚ false
124) Because both the Laspeyres and Paasche indices are weighted aggregate price indices, they will yield the same value for a given evaluation period. ⊚ true ⊚ false
Version 1
42
125) The Laspeyres and Paasche indices tend to differ when the length of time between the periods increases since the relative quantities of items adjust to the changes in consumer demand over time. ⊚ true ⊚ false
126) Price indices are used to remove the effect of inflation so that business and economic time series can be evaluated in a more meaningful way. ⊚ true ⊚ false
127) The Consumer Price Index (CPI) and the Producer Price Index (PPI) must always move in sync with each other. ⊚ true ⊚ false
Version 1
43
Answer Key Test name: Chap 19_4e 1) a. Year Price Index
2015 100
2016 109.3085
2017 88.8298
2018 106.1170
2019 82.1809
b. The price of whole milk in 2016 was 109.3085% of what it was in 2015 or 9.3085% higher. The price of whole milk in 2019 was 82.1809% of what it was in 2015, or 17.8192% lower. c. Year Price Index
2015 112.5749
2016 123.0539
2017 100
2018 119.4611
2019 92.5150
d. Changing the base year from 2015 to 2017 allows one to use the price indices with the revised base of 2017 to directly compare prices of whole milk between 2017 and other non-base years.
Version 1
44
2) a. Using 2015 as the base year, the unweighted aggregate price index for 2016 = aggregate price of milk for 2016/aggregate price of milk for 2015 = (4.11 + 3.49)/(3.76 + 3.09) × 100 = 110.9489. The unweighted aggregate price index for the other years is computed in the same way. b. Given 2015 as the base year, the unweighted aggregate price index for milk in 2018 was 118.9781, which means that the overall price of milk in 2018 was 118.9781% of what it was in 2015, or 8.9781% higher. Similarly, the unweighted aggregate price index for milk in 2019 was 87.2993%, which means that the overall price of milk in 2019 was 87.2993% of what it was in 2015, or 12.7007% lower. c. The Laspeyres price index evaluates the quantities in the base period which is the number of gallons of milk sold in 2015. The weighted price in 2015 is calculated as 3.76 × 14451 + 3.09 × 16759 = 106121. The Laspeyres price index for 2016 is = weighted price in 2016/weighted price in 2015 = (4.11 × 14451 + 3.49 × 16759)/106121 = 111.0830. The Laspeyres price index for the other years is computed in the same way. 2015 100
2016 110.0830
2017 103.1244
2018 120.0298
2019 87.7178
d. The Paasche price index evaluates the quantities in the current period, which is the number of gallons of milk sold in 2019. The weighted price in 2015 is calculated as 3.76 × 16119 + 3.09 × 15281 = 107826. The Paasche price index for 2016 is = weighted price in 2016/weighted price in 2015 = (4.11 × 16119 + 3.49 × 15281)/107826 = 110.9010. The Paasche price index for the other years is computed in the same way. 2015 100
Version 1
2016 110.9010
2017 101.6577
2018 118.6023
2019 87.1497
45
e. The Laspeyres and Paasche price indices provide similar results. The overall price of milk is higher between 2015 and other years except 2019. 3) price change 4) dividends 5) Fisher 6) approximation 7) percentage 8) 177.4 9) aggregate 10) Laspeyres 11) current 12) deflated 13) inflation 14) C 15) D 16) A 17) C 18) A 19) C 20) A 21) D 22) C 23) A 24) C 25) B 26) D 27) C 28) A Version 1
46
29) B 30) C 31) D 32) B 33) D 34) C 35) A 36) D 37) A 38) C 39) C 40) A 41) D 42) B 43) D 44) A 45) D 46) A 47) B 48) D 49) B 50) C 51) C 52) A 53) B 54) A 55) D 56) A 57) C 58) C Version 1
47
59) A 60) D 61) D 62) A 63) A 64) B 65) C 66) C 67) A 68) A 69) B 70) C 71) D 72) A 73) B 74) C 75) D 76) A 77) D 78) C 79) D 80) D 81) B 82) a. 16% b. 5.33% 83) a. 5% b. 7% c. 12%
Version 1
48
84) a. 2.31% b. Approximately $30.50 85) a. 13.33% b. $145 c. Because income yield is 13.33% and the investment return is 10%, capital gains yield must be −3.33%. This implies that, although the price of the share fell to $145 this year, the decline in the share price was lower than the dividend paid to Jake. He therefore earned a positive investment return in spite of a negative capital gains yield. 86) a. 5.33% b. 5.70% 87) −3.37% 88) a. 18.52% b. 16.50% 89) 1.47% 90) a. 0% b. 5% c. 3.65% 91) a. Tammy should invest in XYZ Corporation because it has a higher real return. b. No, an increase in the expected inflation rate will not affect Tammy’s choice between these two options. A higher expected inflation rate will reduce the real return on both investments, but because the nominal returns will remain unchanged, the real return from XYZ will still be higher. 92) a. 50% b. 150 c. 88.89 93) $47.62 94) a. It implies that although the price of the good increased by 12% in 2008, as compared to 2007, the price of the good in 2009 was 2% lower than the price in 2007. b. $17.50.
Version 1
49
95) a. 130 b. 107.69 96) a. $8.28 b. 101.47 97) a. 110.42 b. Since the unweighted aggregate price index for 2009 is 110.42, it implies that the aggregate price of these three brands of cookies was 10.42% higher than the prices in 2008, which is the base year. 98) a. 75.47 b. The table shows that the prices of all three brands declined in 2010, although the decline was not uniform across the three brands. The unweighted aggregate price index for 2010, using 2009 as the base year, tells us that the aggregate price of these three brands of chocolate chip cookies in 2010 was 75.47% of the price in 2009. 99) a. 100 b. An unweighted aggregate price index of 100 implies that, in comparison to 2009, the aggregate prices have not changed in 2010. c. No. The index suggests that, on average, there have been no changes in the price. This might be true only if the three market shares were the same, which occurs quite rarely in practice. 100) a. 104.55 b. 101.74 101) a. 98.82 b. The value of the Laspeyres price index for 2011 is 98.82, which implies that the aggregate prices of this particular good have declined by 1.18% in 2011 as compared to 2010. 102) a. 98.60 b. The value of the Paasche price index for 2011 is 98.60, which implies that the aggregate prices of this particular good have declined by 1.4% in 2011 as compared to 2010.
Version 1
50
103) a. 105.82 b. 105.94 c. Yes, the difference between the two indices is likely to have been higher if the comparison was between 2000 and 2009, instead of between 2008 and 2009. In general, both indices are similar if the periods being compared are not too far apart. However, the longer the time gap, the greater the chances of the quantity consumed being markedly different in the two periods because consumers get the time to adjust their consumption patterns. This causes the two indices to vary if they use substantially different quantities as weights. 104) a. 150.35 b. 146.09 c. The two indices tend to differ when the length of time between the periods increases since the relative quantities of items (weights) adjust to the changes in consumer demand over time. Consumers tend to adjust their consumption patterns by decreasing (increasing) the quantity of items that undergo a larger relative price increase (decrease). For instance, a sharp increase in the price of an item is typically accompanied by a decrease in the quantity demanded, making its relative weight go down in value. Similarly, a sharp decrease in the price of an item will make its relative weight go up. Therefore, a Paasche index that uses the updated weights theoretically produces a lower estimate than a Laspeyres index when prices are increasing and a higher estimate when prices are decreasing. This is consistent with our result here—the value of the Paasche index is lower than the value of the Laspeyres index because it reflects the adjustment in consumption patterns in response to the increasing prices over the years. 105) $11 106) $5.26
107) Amy was better off in 2011 because her price-adjusted salary in 2011 was $39,285.71, which was higher than what she received in 2006.
Version 1
51
108) a. 33.33% b. 20% c. Although Alex’s nominal salary increased by 33.33% between 2004 and 2011, this increase did not take into account the changes in price levels between these two years. When adjusted for changes in the price level, it can be seen that the increase in his purchasing power was lower because his real salary in 2011 was higher than 2004 only by 20%, not 33.33%. 109) a. 50% b. 44.44% 110) a. 1.58% b. 3.63%
111) 114% 112) a. 109.0909% b. 1.4455% c. 110.5364% d. 109.2391% 113) FALSE 114) FALSE 115) TRUE 116) FALSE 117) FALSE 118) TRUE 119) FALSE 120) TRUE 121) TRUE 122) FALSE 123) TRUE 124) FALSE 125) TRUE Version 1
52
126) TRUE 127) FALSE
Version 1
53
CHAPTER 20 – 4e 1) Nonparametric tests use fewer and weaker ________ than those associated with parametric tests.
2) When conducting a hypothesis test for the _____________ m, we want to determine whether m is not equal to, greater than, or less than m0, the value of the population median postulated in the null hypothesis.
3) If we wish to compare central tendencies from nonnormal populations, then the Wilcoxon signed-rank test serves as the nonparametric counterpart to the __________ test.
4) The Wilcoxon rank-sum test is based completely on the ________ of the observations from the two independent samples.
5) The Kruskal-Wallis test is a nonparametric alternative to the ________ test that can be used when the assumptions of normality and/or equal population variances cannot be validated.
6) For the Spearman rank correlation test we can use the normal distribution approximation to implement a z test only if n is greater or equal to ________.
7)
The Kruskal-Wallis test is always a ________-tailed test.
8) If we have a matched-pairs sample of ordinal data, we can use the sign test to determine whether there are significant differences between the _______.
Version 1
1
9) In the Wald-Wolfowitz test a ________ is an uninterrupted sequence of one letter, symbol, or attribute.
10) The method of runs above and below the median is especially useful in detecting trends and cyclical patterns in ________ data.
11) For the Wilcoxon signed-rank test with n ≥ 10, the sampling distribution of the test statistic can be approximated by the normal distribution with mean ________. A) μT = 2n(2n + 1)/2 B) μT = n(n + 1)/4 C) μT = n(n – 1)/4 D) μT = n(2n + 1)/3
12) For the Wilcoxon signed-rank test with n ≥ 10, the sampling distribution of the test statistic can be approximated by the normal distribution with standard deviation ________. A) B) C) D)
13) For the Wilcoxon rank-sum test when the sample sizes are at least 10, the sampling distribution of the test statistic can be approximated by the normal distribution with mean ________.
Version 1
2
A) B) C) D)
14) For the Wilcoxon rank-sum test when the sample sizes are at least 10, the sampling distribution of the test statistic can be approximated by the normal distribution with standard deviation ________. A) B) C) D)
15) For the Kruskal-Wallis test with ni ≥ 5, the sampling distribution of the test statistic, H, can be approximated by the ________. A) normal distribution B) F distribution C) chi-square distribution D) t distribution
16)
Which of the following is the test statistic for the Kruskal-Wallis test?
Version 1
3
A)
B) C)
D)
17)
Which of the following is the sample Spearman rank correlation coefficient?
A) B) C) D)
18) If n ≥ 10, the sampling distribution of the Spearman rank correlation coefficient can be approximated by the normal distribution with ________. A) µ = 0 and B) µ = 1 and C) µ = 0 and D) µ = 0 and
Version 1
4
19)
<p>The test statistic for the sign test is
where ________.
A) is the sample proportion of plus signs</p> B) is the sample proportion of minus signs</p> C) is the sample mean of plus signs</p> D) is the sample mean of minus signs</p>
20)
Which of the following is the test statistic for the sign test?
A) B) C) D)
21)
The nonparametric test for a single population median is known as the ________. A) Wilcoxon signed-rank test B) Wilcoxon rank-sum test C) Kruskal-Wallis test D) Wald-Wolfowitz runs test
22) The nonparametric test for two population medians under independent sampling is known as ________.
Version 1
5
A) Wilcoxon signed-rank test B) Wilcoxon rank-sum test C) Kruskal-Wallis test D) Wald-Wolfowitz runs test
23) The nonparametric test for the correlation between two variables is known as the ________. A) Wilcoxon signed-rank test B) Wilcoxon rank-sum test C) Kruskal-Wallis test D) Spearman’s rank correlation test
24) The nonparametric test for ordinal data based on matched-pairs sampling is the ________. A) Wilcoxon signed-rank test B) Wald-Wolfowitz runs test C) Sign test D) Spearman’s rank correlation test
25)
The nonparametric test to determine if a sequence is random is ________. A) Wilcoxon signed-rank test B) Wald-Wolfowitz runs test C) the sign test D) Spearman’s rank correlation test
Version 1
6
26) For the Wald-Wolfowitz runs test with and , the sampling distribution of the test statistic R can be approximated by the normal distribution with mean of ________. A) B) C) D)
27) For the Wald-Wolfowitz runs test with and , the sampling distribution of the test statistic R can be approximated by the normal distribution with standard deviation ________. A) B) C) D)
28) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175 130 225 240
Version 1
7
Which of the following are the competing hypotheses to determine if the median sale price is greater than $150? A) H0:m ≤ 150; HA:m > 150 B) H0:m ≥ 150; HA:m < 150 C) H0:m ≠ 150; HA:m = 150 D) H0:m = 150; HA:m ≠ 150
29) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175 130 225 240
Which of the following is the value of the test statistic? A) 6 B) 14 C) 20 D) 1
30) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175
Version 1
8
130 225 240
At the 5% significance level, the critical value is ________. A) 2 B) 19 C) 0 D) 28
31) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175 130 225 240
Using the critical value approach and
, the appropriate conclusion is ________.
A) to reject the null hypothesis; conclude the median sale price is greater than $150 B) do not reject the null hypothesis; we cannot conclude the median sale price is greater than $150 C) to reject the null hypothesis; we cannot conclude the median sale price is greater than $150 D) do not reject the null hypothesis; conclude the median sale price is greater than $150
Version 1
9
32) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175 130 225 240
The p-value for the test is ________. A) less than 0.01 B) between 0.025 and 0.01 C) between 0.05 and 0.025 D) greater than 0.05
33) A pawn shop claims to sell used Kindles for about the same price as Amazon or eBay. Both online retailers sell used Kindles for around $150. Below are recent Kindle sales prices for the pawn shop. Price ($) 260 250 175 130 225 240
Using the p-value approach and α = 0.05, the appropriate conclusion is ________.
Version 1
10
A) to reject the null hypothesis; we cannot conclude the median sale price is greater than $150 B) do not reject the null hypothesis; we cannot conclude the median sale price is greater than $150 C) do not reject the null hypothesis; we can conclude the median sale price is greater than $150 D) to reject the null hypothesis; we can conclude the median sale price is greater than $150
34) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median numbers worked per week differ from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic is T = T+ = 153. Which of the following are the competing hypotheses to determine if the median number of hours worked per week differs from 100 hours? A) H0:m ≤ 100; HA:m > 100 B) H0:m ≥ 100; HA:m < 100 C) H0:m ≠ 100; HA:m = 100 D) H0:m = 100; HA:m ≠ 100
35) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median numbers worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic is T = T+ = 153. Given the sample size, the sampling distribution of T can be approximated by the normal distribution with mean and standard deviation ________, respectively. A) 85.50, 527.25 B) 22.96, 527.25 C) 22.96, 85.50 D) 85.50, 22.96
Version 1
11
36) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median number of hours worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic is T = T+ = 153. If the sampling distribution of T can be approximated by the normal distribution, the value of the z test statistic is ________. A) 153 B) 2.94 C) 0.128 D) −2.94
37) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median number of hours worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic is T = T+ = 153.At the 1% significance level, the critical value is ________. A) 1.645 B) 1.96 C) 2.33 D) 2.575
38) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median number of hours worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic isT = T+ = 153. Using the critical value approach and α = 0.01, the appropriate conclusion is ________.
Version 1
12
A) to reject the null hypothesis; we cannot conclude the median number of hours worked per week differs from 100 B) do not reject the null hypothesis; we can conclude the median number of hours worked per week differs from 100 C) do not reject the null hypothesis; we cannot conclude the median number of hours worked per week differs from 100 D) to reject the null hypothesis; we can conclude the median number of hours worked per week differs from 100
39) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median number of hours worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic is T = T+ = 153. The p-value for the test is _______. A) 0.0016 B) 0.0032 C) 0.9984 D) 0
40) A trading magazine wants to determine the number of hours stockbrokers work each week. In particular, the trading magazine wants to determine if the median number of hours worked per week differs from 100 hours. The magazine samples 18 traders. For the Wilcoxon signed-rank test, the value of the test statistic isT = T+ = 153. Using the p-value approach and α = 0.01, the appropriate conclusion is ________.
Version 1
13
A) to reject the null hypothesis; we can conclude the median number of hours worked per week differs from 100 B) to reject the null hypothesis; we cannot conclude the median number of hours worked per week differs from 100 C) do not reject the null hypothesis; we can conclude the median number of hours worked per week differs from 100 D) do not reject the null hypothesis; we cannot conclude the median number of hours worked per week differs from 100
41) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training. Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
For the Wilcoxon signed-rank test for the matched-pairs sample, the competing hypotheses are ________. A) H0:mD ≤ 0;HA:mD > 0 B) H0:mD = 0;HA:mD ≠ 0 C) H0:mD ≠ 0;HA:mD = 0 D) H0:mD ≥ 0;HA:mD < 0
Version 1
14
42) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training. Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
The value of the test statistic is ________. A) 5 B) 23 C) 7 D) 13
43) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training. Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
At the 5% significance level, the critical value is ______.
Version 1
15
A) 25 B) 3 C) 2 D) 26
44) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training. Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
Using the critical value approach and α = 0.05, the appropriate conclusion is ________. A) reject the null hypothesis; we cannot conclude the median difference of the times is greater than zero B) reject the null hypothesis; we can conclude the median difference of the times is greater than zero C) do not reject the null hypothesis; we cannot conclude the median difference of the times is greater than zero D) do not reject the null hypothesis; we can conclude the median difference of the times is greater than zero
45) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training.
Version 1
16
Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
The p-value for the test is ________. A) less than 0.01 B) between 0.01 and 0.025 C) between 0.025 and 0.05 D) greater than 0.05
46) A company that produces financial accounting software wants to offer better training to its customers. The training is intended to decrease the amount of time required to do complicated accounting calculations. For seven individuals, the amount of time (in minutes) to complete a complicated calculation is determined before and after completing the new training. Before
After
24
11
20
18
19
23
20
15
13
16
28
22
15
8
Using the p-value approach and α = 0.05, the appropriate conclusion is ________.
Version 1
17
A) do not reject the null hypothesis; we cannot conclude the median difference of the times is greater than zero B) do not reject the null hypothesis; we can conclude the median difference of the times is greater than zero C) reject the null hypothesis; we cannot conclude the median difference of the times is greater than zero D) reject the null hypothesis; we can conclude the median difference of the times is greater than zero
47) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds are greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. For the Wilcoxon signed-rank test for the matchedpairs, the competing hypotheses are ________. A) H0:mD = 0; HA:mD ≠ 0 B) H0:mD ≤ 0; HA:mD > 0 C) H0:mD ≠ 0; HA:mD = 0 D) H0:mD ≥ 0; HA:mD < 0
48) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds are greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. Given the sample size, the sampling distribution of T can be approximated by the normal disribution as the sample size is greater than 10, T can be assumed to follow the normal distribution with mean and standard deviation ________, respectively.
Version 1
18
A) 126.5 and 30.80 B) 30.80 and 126.5 C) 125.5 and 948.75 D) 30.80 and 984.75
49) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds are greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. Assuming that the sampling distribution of T can be approximated by the normal distribution, the value of the z test statistic is ________. A) 0.062 B) 22 C) 185 D) 1.899
50) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds is greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic isT = T+ = 185. At the 10% significance level, the critical value is ________. A) 1.645 B) 1.96 C) 2.33 D) 2.575
Version 1
19
51) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds is greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. Using the critical value approach and α = 0.10, the appropriate conclusion is _______. A) to reject the null hypothesis; we cannot conclude the median difference of the returns is greater than zero B) do not reject the null hypothesis; we cannot conclude the median difference of the returns is greater than zero C) to reject the null hypothesis; we can conclude the median difference of the returns is greater than zero D) do not reject the null hypothesis; we can conclude the median difference of the returns is greater than zero
52) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds is greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. The p-value for the test is ________. A) greater than 0.10 B) between 0.10 and 0.05 C) between 0.05 and 0.01 D) less than 0.01
53) Investment institutions usually have funds with different risk versus reward prospectuses. A trading magazine wants to determine if the returns of high-risk funds is greater than low-risk funds. The magazine records the return of high- and low-risk funds for a sample of 22 institutions. For the Wilcoxon signed-rank test, where D = high-risk return − low-risk return, the value of the test statistic is T = T+ = 185. Using the p-value approach and α = 0.10, the appropriate conclusion is ________.
Version 1
20
A) to reject the null hypothesis; we can conclude the median difference of the returns is greater than zero B) to reject the null hypothesis; we cannot conclude the median difference of the returns is greater than zero C) do not reject the null hypothesis; we cannot conclude the median difference of the returns is greater than zero D) do not reject the null hypothesis; we can conclude the median difference of the returns is greater than zero
54) An accountant wants to know if the property taxes paid by clients who live in the city are different from those who live in the county. The property taxes paid by five clients from the city (sample 1) and five clients from the county (sample 2) are shown below. City $2,570 $3,000 $2,600 $3,500 $2,900
County $2,700 $2,500 $3,100 $2,400 $1,750
To determine whether the medians of the taxes paid differ using the Wilcoxon rank-sum test, the competing hypotheses are ________. A) H0: m1 − m2 ≥ 0; HA: m1 − m2 < 0 B) H0: m1 − m2 ≤ 0; HA: m1 − m2 > 0 C) H0: m1 − m2 ≠ 0; HA: m1 − m2 = 0 D) H0: m1 − m2 = 0; HA: m1 − m2 ≠ 0
55) An accountant wants to know if the property taxes paid by clients who live in the city are different from those who live in the county. The property taxes paid by five clients from the city (sample 1) and five clients from the county (sample 2) are shown below. City $2,570 $3,000 $2,600 $3,500
Version 1
County $2,700 $2,500 $3,100 $2,400
21
$2,900
$1,750
The value of the test statistic is ________. A) 10 B) 21 C) 34 D) 55
56) An accountant wants to know if the property taxes paid by clients who live in the city are different from those who live in the county. The property taxes paid by five clients from the city (sample 1) and five clients from the county (sample 2) are shown below. City $2,570 $3,000 $2,600 $3,500 $2,900
County $2,700 $2,500 $3,100 $2,400 $1,750
At the 5% significance level, the critical values are ________. A) 18 and 37 B) 19 and 36 C) 18 and 36 D) 19 and 37
57) An accountant wants to know if the property taxes paid by clients who live in the city are different from those who live in the country. The property taxes paid by five clients from the city (sample 1) and five clients from the country (sample 2) are shown below. City $2,570 $3,000 $2,600 $3,500 $2,900
County $2,700 $2,500 $3,100 $2,400 $1,750
Using the critical value approach and α = 0.05, the appropriate conclusion is ________.
Version 1
22
A) reject the null hypothesis; we cannot conclude the median taxes paid in the city differs from the median taxes paid in the country B) do not reject the null hypothesis; conclude the median taxes paid in the city differs from the median taxes paid in the country C) reject the null hypothesis; we can conclude the median taxes paid in the city differs from the median taxes paid in the country D) do not reject the null hypothesis; we cannot conclude the median taxes paid in the city differs from the median taxes paid in the country
58) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
To determine whether the median return of growth funds is larger than the median return of value funds, the competing hypotheses are ________. A) H0: m1 − m2 ≥ 0; HA: m1 − m2 < 0 B) H0: m1 − m2 ≤ 0; HA: m1 − m2 > 0 C) H0: m1 − m2 ≠ 0; HA: m1 − m2 = 0 D) H0: m1 − m2 = 0; HA: m1 − m2 ≠ 0
59) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
Version 1
23
The value of the test statistic W is ________. A) 215 B) 9,870 C) 13,350 D) 23,200
60) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
Because both sample sizes are at least 10, the sampling distribution of W can be approximated by the normal distribution with a mean and standard deviation of ________, respectively. A) 434.74 and 8100 B) 8100 and 434.74 C) 8100 and 189000 D) 189000 and 8100
61) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
Version 1
24
Assuming that the sampling distribution of W can be approximated by the normal distribution, the value of the test statistic is _____. A) 4.07 B) −4.07 C) −12.07 D) 12.07
62) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
At the 1% significance level, the critical value is ________. A) 1.645 B) 1.96 C) 2.33 D) 2.575
63) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
Using the critical value approach and α = 0.01, the appropriate conclusion is ________.
Version 1
25
A) reject the null hypothesis; we can conclude the median return of growth stocks is greater than the median return of value stocks B) do not reject the null hypothesis; we cannot conclude the median return of growth stocks is greater than the median return of value stocks C) do not reject the null hypothesis; we can conclude the median return of growth stocks is greater than the median return of value stocks D) reject the null hypothesis; we cannot conclude the median return of growth stocks is greater than the median return of value stocks
64) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
The p-value for the test is ________. A) greater than 0.10 B) between 0.10 and 0.05 C) between 0.05 and 0.01 D) less than 0.01
65) A fund manager wants to know if the annual rate of return is greater for growth stocks (sample 1) than for value stocks (sample 2). The fund manager collects data on the returns of growth and value funds. Below are the sample sizes and rank sums for the Wilcoxon rank-sum test. Growth
Value
Sample Size
75
140
Sum of Ranks
9,870
13,350
Version 1
26
Using the p-value approach and α = 0.01, the appropriate conclusion is ________. A) do not reject the null hypothesis; conclude the median return of growth stock is greater than the median return of value B) do not reject the null hypothesis; we cannot conclude the median return of growth stock differs from the median return of value stocks C) reject the null hypothesis; conclude the median return of growth stock is greater than the median return of value stocks D) reject the null hypothesis; we cannot conclude the median return of growth stock is greater than the median return of value stocks
66) A marketing firm needs to replace its existing network provider and is considering three different providers. A primary consideration is the amount of system downtime. The following table contains the amount of network downtime (in hours) for the last five months for each network provider. A
B
C
4.0
6.9
0.5
3.7
11.3
1.4
5.1
21.7
1.0
2.0
9.2
1.7
4.6
6.5
3.6
For the Kruskal-Wallis test, the competing hypotheses are ________. A) H0:mA = mB = mC;HA: Not all population medians are equal B) H0:mA = mB = mC;HA:mA ≠ mC C) H0:mA = mB = mC;HA:mA ≠ mB D) H0:mA = mB = mC;HA:mB ≠ mC
67) A marketing firm needs to replace its existing network provider and is considering three different providers. A primary consideration is the amount of system downtime. The following table contains the amount of network downtime (in hours) for the last five months for each network provider.
Version 1
27
A
B
C
4.0
6.9
0.5
3.7
11.3
1.4
5.1
21.7
1.0
2.0
9.2
1.7
4.6
6.5
3.6
The test statistic value is ________. A) 7.8 B) 13.0 C) 3.2 D) 12.02
68) A marketing firm needs to replace its existing network provider and is considering three different providers. A primary consideration is the amount of system downtime. The following table contains the amount of network downtime (in hours) for the last five months for each network provider. A
B
C
4.0
6.9
0.5
3.7
11.3
1.4
5.1
21.7
1.0
2.0
9.2
1.7
4.6
6.5
3.6
At the 5% significance level, the critical value is ________. A) 4.605 B) 5.991 C) 7.378 D) 9.21
Version 1
28
69) A marketing firm needs to replace its existing network provider and is considering three different providers. A primary consideration is the amount of system downtime. The following table contains the amount of network downtime (in hours) for the last five months for each network provider. A
B
C
4.0
6.9
0.5
3.7
11.3
1.4
5.1
21.7
1.0
2.0
9.2
1.7
4.6
6.5
3.6
The p-value for the test is ________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
70) A marketing firm needs to replace its existing network provider and is considering three different providers. A primary consideration is the amount of system downtime. The following table contains the amount of network downtime (in hours) for the last five months for each network provider. A
B
C
4.0
6.9
0.5
3.7
11.3
1.4
5.1
21.7
1.0
2.0
9.2
1.7
4.6
6.5
3.6
Using the critical value approach and α = 0.05, the appropriate conclusion is ________.
Version 1
29
A) reject the null hypothesis; we cannot conclude not all median network downtimes are equal B) do not reject the null hypothesis; we cannot conclude not all median network downtimes are equal C) reject the null hypothesis; we can conclude not all median network downtimes are equal D) do not reject the null hypothesis; we can conclude not all median network downtimes are equal
71) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table. Football
Hockey
Basketball
Baseball
1.2
11
4.0
4.0
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
For the Kruskal-Wallis test, the competing hypotheses are ________.
Version 1
30
A) H0:mF = mH = mBask = mBase, HA:mH ≠ mBask B) H0:mF = mH = mBask = mBase, HA:mF = mBask C) H0:mF = mH = mBask = mBase, HA: Not all population medians are equal D) H0:mF = mH = mBask = mBase, HA:mBask = mBask
72) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table. Football
Hockey
Basketball
Baseball
1.2
11.0
4.0
4.0
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
The test statistic value is ________.
Version 1
31
A) 18.29 B) 21.0 C) 23.3 D) 40.6
73) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table. Football
Hockey
Basketball
Baseball
1.2
11.0
4.0
4.0
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
At the 1% significance level, the critical value is ________. A) 11.345 B) 9.348 C) 7.815 D) 6.251
Version 1
32
74) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table. Football
Hockey
Basketball
Baseball
1.2
11.0
4.0
4.0
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
The p-value for the test is ________. A) greater than 0.10 B) between 0.05 and 0.10 C) between 0.01 and 0.05 D) less than 0.01
75) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table.
Version 1
Football
Hockey
Basketball
Baseball
1.2
11.0
4.0
4.0 33
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
Using the critical value approach and α = 0.01, the appropriate conclusion is ________. A) do not reject the null hypothesis; we cannot conclude not all median annual earnings are equal B) reject the null hypothesis; we can conclude not all median annual earnings are equal C) reject the null hypothesis; we cannot conclude not all median annual earnings are the equal D) do not reject the null hypothesis; we can conclude that not all median annual earnings are equal
76) A sports agent wants to understand the differences in the annual earnings of players in different sports. The following table shows the annual earnings (in million $) of 52 athletes. For the Kruskal-Wallis test, the rank sums for each group are at the bottom of the table.
Version 1
Football
Hockey
Basketball
Baseball
1.2
11.0
4.0
4.0
1.6
7.7
2.3
3.3
2.2
6.7
1.1
2.2
3.4
8.4
0.4
3.5
34
1.4
5.6
0.9
4.4
1.6
3.4
2.3
3.1
5.0
5.9
1.5
1.3
4.0
3.4
3.4
1.2
3.0
2.3
5.6
1.7
2.9
4.5
3.9
1.6
7.6
3.3
5.6
1.8
4.8
6.7
1.9
9.7
2.1
12.3
2.2
14.0 Ri
276
606.5
230.5
264
Using the p-value approach and α = 0.01, the appropriate conclusion is ________. A) do not reject the null hypothesis; we can conclude that not all median annual salaries are the same B) do not reject the null hypothesis; we cannot conclude that not all median annual salaries are the same C) reject the null hypothesis; we can conclude that not all median annual salaries are the same D) reject the null hypothesis; we cannot conclude not all median annual salaries are the same
77) A shipping company believes there is a linear association between the weight of packages shipped and the cost. The following table shows the weight (in pounds) and cost (in dollars) of the last seven packages shipped. Weight
Cost
12
17
9
11
17
27
13
16
8
9
18
25
20
21
Version 1
35
The Spearman rank correlation coefficient between weight and cost is ________. A) 0.960 B) 0.811 C) 0.820 D) 0.897
78) A shipping company believes there is a linear association between the weight of packages shipped and the cost. The following table shows the weight (in pounds) and cost (in dollars) of the last seven packages shipped. Weight
Cost
12
17
9
11
17
27
13
16
8
9
18
25
20
21
To determine if the weights and costs are significantly correlated, the competing hypothesis is ________. A) H0: ρs ≤ 0; HA: ρs > 0 B) H0: ρs ≥ 0; HA: ρs < 0 C) H0: ρs ≠ 0; HA: ρs = 0 D) H0: ρs = 0; HA: ρs ≠ 0
79) A shipping company believes there is a linear association between the weight of packages shipped and the cost. The following table shows the weight (in pounds) and cost (in dollars) of the last seven packages shipped. Weight
Cost
12
17
9
11
Version 1
36
17
27
13
16
8
9
18
25
20
21
At the 10% significance level, the positive critical value is ________. A) 0.893 B) 0.786 C) 0.714 D) 0.881
80) A shipping company believes there is a linear association between the weight of packages shipped and the cost. The following table shows the weight (in pounds) and cost (in dollars) of the last seven packages shipped. Weight
Cost
12
17
9
11
17
27
13
16
8
9
18
25
20
21
Using the critical value approach and α = 0.10, the appropriate conclusion is ________. A) do not reject the null hypothesis; we can conclude weight and cost are correlated B) do not reject the null hypothesis; we cannot conclude weight and cost are correlated C) reject the null hypothesis; we cannot conclude weight and cost are correlated D) reject the null hypothesis; we can conclude weight and cost are correlated
Version 1
37
81) SHY (NYSEARCA: SHY) is a 1−3 year Treasury bond fund that is considered to be a market-neutral position. Using the S&P 500 as a benchmark and five years of monthly log-return data, the rank correlation coefficient of SHY with the S&P 500 is found to be rs = −0.284. To determine if the SHY is not a market-neutral position, the competing hypotheses are ________. A) H0: ρs = 0; HA: ρs ≠ 0 B) H0: ρs ≤ 0; HA: ρs > 0 C) H0: ρs ≥ 0; HA: ρs < 0 D) H0: ρs ≠ 0; HA: ρs = 0
82) SHY (NYSEARCA: SHY) is a 1−3 year Treasury bond fund that is considered to be a market-neutral position. Using the S&P 500 as a benchmark and five years of monthly log-return data, the rank correlation coefficient of SHY with the S&P 500 is found to be rs = −0.284. Given the sample size, the sampling distribution of rs can be approximated by the normal distribution with standard deviation ________. A) 0 B) 0.129 C) 0.130 D) 1
83) SHY (NYSEARCA: SHY) is a 1−3 year Treasury bond fund that is considered to be a market-neutral position. Using the S&P 500 as a benchmark and five years of monthly log-return data, the rank correlation coefficient of SHY with the S&P 500 is found to be rs = −0.284. Given the sample size, the sampling distribution of rs can be approximated by the normal distribution and the value of the test statistic is ________. A) −2.20 B) −2.18 C) 2.18 D) 2.20
Version 1
38
84) SHY (NYSEARCA: SHY) is a 1−3 year Treasury bond fund that is considered to be a market-neutral position. Using the S&P 500 as a benchmark and five years of monthly log-return data, the rank correlation coefficient of SHY with the S&P 500 is found to be rs = −0.284. The pvalue for the test is ________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
85) SHY (NYSEARCA: SHY) is a 1−3 year Treasury bond fund that is considered to be a market-neutral position. Using the S&P 500 as a benchmark and five years of monthly log-return data, the rank correlation coefficient of SHY with the S&P 500 is found to be rs = −0.284. Using the p-value approach and α = 0.01, the appropriate conclusion is ________. A) reject the null hypothesis; we cannot conclude SHY is not a market-neutral position B) do not reject the null hypothesis; we can conclude SHY is not a market-neutral position C) reject the null hypothesis; we can conclude SHY is not a market-neutral position D) do not reject the null hypothesis; we cannot conclude SHY is not a market-neutral position
86) A wine magazine wants to know if chefs can tell the difference between duck liver pâté and wet dog food. Eighteen chefs were asked to rate both the pâté and dog food on a scale from 1 to 5, with 1 corresponding to “inedible” and 5 to “very tasty.” The results are shown in the following table. Chef
Pâté
Dog Food
1
1
3
2
1
5
3
5
1
4
1
5
5
2
5
6
1
4
7
2
5
Version 1
39
8
1
5
9
1
5
10
2
5
11
3
5
12
4
5
13
1
5
14
1
4
15
1
5
16
3
1
17
3
5
18
1
5
To determine whether chefs have a preference between pâté and dog food using the sign test, the competing hypotheses are________. A) H0: p = 0.5; HA: p ≠ 0.5 B) H0: p ≥ 0.5; HA: p < 0.5 C) H0: p ≤ 0.5; HA: p > 0.5 D) H0: p ≠ 0.5; HA: p = 0.5
87) A wine magazine wants to know if chefs can tell the difference between duck liver pâté and wet dog food. Eighteen chefs were asked to rate both the pâté and dog food on a scale from 1 to 5, with 1 corresponding to "inedible" and 5 to "very tasty." The results are shown in the following table. Chef
Pâté
Dog Food
1
1
3
2
1
5
3
5
1
4
1
5
5
2
5
6
1
4
7
2
5
8
1
5
9
1
5
10
2
5
11
3
5
Version 1
40
12
4
5
13
1
5
14
1
4
15
1
5
16
3
1
17
3
5
18
1
5
The estimate of the population proportion of plus signs is ________. A) 0.944 B) 0.50 C) 0.111 D) 0.095
88) A wine magazine wants to know if chefs can tell the difference between duck liver pâté and wet dog food. Eighteen chefs were asked to rate both the pâté and dog food on a scale from 1 to 5, with 1 corresponding to “inedible” and 5 to “very tasty.” The results are shown in the following table. Chef
Pâté
Dog Food
1
1
3
2
1
5
3
5
1
4
1
5
5
2
5
6
1
4
7
2
5
8
1
5
9
1
5
10
2
5
11
3
5
12
4
5
13
1
5
14
1
4
15
1
5
Version 1
41
16
3
1
17
3
5
18
1
5
<p>Assuming the sampling distribution of can be approximated by the standard normal distribution, the value of the test statistic is ________. A) 3.300 B) −3.300 C) 0 D) 0.5
89) A wine magazine wants to know if chefs can tell the difference between duck liver pâté and wet dog food. Eighteen chefs were asked to rate both the pâté and dog food on a scale from 1 to 5, with 1 corresponding to "inedible" and 5 to "very tasty." The results are shown in the following table. Chef
Pâté
Dog Food
1
1
3
2
1
5
3
5
1
4
1
5
5
2
5
6
1
4
7
2
5
8
1
5
9
1
5
10
2
5
11
3
5
12
4
5
13
1
5
14
1
4
15
1
5
16
3
1
17
3
5
18
1
5
Version 1
42
<p>Assuming the sampling distribution of can be approximated by the standard normal distribution, the p-value for the test is ________. A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
90) A wine magazine wants to know if chefs can tell the difference between duck liver pâté and wet dog food. Eighteen chefs were asked to rate both the pâté and dog food on a scale from 1 to 5, with 1 corresponding to "inedible" and 5 to "very tasty." The results are shown in the following table. Chef
Pâté
Dog Food
1
1
3
2
1
5
3
5
1
4
1
5
5
2
5
6
1
4
7
2
5
8
1
5
9
1
5
10
2
5
11
3
5
12
4
5
13
1
5
14
1
4
15
1
5
16
3
1
17
3
5
18
1
5
Using the p-value approach and α = 0.05, the appropriate conclusion is ________.
Version 1
43
A) do not reject the null hypothesis; we cannot conclude chefs have a preference between pâté and dog food B) reject the null hypothesis; we can conclude chefs have a preference between pâté and dog food C) reject the null hypothesis; we cannot conclude chefs have a preference between pâté and dog food D) do not reject the null hypothesis; we can conclude chefs have a preference between pâté and dog food
91) A magician has a coin that may or may not be fair. The results of 30 flips of the coin are H H T H T H H T H H T H H T H T H H H T T T T T H H H H H T. To determine if the coin is fair using the Wald-Wolfowitz runs test, the competing hypotheses are ________. A) H0: Heads and tails do not occur randomly; HA: Heads and tails occur randomly B) H0: Heads and tails occur randomly; HA: Heads and tails do not occur randomly C) H0: Heads and tails occur randomly; HA: Heads are more likely D) H0: Heads and tails occur randomly; HA: Tails are more likely
92) A magician has a coin that may or may not be fair. The results of 30 flips of the coin are H H T H T H H T H H T H H T H T H H H T T T T T H H H H H T. For the Wald-Wolfowitz runs test, the value of the total number of runs R is ________. A) 8 B) 11 C) 16 D) 19
93) A magician has a coin that may or may not be fair. The results of 30 flips of the coin are H H T H T H H T H H T H H T H T H H H T T T T T H H H H H T. Since the number of heads and tails is larger than 10, the sampling distribution of the test statistic R can be approximated by the normal distribution. The value of the test statistic is _____.
Version 1
44
A) −3.07 B) 14.93 C) 0.3484 D) 3.07
94) A magician has a coin that may or not be fair. The results of 30 flips of the coin are H H T H T H H T H H T H H T H T H H H T T T T T H H H H H T. At the 10% significance level, the critical value for the test is ________. A) 1.645 B) 1.96 C) 2.33 D) 2.575
95) A magician has a coin that may or may not be fair. The results of 30 flips of the coin are H H T H T H H T H H T H H T H T H H H T T T T T H H H H H T. Using the critical value approach, appropriate conclusion is ________. A) do not reject the null hypothesis; we cannot conclude the coin is not fair B) reject the null hypothesis; we can conclude the coin is not fair C) reject the null hypothesis; we cannot conclude the coin is not fair D) do not reject the null hypothesis; we can conclude the coin is not fair
96) An energy analyst wants to test if U.S. oil production is random over time. The analyst has monthly production values for the two years. The analyst finds 12 months are above the median, 12 months are below the median, six runs are below the median, and five runs are above the median. To test the random-walk hypothesis about oil production, the competing hypothesis are ________.
Version 1
45
A) H0: Oil production is not random; HA: Oil production is random B) H0: Oil production is random; HA: Oil production increases over time C) H0: Oil production is random; HA: Oil production decreases over time D) H0: Oil production is random; HA: Oil production is not random
97) An energy analyst wants to test if U.S. oil production is random over time. The analyst has monthly production values for the two years. The analyst finds 12 months are above the median, 12 months are below the median, six runs are below the median, and five runs are above the median. For the Wald-Wolfowitz runs test, the value of R is ________. A) 5 B) 6 C) 11 D) 12
98) An energy analyst wants to test if U.S. oil production is random over time. The analyst has monthly production values for the two years. The analyst finds 12 months are above the median, 12 months are below the median, six runs are below the median, and five runs are above the median. Assume the R follows a normal distribution. The value of the test statistic is ________. A) −2.231 B) −6.868 C) 6.868 D) 2.231
99) An energy analyst wants to test if U.S. oil production is random over time. The analyst has monthly production values for the two years. The analyst finds 12 months are above the median, 12 months are below the median, six runs are below the median, and five runs are above the median. Assuming the sampling distribution of R can be approximated by the normal distribution, the p-value for the test is ________.
Version 1
46
A) less than 0.01 B) between 0.01 and 0.05 C) between 0.05 and 0.10 D) greater than 0.10
100) An energy analyst wants to test if U.S. oil production is random over time. The analyst has monthly production values for the two years. The analyst finds 12 months are above the median, 12 months are below the median, six runs are below the median, and five runs are above the median. Using the p-value approach and α = 0.01, the appropriate conclusion is ________. A) do not reject the null hypothesis; we can conclude oil production is not random B) reject the null hypothesis; we can conclude oil production is not random C) reject the null hypothesis; we cannot conclude oil production is not random D) do not reject the null hypothesis; we cannot conclude oil production is not random
101) A research analyst believes that a positive relationship exists between a firm’s advertising expenditures and its sales. For 65 firms, she collects data on each firm’s yearly advertising expenditures and its sales. She calculates a Spearman rank correlation coefficient of 0.45. Which of the following are the competing hypotheses to determine whether the Spearman rank correlation coefficient differs from the zero? A) H0: ρs = 0; HA: ρs ≠ 0 B) H0: ρs ≥ 0; HA: ρs < 0 C) H0: ρs ≠ 0; HA: ρs = 0 D) H0: ρs ≤ 0; HA: ρs > 0
102) A research analyst believes that a positive relationship exists between a firm’s advertising expenditures and its sales. For 65 firms, she collects data on each firm’s yearly advertising expenditures and its sales. She calculates a Spearman rank correlation coefficient of 0.45. The value of the test statistic and p-value are ________, respectively.
Version 1
47
A) 3.60 and 0.0002 B) 3.60 and 0.0004 C) 3.63 and 0.0002 D) 3.63 and 0.0004
103) A research analyst believes that a positive relationship exists between a firm’s advertising expenditures and its sales. For 65 firms, she collects data on each firm’s yearly advertising expenditures and its sales. She calculates a Spearman rank correlation coefficient of 0.45. At the 5% confidence level, which of the following is the correct conclusion about the relationship between advertising and sales? A) Reject the null hypothesis; we cannot conclude advertising and sales are correlated. B) Do not reject the null hypothesis; we can conclude advertising and sales are correlated. C) Reject the null hypothesis; we can conclude advertising and sales are correlated. D) Do not reject the null hypothesis; we cannot conclude advertising and sales are correlated.
104)
Which of the following statements about the Wilcoxon signed-rank test is false? A) It requires the sample size to be at least 10. B) It assumes that the distribution of the population is symmetric. C) It can be used to test whether the population median differs from some hypothesized
value. D) It is prone to Type I error.
Version 1
48
105) An airline claims that the median price of a round-trip ticket is less than $503. For a random sample of 400 tickets, the value of the Wilcoxon sign-ranks test is T = T+ = 37,230. a. Specify the competing hypothesis to determine whether the median ticket price is less than $503. b. Assume the normal approximation for the distribution of T, and calculate the value of the corresponding test statistic. c. At the 10% significance level, what is the decision and conclusion?
106) A real estate agent wants to know how the home size (in square feet) of parents compares to the home size of their children. For a sample of seven parents and their children, the value of the test statistic for the Wilcoxon signed-rank test for a matched pairs sample is T = T+ = 27. The differences are calculated as the parent’s home size subtracted by the child’s home size. a. Specify the competing hypothesis to determine if the median difference in home size between parents and children is greater than zero. b. At the 5% significance level, what is the critical value? c. At the 5% significance level, what is the decision and conclusion?
107) A university advisor wants to determine if there is a difference in the total amount of time spent studying between students taking an online version and a traditional class version of a certain course. The following table contains the sample sizes of each group, and the rank sums for the Wilcoxon rank-sum test. Online
Traditional
Sample Size
10
12
Sum of Ranks
94.5
158.5
Version 1
49
a. Specify the competing hypothesis to determine whether the median study time for online students differs from the median study time for traditional students. b. What is the value of the Wilcoxon rank-sum test statistic W? c. Assuming W follows a normal distribution, what is the value of the corresponding test statistic? d. At the 1% significance level, what is the decision and conclusion?
108) A career counselor is comparing the annual salary of executives in the manufacturing, finance, and engineering industries. The following table shows the sample size for each industry and the rank sum for the Kruskal-Wallis test. Manufacturing
Finance
Engineering
Sample Size
5
6
5
Sum of Ranks
64
53
19
a. Specify the competing hypotheses to determine whether differences exist in the median annual salary of the three industries. b. What is the value of the Kruskal-Wallis test statistic H? c. At the 5% significance level, what is the decision and conclusion?
109) A study was conducted to determine if husbands and wives like the same TV shows. Married couples ranked 15 TV shows, and the Spearman rank correlation coefficient was found to be rs = 0.47. a. Specify the competing hypotheses to determine if there is a positive correlation between the two rankings. b. Assuming the Spearman rank correlation coefficient follows a normal distribution, what are the mean and standard deviation of the normal distribution? c. Assuming the Spearman rank correlation coefficient follows a normal distribution, what is the value of the test statistic? d. At the 5% significance level, what is the decision and conclusion?
Version 1
50
110) A trading magazine wants to know if a seminar on self-confidence helps stockbrokers to obtain new clients. Before and after the self-confidence seminar, 14 stockbrokers rated their selfconfidence as low, neutral, or high. In comparing the before and after rankings, there was one plus sign and 13 negative signs for the sign test. a. Specify the competing hypotheses to determine if there are differences in self-confidence before and after the seminar. b. Assuming the proportion of plus signs follows a normal distribution, what is the value of the test statistic? c. At the 1% significance level, what is the decision and conclusion?
111) Let positive daily S&P 500 returns define up days, and negative daily S&P 500 returns define down days. A stockbroker wants to know if the up and down days of the S&P 500 occur randomly. For 30 trading days, there were 17 up days, 13 down days, four runs of up days, and four runs of down days. a. Specify the competing hypotheses to determine whether the S&P 500 up and down days occur at random. b. What is the value of the Wald-Wolfowitz runs test statistic? c. At the 5% significance level, what is the decision and conclusion?
112)
A portion of the demand data for the new iPhone at 10 stores is shown below: Demand 2160 :
Version 1
51
3140
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the median demand is greater than 2000. b. At the 5% significance level, can you conclude that the median demand is greater than 2000?
113)
A portion of the demand and supply data for the new iPhone at 10 stores is shown below: Demand
Supply
2160
2124
:
:
3140
3153
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the median difference between Demand and Supply differs from zero. b. At the 5% significance level, what is your conclusion?
114)
A portion of the demand data for the iPhone and Galaxy at 10 stores is shown below: iPhone
Galaxy
2156
1944
:
:
3126
2981
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the median demand of iPhone differs from the median demand of Galaxy. b. At the 5% significance level, what is your conclusion?
Version 1
52
115) A portion of the demand data for the iPhone, Galaxy, and Android at 10 stores is shown below: iPhone
Galaxy
Android
2156
1944
2109
:
:
:
3126
2981
3196
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether some differences exist between the median demand by types of smartphone. b. At the 5% significance level, what is your conclusion?
116)
A portion of the demand and supply data for the new iPhone at 10 stores is shown below: Demand
Supply
2160
2124
:
:
3140
3153
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the Spearman rank correlation coefficient between Demand and Supply differs from zero. b. At the 5% significance level, what is your conclusion?
117) Twenty customers are asked to rate the likelihood of purchasing meatless burgers on a 5point scale, where 1 = unlikely and 5 = likely before and after tasting a sample. A portion of data is given below: Before
After
2
4
Version 1
53
:
:
4
2
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the customers are more likely to purchase meatless burgers after the sample tasting. b. At the 5% significance level, what is your conclusion?
118) The table below shows a portion of the interest rates (in percent) for the United States from 1999 to 2019. Year
Interest Rate
1999
5.30
:
:
2019
1.55
{MISSING IMAGE}Click here for the Excel Data File a. Specify the competing hypotheses to determine whether the interest rate is random. b. At the 5% significance level, what is your conclusion?
119)
Parametric tests are distribution-free tests. ⊚ true ⊚ false
120)
Parametric tests typically assume the underlying population to be normally distributed. ⊚ true ⊚ false
121)
Nonparametric tests do not require data of interval or ratio scale.
Version 1
54
⊚ ⊚
true false
122) If the distributional assumptions of a parametric test are valid yet we choose to use a nonparametric test, the nonparametric test is more powerful. ⊚ true ⊚ false
123) The Wilcoxon signed-rank test for a population median does not assume anything about the distribution of the population. ⊚ true ⊚ false
124) test.
The Wilcoxon rank-sum test is used as a nonparametric counterpart to the matched-pairs t ⊚ ⊚
125)
true false
The Kruskal-Wallis test is the nonparametric alternative to one-way ANOVA. ⊚ true ⊚ false
126) Pearson’s correlation coefficient is used as the nonparametric counterpart to the regular correlation coefficient. ⊚ ⊚
Version 1
true false
55
127) The sign test for a matched-pairs sample is similar to the Wilcoxon signed-rank test but applied to ordinal data. ⊚ ⊚
true false
128) The null hypothesis of the Wald-Wolfowitz runs test is the elements do not occur randomly. ⊚ true ⊚ false
Version 1
56
Answer Key Test name: Chap 20_4e 1) assumptions 2) population median 3) matched-pairs 4) ranks 5) one-way ANOVA 6) 10 7) right 8) populations 9) run 10) economic 11) B 12) A 13) C 14) C 15) C 16) A 17) B 18) A 19) A 20) C 21) A 22) B 23) D 24) C 25) B 26) B Version 1
57
27) B 28) A 29) C 30) B 31) A 32) C 33) D 34) D 35) D 36) B 37) D 38) D 39) B 40) A 41) A 42) B 43) A 44) C 45) D 46) A 47) B 48) A 49) D 50) A 51) C 52) C 53) A 54) D 55) B 56) A Version 1
58
57) D 58) B 59) B 60) B 61) A 62) C 63) A 64) D 65) C 66) A 67) D 68) B 69) A 70) C 71) C 72) A 73) A 74) D 75) B 76) C 77) C 78) D 79) C 80) D 81) A 82) C 83) B 84) B 85) D 86) A Version 1
59
87) C 88) B 89) A 90) B 91) B 92) C 93) D 94) A 95) B 96) D 97) C 98) D 99) B 100) D 101) A 102) B 103) C 104) D 105) <p>a. The competing hypotheses are . b. c. At the 10% significance level, the critical value is −1.28. The p-value is 0.1074. Do not reject the null hypothesis. We cannot conclude the median ticket price is less $503. 106) a. The competing hypotheses are H0:mD ≤ 0; HA:mD > 0. b. For a sample size of n = 7 and a 5% significance level for a one-tailed test, using the table for the critical values for the Wilcoxon signed-rank test we find the critical value TU = 25. Note that the p-value is between 0.01 and 0.025. c. Since T > TU reject the null hypothesis. We conclude that the median difference in home size between parents and children is greater than zero.
Version 1
60
107) a. The competing hypotheses are H0:mO − mT = 0; HA:mO − mT ≠ 0 b. Because n1 ≤ n2, W = W1 c. μW = 115; σW = 15.166; z = −1.352 d. At the 1% significance level, the critical value is −2.576. Because z = −1.352 > −2.576, do not reject the null hypothesis. Similarly, because the p-value is 2P(Z ≤ −1.352) = 2 × NORM.S.DIST (−1.352, True) ≈ 0.176, and this is more than 0.01, we do not reject H0. We cannot conclude the median study time differs for the online and traditional class students. 108) a. The competing hypotheses are H0: mM = mF = mE; HA: Not all population medians are equal. b. The test statistic value: H = 8.91 c. Because the sample sizes are at least five, the sampling distribution of H can be approximated by the chi-square distribution with two degrees of freedom. At the 5% significance level, the critical value is 2.920. The p-value is . Reject the null hypothesis. Conclude not all mean annual salaries across the industry are the same. 109) a. The competing hypotheses are H0:ρs ≤ 0; HA:ρs > 0 b. Assuming the sampling distribution of the Spearman rank correlation coefficient can be approximated by the normal distribution, the mean is zero and the standard deviation is ≈ 0.2673. c. The value of the test statistic is z ≈ 1.7586. d. At the 5% significance level, the critical value is 1.645. The p-value is P(Z ≤ −1.7586) = 0.0393. Reject the null hypothesis. We can conclude there is a positive correlation between husband’s and wife’s ratings. 110) a. The competing hypotheses are H0: p ≥ 0.5; HA: p < 0.5 b. The proportion of plus signs is . The test statistic value is z = −3.207. c. At the 1% significance level, the critical value is −2.576. The p-value is P(Z ≤ −3.207) = 0.0007. Reject the null hypothesis. We can conclude that stockbroker self-confidence is improved by the seminar. 111) a. The competing hypotheses are H0: Up and down days occur randomly;HA: Up and down days do not occur randomly. b. We have RD = 4, RU = 4. R = 8. μR = 8.367; σR = 1.868; z = −0.196 c. At the 5% significance level, the critical value is 1.96. The p-value is 2P(Z ≤ −0.196) = 0.8446. Do not reject the null hypothesis; we cannot conclude the up and down days do not occur randomly. 112) a. H0: m ≤ 2000 and HA: m > 2000 b. Yes, the median demand is greater than 2000 because the p-value is 0.0010, which is less than the 5% significance level. 113) a. H0: mD = 0 and HA: mD ≠ 0 b. The p-value is greater than the 5% significance level; we cannot conclude that the median difference between the iPhone’s Demand and Supply differs from zero.
Version 1
61
114) a. H0: m1 − m2 = 0 and HA: m1 − m2 ≠ 0 b. The p-value is greater than the 5% significance level; we cannot conclude that the median demand of iPhone differs from the median demand of Galaxy. 115) a. H0: m1 = m2 = m3 and HA: Not all population medians are equal b. The p-value is greater than the 5% significance level, we cannot conclude that some differences exist between the median demand by types of smartphone. 116) a. H0: ρs = 0 and HA: ρs ≠ 0 b. The p-value is less than the 5% significance level; we conclude that the Spearman rank correlation coefficient between the iPhone’s Demand and Supply differs from zero. 117) a. H0: p ≥ 0.5; HA: p < 0.5 b. The p-value is less than the 5% significance level. We conclude that customers are more likely to purchase the meatless burgers after the tasting. 118) a. H0: the interest rate is random; HA; the interest rate is not random b. The p-value is less than the 5% significance level. We conclude that the interest rate is not random.
119) FALSE 120) TRUE 121) TRUE 122) FALSE 123) FALSE 124) FALSE 125) TRUE 126) FALSE 127) TRUE 128) FALSE
Version 1
62