SOLUTIONS MANUAL FOR Christine Verity
BASIC B USINESS STATISTICS FIFTEENTH EDITION
Mark L. Berenson Montclair State University
David M. Levine Baruch College, City University of New York
Kathryn A. Szabat La Salle University
David F. Stephan Two Bridges Instructional Technology PPID: A103000240340
Copyright ©2024 Pearson Education, Inc.
v
Table of Contents
Teaching Tips................................................................................................................................................ 1
Chapter 1
Defining and Collecting Data............................................................................................... 43
Chapter 2
Tabular and Visual Summarization of Variables ................................................................. 51
Chapter 3
Numerical Descriptive Measures ....................................................................................... 155
Chapter 4
Basic Probability ................................................................................................................ 201
Chapter 5
Discrete Probability Distributions ...................................................................................... 211
Chapter 6
The Normal Distribution and Other Continuous Distributions .......................................... 245
Chapter 7
Sampling Distributions....................................................................................................... 283
Chapter 8
Confidence Interval Estimation.......................................................................................... 309
Chapter 9
Fundamentals of Hypothesis Testing: One-Sample Tests.................................................. 347
Chapter 10
Two-Sample Tests............................................................................................................. 397
Chapter 11
Analysis of Variance .......................................................................................................... 459 Copyright ©2024 Pearson Education, Inc.
Chapter 12
Chi-Square and Nonparametric Tests ................................................................................ 497
Chapter 13
Simple Linear Regression .................................................................................................. 543
Chapter 14
Introduction to Multiple Regression .................................................................................. 599
Chapter 15
More Complex Multiple Regression Models ..................................................................... 681
Chapter 16
Time-Series Forecasting..................................................................................................... 741
Chapter 17
Business Analytics ............................................................................................................. 869
Chapter 18
Getting Ready to Analyze Data in the Future .................................................................... 871
Chapter 19
Statistical Applications in Quality Management (Online) ................................................. 933
Chapter 20
Decision Making (Online).................................................................................................. 963
Online Sections ....................................................................................................................................... 1001
Instructional Tips and Solutions for Digital Cases ................................................................................. 1067
The Craybill Instrumentation Company Case ........................................................................................ 1107 The Mountain States Potato Company Case........................................................................................... 1109
The O. Hara Performance Consulting Case ........................................................................................... 1117 Copyright ©2024 Pearson Education, Inc.
The Sure Value Convenience Stores Case .............................................................................................. 1119
The Choice Is Yours/More Descriptive Choices Follow-up Case........................................................... 1127
The Claro Mountain State Student Surveys Case ................................................................................... 1183
The Shelter Bay Lifestyles Case .............................................................................................................. 1277
The Tri-Cities Times Case ...................................................................................................................... 1337
Chapter 1
1.1
(a) (b) (c)
1.2
(a) (b)
The menu items represent a categorical variable. Each menu item represents a separate category. The variable that contains the prices is a numerical variable. One menu item and the price of that item would be one instance or occurrence of data.
Business size represents a categorical variable because each size represents a particular category. The measurement scale is ordinal, because of the different sizes.
1.3
The variable speed trials is a continuous numerical variable because time can have any value from 0 to any reasonable unit of time.
1.4
(a) (b) (c)
(d)
The telephone number assigned to the smartphone is a categorical variable. The data usage for a current month (in GB) is a numerical variable that is continuous because any value within a range of values can occur. The length (in minutes and seconds) of the last voice call made using the smartphone is a numerical variable that is continuous because time can have any value from 0 to any reasonable unit of time. The number of apps installed on the smartphone is a numerical variable that is discrete because the outcome is a count. Copyright ©2024 Pearson Education, Inc.
(e)
Whether a device protection plan exists is a categorical variable because the answer can be only yes or no.
1.5
(a) (b) (c) (d) (e)
numerical, ratio numerical, ratio categorical, nominal categorical, nominal numerical, ratio
1.6
(a) (b) (c) (d) (e)
numerical, continuous categorical numerical, discrete categorical categorical
1.7
(a) (b) (c)
numerical, ratio scale, continuous numerical, ratio scale numerical, ratio scale, discrete
1.8
(a) (b) (c) (d) (e)
numerical, continuous numerical, discrete numerical, continuous categorical categorical
1.9
(a)
Income may be considered discrete if we ―count‖ our money. It may be considered continuous if we ―measure‖ our money; we are only limited by the way a country’s monetary system treats its currency. The first format would provide more information because it includes a ratio value while the second measure would only include a range of values for each choice category.
(b)
1.10
The variable test score would be numerical, and presumably, in the range of 0 through 100. If fractional credit for an answer is possible, the variable would need to be continuous and not discrete.
1.11
(a)
(b)
1.12
(a) (b) (c)
The population is ―members of the retailer’s rewards program from the metropolitan area.‖ A systematic or random sample could be taken of members from the rewards program from the metropolitan area. The director might wish to collect both numerical and categorical data. Three categorical questions might be occupation, marital status, type of clothing. Numerical questions might be age, average monthly hours shopping for clothing, income.
0001 0040 0902 Copyright ©2024 Pearson Education, Inc.
1.13
(a)
(b)
Sample without replacement: Start at row 29. Read from left to right in 3-digit sequences and continue unfinished sequences from end of row to beginning of next row. Row 29: 124 783 762 299 659 310 658 361 369 889 588 692 957 Rows 29-30: 157 Row 30: 175 555 646 541 142 547 704 570 342 672 937 837 Rows 30-31: 929 Row 31: 161 611 075 801 030 783 159 309 132 762 671 073 000 Row 32: 780 257 353 914 621 390 444 745 003 197 127 874 770 Rows 32-33: 927 Row 33: 587 672 288 014 510 175 128 228 668 765 530 493 Rows 33-34: 251 Row 34: 669 020 427 042 516 447 773 709 739 459 239 668 263 Row 35: 701 835 806 565 489 318 338 209 316 747 103 865 929 Row 35-36: 390 Row 36: 730 353 851 567 999 742 508 667 802 875 573 672 Rows 36-37: 571 Row 37: 093 493 242 134 312 459 002 770 485 820 090 658 595 Row 38: 824 623 016 Note: All sequences above 127 and all repeating sequences are discarded. Use the same technique as in part (a). Note: All sequences above 127 are discarded. There were no repeating sequences.
1.14
A simple random sample would be less practical for personal interviews because of travel costs, unless interviewees are paid to attend a central interviewing location.
1.15
This is a probability sample because the selection is based on chance. It is not a simple random sample because A is more likely to be selected than B or C.
1.16
Here all members of the population are equally likely to be selected and the sample selection mechanism is based on chance. But selection of two elements is not independent; for example if A is in the sample, we know that B is also, and that C and D are not.
1.17
(a)
(b) (c)
(d)
Since a complete roster of registered students exists, a simple random sample of 200 students could be taken. If student satisfaction with the quality of campus life randomly fluctuates across the student body, a systematic 1-in-20 sample could also be taken from the population frame. If student satisfaction with the quality of life may differ by status and by experience/class level, a stratified sample using eight strata, full-time freshmen through full-time seniors and part-time freshmen through part-time seniors, could be selected. If student satisfaction with the quality of life is thought to fluctuate as much within clusters as between them, a cluster sample could be taken. A simple random sample is one of the simplest to select. The population frame is the registrar’s file of 3,000 student names. A systematic sample is easier to select by hand from the registrar’s records than a simple random sample, since an initial person at random is selected and then every 20th person thereafter would be sampled. The systematic sample would have the additional benefit that the alphabetic distribution of sampled students’ names would be more comparable to the alphabetic distribution of student names in the campus population. If rosters by status and class designations are readily available, a stratified sample should be taken. Since student satisfaction with the quality of life may indeed differ by status and class level, the use of a stratified sampling design will not only ensure all strata are Copyright ©2024 Pearson Education, Inc.
(e)
1.18
1.19
(a)
(b)
0089 0189 0289 0389 0489 0589 0689 0789 0889 0989 1089 1189 1289 1389 1489 1589 1689 1789 1889 1989 2089 2189 2289 2389 2489 2589 2689 2789 2889 2989 3089 3189 3289 3389 3489 3589 3689 3789 3889 3989 4089 4189 4289 4389 4489 4589 4689 4789 4889 4989
(c)
With the single exception of invoice 0989, the invoices selected in the simple random sample are not the same as those selected in the systematic sample. It would be highly unlikely that a random process would select the same units as a systematic process.
(a)
A stratified sample should be taken so that each of the three strata will be proportionately represented. The number of observations in each of the three strata out of the total of 100 should reflect the proportion of the three categories in the customer database. For example, 500/1000 = 50% so 50% of 100 = 50 customers should be selected from the potential customers; similarly, 300/1000 = 30% so 30 customers should be selected from those who have purchased once, and 200/1000 = 20% so 20 customers from the repeat buyers. It is not simple random sampling because, unlike the simple random sampling, it ensures proportionate representation across the entire population.
(b)
(c)
1.20
represented in the sample, it will also generate a more representative sample and produce estimates of the population parameter that have greater precision. If all 3,000 registered students reside in one of 10 on-campus residence halls which fully integrate students by status and by class, a cluster sample should be taken. A cluster could be defined as an entire study house, and the students of a single randomly selected study house could be sampled. Since each study house has 300 students, a systematic sample of 150 students can then be selected from the chosen cluster of 300 students. Alternately, a cluster could be defined as a floor of one of the 10 study houses. Suppose there are six floors in each dormitory with 50 students on each floor. Three floors could be randomly sampled to produce the required 150 student sample. Selection of an entire study house may make distribution and collection of the survey easier to accomplish. In contrast, if there is some variable other than status or class that differs across study houses, sampling by floor may produce a more representative sample. Row 16: 2323 6737 5131 8888 1718 0654 6832 4647 6510 4877 Row 17: 4579 4269 2615 1308 2455 7830 5550 5852 5514 7182 Row 18: 0989 3205 0514 2256 8514 4642 7567 8896 2977 8822 Row 19: 5438 2745 9891 4991 4523 6847 9276 8646 1628 3554 Row 20: 9475 0899 2337 0892 0048 8033 6945 9826 9403 6858 Row 21: 7029 7341 3553 1403 3340 4205 0823 4144 1048 2949 Row 22: 8515 7479 5432 9792 6575 5760 0408 8112 2507 3742 Row 23: 1110 0023 4012 8607 4697 9664 4894 3928 7072 5815 Row 24: 3687 1507 7530 5925 7143 1738 1688 5625 8533 5041 Row 25: 2391 3483 5763 3081 6090 5169 0546 Note: All sequences above 5000 are discarded. There were no repeating sequences.
Before accepting the results of a survey of college students, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were the questions clear, accurate, unbiased, valid? What operational definition of immediately and effortlessly was used? What was the response rate? Copyright ©2024 Pearson Education, Inc.
1.21
(a) (b) (c) (d)
Possible coverage error: Only employees in a specific division of the company were sampled. Possible nonresponse error: No attempt is made to contact nonrespondents to urge them to complete the evaluation of job satisfaction. Possible sampling error: The sample statistics obtained from the sample will not be equal to the parameters of interest in the population. Possible measurement error: Ambiguous wording in questions asked on the questionnaire.
1.22
The results are based on a survey of bank executives. If the frame is supposed to be banking institutions, how is the population defined? There is no information about the response rate, so there is an undefined nonresponse error.
1.23
Before accepting the results of the survey, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were they clear, accurate, unbiased, valid? What was the response rate? What was the margin of error? What was the sample size? What frame was used?
1.24
Before accepting the results of the survey, you might want to know, for example: Who funded the survey? Why was it conducted? What was the population from which the sample was selected? What sampling design was used? What mode of response was used: a personal interview, a telephone interview, or a mail survey? Were interviewers trained? Were survey questions field-tested? What questions were asked? Were the questions clear, accurate, unbiased, valid? What was the response rate? What was the margin of error? What was the sample size? What frame was used?
1.25
Only the second value, 2.7GB, contains units.
1.26
(a) (b)
Invalid values include Appel, Samsun, APPLE, Apple iPhone, and mOTOROLA. Appel should be Apple. Samsun should be Samsung. APPLE should be Apple. Apple iPhone should be Apple. mOTOROLA should be Motorola.
1.27
(a)
Employee ID has data integration error. Payment Method should do a domain check. Age should mark missing value and do domain check and should check outliers. Division should do a domain check and look for data integration errors. Department has data integration errors and should check domain. In Employee ID the first and eleventh value are identical, as well as the sixteenth value ―EPM16‖ should be ―EMP16‖. In Payment Method, corrections to values ―Male‖, ―F‖, and ―EMP12‖ should be corrected as well as identifying the missing ninth value. In Age, ―k20‖ should be ―20‖. Also, ―221‖ should be corrected as well as identifying the missing seventeenth value. In Division, ―EAST‖ should be ―East‖, ―N‖ should be ―North‖, ―N‖ should be ―North‖, and ―Nroth‖ should be ―North‖. In Department ―Customer Rel.‖ should be ―Customer Relations‖, ―Brand Mgt.‖ should be ―Brand Mgt‖, ―Operati0ns‖ should be ―Operations‖, ―Human capital‖ should be ―Human Capital‖, and ―FIN‖ should be ―Finance‖.
(b)
Copyright ©2024 Pearson Education, Inc.
1.28
(a)
(b) (c)
Fund Number contains the wrong data. Category should do a domain check. 5-Yr Return should format the number as a percentage. 10-Yr Retrun should mark missing value. Net Expense Ratio needs a domain check and should check outliers. Rating should do a domain check and check outliers. Assets needs a domain check and should mark missing value. The last three values in New Expense Ratio; the eleventh value in Assets. For both New Expense Ratio and Assets, a maximum value could be defined.
1.29
Data cleaning improves the quality of the data while data wrangling changes the organization of the data.
1.30
No, because the categories do not seem to be mutually exclusive. The categories need specific ranges, such as younger than 21, 21 to 23, 35 to 54, and 55 or older.
1.31
Stack the data by Housekeeping Requested values.
1.32
(a) (b)
1.33
For data cleaning, 10-year return and Rating should mark missing values. For data wrangling, the data could be stacked by market cap, fund type, YTD return, 1-year return, 10-year return, life of fund, expense ratio, or fund rating.
1.34
A population contains all the items of interest whereas a sample contains only a portion of the items in the population.
1.35
A statistic is a summary measure describing a sample whereas a parameter is a summary measure describing an entire population.
1.36
Categorical random variables yield categorical responses such as yes or no answers. Numerical random variables yield numerical responses such as your height in inches.
1.37
Discrete random variables produce numerical responses that arise from a counting process. Continuous random variables produce numerical responses that arise from a measuring process.
1.38
Both nominal and ordinal variables are categorical variables but no ranking is implied in nominal variable such as male or female while ranking is implied in ordinal variable such as a student’s grade of A, B, C, D and F.
1.39
Both interval and ratio variables are numerical variables in which the difference between measurements is meaningful but an interval variable does not involve a true zero such as standardized exam scores while a ratio variable involves a true zero such as height.
1.40
A list of values defines a domain for a categorical variable, whereas a range defines a domain for a numerical variable.
The times for each of the hotels would be arranged in separate columns. The hotel names would be in one column and the times would be in a second column.
Copyright ©2024 Pearson Education, Inc.
1.41
Items or individuals in a probability sampling are selected based on known probabilities while items or individuals in a nonprobability samplings are selected without knowing their probabilities of selection.
1.42
Missing values are values that were not collected for a variable. Outliers are values that seem excessively different from most of the other values
1.43
In unstacked arrangements, separate numerical variables are created for each group in the data. For example, you might create a variable for the weights of men and a second variable for the weights of women. In stacked arrangements, a single numerical variable is paired with a categorical variable that represents the categories. For example, all weights would be in one variable, with a categorical variable indicating male or female.
1.44
Coverage error is error generated due to an improperly or inappropriately framed population which can result in a sample that may not be representative of the population that one wishes to study. Non-response error is error generated due to members of a chosen sample not being contacted even after repeated attempts so that information that should be provided is missing.
1.45
Sampling error results from the variability of outcomes of different samples. This sample to sample variation is inevitably connected to the sampling process. Measurement error is error that results from either self-reported data or data that is collected in an inconsistent manner by those who are responsible for collecting and summarizing the desired information.
1.46
Microsoft Excel: This product features a spreadsheet-based interface that allows users to organize, calculate, and organize data. Excel also contains many statistical functions to assist in the description of a dataset. Excel can be used to develop worksheets and workbooks to calculate a variety of statistics including introductory and advanced statistics. Excel also includes interactive tools to create graphs, charts, and pivot tables. Excel can be used to summarize data to better understand a population of interest, compare across groups, predict outcomes, and to develop forecasting models. These capabilities represent those that are generally relevant to the current course. Excel also includes many other statistical capabilities that can be further explored on the Microsoft Office Excel official website.
1.47
(a) (b) (c) (d)
1.48
The population of interest include banking executives representing institutions of various sizes and U.S. geographic locations. The collected sample includes 163 banking executives from institutions of various sizes and U.S. geographic locations. A parameter of interest is the percentage of the population of banking executives that identify customer experience initiatives as an area where increased spending is expected. A statistic used to the estimate the parameter in (c) is the percentage of the 163 banking executives included in the sample who identify customer experience initiatives as an area where increased spending is expected. In this case, the statistic is 55%.
The answers are based on an article titled ―U.S. Satisfaction Still Running at Improved Level‖ and written by Lydia Saad (August 15, 2018). The article is located on the following site: https://news.gallup.com/poll/240911/satisfaction-running-improvedlevel.aspx?g_source=link_NEWSV9&g_medium=NEWSFEED&g_campaign=item_&g_content =U.S.%2520Satisfaction%2520Still%2520Running%2520at%2520Improved%2520Level Copyright ©2024 Pearson Education, Inc.
(a) (b) (c)
(d)
1.49
The answers were based on information obtained from the following site: (a) (b) (c)
(d)
1.50
(a)
(b) (c)
1.51
The population of interest includes all individuals aged 18 and older who live within the 50 U.S. states and the District of Columbia. The collected sample includes a random sample of 1,024 individuals aged 18 and older who live within the 50 U.S. states and the District of Columbia. A parameter of interest is the percentage of the population of individuals aged 18 and older and live within the 50 U.S. states and the District of Columbia who are satisfied with the direction of the U.S. A statistic used to the estimate the parameter in (c) is the percentage of the 1,024 individuals included in the sample. In this case, the statistic is 36%.
The population of interest is U.S. CEOs The sample included 1,000 U.S. CEOs. A parameter of interest would be the percentage of CEOs among the population of interest that believe that AI will significantly change the way they will do business in the next five years. The statistic used to estimate the parameter in (c) is the percentage of CEOs among the 1,000 CEOs included in the sample who believe that AI will significantly change the way they will do business in the next five years. In this case, the statistic is 80% agree with this statement One variable collected with the American Community Survey is marital status with the following possible responses: now married, widowed, divorced, separated, and never married. The variable in (a) represents a categorical variable. Because the variable in (a) is a categorical, this question is not applicable. If one had chosen age in years from the American Community Survey as the variable, the answer to (c) would be discrete.
Answers will vary depending on the specific sample survey used. The below answers were based on the sample survey located at: bit.ly/21qjI6F (a) (b)
An example of a categorical variable included in the survey is gender with male or female as possible answers. An example of a numerical variable included in the survey would be the number of phone calls made or received from or to ones direct supervisor in an average week.
1.52
(a) (b) (c)
The population of interest consisted of 10,000 benefited employees of the University of Utah. The sample consisted of 3,095 employees of the University of Utah. Gender, marital status, and employment category represent categorical variables. Age in years, education level in years completed, and household income represent numerical variables.
1.53
(a)
Key social media platforms used represents a categorical variable. The frequency of social media usage represents a discrete numerical variable. Demographics of key social media platform users represent categorical variables. 1. Which of the following is your preferred social media platform: YouTube, Facebook, or Twitter? 2. What time of the day do you spend the most amount of time using social media: morning, afternoon, or evening? 3. Please indicate your ethnicity? 4. Which of the following do you most often use to access social media: mobile device, laptop computer, desktop computer, other device?
(b)
Copyright ©2024 Pearson Education, Inc.
(c)
5. Please indicate whether you are a home owner: Yes or No? 1. For the past week, how many hours did you spend using social media? 2. Please indicate your current age in years. 3. What was your annual income this past year? 4. Currently, how many friends have you accepted on Facebook? 5. Currently, how many twitter followers do you have?
Chapter 2
2.1
(a) Category A B C
2.2
Frequency 13 28 9
Percentage 26% 56% 18%
(b)
Category ―B‖ is the majority.
(a)
Table frequencies for all student responses Status F/T P/T Totals
(b)
Student Major Categories A B M Totals 14 9 2 25 6 6 3 15 20 15 5 40
Table percentages based on overall student responses Student Major Categories Status F/T P/T Totals
A 35.0% 15.0% 50.0%
B 22.5% 15.0% 37.5%
M 5.0% 7.5% 12.5%
Totals 62.5% 37.5% 100.0%
Table based on row percentages Student Major Categories Status F/T P/T Totals
A 56.0% 40.0% 50.0%
B 36.0% 40.0% 37.5%
M 8.0% 20.0% 12.5%
Totals 100.0% 100.0% 100.0%
Table based on column percentages Student Major Categories Status F/T P/T Totals
2.3
(a)
A 70.0% 30.0% 100.0%
B 60.0% 40.0% 100.0%
M 40.0% 60.0% 100.0%
Totals 62.5% 37.5% 100.0%
You can conclude Apple, Samsung, and Others dominated the market from the third quarter of 2020 through the third quarter of 2021. Others has the largest market share in Q3 2020, but decreased from 38% to 31% in the third quarter of 2021. Samsung also decreased from 22% in Q3 2020 to 18% in Q3 2021. However, Apple gained in market Copyright ©2024 Pearson Education, Inc.
(b)
share increasing from 11% in the third quarter of 2020 to 15% in the third quarter of 2020. Apple, OPPPO, vivo, and Xiaoml increased market share while Samsung and Others decreased market share.
Copyright ©2024 Pearson Education, Inc.
2.4
(a) Category
Total
Percentages
Credit reporting, credit repair
683,189
63.35%
Debt collection
163,512
15.12%
Credit card or prepaid card
88,175
8.15%
Checking or savings account
72,555
6.71%
72,241
6.68%
Mortgage Total
(b)
1,081,672
There are more complaints for credit reporting, debt collection, and credit card or prepaid card than the other categories. These categories account for about 87% of all the complaints.
(c) Category
Total
Percentage
General purpose credit card or charge card General purpose prepaid card
5836
78.36%
240
3.22%
Gift card
24
0.32%
Government benefit card
253
3.40%
Payroll card
26
0.35%
1065
14.30%
4
0.05%
Store credit card Student prepaid card Total
(d)
7,448
The bulk of the complaints were for general purpose credit card or charge card, and for store credit card.
2.5
The respondents from the sample indicated that about half expect increases in budget for the upcoming year in all three categories: data visualization tools, 48%, advanced analytics, 53%, and applied AI solutions, 52%. The respondents indicated little support for decrease or no investment, with responses to each those categories at 6% or less.
2.6
The largest sources of electricity in the United States are natural gas followed equally by coal, nuclear, and renewables.
Copyright ©2024 Pearson Education, Inc.
2.7
(a) Cloud Value Measure
2.8
Frequency
Percentage
Cost savings
74
14.12%
Increased revenue
10
1.91%
Faster innovation and delivery of new digital products and services Ability to execute on strategy to fundamentally change the business Improved operation resilience, safety, and soundness We are not specifically measuring value because cloud is seen as a necessary business foundation Unsure
127
24.24%
90
17.18%
105
20.04%
21
4.01%
6
1.15%
Total
524
(b)
Executives expect the greatest value of the cloud is in faster innovation and delivery of new digital products and services followed by improved operation resilience, safety, and soundness. Executives do not expect to find cloud value in increased revenue, we are not specifically measuring value because cloud is seen as a necessary business foundation, and unsure.
(a)
Table of row percentages: Student Status Pizza Preference
Full-time
Part-time
Total
Local
48.97%
51.03%
100.00%
National chain
74.67%
25.33%
100.00%
Total
57.73%
42.27%
100.00%
Table of column percentages: Student Status Pizza Preference
Full-time
Part-time
Local
55.91%
79.57%
65.91%
Total
National chain
44.09%
20.43%
34.09%
Total
100.00%
100.00%
100.00%
Table of total percentages: Student Status Pizza Preference
(b)
Full-time
Part-time
Total
Local
32.27%
36.64%
65.91%
National chain
25.45%
8.64%
34.09%
Total
57.73%
42.27%
100.00%
The full-time students were more likely to prefer the local pizza. The part-time students were much more likely to prefer the local pizza.
Copyright ©2024 Pearson Education, Inc.
2.9
(a)
Table of row percentages: Location Churned
Ashland
Springville
Total
Yes
49.72%
50.28%
100.00%
No
50.60%
49.40%
100.00%
Total
50.32%
49.68%
100.00%
Table of column percentages: Location Churned
Ashland
Springville
Total
Yes
31.45%
32.21%
31.83%
No
68.55%
67.79%
68.17%
Total
100.00%
100.00%
100.00%
Table of total percentages: Location Churned
(b)
Ashland
Springville
Total
Yes
15.82%
16.00%
31.83%
No
34.49%
33.68%
68.17%
Total
50.32%
49.68%
100.00%
The customers in Ashland are not likely to churn. The customers in Springville are not likely to churn.
Copyright ©2024 Pearson Education, Inc.
2.10
(a)
Summary of results: Paperless Billing Churned
Yes
No
Total
Yes
1,358
398
1,756
No
2,367
1,394
3,761
Total
3,725
1,792
5,517
Table of row percentages: Paperless Billing Churned
Yes
No
Total
Yes
77.33%
22.67%
100.00%
No
62.94%
37.06%
100.00%
Total
67.52%
32.48%
100.00%
Table of column percentages: Paperless Billing Churned
Yes
No
Total
Yes
36.46%
22.21%
31.83%
No
63.54%
77.79%
68.17%
Total
100.00%
100.00%
100.00%
Table of total percentages: Paperless Billing
(b)
Churned
Yes
No
Total
Yes
24.61%
7.21%
31.83%
No
42.90%
25.27%
68.17%
Total
67.52%
32.48%
100.00%
The customers who had paperless billing were more likely to churn.
2.11
Ordered array: 64 68 71 75 81 88 94
2.12
Ordered array: 73 78 78 78 85 88 91
2.13
(a) (b) (c) (d)
2.14
(166 + 100)/591 * 100 = 45.01% (124 + 77)/591 * 100 = 34.01% (59 + 65)/591 * 100 = 20.98% 45% of the incidents took fewer than 2 days and 66% of the incidents were detected in less than 8 days. 79% of the incidents were detected in less than 31 days.
261,000 61,000 33,333.33 so choose 40,000 as interval width 6 (a) $60,000 – under $100,000; $100,000 – under $140,000; $140,000 – under $180,000; $180,000 – under $220,000; $220,000 – under $260,000; $260,000 – under $300,000 Copyright ©2024 Pearson Education, Inc.
(b) 2.14
(c)
cont. 2.15
$40,000 60,000 100,000 $80,000 similarly, the remaining class midpoints are $120,000; 2 $160,000; $200,000; $240,000; $280,000
(a)Franchise valuations ordered array: 0.990 1.100 1.110 1.180 1.190 1.280 1.300 1.320 1.375 1.380 1.385 1.390 1.400 1.575 1.700 1.760 1.780 1.980 2.000 2.050 2.100 2.200 2.300 2.450 2.650 3.500 3.800 3.900 4.075 6.000 (b)
The valuations range from 0.990 to 6.000. Franchise Valuations Frequency 0.990 but less than 1.700 15 1.700 but less than 2.400 8 2.400 but less than 3.100 2 3.100 but less than 3.800 2 3.800 but less than 4.500 2 4.500 but less than 5.200 0 5.200 to 6.000 1 Half of the valuations are less than 1.700.
(c)
Payrolls ordered array: 48.06 58.08 60.39 61.66 79.78 83.92 93.25 94.93 104.64 113.02 129.54 132.19 133.14 135.29 136.91 146.20 151.48 157.87 158.20 170.74 174.36 181.89 184.63 190.37 211.87 211.97 224.45 236.84 266.00 284.73
(d)
The payrolls range from 48.06 to 284.73. Payroll Frequency 48.06 but less than 82.08 5 82.08 but less than 116.06 5 116.06 but less than 150.06 6 150.06 but less than 184.06 6 184.06 but less than 218.06 4 218.06 but less than 252.06 2 252.06 to 284.73 2 Payroll seems centered around 150.06.
2.16
Percentage 50.0% 26.7% 6.7% 6.7% 6.7% 0% 3.3%
Percentage 16.7% 16.7% 20.0% 20.0% 13.3% 6.7% 6.7%
(a) Total Housing Cost Total Housing Cost
Frequency
Percentage
$250 but less than $300
4
7.84%
$300 but less than $350
17
33.33%
$350 but less than $400
15
29.41%
$400 but less than $450
12
23.53%
$450 but less than $500
1
1.96%
$500 but less than $550
2
3.92%
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
2.16 cont.
(b) Total Housing Cost
(c)
2.17
Frequency
Percentage
Cumulative %
$250
0
0.00%
0.00%
$300
4
7.84%
7.80%
$350
17
33.33%
41.17%
$400
15
29.41%
70.59%
$450
12
23.53%
94.12%
$500
1
1.96%
96.08%
$550
2
3.92%
100.00%
The apartment costs are clustered between $300 and $450.
(a) Commuting Time
Frequency
Percentage
19 but less than 22
15
15%
22 but less than 25
38
38%
25 but less than 28
27
27%
28 but less than 31
13
13%
31 but less than 34
5
5%
34 but less than 37
2
2%
Frequency
Percentage
19 but less than 22
15
15%
15%
22 but less than 25
38
38%
53%
25 but less than 28
27
27%
80%
28 but less than 31
13
13%
93%
31 but less than 34
5
5%
98%
34 but less than 37
2
2%
100%
(b) Commuting Time
(c)
Cumulative %
More than half of commuters spend from 19 up to 25 minutes commuting each week. 93% of commuters spend from 19 up to 31 minutes commuting each week.
Copyright ©2024 Pearson Education, Inc.
2.18
(a), (b)
(c)
2.19
Credit Score
Frequency
Percent (%)
670 – under 680
1
1.96
1.96
680 – under 690
4
7.84
9.80
690 – under 700
8
15.69
25.49
700 – under 710
5
9.80
35.29
710 – under 720
12
23.53
58.82
720 – under 730
14
27.45
86.27
730 – under 740
7
13.73
100.00
The average credit scores are concentrated between 710 and 730.
(a), (b) Bin –0.00350 but less than –0.00201 –0.00200 but less than –0.00051 –0.00050 but less than 0.00099 0.00100 but less than 0.00249 0.00250 but less than 0.00399 0.004 but less than 0.00549
(c)
2.20
Cumulative Percent (%)
Frequency 13 26 32 20 8 1
Percentage 13.00% 26.00% 32.00% 20.00% 8.00% 1.00%
Cumulative % 13.00% 39.00% 71.00% 91.00% 99.00% 100.00%
Yes, the steel mill is doing a good job at meeting the requirement as there is only one steel part out of a sample of 100 that is as much as 0.005 inches longer than the specified requirement.
(a), (b) Time in Seconds
Frequency
Percent (%)
5 – under 10
8
16%
10 – under 15
15
30%
15 – under 20
18
36%
20 – under 25
6
12%
25 – under 30
3
6%
(b) Time in Seconds
(c)
Percentage Less Than
5
0
10
16
15
46
20
82
25
94
30
100
The target is being met since 82% of the calls are being answered in less than 20 seconds
Copyright ©2024 Pearson Education, Inc.
2.21
(a) Call Duration (seconds) 60 up to 119 120 up to 179 180 up to 239 240 up to 299 300 up to 359 360 up to 419 420 and longer
Frequency 7 12 11 11 4 3 2 50
Percentage 14% 24% 22% 22% 8% 6% 4% 100%
Call Duration (seconds)
Frequency
Percentage
(b)
(c)
2.22
Cumulative %
60 up to 119
7
14%
14%
120 up to 179
12
24%
38%
180 up to 239
11
22%
60%
240 up to 299
11
22%
82%
300 up to 359
4
8%
90%
360 up to 419
3
6%
96%
420 and longer
2
4%
100%
50
100%
The call center’s target of call duration less than 240 seconds is only met for 60% of the calls in this data set.
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
2.22 cont.
(c)
2.23
(a)
(b)
2.24
Manufacturer B produces bulbs with longer lives than Manufacturer A. The cumulative percentage for Manufacturer B shows 65% of its bulbs lasted less than 50,500 hours, contrasted with 70% of Manufacturer A’s bulbs, which lasted less than 49,500 hours. None of Manufacturer A’s bulbs lasted more than 51,500 hours, but 12.5% of Manufacturer B’s bulbs lasted between 51,500 and 52,500 hours. At the same time, 7.5% of Manufacturer A’s bulbs lasted less than 47,500 hours, whereas all of Manufacturer B’s bulbs lasted at least 47,500 hours
Amount of Soft Drink 1.850 – 1.899 1.900 – 1.949 1.950 – 1.999 2.000 – 2.049 2.050 – 2.099 2.100 – 2.149
Frequency 1 5 18 19 6 1
Percentage 2% 10% 36% 38% 12% 2%
Amount of Soft Drink 1.899 1.949 1.999 2.049 2.099 2.149
Frequency Less Than 1 6 24 43 49 50
Percentage Less Than 2% 12% 48% 86% 98% 100%
The amount of soft drink filled in the two liter bottles is most concentrated in two intervals on either side of the two-liter mark, from 1.950 to 1.999 and from 2.000 to 2.049 liters. Almost three-fourths of the 50 bottles sampled contained between 1.950 liters and 2.049 liters.
(a)
Average per Month Mail order
0.1
Prepaid
0.8
Online bill pay
1.9
Check
2.3
Bank account
2.3
Cash
6.5
Credit
9.4
Debit
9.8 0
2
4
6
Copyright ©2024 Pearson Education, Inc.
8
10
12
2.24 cont.
(a)
Average per Month 0.8
0.1
1.9 2.3
9.8
2.3
6.5 9.4
Debit
Credit
Cash
Bank account
Check
Online bill pay
Prepaid
Mail order
Average per Month 100% 80% 60% 40% 20% 0%
100% 80% 60% 40% 20% 0%
Purchase Method
(b) (c)
The Pareto chart is best for portraying these data because it not only sorts the frequencies in descending order but also provides the cumulative line on the same chart. You can conclude that debit, credit, and cash are the ―vital few‖ transaction methods.
Copyright ©2024 Pearson Education, Inc.
2.25
(a)
Hours Spent Grooming
0.8
Eating and drinking
1.0
Traveling
1.4
Others
2.2
Working and related activities
2.3
Educational activities
3.5
Leisure and sports
4.0
Sleeping
8.8 0.0
2.0
4.0
6.0
8.0
Hours Spent 1.4
1.0 0.8
8.8
2.2 2.3 3.5
4.0
Sleeping
Leisure and sports
Educational activities
Working and related activities
Others
Traveling
Eating and drinking
Grooming
Copyright ©2024 Pearson Education, Inc.
10.0
2.25 cont.
(a)
Hours Spent 100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Activity (b)
(c)
2.26
The Pareto diagram is better than the pie chart or the bar chart because it not only sorts the frequencies in descending order, it also provides the cumulative polygon on the same scale. From the Pareto diagram it is obvious that more than 50% of their day is spent sleeping, taking part in leisure and sports, and educational activities.
(a)
(b)
19% + 20% + 40% = 79%
Copyright ©2024 Pearson Education, Inc.
2.26 cont.
(c)
(d)
2.27
The Pareto diagram is better than the pie chart because it not only sorts the frequencies in descending order, it also provides the cumulative polygon on the same scale.
(a)
(b)
The ―vital few‖ reasons for the categories of complaints are ―Credit Reporting, credit repair‖ and ―Debt Collection‖ which account for more than 78% of the complaints. The remaining are the ―trivial many‖ which make up less than 22% of the complaints.
Copyright ©2024 Pearson Education, Inc.
2.27 cont.
(c)
(d)
The Pareto diagram is better than the pie chart and bar chart because it allows you to see which categories account for most of the complaints.
Copyright ©2024 Pearson Education, Inc.
2.28
(a)
Copyright ©2024 Pearson Education, Inc.
2.28 cont.
(a)
(b) (c)
2.29
Because energy use is spread over many types of appliances, a bar chart may be best in showing which types of appliances used the most energy. Air conditioning, space heating, water heating, and lighting accounted for over one-half (55.7%) of the residential energy use in the United States.
(a)
Copyright ©2024 Pearson Education, Inc.
2.29 cont.
(a)
(b) 2.30
Two-thirds of the market share percentage (66%) is for Starbucks and Dunkin.
(a)
(b)
Part-time students are more likely to order from a local restaurant than full-time students.
Copyright ©2024 Pearson Education, Inc.
2.31
(a)
(b) 2.32
Both Ashland and Springville have the about the same amount of churning.
(a)
(b)
More paperless billing customers churned that customers that do not have paperless billing.
2.33 Stem-and-leaf of Finance Scores 5 34 6 9 7 4 9 38 2.34
Ordered array: 50 74 74 76 81 89 92 Copyright ©2024 Pearson Education, Inc.
2.35
(a)
(b)
(c) (d)
2.36
(a)
Ordered array: 9.1 9.4 9.7 10.0 10.2 10.2 10.3 10.8 11.1 11.2 11.5 11.5 11.6 11.6 11.7 11.7 11.7 12.2 12.2 12.3 12.4 12.8 12.9 13.0 13.2 The stem-and-leaf display conveys more information than the ordered array. We can more readily determine the arrangement of the data from the stem-and-leaf display than we can from the ordered array. We can also obtain a sense of the distribution of the data from the stem-and-leaf display. The most likely gasoline purchase is between 11 and 11.7 gallons. Yes, the third row is the most frequently occurring stem in the display and it is located in the center of the distribution. Stem unit: 1 ($billions) Stem unit: ($billions) 1
01122333444446788
2
00012357
3
589
4
1
5 6
0
(b)
The values are concentrated between $1 billion and $2 billion with one $6 billion (the New York Yankees).
(c)
Stem unit: 10 ($millions) Stem unit: ($millions) 4
8
17
14
5
8
18
25
6
02
19
0
7
20
8
04
21
22
9
35
22
4
10
5
23
7
11
3
24
12 13
25 02357
26
6
Copyright ©2024 Pearson Education, Inc.
14
6
27
15
188
28
5
16 (d)
2.37
The payrolls are spread out between $48 million and $285 million with some concentration between $130 million and $150 million.
(a) Download Speed 6.529.431.132.532.836.337.153.3 Upload Speed 3.74.05.812.913.015.616.917.5
Copyright ©2024 Pearson Education, Inc.
2.37 cont.
(b)
Download Speeds: Stem unit:10 Leaf rounded to nearest integer
Upload Speeds: Stem unit 1
(c)
(d)
2.38
The stem-and-leaf display conveys more information than the ordered array. We can more readily determine the arrangement of the data from the stem-and-leaf display than we can from the ordered array. We can also obtain a sense of the distribution of the data from the stem-and-leaf display. Download speeds are concentrated around 30 Mbps and Upload speeds are varied with a group around 3 to 5 Mbps and a group around 13 to 17 Mbps.
(a)
Copyright ©2024 Pearson Education, Inc.
2.38 cont.
(a)
(b)
(c)
The majority of electricity charges are clustered between $90 and $130.
2.39
The cost of attending a baseball game is concentrated around $65 with twelve teams at that cost. Five teams have costs of $85 and one team is has the highest cost of $115.
2.40
Property taxes on a $176K home seem concentrated between $700 and $2,200 and also between $3,200 and $3,700.
Copyright ©2024 Pearson Education, Inc.
2.41
(a)
(b)
(c)
The majority (79%) of commuters living in cities spend from 28 but less than 28 minutes commuting each week.
Copyright ©2024 Pearson Education, Inc.
2.42
(a)
(b)
(c)
The average credit scores are concentrated between 710 and 740.
Copyright ©2024 Pearson Education, Inc.
2.43
(a)
(b)
2.44
Yes, the steel mill is doing a good job at meeting the requirement as there is only one steel part out of a sample of 100 that is as much as 0.005 inches longer than the specified requirement.
(a)
Copyright ©2024 Pearson Education, Inc.
2.44 cont.
(a)
(b)
(c)
The target is being met since 82% of the calls are being answered in less than 20 seconds.
Copyright ©2024 Pearson Education, Inc.
2.45
(a)
Copyright ©2024 Pearson Education, Inc.
2.45 cont.
(b)
(c)
2.46
The call center’s target of call duration less than 240 seconds is only met for 60% of the calls in this data set.
(a)
Copyright ©2024 Pearson Education, Inc.
2.46 cont.
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
2.46 cont.
(b)
(c) 2.47
Manufacturer B produces bulbs with longer lives than Manufacturer A
(a)
Copyright ©2024 Pearson Education, Inc.
2.47 cont.
(b) Amount of Soft Drink 1.899 1.949 1.999 2.049 2.099 2.149
(c)
(a)
Scatter Plot 10 9 8 7 6 5 4 3 2 1 0
0
2
4
6
8
10
X
(b)
Percentage Less Than 2% 12 48 86 98 100
The amount of soft drink filled in the two liter bottles is most concentrated in two intervals on either side of the two-liter mark, from 1.950 to 1.999 and from 2.000 to 2.049 liters. Almost three-fourths of the 50 bottles sampled contained between 1.950 liters and 2.049 liters.
Y
2.48
Frequency Less Than 1 6 24 43 49 50
There is no relationship between X and Y.
Copyright ©2024 Pearson Education, Inc.
2.49
(a)
(b)
2.50
Annual sales appear to be increasing in the earlier years before 2014, remain flat from 2015 to 2017 and then start to decline after 2018.
(a)
Copyright ©2024 Pearson Education, Inc.
2.50 cont.
(b)
(c)
There appears to be a linear relationship between the first weekend gross and both the U.S. gross or the worldwide gross of Wizarding World movies. However, this relationship is greatly affected by the results of the Deathly Hallows, Part II movie.
Copyright ©2024 Pearson Education, Inc.
2.51
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
2.51 cont.
2.52
(c)
(d)
There appears to be a positive relationship between Cable and Internet. There does not seem to be a relationship between Electricity and cable, nor between Electricity and Internet.
(a)
There appears to be little relationship between the download speed and the upload speed. Although, the carrier with the highest download speed also has the highest upload speed, as one might guess.
(b)
(c)
Yes, this is borne out by the data.
Copyright ©2024 Pearson Education, Inc.
2.53
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
2.53 cont.
2.54
(c)
(d)
There does appear to be some relationship between Value and Payroll, also between Wins and Payroll.
(a)
Excel output:
(b)
There is a great deal of variation in the returns from decade to decade. Most of the returns are between 5% and 15%. The 1950s, 1980s, and 1990s had exceptionally high returns, and only the 1930s had negative returns.
Copyright ©2024 Pearson Education, Inc.
2.55
(a)
(b)
2.56
There is an upward trend in home sales prices until 2006. Prices decline or remain flat from 2006 – 2011. From 2011 – 2016 there is an upward trend in median price of new home sales. Prices decline or remain flat from 2017 – 2019. From 2020 to 2022, there is an upward trend in home sales prices.
(a)
Copyright ©2024 Pearson Education, Inc.
2.56 cont.
(b)
2.57
(a)
2.58
There was a decline in movie attendance from 2001 to 2021. During that time movie attendance first increased in 2002 before starting a long, slow decline through 2019. In 2020, movie attendance suffered a sharp drop, but then significantly increased in 2021.
(b)
From 2004 to 2008 the cost of a 30-second ad was constant at 2.7 million dollars. Since 2010, the cost has increased to its highest level of 5.6 million dollars in 2020 and then decreased to 5.5 million dollars in 2021.
(a)
Pivot Table in terms of % Count of Type
Row Labels Growth
Column Labels
One
Two
Three
Four
Five
Grand Total
3.11% 13.94% 22.60% 11.77% 4.47%
55.89%
Small
0.41%
2.71%
6.22%
1.62% 1.49%
12.45%
Mid-Cap
0.95%
3.52%
4.74%
3.92% 0.68%
13.80%
Large
1.76%
7.71% 11.64%
6.22% 2.30%
29.63%
2.57% 11.37% 16.78%
9.07% 4.33%
44.11%
Small
0.54%
1.62%
3.11%
2.03% 0.68%
7.98%
Mid-Cap
0.68%
2.30%
2.84%
1.62% 0.54%
7.98%
Value
Copyright ©2024 Pearson Education, Inc.
Large Grand Total (b)
1.35%
7.44% 10.83%
5.41% 3.11%
28.15%
5.68% 25.30% 39.38% 20.84% 8.80%
100.00%
The growth and value funds have similar patterns in terms of star rating and type. Both growth and value funds have more funds with a rating of three. Very few funds have ratings of five. The growth and value funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.
Copyright ©2024 Pearson Education, Inc.
2.58 cont.
(c)
Pivot Table in terms of Average Three-Year Return
Average of 3Yr Return
Column Labels
Row Labels
One
Two
Three
Four
Five
Grand Total
Growth
8.94
14.29
16.93
18.32
24.90
16.76
Small
7.86
10.79
14.35
15.10
22.75
14.47
Mid-Cap
9.41
13.03
15.69
16.69
20.92
15.12
Large
8.94
16.10
18.82
20.18
27.45
18.48
8.71
11.24
12.27
13.94
16.63
12.57
Small
6.06
10.37
11.62
12.54
14.16
11.44
Mid-Cap
9.15
11.33
12.04
14.23
16.10
12.31
Large
9.55
11.40
12.52
14.38
17.26
12.97
Grand Total
8.84
12.92
14.95
16.41
20.83
14.91
Value
2.59
(d)
The average 3-year return is directly related to star rating for both growth and value funds, with high rated funds having a higher 3-year average rate of return. For growth funds, the average three-year rate of return is approximately the same for all market levels.
(a)
Pivot table of tallies in terms of counts: Count of Star Rating
Column Labels
Row Labels
One
Two Three Four Five
Grand Total
Small
7
32
69
27
16
151
Low
1
5
14
6
3
29
Average
2
9
20
6
6
43
High
4
18
35
15
7
79
Mid-Cap
12
43
56
41
9
161
Low
3
12
20
9
1
45
Copyright ©2024 Pearson Education, Inc.
Average
3
17
26
22
3
71
High
6
14
10
10
5
45
Large
23
112
166
86
40
427
Low
7
29
58
28
16
138
Average
11
43
80
39
19
192
High
5
40
28
19
5
97
Grand Total
42
187
291
154
65
739
Copyright ©2024 Pearson Education, Inc.
2.59 cont.
(a)
Pivot table of tallies in terms of % of grand total:
Count of Star Rating
Column Labels
Row Labels
One
Two
Three
Four
Five
Grand Total
Small
0.95%
4.33%
9.34%
3.65%
2.17%
20.43%
Low
0.14%
0.68%
1.89%
0.81%
0.41%
3.92%
Average
0.27%
1.22%
2.71%
0.81%
0.81%
5.82%
High
0.54%
2.44%
4.74%
2.03%
0.95%
10.69%
Mid-Cap
1.62%
5.82%
7.58%
5.55%
1.22%
21.79%
Low
0.41%
1.62%
2.71%
1.22%
0.14%
6.09%
Average
0.41%
2.30%
3.52%
2.98%
0.41%
9.61%
High
0.81%
1.89%
1.35%
1.35%
0.68%
6.09%
Large
3.11%
15.16% 22.46% 11.64% 5.41%
57.78%
Low
0.95%
3.92%
7.85%
3.79%
2.17%
18.67%
Average
1.49%
5.82%
10.83%
5.28%
2.57%
25.98%
High
0.68%
5.41%
3.79%
2.57%
0.68%
13.13%
Grand Total
5.68%
25.30% 39.38% 20.84% 8.80%
100.00%
(b)
For the large-cap funds, the three-star rating category had the highest percentage of funds, followed by two-star, four-star, five-star, and one-star. Very few large-cap funds had ratings of five. This pattern was also seen with the mid-cap funds as a group. The same pattern was observed with the small-cap funds. However, the pattern was more subtle in that the differences in percentage were less in many cases. Within the large-cap fund category, the highest percentage of funds were in the averagerisk category followed by the low-risk and high-risk categories. Within the mid-cap category, the highest percentage of funds were in the average-risk category followed by the high and low risk categories. Within the small-cap category, the highest percentage of funds were in the high-risk category followed by the average and low risk categories.
Copyright ©2024 Pearson Education, Inc.
2.59 cont.
(c) Average of 3Yr Return
(d)
2.60
Column Labels
Row Labels
One
Two
Three
Four
Five
Grand Total
Small
6.83
10.63
13.44
13.68
20.07
13.28
Low
4.00
8.85
12.82
13.56
20.90
12.82
Average
8.49
10.73
13.90
14.97
20.39
14.04
High
6.71
11.08
13.43
13.21
19.44
13.04
Mid-Cap
9.30
12.35
14.32
15.97
18.78
14.09
Low
11.46
10.43
14.39
16.43
39.17
14.10
Average
8.49
12.89
14.67
15.86
16.35
14.42
High
8.62
13.35
13.29
15.81
16.15
13.56
Large
9.21
13.80
15.78
17.48
21.59
15.79
Low
10.97
14.61
15.80
17.38
23.22
16.49
Average
8.77
12.98
15.69
18.10
20.98
15.70
High
7.69
14.08
16.00
16.39
18.73
15.00
Grand Total
8.84
12.92
14.95
16.41
20.83
14.91
There are 28 high-risk large-cap funds with a three-star rating. Their average three-year return is 16.00.
(a) Count of Type
Column Labels
Row Labels
One
Two
Five
Grand Total
Growth
3.11%
13.94% 22.60% 11.77% 4.47%
55.89%
Low
1.08%
3.38%
6.63%
3.38%
1.62%
16.10%
Average
1.22%
4.74%
10.01%
5.82%
1.62%
23.41%
Three
Copyright ©2024 Pearson Education, Inc.
Four
(b)
High
0.81%
5.82%
5.95%
2.57%
1.22%
16.37%
Value
2.57%
11.37% 16.78%
9.07%
4.33%
44.11%
Low
0.41%
2.84%
5.82%
2.44%
1.08%
12.58%
Average
0.95%
4.60%
7.04%
3.25%
2.17%
18.00%
High
1.22%
3.92%
3.92%
3.38%
1.08%
13.53%
Grand Total
5.68%
25.30% 39.38% 20.84% 8.80%
100.00%
Both the growth and value funds are most likely to have a star rating of three followed by a rating of two and then four. Both the growth and value funds are most likely to have average risk followed equally by low and high risk.
Copyright ©2024 Pearson Education, Inc.
2.60 cont.
(c) Average of 3Yr Return
(d)
Column Labels
Row Labels
One
Two
Three
Four
Five
Grand Total
Growth
8.94
14.29
16.93
18.32
24.90
16.76
Low
11.76
14.51
17.34
18.38
27.95
17.66
Average
8.21
13.81
17.19
18.55
24.95
16.91
High
6.28
14.56
16.04
17.71
20.74
15.64
Value
8.71
11.24
12.27
13.94
16.63
12.57
Low
7.03
10.98
12.42
14.24
17.25
12.69
Average
9.30
11.48
12.36
14.45
16.91
12.90
High
8.81
11.15
11.90
13.24
15.47
12.03
Grand Total
8.84
12.92
14.95
16.41
20.83
14.91
The growth and value funds have similar patterns in terms of star rating and risk. Both growth and value funds have more funds with a rating of three. Very few funds have ratings of five. The growth and value funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.
Copyright ©2024 Pearson Education, Inc.
2.61
(a)
Pivot table of tallies in terms of % of grand total: Count of Star Rating
Column Labels
Row Labels
One
Two
Five
Grand Total
Growth
3.11%
13.94% 22.60% 11.77% 4.47%
55.89%
Small
0.41%
2.71%
6.22%
1.62%
1.49%
12.45%
Low
0.00%
0.41%
0.81%
0.41%
0.27%
1.89%
Average
0.14%
0.68%
2.03%
0.27%
0.54%
3.65%
High
0.27%
1.62%
3.38%
0.95%
0.68%
6.90%
Mid-Cap
0.95%
3.52%
4.74%
3.92%
0.68%
13.80%
Low
0.41%
0.81%
1.62%
0.95%
0.14%
3.92%
Average
0.27%
1.49%
2.57%
2.30%
0.14%
6.77%
High
0.27%
1.22%
0.54%
0.68%
0.41%
3.11%
Large
1.76%
7.71%
11.64%
6.22%
2.30%
29.63%
Low
0.68%
2.17%
4.19%
2.03%
1.22%
10.28%
Average
0.81%
2.57%
5.41%
3.25%
0.95%
12.99%
High
0.27%
2.98%
2.03%
0.95%
0.14%
6.36%
2.57%
11.37% 16.78%
9.07%
4.33%
44.11%
Small
0.54%
1.62%
3.11%
2.03%
0.68%
7.98%
Low
0.14%
0.27%
1.08%
0.41%
0.14%
2.03%
Average
0.14%
0.54%
0.68%
0.54%
0.27%
2.17%
High
0.27%
0.81%
1.35%
1.08%
0.27%
3.79%
Mid-Cap
0.68%
2.30%
2.84%
1.62%
0.54%
7.98%
Low
0.00%
0.81%
1.08%
0.27%
0.00%
2.17%
Average
0.14%
0.81%
0.95%
0.68%
0.27%
2.84%
High
0.54%
0.68%
0.81%
0.68%
0.27%
2.98%
Value
Three
Copyright ©2024 Pearson Education, Inc.
Four
Large
1.35%
7.44%
10.83%
5.41%
3.11%
28.15%
Low
0.27%
1.76%
3.65%
1.76%
0.95%
8.39%
Average
0.68%
3.25%
5.41%
2.03%
1.62%
12.99%
High
0.41%
2.44%
1.76%
1.62%
0.54%
6.77%
Grand Total
5.68%
25.30% 39.38% 20.84% 8.80%
100.00%
(b)
For growth funds, most are rated as three-star followed by two-star, four-star, five-star and one-star. Among the growth funds, large cap and small cap had the same pattern of star rating as observed for growth funds in general. Mid-cap funds most were rated as three-star followed by four-star, two-star, one-star, and five-star. The pattern of starrating is different among the various risk levels within the large-cap, mid-cap and smallcap growth funds. For value funds, most are rated as three-star followed by two-star, four-star, five-star, and one-star. Among the value funds, the pattern is the same for small-cap and large-cap funds. Mid-cap value funds have a different pattern. The pattern of star-rating is different among the various risk levels within the large-cap, mid-cap and small-cap funds.
(c)
The tables in 2.58 through 2.60 are easier to interpret because they contain fewer fields. The table in 2.61 tallies star rating across three fields: market type, market cap, and risk level. Problems 2.58 through 2.60 tally star rating across two fields. Problem 2.60 reveals that most value funds are rated as low-risk followed by average-risk and high-risk. Problem 2.61 reveals that this is only the case among large-cap value funds. Most mid-cap value funds are rated as average-risk followed by low-risk and highrisk. Most small-cap value funds are rated as average-risk followed by high-risk and lowrisk. Problem 2.61 also reveals that among small-cap funds rated as average-risk, most are rated as four-star, followed by three-star and two-star. Because Problem 2.61 includes four fields compared to three fields included in problems 2.58 through 2.60, additional patterns can be observed.
2.61 cont.
(d)
2.62
(a)
Pivot Table in terms of % Count of Type
Row Labels
Column Labels
One
Two
Three
Four
Five
Grand Total
Growth
2.08%
9.30% 15.09%
7.86% 2.98%
37.31%
Small
0.27%
1.81%
4.16%
1.08% 0.99%
8.31%
Mid-Cap
0.63%
2.35%
3.16%
2.62% 0.45%
9.21%
Large
1.17%
5.15%
7.77%
4.16% 1.54%
19.78%
1.72%
7.59% 11.20%
6.05% 2.89%
29.45%
Value
Copyright ©2024 Pearson Education, Inc.
Small
0.36%
1.08%
2.08%
1.36% 0.45%
5.33%
Mid-Cap
0.45%
1.54%
1.90%
1.08% 0.36%
5.33%
Large
0.90%
4.97%
7.23%
3.61% 2.08%
18.79%
2.80%
8.31% 12.01%
7.77% 2.35%
33.24%
Small
0.63%
2.98%
2.89%
2.26% 0.36%
9.12%
Mid-Cap
0.54%
2.17%
2.17%
1.08% 0.63%
6.59%
Large
1.63%
3.16%
6.96%
4.43% 1.36%
17.52%
6.59% 25.20% 38.30% 21.68% 8.22%
100.00%
Blend
Grand Total (b)
The growth and value funds have similar patterns in terms of star rating and type. Each of the categories of funds have more funds with a star rating of three. Very few funds have ratings of five. Each of the categories of funds have similar patterns in terms of market cap with many more large cap funds than mid-cap or small funds.
Copyright ©2024 Pearson Education, Inc.
2.62 cont.
(c)
Pivot Table in terms of Average Three-Year Return
Average of 3Yr Return
Row Labels
One
Two
Three
Four
Five
Grand Total
Growth
8.94
14.29
16.93
18.32
24.90
16.76
Small
7.86
10.79
14.35
15.10
22.75
14.47
Mid-Cap
9.41
13.03
15.69
16.69
20.92
15.12
Large
8.94
16.10
18.82
20.18
27.45
18.48
8.71
11.24
12.27
13.94
16.63
12.57
Small
6.06
10.37
11.62
12.54
14.16
11.44
Mid-Cap
9.15
11.33
12.04
14.23
16.10
12.31
Large
9.55
11.40
12.52
14.38
17.26
12.97
8.37
12.11
14.69
15.99
18.07
14.05
Small
-0.35
9.53
11.61
12.77
12.81
10.44
Mid-Cap
9.84
11.53
12.53
13.72
16.24
12.53
Large
11.27
14.94
16.63
18.19
20.33
16.51
Grand Total
8.64
12.66
14.86
16.26
20.04
14.63
Value
Blend
(d)
2.63
Column Labels
The average 3-year return is directly related to star rating for all categories of funds, with high rated funds having a higher 3-year average rate of return. For blend and growth funds, the average three-year rate of return is higher for large funds.
(a)
Copyright ©2024 Pearson Education, Inc.
Row Labels
2.63 cont.
Average of Assets
Average of SD
Low
6588.208679
21.21665094
Average
5318.661634
21.21418301
High
3310.876063
22.17891403
Grand Total
5082.428024
21.50339648
(b)
Row Labels Growth
Average of SD 21.67723971
Average of Assets 5346.211598
Copyright ©2024 Pearson Education, Inc.
(c)
2.64
Value
21.28315951
4748.248221
Grand Total
21.50339648
5082.428024
The results from (a) reveal that the average of SD increases as the risk level increases while average of assets decreases as risk level increases. The results from (b) reveal that the average of SD is higher for growth funds compared to value funds. The patterns suggest that value funds are likely to be associated with less risk because the average of SD was lower among value funds and low risk funds.
Funds 1092, 1107, 1101, 782, and 259 have the lowest five-year return.
Copyright ©2024 Pearson Education, Inc.
2.65
(a)
Row Labels
Average of YTD Return
Average of 10Yr Return
Small
-8.356092715
11.58589404
Mid-Cap
-8.220496894
11.99987578
Large
-5.908758782
12.99564403
Grand Total
-6.912462788
12.49064953
(b)
Row Labels Small
Average of YTD Return -8.356092715
Average of 5Yr Return 11.94039735
Copyright ©2024 Pearson Education, Inc.
Mid-Cap
-8.220496894
12.47677019
Large
-5.908758782
13.85990632
Grand Total
-6.912462788
13.16635995
Copyright ©2024 Pearson Education, Inc.
2.65 cont.
(c)
(d)
For the 1-year versus 10-year return chart, the 10-year returns are much higher than the 1-year returns with similar 5-year returns near 12 percent for all three market cap categories. For the 1-year versus 5-year chart, the returns are all higher for the 5-year returns compared to the 1-year returns. The 5-year returns are slightly higher than the 10year returns. Because the average 5-year returns were all higher than the 10-year returns for all market cap categories, one can conclude that the returns were lower in years 6 through 10. Without annual data, one cannot conclude that this was due to consistent lower returns across the years or the result of one or two years with lower returns.
2.66
The five funds with the lowest five-year return have (1) Large cap, Growth, High risk, One-star rating, (2) Large cap, Growth, Average risk, One-star rating, (3) Large cap, Growth, Average risk, One-star rating, (4) Small cap, Value, High risk, One-star rating, and (5) Small cap, Value, Low risk, Two-star rating.
2.67
(a) Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
2.68
(b)
The sparklines reveal that a general trend upward in home prices during the years 2000, 2003-2005, 2012-2017, 2019-2021. There was a general trend downward during the years 2007, 2008, and 2011.
(c)
In the Time-series plot one can see an upward trend in home sales price until 2006. Prices decline or remain flat from 2006 – 2011. From 2011 – 2016 there is an upward trend in median price of new home sales.
(a) Copyright ©2024 Pearson Education, Inc.
Year 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022
(b)
There has been a slight decline in the price of natural gas over time until 2020, when prices greatly increased. Generally, the price is highest in the middle of the year.
2.69
Student project answers will vary
2.70
Student project answers will vary
2.71
(a) (b) (c)
There is a title. None of the axes are labeled.
Copyright ©2024 Pearson Education, Inc.
2.71 cont.
(c)
2.72
(a) (b) (c)
There is a title. The simplest possible visualization is not used.
Copyright ©2024 Pearson Education, Inc.
2.72 cont.
(c)
2.73
(a) (b) (c)
2.74
Answers will vary depending on selection of source.
None. The use of a 3D graph.
Copyright ©2024 Pearson Education, Inc.
2.75
2.76
(a)
―Flexibility Breeds Productivity Increases‖
(b)
The bar chart and the pie chart should be preferred over the exploded pie chart, the cone chart, and the pyramid chart since the former set is simpler and easier to interpret.
(a) Copyright ©2024 Pearson Education, Inc.
(b)
The bar chart and the pie chart should be preferred over the exploded pie chart, the cone chart, and the doughnut chart since the former set is simpler and easier to interpret.
Copyright ©2024 Pearson Education, Inc.
2.77
A histogram uses bars to represent each class while a polygon uses a single point. The histogram should be used for only one group, while several polygons can be plotted on a single graph.
2.78
A summary table allows one to determine the frequency or percentage of occurrences in each category.
2.79
A bar chart is useful for comparing categories. A pie chart is useful when examining the portion of the whole that is in each category. A Pareto diagram is useful in focusing on the categories that make up most of the frequencies or percentages.
2.80
The bar chart for categorical data is plotted with the categories on the vertical axis and the frequencies or percentages on the horizontal axis. In addition, there is a separation between categories. The histogram is plotted with the class grouping on the horizontal axis and the frequencies or percentages on the vertical axis. This allows one to more easily determine the distribution of the data. In addition, there are no gaps between classes in the histogram.
2.81
A time-series plot is a type of scatter diagram with time on the x-axis.
2.82
Because the categories are arranged according to frequency or importance, it allows the user to focus attention on the categories that have the greatest frequency or importance.
2.83
Percentage breakdowns according to the total percentage, the row percentage, and/or the column percentage allow the interpretation of data in a two-way contingency table from several different perspectives.
2.84
A contingency table contains information on two categorical variables whereas a multidimensional table can display information on more than two categorical variables.
2.85
The multidimensional PivotTable can reveal additional patterns that cannot be seen in the contingency table. One can also change the statistic displayed and compute descriptive statistics which can add insight into the data.
2.86
In a PivotTable in Excel, double-clicking a cell drills down and causes Excel to display the underlying data in a new worksheet to enable you to then observe the data for patterns. In Excel, a slicer is a panel of clickable buttons that appears superimposed over a worksheet to enable you to work with many variables at once in a way that avoids creating an overly complex multidimensional contingency table that would be hard to comprehend and interpret.
2.87
Sparklines are compact time-series visualizations of numerical variables. Sparklines can also be used to plot time-series data using smaller time units than a time-series plot to reveal patterns that the time-series plot may not.
Copyright ©2024 Pearson Education, Inc.
2.88
(a)
Copyright ©2024 Pearson Education, Inc.
2.88 cont.
(a)
(b)
(c)
The publisher has the largest portion (66.06%) of the cost. 24.93% is editorial production manufacturing costs. The publisher’s marketing accounts for the next largest share of the revenue, at 11.60%. Author and bookstore personnel each account for around 11 to 12% of the cost, whereas the publisher profit accounts for more than 22% of the cost. Yes, the publisher’s profit cost has almost twice the cost of the authors.
Copyright ©2024 Pearson Education, Inc.
2.89
(a)
Number of Movies
Copyright ©2024 Pearson Education, Inc.
2.89 cont.
(a)
Copyright ©2024 Pearson Education, Inc.
2.89 cont.
(a)
Gross
Copyright ©2024 Pearson Education, Inc.
2.89 cont.
(a)
Gross
Copyright ©2024 Pearson Education, Inc.
2.89 cont.
(a)
Tickets
Copyright ©2024 Pearson Education, Inc.
2.89 cont.
(a)
Tickets Sold
(b)
Based on the Pareto chart for the number of movies, Drama, Comedy, Action, and Thriller/Suspense are the ―vital few‖ and capture about 72% of the market share. According to the Pareto chart for gross (in $millions) and the Pareto chart for number of tickets sold (in millions), Adventure, Action, Drama, and Thriller/Suspense are the ―vital few‖ and capture about 82% of the market share.
Copyright ©2024 Pearson Education, Inc.
2.90
(a)
Cybersecurity 1
2.90
(a)
Cybersecurity 1 Copyright ©2024 Pearson Education, Inc.
cont.
Cybersecurity 2
2.90 cont.
(a)
Cybersecurity 2
Copyright ©2024 Pearson Education, Inc.
2.91
(b)
A bar chart would be best for both tables because each table contains a pair of categories that have similar percentages.
(c)
Almost all small business owners have a concern about cybersecurity and most assign someone the primary responsibility for online security.
(a) Type of Entrée Percentage Number Ordered Beef 29.68% 187 Chicken 16.35% 103 Mixed 4.76% 30 Duck 3.97% 25 Fish 19.37% 122 Pasta 10.00% 63 Vegan 11.75% 74 Veal 4.13% 26 Total 100.00% 630
2.91 cont.
(b)
Copyright ©2024 Pearson Education, Inc.
2.92
(c)
The Pareto diagram has the advantage of offering the cumulative percentage view of the categories and, hence, enables the viewer to separate the ―vital few‖ from the ―trivial many‖.
(d)
Beef and fish account for nearly 50% of all entrees ordered by weekend patrons of a continental restaurant. When chicken is included, nearly two-thirds of the entrees are accounted for.
(a) Dessert
Reservation Copyright ©2024 Pearson Education, Inc.
Ordered Yes No Total
Yes 66% 48% 52%
No 34% 52% 48%
Total 100% 100% 100%
Dessert Ordered Yes No Total
Reservation Yes No 29% 34% 71% 52% 100% 48%
Total 100% 100% 100%
Dessert Ordered Yes No Total
Reservation Yes No 15% 8% 37% 40% 52% 48%
Total 23% 77% 100%
Dessert Ordered Yes No Total
Beef Entree Yes No 52% 48% 23% 77% 30% 70%
Total 100% 100% 100%
Dessert Ordered Yes No Total
Beef Entree Yes No 40% 15% 60% 85% 100% 100%
Total 23% 77% 100%
Dessert Ordered Yes No Total
Beef Entree Yes No 11.75% 10.79% 19.52% 57.94% 31.27% 68.73%
Total 22.54% 77.46% 100%
(b)
If the owner is interested in finding out the percentage of those with a reservation who order dessert or the percentage of those who order a beef entrée and a dessert among all patrons, the table of total percentages is most informative. If the owner is interested in the effect of reservation on ordering of dessert or the effect of ordering a beef entrée on the ordering of dessert, the table of column percentages will be most informative. Because dessert is usually ordered after the main entrée, and the owner has no direct control over the reservation planning of patrons, the table of row percentages is not very useful here.
(c)
29% of those with reservations ordered desserts, compared to 17% of those without. Almost 40% of the patrons ordering a beef entrée ordered dessert, compared to 16% of patrons ordering all other entrées. Patrons ordering beef are more than 2.5 times as likely to order dessert as patrons ordering any other entrée.
Copyright ©2024 Pearson Education, Inc.
2.93
(a)
United States Fresh Food Consumed:
Copyright ©2024 Pearson Education, Inc.
2.93 cont.
(a)
Japan Fresh Food Consumed:
Copyright ©2024 Pearson Education, Inc.
2.93 cont.
(a)
Russia Fresh Food Consumed:
Copyright ©2024 Pearson Education, Inc.
2.93 cont.
(b)
United States Packaged Food Consumed:
2.93
(b)
Japan Packaged Food Consumed: Copyright ©2024 Pearson Education, Inc.
cont.
2.93 cont.
(b)
Russian Packaged Food Consumed: Copyright ©2024 Pearson Education, Inc.
2.93 cont.
(c)
The fresh food consumption patterns between Japanese and Russians are quite similar with vegetables taking up the largest share followed by meats and seafood while Americans consume about the same amount of meats and seafood, and vegetables. Copyright ©2024 Pearson Education, Inc.
Among the three countries, vegetables, and meats and seafood constitute more than 60% of the fresh food consumption. For Americans, dairy products, and processed, frozen, dried and chilled food and readyto-eat meals make up slightly more than 60% of the packaged food consumption. For Japanese, processed, frozen, dried and chilled food, and ready-to-eat meals, and dairy products constitute more than 60% of their packaged food consumption. For the Russians, bakery goods and dairy products take up 60% of the share of their package food consumption. 2.94
(a)
Most complaints were against U.S. airlines. (b)
Copyright ©2024 Pearson Education, Inc.
2.94 cont.
(b)
Most of the complaints were due to refunds and flight problems. 2.95
(a) Range 0 but less than 25 25 but less than 50 50 but less than 75 75 but less than 100 100 but less than 125 125 but less than 150 150 but less than 175
Frequency 17 19 5 2 3 2 2
Percentage 34% 38% 10% 4% 6% 4% 4%
Copyright ©2024 Pearson Education, Inc.
(b) Histogram
Frequency
2.95 cont.
20 18 16 14 12 10 8 6 4 2 0 0 but less 25 but 50 but 75 but 100 but 125 but 150 but than 25 less than less than less than less than less than less than 50 75 100 125 150 175 Days
Percentage Polygon 40% 35% 30% 25% 20%
15% 10% 5% 0% ---
0.53
0.77
0.84
0.89
0.94
0.98
(c) Range 0 but less than 25 25 but less than 50 50 but less than 75 75 but less than 100 100 but less than 125 125 but less than 150 150 but less than 175
Cumulative % 34% 72% 82% 86% 92% 96% 100%
Copyright ©2024 Pearson Education, Inc.
1
2.95 cont.
(c) Cumulative Percentage Polygon 120% 100% 80% 60% 40% 20% 0% -0.01
(d)
2.96
24.99 49.99 74.99 99.99 124.99 149.99 174.99
You should tell the president of the company that over half of the complaints are resolved within a month, but point out that some complaints take as long as three or four months to settle.
(a)
Copyright ©2024 Pearson Education, Inc.
2.96 cont.
(a)
(b)
2.96
(b) Copyright ©2024 Pearson Education, Inc.
cont.
2.97
(c)
The alcohol percentage is concentrated between 4% and 6%, with more between 4% and 5%. The calories are concentrated between 140 and 160. The carbohydrates are concentrated between 12 and 15. There are outliers in the percentage of alcohol in both tails. There are a few beers with alcohol content as high as around 11.5%. There are a few beers with calorie content as high as around 330 and carbohydrates as high as 32.1. There is a positive relationship between percentage of alcohol and calories and between calories and carbohydrates, and there is a moderately positive relationship between percentage alcohol and carbohydrates.
(a)
Ordered array of ratings of Super Bowl ads that ran before halftime 4.54.75.05.05.25.25.35.35.35.3 Copyright ©2024 Pearson Education, Inc.
5.35.45.55.85.85.96.06.16.26.2 6.36.46.46.56.76.76.77.4 Ordered array of ratings of Super Bowl ads that ran at or after halftime 4.04.34.74.84.84.94.95.25.25.3 5.35.35.45.55.65.65.75.75.85.8 5.95.96.06.06.16.36.56.77.3 (b)
Stem-and-leaf display of ratings of Super Bowl ads that ran before halftime Stem-and-Leaf Display Stem unit:
1 4
57
5
00223333345889
6
01223445777
7
4
Stem-and-leaf display of Super Bowl ads that ran at or after halftime Stem-and-Leaf Display Stem unit:
1
4
0378899
5
223334566778899
6
001357
7
3
(c)
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
2.97 cont.
(c)
(d)
Copyright ©2024 Pearson Education, Inc.
2.97 cont.
2.98
(e)
(f)
There was just one ad that was run at or after halftime compared to before halftime. Approximately 61% of the ads run before halftime had a rating of 6 or less, while 83% of the ads run at or after halftime had a rating of 6 or lower. Seven ads run at or after halftime had a rating of less than 5, while only two ads run before halftime had a rating less than 5. One ad run before halftime had a rating of more than 7, and one ad run at after halftime had a rating of more than 7.
(a)
Stem-and -leaf of One-Year CD Stem unit:
0.1
0
2333355555
1
0000000035555
2
00055557
3
5
4 5
5555
Copyright ©2024 Pearson Education, Inc.
2.98 cont.
(a)
Stem-and -leaf of Five-Year CD Stem unit:
0.1
0
233335
1
555
2
000055
3
0000
4
0078
5
00005
6
000
7 8
005
9 10 (b)
00
The yield of one-year CDs shows that most values are at less than 3.0. The yield of fiveyear CDs shows that most values are at least 3.0.
(c)
Copyright ©2024 Pearson Education, Inc.
(d)
There appears to be a positive relationship between the yield of the one-year CD and the five-year CD.
Copyright ©2024 Pearson Education, Inc.
2.99
(a) Download Speed (Mbps) 600 but less than 700 700 but less than 800 800 but less than 900 900 but less than 1000 1000 but less than 1100 1100 but less than 1200 1200 but less than 1300 1300 but less than 1400 1400 but less than 1500 1500 but less than 1600 1600 but less than 1700 1700 but less than 1800 1800 but less than 1900 1900 but less than 2000
Frequency 13 49 28 9 0 0 0 0 0 0 0 0 0 1
Percentage 13.00% 49.00% 28.00% 9.00% 0% 0% 0% 0% 0% 0% 0% 0% 0% 1.00%
(b)
Copyright ©2024 Pearson Education, Inc.
2.99 cont.
(b)
(c) Download Speed (Mbps) 600 but less than 700 700 but less than 800 800 but less than 900 900 but less than 1000 1000 but less than 1100 1100 but less than 1200 1200 but less than 1300 1300 but less than 1400 1400 but less than 1500 1500 but less than 1600 1600 but less than 1700 1700 but less than 1800 1800 but less than 1900 1900 but less than 2000
Cumulative Frequency Percentage Percentage 13 13.00% 13.00% 49 49.00% 62.00% 28 28.00% 90.00% 9 9.00% 99.00% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 99.00% 0 0% 100.00% 1 1.00%
Copyright ©2024 Pearson Education, Inc.
2.99 cont.
(c)
(d)
Approximately 90% of the cities have download speeds from 600 to 900 Mbps.
(e)
(f)
There does not appear to be any relationship between download speed and number of providers.
Copyright ©2024 Pearson Education, Inc.
2.100
(a) Frequencies (Boston) Weight (Boston) 3015 but less than 3050 3050 but less than 3085 3085 but less than 3120 3120 but less than 3155 3155 but less than 3190 3190 but less than 3225 3225 but less than 3260 3260 but less than 3295
Frequency 2 44 122 131 58 7 3 1
Percentage 0.54% 11.96% 33.15% 35.60% 15.76% 1.90% 0.82% 0.27%
(b) Frequencies (Vermont) Weight (Vermont) 3550 but less than 3600 3600 but less than 3650 3650 but less than 3700 3700 but less than 3750 3750 but less than 3800 3800 but less than 3850 3850 but less than 3900
Frequency 4 31 115 131 36 12 1
Percentage 1.21% 9.39% 34.85% 39.70% 10.91% 3.64% 0.30%
Copyright ©2024 Pearson Education, Inc.
2.100 cont.
(c)
(d)
0.54% of the ―Boston‖ shingles pallets are underweight while 0.27% are overweight. 1.21% of the ―Vermont‖ shingles pallets are underweight while 3.94% are overweight.
Copyright ©2024 Pearson Education, Inc.
2.101
(a)
Not member of major conference Total Pay ($000)
Frequency Percentage
400 but less than 500
4
8.00%
500 but less than 1000
27
54.00%
1000 but less than 1500
6
12.00%
1500 but less than 2000
7
14.00%
2000 but less than 2500
4
8.00%
2500 but less than 3000
1
2.00%
3000 but less than 3500
1
2.00%
50
100.00%
Member of major conference Total Pay ($000)
Frequency Percentage
700 but less than 1000
5
7.69%
1000 but less than 2000
4
6.15%
2000 but less than 3000
9
13.85%
3000 but less than 4000
13
20.00%
4000 but less than 5000
16
24.62%
5000 but less than 6000
11
16.92%
6000 but less than 7000
2
3.08%
7000 but less than 8000
2
3.08%
8000 but less than 9000
2
3.08%
9000 but less than 10,000
1
1.54%
65
100.00%
(b)
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
2.101 cont.
(b)
2.101
(c)
Not member of major conference Copyright ©2024 Pearson Education, Inc.
cont. Total Pay ($000)
Frequency Percentage
Cumulative Percentage
400 but less than 500
4
8.00%
8.00%
500 but less than 1000
27
54.00%
62.00%
1000 but less than 1500
6
12.00%
74.00%
1500 but less than 2000
7
14.00%
88.00%
2000 but less than 2500
4
8.00%
96.00%
2500 but less than 3000
1
2.00%
98.00%
3000 but less than 3500
1
2.00%
100.00%
50
100.00%
Member of major conference Total Pay ($000)
Frequency Percentage
Cumulative Percentage
700 but less than 1000
5
7.69%
7.69%
1000 but less than 2000
4
6.15%
13.85%
2000 but less than 3000
9
13.85%
27.69%
Copyright ©2024 Pearson Education, Inc.
2.101 cont.
3000 but less than 4000
13
20.00%
47.69%
4000 but less than 5000
16
24.62%
72.31%
5000 but less than 6000
11
16.92%
89.23%
6000 but less than 7000
2
3.08%
92.31%
7000 but less than 8000
2
3.08%
95.38%
8000 but less than 9000
2
3.08%
98.46%
9000 but less than 10,000
1
1.54%
100.00%
65
100.00%
(c)
(d)
Coaches Pay – Yes Major Conference Member Stem unit:
1000
0
78889
1
3369
2
4788999
3
000001224455889
Copyright ©2024 Pearson Education, Inc.
4
0000002233444488
5
00002355667
6
15
7
67
8
49
9
0
Copyright ©2024 Pearson Education, Inc.
2.101 cont.
(d)
Coaches Pay – Not a Major Conference Member Stem unit:
100 4
348
5
01444
6
388
7
001355578889
8
0003347
9
1
10
059
11 12 13
058
14 15
00459
16 17 18
05
19 20
0
21
0
22 23
12
24 25 26 Copyright ©2024 Pearson Education, Inc.
27 28
8
29 30 31 32 33 34 (e)
0
The majority of coaches in the non-major conference earn under 1 million dollars while the majority coaches in a major conference earn between 3 and 6 million dollars per year. The highest paid coach in a major conference is $9,013,000 while the highest paid coach in the non-major conference is $3,400,000.
Copyright ©2024 Pearson Education, Inc.
2.102
(a) Calories 50 up to 100 100 up to 150 150 up to 200 200 up to 250 250 up to 300 300 up to 350 350 up to 400
Frequency 3 3 9 6 3 0 1
Percentage 12% 12 36 24 12 0 4
Percentage Less Than 12% 24 60 84 96 96 100
Cholesterol 0 up to 50 50 up to 100 100 up to 150 150 up to 200 200 up to 250 250 up to 300 300 up to 350 350 up to 400 400 up to 450 450 up to 500
Frequency 2 17 4 1 0 0 0 0 0 1
Percentage 8 68 16 4 0 0 0 0 0 4
Percentage Less Than 8% 76 92 96 96 96 96 96 96 100
(b)
Copyright ©2024 Pearson Education, Inc.
2.102 cont.
(b)
(c)
2.103
The sampled fresh red meats, poultry, and fish vary from 98 to 397 calories per serving, with the highest concentration between 150 to 200 calories. One protein source, spareribs, with 397 calories, is more than 100 calories above the next highest caloric food. The protein content of the sampled foods varies from 16 to 33 grams, with 68% of the data values falling between 24 and 32 grams. Spareribs and fried liver are both very different from other foods sampled—the former on calories and the latter on cholesterol content.
(a)
Copyright ©2024 Pearson Education, Inc.
2.103 cont.
(b)
The commercial average price was highest in the fall of 2021. The residential average price of natural gas in the United States is higher in the summer in general and the highest in summer of 2021.
(c)
(d)
2.104
There appears to be a slight positive relationship between the commercial price and residential price.
(a)
Amount 2.15
2.1
2.05
2
1.95
1.9
1.85 0
2.104
(b)
10
20
30
40
There is a downward trend in the amount filled. Copyright ©2024 Pearson Education, Inc.
50
60
cont.
(c) (d)
2.105
(a)
The amount filled in the next bottle will most likely be below 1.894 liter. The scatter plot of the amount of soft drink filled against time reveals the trend of the data, whereas a histogram only provides information on the distribution of the data.
Copyright ©2024 Pearson Education, Inc.
2.105 cont.
(b)
(c)
The Japanese yen had depreciated against the U.S. dollar since 1985 while the Canadian dollar appreciated gradually from 1980 to 1987 and from 1991 to 2002 and then started to depreciate until 2011. The English pound to U.S. dollar’s exchange rate has been quite stable since 1983. The U.S. dollar has appreciated against the Japanese yen since 1980 and appreciated against the Canadian dollar since 2002 in general while the exchange rate against the English bound has been stable in general.
(d)
2.105
(d) Copyright ©2024 Pearson Education, Inc.
cont.
(e)
2.106
There is not any obvious relationship between the Canadian dollar and Japanese yen in terms of the U.S. dollar nor any relationship between the Japanese yen and English pound. There is a slightly positive relationship between the Canadian dollar and English pound.
(a) Variations Original Call to Action Button New Call to Action Button
Percentage of Download 9.64% 13.64%
Copyright ©2024 Pearson Education, Inc.
2.106 cont.
(b) Bar Chart
Variations
Original Call to Action Button
(c)
16.00%
14.00%
12.00%
10.00%
8.00%
6.00%
4.00%
2.00%
0.00%
New Call to Action Button
The New Call to Action Button has a higher percentage of downloads at 13.64% when compared to the Original Call to Action Button with a 9.64% of downloads.
(d) Variations Original web design New web design
Percentage of Download 8.90% 9.41%
(e)
Bar Chart
Variations
Original web design
New web design
0.00%
2.106
(f)
2.00%
4.00%
6.00%
8.00%
10.00%
The New web design has only a slightly higher percentage of downloads at 9.41% when Copyright ©2024 Pearson Education, Inc.
cont. (g)
compared to the Original web design with an 8.90% of downloads. The New web design is only slightly more successful than the Original web design while the New Call to Action Button is much more successful than the Original Call to Action Button with about 41% higher percentage of downloads.
(h) Call to Action Button Old New Old New (i)
(j)
Web Design Old Old New New
Percentage of Download 8.30% 13.70% 9.50% 17.00%
The combination of the New Call to Action Button and the New web design results in slightly more than twice as high a percentage of downloads than the combination of the Old Call to Action Button and Old web design. The New web design is only slightly more successful than the Original web design while the New Call to Action Button is much more successful than the Original Call to Action Button with about 41% higher percentage of downloads. However, the combination of the New Call to Action Button and New web design results in more than twice as high a percentage of downloads than the combination of the Old Call to Action Button and Old web design.
2.107
Class project – answers will vary depending on student responses.
2.108
Class project – answers will vary depending on student responses.
Copyright ©2024 Pearson Education, Inc.
2.109
A descriptive analysis of the weight of the pallets of the Boston shingles revealed that the average weight was 3124.2 pounds with a standard deviation of 34.7. The average weight of 3124.2 pounds was 74.2 pounds above the expected minimum weight of 3,050 pounds. An analysis of the Vermont shingles revealed that the average weight was 3704.0 pounds with a standard deviation of 46.7. The average weight of 3704.0 pounds was 104 pounds above the expected minimum weight of 3,600 pounds. The below table includes a number of descriptive statistics for the two shingle types.
A frequency distribution of the Boston shingles revealed that 0.54% of the pallets were underweight and 0.27% were overweight. A frequency distribution of the Vermont shingles revealed that 1.21% of the shingles were underweight and 3.94% were overweight. The complete results are provided in the below frequency distributions.
Histogram graphs of the Boston shingles and the Vermont shingles, shown below, revealed that the weights of the pallets appeared to be consistent with a normal distribution. In both cases, there was slight right skewness with the Boston shingles having slightly more right skewness than the Vermont shingles.
Copyright ©2024 Pearson Education, Inc.
2.109 cont.
The results of the above analyses reveal that both shingle types generally met pallet weight expectations with less than 1% of the Boston shingles weighing outside of the expected parameters and just over 5% of the Vermont shingles weighing outside of the expected parameters. The results suggest that the manufacturer should consider implementation of parameter compliance strategies for the Vermont shingles.
Chapter 3 Copyright ©2024 Pearson Education, Inc.
3.1
(a)
Excel output: X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation
#N/A 2.915476 8.5 7 2 9 30 5 3 8.5 5.5 48.5913%
(d)
Mean = 6 Median = 7 There is no mode. Range = 7 Variance = 8.5 Standard deviation = 2.9 Coefficient of variation = (2.915/6) • 100% = 48.6% Z scores: 0.343, –0.686, 1.029, 0.686, –1.372 None of the Z scores is larger than 3.0 or smaller than –3.0. There is no outlier. Since the mean is less than the median, the distribution is left-skewed.
(a)
Excel output:
(b) (c)
3.2
6 7
X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation
(b)
7 7 7 3.286335 10.8 9 3 12 42 6 4 9 5 46.9476%
Mean = 7 Median = 7 Mode = 7 Range = 9 Variance = 10.8 Standard deviation = 3.286 Coefficient of variation = (3.286/7) • 100% = 46.948%
Copyright ©2024 Pearson Education, Inc.
3.2 cont.
3.3
(d)
Z scores: 0, –0.913, 0.609, 0, –1.217, 1.522 None of the Z scores is larger than 3.0 or smaller than –3.0. There is no outlier. Since the mean equals the median, the distribution is symmetrical.
(a)
Excel output:
(c)
X Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coefficient of Variation Skewness Kurtosis Count Standard Error First Quartile Third Quartile
(b) (c) (d) 3.4
5.85714 7 7 0 11 11 14.1429 3.7607 64.21% –0.2659 –0.6032 7 1.4214 3 9
Mean = 5.85714 Median = 7 Mode = 7 Range = 11 Variance = 14.1429 Standard deviation = 3.7607 Coefficient of variation = 64.21% Z scores: 1.37, 0.30, –0.49, 0.84, –1.56, 0.30, –0.76. There is no outlier. Since the mean is less than the median, the distribution is left-skewed. Excel output: X Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation
(a) (b)
2 7 7 7.874007874 62 17 –8 9 10 5 –6.5 8 14.5 393.7004%
Mean = 2 Median = 7 Mode = 7 Range = 17 Variance = 62 Standard deviation = 7.874 Coefficient of variation = (7.874/2) • 100% = 393.7%
Copyright ©2024 Pearson Education, Inc.
Z scores: 0.635, –0.889, –1.270, 0.635, 0.889. No outliers. Since the mean is less than the median, the distribution is left-skewed.
3.4 cont.
(c) (d)
3.5
RG 1 0.11 0.3
3.6
RG 1 0.2 1 0.3
1/2
1/2
1 19.58%
1 8.348%
3.7
Half of high school graduates with no college have a weekly income of no more than $838 while half of the workers with at least a bachelor’s degree have weekly income of no more than $1,547.
3.8
(a)
(b)
(c)
Grade X Grade Y Mean 575 575.4 Median 575 575 Standard deviation 6.4 2.1 If quality is measured by central tendency, Grade X tires provide slightly better quality because X’s mean and median are both equal to the expected value, 575 mm. If, however, quality is measured by consistency, Grade Y provides better quality because, even though Y’s mean is only slightly larger than the mean for Grade X, Y’s standard deviation is much smaller. The range in values for Grade Y is 5 mm compared to the range in values for Grade X, which is 16 mm. Excel output: Grade X
Grade Y
Mean
575
Mean
577.4
Median
575
Median
575
Mode
#N/A
Mode
#N/A
Standard Deviation
6.403124
Standard Deviation
6.107373
Sample Variance
41
Sample Variance
Range
16
Range
15
Minimum
568
Minimum
573
Maximum
584
Maximum
588
Sum
2875
Sum
2887
Count
5
Count
5
Mean Median Standard deviation
Grade X 575 575 6.4
Grade Y, Altered 577.4 575 6.1
Copyright ©2024 Pearson Education, Inc.
37.3
When the fifth Y tire measures 588 mm rather than 578 mm, Y’s mean inner diameter becomes 577.4 mm, which is larger than X’s mean inner diameter, and Y’s standard deviation increases from 2.1 mm to 6.1 mm. In this case, X’s tires are providing better quality in terms of the mean inner diameter, with only slightly more variation among the tires than Y’s. 3.9
(a) (b) (c)
Half of the new houses were sold at a price no higher than $428,700. On average, the sales price of houses was $507,800. The sales price of new houses in 2022 is right skewed.
Copyright ©2024 Pearson Education, Inc.
3.10
(a), (b)
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error
(c) (d)
3.11
Download Speed 820.5125 733.85 #N/A 675.4 1171.3 495.9 31266.4727 176.8233 21.55% 1.4728 1.1581 8 62.5165
Upload Speed 105.925 107.8 #N/A 80.9 131.7 50.8 278.2250 16.6801 15.75% –0.0029 –0.5582 8 5.8973
The mean download speed is much higher than the median indicating right skewness, whereas the mean and median upload speeds are about the same indicating symmetry. The mean download speed is much higher than the mean upload speed. The download speeds are about 8 times faster than the upload speeds. The variation in the download speeds between cities has a coefficient of variation of 21.55%, whereas the variation in the upload speeds has a coefficient of variation of 15.75%. The kurtosis in the download speeds indicates more lack of normality than for the upload speeds.
(a), (b) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error
First Half 5.789285714 5.8 5.3 4.5 7.4 2.9 0.4928 0.7020 12.13% 0.2392 –0.5132 28 0.1327
After Halftime 5.534482759 5.6 5.3 4 7.3 3.3 0.5031 0.7093 12.82% 0.1578 0.5515 29 0.1317
Copyright ©2024 Pearson Education, Inc.
3.11 cont.
(a), (b)
(c)
(d)
First Half
Z score
Halftime or After
Z score
4.5 4.7 5.0 5.0 5.2 5.2 5.3 5.3 5.3 5.3 5.3 5.4 5.5 5.8 5.8 5.9 6.0 6.1 6.2 6.2 6.3 6.4 6.4 6.5 6.7 6.7 6.7 7.4
–1.83652 –1.55163 –1.12429 –1.12429 –0.8394 –0.8394 –0.69696 –0.69696 –0.69696 –0.69696 –0.69696 –0.55452 –0.41207 0.015262 0.015262 0.157706 0.300151 0.442595 0.585039 0.585039 0.727484 0.869928 0.869928 1.012373 1.297261 1.297261 1.297261 2.294372
4.0 4.3 4.7 4.8 4.8 4.9 4.9 5.2 5.2 5.3 5.3 5.3 5.4 5.5 5.6 5.6 5.7 5.7 5.8 5.8 5.9 5.9 6.0 6.0 6.1 6.3 6.5 6.7 7.3
–2.16349 –1.74051 –1.17655 –1.03556 –1.03556 –0.89457 –0.89457 –0.47159 –0.47159 –0.3306 –0.3306 –0.3306 –0.18961 –0.04862 0.092374 0.092374 0.233365 0.233365 0.374356 0.374356 0.515348 0.515348 0.656339 0.656339 0.797331 1.079313 1.361296 1.643279 2.489227
The mean and median are approximately equal for the before Halftime ratings indicating the data is symmetric. The median is more than the mean for the ratings on ads running at halftime or after indicating a left or negatively skewed distribution. The mean and median are both greater for the ads running before halftime compared to the ads running at or after halftime. There is about the same amount of variation in the ratings of the ads running at or after halftime as those running in the first half.
Copyright ©2024 Pearson Education, Inc.
3.12
(a), (b) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. Of Variation Skewness Kurtosis Count Standard Error
3.13
Team Value 2.074 1.73 #N/A 0.99 6 5.01 1.2997 1.1401 54.97% 1.8739 3.8080 30 0.2081
Payroll 147.2133333 141.555 #N/A 48.06 284.73 236.67 3869.0060 62.2013 42.25% 0.3460 -0.4260 30 11.3564
(c)
The team value is right skewed with the mean about 0.3 billion dollars higher than the median. The payroll is only slightly skewed with a difference of only six million dollars between the mean and median.
(d)
The mean team value is more than two billion dollars. There is a large amount of variation in the team value around the mean with a coefficient of variation of 54.97%. The kurtosis statistic for team value of 3.8 indicates departure from normality. The coefficient of variation for payroll is 42.25%. There is some negative kurtosis in payroll indicating some lack of normality.
(a), (b)
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Number of Partners 19.51111111 18 14 5 41 36 59.3919 7.7066 39.50% 0.8041 0.5737 45 1.1488
There are no Z-Scores greater than 3 or less than –3, which indicates there are no outliers.
Copyright ©2024 Pearson Education, Inc.
3.13 cont.
(a), (b) Number of Partners 37 41 26 14 22 29 36 11 16 29 30 20 20 20 26 21 17 21 14 28 24 14 15 19 14 11 18 9 10 13 14 24 25 13 5 13 20 15 17 16 26 18 20 16 11
Z score 2.269335 2.788369 0.84199 –0.71511 0.322955 1.231265 2.139576 –1.10439 –0.4556 1.231265 1.361024 0.063438 0.063438 0.063438 0.84199 0.193196 –0.32584 0.193196 –0.71511 1.101507 0.582472 –0.71511 –0.58536 –0.06632 –0.71511 –1.10439 –0.19608 –1.36391 –1.23415 –0.84487 –0.71511 0.582472 0.712231 –0.84487 –1.88294 –0.84487 0.063438 –0.58536 –0.32584 –0.4556 0.84199 –0.19608 0.063438 –0.4556 –1.10439
Copyright ©2024 Pearson Education, Inc.
3.13 cont.
(c) (d)
3.14
(a), (b)
Because the mean is larger than the median, the data is skewed to the right. The mean number of partners in rising accounting firms is 19.5 and half of the rising accounting firms have 18 or more partners. The average scatter around the mean is 7.71 and the lowest number of partners is 5 (Krost) and the greatest number of partners is 41 (Brady, Martz & Associates).
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Country Australia China India Indonesia Japan Malaysia New Zealand Philippines Saudi Arabia Singapore South Korea Taiwan Thailand Vietnam
Mobile Commerce Penetration 59.30714286 63.95 #N/A 32.1 79.1 47 199.9299 14.1397 23.84% –0.8638 –0.1033 14 3.7790 Mobile Commerce Penetration 36.4 64.3 57.3 79.1 32.1 68.6 38.9 69.6 65.1 56.9 59.9 63.8 74.2 64.1
Z score –1.6197 0.3535 –0.1416 1.4002 –1.9238 0.6576 –1.4429 0.7283 0.4101 –0.1699 0.0423 0.3181 1.0537 0.3393
There are no Z-Scores greater than 3 or less than –3, which indicates there are no outliers. (c) (d)
Because the mean is smaller than the median, the data is skewed to the left. The mean Mobile Commerce Penetration is 59.307% and half the countries have values greater than or equal to 63.95%. The average scatter around the mean is 14.1397%. The lowest value is 32.1% (Japan), and the highest value is 79.1% (Indonesia).
Copyright ©2024 Pearson Education, Inc.
3.15
(a) Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error (b)
3.16
One-Year CD 0.18 0.12 0.1 0.02 0.55 0.53 0.0246 0.1570 89.12% 1.4615 1.3764 36 0.0262
Five-Year CD 0.38 0.30 0.2 0.02 1.00 0.98 0.0763 0.2762 72.62% 0.6486 –0.2609 36 0.0460
Relative to the mean five-year CDs have much more variation than one-year CDs. The standard deviation, variance, and range are all greater for five-year CDs compared to one-year CDs.
(a),(b)
(c)
The mean time is 232.78 seconds, and half the calls last greater than or equal to 228 seconds, so call duration is slightly right-skewed. The average scatter around the mean is 158.6866 seconds. The shortest call lasted 65 seconds, and the longest call lasted 1141 seconds.
Copyright ©2024 Pearson Education, Inc.
3.17
Excel output: Waiting Time Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range Coefficient of Variation
(a) (b)
(c) (d)
3.18
(a) (b)
Mean = 4.287Median = 4.5 Variance = 2.683Standard deviation = 1.638Range = 6.08 Coefficient of variation = 38.21% Z scores: –0.05, 0.77, –0.77, 0.51, 0.30, –1.19, –0.46, –0.66, 0.13, 1.11, –2.39, 0.51, 1.33, 1.16, –0.30 There are no outliers. Since the mean is less than the median, the distribution is left-skewed. The mean and median are both under 5 minutes and the distribution is left-skewed, meaning that there are more unusually low observations than there are high observations. But six of the 15 bank customers sampled (or 40%) had wait times in excess of 5 minutes. So, although the customer is more likely to be served in less than 5 minutes, the manager may have been overconfident in responding that the customer would ―almost certainly‖ not wait longer than 5 minutes for service. Mean = 7.11Median = 6.68 Variance = 4.336Standard Deviation = 2.082Range = 6.67 Coefficient of variation = 29.27% Waiting Time 9.66 5.90 8.02 5.79 8.73 3.82 8.01 8.35
(c)
4.286667 4.5 #N/A 1.637985 2.682995 6.08 0.38 6.46 64.3 15 3.2 5.55 2.35 38.2112%
Z Score 1.222431 –0.583360 0.434799 –0.636190 0.775786 –1.582310 0.429996 0.593286
Waiting Time 10.49 6.68 5.64 4.08 6.17 9.91 5.47
Z Score 1.621050 –0.208750 –0.708230 –1.457440 –0.453690 1.342497 –0.789870
Since there are no Z values below –3.0 or above 3.0, there are no outliers. Because the mean is greater than the median, the distribution is right-skewed.
Copyright ©2024 Pearson Education, Inc.
3.18 cont.
(d)
3.19
(a)
1/2
1 2.86%
(c) (a)
RG 1 0.42531 0.5504
(b) (c) 3.21
RG 1 0.03231 0.0934
First year: 1,000(1 0.0286) $1,028.60 Second year: 1,028.60(1 0.0286) $1,058.02 The rate of return for Microsoft was much better than that of GE.
(b)
3.20
The mean and median are both greater than five minutes. The distribution is right-skewed, meaning that there are some unusually high values. Further, 13 of the 15 bank customers sampled (or 86.7%) had waiting times greater than five minutes. So the customer is likely to experience a waiting time in excess of five minutes. The manager overstated the bank’s service record in responding that the customer would ―almost certainly‖ not wait longer than five minutes for service.
1/2
1 48.65%
First year: 1,000(1 0.4865) $1,486.53 Second year: 1,486.53(1 0.4865) $2,209.79 The rate of return for Microsoft was much better than that of GE.
(a) Year 2018 2019 2020 2021 Geometric mean
DJIA -5.63 22.24 18.73 7.25 10.09%
S&P 500 -6.24 28.88 16.26 26.89 15.55%
NASDAQ -5.98 37.96 47.58 26.63 24.78%
Excel formula for DJIA =((1+B2/100)*(1+B3/100)*(1+B4/100)*(1+B5/100))^(1/4)–1 (b) (c) 3.22
NASDAQ had the best rate of return followed by S&P 500. All of the indices had positive returns. The return on the metals was mostly lower than the rate of the stock indices.
(a) Year 2019 2020 2021 Geometric mean
Platinum 22.12 10.44 -10.44 6.50%
Gold 18.83 24.60 -3.51 12.63%
Silver 15.36 47.44 -11.55 14.58%
Excel formula for Platinum =((1+B2/100)*(1+B3/100)*(1+B4/100))^(1/3)–1 (b) (c)
Silver had the best rate of return followed by gold. The return on the metals was mostly lower than the rate of the stock indices.
Copyright ©2024 Pearson Education, Inc.
3.23
(a) Mean of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total
Risk Level Low 17.66 14.42 15.58 19.05 12.69 11.33 11.42 13.34 15.48
Average 16.91 15.16 14.85 18.48 12.90 12.15 13.40 12.91 15.17
High 15.64 14.11 15.14 17.55 12.03 11.09 11.92 12.60 14.01
Grand Total 16.76 14.47 15.12 18.48 12.57 11.44 12.31 12.97 4.91
(b) StdDev of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total
(c)
Risk Level Low 5.36 5.82 5.44 4.75 3.10 2.88 2.47 3.13 5.14
Average 5.13 5.52 2.59 5.50 2.84 2.65 2.87 2.86 4.72
High 4.42 4.03 3.71 4.50 2.66 3.23 2.14 2.40 4.13
Grand Total 5.05 4.76 3.82 5.05 2.88 2.98 2.60 2.84 4.71
The mean three-year return is higher for growth funds than for value funds. For growth funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. For value funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. The mean three-year return is highest for low risk funds and is lower for average and high risk level funds. This occurs for both growth and value funds. The standard deviation of the three-year return varies much more for growth funds than for value funds. For the growth funds, the standard deviation is higher for large and small funds than mid-cap funds. The standard deviation for value funds does not vary much among different sized funds.
3.24
(a) Mean of 3YrReturn% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total
One 8.94 7.86 9.41 8.94 8.71 6.06 9.15 9.55 8.84
Two 14.29 10.79 13.03 16.10 11.24 10.37 11.33 11.40 12.92
Rating Three 16.93 14.35 15.69 18.82 12.27 11.62 12.04 12.52 14.95
Four 18.32 15.10 16.69 20.18 13.94 12.54 14.23 14.38 16.41
Copyright ©2024 Pearson Education, Inc.
Five 24.90 22.75 20.92 27.45 16.63 14.16 16.10 17.26 20.83
Grand Total 16.76 14.47 15.12 18.48 12.57 11.44 12.31 12.97 14.92
3.24 cont.
(b) StdDev of 3Yr Return% Type Growth Small Mid-Cap Large Value Small Mid-Cap Large Grand Total
(c)
One 6.17 0.87 2.47 8.13 2.24 2.38 1.37 1.80 4.76
Two 3.39 6.66 2.80 2.10 2.93 3.34 2.74 2.92 3.53
Rating Three 2.92 3.02 2.08 1.51 1.83 2.47 1.32 1.69 3.41
Four 2.70 1.99 1.69 1.78 1.72 1.39 2.09 1.46 3.18
Five 7.95 4.38 10.23 8.55 2.46 2.64 2.08 2.20 7.20
Grand Total 5.05 4.76 3.82 5.05 2.88 2.98 2.60 2.84 4.71
The mean three-year return is higher for growth funds than for value funds. For growth funds, the mean three-year rate of return is higher for large cap funds than for mid-cap and small cap funds. For value funds, the mean three-year return is similar for all sized funds. The mean three-year return is highest for four and five star rated funds and is lower for one and two star rated funds. This occurs for both growth and value funds. The standard deviation of the three-year return varies much more for growth funds than for value funds. For the growth funds, the standard deviation is higher for small and large cap funds than mid-cap funds. The standard deviation for value funds does not vary much among different sized funds.
3.25
(a) Mean of 3YrReturn% Type Small Low Average High Mid-Cap Low Average High Large Low Average High Grand Total
One 6.83 4.00 8.49 6.71 9.30 11.46 8.49 8.62 9.21 10.97 8.77 7.69 8.84
Two 10.63 8.85 10.73 11.08 12.35 10.43 12.89 13.35 13.80 14.61 12.98 14.08 12.92
Rating Three 13.44 12.82 13.90 13.43 14.32 14.39 14.67 13.29 15.78 15.80 15.69 16.00 14.95
Four 13.68 13.56 14.97 13.21 15.97 16.43 15.86 15.81 17.48 17.38 18.10 16.39 16.41
Copyright ©2024 Pearson Education, Inc.
Five 20.07 20.90 20.39 19.44 18.78 39.17 1.83 16.15 21.59 23.22 20.98 18.73 20.83
Grand Total 13.28 12.82 14.04 13.04 14.09 14.10 14.42 13.56 15.79 16.49 15.70 15.00 14.91
3.25 cont.
(b) StdDev of 3YrReturn% Type Small Low Average High Mid-Cap Low Average High Large Low Average High Grand Total
(c)
3.26
One 2.01 --1.43 1.69 2.01 1.51 1.76 1.73 6.12 1.81 7.31 7.68 4.76
Two 3.49 2.52 1.57 4.30 2.87 2.73 2.04 3.20 3.45 3.90 3.04 3.42 3.53
Rating Three 3.11 1.94 4.27 2.73 2.55 2.49 2.49 2.77 3.53 3.45 3.61 3.61 3.41
Four 2.10 2.00 2.33 1.97 2.12 2.05 1.97 2.63 3.33 3.01 3.39 3.53 3.18
Five 5.62 9.35 6.12 4.19 7.77 --2.41 0.91 7.67 7.95 7.99 5.07 7.20
Grand Total 4.40 4.72 4.86 4.01 3.67 5.0 2.74 3.42 4.96 4.99 5.19 4.34 4.71
The mean three-year return for large-cap funds is much higher than mid-cap or small-cap funds. In all risk categories except five-star funds have the highest mean three-year return. Large-cap, five-star, low-risk funds have the highest mean three -year return and the small-cap one-star, average-risk funds have the lowest standard deviation. The highest standard deviation is found in small-cap, low-risk, five-star funds.
(a) Mean of 3Yr Return% Type Growth Low Average High Value Low Average High Grand Total
One 8.94 11.76 8.21 6.28 8.71 7.03 9.30 8.81 8.84
Two 14.29 14.51 13.81 14.56 11.24 10.98 11.48 11.15 12.92
Rating Three 16.93 17.34 17.19 16.04 12.27 12.42 12.36 11.90 14.95
Four 18.32 18.38 18.55 17.71 13.94 14.24 14.45 13.24 16.41
Five 24.90 27.95 24.95 20.71 16.63 17.25 16.91 15.47 20.83
Grand Total 16.76 17.66 16.91 15.64 12.57 12.69 12.90 12.03 14.91
One 6.17 1.06 7.99 6.45 2.24 2.65 2.15 2.15 4.76
Two 3.39 3.81 2.41 3.82 2.93 3.76 2.60 2.69 3.53
Rating Three 2.92 2.40 3.19 2.84 1.83 1.73 1.63 2.27 3.41
Four 2.70 2.40 2.71 3.08 1.72 1.77 1.86 1.34 3.18
Five 7.95 8.53 8.80 3.59 2.46 2.40 2.69 1.85 7.20
Grand Total 5.05 5.36 5.13 4.42 2.88 3.10 2.84 2.66 4.71
(b) StdDev of 3Yr Return% Growth Low Average High Value Low Average High Grand Total
3.26
(c)
The mean three-year return is higher for growth funds than for value funds. For both growth and value funds, the mean three-year return does not differ much depending on the risk level of the funds. The mean three-year return is highest for four and five star rated funds and is lower for one and two star rated funds. This occurs for both the growth and value funds.
(c)
The standard deviation of the three-year return varies much more for growth funds than Copyright ©2024 Pearson Education, Inc.
cont.
3.27
for value funds. For the growth funds, the standard deviation does not vary much among different sized funds. (a) (b) (c)
Q1 = 3, Q3 = 9, interquartile range = 6 Five-number summary: 0 3 7 9 12 Box-and-whisker Plot
X
-10
(d)
3.28
(a) (b) (c)
3.29
(a) (b) (c)
5
10
Q1 = 4, Q3 = 9, interquartile range = 5 Five-number summary: 3 4 7 9 12
5
10
The distances between the median and the extremes are close, 4 and 5, but the differences in the tails are different (1 on the left and 3 on the right), so this distribution is slightly right-skewed. In 3.2 (d), because the mean and median are equal, the distribution is symmetric. The box part of the graph is symmetric, but the tails show right-skewness. Q1 = 3, Q3 = 8.5, interquartile range = 5.5 Five-number summary: 2 3 7 8.5 9
0
3.30
0
The distribution is left-skewed. Since one of the data points is different, 12 here in 3.27 and 11 in 3.3, the answers are not the same. The maximum for 3.27 is 12 and the maximum in 3.3 is 11. The rest of the five-number summary is the same.
0
(d)
-5
5
(d)
The distribution is left-skewed. Answers are the same.
(a) (b)
Q1 = –6.5, Q3 = 8, interquartile range = 14.5 Five-number summary: –8 –6.5 7 8 9
10
Copyright ©2024 Pearson Education, Inc.
3.30 cont.
(c) Box-and-whisker Plot
X
-10
(d) 3.31
(a) (b)
-5
0
5
10
The distribution is left-skewed. This is consistent with the answer in 3.4 (d). Q1 = 14, Q3 = 24.5, interquartile range = 10.5 Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum
5 14 18 24.5 41
(c)
The distribution is right-skewed
Copyright ©2024 Pearson Education, Inc.
3.32
(a) (b)
Q1 = 56.9, Q3 = 68.6, interquartile range = 11.7 Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum
32.1 56.9 63.95 68.6 79.1
(c)
The distribution is slightly right-skewed. 3.33
(a) (b)
Q1 = 139, Q3 = 273, interquartile range = 134
Copyright ©2024 Pearson Education, Inc.
3.33 cont.
(c)
The distribution is right-skewed 3.34
(a),(b) Five-Number Summary Before Halftime Halftime or After
Minimum First Quartile Median Third Quartile Maximum
4.5 5.3 5.8 6.4 7.4
4 5.05 5.6 5.95 7.3
Interquartile Range
0.9
1.1
Copyright ©2024 Pearson Education, Inc.
3.34 cont.
(c)
The boxplot plot for halftime or after is approximately symmetrical and the boxplot for the first or second quarter is also approximately symmetric. 3.35
(a), (b) Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum Interquartile Range
One-Year CD 0.02 0.05 0.115 0.25 0.55
Five-Year CD 0.02 0.15 0.3 0.55 1
0.2
0.4
Copyright ©2024 Pearson Education, Inc.
3.35 cont.
(c)
Both of the distributions are right-skewed. 3.36
(a)
(b)
3.36 cont.
(b)
The waiting time for bank 1 in the commercial district of a city is skewed to the left. The waiting time for bank 2 in the residential area is skewed right. Copyright ©2024 Pearson Education, Inc.
(c)
The central tendency of the waiting times for the bank branch located in the commercial district (bank 1) of a city is lower than that of the branch located in the residential area (bank 2). There are a few longer than normal waiting times for the branch located in the residential area whereas there are a few exceptionally short waiting times for the branch located in the commercial area.
3.37
(a) (b)
Population mean = 6 2 9.4 3.07
3.38
(a) (b)
Population mean = 6 2 2.8 1.67
3.39
(a)
Population mean = 4.362, Population variance = 0.5944, Population standard deviation = 0.7709
(b)
Within one standard deviation of the mean is (3.59, 5.13) counting the data reveals 31 states or 31/50*100 = 62% of the states are within this range. Within two standard deviations of the mean is (2.82, 5.90), counting the data reveals 50 states or 50/50*100 = 100% of the states are within this range. Within three standard deviations of the mean is (2.05, 6.67), counting the data reveals 50 states or 100% of the states are within this range.
(c)
This is slightly different from 68%, 95% and 99.7% of the empirical rule.
3.40
(a) (b) (c) (d)
68% 95% at least 0 75% 88.89% 4 to 4 or –2.8 to 19.2
3.41
(a)
Population mean = 1.9226 Population standard deviation = 1.1886
(b)
On average, the cigarette tax is $1.92. The typical distance between the cigarette tax in each of the 50 states and the District of Columbia and the population mean cigarette tax is $1.19.
(a)
Population mean = 14.4267 Population variance = 19.022 Population standard deviation = 4.3614
(b)
Within one standard deviation of the mean is (10.065, 18.788) counting the data reveals 42 states or 42/51*100 = 82.35% of the states and DC are within this range.
3.42
Within two standard deviations of the mean is (5.704, 23.150), counting the data reveals 49 states or 49/51*100 = 96.08% of the states and DC are within this range. Within three standard deviations of the mean is (1.342, 27.511), counting the data reveals 50 states or 50/51*100 = 98.04% of the states and DC are within this range.
3.43
(c)
This is slightly different from 68%, 95% and 99.7% of the empirical rule.
(a)
Population mean = 397.1617 Population standard deviation = 638.721 Copyright ©2024 Pearson Education, Inc.
3.44
(b)
On average, the market capitalization for this population of 30 companies is $397.2 billion. The typical distance between the market capitalization and the mean market capitalization for this population of 30 companies is $638.7 billion.
(a) (b)
cov(X, Y) = 65.2909 S X2 = 21.7636, SY2 = 195.8727
r
(c)
3.45
3.46
(a)
65.2909 1.0 21.7636 195.8727
S X2 SY2 There is a perfect positive linear relationship between X and Y ; all the points lie exactly on a straight line with a positive slope.
(b)
The study suggests that the perceived usefulness of smartphones in an educational setting and the number of times students used their smartphone to send or read email for class purpose are positively correlated. There could be a cause and effect relationship between perceived usefulness of smartphones and the number of times students used their smartphone to send or read email for class purposes. The more a student uses their smartphone for class the more they may feel it is useful in an educational setting.
(a) (b)
cov(X, Y) = 133.3333 S X2 = 2200, SY2 = 11.4762 r
(c)
(d) 3.47
cov X , Y
cov X , Y
0.8391 S X SY The correlation coefficient is more valuable for expressing the relationship between calories and sugar because it does not depend on the units used to measure calories and sugar. There is a strong positive linear relationship between calories and sugar.
(a)
Covariance sample First Weekend & US Gross = 1378.794 Covariance sample First Weekend & Worldwide Gross = 4781.008 Covariance sample US Gross & Worldwide Gross = 9937.710
(b)
Coefficient of correlation First Weekend & US Gross = 0.7600 Coefficient of correlation First Weekend & Worldwide Gross = 0.8596 Coefficient of correlation US Gross &Worldwide Gross = 0.9448
(c)
The correlation coefficient is more valuable for expressing the relationship because it does not depend on the units used.
(d)
There is a strong positive linear relationship between U.S. gross and worldwide gross, first weekend gross and worldwide gross and first weekend gross and U.S. gross.
Copyright ©2024 Pearson Education, Inc.
3.48
Excel Output: City Austin Baltimore Charlotte New Orleans Oklahoma City San Diego San Francisco Tampa Covariance r
3.49
Download Speed 726.3 740.2 1171.3 696.9 727.5 675.4 1008.9 817.6 726.4682143 0.246308328
Upload Speed 80.9 89.0 110.6 97.2 111.6 131.7 121.4 105.0 =COVARIANCE.S(B2:B9, C2:C9) =CORREL(B2:B9, C2:C9)
(a) (b) (c)
cov(X, Y) = 726.5 Correlation = r = 0.2463 The is a small positive linear relationship between download speed and upload speed.
(a)
Covariance sample between franchise value and payroll = 47.98 Coefficient of correlation between franchise value and payroll = 0.6766
(b)
Covariance of sample between payroll and number of wins = 396.57
(c)
Coefficient of correlation between payroll and number of wins = 0.4406
(d)
The covariance between payroll and number of wins is 396.57 which is much higher than the covariance between franchise value and payroll, which is only 47.98. There is more of a positive linear relationship between franchise value and payroll, with correlation value 0.6766. The positive linear relationship is not very strong between payroll and number of wins, with correlation value 0.4406.
3.50
We should look for ways to describe the typical value, the variation, and the distribution of the data within a range.
3.51
Central tendency or location refers to the fact that most sets of data show a distinct tendency to group or cluster about a certain central point.
3.52
The arithmetic mean is a simple average of all the values, but is subject to the effect of extreme values. The median is the middle ranked value, but varies more from sample to sample than the arithmetic mean, although it is less susceptible to extreme values. The mode is the most common value, but is extremely variable from sample to sample.
3.53
The first quartile is the value below which 25% of the total ranked observations will fall, the median is the value that divides the total ranked observations into two equal halves and the third quartile is the observation above which 25% of the total ranked observations will fall.
3.54
Variation is the amount of dispersion, or ―spread,‖ in the data.
3.55
The Z score measures how many standard deviations an observation in a data set is away from the mean.
3.56
The range is a simple measure, but only measures the difference between the extremes. The interquartile range measures the range of the center fifty percent of the data. The standard Copyright ©2024 Pearson Education, Inc.
deviation measures variation around the mean while the variance measures the squared variation around the mean, and these are the only measures that take into account each observation. The coefficient of variation measures the variation around the mean relative to the mean. The range, standard deviation, variance and coefficient of variation are all sensitive to outliers while the interquartile range is not. 3.57
The empirical rule relates the mean and standard deviation to the percentage of values that will fall within a certain number of standard deviations of the mean.
3.58
Chebyshev’s theorem applies to any type of distribution while the empirical rule applies only to data sets that are approximately bell-shaped. The empirical rule is more accurate than the Chebyshev rule in approximating the concentration of data around the mean.
3.59
Shape is the manner in which the data are distributed. The shape of a data set can be symmetrical or asymmetrical (skewed).
3.60
Skewness measures the extent to which the data values are not symmetrical around the mean. Kurtosis measures the extent to which values that are very different from the mean affect the shape of the distribution of a set of data. For symmetrical distributions, the boxplot is also symmetrical with the median splitting the box in half and whiskers of equal length. For left skewed distributions the boxplot’s left whisker will be longer and the median will be located in the right half of the box. For rights skewed distributions the boxplot’s the right whisker will be longer and the median will be located in the left half of the box.
3.61
3.62
The covariance measures the strength of the linear relationship between two numerical variables while the coefficient of correlation measures the relative strength of the linear relationship. The value of the covariance depends very much on the units used to measure the two numerical variables while the value of the coefficient of correlation is totally free from the units used.
3.63
The arithmetic mean is the most common measure of central tendency and is calculated by dividing the sum of the values in the data set by the number of data values in the set. The geometric mean is used to measure the rate of change of a variable over time. It is calculated by taking the nth root of the product of the n data values, where n is the number of data values.
3.64
The geometric mean is used to measure the rate of change of a variable over time. It is calculated by taking the nth root of the product of the n data values, where n is the number of data values. The geometric rate of return measures the mean percentage return of an investment per time period.
Copyright ©2024 Pearson Education, Inc.
3.65
Excel Output City Download Speeds
City Download Speeds
Minimum
672.72
Mean
797.77
First Quartile
730.99
Median
777.75
Median
777.75
Mode
Third Quartile
848.77
Minimum
672.72
Maximum
1916.28
Maximum
1916.28
Range
1243.56
IQR
177.78
Variance
134.3062
Coeff. of Variation
16.84%
Skewness
5.9260
Kurtosis
48.7143
Standard Error
(b)
18038.1432
Standard Deviation
Count
(a)
#N/A
100 13.4306
Download Speed: mean = 797.77, median = 777.75, first quartile = 730.99, third quartile = 848.77 Download Speed: range = 1243.56, interquartile range = 177.78, variance=18,038.14, standard deviation = 134.31, coefficient of variation = 16.84%
(c)
Copyright ©2024 Pearson Education, Inc.
(d)
The mean download speed is 797.77 Mbps with 50% of the cities having a download speed less than 777.75 Mbps and 50% of the cities having download speed between 730.99 Mbps and 848.77 Mbps.
Copyright ©2024 Pearson Education, Inc.
3.66
Minitab Output:
(a) (b)
Mean = 45.22Median = 45 first quartile = 25third quartile = 63 Range = 83Interquartile range = 38Variance = 535.79 Standard Deviation = 23.15Coefficient of variation = 51.19%
(c)
(d)
3.67
The distribution is approximately symmetric. The mean approval process takes 45.22 days with 50% of the policies being approved in less than 45 days. 50% of the applications are approved between 25 and 63 days. About 25% of the applicants are approved in no more than 25 days.
Excel output: Days Mean Median Mode Standard Deviation Sample Variance Range Minimum Maximum First Quartile Third Quartile Interquartile Range CV (a) (b)
43.04 28.5 5 41.92606 1757.794 164 1 165 14 54 40 97.41%
Mean = 43.04Median = 28.5 Q1 = 14Q3 = 54 Range = 164Interquartile range = 40Variance = 1,757.79 Standard deviation = 41.926Coefficient of variation = 97.41% Copyright ©2024 Pearson Education, Inc.
3.67 cont.
(c)
Box-and-whisker plot for Days to Resolve Complaints Box-and-whisker Plot
Days
0
(d)
50
100
150
The distribution is right-skewed. Half of all customer complaints that year were resolved in less than a month (median = 28.5 days), 75% of them within 54 days. There were five complaints that were particularly difficult to settle which brought the overall mean up to 43 days. No complaint took longer than 165 days to resolve.
3.68
(a)
Excel output:
3.68
(a)
Minitab Output: Copyright ©2024 Pearson Education, Inc.
cont.
(b)
Using the formulas in the text with n= 50, Q1 = (50+1)/4 ranked value = 12.75 ranked value so choose 13th ranked value which is 12. Q3 = 3(50+1)/4 ranked value = 38.25 ranked value so choose 38th ranked value which is 18 Therefore 5 number summary is min, Q1, median, Q3, max = 5, 12, 15, 18, 28 * Note Minitab uses a slightly different formula to calculate the quartiles
(c) (d) 3.69
(a) and (b) Excel Output
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Minimum First Quartile Median Third Quartile Maximum IQR
The distribution is symmetric. The service level is met because 75% of the class are answered in less than 18 seconds.
Electricity 110.255 108 95 73 160 87 321.9937
Gas 82.412 82 87 39 138 99 596.6871
Water 38.902 32 26 18 91 73 317.4502
Sewer 55.294 47 33 19 122 103 688.5718
Cable 46.490 47 47 23 55 32 21.5349
Internet 37.157 30 30 20 70 50 111.2549
17.9442
24.4272
17.8171
26.2407
4.6406
10.5477
16.28% 0.4685 0.2788 51
29.64% 0.5942 -0.1853 51
45.80% 1.2992 0.7754 51
47.46% 0.6390 -0.3136 51
9.98% -2.6055 13.3315 51
28.39% 0.9179 -0.0055 51
2.5127
3.4205
2.4949
3.6744
0.6498
1.4770
Electricity 73
Gas 39
Water 18
Sewer 19
Cable 23
Internet 20
95 108
64 82
27 32
33 47
45 47
30 30
123 160
93 138
46 91
72 122
47 55
50 70
28
29
19
39
2
20
Copyright ©2024 Pearson Education, Inc.
3.69 cont.
(c)
For Electricity, the data are slightly skewed right. For Gas, the data are slightly skewed left. For Water, the data are skewed right. For Sewer, the data are skewed right. For cable, the median and third quartile are the same, 47, and the data are skewed left. For Internet, the median and first quartile are the same, 30, and the data are skewed right. (d) 3.70
Coefficient of correlation between Electricity and Cable = 0.0763 Coefficient of correlation between Electricity and Internet = 0.1122
(a), (b) Bundle Score Typical Cost ($) Mean 54.775 24.175 Standard Error 4.367344951 2.866224064 Median 62 20 Mode 75 8 Standard Deviation 27.62151475 18.12759265 Sample Variance 762.9480769 328.6096154 Kurtosis -0.845357193 2.766393511 Skewness -0.48041728 1.541239625 Range 98 83 Minimum 2 5 Maximum 100 88 Sum 2191 967 Count 40 40 First Quartile 34 9 Third Quartile 75 31 Interquartile Range 41 22 CV 50.43% 74.98%
3.70
(c) Copyright ©2024 Pearson Education, Inc.
cont. Boxplot
Typical Cost ($)
Bundle Score
0
(d) (e)
3.71
20
40
60
80
100
The typical cost is right-skewed, while the bundle score is left-skewed. cov X , Y r 0.3465 S X SY The mean typical cost is $24.18, with an average spread around the mean equaling $18.13. The spread between the lowest and highest costs is $83. The middle 50% of the typical cost fall over a range of $22 from $9 to $31, while half of the typical cost is below $20. The mean bundle score is 54.775, with an average spread around the mean equaling 27.6215. The spread between the lowest and highest scores is 98. The middle 50% of the scores fall over a range of 41 from 34 to 75, while half of the scores are below 62. The typical cost is right-skewed, while the bundle score is left-skewed. There is a weak positive linear relationship between typical cost and bundle score.
Excel output: Teabags Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range CV
(a) (b) 3.71
(c)
5.5014 0.014967 5.515 5.53 0.10583 0.0112 0.127022 –0.15249 0.52 5.25 5.77 275.07 50 5.44 5.57 0.13 1.9237%
mean = 5.5014, median = 5.515, first quartile = 5.44, third quartile = 5.57 Range = 0.52Interquartile range = 0.13Variance = 0.0112, Standard Deviation = 0.10583 Coefficient of Variation = 1.924% The mean weight of the tea bags in the sample is 5.5014 grams while the middle ranked Copyright ©2024 Pearson Education, Inc.
cont.
weight is 5.515. The company should be concerned about the central tendency because that is where the majority of the weight will cluster around. The average of the squared differences between the weights in the sample and the sample mean is 0.0112 whereas the square-root of it is 0.106 gram. The difference between the lightest and the heaviest tea bags in the sample is 0.52. 50% of the tea bags in the sample weigh between 5.44 and 5.57 grams. According to the empirical rule, about 68% of the tea bags produced will have weight that falls within 0.106 grams around 5.5014 grams. The company producing the tea bags should be concerned about the variation because tea bags will not weigh exactly the same due to various factors in the production process, e.g. temperature and humidity inside the factory, differences in the density of the tea, etc. Having some idea about the amount of variation will enable the company to adjust the production process accordingly. (d) Box-and-whisker Plot
Teabags
5
(e)
3.72
(a)
5.2
5.4
5.6
5.8
6
The data is slightly left skewed. On average, the weight of the teabags is quite close to the target of 5.5 grams. Even though the mean weight is close to the target weight of 5.5 grams, the standard deviation of 0.106 indicates that about 75% of the teabags will fall within 0.212 grams around the target weight of 5.5 grams. The interquartile range of 0.13 also indicates that half of the teabags in the sample fall in an interval 0.13 grams around the median weight of 5.515 grams. The process can be adjusted to reduce the variation of the weight around the target mean.
Excel output: Five-number Summary Boston Vermont Minimum 0.04 0.02 First Quartile 0.17 0.13 Median 0.23 0.2 Third Quartile 0.32 0.28 Maximum 0.98 0.83
Copyright ©2024 Pearson Education, Inc.
3.72 cont.
(b) Box-and-whisker Plot
Vermont
Boston
0
0.2
0.4
0.6
0.8
1
Both distributions are right skewed. (c)
3.73
Both sets of shingles did quite well in achieving a granule loss of 0.8 gram or less. The Boston shingles had only two data points greater than 0.8 gram. The next highest to these was 0.6 gram. These two data points can be considered outliers. Only 1.176% of the shingles failed the specification. In the Vermont shingles, only one data point was greater than 0.8 gram. The next highest was 0.58 gram. Thus, only 0.714% of the shingles failed to meet the specification.
(a) Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum
Center City 13 41 64 75 109
Metro Area 14 35 48 59 70
Copyright ©2024 Pearson Education, Inc.
3.73 cont.
(b)
The Metro Area is slightly left-skewed. The Center City is left-skewed.
3.74
(c)
Correlation coefficient of summated rating and cost of a mean for Center City = 0.6842 Correlation coefficient of summated rating and cost of a mean for Metro = 0.6867 There is a positive correlation between cost of a meal and summated rating. The higher priced restaurants tend to receive higher rating than the lower priced restaurants.
(d)
The median cost of a meal in the center city is $64 while the median cost of a meal in the metro area is $48. The range in costs of meals in the center city is greater than the range in costs of meals in the metro area.
(a), (b), (c) Calories Calories Protein Cholesterol
(d)
1 0.464411 0.177665
Protein
Cholesterol
1 0.141673
1
There is a rather weak positive linear relationship between calories and protein with a correlation coefficient of 0.46. The positive linear relationship between calories and cholesterol is quite weak at .178.
Copyright ©2024 Pearson Education, Inc.
3.75
(a), (b), (d) Excel output: Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum
No 430 700 800 1500 3400
Yes 713.6 2941 4000 5000 9013
IQR
800
2059
No 1102.36 800 800 430 3400 2970 429003.5004 654.9836 59.42% 1.6263 2.5864 50 92.6287
Yes 4000.309231 4000 4000 713.6 9013 8299.4 3476664.9999 1864.5817 46.61% 0.5675 0.7561 65 231.2729
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
The pay of coaches not in a major conference is right-skewed. The pay of coaches in a major conference is nearly symmetrical. 3.75
(c)
The mean pay of coaches not in a major conference is $1102.36 million with standard Copyright ©2024 Pearson Education, Inc.
cont.
deviation $654.98 million, whereas the mean pay of coaches in a major conference is $4000.31 million with standard deviation of $1864.58 million. The middle rank pay of coaches not in a major conference is $800 million. The middle rank pay of coaches not in a major conference is $4000 million. The difference between the highest and lowest pay not in a major conference is $2970 million. The difference between the highest and lowest salary in a major conference is $8299.40 million. (e)
3.76
On average, major conference coaches are paid more than non-major conference coaches. There is much more variation in pay among major conference coaches compared to nonmajor conference coaches.
(a), (b) Excel output: Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum IQR
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Real Estate Tax Rate 0.0028 0.0066 0.0092 0.0156 0.0249
Average Home Price 119000 157200 194500 273100 615300
Annual Property Tax 606 1446 2006 3390 5419
0.009
115900
1944
Real Estate Tax Rate 0.011060784 0.0092 0.0057 0.0028 0.0249 0.0221
Average Home Price 233176.4706 194500 172500 119000 615300 496300 11925884235. 2941 109205.6969 46.83% 1.9417 4.3227 51 15291.8562
Annual Property Tax 2405.352941 2006 #N/A 606 5419 4813
0.0000 0.0054 48.81% 0.8005 -0.2385 51 0.0008
Copyright ©2024 Pearson Education, Inc.
1379255.4729 1174.4171 48.83% 0.8025 -0.2323 51 164.4513
3.76 cont.
(c)
Copyright ©2024 Pearson Education, Inc.
3.76 cont.
(c)
All three variables are highly right-skewed, especially average property value. (d)
Correlation coefficient between property taxes and home price = –0.1240.
(e)
There is a large variation in each of the variables from state to state.
Copyright ©2024 Pearson Education, Inc.
3.77
(a), (b) Excel output: Five-Number Summary
Minimum First Quartile Median Third Quartile Maximum IQR
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Mobile Connection Speed 6.2 13.7 21.8 31.3 59.6
Broadband Connection Speed 35.32 53.73 92.63 153.88 245.5
17.6
100.15
Mobile Connection Speed 23.98450704 21.8 6.8 6.2 59.6 53.4 165.2156 12.8536 53.59% 0.8854 0.4848 71 1.5254
Broadband Connection Speed 108.3716901 92.63 77.88 35.32 245.5 210.18 3561.3746 59.6773 55.07% 0.6167 -0.7173 71 7.0824
(c)
3.77
(d)
Both Mobile Connection Speed and Broadband Connection Speed are right-skewed. Correlation coefficient between mobile and broadband = 0.5513. Copyright ©2024 Pearson Education, Inc.
cont.
(e)
The average mobile connection speed for the various countries surveyed is 23 Mbps. Half of the countries surveyed had mobile connection speed less than 21.8 Mbps. One-quarter of the countries surveyed had mobile connection speed less than 13.7 Mbps while another quarter had mobile connection speed greater than 31.3 Mbps. The range for mobile connection speed is 53.4 Mbps with standard deviation of 12.85 Mbps. The average broadband connection speed for the various countries surveyed is 108 Mbps. Half of the countries surveyed had broadband connection speed less than 92.6 Mbps. One-quarter of the countries surveyed had broadband connection speed less than 53.7 Mbps, while another quarter had broadband connection speed greater than 153.9 Mbps. The range for broadband connection speed is 210.2 Mbps with standard deviation of 59.7 Mbps.
(f)
3.78
There is a positive linear relationship between mobile connection speed and broadband connection speed.
(a), (b) Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range CV
Abandonment rate in % (7:00AM-3:00PM) 13.86363636 1.625414306 10 9 7.623868875 58.12337662 0.723568739 1.180708144 29 5 34 305 22 9 20 11 54.99%
Copyright ©2024 Pearson Education, Inc.
3.78 cont.
(c) Boxplot
Abandonment rate in % (7:00AM3:00PM)
0
5
10
15
20
25
30
35
The data are right-skewed. (d) (e)
3.79
r = 0.7575 The average abandonment rate is 13.86%. Half of the abandonment rates are less than 10%. One-quarter of the abandonment rates are less than 9% while another one-quarter are more than 20%. The overall spread of the abandonment rates is 29%. The middle 50% of the abandonment rates are spread over 11%. The average spread of abandonment rates around the mean is 7.62%. The abandonment rates are right-skewed.
(a), (b) Excel Output Average Commuting Time
Average Commuting Time
Minimum
19.5
Mean
25.385
First Quartile
23
Median
24.75
Median
24.75
Mode
23.4
Third Quartile
27.1
Minimum
19.5
Maximum
36.3
Maximum
36.3
Range
16.8
IQR
4.1
Variance
11.0869
Standard Deviation
3.3297
Coeff. of Variation
13.12%
Copyright ©2024 Pearson Education, Inc.
Skewness
0.8592
Kurtosis
0.5924
Count
100
Standard Error
Copyright ©2024 Pearson Education, Inc.
0.3330
3.79 cont.
(c)
The data are skewed right. (d)
3.80
The average weekly commuting time is 25.385 minutes. Half of the average weekly commuting time is less than 24.75 minutes. One-quarter of the average weekly commuting time is less than 23 minutes, while another one-quarter is more than 27.1 minutes. The range of average weekly commuting time is 16.8 minutes. The middle 50% of the average weekly commuting time spreads over 4.1 minutes. The typical spread of average weekly commuting time around the mean is 3.33.
(a), (b) Excel Output Average FICO Score
Average FICO Score
Minimum
675
Mean
First Quartile
699
Median
717
Median
717
Mode
690
Third Quartile
726
Minimum
675
Maximum
739
Maximum
739
Range
64
IQR
27
Variance
Copyright ©2024 Pearson Education, Inc.
712.745098
242.0337
Standard Deviation
15.5574
Coeff. of Variation
2.18%
Skewness
-0.5566
Kurtosis
-0.6947
Count Standard Error 3.80 cont.
51 2.1785
(c)
Since the mean is less than the median, the data are left-skewed. (d)
3.81
The mean of the average credit scores is 712.7451. Half of the average credit scores are less than 717. One-quarter of the average credit scores are less than 699 while another one-quarter is more than 726. The range of the average credit score is 64. The middle 50% of the average credit scores is spread over 27. The typical spread of the average credit scores around the mean is 15.557.
The variables ―gender‖ and ―major‖ are categorical and cannot be summarized with boxplots because boxplots are created using the data from numerical variables. Similarly, the mean is a static computed on numerical variables so is not appropriate for the categorical variables ―gender‖ or ―major‖. Pie charts are used for categorical variables, so they should not be created using data from the numerical variables ―grade point average‖ and ―height.‖ Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
3.82
Excel output:
Minimum First Quartile Median Third Quartile Maximum IQR
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Five-Number Summary Alcohol Calories 2.4 55 4.4 130.5 4.92 151 5.65 170.5 11.5 330 1.25
40
Carbohydrates 1.9 8.65 12 14.7 32.1 6.05
Alcohol 5.269490446 4.92 4.2 2.4 11.5 9.1 1.8344 1.3544
Calories 155.656051 151 110 55 330 275 1917.4707 43.7889
Carbohydrates 12.05171975 12 12 1.9 32.1 30.2 24.8094 4.9809
25.70%
28.13%
41.33%
1.8405 4.5814 157 0.1081
1.2061 2.9620 157 3.4947
0.4912 1.0811 157 0.3975
Copyright ©2024 Pearson Education, Inc.
3.82 cont.
Copyright ©2024 Pearson Education, Inc.
3.82 cont.
The amount of % alcohol is right skewed with an average at 5.269%. Half of the beers have % alcohol below 4.92%. The middle 50% of the beers have alcohol content spread over a range of 1.25%. The highest alcohol content is at 11.5% while the lowest is at 2.4%. The typical spread of alcohol content around the mean is 1.354%. The number of calories is right-skewed with an average at 155.66. Half of the beers have calories below 151. The middle 50% of the beers have calories spread over a range of 40. The highest number of calories is 330 while the lowest is 55. The typical spread of calories around the mean is 43.79. The number of carbohydrates is symmetric average at 12.052, which is almost identical to the median at 12.000. Half of the beers have carbohydrates below 12.000. The middle 50% of the beers have carbohydrates spread over a range of 6.05. The highest number of carbohydrates is 32.10 while the lowest is 1.9. The typical spread of carbohydrates around the mean is 4.98.
Copyright ©2024 Pearson Education, Inc.
Chapter 4
4.1
(a) (b) (c) (d)
Simple events include tossing a head or tossing a tail. Joint events include tossing three heads (HHH), a head followed by two tails (HTT), a tail followed by two heads (THH), and three tails (TTT). Tossing a tail on the first toss The sample space is the collection of (HHH), (HHT), (HTH), (THH), (TTH), (THT), (HTT), and (TTT).
4.2
(a) (b) (c)
Simple events include selecting a red ball. Selecting a white ball The sample space is consists of the 12 red balls and the 8 white balls.
4.3
(a)
30 1 0.33 90 3
(b)
60 2 0.67 90 3
(c)
10 1 0.11 90 9
(d)
30 30 10 50 5 0.556 90 90 90 90 9
(a)
60 3 0.6 100 5
(b)
10 1 0.1 100 10
(c)
35 7 0.35 100 20
(d)
60 65 35 90 9 0.9 100 100 100 100 10
(a)
a priori
4.4
4.5
Copyright ©2024 Pearson Education, Inc.
v
vi Chapter 5: Discrete Probability Distributions
4.6
4.7
(b)
Subjective
(c)
a priori
(d)
Empirical
(a)
Mutually exclusive, not collectively exhaustive.
(b)
Not mutually exclusive, not collectively exhaustive.
(c)
Mutually exclusive, not collectively exhaustive.
(d)
Mutually exclusive, collectively exhaustive
(a)
The joint probability of mutually exclusive events (being listed on the New York Stock Exchange and NASDAQ) is zero.
(b)
The joint probability of the events (owning a smartphone and a tablet) is not zero because a consumer can own both a smartphone and a tablet at the same time.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 4.7
(c)
cont.
4.8
4.9
4.10
The joint probability of mutually exclusive events (being an Apple cellphone and a Samsung cellphone) is zero.
(d)
The joint probability of the events (an automobile that is a Toyota and was manufactured in the U.S) is not zero because a Toyota can be manufactured in the U.S.
(a)
A respondent in the 50–59 age group.
(b)
A respondent in the 50–59 age group who has more than $100,000 in retirement savings.
(c)
A retirement account that has more than $100,000 in savings.
(d)
Because age group and amount of retirement savings are two different characteristics.
(a)
P(retirement account less than $100,000)
(b)
P(60–69 and less than $100,000)
(c)
P(60–69 or less than $100,000)
(d)
Question (b) asks for the intersection: both conditions being satisfied, the “and” composition. Question (c) asks for the “or” composition: one of the two conditions being satisfied. The difference between the two answers are the outcomes where one of the two conditions is satisfied, but not the other.
530 380 0.455 2,000
380 0.19 2,000
1,000 910 380 0.765 2,000
Answers will vary. (a)
A marketer who uses LinkedIn.
(b)
A B2B marketer who uses LinkedIn.
(c)
A marketer who uses B2C.
(d)
A marketer who uses Facebook and is a B2C marketer is a joint event because it consists of two characteristics, uses of Facebook and is a B2C marketer. Copyright ©2024 Pearson Education, Inc.
viii Chapter 5: Discrete Probability Distributions
4.11
4.12
4.12
P(chosen Facebook)
(b)
P(B2B and chosen LinkedIn)
(c)
P(B2B or chosen LinkedIn)
(d)
The probability of “is B2B or has chosen LinkedIn” includes the probability of “is B2B” and the probability of “chosen LinkedIn” minus the probability of “is B2B and chosen LinkedIn.”
(a)
P(fully supports increased use of educational technologies in higher ed) 686 0.3805 1,803
(b)
P(is a digital learning leader)
(c)
P(fully supports increased use of educational technologies or is a digital learning leader) 686 206 175 717 0.3977 1,803 1,803
(d)
The probability in (c) includes those who fully support increased use of educational
cont.
4.13
1,030 0.5150 2,000
(a)
350 0.1750 2,000
1,000 400 50 0.6750 2,000
206 0.1143 1,803
technologies in higher education and those who are digital learning leaders.
89 0.7008 127
(a)
P (has a relationship)
(b)
P(has a relationship or is Latino)
(c)
P(has a relationship and is Latino)
(d)
The probability of “has a relationship or is Latino” includes the probability of “has a relationship” and the probability of “is Latino” minus the joint probability of “has a relationship and is Latino.”
89 52 35 0.8346 127 35 0.2756 127
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix
4.14
Important to Understand Privacy Policy
Older adults
Younger adults
Total
Yes
911
195
1106
No
90
65
155
Total
1001
260
1261
(a)
P(is important to have a clear understanding of a company’s privacy policy before 1,106 0.8771 signing up for its service online) 1,261
(b)
P(is an older adult and indicates it is important to have a clear understanding of a 911 0.7224 company’s privacy policy before signing up for its service online) 1,261
(c)
P(is an older adult or indicates it is important to have a clear understanding of a company’s privacy policy before signing up for its service online) 1,001 1,106 911 1,196 0.9485 1,261 1,261
(d)
P(is an older adult or a younger adult)
1,261 1.00 1,261
4.15
Needs Warranty-Related Repair
U.S.
Non-U.S.
Total
Yes
0.025
0.015
0.04
No
0.575
0.385
0.96
Total
0.600
0.400
1.00
(a)
P(needs warranty repair) = 0.04 Copyright ©2024 Pearson Education, Inc.
x Chapter 5: Discrete Probability Distributions (b)
P(needs warranty repair and manufacturer based in U.S.) = 0.025
(c)
P(needs warranty repair or manufacturer based in U.S.) = 0.615
(d)
P(needs warranty repair or manufacturer not based in U.S.) = 0.425
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 4.16
4.17
10 1 0.33 30 3
(a)
P( A | B)
(b)
P ( A | B)
20 1 0.33 60 3
(c)
P( A | B)
40 2 0.67 60 3
(d)
1 Since P ( A | B ) P ( A) , events A and B are statistically independent. 3
(a)
P( A | B)
(b)
P ( A | B)
35 7 0.5385 65 13
(c)
P ( A | B)
30 6 0.4615 65 13
(d)
Since P( A | B) 0.2857 and P(A) = 0.40, events A and B are not statistically
10 2 0.2857 35 7
independent.
P( A and B) 0.4 1 0.5 P( B) 0.8 2
4.18
P( A | B)
4.19
P(A and B) = P(A) P(B) = (0.7) (0.6) 0.42
4.20
Since P(A and B) = 0.20 and P(A) P(B) = 0.12, events A and B are not statistically independent.
4.21
(a)
P(less than $100k|50-59)
(b)
P (50-59 | less than $100k)
530 0.5300 1,000 530 0.5824 910
Copyright ©2024 Pearson Education, Inc.
xii Chapter 5: Discrete Probability Distributions (c)
The conditional events are reversed.
(d)
Since P(less than $100k|50-59)
530 0.5300 is not equal to 1,000
910 0.4550 , having a retirement savings of less than $100,000 2,000 and age group are not independent. P(less than $100k)
4.22
P(Chosen Facebook|B2B)
(b)
P(B2B | Chosen Facebook)
(c)
The conditional events are reversed.
(d)
4.23
400 0.3883 1,030
400 1,030 0.4000 and P(Chosen Facebook) 0.5150 1,000 2,000 are not equal. Therefore, business focus and social media used are not independent. P(Chosen Facebook|B2B)
(a)
P(Latino | has a relationship)
35 0.3933 89
(b)
P(has a relationship | Latino)
35 0.6731 52
(c)
The conditional events are reversed.
(d)
4.24
400 0.4000 1,000
(a)
(a)
35 52 0.3933 and P (Latino) 0.4094 are not 89 127 equal. Therefore, having a business relationship with a bank or credit union and ethnicity of the small business owner are not independent. P(Latino | has a relationship)
P(fully supports increased use of educational technologies in higher ed | faculty 511 0.3200 member) 1,597
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii (b)
P(does not fully supports increased use of educational technologies in higher ed | faculty member)
1,086 0.6800 1,597
(c)
P(fully supports increased use of educational technologies in higher ed | digital learning 175 leader) 0.8495 206
(d)
P(does not fully supports increased use of educational technologies in higher ed | digital 31 learning leader) 0.1505 206
4.25 Important to Understand Privacy Policy
Older adults
Younger adults
Total
Yes
911
195
1106
No
90
65
155
Total
1001
260
1261
911 0.9101 1,001
(a)
P(important to understand privacy policy | older adult)
(b)
P(younger adult | does not indicate that it is important to understand privacy policy) 65 0.4194 155
(c)
Since P(younger adult | does not indicate that it is important to understand privacy
260 0.2062, indicates that it is 1,261 important to understand privacy policy and adult age are not independent. policy) = 0.4194 is not equal to P(younger adult)
Copyright ©2024 Pearson Education, Inc.
xiv Chapter 5: Discrete Probability Distributions 4.26
4.27
Needs Warranty-Related Repair
U.S.
Non-U.S.
Total
Yes
0.025
0.015
0.04
No
0.575
0.385
0.96
Total
0.600
0.400
1.00
0.025 0.0417 0.6
(a)
P(needs warranty repair | manufacturer based in U.S.) =
(b)
P(needs warranty repair | manufacturer not based in U.S.) =
(c)
Since P(needs warranty repair | manufacturer based in U.S.) =
(a)
P(higher for the year)
(b)
P(higher for the year | higher first week)
(c)
Since P (higher for the year) 0.6620 is not equal to
0.015 0.0375 0.4
0.025 0.0417 is not 0.6 equal to P(needs warranty repair) = 0.04, the two events are not independent.
39 8 0.6620 39 8 12 12 39 0.8298 39 8
P (higher for the year | higher first week) 0.8298 , the two events, first-week
performance and annual performance, are not statistically independent.
4.28
(d)
Answers will vary.
(a)
P(both queens) = 4 3 12
(b)
P(10 followed by 5 or 6) = 4 8 32 8 0.012
(c)
P(both queens) = 4 4 16
52 51
2,652
52 51
52 52
2,704
1 0.0045 221
2,652
663
1 0.0059 169
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv
4.29
(d)
P(blackjack) = 16 4 4 16 128 32 0.0483
(a)
P(2 red cellphones) =
(b)
P(1 red cellphone and 1 black cellphone) =
(c)
2 P(3 red cellphones) = 0.01097 9
(d)
(a) P(2 red cellphones) =
52 51
52 51
2,652
663
2 1 2 1 0.0278 9 8 72 36 7 2 2 7 28 7 0.3889 9 8 9 8 72 18
3
2 2 4 0.0494 9 9 81
2 (b) P(1 red cellphone and 1 black cellphone) = 2 7 0.3457 9 9
4.30 4.31 4.32
P( A | B) P( B) 0.8 0.05 0.04 0.095 P( A | B) P( B) P( A | B) P( B) 0.8 0.05 0.4 0.95 0.42 P( A | B) P( B) 0.6 0.3 0.18 P( B | A) 0.340 P( A | B) P( B) P( A | B) P( B) 0.6 0.3 0.5 0.7 0.53 P( B | A)
(a)
D = has disease
T = tests positive
(b)
P(T | D) P( D) 0.9 0.03 0.027 0.736 P(T | D) P( D) P(T | D) P( D) 0.9 0.03 0.01 0.97 0.0367 P(T | D) P( D) P( D | T ) P(T | D) P( D) P(T | D) P( D) 0.99 0.97 0.9603 0.997 0.99 0.97 0.10 0.03 0.9633
(a)
S = Shop online in office
P( D | T )
4.33
M = shopper is male
(b)
P( M | S ) P( S ) P( M | S ) P( S ) P( M | S ) P(S ) 0.57 0.23 0.1311 0.2618 0.57 0.23 0.48 0.77 0.4527 P ( M ) 0.57 0.23 0.48 0.77 0.5007
(a)
B = Base Construction Co. enters a bid
P( S | M )
4.34
O = Olive Construction Co. wins the contract
Copyright ©2024 Pearson Education, Inc.
xvi Chapter 5: Discrete Probability Distributions P ( B | O)
4.35
(b)
P O 0.175 0.15 0.325
(a)
W = Women started business P(W | A)
4.36
4.37
P(O | B) P( B) 0.5 0.3 0.15 0.4615 P(O | B) P( B) P(O | B) P( B) 0.5 0.3 0.25 0.7 0.325
A = Above $50,000 revenues
P( A | W ) P(W ) 0.2 0.35 0.07 0.3097 P( A | W ) P(W ) P( A | W ) P(W ) 0.2 0.35 0.24 0.65 0.226
(b)
P A 0.2 0.35 0.24 0.65 0.226
(a)
P(huge success | favorable review) =
(b)
0.099 0.2157 0.459 0.14 P(moderate success | favorable review) = 0.3050 0.459 0.16 P(break even | favorable review) = 0.3486 0.459 0.06 P(loser | favorable review) = 0.1307 0.459 P(favorable review) = 0.99(0.1) + 0.7(0.2) + 0.4(0.4) + 0.2(0.3) = 0.459
(a)
P(A rating | issued by city) =
(b) (c)
0.35 0.625 0.56 P(issued by city) = 0.5(0.7) + 0.6(0.2) + 0.9(0.1) = 0.56 P(issued by suburb) = 0.4(0.7) + 0.2(0.2) + 0.05(0.1) = 0.325
4.38
310 59049
4.39
(a) (b) (c)
(30)(30)(30) = 27,000 1 1 1 0.000037 30 30 30 In ―dial combination,‖ the order of the combination is important while order is irrelevant in the mathematical combination expressed by equation (4.14). 27 128 67 279936 There are two mutually exclusive and collectively exhaustive outcomes in (a) and six in (b).
4.40
(a) (b) (c)
4.41
(7)(4)(3) = 84
4.42
(5)(4)(9)(5)(6) = 5,400
4.43
n ! 4! (4)(3)(2)(1) 24
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 4.44
5! (5)(4)(3)(2)(1) 120; not all the orders are equally likely because the teams have a different probability of finishing first through fifth.
P4
5! 5! 120 (5 4)! 1!
4.45
5
4.46
n! 6! 720
4.47
8
4.48
10
4.49
7
4.50
100
4.51
20
4.52
4.53
C2
8! 28 2! 6!
C4
10! 210 4! 6!
C4
7! (7)(6)(5) 35 4! 3! (3)(2)(1)
C2
C3
100! (100)(99) 4,950 2! 98! 2
20! (20)(19)(18) 1,140 3!17! (3)(2)(1)
With a priori probability, the probability of success is based on prior knowledge of the process involved. With empirical probability, outcomes are based on observed data. Subjective probability refers to the chance of occurrence assigned to an event by a particular individual. A simple event can be described by a single characteristic. Joint probability refers to phenomena containing two or more events.
4.54
The general addition rule is used by adding the probability of A and the probability of B and then subtracting the joint probability of A and B.
4.55
Events are mutually exclusive if both cannot occur at the same time. Events are collectively exhaustive if one of the events must occur.
4.56
If events A and B are statistically independent, the conditional probability of event A given B is equal to the probability of A.
4.57
When events A and B are independent, the probability of A and B is the product of the probability of event A and the probability of event B. When events A and B are not independent, the probability of A and B is the product of the conditional probability of event A given event B and the probability of event B.
4.58
Bayes’ theorem uses conditional probabilities to revise the probability of an event in the light of new information. Copyright ©2024 Pearson Education, Inc.
xviii Chapter 5: Discrete Probability Distributions 4.59
In Bayes’ theorem, the prior probability is an unconditioned probability while the revised probability is the probability of the original event updated in light of some new information.
4.60
For Counting Rule 1, the number of possible events is the same for each trial. Counting Rule 2 allows for the number of possible events to differ for each trial.
4.61
In combinations, the order of the elements in the arrangement does not matter, whereas in permutations, the order of the arrangement of the elements does matter.
4.62
(a) Generation
4.63
Interested in Investment Learning
Z
X
Total
Yes
390
305
695
No
110
195
305
Total
500
500
1,000
(b)
Answers may vary. A simple event is “Generation Z” A joint event is “Generation Z and interested in investment learning.”
(c)
Answers may vary. A joint event is “Generation Z and interested in investment learning.”
(d)
P(interested in learning)
(e)
P(interested in learning and generation Z)
(f)
P(interested in learning or generation Z)
(g)
They are not independent because generation Z and generation X have different probabilities of interest in investment learning.
(a)
P(is an HR employee)
695 0.6950 1,000 390 0.3900 1,000
695 500 390 0.8050 1,000
132 0.33 400
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix (b)
P(is an HR employee or indicates that absenteeism is an important metric) 132 126 54 204 0.51 400 400
(c)
P(does not indicate that presenteeism is an important metric and is a non-HR employee)
4.64
4.65
225 0.5625 400
(d)
P(does indicate that presenteeism is an important metric or is a non-HR employee) 304 268 225 347 0.8675 400 400
(e)
P( a non-HR employee given they indicate that presenteeism is an important metric) 43 0.4479 96
(f) (g)
They are not independent They are not independent
(a)
P(at least once a week)
(b)
P(at least once a week or at least once a month)
(c)
P(Gen Z or several times a day)
(d)
P(Gen Z and several times a day)
(e)
P(never | Gen Boomer)
(a)
P(B2B service)
(b)
P(budget based on previous year) (0.521)(1,000) (0.39)(1,000) (0.30)(1,000) (0.278)(1,000) 1, 489 0.3723 4,000 4,000
(c)
P(B2B service and budget based on previous year)
132 0.1317 1,002 132 73 205 0.2046 1,002 1,002
140 252 50 342 0.3413 1,002 1,002 50 0.0499 1,002
157 0.4604 341
1,000 0.25 4,000
(0.521)(1,000) 0.1303 4,000
Copyright ©2024 Pearson Education, Inc.
xx Chapter 5: Discrete Probability Distributions
1,000 (0.521)(1,000) 0.3803 4,000
(d)
P(B2B service or budget based on previous year)
(e)
P(B2B service | budget based on previous year)
(f)
P(budget based on previous year | B2B product or services) (0.521)(1,000) (0.39)(1,000) 0.4555 2,000
(g)
They are not independent.
(0.39)(1,000) 0.39 1,000
Chapter 5
5.1
PHStat output for Distribution A:
Probabilities & Outcomes:
P
X
Y
0.5
0
0.2
1
0.15
2
0.1
3
0.05
4
Statistics E(X)
1
E(Y)
0
Variance(X) Standard Deviation(X)
1.5 1.224745
Variance(Y)
0
Standard Deviation(Y)
0
Covariance(XY)
0 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi Variance(X+Y)
1.5
Standard Deviation(X+Y)
1.224745
PHStat output for Distribution B:
Probabilities & Outcomes:
P
X
Y
0.05
0
0.1
1
0.15
2
0.2
3
0.5
4
Statistics E(X)
3
E(Y)
0
Variance(X)
1.5
Standard Deviation(X)
1.224745
Variance(Y)
0
Standard Deviation(Y)
0
Covariance(XY)
0
Variance(X+Y)
1.5
Standard Deviation(X+Y)
1.224745
(a)
Distribution ADistribution B X
P(X) 0
X*P(X)
0.50 0.00
X
P(X)
X*P(X)
0
0.50 0.00
Copyright ©2024 Pearson Education, Inc.
xxii Chapter 5: Discrete Probability Distributions 1
0.20 0.20
1
0.20 0.20
2
0.15 0.30
2
0.15 0.30
3
0.10 0.30
3
0.10 0.30
4
0.05 0.20
4
0.05 0.20
μ= 1.00
μ= 1.00
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii 5.1 cont.
(b) Distribution A (X – )2
X
(X – )2*P(X)
P(X)
0
(–1)2
0.50 0.50
1
(0)2
0.20 0.00
2
(1)2
0.15 0.15
3
(2)2
0.10 0.40
4
(3)2
0.05 0.45
2 = 1.50
( X – )2 P( X ) = 1.22
Distribution B (X – )2
X
P(X)
(X – )2*P(X)
0
(–3)2
0.05 0.45
1
(–2)2
0.10 0.40
2
(–1)2
0.15 0.15
3
(0)2
0.20 0.00
4
(1)2
0.50 0.50
2 = 1.50
( X – )2 P( X ) = 1.22
5.2
(c)
For distribution A, P(X 3) = 0.10 + 0.05 = 0.15 For distribution B, P(X 3) = 0.20 + 0.50 = 0.70
(d)
The means are different, but the variances are the same.
(a)–(b)
Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 5: Discrete Probability Distributions X
P(X)
X*P(X)
(X – X )2
(X – X )2*P(X)
0
0.10
0.00
4
0.40
1
0.20
0.20
1
0.20
2
0.45
0.90
0
0.00
3
0.15
0.45
1
0.15
4
0.05
0.20
4
0.20
5
0.05
0.25
9
0.45
(a) Mean =
2.00
Variance = 1.40 (b) Stdev =
5.3
(c)
P(X 2) = 0.45 + 0.15 + 0.05 + 0.05 = 0.70
(a)
Based on the fact that the odds of winning are expressed out with a base of 31,478, you would think that the automobile dealership sent out 31,478 fliers. 1 1 31,476 iN1 X i P X i 5000 60 5 $5.16 31,478 31,478 31,478
(b) (c) (d)
5.4
1.18321596
iN1 X i E X i P X i = $28.15 The total cost of the prizes is $5,000 + $60 + 31,476 * $5 = $162,440. Assuming that the cost of producing the fliers is negligible, the cost of reaching a single customer is $162,440/31,478 = $5.16. The effectiveness of the promotion will depend on how many customers will show up in the show room. 2
(a) X
P(X)
$–1
21/36
$+1
15/36
X
P(X)
$–1
21/36
$+1
15/36
(b)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv (c)
(d)
5.5
X
P(X)
$–1
30/36
$+4
6/36
$ – 0.167 for each method of play
Excel Output:
Arrivals = X
Frequency
Probability = Frequency/n
[X-E(X)]^2
0
15
0.075
8.0089
=(A2-B13)^2
1
31
0.155
3.3489
=(A3-B13)^2
2
47
0.235
0.6889
=(A4-B13)^2
3
41
0.205
0.0289
=(A5-B13)^2
4
29
0.145
1.3689
=(A6-B13)^2
5
24
0.12
4.7089
=(A7-B13)^2
6
10
0.05
10.0489
=(A8-B13)^2
7
2
0.01
17.3889
=(A9-B13)^2
8
1
0.005
26.7289
=(A10-B13)^2
n
200
=SUM(B2:B10)
E(X)
2.83
=SUMPRODUCT(A2:A10, C2:C10)
Variance(X)
2.8611
=SUMPRODUCT(C2:C10, D2:D10)
Std Dev (X)
1.691479
=SQRT(B14)
(a) E ( X ) 2.83 (b) 1.69 Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 5: Discrete Probability Distributions (c) P ( X 2) 0.075 0.155 0.230
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 5.6
Excel Output:
Approved = X
Frequency
Probability = Frequency/n
[X-E(X)]^2
0
13
0.125
4.393861
=(A2-B12)^2
1
29
0.278846
1.201553
=(A3-B12)^2
2
28
0.269231
0.009246
=(A4-B12)^2
3
15
0.144231
0.816938
=(A5-B12)^2
4
11
0.105769
3.62463
=(A6-B12)^2
5
5
0.048077
8.432322
=(A7-B12)^2
6
2
0.019231
15.24001
=(A8-B12)^2
7
1
0.009615
24.04771
=(A9-B12)^2
n
104
=SUM(B2:B9)
E(X)
2.096154
=SUMPRODUCT(A2:A9, C2:C9)
Variance(X)
2.317678
=SUMPRODUCT(C2:C9, D2:D9)
Std Dev (X)
1.522392
=SQRT(B13)
(a) E ( X ) 2.0962 (b) 1.5224 (c) P( X 1)
5.7
28 15 11 5 2 1 0.5962 104 104 104 104 104 104
Excel output: Probability 0.1
Stock X
Stock Y
[X-E(X)]^2
-5
-100
13340.25
[Y-E(Y)]^2
Copyright ©2024 Pearson Education, Inc.
62001
xxviii Chapter 5: Discrete Probability Distributions 0.3
10
50
10100.25
9801
0.4
170
210
3540.25
3721
0.2
200
300
8010.25
22801
E(X)
110.5
E(Y)
Variance(X)
7382.25
Variance(Y) 15189
Std Dev(X)
85.92002095
Std Dev(Y)
(a) (b) (c)
149
123.2436611
E ( X ) $110.5 E (Y ) $149 X $85.92 Y $123.24 Stock Y gives the investor a higher expected return than stock X, but also has a higher standard deviation. Risk-averse investors would invest in stock X, whereas risk takers would invest in stock Y.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix 5.8
Excel output: Probability
Bond X
Stock Y
[X-E(X)]^2
0.01
-200
-700
63731
568516
0.15
-75
-300
16243.5
125316
0.09
30
-100
504.0025
23716
0.35
60
100
57.0025
2116
0.3
100
150
2261.003
9216
0.1
120
350
4563.003
87616
E(X)
52.45
E(Y)
Variance(X)
4273.748
Variance(Y) 38884
Std Dev(X)
65.37391
Std Dev(Y)
(a) (b) (c)
(d)
5.9
[Y-E(Y)]^2
(a) (b) (c) (d)
54
197.1903
E (Bond fund, X ) $52.45 E (Stock Fund, Y ) $54 X $65.37 Y $197.19 Based on the expected value criteria, you would choose the common stock fund. However, the common stock fund also has a standard deviation more than three times higher than that for the corporate bond fund. An investor should carefully weigh the increased risk. If you chose the common stock fund, you would need to assess your reaction to the small possibility that you could lose virtually all of your entire investment.
0.5997 0.0016 0.0439 0.4018 PHstat output for part (d): Binomial Probabilities Data Sample size
6
Probability of an event of interest
0.83
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 5: Discrete Probability Distributions Statistics Mean
4.98
Variance
0.8466
Standard deviation
0.920109
Binomial Probabilities Table X
P(X)
P(<=X)
0
2.41E-05
2.41E-05
0
0.999976
1
1
0.000707
0.000731
2.41E-05
0.999269
0.999976
2
0.008631
0.009362
0.000731
0.990638
0.999269
3
0.056184
0.065546
0.009362
0.934454
0.990638
4
0.205732
0.271277
0.065546
0.728723
0.934454
5
0.401782
0.67306
0.271277
0.32694
0.728723
6
0.32694
1
0.67306
0
0.32694
Copyright ©2024 Pearson Education, Inc.
P(<X)
P(>X)
P(>=X)
Solutions to End-of-Section and Chapter Review Problems xxxi 5.10
(a) (b) (c) (d)
4(0.10) 0.40 4 0.1 0.9 0.60 4(0.40) 1.60 4 0.4 0.6 0.98 5(0.80) 0.40 5 0.8 0.2 0.894 3(0.5) 1.50 3 0.5 0.5 0.866
5.11
Given 0.5 and n = 5, P(X = 5) = 0.0312.
5.12
PHStat Output: Binomial Probabilities Data Sample size
6
Probability of an event of interest
0.469
Statistics Mean
2.814
Variance
1.4942
Standard deviation
1.2224
Binomial Probabilities Table X
(a)
P(X) 0
0.0224
1
0.1188
2
0.2623
3
0.3089
4
0.2046
5
0.0723
6
0.0106
P ( X 4) 0.2046 Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 5: Discrete Probability Distributions P( X 6) 0.0106 P ( X 4) 0.2046 0.0723 0.0106 0.2876 2.814 1.2224 That each American adult owns an iPhone or does not own an iPhone and that next six adults selected are independent.
(b) (c) (d) (e)
5.13
PHStat output: Data Sample size
5
Probability of an event of interest
0.25
Statistics Mean Variance Standard deviation
1.25 0.9375 0.968246
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii 5.13
PHStat output:
cont. Binomial Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
0
0.237305
0.237305
0
0.762695
1
1
0.395508
0.632813
0.237305
0.367188
0.762695
2
0.263672
0.896484
0.632813
0.103516
0.367188
3
0.087891
0.984375
0.896484
0.015625
0.103516
4
0.014648
0.999023
0.984375
0.000977
0.015625
5
0.000977
1
0.999023
0
0.000977
If = 0.25 and n = 5, (a)P(X = 5) = 0.0010 (b)P(X 4) = P(X = 4) + P(X = 5) = 0.0146 + 0.0010 = 0.0156 (c)P(X = 0) = 0.2373 (d)P(X 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.2373 + 0.3955 + 0.2637 = 0.8965
5.14
PHStat Output: Binomial Probabilities
Data Sample size
10
Probability of an event of interest
0.02
Statistics Mean
0.2 Copyright ©2024 Pearson Education, Inc.
P(>=X)
xxxiv Chapter 5: Discrete Probability Distributions Variance
0.1960
Standard deviation
0.4427
Binomial Probabilities Table X
P(X) 0
0.8171
1
0.1667
2
0.0153
3
0.0008
4
0.0000
5
0.0000
6
0.0000
7
0.0000
8
0.0000
9
0.0000
10
0.0000
P(X<=2)
0.9991
P(X>=3)
0.0009
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv P( X 0) 0.8171 P ( X 1) 0.1667 P( X 2) 0.8171 0.1667 0.9991 P( X 3) 0.0009
5.14 cont.
(a) (b) (c) (d)
5.15
Partial PHStat output:
Binomial Probabilities
Data Sample size
20
Probability of an event of interest
0.07
Statistics Mean
1.4
Variance
1.3020
Standard deviation
1.1411
Binomial Probabilities Table X
(a) (b) (c) (d)
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.2342
0.2342
0.0000
0.7658
1.0000
1
0.3526
0.5869
0.2342
0.4131
0.7658
2
0.2521
0.8390
0.5869
0.1610
0.4131
1.4 1.1411 P( X 0) 0.2342 P( X 1) 0.3526 P( X 2) 0.4131
Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 5: Discrete Probability Distributions 5.16Partial PHStat output: Binomial Probabilities
Data Sample size
3
Probability of an event of interest
0.905
Statistics Mean
2.715
Variance
0.2579
Standard deviation
0.5079
Binomial Probabilities Table X
(a) (b) (c) (d)
(e)
P(X) 0
0.0009
1
0.0245
2
0.2334
3
0.7412
P( X 3) 0.7412 P( X 0) 0.0009 P ( X 2) 0.2334 0.7412 0.9746 2.715 0.5079 On the average, over the long run, you theoretically expect 2.715 orders to be filled correctly in a sample of 3 orders with a standard deviation of 0.5079. McDonald’s has a slightly higher probability of filling orders correctly, and Wendy’s has a slightly lower probability.
5.17Partial PHStat output: Binomial Probabilities
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii Data Sample size
3
Probability of an event of interest
0.929
Statistics Mean
2.787
Variance
0.1979
Standard deviation
0.4448
Binomial Probabilities Table X
P(X) 0
0.0004
1
0.0140
2
0.1838
3
0.8018
Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 5: Discrete Probability Distributions 5.17
(a) (b) (c) (d)
(e)
5.18
P( X 3) 0.8018 P( X 0) 0.0004 P ( X 2) 0.1838 0.8018 0.9856 2.787 0.4448 On the average, over the long run, you theoretically expect 2.787 orders to be filled correctly in a sample of 3 orders with a standard deviation of 0.4448. Out of all three fast-food restaurants, McDonald’s has the highest probability of filling orders correctly.
(a)Partial PHStat output:
Poisson Probabilities
Data Average/Expected number of successes:
2.5
Poisson Probabilities Table X 2
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0.256516
0.543813
0.287297
0.456187
0.712703
P(>X)
P(>=X)
Using the equation, if 2.5, P( X 2)
e2.5 (2.5)2 0.2565 2!
(b)Partial PHStat output:
Poisson Probabilities Data Average/Expected number of successes:
8
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 8
0.139587
0.592547
0.452961
0.407453
0.547039
If = 8.0, P(X = 8) = 0.1396 (c) Partial PHStat output:
Poisson Probabilities Data Average/Expected number of successes:
0.5
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.606531
0.606531
0.000000
0.393469
1.000000
1
0.303265
0.909796
0.606531
0.090204
0.393469
If = 0.5, P(X = 1) = 0.3033
Copyright ©2024 Pearson Education, Inc.
xl Chapter 5: Discrete Probability Distributions 5.18
(d)Partial PHStat output:
cont. Poisson Probabilities Data Average/Expected number of successes:
3.7
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0 0.024724 0.024724 0.000000 0.975276 1.000000
If = 3.7, P(X = 0) = 0.0247
5.19
(a)Partial PHStat output:
Poisson Probabilities Data Mean/Expected number of events of interest: Poisson Probabilities Table X 0 1 2
2
P(X) P(<=X) P(<X) P(>X) P(>=X) 0.135335 0.135335 0.000000 0.864665 1.000000 0.270671 0.406006 0.135335 0.593994 0.864665 0.270671 0.676676 0.406006 0.323324 0.593994
If = 2.0, P(X 2) = 1 – [P(X = 0) + P(X = 1)] = 1 – [0.1353 + 0.2707] = 0.5940
(b)Partial PHStat output: Poisson Probabilities Data Average/Expected number of successes:
8
Poisson Probabilities Table
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.000335
0.000335
0.000000
0.999665
1.000000
1
0.002684
0.003019
0.000335
0.996981
0.999665
2
0.010735
0.013754
0.003019
0.986246
0.996981
3
0.028626
0.042380
0.013754
0.957620
0.986246
If = 8.0, P(X 3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2)] = 1 – [0.0003 + 0.0027 + 0.0107] = 1 – 0.0137 = 0.9863 (c)Partial PHStat output:
Poisson Probabilities Data Average/Expected number of successes:
0.5
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.606531
0.606531
0.000000
0.393469
1.000000
1
0.303265
0.909796
0.606531
0.090204
0.393469
If = 0.5, P(X 1) = P(X = 0) + P(X = 1) = 0.6065 + 0.3033 = 0.9098
Copyright ©2024 Pearson Education, Inc.
xlii Chapter 5: Discrete Probability Distributions 5.19
(d)Partial PHStat output:
cont. Poisson Probabilities
Data Average/Expected number of successes:
4
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.018316
0.018316
0.000000
0.981684
1.000000
1
0.073263
0.091578
0.018316
0.908422
0.981684
If = 4.0, P(X 1) = 1 – P(X = 0) = 1 – 0.0183 = 0.9817
(e)Partial PHStat output:
Poisson Probabilities
Data Average/Expected number of successes:
5
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.006738
0.006738
0.000000
0.993262
1.000000
1
0.033690
0.040428
0.006738
0.959572
0.993262
2
0.084224
0.124652
0.040428
0.875348
0.959572
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii 3
0.140374
0.265026
0.124652
If = 5.0, P(X 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.0067 + 0.0337 + 0.0842 + 0.1404 = 0.2650
Copyright ©2024 Pearson Education, Inc.
0.734974
0.875348
xliv Chapter 5: Discrete Probability Distributions 5.20
PHStat output for (a) – (d)
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.006738
0.006738
0.000000
0.993262
1.000000
1
0.033690
0.040428
0.006738
0.959572
0.993262
2
0.084224
0.124652
0.040428
0.875348
0.959572
3
0.140374
0.265026
0.124652
0.734974
0.875348
4
0.175467
0.440493
0.265026
0.559507
0.734974
5
0.175467
0.615961
0.440493
0.384039
0.559507
6
0.146223
0.762183
0.615961
0.237817
0.384039
7
0.104445
0.866628
0.762183
0.133372
0.237817
8
0.065278
0.931906
0.866628
0.068094
0.133372
9
0.036266
0.968172
0.931906
0.031828
0.068094
10
0.018133
0.986305
0.968172
0.013695
0.031828
11
0.008242
0.994547
0.986305
0.005453
0.013695
12
0.003434
0.997981
0.994547
0.002019
0.005453
13
0.001321
0.999302
0.997981
0.000698
0.002019
14
0.000472
0.999774
0.999302
0.000226
0.000698
15
0.000157
0.999931
0.999774
0.000069
0.000226
16
0.000049
0.999980
0.999931
0.000020
0.000069
17
0.000014
0.999995
0.999980
0.000005
0.000020
18
0.000004
0.999999
0.999995
0.000001
0.000005
19
0.000001
1.000000
0.999999
0.000000
0.000001
20
0.000000
1.000000
1.000000
0.000000
0.000000
Given = 5.0, (a) P(X = 1) = 0.0337 (b) P(X < 1) = 0.0067 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv (c) (d)
P(X > 1) = 0.9596 P(X 1) = 0.0404
Copyright ©2024 Pearson Education, Inc.
xlvi Chapter 5: Discrete Probability Distributions 5.21
Portion of PHStat Output:
POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
12
POISSON.DIST Probabilities Table
(a) (b)
X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.0000
0.0000
0.0000
1.0000
1.0000
1
0.0001
0.0001
0.0000
0.9999
1.0000
2
0.0004
0.0005
0.0001
0.9995
0.9999
3
0.0018
0.0023
0.0005
0.9977
0.9995
4
0.0053
0.0076
0.0023
0.9924
0.9977
5
0.0127
0.0203
0.0076
0.9797
0.9924
6
0.0255
0.0458
0.0203
0.9542
0.9797
7
0.0437
0.0895
0.0458
0.9105
0.9542
8
0.0655
0.1550
0.0895
0.8450
0.9105
9
0.0874
0.2424
0.1550
0.7576
0.8450
10
0.1048
0.3472
0.2424
0.6528
0.7576
11
0.1144
0.4616
0.3472
0.5384
0.6528
12
0.1144
0.5760
0.4616
0.4240
0.5384
13
0.1056
0.6815
0.5760
0.3185
0.4240
14
0.0905
0.7720
0.6815
0.2280
0.3185
e12 (12)0 0.000006 0 0! e12 (12)10 12, P( X 10) 0.1048 10!
12, P( X 0)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii (c)
P( X 12) 1 P( X 12) 1 P( X 0) P( X 1) P( X 2) ... P( X 11)
(d)
e12 (12)0 e12 (12)1 e12 (12)2 e12 (12)11 1 ... 0! 1! 2! 11! 1 0 0.001 0.004 ... 0.1144 1 0.4616 0.5384 P( X 13) P( X 0) P( X 1) P( X 2) ... P( X 12) e12 (12)0 e12 (12)1 e12 (12)2 e12 (12)12 1 ... 0! 1! 2! 12! 0 0.001 0.004 ... 0.1144 0.6815
Copyright ©2024 Pearson Education, Inc.
xlviii Chapter 5: Discrete Probability Distributions 5.22
(a)–(c) Portion of PHStat output Data Average/Expected number of successes: Poisson Probabilities Table X P(X) P(<=X) 0 0.002479 0.002479 1 0.014873 0.017351 2 0.044618 0.061969 3 0.089235 0.151204 4 0.133853 0.285057 5 (b) 0.445680 0.160623 6 0.160623 0.606303 7 0.137677 0.743980 8 0.103258 0.847237 9 0.068838 0.916076 10 0.041303 0.957379 11 0.022529 0.979908 12 0.011264 0.991173 13 0.005199 0.996372 14 0.002228 0.998600 15 0.000891 0.999491 16 0.000334 0.999825 17 0.000118 0.999943
(a)
6
P(<X) P(>X) P(>=X) 0.000000 0.997521 1.000000 0.002479 0.982649 0.997521 0.017351 0.938031 0.982649 0.061969 0.848796 0.938031 0.151204 0.714943 0.848796 (a) 0.554320 (c) 0.285057 0.714943 0.445680 0.393697 0.554320 0.606303 0.256020 0.393697 0.743980 0.152763 0.256020 0.847237 0.083924 0.152763 0.916076 0.042621 0.083924 0.957379 0.020092 0.042621 0.979908 0.008827 0.020092 0.991173 0.003628 0.008827 0.996372 0.001400 0.003628 0.998600 0.000509 0.001400 0.999491 0.000175 0.000509 0.999825 0.000057 0.000175
P(X < 5) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) + P(X = 4) =
e6 6
0
e6 6
2
e6 6
3
e6 6
4
++ + + 0! 2! 3! 4! = 0.002479 + 0.014873 + 0.044618 + 0.089235 + 0.133853 = 0.2851
e6 6
5
(b) (c) (d)
5.23
P(X = 5) =
= 0.1606 5! P(X 5) = 1 – P(X < 5) = 1 – 0.2851 = 0.7149 4 5 e6 6 e6 6 P(X = 4 or X = 5) = P(X = 4) + P(X = 5) = + = 0.2945 4! 5! 1 e6 6 = 1!
Partial PHStat output: Poisson Probabilities Data Average/Expected number of successes: Poisson Probabilities Table X P(X) 3 0.089235
P(<=X) 0.151204
6
P(<X) 0.061969
Copyright ©2024 Pearson Education, Inc.
P(>X) 0.848796
P(>=X) 0.938031
Solutions to End-of-Section and Chapter Review Problems xlix 5.23 cont.
If = 6.0, P(X 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) = 0.0025 + 0.0149 + 0.0446 + 0.0892 = 0.1512 n P(X 3) = 100 (0.1512) = 15.12, so 15 or 16 cookies will probably be discarded.
5.24
Portion of PHStat Output: POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
0.45
POISSON.DIST Probabilities Table
5.25
X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.6376
0.6376
0.0000
0.3624
1.0000
1
0.2869
0.9246
0.6376
0.0754
0.3624
2
0.0646
0.9891
0.9246
0.0109
0.0754
3
0.0097
0.9988
0.9891
0.0012
0.0109
(a) (b)
P( X 0) 0.6376 P ( X 1) 1 P ( X 0) 1 0.6376 0.3624
(c)
P( X 2) 1 P( X 0) P( X 1) 1 0.6376 0.2869 0.0754
Portion of PHStat Output: POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
0.99
POISSON.DIST Probabilities Table X
P(X)
P(<=X)
P(<X)
Copyright ©2024 Pearson Education, Inc.
P(>X)
P(>=X)
l Chapter 5: Discrete Probability Distributions 0
0.3716
0.3716
0.0000
0.6284
1.0000
1
0.3679
0.7394
0.3716
0.2606
0.6284
2
0.1821
0.9215
0.7394
0.0785
0.2606
3
0.0601
0.9816
0.9215
0.0184
0.0785
(a) (b)
P( X 0) 0.3716 P ( X 1) 1 P ( X 0) 1 0.3716 0.6284
(c)
P( X 2) 1 P( X 0) P( X 1) 1 0.3716 0.3679 0.2606
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li 5.26
Portion of PHStat Output: POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
4.5
POISSON.DIST Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.0111
0.0111
0.0000
0.9889
1.0000
1
0.0500
0.0611
0.0111
0.9389
0.9889
2
0.1125
0.1736
0.0611
0.8264
0.9389
3
0.1687
0.3423
0.1736
0.6577
0.8264
4
0.1898
0.5321
0.3423
0.4679
0.6577
(a) (b)
P ( X 0) 0.0111 P ( X 1) 0.0500
(c) (d)
P( X 1) 1 P( X 0) P( X 1) 1 0.0111 0.0500 0.9389 P( X 2) P( X 0) P( X 1) 0.0111 0.0500 0.0611
5.27Portion of PHStat Output: POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
1.88
POISSON.DIST Probabilities Table X
P(X)
P(<=X)
P(<X)
Copyright ©2024 Pearson Education, Inc.
P(>X)
P(>=X)
lii Chapter 5: Discrete Probability Distributions 0
0.1526
0.1526
0.0000
0.8474
1.0000
1
0.2869
0.4395
0.1526
0.5605
0.8474
2
0.2697
0.7091
0.4395
0.2909
0.5605
3
0.1690
0.8781
0.7091
0.1219
0.2909
(a)
For the number of problems with 2019 model Ford to be distributed as a Poisson random variable, we need to assume that (i) the probability that a problem occurs in a given Ford is the same for any other new Ford, (ii) the number of problems that a Ford has is independent of the number of problems any other Ford has, (iii) the probability that two or more problems will occur in some area of a Ford approaches zero as the area becomes smaller. Yes, these assumptions are reasonable in this problem. P( X 0) 0.1526 (b) P( X 2) P ( X 0) P ( X 1) P ( X 2) 0.7091 (c) (d) An operational definition for problem can be ―a specific feature in the car that is not performing according to its intended designed function.‖ The operational definition is important in interpreting the initial quality score because different customers can have different expectations of what function a feature is supposed to perform. 5.28Portion of PHStat Output: POISSON.DIST Probabilities Data Mean/Expected number of events of interest:
1.48
POISSON.DIST Probabilities Table
(a) (b) (c)
X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.2276
0.2276
0.0000
0.7724
1.0000
1
0.3369
0.5645
0.2276
0.4355
0.7724
2
0.2493
0.8139
0.5645
0.1861
0.4355
3
0.1230
0.9368
0.8139
0.0632
0.1861
P( X 0) 0.2276 P( X 2) P( X 0) P( X 1) P( X 2) 0.8139 Because Ford had a higher mean rate of problems per car than Hyundai, the probability of a randomly selected Ford having zero problems and the probability of no more than two problems are both lower than for Hyundai. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems liii
5.29
Partial PHStat output: Poisson Probabilities
Data Mean/Expected number of events of interest:
0.8
Poisson Probabilities Table X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.449329
0.449329
0.000000
0.550671
1.000000
1
0.359463
0.808792
0.449329
0.191208
0.550671
2
0.143785
0.952577
0.808792
0.047423
0.191208
3
0.038343
0.990920
0.952577
0.009080
0.047423
4
0.007669
0.998589
0.990920
0.001411
0.009080
5
0.001227
0.999816
0.998589
0.000184
0.001411
6
0.000164
0.999979
0.999816
0.000021
0.000184
7
0.000019
0.999998
0.999979
0.000002
0.000021
8
0.000002
1.000000
0.999998
0.000000
0.000002
(a)
For the number of phone calls received in a 1-minute period to be distributed as a Poisson random variable, we need to assume that (i) the probability that a phone call is received in a given 1-minute period is the same for all the other 1-minute periods, (ii) the number of phone calls received in a given 1-minute period is independent of the number of phone calls received in any other 1-minute period, (iii) the probability that two or more phone calls received in a time period approaches zero as the length of the time period becomes smaller.
(b)
0.8 , P(X = 0) = 0.4493
(c)
0.8 , P(X 3) = 0.0474
Copyright ©2024 Pearson Education, Inc.
liv Chapter 5: Discrete Probability Distributions (d)
0.8 , P(X 6) = 0.999979. A maximum of 6 phone calls will be received in a 1minute period 99.99% of the time.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lv 5.30
The expected value is the average of a probability distribution. It is the value that can be expected to occur on the average, in the long run.
5.31
The four properties of a situation that must be present in order to use the binomial distribution are (i) the sample consists of a fixed number of observations, n, (ii) each observation can be classified into one of two mutually exclusive and collectively exhaustive categories, usually called “an event of interest” and “not an event of interest”, (iii) the probability of an observation being classified as “an event of interest”, , is constant from observation to observation and (iv) the outcome (i.e., “an event of interest” or “not an event of interest”) of any observation is independent of the outcome of any other observation.
5.32
The four properties of a situation that must be present in order to use the Poisson distribution are (i) you are interested in counting the number of times a particular event occurs in a given area of opportunity (defined by time, length, surface area, and so forth), (ii) the probability that an event occurs in a given area of opportunity is the same for all of the areas of opportunity, (iii) the number of events that occur in one area of opportunity is independent of the number of events that occur in other areas of opportunity and (iv) the probability that two or more events will occur in an area of opportunity approaches zero as the area of opportunity becomes smaller.
5.33
(a)
PHStat output:
Covariance Analysis Probabilities & Outcomes:
Statistics E(X) E(Y) Variance(X) Standard Deviation(X) Variance(Y) Standard Deviation(Y) Covariance(XY) Variance(X+Y) Standard Deviation(X+Y)
P X 0.001 -1000000 0.999 4000
Y
2996 0 1.01E+09 31733.39 0 0 0 1.01E+09 31733.39
Copyright ©2024 Pearson Education, Inc.
Calculations Area
lvi Chapter 5: Discrete Probability Distributions The expected value of the profit made by the insurance company is $2996. (b)
On average, the promoter will have to pay $4000 while the insurance company will make a profit of $2996. This is not a win-win opportunity.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lvii 5.34
(a) (b)
0.675 0.675 Excel Output Binomial Probabilities
Data Sample size
5
Probability of an event of interest
0.675
Statistics Mean
3.375
Variance
1.0969
Standard deviation
1.0473
Binomial Probabilities Table X
P(X) 0
0.0036
1
0.0377
2
0.1564
3
0.3248
4
0.3373
5
0.1401
0.675, n 5 (c) (d) (e)
P ( X 4) 0.3373 P ( X 0) 0.0036
Stock prices tend to rise in the years when the economy is expanding and fall in the years of recession or contraction. Hence, the probability that the price will rise in one year is not independent from year to year.
Copyright ©2024 Pearson Education, Inc.
lviii Chapter 5: Discrete Probability Distributions 5.35
Excel Output
Binomial Probabilities
Data Sample size
10
Probability of an event of interest
0.81
Statistics Mean
8.1
Variance
1.5390
Standard deviation
1.2406
Binomial Probabilities Table X
P(X) 0
0.0000
1
0.0000
2
0.0001
3
0.0006
4
0.0043
5
0.0218
6
0.0773
7
0.1883
8
0.3010
9
0.2852
10
0.1216
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lix (a) (b) (c) (d)
P ( X 8) 0.3010 P ( X 8) P( X 8) P( X 9) P( X 10) 0.7078 P ( X 6) P ( X 0) P ( X 1) ... P ( X 6) 0.1039
The probability that only three respondents use two or more social media channels is /or 0. If the probability that a retail brand uses two or more social media channels for business is 0.91 it is essentially impossible that only three business in 10 would use two or more social media channels. We might conclude that this geographical area has very limited internet access and it is not appropriate to use the model in this area.
Copyright ©2024 Pearson Education, Inc.
lx Chapter 5: Discrete Probability Distributions 5.36
Excel Output
Binomial Probabilities
Data Sample size
15
Probability of an event of interest
0.5
Statistics Mean
7.5
Variance
3.7500
Standard deviation
1.9365
Binomial Probabilities Table X
P(X) 0
0.0000
1
0.0005
2
0.0032
3
0.0139
4
0.0417
5
0.0916
6
0.1527
7
0.1964
8
0.1964
9
0.1527
10
0.0916
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxi
(a)
11
0.0417
12
0.0139
13
0.0032
14
0.0005
P ( X 12) P( X 12) P( X 13) P( X 14) 0.0175
Copyright ©2024 Pearson Education, Inc.
lxii Chapter 5: Discrete Probability Distributions 5.36
(b)
Excel Output
cont.
Binomial Probabilities
Data Sample size
15
Probability of an event of interest
0.75
Statistics Mean
11.25
Variance
2.8125
Standard deviation
1.6771
Binomial Probabilities Table X
P(X) 0
0.0000
1
0.0000
2
0.0000
3
0.0000
4
0.0001
5
0.0007
6
0.0034
7
0.0131
8
0.0393
9
0.0917
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxiii
(b)
10
0.1651
11
0.2252
12
0.2252
13
0.1559
14
0.0668
15
0.0134
P ( X 12) P ( X 12) P ( X 13) P ( X 14) P ( X 15) 0.4613
Copyright ©2024 Pearson Education, Inc.
lxiv Chapter 5: Discrete Probability Distributions 5.37
Excel Output:
Binomial Probabilities
Data Sample size
10
Probability of an event of interest
0.8
Statistics Mean
8
Variance
1.6000
Standard deviation
1.2649
Binomial Probabilities Table X
P(X) 0
0.0000
1
0.0000
2
0.0001
3
0.0008
4
0.0055
5
0.0264
6
0.0881
7
0.2013
8
0.3020
9
0.2684
10
0.1074
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxv (a) (b) (c) (d)
P ( X 0) 0.0000 P ( X 5) 0.0264 P ( X 5) P ( X 6) P ( X 7) P( X 8) P( X 9) P( X 10) 0.9672 8, 1.2649
Copyright ©2024 Pearson Education, Inc.
lxvi Chapter 5: Discrete Probability Distributions 5.38
Excel Output:
Binomial Probabilities
Data Sample size
10
Probability of an event of interest
0.3
Statistics Mean
3
Variance
2.1000
Standard deviation
1.4491
Binomial Probabilities Table X
P(X) 0
0.0282
1
0.1211
2
0.2335
3
0.2668
4
0.2001
5
0.1029
6
0.0368
7
0.0090
8
0.0014
9
0.0001
10
0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxvii (a) (b) (c) (d) (e)
P ( X 0) 0.0282 P ( X 5) 0.1029 P ( X 5) P ( X 6) P ( X 7) P( X 8) P( X 9) P( X 10) 0.0473 3, 1.4491
Since the percentage of bills containing an error is lower in this problem, the probability is higher in (a) and (b) of this problem and lower in (c).
Copyright ©2024 Pearson Education, Inc.
lxviii Chapter 5: Discrete Probability Distributions 5.39
Excel Output:
Binomial Probabilities
Data Sample size
10
Probability of an event of interest
0.28
Statistics Mean
2.8
Variance
2.0160
Standard deviation
1.4199
Binomial Probabilities Table X
P(X) 0
0.0374
1
0.1456
2
0.2548
3
0.2642
4
0.1798
5
0.0839
6
0.0272
7
0.0060
8
0.0009
9
0.0001
10
0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxix (a) (b) (c) (d)
P ( X 5) P ( X 6) P ( X 7) P( X 8) P( X 9) P( X 10) 0.0342 P ( X 2) P ( X 0) P ( X 1) 0.1830 P ( X 0) 0.0374
The assumptions needed are (i) there are only two mutually exclusive and collectively exhaustive outcomes – ―one-word searches‖ or ―not one-word searches,‖ (ii) the probabilities are constant, and (iii) the outcomes are independent.
Copyright ©2024 Pearson Education, Inc.
lxx Chapter 5: Discrete Probability Distributions 5.40
Excel Output:
Binomial Probabilities
Data Sample size
20
Probability of an event of interest
0.62
Statistics Mean
12.4
Variance
4.7120
Standard deviation
2.1707
Binomial Probabilities Table X
P(X) 0
0.0000
1
0.0000
2
0.0000
3
0.0000
4
0.0001
5
0.0007
6
0.0029
7
0.0094
8
0.0249
9
0.0542
10
0.0974
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxi 11
0.1444
12
0.1767
13
0.1774
14
0.1447
15
0.0945
16
0.0482
17
0.0185
18
0.0050
19
0.0009
20
0.0001
(a)
E ( X ) 12.4
(b)
2.1707
(c)
P ( X 10) 0.0974
(d)
P ( X 5) P ( X 0) P ( X 1) ... P ( X 5) 0.0009
(e)
P ( X 5) P ( X 5) P( X 6) ... P ( X 20) 0.9998
Copyright ©2024 Pearson Education, Inc.
lxxii Chapter 5: Discrete Probability Distributions 5.41
Excel Output
Binomial Probabilities
Data Sample size
20
Probability of an event of interest
0.11
Statistics Mean
2.2
Variance
1.9580
Standard deviation
1.3993
Binomial Probabilities Table X
P(X) 0
0.0972
1
0.2403
2
0.2822
3
0.2093
4
0.1099
5
0.0435
6
0.0134
7
0.0033
8
0.0007
9
0.0001
10
0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxiii 11
0.0000
12
0.0000
13
0.0000
14
0.0000
15
0.0000
16
0.0000
17
0.0000
18
0.0000
19
0.0000
20
0.0000
(a)
E ( X ) 2.2
(b)
1.3993
(c)
P ( X 0) 0.0972
(d)
P ( X 2) P ( X 0) P( X 1) P( X 2) 0.6198
(e)
P ( X 3) P ( X 3) P ( X 4) ... P ( X 20) 0.3802
Alternatively, P( X 3) 1 P ( X 2) 1 0.6198 0.3802
Copyright ©2024 Pearson Education, Inc.
lxxiv Chapter 5: Discrete Probability Distributions 5.42
Partial Excel Output: Binomial Probabilities
Data Sample size
47
Probability of an event of interest
0.5
Statistics Mean
23.5
Variance
11.7500
Standard deviation
3.4278
Binomial Probabilities Table X
(a)
P(X) 38
0.0000
39
0.0000
40
0.0000
41
0.0000
42
0.0000
43
0.0000
44
0.0000
45
0.0000
46
0.0000
47
0.0000
0.50, P ( X 39) 0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxv 5.42 (b)Partial Excel Output: cont. Binomial Probabilities
Data Sample size
47
Probability of an event of interest
0.7
Statistics Mean
32.9
Variance
9.8700
Standard deviation
3.1417
Binomial Probabilities Table X
P(X) 38
0.0348
39
0.0188
40
0.0088
41
0.0035
42
0.0012
43
0.0003
44
0.0001
45
0.0000
46
0.0000
47
0.0000
0.70, P ( X 39) 0.0326
Copyright ©2024 Pearson Education, Inc.
lxxvi Chapter 5: Discrete Probability Distributions 5.42 (c)Partial Excel Output: cont. Binomial Probabilities
Data Sample size
47
Probability of an event of interest
0.9
Statistics Mean
42.3
Variance
4.2300
Standard deviation
2.0567
Binomial Probabilities Table X
P(X) 38
0.0249
39
0.0516
40
0.0930
41
0.1428
42
0.1837
43
0.1922
44
0.1572
45
0.0943
46
0.0369
47
0.0071
0.90, P ( X 39) 0.9589 (d)
Based on the results in (a)–(c), the probability that the Standard & Poor’s 500 index will increase if there is an early gain in the first five trading days of the year is very likely to Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxvii be close to 0.90 because that yields a probability of 95.89% that at least 39 of the 47 years the Standard & Poor’s 500 index will increase the entire year.
Copyright ©2024 Pearson Education, Inc.
lxxviii Chapter 5: Discrete Probability Distributions 5.43
Excel Output: Binomial Probabilities
Data Sample size
55
Probability of an event of interest
0.5
Statistics Mean
27.5
Variance
13.7500
Standard deviation
3.7081
Binomial Probabilities Table X
P(X) 37
0.0040
38
0.0019
39
0.0008
40
0.0003
41
0.0001
42
0.0000
43
0.0000
44
0.0000
45
0.0000
46
0.0000
47
0.0000
48
0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxix
(a) (b)
49
0.0000
50
0.0000
51
0.0000
52
0.0000
53
0.0000
54
0.0000
55
0.0000
0.50, P( X 38) 0.0032 It is ludicrous to believe that there is a correlation between the performance of the stock market and the winner of a Super Bowl. If the indicator is a random event, the probability of making a correct prediction 38 or more times out of 50 trials is nearly zero.
Copyright ©2024 Pearson Education, Inc.
lxxx Chapter 5: Discrete Probability Distributions
5.44Portion of PHStat Output:
POISSON.DIST Probabilities
Data Mean/Expected number of events of interest:
3
POISSON.DIST Probabilities Table
(a)
X
P(X)
P(<=X)
P(<X)
P(>X)
P(>=X)
0
0.0498
0.0498
0.0000
0.9502
1.0000
1
0.1494
0.1991
0.0498
0.8009
0.9502
2
0.2240
0.4232
0.1991
0.5768
0.8009
3
0.2240
0.6472
0.4232
0.3528
0.5768
4
0.1680
0.8153
0.6472
0.1847
0.3528
5
0.1008
0.9161
0.8153
0.0839
0.1847
6
0.0504
0.9665
0.9161
0.0335
0.0839
7
0.0216
0.9881
0.9665
0.0119
0.0335
8
0.0081
0.9962
0.9881
0.0038
0.0119
9
0.0027
0.9989
0.9962
0.0011
0.0038
10
0.0008
0.9997
0.9989
0.0003
0.0011
11
0.0002
0.9999
0.9997
0.0001
0.0003
12
0.0001
1.0000
0.9999
0.0000
0.0001
13
0.0000
1.0000
1.0000
0.0000
0.0000
The assumptions needed are (i) the probability that a questionable claim is referred by an investigator is constant, (ii) the probability that a questionable claim is referred by an investigator approaches 0 as the interval gets smaller, and (iii) the probability that a questionable claim is referred by an investigator is independent from interval to interval. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxi (b) (c) (d)
P( X 5) 0.1008 P( X 10) P ( X 0) P ( X 1) ... P ( X 10) 0.9997 P( X 11) 1 P( X 10) 1 0.9997 0.0003
Copyright ©2024 Pearson Education, Inc.
Chapter 6
6.1
PHStat output: Normal Probabilities Common Data Mean
0
Standard Deviation
1
Probability for X<1.47 or X >1.94 P(X<1.47 or X >1.94)
Probability for a Range
Probability for X <= X Value
1.47
From X Value
1.47
Z Value
1.47
To X Value
1.94
Z Value for 1.47
1.47
Z Value for 1.94
1.94
P(X<=1.47)
0.9292
Probability for X >
6.2
0.9554
X Value
1.94
Z Value
1.94
P(X>1.94)
0.0262
P(X<=1.47)
0.9292
P(X<=1.94)
0.9738
P(1.47<=X<=1.94)
0.0446
(a)
P ( Z 1.47) 0.9292
(b)
P ( Z 1.94) 0.0262
(c)
P(1.47 Z 1.94) 0.9738 0.9292 0.0446
(d)
P ( Z 1.47) P ( Z 1.94) 0.9292 (1 0.9738) 0.9554
PHStat output: Normal Probabilities
Copyright ©2024 Pearson Education, Inc. v
Solutions to End-of-Section and Chapter Review Problems 247 Common Data Mean
0
Standard Deviation
1 Probability for a Range
Probability for X <=
From X Value
1.57
X Value
–1.57
To X Value
1.84
Z Value
–1.57
Z Value for 1.57
1.57
0.0582076
Z Value for 1.84
1.84
P(X<=–1.57)
Probability for X > X Value
1.84
Z Value
1.84
P(X>1.84)
0.0329
P(X<=1.57)
0.9418
P(X<=1.84)
0.9671
P(1.57<=X<=1.84)
0.0253
Find X and Z Given Cum. Pctage. Cumulative Percentage
6.3
84.13%
Probability for X<–1.57 or X >1.84
Z Value
0.999815
P(X<–1.57 or X >1.84)
X Value
0.999815
0.0911
(a)
P(–1.57 < Z < 1.84) = 0.9671 – 0.0582 = 0.9089
(b)
P(Z < –1.57) + P(Z > 1.84) = 0.0582 + 0.0329 = 0.0911
(c)
If P(Z > A) = 0.025, P(Z < A) = 0.975. A = + 1.96.
(d)
If P(–A < Z < A) = 0.6826, P(Z < A) = 0.8413. So 68.26% of the area is captured between – A = –1.00 and A = +1.00.
PHStat output: Normal Probabilities
Standard Deviation
Common Data Mean
1
Probability for X <= 0
X Value
Copyright ©2024 Pearson Education, Inc.
1.18
224 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value
1.18
P(X<=1.18)
0.8810
Probability for X > X Value
-0.31
Z Value
-0.31
P(X>-0.31)
0.6217
Probability for a Range
Probability for X<1.18 or X >-0.31
From X Value
P(X<1.18 or X >-0.31)
To X Value
1.5027
Z Value for -0.31
6.4
-0.31 0 -0.31
Z Value for 0
0
P(X<=-0.31)
0.3783
P(X<=0)
0.5000
P(-0.31<=X<=0)
0.1217
(a)
P ( Z 1.18) 0.8810
(b)
P ( Z 0.31) 0.6217
(c)
P(0.31 Z 0) 0.5000 0.3783 0.1217
(d)
P ( Z 0.31) P( Z 1.18) (1 0.6217) (1 0.8810) 0.4973
PHStat output:
Normal Probabilities
Copyright ©2024 Pearson Education, Inc.
248 Chapter 6: The Normal Distribution and Other Continuous Distributions Common Data Mean
0
Standard Deviation
1 Probability for a Range
Probability for X <=
From X Value
–1.96
X Value
–0.21
To X Value
–0.21
Z Value
–0.21
Z Value for –1.96
–1.96
0.4168338
Z Value for –0.21
–0.21
P(X<=–1.96)
0.0250
P(X<=–0.21)
0.4168
P(–1.96<=X<=–0.21)
0.3918
P(X<=–0.21)
Probability for X > X Value
1.08
Z Value
1.08
P(X>1.08)
0.1401
Find X and Z Given Cum. Pctage. Cumulative Percentage
6.5
84.13%
Probability for X<–0.21 or X >1.08
Z Value
0.999815
P(X<–0.21 or X >1.08)
X Value
0.999815
0.5569
(a)
P(Z > 1.08) = 1 – 0.8599 = 0.1401
(b)
P(Z < –0.21) = 0.4168
(c)
P(–1.96 < Z < –0.21) = 0.4168 – 0.0250 = 0.3918
(d)
P(Z > A) = 0.1587, P(Z < A) = 0.8413. A = + 1.00.
Partial PHStat output:
Normal Probabilities Common Data Mean
100
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 249 Standard Deviation
10
Probability for X <= X Value
68
Z Value
-3.2
P(X<=68)
0.0007
Probability for X > X Value
78
Z Value
-2.2
P(X>78)
0.9861
(a)
P ( X 78) P ( Z 2.20) 0.9861 Z
(b)
P ( X 68) P ( Z 3.20) 0.0007 Z
(c)
Partial PHStat output:
X
X
78 100 2.20 10
68 100 3.20 10
Normal Probabilities Common Data Mean
100
Standard Deviation
10
Probability for X <= X Value
78
Z Value
-2.2
P(X<=78)
0.0139
Probability for X > X Value
100
Copyright ©2024 Pearson Education, Inc.
250 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value
0
P(X>100)
0.5000
Probability for X<78 or X >100 P(X<78 or X >100)
0.5139
P ( X 78) P ( X 100) P ( Z 2.20) P ( Z 0) 0.0139 0.5 0.5139
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 251 6.5
(d)
Partial PHStat output:
cont. Find X Values Given a Percentage Percentage
80.00%
Z Value
-1.28
Lower X Value
87.18
Upper X Value
112.82
P( X lower X X upper ) 0.8 P (1.28 Z ) 0.10 Z 1.28
and P( Z 1.28) 0.90
X lower 100 10
and Z 1.28
X upper 100 10
X lower 100 1.28(10) 87.20 and X upper 100 1.28(10) 112.80
6.6
(a)
Partial PHStat output:
Common Data Mean
50
Standard Deviation
4 Probability for a Range
Probability for X <=
From X Value
42
X Value
42
To X Value
43
Z Value
–2
Z Value for 42
–2
0.0227501
Z Value for 43
–1.75
P(X<=42)
0.0228
P(X<=43)
0.0401
P(X<=42)
Probability for X >
Copyright ©2024 Pearson Education, Inc.
252 Chapter 6: The Normal Distribution and Other Continuous Distributions X Value
43
Z Value
–1.75
P(X>43)
0.9599
P(42<=X<=43)
Find X and Z Given Cum. Pctage. Cumulative Percentage
Probability for X<42 or X >43 P(X<42 or X >43)
0.9827
(b)
P(X < 42) = P(Z < –2.00) = 0.0228
(c)
P(X < A) = 0.05,
(d)
–1.644854
X Value
43.42059
A 50 A = 50 – 1.645(4) = 43.42 4
Partial PHStat output:
Find X and Z Given Cum. Pctage. Cumulative Percentage
80.00%
Z Value
0.841621
X Value
53.36648
P(Xlower < X < Xupper) = 0.60 P(Z < –0.84) = 0.20 and P(Z < 0.84) = 0.80 Z 0.84
X lower 50 4
Z 0.84
X upper 50 4
Xlower = 50 – 0.84(4) = 46.64 and Xupper = 50 + 0.84(4) = 53.36 6.7
45.2, 10 P ( X 33) P ( Z 1.22) 0.8888 (a) Probability for X > X Value
33
Z Value
-1.22
5.00%
Z Value
P(X > 43) = P(Z > –1.75) = 1 – 0.0401 = 0.9599
Z 1.645
0.0173
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 253 P(X>33) (b)
(c)
(d)
6.8
0.8888
P (10 X 20) P(3.52 Z 2.52) 0.0057 Probability for a Range
From X Value
10
To X Value
20
Z Value for 10
-3.52
Z Value for 20
-2.52
P(X<=10)
0.0002
P(X<=20)
0.0059
P(10<=X<=20)
0.0057
P ( X 10) P ( Z 3.52) 0.0002 Probability for X <=
X Value
10
Z Value
-3.52
P(X<=10)
0.0002
A 45.2 A 68.4635 10 Find X and Z Given Cum. Pctage.
P ( X A) 0.99 Z 2.3263
Cumulative Percentage
99.00%
Z Value
2.3263
X Value
68.4635
43,647, 10,000 P(34,000 X 50,000) P( 0.9647 Z 0.6353) 0.7374 0.1673 0.5700 (a) Probability for a Range From X Value
34000
To X Value
50000
Copyright ©2024 Pearson Education, Inc.
254 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value for 34000
-0.9647
Z Value for 50000
0.6353
P(X<=34000)
0.1673
P(X<=50000)
0.7374
P(34000<=X<=50000)
0.5700
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 255 6.8 cont.
(b)
P ( X 30,000) P ( x 60,000) P( Z 1.3647) P ( Z 1.6353) 0.0862 0.00510 0.1372
Probability for X <= X Value
30000
Z Value
-1.3647
P(X<=30000)
0.0862
Probability for X > X Value
60000
Z Value
1.6353
P(X>60000)
0.0510
Probability for X<30000 or X >60000 P(X<30000 or X >60000)
(c)
(d) (a)
0.1372
A 43,647 A 53,063.2123 12,000 Find X and Z Given Cum. Pctage.
P ( X A) 0.80 Z 0.8416
Cumulative Percentage
80.00%
Z Value
0.8416
X Value
52063.2123
43,647, 12,000 P (34,000 X 50,000) P (0.8039 Z 0.5294) 0.7017 0.2107 0.4910 Probability for a Range
From X Value
34000
To X Value
50000
Z Value for 34000
-0.803917
Z Value for 50000
0.5294167
Copyright ©2024 Pearson Education, Inc.
256 Chapter 6: The Normal Distribution and Other Continuous Distributions P(X<=34000)
0.2107
P(X<=50000)
0.7017
P(34000<=X<=50000)
0.4910
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 257 6.8 cont.
(d)
(b) P ( X 30,000) P ( x 60,000) P ( Z 1.137) P ( Z 1.363) 0.1277 0.0865 0.2142 Probability for X <= X Value
30000
Z Value
-1.13725
P(X<=30000)
0.1277
Probability for X > X Value
60000
Z Value
1.36275
P(X>60000)
0.0865
Probability for X<30000 or X >60000 P(X<30000 or X >60000)
(c)
6.9
0.2142
A 43,647 A 53,746.4548 10,000 Find X and Z Given Cum. Pctage.
P ( X A) 0.80 Z 0.8416
Cumulative Percentage
80.00%
Z Value
0.8416
X Value
53746.4548
139.33, 25 P ( X 100) P( Z 1.5732) 0.9422 (a) Probability for X >
(b)
X Value
100
Z Value
-1.5732
P(X>100)
0.9422
P(100 X 200) P (1.5732 Z 2.4268) 0.9924 0.0578 0.9345
Copyright ©2024 Pearson Education, Inc.
258 Chapter 6: The Normal Distribution and Other Continuous Distributions Probability for a Range From X Value
100
To X Value
200
Z Value for 100
-1.5732
Z Value for 200
2.4268
P(X<=100)
0.0578
P(X<=200)
0.9924
P(100<=X<=200)
0.9345
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 259 6.9
P( X lower X X upper ) 0.95
(c)
P ( 1.96 Z ) 0.0250
cont.
Z 1.96
and P( Z 1.96) 0.975
X lower 139.33 25
and Z 1.96
X upper 139.33 25
X lower 139.33 1.96(25) 90.33 and X upper 139.33 1.96(25) 188.33
Find X Values Given a Percentage Percentage
6.10
95.00%
Z Value
-1.96
Lower X Value
90.33
Upper X Value
188.33
PHStat output: Common Data Mean
73
Standard Deviation
8 Probability for a Range
Probability for X <=
From X Value
65
To X Value
89
X Value
91
Z Value
2.25
Z Value for 65
–1
0.9877755
Z Value for 89
2
P(X<=91)
Probability for X > X Value
81
Z Value
1
P(X>81)
0.1587
P(X<=65)
0.1587
P(X<=89)
0.9772
P(65<=X<=89)
0.8186
Find X and Z Given Cum. Pctage. Cumulative Percentage
Copyright ©2024 Pearson Education, Inc.
95.00%
260 Chapter 6: The Normal Distribution and Other Continuous Distributions Probability for X<91 or X >81 P(X<91 or X >81)
(a) (b) (c)
(d)
1.1464
Z Value
1.644854
X Value
86.15883
P(X < 91) = P(Z < 2.25) = 0.9878 P(65 < X < 89) = P(– 1.00 < Z < 2.00) = 0.9772 – 0.1587 = 0.8185 P(X > A) = 0.05P(Z < 1.645) = 0.9500 A 73 A = 73 + 1.645(8) = 86.16% Z 1.645 8 Option 1: P(X > A) = 0.10 P(Z < 1.28) 0.9000 81 73 Z 1.00 8 Since your score of 81% on this exam represents a Z-score of 1.00, which is below the minimum Z-score of 1.28, you will not earn an ―A‖ grade on the exam under this grading option. 68 62 Option 2: Z 2.00 3 Since your score of 68% on this exam represents a Z-score of 2.00, which is well above the minimum Z-score of 1.28, you will earn an ―A‖ grade on the exam under this grading option. You should prefer Option 2.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 261 6.11
(a)
(b)
37.1, 12 , P( X 33) P( Z 0.34166) 0.6337 Probability for X > X Value
33
Z Value
-0.341667
P(X>33)
0.6337
P (20 X 30) P( 1.425 Z 0.592) 0.2770 0.0771 0.2000
Probability for a Range
(c)
(d)
(e)
From X Value
20
To X Value
30
Z Value for 20
-1.425
Z Value for 30
-0.591667
P(X<=20)
0.0771
P(X<=30)
0.2770
P(20<=X<=30)
0.2000
P( X 20) P( Z 1.425) 0.0771 Probability for X <=
X Value
20
Z Value
-1.425
P(X<=20)
0.0771
A 37.1 A 65.0162 12 Find X and Z Given Cum. Pctage.
P ( X A) 0.99 Z 2.3263
Cumulative Percentage
99.00%
Z Value
2.3263
X Value
65.0162
The per capita consumption of bottled water in Germany is much lower than the per capita consumption of bottled water in the United States. Copyright ©2024 Pearson Education, Inc.
262 Chapter 6: The Normal Distribution and Other Continuous Distributions
6.12
(a)
(b)
39.6, 8 , P( X 50) P( Z 1.3) 0.0968 Probability for X > X Value
50
Z Value
1.3
P(X>50)
0.0968
P (25 X 40) P(1.825 Z 0.05) 0.5199 0.0340 0.4859
Probability for a Range
6.12 cont.
(c)
From X Value
25
To X Value
40
Z Value for 25
-1.825
Z Value for 40
0.05
P(X<=25)
0.0340
P(X<=40)
0.5199
P(25<=X<=40)
0.4859
P ( X 10) P ( Z 3.7) 0.0001
Probability for X <=
(d)
X Value
10
Z Value
-3.7
P(X<=10)
0.0001
A 39.6 A 58.2108 8 Find X and Z Given Cum. Pctage.
P ( X A) 0.99 Z 2.3263
Cumulative Percentage
99.00%
Z Value
2.3263
X Value
58.2108
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 263
6.13
(a)
Partial PHStat output:
Probability for a Range From X Value
21.99
To X Value
22
Z Value for 21.99
–2.4
Z Value for 22
–0.4
P(X<=21.99)
0.0082
P(X<=22)
0.3446
P(21.99<=X<=22)
0.3364
P(21.99 < X < 22.00) = P(– 2.4 < Z < – 0.4) = 0.3364 (b)
Partial PHStat output:
Probability for a Range From X Value
21.99
To X Value
22.01
Z Value for 21.99
–2.4
Z Value for 22.01
1.6
P(X<=21.99)
0.0082
P(X<=22.01)
0.9452
P(21.99<=X<=22.01)
0.9370
P(21.99 < X < 22.01) = P(–2.4 < Z < 1.6) = 0.9370 (c)
Partial PHStat output:
Find X and Z Given Cum. Pctage.
Copyright ©2024 Pearson Education, Inc.
264 Chapter 6: The Normal Distribution and Other Continuous Distributions Cumulative Percentage
98.00%
Z Value
2.05375
X Value
22.0123
P(X > A) = 0.02
Z = 2.05
A = 22.0123
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 265 6.13
(d)
(a) Partial PHStat output:
cont.
Probability for a Range From X Value
21.99
To X Value
22
Z Value for 21.99
–3
Z Value for 22
–0.5
P(X<=21.99)
0.0013
P(X<=22)
0.3085
P(21.99<=X<=22)
0.3072
P(21.99 < X < 22.00) = P(– 3.0 < Z < – 0.5) = 0.3072 (b) Partial PHStat output:
Probability for a Range From X Value
21.99
To X Value
22.01
Z Value for 21.99
–3
Z Value for 22.01
2
P(X<=21.99)
0.0013
P(X<=22.01)
0.9772
P(21.99<=X<=22.01)
0.9759
P(21.99 < X < 22.01) = P(– 3.0 < Z < 2) = 0.9759 (c) Partial PHStat output:
Find X and Z Given Cum. Pctage.
Copyright ©2024 Pearson Education, Inc.
266 Chapter 6: The Normal Distribution and Other Continuous Distributions Cumulative Percentage
98.00%
Z Value
2.05375
X Value
22.0102
P(X > A) = 0.02
Z = 2.05
A = 22.0102
6.14
With 39 values, the smallest of the standard normal quantile values covers an area under the normal curve of 0.025. The corresponding Z value is –1.96. The middle (20th) value has a cumulative area of 0.50 and a corresponding Z value of 0.0. The largest of the standard normal quantile values covers an area under the normal curve of 0.975, and its corresponding Z value is +1.96.
6.15
Area under normal curve covered: 0.1429 0.2857 0.4286 0.5714 0.7143 0.8571 Standardized normal quantile value: –1.07 –0.57 –0.18 +0.18 +0.57 +1.07
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 267 6.16
(a)
Excel output: Before Halftime Ad Ratings Descriptive Summary
Rating
Five-Number Summary
Mean
5.789286
Minimum
4.5
Median
5.8
First Quartile
5.3
Mode
5.3
Median
5.8
Minimum
4.5
Third Quartile
6.4
Maximum
7.4
Maximum
7.4
Range
2.9
IQR
1.1
Variance
0.4928
Standard Deviation
0.7020
1.33S
0.933698
Coeff. of Variation
12.13%
6*Std dev
4.212171
Skewness
0.2392
Kurtosis
-0.5132
Count
28
Standard Error
0.1327
Copyright ©2024 Pearson Education, Inc.
268 Chapter 6: The Normal Distribution and Other Continuous Distributions
Super Bowl Ad Ratings First and Second Period: (a) Mean = 5.7893, median = 5.8, S = 0.7020, range = 2.9, 6S = 4.212, interquartile range = 1.1, 1.33(0.7020) = 0.9337. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is more than 1.33S. The skewness statistic is 0.2392, indicating a symmetric distribution, and the kurtosis statistic is –0.5132, indicating a platykurtic distribution. 6.16 cont.
(a)
Excel output: Halftime and Afterwards Ad Ratings
Descriptive Summary
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness
Rating 5.534483 5.6 5.3 4 7.3 3.3 0.5031 0.7093 12.82% 0.1578
Five-Number Summary Minimum First Quartile Median Third Quartile Maximum IQR
4 5.05 5.6 5.95 7.3 0.9
1.33S 6*Std dev
0.94332 4.255579
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 269 Kurtosis Count Standard Error
0.5515 29 0.1317
Super Bowl Ad Ratings Halftime and Afterward: (a) Mean = 5.5345, median = 5.6, S = 0.7093, range = 3.3, 6S = 4.2556, interquartile range = 0.9, 1.33(0.7093) = 0.9433. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is close to 1.33S. The skewness statistic is 0.1578, indicating an approximately symmetric distribution, and the kurtosis statistic is 0.5515, indicating a platykurtic distribution.
Copyright ©2024 Pearson Education, Inc.
270 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.16
(b)
cont.
The data for the halftime and afterwards appear to follow a normal distribution.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 271
Copyright ©2024 Pearson Education, Inc.
272 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.17
(a)Excel Ouput:
Descriptive Summary Team Value
Payroll
Wins
Mean
2.074
147.2133333
80.96666667
Median
1.73
141.555
81
Mode
#N/A
#N/A
77
Minimum
0.99
48.06
52
Maximum
6
284.73
107
Range
5.01
236.67
55
Variance
1.2997
3869.0060
209.3437
Standard Deviation
1.1401
62.2013
14.4687
Coeff. of Variation
54.97%
42.25%
17.87%
Skewness
1.8739
0.3460
-0.2659
Kurtosis
3.8080
-0.4260
-0.4135
Count
30
30
30
Standard Error
0.2081
11.3564
2.6416
6*std dev
6.8404
373.2080
86.8123
Minimum
0.99
48.06
52
First Quartile
1.32
94.93
73
Median
1.73
141.555
81
Third Quartile
2.3
184.63
92
Maximum
6
284.73
107
Five-Number Summary
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 273 IQR
0.98
89.7
19
1.33*S
1.5163
82.7278
19.2434
6.17 (a) cont.
Copyright ©2024 Pearson Education, Inc.
274 Chapter 6: The Normal Distribution and Other Continuous Distributions
Team Value: Mean = 2.074, median = 1.73, S = 1.1401, range = 5.01, 6S = 8.8404, interquartile range = 0.98, 1.33(1.1401) = 1.5163. The mean is greater than the median. The range is much less than 6S, and the interquartile range is much less than 1.33S. The skewness statistic is 1.8739, and the kurtosis statistic is 3.8080. Payroll: Mean = 147.2133, median = 141.555, S = 62.2013, range = 236.67, 6S = 373.2080, interquartile range = 89.7, 1.33(62.2013) = 82.7278. The mean is greater the median. The range is much less than 6S, and the interquartile range is more than 1.33S. The skewness statistic is 0.3460, indicating a symmetric distribution, and the kurtosis statistic is –0.4260, indicating a platykurtic distribution. Wins: Mean = 80.967, median = 81, S = 14.4687, range = 55, 6S = 86.8123, interquartile range = 19, 1.33(14.4687) = 19.2434. The mean is approximately the same as the median. The range is much less than 6S, and the interquartile range is approximately the same as 1.33S. The skewness statistic is –0.2659, and the kurtosis statistic is – 0.4135, indicating a platykurtic distribution. 6.17
(b)
cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 275
6.18
(a)
(b)Excel Output:
Copyright ©2024 Pearson Education, Inc.
276 Chapter 6: The Normal Distribution and Other Continuous Distributions
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 277 6.18
(a)
(b)
cont.
The mean is greater than the median. The range is much less than 6S, and the IQR is more than 1.33S. The box plot is right skewed. The normal probability plot along with the skewness and kurtosis statistics indicate a departure from the normal distribution. 6.19
(a) Descriptive Summary
Market Cap
Five-Number Summary
Mean
397.16
Minimum
38.96
Median
217.48
First Quartile
123.49
Mode
#N/A
Median
217.475
Minimum
38.96
Third Quartile
404.34
Maximum
2941.00
Maximum
2941
Range
2902.04
IQR
280.85
Variance
422032.2302
Copyright ©2024 Pearson Education, Inc.
278 Chapter 6: The Normal Distribution and Other Continuous Distributions Standard Deviation
649.6401
1.33S
864.0213
Coeff. of Variation
163.57%
6*Std dev
3897.84
Skewness
3.4450
Kurtosis
11.4249
Count
30
Standard Error
118.6075
The range is much less than 6S and the IQR is less than 1.33S, the mean is larger than the median, the normal probability plot appears right skewed, the histogram appears right-skewed and both the skewness and kurtosis statistics indicate a departure from a normal distribution. 6.19
(b)
cont.
(c)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 279
Copyright ©2024 Pearson Education, Inc.
280 Chapter 6: The Normal Distribution and Other Continuous Distributions Excel output: Error –0.00023
Mean Median
0
Mode
0
Standard Deviation
0.001696
Sample Variance
2.88E-06
Range
0.008
Minimum
–0.003
Maximum
0.005
First Quartile
–0.0015
Third Quartile
0.001
1.33 Std Dev
0.002255
Interquartile Range 6 Std Dev
(a)
0.0025 0.010175
Because the interquartile range is close to 1.33S and the range is also close to 6S, the data appear to be approximately normally distributed.
(b) Normal Probability Plot 0.006 0.005 0.004 0.003
Error
6.20
0.002 0.001 0 -0.001 -3
-2
-1
0
1
2
3
-0.002 -0.003 -0.004
Z Value
The normal probability plot suggests that the data appear to be approximately normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 281 6.21
Excel Output: Descriptive Summary
One-Year CD
Five-Year CD
Mean
0.18
0.38
Median
0.12
0.30
Mode
0.1
0.2
Minimum
0.02
0.02
Maximum
0.55
1.00
Range
0.53
0.98
Variance
0.0246
0.0763
Standard Deviation
0.1570
0.2762
Coeff. of Variation
89.12%
72.62%
Skewness
1.4615
0.6486
Kurtosis
1.3764
–0.2609
Count
36
36
Standard Error
0.0262
0.0460
6*Std dev
0.94174913
1.65697401
1.33*S
0.20875439
0.3672959
Minimum
0.02
0.02
First Quartile
0.05
0.15
Median
0.115
0.3
Third Quartile
0.25
0.55
Five-Number Summary
Copyright ©2024 Pearson Education, Inc.
282 Chapter 6: The Normal Distribution and Other Continuous Distributions Maximum
0.55
1
Interquartile Range
0.2
0.4
(a)
For the One-year CD the mean is larger than the median; the range is smaller than 6 times the standard deviation, and the interquartile range is smaller than 1.33 times the standard deviation. The data do not appear to be normally distributed. For the Five-Year CD the mean is larger than the median; the range is smaller than 6 times the standard deviation, and the interquartile range is larger than 1.33 times the standard deviation. The data appear to deviate from the normal distribution.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 283 6.21 cont.
(b)
Copyright ©2024 Pearson Education, Inc.
284 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.22
(a)
Excel Output
Descriptive Summary
Electricity
Five-Number Summary
Mean
110.254902
Minimum
73
Median
108
First Quartile
95
Mode
95
Median
108
Minimum
73
Third Quartile
123
Maximum
160
Maximum
160
Range
87
IQR
28
Variance
321.9937
Standard Deviation
17.9442
6*Std deviation
107.6651
Coeff. of Variation
16.28%
1.33*S
23.8658
Skewness
0.4685
Kurtosis
0.2788
Count
51
Standard Error
2.5127
The mean is close to the median. The five-number summary suggests that the distribution is quite symmetrical around the median. The interquartile range is much more than 1.33 times the standard deviation. The range is about $20 below 6 times the standard deviation. In general, the distribution of the data appears to closely resemble a normal distribution. (b)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 285
The normal probability plot confirms that the data appear to be approximately normally distributed.
Copyright ©2024 Pearson Education, Inc.
286 Chapter 6: The Normal Distribution and Other Continuous Distributions
6.23
(a) (b) (c)
75 0.2 10 32 P(2 X 3) 0.1 10 0 10 5 2 P(5 X 7)
10 0 2.8868 2
6.24
(d)
(a)
P (0 X 20)
(b) (c)
6.25
0 100 50 2
(a)
P(25 X 30)
(c)
(a) (b) (c)
6.27
20 0 0.20 100 30 10 P(10 X 30) 0.20 100 100 35 P(35 X 100) 0.65 100
(d)
(b)
6.26
12
20 60 40 2
2
12
30 23 0.70 30 20 30 25 P(25 X 30) 0.50 30 20 25 20 P(20 X 25) 0.50 30 20
(a)
P(59 X 70)
(d)
60 20 11.5470
P(23 X 30)
(d)
(c)
12
30 25 0.125 60 20 35 20 P(20 X 35) 0.375 60 20
20 30 25 2
(b)
100 0 31.623 2
30 20 2.8668 2
12
70 59 0.6875 75 59 70 65 P(65 X 70) 0.3125 75 59 75 65 P(65 X 75) 0.6250 75 59 75 59 67 2
75 59 4.6188 2
12
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 287
6.28
Using Table E.2, first find the cumulative area up to the larger value, and then subtract the cumulative area up to the smaller value.
6.29
Find the Z value corresponding to the given percentile and then use the equation X z .
6.30
The normal distribution is bell-shaped; its measures of central tendency are all equal; its middle 50% is within 1.33 standard deviations of its mean; and 99.7% of its values are contained within three standard deviations of its mean.
6.31
Both the normal distribution and the uniform distribution are symmetric but the uniform distribution has a bounded range while the normal distribution ranges from negative infinity to positive infinity. The exponential distribution is right-skewed and ranges from zero to infinity.
6.32
If the distribution is normal, the plot of the Z values on the horizontal axis and the original values on the vertical axis will be a straight line.
6.33
(a)
Partial PHStat output:
Probability for a Range From X Value
0.75
To X Value
0.753
Z Value for 0.75
–0.75
Z Value for 0.753
0
P(X<=0.75)
0.2266
P(X<=0.753)
0.5000
P(0.75<=X<=0.753)
0.2734
P(0.75 < X < 0.753) = P(– 0.75 < Z < 0) = 0.2734 (b)
Partial PHStat output: Copyright ©2024 Pearson Education, Inc.
288 Chapter 6: The Normal Distribution and Other Continuous Distributions
Probability for a Range
From X Value
0.74
To X Value
0.75
Z Value for 0.74
–3.25
Z Value for 0.75
–0.75
P(X<=0.74)
0.0006
P(X<=0.75)
0.2266
P(0.74<=X<=0.75)
0.2261
P(0.74 < X < 0.75) = P(– 3.25 < Z < – 0.75) = 0.2266 – 0.00058 = 0.2260
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 289 6.33
(c)Partial PHStat output:
cont. Probability for X > X Value
0.76
Z Value
1.75
P(X>0.76)
0.0401
P(X > 0.76) = P(Z > 1.75) = 1.0 – 0.9599 = 0.0401 (d)
Partial PHStat output:
Probability for X <= X Value
0.74
Z Value
–3.25
P(X<=0.74)
0.000577
P(X < 0.74) = P(Z < – 3.25) = 0.00058 (e)
Partial PHStat output:
Find X and Z Given Cum. Pctage. Cumulative Percentage
7.00%
Z Value
–1.475791
X Value
0.747097
P(X < A) = P(Z < – 1.48) = 0.07
6.34
(a)
A = 0.753 – 1.48(0.004) = 0.7471
Partial PHStat output:
Probability for a Range
Copyright ©2024 Pearson Education, Inc.
290 Chapter 6: The Normal Distribution and Other Continuous Distributions From X Value
1.9
To X Value
2
Z Value for 1.9
–2
Z Value for 2
0
P(X<=1.9)
0.0228
P(X<=2)
0.5000
P(1.9<=X<=2)
0.4772
P(1.90 < X < 2.00) = P(– 2.00 < Z < 0) = 0.4772 (b)
Partial PHStat output:
Probability for a Range From X Value
1.9
To X Value
2.1
Z Value for 1.9
–2
Z Value for 2.1
2
P(X<=1.9)
0.0228
P(X<=2.1)
0.9772
P(1.9<=X<=2.1)
0.9545
P(1.90 < X < 2.10) = P(– 2.00 < Z < 2.00) = 0.9772 – 0.0228 = 0.9544 (c)
Partial PHStat output:
Probability for X<1.9 or X >2.1 P(X<1.9 or X >2.1)
0.0455
P(X < 1.90) + P(X > 2.10) = 1 – P(1.90 < X < 2.10) = 0.0456
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 291 6.34
(d)
Partial PHStat output:
cont. Find X and Z Given Cum. Pctage. Cumulative Percentage
1.00%
Z Value
–2.326348
X Value
1.883683
P(X > A) = P( Z > – 2.33) = 0.99A = 2.00 – 2.33(0.05) = 1.8835 (e)
Partial PHStat output:
Find X and Z Given Cum. Pctage. Cumulative Percentage
99.50%
Z Value
2.575829
X Value
2.128791
P(A < X < B) = P(– 2.58 < Z < 2.58) = 0.99 A = 2.00 – 2.58(0.05) = 1.8710
6.35
(a)
B = 2.00 + 2.58(0.05) = 2.1290
Partial PHStat output:
Probability for a Range From X Value
1.9
To X Value
2
Z Value for 1.9
–2.4
Z Value for 2
–0.4
P(X<=1.9)
0.0082
P(X<=2)
0.3446
P(1.9<=X<=2)
0.3364
Copyright ©2024 Pearson Education, Inc.
292 Chapter 6: The Normal Distribution and Other Continuous Distributions P(1.90 < X < 2.00) = P(– 2.40 < Z < – 0.40) = 0.3446 – 0.0082 = 0.3364 (b)
Partial PHStat output:
Probability for a Range From X Value
1.9
To X Value
2.1
Z Value for 1.9
–2.4
Z Value for 2.1
1.6
P(X<=1.9)
0.0082
P(X<=2.1)
0.9452
P(1.9<=X<=2.1)
0.9370
P(1.90 < X < 2.10) = P(– 2.40 < Z < 1.60) = 0.9452 – 0.0082 = 0.9370 (c)
Partial PHStat output:
Probability for a Range From X Value
1.9
To X Value
2.1
Z Value for 1.9
–2.4
Z Value for 2.1
1.6
P(X<=1.9)
0.0082
P(X<=2.1)
0.9452
P(1.9<=X<=2.1)
0.9370
P(X < 1.90) + P(X > 2.10) = 1 – P(1.90 < X < 2.10) = 0.0630
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 293 6.35
(d)
Partial PHStat output:
cont. Find X and Z Given Cum. Pctage. Cumulative Percentage
1.00%
Z Value
–2.326348
X Value
1.903683
P(X > A) = P(Z > – 2.33) = 0.99 (e)
A = 2.02 – 2.33(0.05) = 1.9035
Partial PHStat output:
Find X and Z Given Cum. Pctage. Cumulative Percentage
99.50%
Z Value
2.575829
X Value
2.148791
P(A < X < B) = P(– 2.58 < Z < 2.58) = 0.99 A = 2.02 – 2.58(0.05) = 1.8910
6.36
(a)
B = 2.02 + 2.58(0.05) = 2.1490
Partial PHStat output:
Probability for X <= X Value Z Value P(X<=210)
210 -2 0.0228
P(X < 210) = P(Z < –2) = 0.0228 (b)
Copyright ©2024 Pearson Education, Inc.
294 Chapter 6: The Normal Distribution and Other Continuous Distributions
Probability for a Range From X Value 270 To X Value 300 Z Value for 270 1 Z Value for 300 2.5 P(X<=270) 0.8413 P(X<=300) 0.9938 P(270<=X<=300) 0.1524
P(270 < X < 300) = P(1.0 < Z < 2.5) = 0.1524 (c)
Find X and Z Given Cum. Pctage. Cumulative Percentage 90.00% Z Value 1.2816 X Value 275.6310
P(X < A) = P(Z < 1.2816) = 0.90A = 250 + 20(1.2816) = $275.63 (d)
Find X Values Given a Percentage Percentage 80.00% Z Value -1.28 Lower X Value 224.37 Upper X Value 275.63
6.36
(d)
cont.
P(A < X < B) = P(– 1.2816 < Z < 1.2816) = 0.80 A = 250 – 1.28(500) = $224.37 B = 250 + 1.28(500) = $275.63
6.37
Excel Output: Descriptive Summary
Mean Median Mode
Alcohol 5.269490446 4.92 4.2
Calories 155.656051 151 110
Carbohydrates 12.05171975 12 12
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 295 Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
2.4 11.5 9.1 1.8344 1.3544 25.70% 1.8405 4.5814 157 0.1081
55 330 275 1917.4707 43.7889 28.13% 1.2061 2.9620 157 3.4947
6*Std deviation 1.33*S
8.126298607 262.7336 29.88544064 1.801329525 58.2392814 6.624606008
Minimum First Quartile Median Third Quartile Maximum IQR
2.4 4.4 4.92 5.65 11.5 1.25
55 130.5 151 170.5 330 40
1.9 32.1 30.2 24.8094 4.9809 41.33% 0.4912 1.0811 157 0.3975
1.9 8.65 12 14.7 32.1 6.05
Alcohol %: The mean is greater than the median; the range is larger than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation. The data appear to deviate from the normal distribution.
Copyright ©2024 Pearson Education, Inc.
296 Chapter 6: The Normal Distribution and Other Continuous Distributions 6.37 cont.
The normal probability plot suggests that data are not normally distributed. The kurtosis is 4.582 indicating a distribution that is more peaked than a normal distribution, with more values in the tails. The skewness of 1.841 suggests that the distribution is right-skewed.
Calories: The mean is greater than the median; the range is greater than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation. The data appear to deviate away from the normal distribution.
The normal probability plot suggests that the data are somewhat right-skewed. The kurtosis is 2.9620 indicating a distribution that is more peaked than a normal distribution, with more values in the tails. The skewness of 1.2061 suggests that the distribution is right-skewed. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 297 6.37 cont.
Carbohydrates: The mean is approximately equal to the median; the range is approximately equal to 6 times the standard deviation and the interquartile range is slightly smaller than 1.33 times the standard deviation. The data appear to be normally distributed.
The normal probability plot suggests that the data are approximately normally distributed. The kurtosis is 1.0811 indicating a distribution that is slightly more peaked than a normal distribution, with more values in the tails. The skewness of 0.4912 indicates that the distribution deviates slightly from the normal distribution. Waiting time will more closely resemble an exponential distribution. Seating time will more closely resemble a normal distribution. Histogram 60
100.00%
50
80.00%
40
60.00%
30 40.00%
20
Frequency Cumulative %
38
30
0.00%
22
0
14
20.00%
6
10
---
(a) (b) (c)
Frequency
6.38
Midpoints
Copyright ©2024 Pearson Education, Inc.
298 Chapter 6: The Normal Distribution and Other Continuous Distributions (c) Normal Probability Plot 45 40 35 Waiting
30 25 20 15 10 5 0 -3
-2
-1
0
1
2
3
Z Value
Both the histogram and normal probability plot suggest that waiting time more closely resembles an exponential distribution. (d)
30
100.00%
25
80.00%
20
60.00%
Frequency Cumulative %
15 40.00%
10
67
59
0.00%
51
0
43
20.00%
35
5
---
Frequency
Histogram
Midpoints
Normal Probability Plot 80 70 60 Seating
6.38 cont.
50 40 30 20 10 0 -3
-2
-1
0
1
2
3
Z Value
Both the histogram and normal probability plot suggest that seating time more closely resembles a normal distribution. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 299 6.39 26.9, 20 (a)
(b)
(c)
(d)
P ( X 0) P ( Z 1.345) 0.9107 Probability for X >
X Value
0
Z Value
-1.345
P(X>0)
0.9107
P( X 10) P( Z 0.845) 0.8009 Probability for X >
X Value
10
Z Value
-0.845
P(X>10)
0.8009
P ( X 20) P ( Z 0.345) 0.3650 Probability for X <=
X Value
20
Z Value
-0.345
P(X<=20)
0.3650
P ( X 30) P ( Z 0.155) 0.5616
Probability for X <= X Value
30
Z Value
0.155
P(X<=30)
0.5616
(e) 0.6, 30 (a) P ( X 0) P( Z 0.02) 0.5080 Probability for X > X Value
0
Copyright ©2024 Pearson Education, Inc.
300 Chapter 6: The Normal Distribution and Other Continuous Distributions Z Value
-0.02
P(X>0)
0.5080
(b) P ( X 10) P ( Z 0.3133) 0.3770 Probability for X > X Value
10
Z Value
0.3133333
P(X>10)
0.3770
(c) P ( X 20) P ( Z 0.6467) 0.7411 Probability for X <= X Value
20
Z Value
0.6466667
P(X<=20)
0.7411
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 301 6.39 cont.
(e)
(d) P ( X 30) P ( Z 0.98) 0.8365 Probability for X <=
(f)
X Value
30
Z Value
0.98
P(X<=30)
0.8365
The probability that a S&P 500 stock gained value in 2021 is 0.9107. The probability that a NASDAQ stock gained value in 2021 is 0.5080. The probability that a S&P 500 stock gained 10% or more value in 2021 is 0.8009. The probability that a NASDAQ stock gained 10% or more value in 2021 is 0.3770. The probability that a S&P 500 stock lost 20% or more value in 2021 is 0.3650. The probability that a NASDAQ stock lost 20% or more value in 2021 is 0.7411. The probability that a S&P 500 stock lost 30% or more value in 2021 is 0.5616. The probability that a NASDAQ stock lost 30% or more value in 2021 is 0.8365. The larger standard deviation of the NASDAQ is associated with higher risk.
6.40 33,100, 5,000 (a)
P ( X 25,000) P ( Z 1.62) 0.0526
Probability for X <= X Value
25000
Z Value
-1.62
P(X<=25000) (b)
0.0526
P(25,000 X 40,000) P( 1.62 Z 1.38) 0.9162 0.0526 0.8636 Probability for a Range
From X Value
25000
To X Value
40000
Z Value for 25000
-1.62
Z Value for 40000
1.38
P(X<=25000)
0.0526
P(X<=40000)
0.9162
P(25000<=X<=40000)
0.8636
Copyright ©2024 Pearson Education, Inc.
302 Chapter 6: The Normal Distribution and Other Continuous Distributions (c)
P ( X 40,000) P ( Z 1.38) 0.0838 Probability for X >
X Value
40000
Z Value
1.38
P(X>40000) (d)
0.0838
A 33,100 A $21, 468.2606 5,000 Find X and Z Given Cum. Pctage.
P ( X A) 0.99 Z 2.3263
Cumulative Percentage
1.00%
Z Value
-2.3263
X Value
21468.2606
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 303 6.40
P( X lower X X upper ) 0.95
(e) P (1.96 Z ) 0.025
cont.
Z 1.96
X lower 33,100 5,000
and
P ( Z 1.96) 0.975
and
Z 1.96
X upper 33,100 5,000
X lower 33,100 1.96(5000) $23,300.18 and X upper 33,100 1.96(5000) $42,899.82
Find X Values Given a Percentage Percentage
95.00%
Z Value
(f)
-1.96
Lower X Value
23300.18
Upper X Value
42899.82
P(20,000 X 25,000)
25,000 20,000 0.1667 50,000 20,000
(g) P(25,000 X 40,000)
40,000 25,000 0.50 50,000 20,000
(h) P(40,000 X 50,000)
50,000 40,000 0.3333 50,000 20,000
6.41 47,793, 5,000 (a)
(b)
P ( X 40,000) P ( Z 1.5586) 0.0595 Probability for X <=
X Value
40000
Z Value
-1.5586
P(X<=40000)
0.0595
P(40,000 X 60,000) P( 1.5586 Z 2.4414) 0.9927 0.0595 0.9331
Probability for a Range
Copyright ©2024 Pearson Education, Inc.
304 Chapter 6: The Normal Distribution and Other Continuous Distributions
(c)
From X Value
40000
To X Value
60000
Z Value for 40000
-1.5586
Z Value for 60000
2.4414
P(X<=40000)
0.0595
P(X<=60000)
0.9927
P(40000<=X<=60000)
0.9331
P ( X 60,000) P ( Z 2.4414) 0.0073 Probability for X >
X Value
60000
Z Value
2.4414
P(X>60000)
0.0073
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems 305
(d)
A 47,793 A $36,161.2606 5,000 Find X and Z Given Cum. Pctage.
P ( X A) 0.01 Z 2.3263
Cumulative Percentage
1.00%
Z Value
-2.3263
X Value
36161.2606
P( X lower X X upper ) 0.95
(e) P (1.96 Z ) 0.025
Z 1.96
X lower 47,793 5,000
and
P ( Z 1.96) 0.975
and
Z 1.96
X upper 47,793 5,000
X lower 47,793 1.96(5000) $37,993.18 and X upper 47,793 1.96(5000) $57,592.82
Find X Values Given a Percentage Percentage
95.00%
Z Value
(f)
-1.96
Lower X Value
37993.18
Upper X Value
57592.82
P(40,000 X 45,000)
45,000 40,000 0.25 60,000 40,000
(g) P(40,000 X 50,000)
50,000 40,000 0.50 60,000 40,000
(h) P(45,000 X 60,000)
60,000 45,000 0.75 60,000 40,000
6.42
Class project solutions may vary.
Copyright ©2024 Pearson Education, Inc.
Chapter 7
7.1
X 10 2. n
25
PHstat output:
Common Data Mean
100
Standard Deviation
2 Probability for a Range
Probability for X <=
From X Value
95
X Value
97.5
To X Value
97.5
Z Value
–1.25
Z Value for 95
–2.5
P(X<=97.5)
0.1056
Z Value for 97.5
–1.25
P(X<=95)
0.0062
P(X<=97.5)
0.1056
P(95<=X<=97.5)
0.0994
Probability for X > X Value
101.7
Z Value
0.85
P(X>101.7)
0.1977
Find X and Z Given Cum. Pctage. Cumulative Percentage
25.00%
Probability for X<97.5 or X >101.7
Z Value
–0.6745
P(X<97.5 or X>101.7)
X Value
98.6510
0.3033
P( X < 97.5) = P(Z < –1.25) = 0.1056 P(95 < X < 97.5) = P(–2.5 < Z < –1.25) = 0.1056 – 0.0062 = 0.0994 P( X > 101.7) = P(Z > 0.85) = 1.0 – 0.8023 = 0.1977 10 X = 100 – 0.675 (d) P( X > A) = P(Z > – 0.675) = 0.75 = 98.65 25 X 5 0.5. PHStat output: n 100 (a) (b) (c)
7.2
Common Data Copyright ©2024 Pearson Education, Inc. v
vi Chapter 7: Sampling Distributions Mean
50
Standard Deviation
0.5 Probability for a Range
Probability for X <=
From X Value
47 49.5
X Value
47
To X Value
Z Value
–6
Z Value for 47
–6
Z Value for 49.5
–1
P(X<=47)
9.866E-10
Probability for X > X Value
51.5
Z Value
3
P(X>51.5)
0.0013
P(X<=47)
0.0000
P(X<=49.5)
0.1587
P(47<=X<=49.5)
0.1587
Find X and Z Given Cum. Pctage. Cumulative Percentage
65.00%
Probability for X<47 or X >51.5
Z Value
0.38532
P(X<47 or X >51.5)
X Value
50.19266
0.0013
Probability for X > X Value
51.1
Z Value
2.2
P(X>51.1)
0.0139
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 7.2 cont.
(a) (b) (c) (d)
P( X < 47) = P(Z < – 6.00) = virtually zero P(47 < X < 49.5) = P(– 6.00 < Z < – 1.00) = 0.1587 – 0.00 = 0.1587 P( X > 51.1) = P(Z > 2.20) = 1.0 – 0.9861 = 0.0139 X = 50 + 0.39(0.5) = 50.195 P( X > A) = P(Z > 0.39) = 0.35
7.3
(a)
For samples of 25 customer receipts for a supermarket for a year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 customer receipts for a supermarket for that year. For samples of 25 insurance payouts in a particular geographical area in a year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 insurance payouts in that particular geographical area in that year. For samples of 25 Call Center logs of inbound calls tracking handling time for a credit card company during the year, the sampling distribution of sample means is the distribution of means from all possible samples of 25 Call Center logs of inbound calls tracking handling time for a credit card company during that year.
(b)
(c)
7.4
(a)
Sampling Distribution of the Mean for n = 2 (without replacement) Sample Number
Outcomes
Sample Means X i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
1, 3 1, 6 1, 7 1, 9 1, 10 3, 6 3, 7 3, 9 3, 10 6, 7 6, 9 6, 10 7, 9 7, 10 9, 10
X 1 = 2.5 X 2 = 3.5 X 3 = 4.5 X 4 = 5.5 X 5 = 5.5
X 6 = 4.5 X 7 = 5.5 X 8 = 6.5 X 9 = 6.5 X 10 = 6.5 X 11 = 7.5 X 12 = 8.5 X 13 = 8.5 X 14 = 8.5 X 15 = 9.5
Mean of All Possible Sample Means: Mean of All Population Elements: 90 1 3 6 7 9 10 X 6 6 6 15 Both means are equal to 6. This property is called unbiasedness.
Copyright ©2024 Pearson Education, Inc.
viii Chapter 7: Sampling Distributions 7.4 cont.
(b)
Sampling Distribution of the Mean for n = 3 (without replacement)
Sample Number
Outcomes
Sample Means X i
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1, 3, 6 1, 3, 7 1, 3, 9 1, 3, 10 1, 6, 7 1, 6, 9 1, 6, 10 3, 6, 7 3, 6, 9 3, 6, 10 6, 7, 9 6, 7, 10 6, 9, 10 7, 9, 10 1, 7, 9 1, 7, 10 1, 9, 10 3, 7, 9 3, 7, 10 3, 9, 10
X 1 = 3 1/3 X 2 = 3 2/3 X 3 = 4 1/3 X 4 = 4 2/3
X 5 = 4 2/3 X 6 = 5 1/3 X 7 = 5 2/3 X 8 = 5 1/3 X 9 = 6 1/3 X 10 = 6 1/3 X 11 = 7 1/3 X 12 = 7 2/3 X 13 = 8 1/3 X 14 = 8 2/3 X 15 = 5 2/3 X 16 = 6 1/3
X 17 = 6 2/3 X 18 = 6 1/3 X 19 = 6 2/3 X 20 = 7 1/3
120 6 This is equal to , the population mean. 20 The distribution for n = 3 has less variability. The larger sample size has resulted in sample means being closer to . (a) Sampling Distribution of the Mean for n = 2 (with replacement)
X
(c) (d)
Sample Number
Outcomes
Sample Means X i
1 2 3 4 5 6 7 8 9
1, 1 1, 3 1, 6 1, 7 1, 9 1, 10 3,1 3, 3 3, 6
X 1 = 1.5
X 2 = 2.5 X 3 = 3.5 X 4 = 4.5 X 5 = 5.5 X 6 = 5.5 X 7 = 2.5 X 8 = 3.5 X 9 = 4.5
(table continues on next page)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 7.4 cont.
(d)
(a)
(b)
(c)
Sample Number
Outcomes
Sample Means X i
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
3, 7 3, 9 3, 10 6, 1 6, 3 6, 6 6, 7 6, 9 6, 10 7, 1 7,3 7, 6 7, 7 7, 9 7, 10 9, 1 9, 3 9, 6 9, 7 9, 9 9, 10 10, 1 10, 3 10, 6 10, 7 10, 9 10, 10
X 10 = 5.5
X 11 = 6.5 X 12 = 6.5 X 13 = 3.5 X 14 = 4.5 X 15 = 6.5 X 16 = 6.5 X 17 = 7.5
X 18 = 8.5 X 19 = 4.5 X 20 = 5.5 X 21 = 6.5 X 22 = 7.5
X 23 = 8.5 X 24 = 8.5 X 25 = 5.5 X 26 = 6.5 X 27 = 7.5 X 28 = 8.5 X 29 = 9.5
X 30 = 9.5 X 31 = 5.5 X 32 = 6.5 X 33 = 8.5 X 34 = 8.5 X 35 = 9.5 X 36 = 10.
Mean of All Possible Mean of All Sample Means: Population Elements: 216 1 3 6 7 7 12 X 6 6 36 6 Both means are equal to 6. This property is called unbiasedness. Repeat the same process for the sampling distribution of the mean for n = 3 (with replacement). There will be 63 216 different samples. X 6 This is equal to , the population mean. The distribution for n = 3 has less variability. The larger sample size has resulted in more sample means being close to . Copyright ©2024 Pearson Education, Inc.
x Chapter 7: Sampling Distributions 7.5
(a)
P(X < 2.03) = P(Z < –0.4) = 0.3446 Excel Output: Mean
2.04
Standard Deviation
0.025
Probability for X <= X Value
2.03
Z Value
-0.4
P(X<=2.03) (b)
0.3446
Because the amount of water in a two-liter bottle is approximately normally distributed, the sampling distribution of samples of 4 will also be approximately normal with a mean 0.025 0.0125. of X 2.04 and X n 4
P( X 2.03) = P(Z < –0.8) = 0.2119 Excel Output: Mean
2.04
Standard Deviation
0.0125
Probability for X <= X Value
2.03
Z Value
-0.8
P(X<=2.03) (c)
0.2119
Because the amount of water in a two-liter bottle is approximately normally distributed, the sampling distribution of samples of 25 will also be approximately normal with a mean 0.025 0.005. of X 2.05 and X n 25
P( X 2.03) = P(Z < –2) = 0.0228 Excel Output: Mean
2.04
Standard Deviation
0.005
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi Probability for X <= X Value
2.03
Z Value
-2
P(X<=2.03) (d)
(e)
0.0228
(a) refers to the amount of water in an individual two-liter bottle while (c) refers to the mean amount of water in a sample of 25 two-liter water bottles. There is a 34.46% chance that an individual water bottle will contain less than 2.03 liters but a 2.28% chance that the mean amount of water in 25 water bottles will be less than 2.03 liters. Increasing the sample size from four to 25 reduced the probability that the mean amount of water will be less than 2.03 liters from 21.19% to 2.28%.
Copyright ©2024 Pearson Education, Inc.
xii Chapter 7: Sampling Distributions 7.6
(a)
P(X < 42.035) = P(Z < –0.6) = 0.2743 Excel Output:
(b)
Because the weight of an energy bar is approximately normally distributed, the sampling distribution of samples of 4 will also be approximately normal with a mean of
X 42.05 and X
n
0.0125.
P( X 42.035) = P(Z < – 1.2) = 0.1151 Excel Output:
(c)
Because the weight of an energy bar is approximately normally distributed, the sampling distribution of samples of 25 will also be approximately normal with a mean of
X 42.05 and X
n
0.005.
P( X 42.035) = P(Z < – 3) = 0.0013 Excel Output:
(d)
(a) refers to an individual energy bar while (c) refers to the mean of a sample of 25 energy bars. There is a 27.43% chance that an individual energy bar will have a weight below 42.05 grams but only a chance of 0.135% that a mean of 25 energy bars will have a weight below 42.05 grams. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 7.6 cont.
(e)
Increasing the sample size from four to 25 reduced the probability the mean will have a weight below 42.05 grams from 11.51% to 0.135%.
7.7
(a)
Because the population diameter of tennis balls is approximately normally distributed, the sampling distribution of samples of 9 will also be approximately normal with a mean of X = 2.63 and X 0.04 0.01333 . n 25 P( X < 2.61) = P(Z < –1.50) = 0.0668
(b)
Probability for X <= X Value
2.61
Z Value
–1.500375
P(X<=2.61)
(c)
0.0668
P(2.62 < X < 2.64) = P(–0.75 < Z < 0.75) = 0.5469 Probability for a Range
(d)
From X Value
2.62
To X Value
2.64
Z Value for 2.62
–0.750188
Z Value for 2.64
0.7501875
P(X<=2.62)
0.2266
P(X<=2.64)
0.7734
P(2.62<=X<=2.64))
0.5469
P(A < X < B) = P( 1.000 < Z < 1.000) = 0.68 Find X and Z Given Cum. Pctage. Cumulative 20.00% Percentage Z Value –0.8416 X Value 2.6188
Find X and Z Given Cum. Pctage. Cumulative 80.00% Percentage Z Value 0.8416 X Value 2.6412
Lower bound: X = 2.6188 Upper bound: X = 2.6412 Copyright ©2024 Pearson Education, Inc.
xiv Chapter 7: Sampling Distributions 7.8
(a)
(b)
When n = 4 , the shape of the sampling distribution of X should closely resemble the shape of the distribution of the population from which the sample is selected. Because the mean is larger than the median, the distribution of the sales price of new houses is skewed to the right, and so is the sampling distribution of X although it will be less skewed than the population. If you select samples of n = 100, the shape of the sampling distribution of the sample mean will be very close to a normal distribution with a mean of $423,300 and a standard $90,000 error of the mean of X = $9,000. 100 n
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 7.8 cont.
(c)
P( X < 510,000) = P(Z < 9.6333) = 0.8323 Excel Output: Mean
423300
Standard Deviation
90000
Probability for X <= X Value
510000
Z Value
0.9633333
P(X<=510000) (d)
0.8323
P(470,000 < X < 480,000) = P(5.1889 < Z < 6.3) = 0.0376 Excel Output: Probability for a Range
7.9
(a)
From X Value
470000
To X Value
480000
Z Value for 470000
0.5188889
Z Value for 480000
0.63
P(X<=470000)
0.6981
P(X<=480000)
0.7357
P(470000<=X<=480000)
0.0376
Because the number of apps used per month by smartphone owners is assumed to be normally distributed, the sampling distribution of samples of 25 will also be 8 1.6. approximately normal with a mean of X 30 and X n 25 P(29 < X < 31) = P(–0.625 < Z < 0.625) = 0.4680 Excel Output: Common Data Mean
30
Standard Deviation
1.6
Copyright ©2024 Pearson Education, Inc.
xvi Chapter 7: Sampling Distributions
Probability for a Range From X Value
29
To X Value
31
Z Value for 29
-0.625
Z Value for 31
0.625
P(X<=29)
0.2660
P(X<=31)
0.7340
P(29<=X<=31)
0.4680
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 7.9 cont.
(b)
P(28 < X < 32) = P(–1.25 < Z < 1.25) = 0.7887 Excel Output: Mean
30
Standard Deviation
1.6
Probability for a Range
(c)
From X Value
28
To X Value
32
Z Value for 28
-1.25
Z Value for 32
1.25
P(X<=28)
0.1056
P(X<=32)
0.8944
P(28<=X<=32)
0.7887
Because the number of apps used per month by smartphone owners is assumed to be normally distributed, the sampling distribution of samples of 100 will also be 8 approximately normal with a mean of X 30 and X 0.8. n 100 P(29 < X < 31) = P(–1.25 < Z < 1.25) = 0.7887 Excel Output: Mean
30
Standard Deviation
0.8
Probability for a Range From X Value
29
To X Value
31
Z Value for 29
-1.25
Z Value for 31
1.25
P(X<=29)
0.1056
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 7: Sampling Distributions
(d)
P(X<=31)
0.8944
P(29<=X<=31)
0.7887
With the sample size increasing from n = 25 to n = 100, more sample means will be closer to the distribution mean. The standard error of the sampling distribution of size 100 is much smaller than that of size 25, so the likelihood that the sample mean will fall within 1 apps of the mean is much higher for samples of size 100 (probability = 0.7887) than for samples of size 25 (probability = 0.4680).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix 7.10
(a)
X 149 and X 30 7.5
n 16 P( X > 130) = P(Z > –2.53) = 0.9944 Excel Output: Mean
149
Standard Deviation
7.5
Probability for X >
(b)
X Value
130
Z Value
-2.533333
P(X>130)
0.9944
P( X < A) = P(Z < 1.0364) = 0.85 X = 149 + 1.0364 (7.5) = 156.773 Excel Output: Mean
149
Standard Deviation
7.5
Find X and Z Given Cum. Pctage.
(c) (d)
Cumulative Percentage
85.00%
Z Value
1.0364
X Value
156.7733
To be able to use the standardized normal distribution as an approximation for the area under the curve, you must assume that the population is approximately symmetrical. P( X < A) = P(Z < 1.04) = 0.85 X = 149 + 1.0364 (3.75) = 152.8866 Excel Output: Mean
149
Standard Deviation
3.75
Find X and Z Given Cum. Pctage. Cumulative Percentage
85.00%
Copyright ©2024 Pearson Education, Inc.
xx Chapter 7: Sampling Distributions
7.11
(a) (b)
7.12
(a) (b)
7.13
7.14
Z Value
1.0364
X Value
152.8866
55 0.6875 80 0.70(0.30) = 0.0512 p 80 p
20 0.40 50 0.45 0.55 = 0.0704 p 50 p
(a) p = 16/40 = 0.4
0.30(0.70) = 0.0725 40
(b)
p =
(a)
p 0.501 , p
1 n
0.5011 0.501 0.05 100
Partial PHstat output: Probability for X > X Value
0.55
Z Value
0.98
P(X>0.55)
0.1635
P(p > 0.55) = P (Z > 0.98) = 1 – 0.8365 = 0.1635 (b)
p 0.60 , p
1 n
0.6 1 0.6 100
0.04899
Partial PHstat output: Probability for X > X Value
0.55
Z Value
–1.020621
P(X>0.55)
0.8463
P(p > 0.55) = P (Z > – 1.021) = 1 – 0.1539 = 0.8461 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi (c)
p 0.49 , p
1 n
0.49 1 0.49 100
0.05
Partial PHstat output: Probability for X > X Value
0.55
Z Value
1.2002401
P(X>0.55)
(d)
0.1150
P(p > 0.55) = P (Z > 1.20) = 1 – 0.8849 = 0.1151 Increasing the sample size by a factor of 4 decreases the standard error by a factor of 2. (a) Partial PHstat output: Probability for X > X Value
0.55
Z Value
1.9600039
P(X>0.55)
0.0250
P(p > 0.55) = P (Z > 1.96) = 1 – 0.9750 = 0.0250 (b) Partial PHstat output: Probability for X > X Value
0.55
Z Value
–2.041241
P(X>0.55)
0.9794
P(p > 0.55) = P (Z > – 2.04) = 1 – 0.0207 = 0.9793
Copyright ©2024 Pearson Education, Inc.
xxii Chapter 7: Sampling Distributions 7.14
(d)
(c)Partial PHstat output:
cont. Probability for X > X Value
0.55
Z Value
2.4004801
P(X>0.55)
0.0082
P(p > 0.55) = P (Z > 2.40) = 1 – 0.9918 = 0.0082 If the sample size is increased to 400, the probably in (a), (b) and (c) is smaller, larger, and smaller, respectively because the standard error of the sampling distribution of the sample proportion becomes smaller and, hence, the sampling distribution is more concentrated around the true population proportion.
7.15
(a)
p 0.50, p
(1 ) n
0.50(1 0.50) = 0.035355339 200
Partial PHstat output: Probability for a Range
(b)
From X Value
0.5
To X Value
0.55
Z Value for 0.5
0
Z Value for 0.55
1.4142136
P(X<=0.5)
0.5000
P(X<=0.6)
0.9214
P(0.5<=X<=0.6)
0.4214
P(0.50 < p < 0.55) = P(0 < Z < 1.41) = 0.4214 Partial PHstat output: Find X and Z Given Cum. Pctage. Cumulative Percentage
90.00%
Z Value
1.2816
X Value
0.5453
P(–1.2816 < Z < 1.2816) = 0.90 p = 0.50 – 1.2816(0.03536) = 0.4547p = 0.50 + 1.2816(0.03536) = 0.5453 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii (c)
Partial PHstat output: Probability for X > X Value
0.70
Z Value
5.6568543
P(X>0.65)
(d)
0.0000
P(p > 0.70) = P (Z > 5.66) = virtually zero Partial PHstat output: Probability for X > X Value
0.6
Z Value
2.8284271
P(X>0.6)
0.0023
Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 7: Sampling Distributions 7.15 cont.
(d)
If n = 200, P(p > 0.60) = P (Z > 2.83) = 1.0 – 0.9977 = 0.0023 Probability for X > X Value
0.55
Z Value
3.1622777
P(X>0.55)
0.00078
If n = 1000, P(p > 0.55) = P (Z > 3.16) = 1.0 – 0.99921 = 0.00079 More than 60% correct in a sample of 200 is more likely than more than 55% correct in a sample of 1000. 7.16
(a)
(1 )
0.33(1 0.33) = 0.047 n 100 P(p < 0.3) = P(Z < –0.638) = 0.2616 Excel Output:
p 0.33, p
Mean
0.33
Standard Deviation
0.047
Probability for X <=
(b)
X Value
0.3
Z Value
-0.638298
P(X<=0.3)
0.2616
P(0.3 < p < 0.4) = P(–0.638 < Z < 1.489) = 0.0.6702 Excel Output: Probability for a Range From X Value
0.3
To X Value
0.4
Z Value for 0.3
-0.638298
Z Value for 0.4
1.4893617
P(X<=0.3)
0.2616
P(X<=0.4)
0.9318
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv P(0.3<=X<=0.4) (c)
0.6702
P(p > 0.4) = P(Z > 1.489) = 0.0682 Excel Output: Probability for X >
(d)
X Value
0.4
Z Value
1.4893617
P(X>0.4)
0.0682
(1 )
0.33(1 0.33) = 0.0235 n 400 (a)P(p < 0.3) = P(Z < –1.2766) = 0.1009 (b)P(0.3 < p < 0.4) = P(–1.2766 < Z < 2.9787) = 0.0.8977 (c)P(p > 0.4) = P(Z > 2.9787) = 0.0014
p 0.33, p
Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 7: Sampling Distributions 7.16 cont.
(d)
Excel Output: Mean
0.33
Standard Deviation
0.0235
Probability for X <= X Value
0.3
Z Value
-1.276596
P(X<=0.3)
0.1009
Probability for X > X Value
0.4
Z Value
2.9787234
P(X>0.4)
0.0014
Probability for X<0.3 or X >0.4 P(X<0.3 or X >0.4)
0.1023
Probability for a Range From X Value
0.3
To X Value
0.4
Z Value for 0.3
-1.276596
Z Value for 0.4
2.9787234
P(X<=0.3)
0.1009
P(X<=0.4)
0.9986
P(0.3<=X<=0.4)
0.8977
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 7.17
p 0.66, p
(1 ) n
0.66(1 0.66) = 0.0474 100
Excel Output: Mean
0.66
Standard Deviation
0.0474
Probability for X <= X Value
0.6
Z Value
-1.265823
P(X<=0.6)
0.1028
Probability for X > X Value
0.7
Z Value
0.8438819
P(X>0.7)
0.1994
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 7: Sampling Distributions 7.17 cont.
Excel Output: Probability for a Range From X Value
0.6
To X Value
0.7
Z Value for 0.6
-1.265823
Z Value for 0.7
0.8438819
P(X<=0.6)
0.1028
P(X<=0.7)
0.8006
P(0.6<=X<=0.7)
0.6978
(a) (b) (c) (d)
P(p < 0.60) = P(Z < –1.266) = 0.1028 P(0.60 < p < 0.70) = P(–1.266 < Z < 0.844) = 0.6978 P(p > 0.70) = P(Z > 0.844) = 0.1994 (1 ) 0.66(1 0.66) p 0.33, p = 0.0237 n 400 (a)P(p < 0.60) = P(Z < –2.532) = 0.0057 (b)P(0.60 < p < 0.70) = P(–2.532 < Z < 1.688) = 0.9486 (c)P(p > 0.70) = P(Z > 1.688) = 0.0457 Excel Output: Mean
0.66
Standard Deviation
0.0237
Probability for X <= X Value
0.6
Z Value
-2.531646
P(X<=0.6)
0.0057
Probability for X > X Value
0.7
Z Value
1.6877637
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix P(X>0.7)
0.0457
Probability for a Range From X Value
0.6
To X Value
0.7
Z Value for 0.6
-2.531646
Z Value for 0.7
1.6877637
P(X<=0.6)
0.0057
P(X<=0.7)
0.9543
P(0.6<=X<=0.7)
0.9486
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 7: Sampling Distributions 7.18
(a)
(1 )
0.75(1 0.75) = 0.0306 n 200 P(0.70 < p < 0.8) = P(–1.634 < Z < 1.634) = 0.8977 Excel Output:
p 0.75, p
Mean
0.75
Standard Deviation
0.0306
Probability for a Range
(b)
From X Value
0.7
To X Value
0.8
Z Value for 0.7
-1.633987
Z Value for 0.8
1.6339869
P(X<=0.7)
0.0511
P(X<=0.8)
0.9489
P(0.7<=X<=0.8)
0.8977
The probability is 90% that the sample percentage will be contained between 0.70 and 0.80. Excel Output: Find X Values Given a Percentage Percentage
(c)
90.00%
Z Value
-1.64
Lower X Value
0.70
Upper X Value
0.80
The probability is 95% that the sample percentage will be contained between 0.69 and 0.81. Excel Output: Mean
0.75
Standard Deviation
0.0306
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi Find X Values Given a Percentage Percentage
95.00%
Z Value
-1.96
Lower X Value
0.69
Upper X Value
0.81
Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 7: Sampling Distributions 7.19
(a)
(1 )
0.45(1 0.45) = 0.0497 n 100 P(0.40 < p < 0.50) = P(–1.006036 < Z < 1.0060362) = 0.6856 Excel Output:
p 0.45, p
Mean
0.45
Standard Deviation
0.0497
Probability for a Range
(b)
From X Value
0.4
To X Value
0.5
Z Value for 0.4
-1.006036
Z Value for 0.5
1.0060362
P(X<=0.4)
0.1572
P(X<=0.5)
0.8428
P(0.4<=X<=0.5)
0.6856
The probability is 90% that the sample percentage will be contained between 0.37 and 0.53. Excel Output: Find X Values Given a Percentage Percentage
(c)
90.00%
Z Value
-1.64
Lower X Value
0.37
Upper X Value
0.53
The probability is 95% that the sample percentage will be contained between 0.35 and 0.55. Excel Output: Find X Values Given a Percentage Percentage Z Value
95.00% -1.96 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii Lower X Value
0.35
Upper X Value
0.55
Copyright ©2024 Pearson Education, Inc.
xxxiv Chapter 7: Sampling Distributions 7.19 cont.
(d)
(1 )
0.45(1 0.45) = 0.0249 n 400 P(0.40 < p < 0.50) = P(–2.008032 < Z < 2.0080321) = 0.9554 Excel Output:
p 0.45, p
Mean
0.45
Standard Deviation
0.0249
Probability for a Range From X Value
0.4
To X Value
0.5
Z Value for 0.4
-2.008032
Z Value for 0.5
2.0080321
P(X<=0.4)
0.0223
P(X<=0.5)
0.9777
P(0.4<=X<=0.5)
0.9554
The probability is 90% that the sample percentage will be contained between 0.41 and 0.49. Excel Output: Find X Values Given a Percentage Percentage
90.00%
Z Value
-1.64
Lower X Value
0.41
Upper X Value
0.49
The probability is 95% that the sample percentage will be contained between 0.40 and 0.50. Find X Values Given a Percentage Percentage Z Value
95.00% -1.96
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv Lower X Value
0.40
Upper X Value
0.50
Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 7: Sampling Distributions 7.20
(a)
(1 )
0.6208(1 0.6208) = 0.0343 n 200 P(0.70 < p < 0.78) = P(2.3090 < Z < 4.6414) = 0.0105
p 0.8(0.776) 0.6208 , p Mean
0.6208
Standard Deviation
0.0343
Probability for a Range
(b)
From X Value
0.7
To X Value
0.78
Z Value for 0.7
2.3090379
Z Value for 0.78
4.6413994
P(X<=0.7)
0.9895
P(X<=0.78)
1.0000
P(0.7<=X<=0.78)
0.0105
The probability that 90% of the sample percentage have three or more women on the board of directors is nearly 0. Probability for X >
(c)
X Value
0.9
Z Value
8.1399417
P(X>0.9)
0.0000
The probability that 95% of the sample percentage have three or more women on the board of directors is nearly 0. Probability for X > X Value
0.95
Z Value
9.5976676
P(X>0.95)
7.21
0.0000
Because the average of all the possible sample means of size n is equal to the population mean. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii 7.22
The standard error of the sample means becomes smaller as larger sample sizes are taken. This is due to the fact that an extreme observation will have a smaller effect on the mean in a larger sample than in a small sample. Thus, the sample means will tend to be closer to the population mean as the sample size increases.
7.23
As larger sample sizes are taken, the effect of extreme values on the sample mean becomes smaller and smaller. With large enough samples, even though the population is not normally distributed, the sampling distribution of the mean will be approximately normally distributed.
7.24
The population distribution is the distribution of a particular variable of interest, while the sampling distribution represents the distribution of a statistic.
7.25
When the items of interest and the items not of interest are at least 5, the normal distribution can be used to approximate the binomial distribution.
7.26
X 0.753
X
n
0.004 = 0.0008 5
PHStat output: Common Data Mean
0.753
Standard Deviation
0.0008 Probability for a Range
Probability for X <=
From X Value
0.75
To X Value
0.753
X Value
0.74
Z Value
–16.25
Z Value for 0.75
–3.75
1.117E-59
Z Value for 0.753
0
P(X<=0.74)
Probability for X > X Value
0.76
Z Value
8.75
P(X>0.76)
0.0000
P(X<=0.75)
0.0001
P(X<=0.753)
0.5000
P(0.75<=X<=0.753)
0.4999
Find X and Z Given Cum. Pctage. Cumulative Percentage
7.00%
Probability for X<0.74 or X >0.76
Z Value
–1.475791
P(X<0.74 or X >0.76)
X Value
0.751819
0.0000
Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 7: Sampling Distributions Probability for a Range From X Value
0.74
To X Value
0.75
Z Value for 0.74
–16.25
Z Value for 0.75
–3.75
P(X<=0.74)
0.0000
P(X<=0.75)
0.0001
P(0.74<=X<=0.75)
0.00009
(a) (b) (c) (d) (e)
P(0.75 < X < 0.753) = P(– 3.75 < Z < 0) = 0.5 – 0.00009 = 0.4999 P(0.74 < X < 0.75) = P(– 16.25 < Z < – 3.75) = 0.00009 P( X > 0.76) = P(Z > 8.75) = virtually zero P( X < 0.74) = P(Z < – 16.25) = virtually zero P( X < A) = P(Z < – 1.48) = 0.07 X = 0.753 – 1.48(0.0008) = 0.7518
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 7.27
X 0.045 = 0.009
X 2.0
n
25
PHStat output: Common Data Mean
2
Standard Deviation
0.009 Probability for a Range
Probability for X <=
From X Value
1.99 2.0
X Value
1.98
To X Value
Z Value
–2.22
Z Value for 1.99
P(X<=1.98)
0.0131
Z Value for 2
Probability for X > X Value
2.01
Z Value
1.11
P(X>2.01)
0.1333
P(X<1.98 or X >2.01)
0.1464
0
P(X<=1.99)
0.1333
P(X<=2)
0.5000
P(1.99<=X<=2)
0.3667
Find X and Z Given Cum. Pctage. Cumulative Percentage
Probability for X<1.98 or X >2.01
–1.11
1.00%
Z Value
–2.3263
X Value
1.9791
Find X and Z Given Cum. Pctage.
(a) (b) (c) (d) (e)
Cumulative Percentage
99.50%
Z Value
2.5758
X Value
2.0232
P(1.99 < X < 2.00) = P(–1.11 < Z < 0) = 0.5 – 0.1333 = 0.3667 P( X < 1.98) = P(Z < –2.22) = 0.0131 P( X > 2.01) = P(Z > 1.11) = 1.0 – 0.8667 = 0.1333 P( X > A) = P( Z > –2.33) = 0.99 A = 2.00 – 2.33(0.009) = 1.9791 P(A < X < B) = P(–2.58 < Z < 2.58) = 0.99 Copyright ©2024 Pearson Education, Inc.
xl Chapter 7: Sampling Distributions A = 2.00 – 2.58(0.009) = 1.9768 7.28
X 4.7
X
X n
B = 2.00 + 2.58(0.009) = 2.0232
0.40 0.08 5
PHstat output: Common Data Mean Standard Deviation Probability for X > X Value Z Value P(X>4.6)
4.7 0.08
Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value
4.6 –1.25 0.8944
Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value
23.00% –0.738847 4.640892
15.00% –1.036433 4.6170853
Find X and Z Given Cum. Pctage. Cumulative Percentage Z Value X Value
85.00% 1.036433 4.782915
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli 7.28 cont.
(a) (b) (c)
7.29
P(4.60 < X ) = P(– 1.25 < Z) = 1 – 0.1056 = 0.8944 P(A < X < B) = P(– 1.04 < Z < 1.04) = 0.70 A = 4.70 – 1.04(0.08) = 4.6168 ounces X = 4.70 + 1.04(0.08) = 4.7832 ounces X P( > A) = P(Z > – 0.74) = 0.77 A = 4.70 – 0.74(0.08) = 4.6408
(a)
X
0.40 0.08 n 25 Partial PHStat output:
X 4.9
X
Probability for X > X Value Z Value P(X>4.60)
4.6 –3.75 0.9999
P(4.60 < X ) = P(–5 < Z) = 0.9999 (b)
Partial PHStat output: Find X and Z Given Cum. Pctage. Cumulative Percentage 15.00% Z Value –1.0364 X Value 4.8171 Find X and Z Given Cum. Pctage. Cumulative Percentage 85.00% Z Value 1.0364 X Value 4.9829
P(A < X < B) = P(–1.0364 < Z < 1.0364) = 0.70 A = 4.9 – 1.0364(0.08) = 4.8171 ounces X = 4.9 + 1.0364(0.08) = 4.9829 ounces (c)
Partial PHStat output: Find X and Z Given Cum. Pctage. Cumulative Percentage 23.00% Z Value –0.7388 X Value 4.8409
P( X > A) = P(Z > –0.7388) = 0.77A = 4.9 – 0.7388(0.08) = 4.8409
Copyright ©2024 Pearson Education, Inc.
xlii Chapter 7: Sampling Distributions 7.30
X 21.08
X
n
20 =5 4
Excel Output: Mean
21.08
Standard Deviation
5
Probability for X <=
Probability for a Range
X Value
0
From X Value
0
Z Value
-4.216
To X Value
10
P(X<=0)
0.0000
Z Value for 0
-4.216
Z Value for 10
-2.216
P(X<=0)
0.0000
P(X<=10)
0.0133
P(0<=X<=10)
0.0133
Probability for X > X Value
10
Z Value
-2.216
P(X>10)
0.9867
(a) (b) (c) 7.31
P( X < 0) = P (Z < –4.216) = 0.0000 P(0< X < 10) = P(–4.216< Z < –2.216) = 0.0133 P( X > 10) = P (Z > –2.216) = 0.9867
Excel Output: Mean
-10.87
Standard Deviation
10
Probability for X <=
Probability for a Range
X Value
0
From X Value
-10
Z Value
1.087
To X Value
-20
P(X<=0)
0.8615
Z Value for -10
0.087
Z Value for -20
-0.913
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii Probability for X > X Value
-5
Z Value
0.587
P(X>-5)
0.2786
(a) (b) (c)
P(X<=-10)
0.5347
P(X<=-20)
0.1806
P(-10<=X<=-20)
0.3540
P( X < 0) = P (Z < 1.087) = 0.8615 P(–20 < X < –10) = P(–0.913 < Z < 0.087) = 0.3540 P( X > –5) = P (Z > 0.587) = 0.2786
Copyright ©2024 Pearson Education, Inc.
xliv Chapter 7: Sampling Distributions 7.31
X 10.87 X
cont.
Excel Output:
n
10 =5 2
Mean
-10.87
Standard Deviation
5
Probability for X <=
Probability for a Range
X Value
0
From X Value
-10
Z Value
2.174
To X Value
-20
P(X<=0)
0.9851
Z Value for -10
0.174
Z Value for -20
-1.826
P(X<=-10)
0.5691
P(X<=-20)
0.0339
P(-10<=X<=-20)
0.5351
Probability for X > X Value
-5
Z Value
1.174
P(X>-5)
0.1202
(d) (e) (f) (g)
P( X < 0) = P (Z < 2.174) = 0.9851 P(–20 < X < –10) = P(–1.826 < Z < 0.174) = 0.5351 P( X > –5) = P (Z > 1.174) = 0.1202 Since the sample mean of returns of a sample of stocks is distributed closer to the population mean than the return of a single stock, the probabilities in (a) and (b) are lower than those in (d) and (e) while the probability in (c) is higher than that in (f).
ab , since the random 2 numbers in the table range from 0 to 9 the mean is 4.5. When n = 2, the frequency distribution of the sample means for the class should be centered around 4.5 and have a shape similar to column B with n = 2 in Figure 7.4 page 232 of text. As the sample size increases the frequency distribution of the sample means should have the shape similar to a normal distribution centered around 4.5.
7.32
Class Project answers will vary. The mean of the uniform distribution is
7.33
Class Project answers will vary. This scenario simulates a binomial distribution with 0.5, the mean is n 10(0.5) 5. The frequency distribution of the entire class should have the shape similar to a normal distribution centered around 5.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 7.34
Class Project answers will vary. Population mean is 1.310 and population standard deviation is 1.13
Depending on class results, one should expect similar results to example 7.5 on page 279 of text. The frequency distributions of the sample means for each sample size should progress from a skewed population toward a bell-shaped distribution as the sample size increases.
7.35
(a)–(b) Class Project answers will vary. (c) Since the population histogram of average credit score is fairly symmetrical, one can expect the class created frequency distributions of the sample means (for each sample size) to be approximately normal for n = 5, n = 15, and n = 30.
Copyright ©2024 Pearson Education, Inc.
Chapter 8
83.53 86.47
X Z
8.2
X Z
8.3
Since the results of only one sample are used to indicate whether something has gone wrong in the production process, the manufacturer can never know with 100% certainty that the specific interval obtained from the sample includes the true population mean. In order to have 100% confidence, the entire population (sample size N) would have to be selected.
8.4
Yes, it is true since 5% of intervals will not include the population mean.
8.5
If all possible samples of the same size n 100 are taken, 95% of them will include the true population mean time spent on Twitter per day. Thus, you are 95 percent confident that this sample is one that does correctly estimate the true mean time spent on Twitter per day.
8.6
(a)
(b)
n
n
85 1.96
6 64
8.1
= 125 2.58
24 36
114.68 135.32
You would compute the mean first because you need the mean to compute the standard deviation. If you had a sample, you would compute the sample mean. If you had the population mean, you would compute the population standard deviation. If you have a sample, you are computing the sample standard deviation not the population standard deviation needed in Equation 8.1. If you have a population, and have computed the population mean and population standard deviation, you don't need a confidence interval estimate of the population mean since you already have computed it.
8.7
If the population mean time spent on Twitter is 56 minutes a day, the confidence interval estimate stated in Problem 8.5 is correct because it contains the value 56 minutes.
8.8
Equation (8.1) assumes that you know the population standard deviation. Because you are selecting a sample of 100 from the population, you are computing a sample standard deviation, not the population standard deviation.
8.9
(a) (b) (c) (d)
0.02 0.9777 0.9923 n 50 Since the value of 1.0 is not included in the interval, there is reason to believe that the mean is different from 1.0 gallon and the distributor has a right to complain. No. Since is known and n 50 , from the Central Limit Theorem, we may assume that the sampling distribution of X is approximately normal. The reduced confidence level narrows the width of the confidence interval. X Z
0.985 2.58
Copyright ©2024 Pearson Education, Inc. v
vi Chapter 8: Confidence Interval Estimation
0.02 0.9795 0.9905 n 50 Since the value of 1.0 is still not included in the interval, and the distributor does have a right to complain. X Z
0.985 1.96
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 8.10
(a)
(b) (c) (d)
X Z
49,875 1.96
n Excel Output:
1500 64
49,507.51 50,242.49
Yes, because the confidence interval includes 50,000 hours the manufacturer can support a claim that the bulbs have a mean of 50,000 hours. No. Because is known and n = 64, from the Central Limit Theorem, you know that the sampling distribution of X is approximately normal. 500 49,752.50 49,997.50 X Z 49,875 1.96 n 64 The confidence interval is narrower, based on the population standard deviation of 500 hours and the confidence interval no longer includes 50,000 so the manufacturer could not state that the LED bulbs have a mean life of 50,000 hours. Excel Output:
Copyright ©2024 Pearson Education, Inc.
viii Chapter 8: Confidence Interval Estimation 8.11
X t
S 20 75 2.0301 n 36
8.12
(a) (b) (c) (d) (e)
df = 9, = 0.05, t /2 = 2.2622 df = 9, = 0.01, t /2 = 3.2498 df = 31, = 0.05, t /2 = 2.0395 df = 64, = 0.05, t /2 = 1.9977 df = 15, = 0.1, t /2 = 1.7531
8.13
Set 1: 4.5 2.3646
8.14
Original data: 5.8571 2.4469
8.15
(a)
68.2330 81.7670
3.7417 1.3719 7.6281 8 2.4495 Set 2: 4.5 2.3646 2.4522 6.5478 8 The data sets have different confidence interval widths because they have different values for the standard deviation. 6.4660 – 0.1229 11.8371 7 2.1602 Altered data: 4.00 2.4469 2.0022 5.9978 7 The presence of an outlier in the original data increases the value of the sample mean and greatly inflates the sample standard deviation.
PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level
200 1500 100 95%
Intermediate Calculations Standard Error of the Mean 20 Degrees of Freedom 99 t Value 1.9842 Interval Half Width 39.6843 Confidence Interval Interval Lower Limit Interval Upper Limit
1460.32 1539.68
S 200 $ 1460.32 $ 1539.68 1500 1.9842 n 100 You can be 95% confident that the population mean spending for all Amazon Prime member shoppers is somewhere between $1460.32 and $1539.68. X t
(b)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 8.16
8.17
85.46 88.54
X t
(b)
You can be 95% confident that the population mean one-time gift donations is somewhere between $85.46 and $88.54.
(a) (b) (c)
8.18
S 9 87 1.9781 n 133
(a)
S 21.4 187.1580 208.4420 197.8 2.1098 n 18 No, a grade of 200 is in the interval. It is not unusual to have an observed tread wear index of 210, which is outside the 95% confidence interval for the population mean tread wear index, because the standard error of the sample mean / n is smaller than the standard deviation of the population of the tread wear index for a single observed treat wear. Hence, the value of a single observed tread wear index varies around the population mean more than a sample mean does. X t
(a)
6.32 7.87 Minitab Output:
(b)
You can be 95% confident that the population mean amount spent for lunch at a fast-food restaurant is between $6.31 and $7.87. That the population distribution is normally distributed. The assumption of normality is not seriously violated and with a sample of 15, the validity of the confidence interval is not seriously impacted.
(c) (d)
Copyright ©2024 Pearson Education, Inc.
x Chapter 8: Confidence Interval Estimation 8.19
(a)
Commuting Time 24.72 26.05 Data Sample Standard Deviation
3.3297
Sample Mean
25.385
Sample Size
100
Confidence Level
95%
Intermediate Calculations Standard Error of the Mean
0.33297
Degrees of Freedom
99
t Value
1.9842
Interval Half Width
0.6607
Confidence Interval
(b) (c) (d)
8.20
(a)
Interval Lower Limit
24.72
Interval Upper Limit
26.05
You can be 95% confident that the population mean commuting time is somewhere between 24.72 minutes and 26.05 minutes. That the population distributions are normally distributed The assumption of normality is not seriously violated with sample sizes of 30. The validity of the confidence interval is not seriously impacted. For First and Second Quarter ads: 5.52 6.06 For Halftime and Second half ads: 5.26 5.80 Excel Output: BEFORE HALFTIME
HALFTIME AND AFTERWARDS
Data
Data
Sample Standard Deviation
0.702028
Sample Standard Deviation
0.709263
Sample Mean
5.789286
Sample Mean
5.534483
Sample Size
28
Sample Size
Copyright ©2024 Pearson Education, Inc.
29
Solutions to End-of-Section and Chapter Review Problems xi Confidence Level
95%
Confidence Level
Intermediate Calculations Standard Error of the Mean
Intermediate Calculations
0.132671
Degrees of Freedom
27
0.131707
Degrees of Freedom
28
2.0518
t Value
2.0484
Interval Half Width
0.2722
Interval Half Width
0.2698
Confidence Interval
Interval Lower Limit
5.52
Interval Lower Limit
5.26
Interval Upper Limit
6.06
Interval Upper Limit
5.80
(b)
(c) (d) (e)
8.21
Standard Error of the Mean
t Value
Confidence Interval
8.20 cont.
95%
You are 95% confident that the mean rating for first and second quarter ads is between 5.52 and 6.06. You are 95% confident that the mean rating for halftime and second half ads is between 5.26 and 5.80. The confidence intervals for the two groups of ads are similar. You need to assume that the distributions of the rating for the two groups of ads are normally distributed. The distribution of each group of ads appears approximately normally distributed.
Excel Output: One-Year CD
Five-Year CD Data
Data
Sample Standard Deviation
0.157
Sample Standard Deviation
0.2762
Sample Mean
0.1761
Sample Mean
0.3803
Sample Size
36
Confidence Level
95%
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
Sample Size
36
Confidence Level
95%
Intermediate Calculations
0.026167 35
Standard Error of the Mean Degrees of Freedom
Copyright ©2024 Pearson Education, Inc.
0.046033 35
xii Chapter 8: Confidence Interval Estimation t Value
2.0301
t Value
2.0301
Interval Half Width
0.0531
Interval Half Width
0.0935
Confidence Interval Interval Lower Limit
0.12
Interval Lower Limit
0.29
Interval Upper Limit
0.23
Interval Upper Limit
0.47
(a) (b) (c)
One Year CD 0.12 0.23 Five Year CD 0.29 0.47 The mean yield for a one-year CD is somewhere between 0.12 and 0.23 with 95% confidence and the mean yield for a five-year CD is somewhere between 0.29 and 0.47 with 95% confidence. S 41.9261 31.12 54.96 43.04 2.0096 n 50 The population distribution needs to be normally distribution. X t
Normal Probability Plot 180 160 140 120
Days
(a) (b) (c)
8.22
Confidence Interval
100 80 60 40 20 0
-3
-2
-1
0
1
2
3
Z Value
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 8.22 cont.
(c) Box-and-whisker Plot
Days
0
(d)
8.23
(a)
50
100
150
Both the normal probability plot and the boxplot suggest that the distribution is skewed to the right. Even though the population distribution is not normally distributed, with a sample of 50, the t distribution can still be used due to the Central Limit Theorem. X t
S 87.3651 1691.78 1757.02 1724.4 2.0452 n 30
Confidence Interval Estimate for the Mean
Data Sample Standard Deviation
87.3651
Sample Mean
1724.4
Sample Size
30
Confidence Level
95%
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
15.950612 29
t Value
2.0452
Interval Half Width
32.6227
Confidence Interval Copyright ©2024 Pearson Education, Inc.
xiv Chapter 8: Confidence Interval Estimation
(b)
Interval Lower Limit
1691.78
Interval Upper Limit
1757.02
The population distribution needs to be normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 8.23 cont.
(c)
The normal probability plot indicates that the population distribution is normally distributed.
Normal Probability Plot
Force
2000 1800 1600 1400 1200 1000 800 600 400 200 0 -3
8.24
-2
-1
0 Z Value
1
2
3
Excel Output: Confidence Interval Estimate for the Mean
Data Sample Standard Deviation Sample Mean
14.1397 59.30714286
Sample Size
14
Confidence Level
95%
Intermediate Calculations Standard Error of the Mean
3.778993782
Degrees of Freedom
13
t Value
2.1604
Interval Half Width
8.1640
Confidence Interval Interval Lower Limit
51.14 Copyright ©2024 Pearson Education, Inc.
xvi Chapter 8: Confidence Interval Estimation Interval Upper Limit (a)
67.47
51.14 67.47
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 8.24 cont.
8.25
(b)
The population distribution is normally distributed.
(c)
The boxplot appears to be left skewed, so the validity of the confidence interval is questionable.
(a) (b)
S 0.0017 –0.000566 0.000106 0.00023 1.9842 n 100 The population distribution needs to be normally distributed. However, with a sample of 100, the t distribution can still be used as a result of the Central Limit Theorem even if the population distribution is not normal. X t
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 8: Confidence Interval Estimation 8.25 cont.
(c) Normal Probability Plot 0.006 0.005 0.004
Error
0.003 0.002 0.001 0 -0.001 -3
-2
-1
0
1
2
0.004
0.006
3
-0.002 -0.003 -0.004
Z Value
Box-and-whisker Plot
Error
-0.006
8.27
-0.002
0
0.002
Both the normal probability plot and the boxplot suggest that the distribution is skewed to the right. We are 95% confident that the mean difference between the actual length of the steel part and the specified length of the steel part is between –0.000566 and 0.000106 inch , which is narrower than the plus or minus 0.005 inch requirement. The steel mill is doing a good job at meeting the requirement. This is consistent with the finding in Problem 2.43.
(d)
8.26 p
-0.004
X 50 = 0.25 p Z n 200
p
p(1– p) 0.25(0.75) 0.25 1.96 n 200 0.19 0.31
X 20 pZ 0.05 n 400 0.0219 0.0781
p(1 p) 0.05(0.95) 0.05 2.58 n 400
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix 8.28
(a)
(b)
p(1 – p) 0.27(1 0.27) X 135 pZ 0.27 2.5758 = 0.27 n 500 n 500 0.22 0.32 The manager in charge of promotional programs concerning residential customers can infer that the proportion of households that would purchase a new cellphone if it were made available at a substantially reduced installation cost is between 0.22 and 0.32 with a 99% level of confidence.
(a)
Excel Output: Number of successes 400(0.73) 292
p
8.29
Confidence Interval Estimate for the Proportion
Data Sample Size
400
Number of Successes
292
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.73
Z Value
-1.9600
Standard Error of the Proportion
0.0222
Interval Half Width
0.0435
Confidence Interval
Copyright ©2024 Pearson Education, Inc.
xx Chapter 8: Confidence Interval Estimation Interval Lower Limit
0.6865
Interval Upper Limit
0.7735
p 0.73 p Z
(b)
p 1 p n
0.73 1.96
0.73 1 0.73 400
0.6865 0.7735 The 95% confidence interval for the proportion of adults who responded somewhat agree or strongly agree that flexibility in work scheduling increases productivity is somewhere between 68.65% and 77.35%. One can infer that a large proportion of U.S. adults want flexibility in work scheduling.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi 8.30
(a)
Excel Output: Number of successes 2000(0.798) 1596 Confidence Interval Estimate for the Proportion Data Sample Size
2000
Number of Successes
1596
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.798
Z Value
-1.9600
Standard Error of the Proportion
0.0090
Interval Half Width
0.0176
Confidence Interval Interval Lower Limit
0.7804
Interval Upper Limit
0.8156 p 1 p
0.798 1 0.798
0.7804 0.8156 n 2000 The 95% confidence interval estimate for the population proportion of U.S. Internet shoppers aged 18 and above who said they shopped at Amazon because of free shipping is between 78.04% and 81.56%. p 0.798
(b)
pZ
0.798 1.96
Excel Output: Number of successes 2000(0.689) 1378 Data Sample Size
2000
Number of Successes
1378
Confidence Level
95%
Intermediate Calculations
Copyright ©2024 Pearson Education, Inc.
xxii Chapter 8: Confidence Interval Estimation Sample Proportion
0.689
Z Value
-1.9600
Standard Error of the Proportion
0.0104
Interval Half Width
0.0203
Confidence Interval Interval Lower Limit
0.6687
Interval Upper Limit
0.7093
p 1 p
0.689 1 0.689
0.6687 0.7093 n 2000 The 95% confidence interval estimate for the population proportion of U.S. Internet shoppers aged 18 and above who said they shopped at Amazon because of broad selection is between 66.87% and 70.93%. p 0.689
(c) 8.31
(a)
pZ
0.689 1.96
A large proportion of Amazon shoppers purchase from Amazon because of free shipping and because of its broad selection. Excel Output: Number of successes 1000(0.85) 850 Confidence Interval Estimate for the Proportion Data Sample Size
1000
Number of Successes
850
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.85
Z Value
-1.9600
Standard Error of the Proportion
0.0113
Interval Half Width
0.0221
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii Confidence Interval Interval Lower Limit
0.8279
Interval Upper Limit
0.8721
p 1 p
p 0.85 p Z
0.85 1.96
0.85 1 0.85
n 1000 0.8279 0.8721 The 95% confidence interval estimate for the population proportion of U.S. adults who now say that they go online on a daily basis is between 82.79% and 87.21%.
(b)
Excel Output: Number of successes 1000(0.07) 70 Data Sample Size
1000
Number of Successes
70
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.07
Z Value
-1.9600
Standard Error of the Proportion
0.0081
Interval Half Width
0.0158
Confidence Interval Interval Lower Limit
0.0542
Interval Upper Limit
0.0858
p 0.07 p Z
p 1 p
0.07 1.96
0.07 1 0.07
n 1000 0.0542 0.0858 The 95% confidence interval estimate for the population proportion of U.S. adults who say that they do not use the Internet at all is between 5.42% and 8.58%.
(c)
A large proportion of U.S. adults say they use the Internet on a daily basis, while a very small proportion say that they do not use the Internet at all. Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 8: Confidence Interval Estimation 8.32
(a)
Excel Output: Number of successes 500(0.40) 200 Confidence Interval Estimate for the Proportion Data Sample Size
500
Number of Successes
200
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.4
Z Value
-1.9600
Standard Error of the Proportion
0.0219
Interval Half Width
0.0429
Confidence Interval Interval Lower Limit
0.3571
Interval Upper Limit
0.4429
p 0.40 p Z
p 1 p
0.40 1.96
0.40 1 0.40
n 500 0.3571 0.4429 The 95% confidence interval estimate for the population proportion of Americans who drink their coffee at coffee shops drink at Starbucks is between 35.71% and 44.29%.
(b)
Excel Output: Number of successes 500(0.26) 130 Data Sample Size
500
Number of Successes
130
Confidence Level
95%
Intermediate Calculations
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv Sample Proportion
0.26
Z Value
-1.9600
Standard Error of the Proportion
0.0196
Interval Half Width
0.0384
Confidence Interval Interval Lower Limit
0.2216
Interval Upper Limit
0.2984
p 0.26 p Z
p 1 p
0.26 1.96
0.26 1 0.26
n 500 0.2216 0.2984 The 95% confidence interval estimate for the population proportion of Americans who drink their coffee at coffee shops drink at Dunkin is between 22.16% and 29.84%.
(c) 8.33
(a)
A large proportion of Americans who drink their coffee at coffee shops drink at Starbucks or at Dunkin. Excel Output: Number of successes = 1,358 Confidence Interval Estimate for the Proportion Data Sample Size
3725
Number of Successes
1358
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.364563758
Z Value
-1.9600
Standard Error of the Proportion
0.0079
Interval Half Width
0.0155
Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 8: Confidence Interval Estimation Confidence Interval Interval Lower Limit
0.3491
Interval Upper Limit
0.3800
p 1 p 0.3646 1 0.3646 X 1358 0.3646 1.96 0.3646 p Z n 3,725 n 3725 0.3491 0.3800 The 95% confidence interval estimate for population proportion of customers who had paperless billing and who churned in the last month s is between 34.91% and 38.00%. p
(b)
Excel Output: Number of successes = 398 Data Sample Size
1792
Number of Successes
398
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.222098214
Z Value
-1.9600
Standard Error of the Proportion
0.0098
Interval Half Width
0.0192
Confidence Interval Interval Lower Limit
0.2029
Interval Upper Limit
0.2413
p 1 p 0.22211 0.2221 X 398 0.2221 1.96 0.2221 p Z n 1,792 n 1792 0.2029 0.2413 The 95% confidence interval estimate for the population proportion of customers who did not have paperless billing and who churned in the last month s is between 20.29% and 24.13%. p
(c)
A small proportion of telecom customers churned last month. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 8.34
n
Z 2 2 1.962 152 = 34.57 e2 52
Use n = 35
8.35
n
Z 2 2 2.582 902 134.7921 e2 202
Use n 135
8.36
n
Z 2 (1 – ) 2.582 (0.5)(0.5) = 1,040.06 e2 (0.04)2
Use n = 1,041
8.37
n
Z 2 (1 – ) 1.962 (0.38)(0.62) 1,005.6455 e2 (0.03)2
8.38
(a) (b)
Z 2 2 1.962 4002 = 245.86 e2 502 Z 2 2 1.962 4002 = 983.41 n 2 e 252 n
Z 2 2 1.962 (0.025)2 196 e2 (0.0035)2
8.39
n
8.40
Excel Output:
Use n 1006
Use n = 246 Use n = 984
Use n 196
n
Z 2 2 1.962 15002 54.0225 e2 4002
Use n = 55
8.41
n
Z 2 2 1.962 (0.045)2 34.5744 e2 (0.015)2
Use n 35
8.42
(a) (b)
Z 2 2 2.57582 202 = 106.1583 e2 52 Z 2 2 1.962 202 n 2 = 61.4633 e 52 n
Use n = 107 Use n = 62
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 8: Confidence Interval Estimation 8.43
(a) (b)
Z 2 2 1.6452 302 152.2 e2 42 Z 2 2 2.57582 302 n 2 373.2 e 42
Use n 153
n
Use n 374
8.44
Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. Z 2 2 1.962 22 (a) = 245.85 Use n = 246 n 2 e 0.252 Z 2 2 1.962 2.52 (b) = 384.15 Use n = 385 n 2 e 0.252 Z 2 2 1.962 3.02 (c) = 553.17 Use n = 554 n 2 e 0.252 (d) When there is more variability in the population, a larger sample is needed to accurately estimate the mean.
8.45
(a)
Excel Output: Data Estimate of True Proportion
0.36
Sampling Error
0.04
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Calculated Sample Size
553.1701
Result Sample Size Needed
554.0000
Z 2 (1 ) 1.962 (0.36)(1 0.36) 553.19 e2 0.042 Excel Output: n
(b)
Data Estimate of True Proportion
0.36
Sampling Error
0.04
Copyright ©2024 Pearson Education, Inc.
Use n 554
Solutions to End-of-Section and Chapter Review Problems xxix Confidence Level
99%
Intermediate Calculations Z Value
-2.5758
Calculated Sample Size
955.4251
Result Sample Size Needed
956.0000
Z 2 (1 ) 2.57582 (0.36)(1 0.36) 955.4 e2 0.042 Excel Output: n
8.45 cont.
(c)
Use n 956
Data Estimate of True Proportion
0.36
Sampling Error
0.02
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Calculated Sample Size
2212.6803
Result Sample Size Needed
2213.0000
Z 2 (1 ) 1.962 (0.36)(1 0.36) 2212.76 e2 0.022 Excel Output: n
(d)
Data Estimate of True Proportion
0.36
Sampling Error
0.02
Copyright ©2024 Pearson Education, Inc.
Use n 2213
xxx Chapter 8: Confidence Interval Estimation Confidence Level
99%
Intermediate Calculations Z Value
-2.5758
Calculated Sample Size
3821.7004
Result Sample Size Needed
3822.0000
Z 2 (1 ) 2.57582 (0.36)(1 0.36) Use n 3822 3821.6 e2 0.022 The higher the level of confidence desired, the larger is the sample size required. The smaller the sampling error desired, the larger is the sample size required. n
(e)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi 8.46
(a)
Excel Output: pricing expectations of potential targets Confidence Interval Estimate for the Proportion Data Sample Size
229
Number of Successes
167
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.729257642
Z Value
-1.9600
Standard Error of the Proportion
0.0294
Interval Half Width
0.0576
Confidence Interval Interval Lower Limit
0.6717
Interval Upper Limit
0.7868
X 169 0.729 p Z n 229 0.6717 0.7868
p
(b)
p 1 p n
0.729 1.96
Excel Output: culture/integration of personnel Data Sample Size
229
Number of Successes
128
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.558951965
Copyright ©2024 Pearson Education, Inc.
0.729 1 0.729 229
xxxii Chapter 8: Confidence Interval Estimation Z Value
-1.9600
Standard Error of the Proportion
0.0328
Interval Half Width
0.0643
Confidence Interval Interval Lower Limit
0.4946
Interval Upper Limit
0.6233
X 128 0.559 p Z n 229 0.4946 0.6233
p
p 1 p n
0.559 1.96
Copyright ©2024 Pearson Education, Inc.
0.559 1 0.559 229
Solutions to End-of-Section and Chapter Review Problems xxxiii 8.46 cont.
(c)
Excel Output: technology integration Data Sample Size
229
Number of Successes
48
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.209606987
Z Value
-1.9600
Standard Error of the Proportion
0.0269
Interval Half Width
0.0527
Confidence Interval Interval Lower Limit
0.1569
Interval Upper Limit
0.2623
X 48 0.210 p Z n 229 0.1569 0.2623
p
(d)
p 1 p n
0.210 1.96
Excel Output: pricing expectations of potential targets (a) Data Estimate of True Proportion
0.729257642
Sampling Error
0.02
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Copyright ©2024 Pearson Education, Inc.
0.210 1 0.210 229
xxxiv Chapter 8: Confidence Interval Estimation Calculated Sample Size
1896.1530
Result Sample Size Needed
n
1897.0000
Z 2 (1 ) 1.962 (0.729)(1 0.729) 1869.2 e2 0.022
Copyright ©2024 Pearson Education, Inc.
Use n 1,897
Solutions to End-of-Section and Chapter Review Problems xxxv 8.46 cont.
(d)
Excel Output: culture/integration of personnel (b) Data Estimate of True Proportion
0.558951965
Sampling Error
0.02
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Calculated Sample Size
2367.5359
Result Sample Size Needed
2368.0000
Z 2 (1 ) 2.57582 (0.559)(1 0.559) 2367.5 e2 0.022 Excel Output: technology integration (c) Data n
Estimate of True Proportion
Use n 2,368
0.209606987
Sampling Error
0.02
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Calculated Sample Size
1591.0544
Result Sample Size Needed
n
1592.0000
Z 2 (1 ) 1.962 (0.210)(1 0.210) 1,591.1 e2 0.022 Copyright ©2024 Pearson Education, Inc.
Use n 1,592
xxxvi Chapter 8: Confidence Interval Estimation
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii 8.47
(a)
Excel Output: Data Sample Size
400
Number of Successes
196
Confidence Level
95%
Intermediate Calculations Sample Proportion
0.49
Z Value
-1.9600
Standard Error of the Proportion
0.0250
Interval Half Width
0.0490
Confidence Interval Interval Lower Limit
0.4410
Interval Upper Limit
0.5390
X 196 0.49 p Z n 400 0.4410 0.5390 p
p 1 p n
0.49 1.96
0.49 1 0.49 400
(b)
You are 95% confident that the population proportion of nonprofit professionals that indicate ensuring employees are properly trained an serving their mission are their most important goals for the coming year is somewhere between 44.10%nd 53.90%.
(c)
Excel Output: Data Estimate of True Proportion
0.49
Sampling Error
0.01
Confidence Level
95%
Intermediate Calculations Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 8: Confidence Interval Estimation Z Value
-1.9600
Calculated Sample Size
9599.8056
Result Sample Size Needed
n 8.48
(a)
9600.0000
Z 2 (1 ) 1.962 (0.49)(1 0.49) 9,599.8 e2 0.012
Use n 9,600
If you conducted a follow-up study, you would use 0.32 in the sample size formula because it is based on past information on the proportion.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 8.48 cont.
(b)
Excel Output: Data Estimate of True Proportion
0.32
Sampling Error
0.03
Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Calculated Sample Size
928.7794
Result Sample Size Needed
n 8.49
(a)
929.0000
Z 2 (1 ) 1.962 (0.32)(1 0.32) 928.8 e2 0.032
Use n 929
Excel Output: Data Estimate of True Proportion
0.54
Sampling Error
0.03
Confidence Level
99%
Intermediate Calculations Z Value
-2.5758
Calculated Sample Size
1831.2315
Result Sample Size Needed
n (b)
1832.0000
Z 2 (1 ) 1.962 (0.54)(1 0.54) 1,831.2 e2 0.032
Excel Output: Copyright ©2024 Pearson Education, Inc.
Use n 1,832
xl Chapter 8: Confidence Interval Estimation Data Estimate of True Proportion
0.54
Sampling Error
0.05
Confidence Level
99%
Intermediate Calculations Z Value
-2.5758
Calculated Sample Size
659.2433
Result Sample Size Needed
n
8.50
8.51 8.52
8.53
660.0000
Z 2 (1 ) 1.962 (0.54)(1 0.54) 659.2 e2 0.052
Use n 660
(c) A smaller sampling error requires a larger sample size. The only way to have 100% confidence is to obtain the parameter of interest, rather than a sample statistic. From another perspective, the range of the normal and t distribution is infinite, so a Z or t value that contains 100% of the area cannot be obtained. The t distribution is used for obtaining a confidence interval for the mean when is unknown. If the confidence level is increased, a greater area under the normal or t distribution needs to be included. This leads to an increased value of Z or t, and thus a wider interval. The term 1 reaches its largest value when the population proportion is at 0.5. Hence, the sample size n
Z 2 1
needed to determine the proportion is smaller when the population e2 proportion is 0.20 than when the population proportion is 0.50. 8.54
(a)
PC/laptop Excel Output:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli
p 0.84
pZ
p 1 p
0.8173 0.8627
n
0.84 1.96
0.84 1 0.84 1000
Copyright ©2024 Pearson Education, Inc.
xlii Chapter 8: Confidence Interval Estimation 8.54 cont.
(a)
Smartphone Excel Output:
p 0.91 p Z
p 1 p
0.8923 0.9277
n
0.91 1.96
0.911 0.91 1000
Tablet Excel Output:
p 0.5
pZ
p 1 p
0.469 0.5310
n
0.5 1.96
0.5 1 0.5 1000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii 8.54 cont.
(a)
Smart Watch Excel Output:
p(1 p) 0.1(1 0.1) 0.1 n 100 0.0814 0.1186 Most adults have a PC/laptop and a smartphone. Some adults have a tablet computer and very few have a smart watch. p 0.1 p Z
(b)
8.55
(a)
Digital Coupons Excel Output:
p(1 p) 0.49(1 0.49) 0.49 n 731 0.4535 0.5260 p 0.49
pZ
Copyright ©2024 Pearson Education, Inc.
xliv Chapter 8: Confidence Interval Estimation 8.55 cont.
(a)
Look up recipes Excel Output:
p(1 p) 0.485(1 0.485) 0.485 n 731 0.4494 0.5219 p 0.485 p Z
Read product reviews Excel Output:
p(1 p) 0.32(1 0.32) 0.32 n 731 0.2863 0.3539 p 0.32
pZ
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 8.55 cont.
(a)
Locate in-store items Excel Output:
p(1 p) 0.21(1 0.21) 0.21 n 731 0.1811 0.2402 About half of smartphone owners use their phone to access digital coupons or look up recipes while shopping in a grocery store. Fewer smartphone owners use their phone to read product reviews or locate items in the store while shopping in a grocery store. p 0.21 p Z
(b)
8.56
Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. (a) PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level
3.5 51 40 95%
Intermediate Calculations Standard Error of the Mean 0.553398591 Degrees of Freedom 39 t Value 2.0227 Interval Half Width 1.1194 Confidence Interval Interval Lower Limit Interval Upper Limit 49.88 52.12
49.88 52.12
Copyright ©2024 Pearson Education, Inc.
xlvi Chapter 8: Confidence Interval Estimation 8.56 cont.
(b)
PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level
(c) (d) (e)
8.57
(a)
40 32 95%
Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width
0.8000 -1.9600 0.0632 0.1240
Confidence Interval Interval Lower Limit Interval Upper Limit
0.6760 0.9240
0.6760 0.9240 Z 2 2 1.962 52 = 24.01 Use n = 25 n e2 22 Z 2 (1 – ) 1.962 (0.5) (0.5) = 266.7680 Use n = 267 n e2 (0.06)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 267) should be used. Excel Output:
X t
S 8 42 2.680 n 50
38.97 45.03
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii 8.57 cont.
(b)
Excel Output:
p(1 p) 0.26(1 0.26) 0.26 n 50 0.1384 0.3816 p 0.26
8.58
(a)
pZ
PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level
7.3 6.2 25 95%
Intermediate Calculations Standard Error of the Mean 1.46 Degrees of Freedom 24 t Value 2.0639 Interval Half Width 3.0133 Confidence Interval Interval Lower Limit Interval Upper Limit
3.19 9.21
3.19 9.21
Copyright ©2024 Pearson Education, Inc.
xlviii Chapter 8: Confidence Interval Estimation 8.58 cont.
(b)
PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level
(c) (d) (e)
8.59
25 13 95%
Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width
0.52 -1.9600 0.0999 0.1958
Confidence Interval Interval Lower Limit Interval Upper Limit
0.3242 0.7158
0.3241 0.7158 Z 2 2 1.962 82 = 109.2682 Use n = 110 n e2 1.52 Z 2 (1 – ) 1.6452 (0.5) (0.5) = 120.268 Use n = 121 n e2 (0.075)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 121) should be used.
Note: All the answers are computed using PHStat. Answers computed otherwise may be slightly different due to rounding. (a) PHStat output: Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level
1000 12500 100 95%
Intermediate Calculations Standard Error of the Mean 100 Degrees of Freedom 99 t Value 1.9842 Interval Half Width 198.4217 Confidence Interval Interval Lower Limit Interval Upper Limit
12301.58 12698.42
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlix
8.59 cont.
(b)
$12,301.58 $12,698.42 PHStat output: Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level
(c) (d)
8.60
(a) (b) (c)
8.61
(a) (b)
(c) (d)
100 30 95%
Intermediate Calculations Sample Proportion Z Value Standard Error of the Proportion Interval Half Width
0.3000 -1.9600 0.0458 0.0898
Confidence Interval Interval Lower Limit Interval Upper Limit
0.2102 0.3898
0.2102 0.3898 Z 2 2 2.582 10002 = 106.1583 Use n = 107 n e2 2502 Z 2 (1 – ) 1.6452 (0.3) (1 0.3) = 280.5749 Use n = 281 n e2 (0.045)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 281) should be used.
p(1 – p) 0.31(0.69) 0.31 1.645 n 200 S 2 X t 3.5 1.9720 n 200 S 3000 X t 18000 1.9720 n 200
0.2562 0.3638
S $9.22 $21.34 1.9949 n 70 p(1 – p) 0.3714(0.6286) pZ 0.3714 1.645 n 70 0.2764 0.4664 Z 2 2 1.962 102 n 2 = 170.74 e 1.52 Z 2 (1 – ) 1.6452 (0.5) (0.5) = 334.08 n e2 (0.045)2
$19.14 $23.54
pZ
X t
Copyright ©2024 Pearson Education, Inc.
3.22 3.78 $17,581.68 $18,418.32
Use n = 171 Use n = 335
l Chapter 8: Confidence Interval Estimation (e)
8.62
(a) (b) (c) (d) (e)
8.63
(a)
(b) (c)
8.64
If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 335) should be used. S $7.26 $36.66 $40.42 $38.54 2.0010 n 60 p(1 – p) 0.30(0.70) pZ 0.30 1.645 0.2027 0.3973 n 60 Z 2 2 1.962 82 = 109.27 Use n = 110 n 2 e 1.52 Z 2 (1 – ) 1.6452 (0.5) (0.5) = 422.82 Use n = 423 n e2 (0.04)2 If a single sample were to be selected for both purposes, the larger of the two sample sizes (n = 423) should be used. X t
Z 2 (1 ) 1.962 (0.5) (0.5) = 384.16 Use n = 385 e2 (0.05)2 If we assume that the population proportion is only 0.50, then a sample of 385 would be required. If the population proportion is 0.90, the sample size required is cut to 138. p(1 p) 0.84(0.16) pZ 0.84 1.96 n 50 0.7384 0.9416 The representative can be 95% confidence that the actual proportion of bags that will do the job is between 74.5% and 93.5%. He/she can accordingly perform a cost-benefit analysis to decide if he/she wants to sell the Ice Melt product. n
(a) Confidence Interval Estimate for the Proportion Data Sample Size Number of Successes Confidence Level
90 51 95%
Intermediate Calculations Sample Proportion 0.566666667 Z Value -1.9600 Standard Error of the Proportion 0.0522 Interval Half Width 0.1024 Confidence Interval Interval Lower Limit Interval Upper Limit
0.4643 0.6690
p(1 p) 0.5667(1 0.5667) 0.5667 1.96 n 90 0.4643 0.6690 pZ
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li 8.64 cont.
(b) Confidence Interval Estimate for the Mean Data Sample Standard Deviation Sample Mean Sample Size Confidence Level
1103.6491 563.38 51 95%
Intermediate Calculations Standard Error of the Mean 154.5417855 Degrees of Freedom 50 t Value 2.0086 Interval Half Width 310.4063 Confidence Interval Interval Lower Limit Interval Upper Limit
X t
8.65
(a) (b) (c)
8.66
(a)
252.97 873.79
S 1103.6491 563.38 2.0086 $252.97 $873.79 n 51
S 0.1058 5.46 5.54 5.5014 2.6800 n 50 Since 5.5 grams is within the 99% confidence interval, the company can claim that the mean weight of tea in a bag is 5.5 grams with a 99% level of confidence. The assumption is valid as the weight of the tea bags is approximately normally distributed. X t
MiniTab Output
Copyright ©2024 Pearson Education, Inc.
lii Chapter 8: Confidence Interval Estimation (a)
(c)
13.4001 16.559 With 95% confidence, the population mean answer time is somewhere between 13.40 and 16.56 seconds. The assumption is valid as the answer time is approximately normally distributed.
(a)
X t
(b)
8.67
(b)
S 34.713 3120.66 3127.77 3124.2147 1.9665 n 368 S 46.7443 3698.98< 3709.10 X t 3704.0424 1.9672 n 330
(c)
Normal Probability Plot 3300 3250
Boston
8.66 cont.
3200 3150 3100 3050 3000 -4
-3
-2
-1
0
1
2
Z Value
Copyright ©2024 Pearson Education, Inc.
3
4
Solutions to End-of-Section and Chapter Review Problems liii 8.67 cont.
(c)
Normal Probability Plot 3900 3850
Vermont
3800 3750 3700 3650 3600 3550 -4
-3
-2
-1
0
1
2
3
4
Z Value
(d)
(a) (b)
S 0.1424 0.2425 0.2856 0.2641 1.9741 n 170 S 0.1227 0.1975 0.2385 X t 0.218 1.9772 n 140
X t
(c)
Normal Probability Plot 0.9 0.8 0.7
Vermont
8.68
The weight for Boston shingles is slightly skewed to the right while the weight for Vermont shingles appears to be slightly skewed to the left. Since the two confidence intervals do not overlap, the mean weight of Vermont shingles is greater than the mean weight of Boston shingles.
0.6 0.5 0.4 0.3 0.2 0.1 0 -3
-2
-1
0
1
2
Z Value
Copyright ©2024 Pearson Education, Inc.
3
liv Chapter 8: Confidence Interval Estimation 8.68 cont.
(c)
Normal Probability Plot 1.2
Boston
1 0.8 0.6 0.4 0.2 0 -3
-2
-1
0
1
2
3
Z Value
(d)
8.69
The amount of granule loss for both brands are skewed to the right but the sample sizes are large enough so the violation of the normality assumption is not critical. Because the two confidence intervals do not overlap, you can conclude that the mean granule loss of Boston shingles is higher than that of Vermont shingles
Report Writing Exercise answers will vary. An example of a report would be as follows: One can conclude with 95% confidence that the mean time for a human agent to answer a call to the financial service center is between 13.401 and 16.559 seconds. The validity of this confidence interval estimate depends on the assumption that the processing time is normally distributed. From the box plot below the answer time appears approximately symmetric so the validity of the confidence interval is not in serious doubt.
Copyright ©2024 Pearson Education, Inc.
Chapter 9
9.1
Decision rule: Reject H0 if Z STAT < –1.65 or Z STAT > +1.65. Decision: Since ZSTAT = –1.76 is less than the lower critical value of –1.65, reject H0.
9.2
Decision rule: Reject H0 if Z STAT < –1.96 or Z STAT > +1.96. Decision: Since ZSTAT = +2.21 is greater than the upper critical value of + 1.96, reject H0.
9.3
Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58.
9.4
Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58.
9.5
Decision rule: Reject H0 if Z STAT < –2.58 or Z STAT > +2.58 Decision: Since ZSTAT = –2.51 in between the two critical values, do not reject H0.
9.6
For two-tail hypothesis test where ZSTAT = +2.00, p-value = 2(1 .9772) = 0.0456
9.7
Since the p-value of 0.0456 is less than the 0.05 level of significance, the statistical decision is to reject the null hypothesis.
9.8
For two-tail hypothesis test where ZSTAT = –1.38, p-value = 0.0838 + (1 – 0.9162) = 0.1676
9.9
is the probability of incorrectly convicting the defendant when he is innocent. is the probability of incorrectly failing to convict the defendant when he is guilty.
9.10
Under the French judicial system, unlike the United States, the null hypothesis assumes the defendant is guilty, the alternative hypothesis assumes the defendant is innocent. A Type I error would be not convicting a guilty person and a Type II error would be convicting an innocent person.
9.11
(a)
A Type I error is the mistake of approving an unsafe drug. A Type II error is not approving a safe drug.
(b)
The consumer groups are trying to avoid a Type I error. Copyright ©2024 Pearson Education, Inc. v
vi Chapter 10: Two-Sample Tests
9.12
(c)
The industry lobbyists are trying to avoid a Type II error.
(d)
To lower both Type I and Type II errors, the FDA can require more information and evidence in the form of more rigorous testing. This can easily translate into longer time to approve a new drug.
H0: = 20 minutes. 20 minutes is adequate travel time between classes. H1: 20 minutes. 20 minutes is not adequate travel time between classes.
9.13
(a)
H0: = 13 hours H1: 13 hours
(b)
A Type I error is the mistake of concluding that the mean number of hours spent by business seniors at your school is different from the 13-hour-per-week benchmark reported by The National Survey of Student Engagement when in fact it is not any different.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 9.13 cont.
(c)
A Type II error is the mistake of not concluding that the mean number of hours spent by business seniors at your school is different from the 13-hour-per-week benchmark reported by The National Survey of Student Engagement when it is in fact different.
9.14
(a)
Minitab Output:
H0: = 50,000. The mean life of a large shipment of LEDs is equal to 50,000 hours. H1: 50,000. The mean life of a large shipment of LEDs differs from 50,000 hours. Decision rule: Reject H 0 if |ZSTAT| > 1.96 Test statistic: Z STAT
X
–0.67
n Decision: Since –1.96 < ZSTAT = –0.67 < 1.96, do not reject H 0 . There is not enough evidence to conclude that the mean life of a large shipment of LEDs differs from 50,000 hours. (b)
p-value = 0.505. If the population mean life of a large shipment of LEDs is indeed equal to 50,000 hours, there is a 50.5% chance of observing a test statistic at least as contradictory to the null hypothesis as the sample result.
(c)
X Z a /2
(d)
Because the interval includes the hypothesized value of 50000 hours, you do not reject the null hypothesis. There is insufficient evidence that the mean life of a large shipment
n
49875 1.96
1500 64
49508 50242
Copyright ©2024 Pearson Education, Inc.
viii Chapter 10: Two-Sample Tests of LEDs differs from 50,000 hours. The same decision was reached using the two-tailed hypothesis test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 9.15
(a)
(a) PHStat Output:
Z Test of Hypothesis for the Mean
Confidence Interval Estimate for the Mean
Data
Data =
Null Hypothesis
50000
Population Standard Deviation
1000 49875
Level of Significance
0.05
Sample Mean
Population Standard Deviation
1000
Sample Size
Sample Size
64
Sample Mean
64
Confidence Level
95%
49875 Intermediate Calculations
Intermediate Calculations
Standard Error of the Mean
125.0000
Standard Error of the Mean
125.0000
Z Value
-1.9600
Z Test Statistic
-1.0000
Interval Half Width
244.9955
Two-Tail Test
Confidence Interval
Lower Critical Value
-1.9600
Interval Lower Limit
49630.00
Upper Critical Value
1.9600
Interval Upper Limit
50120.00
p-Value
0.3173
Do not reject the null hypothesis H0: = 50,000. The mean life of a large shipment of LEDs is equal to 50,000 hours. H1: 50,000. The mean life of a large shipment of LEDs differs from 50,000 hours. Decision rule: Reject H 0 if |ZSTAT| > 1.96
Copyright ©2024 Pearson Education, Inc.
x Chapter 10: Two-Sample Tests Test statistic: Z STAT
X
n
49,875 50,000 1.00 1,000 64
Decision: Since –1.96 < ZSTAT = –1.00 < 1.96, do not reject H 0 . There is not enough evidence to conclude that the mean life of a large shipment of LEDs differs from 50,000 hours. (b) p-value = 0.3173. If the population mean life of a large shipment of LEDs is indeed equal to 50,000 hours, there is a 31.73% chance of observing a test statistic at least as contradictory to the null hypothesis as the sample result. (c) X Z a /2
n
49,875 1.96
1,000 64
49,630 50,120
(d) Because the interval includes the hypothesized value of 50,000 hours, you do not reject the null hypothesis. There is insufficient evidence that the mean life of a large shipment of LEDs differs from 50,000 hours. The same decision was reached using the two-tailed hypothesis test. (b)
Comparing the results to (a) of problem 9.14, the smaller standard deviation does not result in a rejection of the null hypothesis.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 9.16
(a)
PHStat output:
Data =
Null Hypothesis
1
Level of Significance
0.01
Population Standard Deviation
0.02
Sample Size
50
Sample Mean
0.995
Intermediate Calculations Standard Error of the Mean
0.002828427
Z Test Statistic
–1.767766953
Two-Tail Test Lower Critical Value
–2.575829304
Upper Critical Value
2.575829304
p-Value
0.077099872 Do not reject the null hypothesis
H0: = 1. The mean amount of water is 1 gallon. H1: 1. The mean amount of water differs from 1 gallon. Decision rule: Reject H 0 if |ZSTAT| > 2.5758 Test statistic: Z STAT
X
n
0.995 1 1.7678 0.02 50
Decision: Since |ZSTAT| < 2.5758, do not reject H 0 . There is not enough evidence to conclude that the mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is different from 1 gallon. Copyright ©2024 Pearson Education, Inc.
xii Chapter 10: Two-Sample Tests (b)
p-value = 0.0771. If the population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is actually 1 gallon, the probability of obtaining a test statistic that is more than 1.7678 standard error units away from 0 is 0.0771.
(c)
PHStat output:
Data Population Standard Deviation
0.02
Sample Mean
0.995
Sample Size
50
Confidence Level
99%
Intermediate Calculations Standard Error of the Mean
0.002828427
Z Value
–2.5758293
Interval Half Width
0.007285545
Confidence Interval Interval Lower Limit
0.987714455
Interval Upper Limit
1.002285545
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 9.16
X Z a /2
(c)
cont.
n
0.995 2.5758
0.02 50
0.9877 1.0023
You are 99% confident that population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is somewhere between 0.9877 and 1.0023 gallons. (d)
Since the 99% confidence interval does contain the hypothesized value of 1, you will not reject H 0 . The conclusions are the same.
9.17
(a)
(a) PHStat output:
Z Test of Hypothesis for the Mean
Confidence Interval Estimate for the Mean
Data Null Hypothesis
Data =
1
Population Standard Deviation
0.015
Level of Significance
0.01
Sample Mean
0.995
Population Standard Deviation
0.015
Sample Size
50
Sample Size
50
Sample Mean
0.995
Confidence Level
99%
Intermediate Calculations Intermediate Calculations
Standard Error of the Mean
0.0021
Standard Error of the Mean
0.0021
-2.5758
-1.9600
Z Test Statistic
-2.3570
0.0055
244.9955
Two-Tail Test
Confidence Interval
Lower Critical Value
-2.5758
Interval Lower Limit
0.9895
Upper Critical Value
2.5758
Interval Upper Limit
1.0005
p-Value
0.0184
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
xiv Chapter 10: Two-Sample Tests H0: = 1. The mean amount of water is 1 gallon. H1: 1. The mean amount of water differs from 1 gallon. Decision rule: Reject H 0 if |ZSTAT| > 2.5758 Test statistic: Z STAT
X
n
0.995 1 2.3570 0.015 50
Decision: Since |ZSTAT| < 2.5758, do not reject H 0 . There is not enough evidence to conclude that the mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is different from 1 gallon. (b) p-value = 0.0184. If the population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is actually 1 gallon, the probability of obtaining a test statistic that is more than 2.3570 standard error units away from 0 is 0.0184. (c) X Z a /2
n
0.995 2.5758
0.015 50
0.9895 1.0005
You are 99% confident that population mean amount of water contained in 1-gallon bottles purchased from a nationally known water bottling company is somewhere between 0.9895 and 1.0005 gallon. 9.17
(d) Since the 99% confidence interval contains the hypothesized value of 1 gallon, you
cont.
will not reject H 0 . The conclusions are same. (b)
The smaller population standard deviation results in a smaller standard error of the Z test and, hence, smaller p-value. You do not reject H 0 in both Problem 9.16 and 9.17.
X – 56 – 50 2.2361 S 12 n 20
9.18
tSTAT
9.19
d.f. = n – 1 = 20 – 1 = 19
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 9.20
For a two-tailed t-test with a 0.05 level of confidence, and 19 degrees of freedom, the critical values are 2.0930.
9.21
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis Level of Significance
50 0.05
Sample Size
20
Sample Mean
56
Sample Standard Deviation
12
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
2.6833 19
t Test Statistic
2.2361
Two-Tail Test Lower Critical Value
-2.0930
Upper Critical Value
2.0930
p-Value
0.0375
Reject the null hypothesis
H 0 : 50
H1 : 50 Decision rule: Reject H 0 if tSTAT > 2.0930 or tSTAT < –2.0930 or p-value < 0.05 Test statistic: tSTAT
X 56 50 2.2361 S 12 n 20 Copyright ©2024 Pearson Education, Inc.
d.f. = 19
xvi Chapter 10: Two-Sample Tests p-value = 0.0375 Decision: Since |tSTAT| < 2.0930 and the p-value of 0.0375 < 0.05 , reject H 0 . There is enough evidence to conclude that the mean amount is different from 50. 9.22
No, you should not use the t test to test the null hypothesis that = 60 on a population that is left-skewed because the sample size (n = 16) is less than 30. The t test assumes that, if the underlying population is not normally distributed, the sample size is sufficiently large to enable the test to be valid. If sample sizes are small (n < 30), the t test should not be used because the sampling distribution does not meet the requirements of the Central Limit Theorem.
9.23
Yes, you may use the t test to test the null hypothesis that = 60 even though the population is left-skewed because the sample size is sufficiently large (n = 160). The t test assumes that, if the underlying population is not normally distributed, the sample size is sufficiently large to enable the test statistic t to be influenced by the Central Limit Theorem.
9.24
PHStat output: t Test for Hypothesis of the Mean Data Null Hypothesis
=
3.7
Level of Significance
0.05
Sample Size
64
Sample Mean
3.57
Sample Standard Deviation
0.8
Intermediate Calculations Standard Error of the Mean
0.1
Degrees of Freedom
63 –1.3
t Test Statistic Two-Tail Test Lower Critical Value
–1.9983405
Upper Critical Value
1.9983405
p-Value
0.1983372
Do not reject the null hypothesis
(a)
H1 : 3.7 Decision rule: Reject H 0 if |tSTAT| > 1.9983 d.f. = 63 H 0 : 3.7
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii
X 3.57 3.7 1.3 S 0.8 n 64 Decision: Since |tSTAT| < 1.9983, do not reject H 0 . There is not enough evidence to conclude that the population mean waiting time is different from 3.7 minutes at the 0.05 level of significance. The sample size of 64 is large enough to apply the Central Limit Theorem, hence, you do not need to be concerned about the shape of the population distribution when conducting the t-test in (a). In general, the t test is appropriate for this sample size except for the case where the population is extremely skewed or bimodal. Test statistic: tSTAT
(b)
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 10: Two-Sample Tests 9.25
Excel Output: t Test for Hypothesis of the Mean
Data Null Hypothesis
=
8.2
Level of Significance
0.05
Sample Size
50
Sample Mean
8.159
Sample Standard Deviation
0.051
Intermediate Calculations Standard Error of the Mean
0.0072
Degrees of Freedom
49
t Test Statistic
-5.6846
Two-Tail Test Lower Critical Value
-2.0096
Upper Critical Value
2.0096
p-Value
0.0000
Reject the null hypothesis (a)
H 0 : 8.20
H1 : 8.20 Decision rule: Reject H 0 if |tSTAT| > 2.0096 or p-value < 0.05 Test statistic: tSTAT
(b)
d.f. = 49
X 8.159 8.20 5.6846 S 0.051 n 50
p-value = 0.0000 Decision: Since |tSTAT| > 2.0096 and the p-value of 0.0000 < 0.05 , reject H 0 . There is not enough evidence to conclude that the mean amount is different from 8.20 ounces. The p-value is 0.0000. If the population mean is indeed 8.20 ounces, the probability of obtaining a sample mean that is more than 0.041 ounces away from 8.20 ounces is 0. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix
Copyright ©2024 Pearson Education, Inc.
xx Chapter 10: Two-Sample Tests 9.26
PHStat output:
t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation
1475 0.05 100 1500 200
Intermediate Calculations Standard Error of the Mean 20.0000 Degrees of Freedom 99 t Test Statistic 1.2500 Two-Tail Test Lower Critical Value -1.9842 Upper Critical Value 1.9842 p -Value 0.2142 Do not reject the null hypothesis (a)
H 0 : $1,475
H1 : $1,475 Decision rule: Reject H 0 if p-value < 0.05 Test statistic: tSTAT
(b)
X 1.2500 S n
p-value = 0.2142 Decision: Since the p-value of 0.2142 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean amount spent on Amazon.com by Amazon Prime member shoppers is different from $1,475. The p-value is 0.2142. If the population mean is indeed $1,475, the probability of obtaining a test statistic that is more than 1.25 standard error units away from 0 in either direction is 0.2142.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi 9.27
Excel Output: t Test for Hypothesis of the Mean
Data Null Hypothesis
=
200
Level of Significance
0.05
Sample Size
28
Sample Mean
198.8
Sample Standard Deviation
21.4
Intermediate Calculations Standard Error of the Mean
4.0442
Degrees of Freedom
27
t Test Statistic
-0.2967
Two-Tail Test Lower Critical Value
-2.0518
Upper Critical Value
2.0518
p-Value
0.7690
Do not reject the null hypothesis (a)
H 0 : 200
H1 : 200 Decision rule: Reject H 0 if |tSTAT| > 2.0518 or p-value < 0.05 Test statistic: tSTAT
X 198.8 200 0.2967 S 21.4 n 28
p-value = 0.7690 Decision: Since |tSTAT| < 2.0518 and the p-value of 0.7690 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean tread wear index is different from 200. Copyright ©2024 Pearson Education, Inc.
xxii Chapter 10: Two-Sample Tests (b)
The p-value is 0.7690. If the population mean is indeed 200, the probability of obtaining a test statistic that is more than 0.2967 standard error units away from 0 in either direction is 0.7690.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii 9.28
PHStat output: t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation
6.5 0.05 15 7.09 1.406031226
Intermediate Calculations Standard Error of the Mean 0.3630 Degrees of Freedom 14 t Test Statistic 1.6344 Two-Tail Test Lower Critical Value -2.1448 Upper Critical Value 2.1448 p -Value 0.1245 Do not reject the null hypothesis
(a)
H1 : $6.50 Decision rule: Reject H 0 if |tSTAT| > 2.1448 or p-value < 0.05 H 0 : $6.50
X 1.6344 S n Decision: Since |tSTAT| < 2.1448, do not reject H 0 . There is not enough evidence to conclude that the mean amount spent for lunch is different from $6.50. The p-value is 0.1245. If the population mean is indeed $6.50, the probability of obtaining a test statistic that is more than 1.6344 standard error units away from 0 in either direction is 0.4069. That the distribution of the amount spent on lunch is normally distributed. With a sample size of 15, it is difficult to evaluate the assumption of normality. However, the distribution may be fairly symmetric because the mean and the median are close in value. Also, the boxplot appears only slightly skewed so the normality assumption does not appear to be seriously violated. Test statistic: tSTAT
(b)
(c) (d)
Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 10: Two-Sample Tests 9.29
(a)
Minitab Output
H 0 : 45
H1 : 45 Decision rule: Reject H 0 if |tSTAT| > 2.0555 d.f. = 26 X 45.22 45 0.05 S 23.15 n 27 Decision: Since |tSTAT| < 2.0555, do not reject H 0 . There is not enough evidence to conclude that the mean processing time has changed from 45 days. The population distribution needs to be normal. Test statistic: tSTAT
(b) (c)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv 9.29 cont.
9.30
(c)
(d)
The mean is close to the median, and the points on the normal probability plot appear to be increasing approximately in a straight line. The boxplot appears to be approximately symmetrical. Thus, you can assume that the population of processing times is approximately normally distributed. The assumption needed to conduct the t test is valid.
(a)
H0 : 2
H1 : 2 d.f. = 49 Decision rule: Reject H 0 if |tSTAT| > 2.0096 X 2.0007 2 0.1143 S 0.0446 n 50 Decision: Since |tSTAT| < 2.0096, do not reject H 0 . There is not enough evidence to conclude that the mean amount of soft drink filled is different from 2.0 liters. p-value = 0.9095. If the population mean amount of soft drink filled is indeed 2.0 liters, the probability of observing a sample of 50 soft drinks that will result in a sample mean amount of fill more different from 2.0 liters is 0.9095. Test statistic: tSTAT
(b)
(c)
Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 10: Two-Sample Tests Normal Probability Plot 2.15 2.1
Amount
2.05 2 1.95 1.9 1.85 -3
-2
-1
0
1
2
3
Z Value
9.30 cont.
(d)
The normal probability plot suggests that the data are rather normally distributed. Hence, the results in (a) are valid in terms of the normality assumption.
(e) Time Series Plot 2.15 2.1
Amount
2.05 2 1.95 1.9 1.85 1.8 1.75 1
4
7
10 13 16 19 22 25 28 31 34 37 40 43 46 49
The time series plot of the data reveals that there is a downward trend in the amount of soft drink filled. This violates the assumption that data are drawn independently from a normal population distribution because the amount of fill in consecutive bottles appears to be closely related. As a result, the t test in (a) becomes invalid.
9.31
(a)
PHStat output: Data Null Hypothesis
=
Level of Significance
20 0.05
Sample Size
50
Sample Mean
43.04
Sample Standard Deviation
41.92605736
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii
Intermediate Calculations Standard Error of the Mean
5.929239893
Degrees of Freedom t Test Statistic
49 3.885826921
Two-Tail Test Lower Critical Value
–2.009575199
Upper Critical Value
2.009575199
p-Value
0.000306263 Reject the null hypothesis
H1 : 20 Decision rule: Reject H 0 if tSTAT > 2.0096 d.f. = 49 X 43.04 20 3.8858 Test statistic: tSTAT S 41.9261 n 50 Decision: Since tSTAT > 2.0096, reject H 0 . There is enough evidence to conclude that the H 0 : 20
mean number of days is different from 20.
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 10: Two-Sample Tests 9.31 cont.
(b) (c)
The population distribution needs to be normal. Normal Probability Plot 180 160 140
Days
120 100 80 60 40 20 0 -3
-2
-1
0
1
2
3
Z Value
(d)
(a)
H 0 : 8.46
H1 : 8.46 Decision rule: Reject H 0 if |tSTAT| > 2.0106 d.f. = 48 X 8.4209 8.46 = –5.9355 S 0.0461 n 49 Decision: Since |tSTAT| > 2.0106, reject H 0 . There is enough evidence to conclude that mean widths of the troughs is different from 8.46 inches. The population distribution needs to be normal. Test statistic: tSTAT
(b) (c)
Normal Probability Plot 8.55 8.5
Width
9.32
The normal probability plot indicates that the distribution is skewed to the right. Even though the population distribution is probably not normally distributed, the result obtained in (a) should still be valid due to the Central Limit Theorem as a result of the relatively large sample size of 50.
8.45 8.4 8.35 8.3 -3
-2
-1
0
1
2
Z Value
Copyright ©2024 Pearson Education, Inc.
3
Solutions to End-of-Section and Chapter Review Problems xxix 9.32 cont.
(c) Box-and-whisker Plot
Width
8.3
9.33
8.35
8.4
8.45
8.5
(d)
The normal probability plot and the boxplot indicate that the distribution is skewed to the left. Even though the population distribution is not normally distributed, the result obtained in (a) should still be valid due to the Central Limit Theorem as a result of the relatively large sample size of 49.
(a)
H0 : 0
H1 : 0 Decision rule: Reject H 0 if |tSTAT| > 1.9842 d.f. = 99 X 0 0.00023 1.3563 S 0.00170 n 100 Decision: Since |tSTAT| < 1.9842, do not reject H 0 . There is not enough evidence to conclude that the mean difference is different from 0.0 inches. S 0.001696 X t -0.00023 1.9842 –0.0005665 0.0001065 n 100 You are 95% confident that the mean difference is somewhere between –0.0005665 and 0.0001065 inches. Since the 95% confidence interval contains 0, you do not reject the null hypothesis in part (a). Hence, you will make the same decision and arrive at the same conclusion as in (a). In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample size is 100, which is considered quite large, the t distribution will provide a good approximation to the sampling distribution of the mean as long as the population distribution is not very skewed. Test statistic: tSTAT
(b)
(c) (d)
Box-and-whisker Plot
Error
-0.006
-0.004
-0.002
0
0.002
0.004
0.006
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 10: Two-Sample Tests
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi 9.33 cont.
(d)
The boxplot suggests that the data has a distribution that is skewed slightly to the right. Given the relatively large sample size of 100 observations, the t distribution should still provide a good approximation to the sampling distribution of the mean.
9.34
(a)
H 0 : 5.5
H1 : 5.5 Decision rule: Reject H 0 if |tSTAT| > 2.680 d.f. = 49
(c)
X 5.5014 5.5 0.0935 S 0.1058 n 50 Decision: Since |tSTAT| < 2.680, do not reject H 0 . There is not enough evidence to conclude that the mean amount of tea per bag is different from 5.5 grams. s 0.1058 5.46< 5.54 X t 5.5014 2.6800 n 50 With 99% confidence, you can conclude that the population mean amount of tea per bag is somewhere between 5.46 and 5.54 grams. The conclusions are the same.
(a)
Excel Output:
Test statistic: tSTAT
(b)
9.35
t Test for Hypothesis of the Mean Data =
Null Hypothesis Level of Significance
121 0.05
Sample Size
30
Sample Mean
115.4
Sample Standard Deviation
56.7089
Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic
10.3536 29 -0.5409
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 10: Two-Sample Tests Lower Critical Value
-2.0452
Upper Critical Value
2.0452
p-Value
0.5927
Do not reject the null hypothesis
H 0 : 121 H1 : 121 Decision rule: Reject H 0 if p-value < 0.05 . Test statistic: tSTAT
X 115.4 121 0.5409 S 56.7089 n 30
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii 9.35 cont.
(a)
(b) (c)
Decision: Since p-value = 0.5927 > 0.05, do not reject H 0 . There no is evidence to conclude that population mean time spent per day accessing the internet is different from 121 minutes. The population distribution needs to be normal. You could construct charts and observe their appearance. For small- or moderate-sized data sets, construct a stem-and-leaf display or a boxplot. For large data sets, plot a histogram or polygon. You could also compute descriptive numerical measures and compare the characteristics of the data with the theoretical properties of the normal distribution. Compare the mean and median and see if the interquartile range is approximately 1.33 times the standard deviation and if the range is approximately 6 times the standard deviation. Evaluate how the values in the data are distributed. Determine whether approximately two-thirds of the values lie between the mean and ±1 standard deviation. Determine whether approximately four-fifths of the values lie between the mean and ±1.28 standard deviations. Determine whether approximately 19 out of every 20 values lie between the mean ±2 standard deviations.
(d) Minutes Mean
Five-Number Summary
115.4
Minimum
4
Standard Deviation
56.7089
First Quartile
59
Count
30
Median
137.5
Third Quartile
166
Maximum
189
Copyright ©2024 Pearson Education, Inc.
xxxiv Chapter 10: Two-Sample Tests 9.35 cont.
(d)
Both the boxplot and the normal probability plot indicate that the distribution is leftskewed. Hence, the t-test in (a) might not be valid. 9.36
p-value = 1 0.9772 = 0.0228
9.37
Since the p-value = 0.0228 is less than = 0.05, reject H0.
9.38
p-value = 0.0838
9.39
Since the p-value = 0.0838 is greater than = 0.01, do not reject H0
9.40
p-value = P Z 2.38 0.9913
9.41
Since the p-value = 0.9913 > 0.05, do not reject the null hypothesis.
9.42
t = 2.7638
9.43
Since tSTAT = 1.79 < 2.7638, do not reject H0.
9.44
t = –2.5280
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv 9.45
Since tSTAT = –1.15 > –2.5280, do not reject H0
Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 10: Two-Sample Tests 9.46
(a)
H 0 : 8000
H1 : 8000 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 63 Test statistic: tSTAT
(b)
X 2.69 S n
p-value = 0.005 Decision: tSTAT = 2.69 > 1.6694, reject H 0 . There is evidence to conclude that the population mean bus miles is more than 8000 bus miles. The p-value is 0.005 < 0.05. The probability of getting a tSTAT statistic greater than 2.69 given that the null hypothesis is true, is 0.005.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii 9.47
(a)
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis
98
Level of Significance
0.05
Sample Size
100
Sample Mean
117
Sample Standard Deviation
25
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
2.5000 99
t Test Statistic
7.6000
Upper-Tail Test Upper Critical Value
1.6604
p-Value
0.0000
Reject the null hypothesis
H 0 : 95
H1 : 95 Decision rule: Reject H 0 if p-value > 0.05 d.f. = 99 Test statistic: tSTAT
(b)
X 117 98 7.6000 S 25 n 100
p-value = 0.000 Decision: tSTAT = 7.6 > 1.6604, reject H 0 . There is evidence to conclude that the population mean cost to repair is more than $95. The p-value is 0.0000 < 0.05. The probability of getting a tSTAT statistic less than 7.6000 given that the null hypothesis is true, is 0.000.
Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 10: Two-Sample Tests 9.48
(a)
Minitab Output:
H 0 : 30
H1 : 30 Decision rule: Reject H 0 if p-value < 0.01 d.f. = 859 Test statistic: tSTAT
(b)
9.49
X –10.58 S n
p-value = 0.000 Decision: tSTAT = –10.58 < –2.3307, reject H 0 . p-value = 0.000 < 0.01, reject H 0 . There is evidence to conclude that the population mean wait time is less than 30 minutes. The probability of getting a sample mean of 24.05 minutes or less if the population mean is 30 minutes is 0.000.
H0: 25 min.The mean delivery time is not less than 23 minutes. H1: < 25 min.The mean delivery time is less than 23 minutes. (a) Decision rule: If tSTAT < – 1.6896, reject H0. X – 19.6 – 23 3.4 Test statistic: tSTAT S 6 n 36 Decision: Since tSTAT = –3.4 is less than –1.6896, reject H0. There is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 25 minutes. (b) Decision rule: If p value < 0.05, reject H0. p value = 0.0068 Decision: Since p value = 0.0008 is less than = 0.05, reject H0. There is enough evidence to conclude the population mean delivery time has been reduced below the previous value of 23 minutes. (c) The probability of obtaining a sample whose mean is 19.6 minutes or less when the null hypothesis is true is 0.0008. (d) The conclusions are the same.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 9.50
(a)
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis
85.5
Level of Significance
0.01
Sample Size
133
Sample Mean
98
Sample Standard Deviation
9
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
0.7804 132
t Test Statistic
16.0174
Upper-Tail Test Upper Critical Value
2.3549
p-Value
0.0000
Reject the null hypothesis
H 0 : 85.5
H1 : 85.5 Decision rule: Reject H 0 if p-value < 0.01 d.f. = 132 Test statistic: tSTAT
(b)
X 98 85.5 16.0174 S 9 n 133
p-value = 0.028 Decision: tSTAT = 16.0174 > 2.35493, reject H 0 . p-value = 0.000 < 0.01, reject H 0 . There is enough evidence to conclude that the population mean one-time gift donation is greater than $85.50. The probability of getting a sample mean of $98 or more if the population mean is $85.50 is 0.0000. Copyright ©2024 Pearson Education, Inc.
xl Chapter 10: Two-Sample Tests
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli 9.51
(a)
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis
4
Level of Significance
0.05
Sample Size
100
Sample Mean
3.7
Sample Standard Deviation
2.5
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
0.2500 99
t Test Statistic
-1.2000
Lower-Tail Test Lower Critical Value
-1.6604
p-Value
0.1165
Do not reject the null hypothesis
H0 : 4
H1 : 4 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 99 Test statistic: tSTAT
(b)
X 3.7 4 1.2 S 2.5 n 100
p-value = 0.1165 Decision: tSTAT = –1.2 > –1.6604, do not reject H 0 . There is not enough evidence to conclude that the population mean wait time to check out is less than 4 minutes. Decision p-value = 0.1165 > 0.05, do not reject H 0 . There is not enough evidence to conclude that the population mean wait time to check out is less than 4 minutes. Copyright ©2024 Pearson Education, Inc.
xlii Chapter 10: Two-Sample Tests (c)
The probability of getting a sample mean of 3.70 minutes or less if the population mean is 4 minutes is 0.1165. The conclusions are the same
(d)
X 88 = 0.22 n 400
9.52
p=
9.53
Z STAT
1.3856 0.25 0.75 n 400 88 400 0.25 X n or Z STAT 1.3856 n 1 400 0.25 0.75
9.54
H0: = 0.20 H1: 0.20 Decision rule: If Z < –1.96 or Z > 1.96, reject H0. p 0.22 0.20 Test statistic: Z STAT = 1.00 1 0.20 0.8 n 400 Decision: Since Z = 1.00 is between the critical bounds of 1.96, do not reject H0.
9.55
(a)
p
1
0.22 0.25
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis Level of Significance
0.7 0.05
Number of Items of Interest
78
Sample Size
125
Intermediate Calculations Sample Proportion
0.624
Standard Error
0.0410
Z Test Statistic
-1.8542
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0637
Do not reject the null hypothesis H0: = 0.70 H1: 0.70 Decision rule: p-value < 0.05, reject H0. 78 0.70 p 125 1.8542 Test statistic: Z STAT p-value = 0.0637 0.70(1 0.70) 1 125 n Decision: Since p-value = 0.0637 > 0.05, do not reject H0. There is not enough evidence that the proportion of college unpaid interns that received full-time job offers postgraduation is different from 0.70.
Copyright ©2024 Pearson Education, Inc.
xliv Chapter 10: Two-Sample Tests 9.55 cont.
(b)
Minitab Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis Level of Significance
0.7 0.05
Number of Items of Interest
60
Sample Size
125
Intermediate Calculations Sample Proportion
0.48
Standard Error
0.0410
Z Test Statistic
-5.3675
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis H0: = 0.70 H1: 0.70 Decision rule: p-value < 0.05, reject H0. 60 0.70 5.3675 p-value = 0.0000 Test statistic: Z STAT 125 0.70(1 0.70) 125 Decision: Since p-value = 0.0000 < 0.05, reject H0. There is evidence that the proportion of college unpaid interns that received full-time job offers post-graduation is different from 0.70. The conclusions are not the same.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 9.56
(a)
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis Level of Significance
0.651 0.05
Number of Items of Interest
60
Sample Size
100
Intermediate Calculations Sample Proportion
0.6
Standard Error
0.0477
Z Test Statistic
-1.0700
Lower-Tail Test Lower Critical Value
-1.6449
p-Value
0.1423
Do not reject the null hypothesis H0: ≥ 0.651 H1: < 0.651 Decision rule: p-value < 0.05, reject H0. 60 0.651 p 100 Z 1.0700 Test statistic: STAT 0.651(1 0.651) 1 100 n
p-value = 0.1423
Decision: Since Z STAT 1.0700 1.6449 or p-value = 0.1423 > 0.05, do not reject H0. There is no evidence to show that less than 65.1% of students at your university use the Chrome web browser.
Copyright ©2024 Pearson Education, Inc.
xlvi Chapter 10: Two-Sample Tests 9.56 cont.
(b)
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.651
Level of Significance
0.05
Number of Items of Interest
360
Sample Size
600
Intermediate Calculations Sample Proportion
0.6
Standard Error
0.0195
Z Test Statistic
-2.6209
Lower-Tail Test Lower Critical Value
-1.6449
p-Value
0.0044
Reject the null hypothesis H0: 0.651 H1: > 0.651 Decision rule: p-value < 0.05, reject H0. 360 0.651 p 600 2.6209 Test statistic: Z STAT 1 0.651(1 0.651) 600 n
(c) (d)
p-value = 0.0044
Decision: Since Z STAT 2.6209 1.6449 or p-value = 0.0044 < 0.05, Reject H0. There is evidence to show that less than 65.1% of students at your university use the Chrome web browser. The sample size had an effect on being able to reject the null hypothesis. You would be very unlikely to reject the null hypothesis with a sample of 20.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii 9.57
PHStat output: Z Test of Hypothesis for the Proportion
Data Null Hypothesis
=
0.55
Level of Significance
0.05
Number of Items of Interest
25
Sample Size
45
Intermediate Calculations Sample Proportion
0.555555556
Standard Error
0.0742
Z Test Statistic
0.0749
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.9403
Do not reject the null hypothesis H0: 0.55 H1: 0.55 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. 25 0.55 p 45 0.0749 Test statistic: Z STAT 1 0.55 1 0.55 n 45 Decision: Since ZSTAT = 0.0749 is between the two critical bounds, do not reject H0. There is not enough evidence that the proportion of females in this position at this medical center is different from what would be expected in the general workforce.
Copyright ©2024 Pearson Education, Inc.
xlviii Chapter 10: Two-Sample Tests 9.58
Excel Output: Z Test of Hypothesis for the Proportion
Data Null Hypothesis
=
0.7
Level of Significance
0.05
Number of Items of Interest
144
Sample Size
200
Intermediate Calculations Sample Proportion
0.72
Standard Error
0.0324
Z Test Statistic
0.6172
Two-Tail Test
Lower Critical Value
1.9600
Upper Critical Value
1.9600
p-Value
0.5371
Do not reject the null hypothesis H0: = 0.70 H1: 0.70 Decision rule: p-value < 0.05, reject H0. Test statistic: Z STAT
p
1 n
144 0.7 200 0.6172 0.7(1 0.7) 200
p-value = 0.5371
Decision: Since Z STAT 0.6172 1.96 or p-value = 0.5371 > 0.05, do not reject H0. There is no evidence the proportion of workers who find it easy or somewhat easy to find adequate workspace to do work at home is not different from 0.70. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlix
Copyright ©2024 Pearson Education, Inc.
l Chapter 10: Two-Sample Tests 9.59
PHStat output: Z Test of Hypothesis for the Proportion
Data Null Hypothesis
=
0.2
Level of Significance
0.05
Number of Items of Interest
155
Sample Size
500
Intermediate Calculations Sample Proportion
0.31
Standard Error
0.0179
Z Test Statistic
6.1492
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.0000
Reject the null hypothesis (a)
(b)
H0: 0.2H1: > 0.2 Decision rule: If ZSTAT > 1.6449 or p-value < 0.05, reject H0. 155 0.2 p 500 6.1492 Test statistic: Z STAT 0.2(1 0.2) 1 500 n Decision: Since ZSTAT = 6.1492 is larger than the critical bound of 1.6449, reject H0. There is enough evidence to conclude that more than 20% of the customers would upgrade to a new cellphone. The manager in charge of promotional programs concerning residential customers can use the results in (a) to try to convince potential customers to upgrade to a new cellphone since more than 20% of all potential customers will do so based on the conclusion in (a).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li 9.60
Excel Output: Z Test of Hypothesis for the Proportion
Data Null Hypothesis
=
0.2
Level of Significance
0.05
Number of Items of Interest
1095
Sample Size
4277
Intermediate Calculations Sample Proportion
0.256020575
Standard Error
0.0061
Z Test Statistic
9.1592
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.0000
Reject the null hypothesis H0: 0.20 H1: > 0.20 Decision rule: p-value < 0.05, reject H0. 1095 0.20 p 4277 Z 9.1592 Test statistic: STAT 1 0.20(1 0.20) 4277 n
p-value = 0.0000
Decision: Since Z STAT 9.1592 1.6449 or p-value = 0.0000 < 0.05, Reject H0. There is evidence that the percentage is greater than 20%. 9.61
The null hypothesis represents the status quo or the hypothesis that is to be disproved. The null hypothesis includes an equal sign in its definition of a parameter of interest. The alternative hypothesis is the opposite of the null hypothesis and usually represents taking an action. The alternative hypothesis includes either a less than sign, a not equal to sign, or a greater than sign in its definition of a parameter of interest. Copyright ©2024 Pearson Education, Inc.
lii Chapter 10: Two-Sample Tests 9.62
A Type I error represents rejecting a true null hypothesis, while a Type II error represents not rejecting a false null hypothesis.
9.63
The power of a test is the probability that the null hypothesis will be rejected when the null hypothesis is false.
9.64
In a one-tailed test for a mean or proportion, the entire rejection region is contained in one tail of the distribution. In a two-tailed test, the rejection region is split into two equal parts, one in the lower tail of the distribution, and the other in the upper tail.
9.65
The p-value is the probability of obtaining a test statistic equal to or more extreme than the result obtained from the sample data, given that the null hypothesis is true. Assuming a two-tailed test is used, if the hypothesized value for the parameter does not fall into the confidence interval, then the null hypothesis can be rejected.
9.66
9.67
The following are the 6-step critical value approach to hypothesis testing: (1) State the null hypothesis H0. State the alternative hypothesis H1. (2) Choose the level of significance . Choose the sample size n. (3) Determine the appropriate statistical technique and corresponding test statistic to use. (4) Set up the critical values that divide the rejection and nonrejection regions. (5) Collect the data and compute the sample value of the appropriate test statistic. (6) Determine whether the test statistic has fallen into the rejection or the nonrejection region. The computed value of the test statistic is compared with the critical values for the appropriate sampling distribution to determine whether it falls into the rejection or nonrejection region. Make the statistical decision. If the test statistic falls into the nonrejection region, the null hypothesis H0 cannot be rejected. If the test statistic falls into the rejection region, the null hypothesis is rejected. Express the statistical decision in terms of a particular situation.
9.68
The following are the 6-step p-value approach to hypothesis testing: (1) State the null hypothesis, H0, and the alternative hypothesis, H1. (2) Choose the level of significance, α, and the sample size, n. (3) Determine the appropriate test statistic and the sampling distribution. (4) Collect the sample data, compute the value of the test statistic. (5) compute the p-value. (6) Make the statistical decision and state the managerial conclusion. If the p-value is greater than or equal to α, you do not reject the null hypothesis, H0. If the p-value is less than α, you reject the null hypothesis.
9.69
(a) (b)
(c)
(d)
H 0 : 0.6
H1 : 0.6
The level of significance is the probability of committing a Type I error, which is the probability of concluding that the population proportion of web page visitors preferring the new design is not 0.60 when in fact 60% of the population proportion of web page visitors prefer the new design. The risk associated with Type II error is the probability of not rejecting the claim that 60% of the population proportion of web page visitors prefer the new design when it should be rejected. If you reject the null hypothesis for a p-value of 0.20, there is a 20% probability that you may have incorrectly concluded that the population proportion of web page visitors preferring the new design is not 0.60 when in fact 60% of the population proportion of web page visitors prefer the new design. The argument for raising the level of significance might be that the consequences of incorrectly concluding the proportion is not 60% are deemed not very severe. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems liii (e) (f)
9.70
(a) (b) (c)
Before raising the level of significance of a test, you have to genuinely evaluate whether the cost of committing a Type I error is really not as bad as you have thought. If the p-value is actually 0.12, you will be more confident about rejecting the null hypothesis. If the p-value is 0.01, you will be even more confident that a Type I error is much less likely to occur. A Type I error occurs when a firm is predicted to be a bankrupt firm when it will not. A Type II error occurs when a firm is predicted to be a non-bankrupt firm when it will go bankrupt. The executives are trying to avoid a Type I error by adopting a very stringent decision criterion. Only firms that show significant evidence of being in financial stress will be predicted to go bankrupt within the next two years at the chosen level of the possibility of making a Type I error.
Copyright ©2024 Pearson Education, Inc.
liv Chapter 10: Two-Sample Tests 9.70 cont.
(d)
If the revised model results in more moderate or large Z scores, the probability of committing a Type I error will increase. Many more of the firms will be predicted to go bankrupt than will go bankrupt. On the other hand, the revised model that results in more moderate or large Z scores will lower the probability of committing a Type II error because few firms will be predicted to go bankrupt than will actually go bankrupt.
9.71
(a)
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis
9.48
Level of Significance
0.05
Sample Size
50
Sample Mean
10.12
Sample Standard Deviation
3.2
Intermediate Calculations Standard Error of the Mean
0.4525
Degrees of Freedom
49
t Test Statistic
1.4142
Upper-Tail Test Upper Critical Value
1.6766
p-Value
0.0818
Do not reject the null hypothesis
H 0 : 9.48
H1 : 9.48 Decision rule: Reject H 0 if p-value > 0.05 d.f. = 99 Test statistic: tSTAT
X 10.12 9.48 1.4142 S 3.2 n 50
p-value = 0.0818 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lv Decision: tSTAT = 1.4142 < 1.6766, do not reject H 0 .
Copyright ©2024 Pearson Education, Inc.
lvi Chapter 10: Two-Sample Tests 9.71 cont.
(b)
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.16
Level of Significance
0.05
Number of Items of Interest
19
Sample Size
50
Intermediate Calculations Sample Proportion
0.38
Standard Error
0.0518
Z Test Statistic
4.2433
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000 Reject the null hypothesis
H0: = 0.16 H1: 0.16 Decision rule: Z STAT 1.96 , reject H0. Test statistic: Z STAT
p
1 n
19 0.16 50 4.2433 0.16(1 0.16) 50
Decision: Since Z STAT 4.2433 1.96 , Reject H0.
Copyright ©2024 Pearson Education, Inc.
p-value = 0.0000
Solutions to End-of-Section and Chapter Review Problems lvii
Copyright ©2024 Pearson Education, Inc.
lviii Chapter 10: Two-Sample Tests 9.72
PHStat output: t Test for Hypothesis of the Mean
Data Null Hypothesis
=
Level of Significance
8 0.05
Sample Size
60
Sample Mean
8.55
Sample Standard Deviation
1.75
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
0.2259 59
t Test Statistic
2.4344
Two-Tail Test Lower Critical Value
-2.0010
Upper Critical Value
2.0010
p-Value
0.0180
Reject the null hypothesis (a)
(b)
H0: = $8.00 H1: $8.00 Decision rule: d.f. = 59. If p-value < 0.05, reject H0. X – 8.55 8 2.4344 Test statistic: tSTAT S 1.75 n 60 Because tSTAT 2.4344 2.0010, reject H 0 . There is enough evidence to conclude that the mean amount spent differs from $8.00 p-value = 0.0180
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lix 9.72 cont.
(c)
Excel Output Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.6
Level of Significance
0.05
Number of Items of Interest
41
Sample Size
60
Intermediate Calculations Sample Proportion
0.683333333
Standard Error
0.0632
Z Test Statistic
1.3176
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.0938
Do not reject the null hypothesis H0: 0.60. H1: > 0.60. Decision rule: If p-value < 0.05, reject H0. 41 0.6 p 60 1.3176 Test statistic: Z STAT 0.6(1 0.6) 1 60 n
p-value = 0.0938
Decision: Since Z STAT 1.3176 1.6449 and p-value > 0.05, do not reject H0. There is not sufficient evidence to conclude that more than 60% of customers say they ―definitely will‖ recommend the specialty coffee shop to family and friends.
Copyright ©2024 Pearson Education, Inc.
lx Chapter 10: Two-Sample Tests 9.72 cont.
(d)
Excel Output t Test for Hypothesis of the Mean
Data =
Null Hypothesis Level of Significance
8 0.05
Sample Size
60
Sample Mean
9.2
Sample Standard Deviation
1.75
Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic
0.2259 59 5.3115
Two-Tail Test Lower Critical Value
-2.0010
Upper Critical Value
2.0010
p-Value
0.0000
Reject the null hypothesis H0: = $9.20 H1: $9.20 Decision rule: d.f. = 59. If p-value < 0.05, reject H0. X – 9.20 8 5.3115 Test statistic: tSTAT S 1.75 n 60 Because tSTAT 5.3115 2.0010, reject H 0 . There is enough evidence to conclude that the mean amount spent differs from $9.20
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxi 9.72 cont.
(e)
PHStat output:
Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.6
Level of Significance
0.05
Number of Items of Interest
26
Sample Size
60
Intermediate Calculations Sample Proportion
0.433333333
Standard Error
0.0632
Z Test Statistic
-2.6352
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.9958
Do not reject the null hypothesis H0: 0.60. H1: > 0.60. Decision rule: If p-value < 0.05, reject H0. 26 0.6 p 60 Z 2.6352 Test statistic: STAT 0.6(1 0.6) 1 60 n
p-value = 0.9958
Decision: Since Z STAT 2.6352 1.6449 and p-value > 0.05, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
lxii Chapter 10: Two-Sample Tests 9.73
(a)
Excel Output: t Test for Hypothesis of the Mean
Data =
Null Hypothesis Level of Significance
100 0.05
Sample Size
75
Sample Mean
133.7
Sample Standard Deviation
37.11
Intermediate Calculations Standard Error of the Mean Degrees of Freedom t Test Statistic
4.2851 74 7.8645
Upper-Tail Test Upper Critical Value
1.6657
p-Value
0.0000 Reject the null hypothesis
H0: $100H1: > $100 Decision rule: d.f. = 74. If tSTAT > 1.6657, reject H0. X – $133.70 – $100 7.8645 Test statistic: tSTAT S $37.11 n 75 Decision: Since the test statistic of tSTAT = 7.8645 is greater than the critical bound of 1.6657, reject H0. There is evidence to conclude that the mean reimbursement for office visits to doctors paid by Medicare was more than $100.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxiii 9.73 cont.
(b)
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.1
Level of Significance
0.05
Number of Items of Interest
12
Sample Size
75
Intermediate Calculations Sample Proportion
0.16
Standard Error
0.0346
Z Test Statistic
1.7321
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0833
Do not reject the null hypothesis H0: 0.10. At most 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. H1: > 0.10. More than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Decision rule: If ZSTAT > 1.9600, reject H0. 12 0.10 p 75 Test statistic: Z STAT 1.7321 1 0.10 1 0.10 n 75 Decision: Since ZSTAT = 1.7321 is less than the critical bound of 1.9600, do not reject H0. There is not enough evidence to conclude that more than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Copyright ©2024 Pearson Education, Inc.
lxiv Chapter 10: Two-Sample Tests (c)
To perform the t-test on the population mean, you must assume that the observed sequence in which the data were collected is random and that the data are approximately normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxv 9.73 cont.
(d)
Excel Output:
t Test for Hypothesis of the Mean
Data =
Null Hypothesis
100
Level of Significance
0.05
Sample Size
75
Sample Mean
90
Sample Standard Deviation
37.11
Intermediate Calculations Standard Error of the Mean
4.2851
Degrees of Freedom t Test Statistic
74 -2.3337
Upper-Tail Test Upper Critical Value
1.6657
p-Value
0.9888
Do not reject the null hypothesis H0: $100. The mean reimbursement for office visits to doctors paid by Medicare is at most $100. H1: > $100. The mean reimbursement for office visits to doctors paid by Medicare is greater than $100. Decision rule: d.f. = 74. If tSTAT > 1.6657, reject H0. X – $90 – $100 2.3337 Test statistic: tSTAT S $37.11 n 75 Decision: Since tSTAT = –2.3337 is less than the critical bound of 1.6657, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
lxvi Chapter 10: Two-Sample Tests 9.73 cont.
(e)
Excel Output: Z Test of Hypothesis for the Proportion
Data =
Null Hypothesis
0.1
Level of Significance
0.05
Number of Items of Interest
8
Sample Size
75
Intermediate Calculations Sample Proportion
0.106666667
Standard Error
0.0346
Z Test Statistic
0.1925
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.8474
Do not reject the null hypothesis H0: 0.10. At most 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. H1: > 0.10. More than 10% of all reimbursements for office visits to doctors paid by Medicare are incorrect. Decision rule: If ZSTAT > 1.9600, reject H0. 8 0.10 p 75 0.1925 Test statistic: Z STAT 1 0.10 1 0.10 n 75 Decision: Since ZSTAT = 0.1925 is less than the critical bound of 1.9600, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxvii 9.74
(a)
H0: 5 minutes. The mean waiting time at a bank branch in a commercial district of the city is at least 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period. H1: < 5 minutes. The mean waiting time at a bank branch in a commercial district of the city is less than 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period. Decision rule: d.f. = 14. If tSTAT < –1.7613, reject H0. X – 4.2866 – 5.0 Test statistic: tSTAT = –1.6867 S 1.637985 n 15 Decision: Since tSTAT = –1.6867 is greater than the critical bound of –1.7613, do not reject H0. There is not enough evidence to conclude that the mean waiting time at a bank branch in a commercial district of the city is less than 5 minutes during the 12:00 p.m. to 1 p.m. peak lunch period.
Copyright ©2024 Pearson Education, Inc.
lxviii Chapter 10: Two-Sample Tests 9.74 cont.
(b)
(c)
To perform the t-test on the population mean, you must assume that the observed sequence in which the data were collected is random and that the data are approximately normally distributed. Normal probability plot: Normal Probability Plot 7 6
Waiting Time
5 4 3 2 1 0 -2
-1.5
-1
-0.5
0
0.5
1
1.5
2
Z Value
9.75
(d) (e)
With the exception of one extreme point, the data are approximately normally distributed. Based on the results of (a), the manager does not have enough evidence to make that statement.
(a)
Minitab Output:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxix 9.75 cont.
(a)
Excel Output:
H 0 : 20
H1 : 20 Decision rule: Reject H 0 if p-value < 0.05 d.f. = 49 Test statistic: tSTAT
(b)
X –6.39 S n
p-value = 0.000 Decision: tSTAT = –6.39 < –1.6766, reject H 0 or p-value = 0.000 < 0.05, reject H 0 There is evidence to conclude that the population mean answer time is less than 20 minutes. The population distribution needs to be normal.
Copyright ©2024 Pearson Education, Inc.
lxx Chapter 10: Two-Sample Tests 9.75 cont.
9.76
(c)
(d)
The mean is close to the median, and the points on the normal probability plot appear to be increasing approximately in a straight line. The boxplot appears to be approximately symmetrical. Thus, you can assume that the population of processing times is approximately normally distributed. The assumption needed to conduct the t test is valid.
(a)
H 0 : 0.35
H1 : 0.35 Decision rule: Reject H 0 if tSTAT < 1.690 d.f. = 35
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxi
X – 0.3167 – 0.35 1.4735 S 0.1357 n 36 Decision: Since tSTAT > 1.690, do not reject H 0 . There is not enough evidence to conclude that the mean moisture content for Boston shingles is less than 0.35 pounds per 100 square feet. p-value = 0.0748. If the population mean moisture content is in fact no less than 0.35 pounds per 100 square feet, the probability of observing a sample of 36 shingles that will result in a sample mean moisture content of 0.3167 pounds per 100 square feet or less is 0.0748. H 0 : 0.35 H1 : 0.35 Decision rule: Reject H 0 if tSTAT < –1.6973 d.f. = 30 Test statistic: tSTAT
9.76 cont.
(a)
(b)
(c)
X – 0.2735 – 0.35 3.1003 S 0.1373 n 31 Decision: Since tSTAT < –1.6973, reject H 0 . There is enough evidence to conclude that the mean moisture content for Vermont shingles is less than 0.35 pounds per 100 square feet. p-value = 0.0021. If the population mean moisture content is in fact no less than 0.35 pounds per 100 square feet, the probability of observing a sample of 31 shingles that will result in a sample mean moisture content of 0.2735 pounds per 100 square feet or less is 0.0021. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample sizes are 36 and 31, respectively, which are considered quite large, the t distribution will provide a good approximation to the sampling distribution of the mean as long as the population distribution is not very skewed. Test statistic: tSTAT
(d)
(e)
(f) Box-and-whisker Plot (Boston)
0
0.2
0.4
0.6
0.8
Copyright ©2024 Pearson Education, Inc.
1
lxxii Chapter 10: Two-Sample Tests 9.76 cont.
(f) Box-and-whisker Plot (Vermont)
0
9.77
0.2
0.4
0.6
0.8
1
(g)
Both boxplots suggest that the data are skewed slightly to the right, more so for the Boston shingles. However, the very large sample sizes mean that the results of the t test are relatively insensitive to the departure from normality.
(a)
H 0 : 3150
(b)
(c)
H1 : 3150 Decision rule: Reject H 0 if |tSTAT| > 1.9665 d.f. = 367 X – 3124.2147 – 3150 14.2497 Test statistic: tSTAT S 34.713 n 368 Decision: Since tSTAT < 1.9665, reject H 0 . There is enough evidence to conclude that the mean weight for Boston shingles is different from 3150 pounds. p-value is virtually zero. If the population mean weight is in fact 3150 pounds, the probability of observing a sample of 368 shingles that will yield a test statistic more extreme than –14.2497 is virtually zero.
H 0 : 3700
H1 : 3700 Decision rule: Reject H 0 if |tSTAT| > 1.967 d.f. = 329
X – 3704.0424 – 3700 1.571 S 46.7443 n 330 Decision: Since |tSTAT| < 1.967, do not reject H 0 . There is not enough evidence to conclude that the mean weight for Vermont shingles is different from 3700 pounds. p-value = 0.1171. The probability of observing a sample of 330 shingles that will yield a test statistic more extreme than 1.571 is 0.1171 if the population mean weight is in fact 3700 pounds. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Since the sample sizes are 368 and 330, respectively, which are considered large enough, the t distribution will provide a good approximation to the sampling distribution of the mean even if the population is not normally distributed. Test statistic: tSTAT
(d)
(e)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxiii 9.78
(a)
H 0 : 0.3
H1 : 0.3
t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation
0.3 0.05 170 0.26 0.142382504
Intermediate Calculations Standard Error of the Mean 0.0109 Degrees of Freedom 169 t Test Statistic -3.2912 Two-Tail Test Lower Critical Value -1.9741 Upper Critical Value 1.9741 p -Value 0.0012 Reject the null hypothesis Decision rule: Reject H 0 if |tSTAT| > 1.9741 d.f. = 169 X – = –3.2912, p-value = 0.0012 S n Decision: Since tSTAT < 1.9741, reject H 0 . There is enough evidence to conclude that the mean granule loss for Boston shingles is different from 0.3 grams. p-value is 0.0012. If the population mean granule loss is in fact 0.3 grams, the probability of observing a sample of 170 shingles that will yield a test statistic more extreme than –3.2912 is 0.0012. Test statistic: tSTAT
(b)
Copyright ©2024 Pearson Education, Inc.
lxxiv Chapter 10: Two-Sample Tests 9.78 cont.
(c)
H 0 : 0.3
H1 : 0.3
t Test for Hypothesis of the Mean Data Null Hypothesis = Level of Significance Sample Size Sample Mean Sample Standard Deviation
0.3 0.05 140 0.22 0.122698672
Intermediate Calculations Standard Error of the Mean 0.0104 Degrees of Freedom 139 t Test Statistic -7.9075 Two-Tail Test Lower Critical Value -1.9772 Upper Critical Value 1.9772 p -Value 0.0000 Reject the null hypothesis Decision rule: Reject H 0 if |tSTAT| > 1.9772 d.f. = 139 X – = –7.9075 S n Decision: Since tSTAT < 1.977, reject H 0 . There is enough evidence to conclude that the mean granule loss for Vermont shingles is different from 0.3 grams. p-value is virtually zero. The probability of observing a sample of 140 shingles that will yield a test statistic more extreme than –1.977 is virtually zero if the population mean granule loss is in fact 0.3 grams. In order for the t test to be valid, the data are assumed to be independently drawn from a population that is normally distributed. Both normal probability plots indicate that the data are slightly right-skewed. Since the sample sizes are 170 and 140, respectively, which are considered large enough, the t distribution will provide a good approximation to the sampling distribution of the mean even if the population is not normally distributed. Test statistic: tSTAT
(d)
(e)
9.79
Answers will vary
Chapter 10
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxv 10.1
df n1 n2 2 10 12 2 20
10.2
(a)
S p2
(n1 1) S12 (n2 1) S2 2 (7) 42 (14) 52 22 (n1 1) (n2 1) 7 14
tSTAT
X X 42 34 0 3.8959 1
2
1
2
1 1 1 1 22 S p2 8 15 n1 n2 d.f. = (n1 – 1) + (n2 – 1) = 7 + 14 = 21 Decision rule: d.f. = 21. If tSTAT > 2.5177, reject H0. Decision: Since t = 3.8959 is greater than the critical bound of 2.5177, reject H0. There is enough evidence to conclude that the first population mean is larger than the second population mean.
(b) (c) (d)
10.3
Assume that you are sampling from two independent normal distributions having equal variances.
10.4
X X t S n1 n1 42 34 2.0796 22 18 151
10.5
1
2
2 p
1
2
df n1 n2 2 7 6 2 11
10.6
Copyright ©2024 Pearson Education, Inc.
3.7296 1 2 12.2704
lxxvi Chapter 10: Two-Sample Tests 10.6 cont.
Decision: Since tSTAT = 2.6762 is smaller than the upper critical bounds of 2.9979, do not reject H0. There is not enough evidence of a difference in the means of the two populations. 10.7
(a)
(b)
(c)
H0: 1 2 The mean estimated amount of calories in the cheeseburger is not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. H1: 1 2 The mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. Type I error is the error made in concluding that the mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first when the mean estimated amount of calories in the cheeseburger is in fact not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. Type II error is the error made in concluding that the mean estimated amount of calories in the cheeseburger is not lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first when the mean estimated amount of calories in the cheeseburger is in fact lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first.
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxvii 10.7 cont.
(d)
(e)
10.8
(a)
(b) (c) (d)
Decision: Since tSTAT = –6.1532 is smaller than the critical bound of –2.4286, reject H0. There is evidence that the mean estimated amount of calories in the cheeseburger is lower for the people who thought about the cheesecake first than for the people who thought about the organic fruit salad first. The commercial would feature foods high in calories such as peanut butter and chocolate. Based on the results from (d), presentation of foods high in calories would decrease estimates of the amount of calories associated with a cheeseburger. Because tSTAT = 2.8990 > 1.6620 or p-value = 0.0024 < 0.05, reject H0. There is evidence that the mean amount of Walker Crisps eaten by children who watched a commercial featuring a long-standing sports celebrity endorser is higher than for those who watched a commercial for an alternative food snack. 3.4616 1 2 18.5384 The results cannot be compared because (a) is a one-tail test and (b) is a confidence interval that is comparable only to the results of a two-tail test. You would choose the commercial featuring a long-standing celebrity endorser.
Copyright ©2024 Pearson Education, Inc.
lxxviii Chapter 10: Two-Sample Tests 10.9
From PHStat, Population 1 = Traditional, Population 2 = PrePaid Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
7
Sample Mean
79.85714286
Sample Standard Deviation
7.537209285
Population 2 Sample Sample Size
12
Sample Mean
82.58333333
Sample Standard Deviation
3.918680978
Intermediate Calculations Population 1 Sample Degrees of Freedom
6
Population 2 Sample Degrees of Freedom
11
Total Degrees of Freedom
17
Pooled Variance
29.9867
Standard Error
2.6044
Difference in Sample Means
-2.7262
t Test Statistic
-1.0468
Two-Tail Test Lower Critical Value Copyright ©2024 Pearson Education, Inc.
-2.1098
Solutions to End-of-Section and Chapter Review Problems lxxix Upper Critical Value
2.1098
p-Value
0.3099
Do not reject the null hypothesis
(a)
(b)
(c)
A pooled-variance t test revealed that there was no significant difference between mean rating between the two types of cellular providers. Because tSTAT = –1.0468 or p-value = 0.3099, one would not reject H0. The p-value of 0.3099 is well above the 0.05 significance level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 30.99%. It is necessary to assume both populations associated with the ratings data from the two types of cellular providers are normally distributed.
Copyright ©2024 Pearson Education, Inc.
lxxx Chapter 10: Two-Sample Tests 10.9 cont.
(d)
From PHStat Confidence Interval Estimate for the Difference Between Two Means
Data Confidence Level
95%
Intermediate Calculations Degrees of Freedom
17
t Value
2.1098
Interval Half Width
5.4947
Confidence Interval
(e)
Interval Lower Limit
-8.2209
Interval Upper Limit
2.7685
Using a 95% confidence interval, the lower limit of the average difference between the two providers is –8.2209 and the upper limit is 2.7685. Based on the results from a pooled-variance t test, one can conclude that there is no significant difference in satisfaction ratings between traditional cellular providers and prepaid cellular providers.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxi 10.10
(a)
(b)
(c)
The results from a pooled-variance t test revealed that there is no evidence at the 0.05 level of significance that there is a difference between the Southeast region accounting firms and the Gulf Coast accounting firms with respect to the mean number of partners. Because tSTAT = –0.24 or p-value = 0.808, do not reject H0. The p-value was 0.808, well above the 0.05 level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 80.8%. The pooled-variance t test assumes that the variance associated with the two populations of accounting firms are equal and that the number of partners data are approximately normally distributed for the two populations.
Copyright ©2024 Pearson Education, Inc.
lxxxii Chapter 10: Two-Sample Tests 10.11 (a) From PHStat, Population 1 = Before Halftime, Population 2 = During or after halftime Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
Confidence Interval Estimate 0
Level of Significance
for the Difference Between Two Means
0.05
Population 1 Sample Sample Size
Data 28
Sample Mean
5.789285714
Sample Standard Deviation
0.702028429
Population 2 Sample Sample Size
Confidence Level
Intermediate Calculations Degrees of Freedom
29
Sample Mean
5.534482759
Sample Standard Deviation
0.70926313
95%
55
t Value
2.0040
Interval Half Width
0.3747
Confidence Interval Intermediate Calculations Population 1 Sample Degrees of Freedom
27
Population 2 Sample Degrees of Freedom
28
Total Degrees of Freedom
55
Pooled Variance
0.4980
Standard Error
0.1870
Difference in Sample Means
0.2548
t Test Statistic
1.3627
Interval Lower Limit
-0.1199
Interval Upper Limit
0.6295
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxiii Lower Critical Value
-2.0040
Upper Critical Value
2.0040
p-Value
0.1785
Do not reject the null hypothesis
(b)
(c)
The results from a pooled-variance t test revealed that there was no significant difference between the mean rating of the ads that ran before halftime and the ads that ran at halftime or after. Because tSTAT = 1.3627 or p-value = 0.1785, do not reject H0. The p-value of 0.1785, which is well above the 0.05 level. This means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 17.85%. Using a 95% confidence interval, the lower limit of the average difference in mean rating between the ads that ran before halftime and ads that ran during or after halftime is –0.1199 and the upper limit is 0.6295. This result means that one can be 95% confident that the mean rating difference between the ads that ran before halftime and ads that during or after halftime will be –0.1199 to 0.6295.
Copyright ©2024 Pearson Education, Inc.
lxxxiv Chapter 10: Two-Sample Tests 10.12
(a)
(b)
(c) (d)
H 0 : 1 2 Mean waiting times of Bank 1 and Bank 2 are the same. H1 : 1 2 Mean waiting times of Bank 1 and Bank 2 are different.
Since the p-value of 0.000 is less than the 5% level of significance, reject the null hypothesis. There is enough evidence to conclude that the mean waiting time is different in the two banks. p-value = 0.000. The probability of obtaining a sample that will yield a t test statistic more extreme than –4.13 is 0.000 if, in fact, the mean waiting times of Bank 1 and Bank 2 are the same. We need to assume that the two populations are normally distributed.
X X t S n1 n1 4.2867 7.1147 2.0484 3.5093 151 151 1
2
2 p
2 1 4.2292 1 2 1.4268 You are 95% confident that the difference in mean waiting time between Bank 1 and Bank 2 is between 4.2292 and 1.4268 minutes.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxv 10.13
H 0 : 1 2 Mean waiting times of Bank 1 and Bank 2 are the same. H1 : 1 2 Mean waiting times of Bank 1 and Bank 2 are different.
Since the p-value of 0.000 is less than the 5% level of significance, reject the null hypothesis. There is enough evidence to conclude that the mean waiting times are different in the two banks. Both t tests yield the same conclusion.
Copyright ©2024 Pearson Education, Inc.
lxxxvi Chapter 10: Two-Sample Tests 10.14
From PHStat Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
20
Sample Mean
42.9135
Sample Standard Deviation
14.10057269
Population 2 Sample Sample Size
11
Sample Mean
29.21
Sample Standard Deviation
16.15989109
Intermediate Calculations Population 1 Sample Degrees of Freedom
19
Population 2 Sample Degrees of Freedom
10
Total Degrees of Freedom
29
Pooled Variance
220.3144
Standard Error
5.5717
Difference in Sample Means
13.7035
t Test Statistic
2.4595
Two-Tail Test Lower Critical Value
-2.0452 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxvii Upper Critical Value
2.0452
p-Value
0.0201 Reject the null hypothesis
(a) (b)
(c)
Because tSTAT = 2.4595 > 2.0452, reject H0. There is evidence of a difference in the mean time to start a business between developed and emerging countries. p-value = 0.0201. The probability that two samples have a mean difference of 13.7035 or more is 0.0201 if there is no difference in the meantime to start a business between developed and emerging countries. You need to assume that the population distribution of the time to start a business of both developed and emerging countries is normally distributed.
Copyright ©2024 Pearson Education, Inc.
lxxxviii Chapter 10: Two-Sample Tests 10.14 cont.
(c)
From PHStat Confidence Interval Estimate for the Difference Between Two Means
Data Confidence Level
95%
Intermediate Calculations Degrees of Freedom
29
t Value
2.0452
Interval Half Width
11.3955
Confidence Interval
(d)
Interval Lower Limit
2.3080
Interval Upper Limit
25.0990
2.3080 1 2 25.0990
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxix 10.15
From PHStat Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
20
Sample Mean
42.9135
Sample Standard Deviation
14.1006
Population 2 Sample Sample Size
11
Sample Mean
29.21
Sample Standard Deviation
16.1599
Intermediate Calculations Numerator of Degrees of Freedom
1134.4432
Denominator of Degrees of Freedom
61.5612
Total Degrees of Freedom
18.4279
Degrees of Freedom
18
Standard Error
5.8036
Difference in Sample Means
13.7035
Separate-Variance t Test Statistic
2.3612
Two-Tail Test Copyright ©2024 Pearson Education, Inc.
xc Chapter 10: Two-Sample Tests Lower Critical Value
-2.1009
Upper Critical Value
2.1009
p-Value
0.0297 Reject the null hypothesis
The results between the two analyses were approximately equal. The t test analysis without assuming equal variances also revealed a significant difference in mean time required to start a business between developed and emerging countries.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xci 10.16
From PHStat, Population 1 = IOS, Population 2 = Android Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
30
Sample Mean
175.2666667
Sample Standard Deviation
139.8368343
Population 2 Sample Sample Size
30
Sample Mean
377.2
Sample Standard Deviation
493.7347047
Intermediate Calculations Population 1 Sample Degrees of Freedom
29
Population 2 Sample Degrees of Freedom
29
Total Degrees of Freedom
58
Pooled Variance
131664.1494
Standard Error
93.6889
Difference in Sample Means t Test Statistic
-201.9333 -2.1554
Two-Tail Test Lower Critical Value
-2.0017 Copyright ©2024 Pearson Education, Inc.
xcii Chapter 10: Two-Sample Tests Upper Critical Value
2.0017
p-Value
0.0353 Reject the null hypothesis
(a)
(b)
Because tSTAT = –2.1554 < –2.0017 or p-value = 0.03535 < 0.05, reject H0. There is evidence of a difference in the mean time per day accessing the Internet via a mobile device between IOS users and Android users. You must assume that each of the two independent populations is normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xciii 10.17
(a)
From PHStat, Population 1 = Technology, Population 2 = Financial Institutions Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
21
Sample Mean
128721.3333
Sample Standard Deviation
224301.7528
Population 2 Sample Sample Size
15
Sample Mean
52231.73333
Sample Standard Deviation
45687.56407
Intermediate Calculations Population 1 Sample Degrees of Freedom
20
Population 2 Sample Degrees of Freedom
14
Total Degrees of Freedom
34
Pooled Variance
30454366914.2824
Standard Error
58995.7547
Difference in Sample Means
76489.6000
t Test Statistic
1.2965
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
xciv Chapter 10: Two-Sample Tests Lower Critical Value
-2.0322
Upper Critical Value
2.0322
p-Value
0.2035
Do not reject the null hypothesis There is insufficient evidence that there is a significant difference at the 0.05 significance level in mean brand value between the technology and financial sectors. Because tSTAT = 1.2965 or p-value = 0.2035, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcv 10.17 cont.
(b)
From PHStat, Population 1 = Technology, Population 2 = Financial Institutions Separate-Variances t Test for the Difference Between Two Means (assumes unequal population ariances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
21
Sample Mean
128721.3333
Sample Standard Deviation
224301.7528
Population 2 Sample Sample Size
15
Sample Mean
52231.73333
Sample Standard Deviation
45687.5641
Intermediate Calculations Numerator of Degrees of Freedom
6425880054317410000.0000
Denominator of Degrees of Freedom
288370096112897000.0000
Total Degrees of Freedom Degrees of Freedom
22.2834 22
Standard Error
50348.1078
Difference in Sample Means
76489.6000
Separate-Variance t Test Statistic
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
1.5192
xcvi Chapter 10: Two-Sample Tests Lower Critical Value
-2.0739
Upper Critical Value
2.0739
p-Value
0.1430
Do not reject the null hypothesis There is insufficient evidence that there is a significant difference at the 0.05 significance level in mean brand value between the technology and financial sectors. Because tSTAT = 1.5192 or p-value = 0.1430, do not reject H0. (c)
Both t tests led to the same conclusion to not reject H0. For the t test not assuming unequal variances, tSTAT = 1.5192 with a p-value of 0.1430. This p-value was slightly lower than the p-value associated with the pooled-variance t test from 10.17 (a). Both p-values were above the 0.05 significance level.
10.18
The degrees of freedom is 20 – 1, or 19.
10.19
The degrees of freedom is 20 – 1, or 19.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcvii 10.20
(a)
–1.5566 3.2772.
Because tSTAT = –3.2772 < –2.306 or 1.424 9 p-value = 0.0112 < 0.05, reject H0. There is enough evidence of a difference in the mean summated ratings between the two brands. (b) You must assume that the distribution of the differences between the two ratings is approximately normal. (c)p-value = 0.0112. The probability of obtaining a mean difference in ratings that results in a test statistic that deviates from 0 by 3.2772 or more in either direction is 0.0112 if there is no difference in the mean summated ratings between the two brands. (d) 2.6501 D 0.4610. You are 95% confident that the mean difference in summated ratings between brand A and brand B is somewhere between –2.6501 and –0.4610. tSTAT =
Copyright ©2024 Pearson Education, Inc.
xcviii Chapter 10: Two-Sample Tests 10.21
(a) Paired t Test
Data Hypothesized Mean Difference Level of significance
0 0.05
Intermediate Calculations Sample Size
20
DBar
5.1500
Degrees of Freedom
19
SD
3.0826
Standard Error
0.6893
t Test Statistic
7.4714
Two-Tail Test Lower Critical Value
-2.0930
Upper Critical Value
2.0930
p-Value
0.0000
Reject the null hypothesis
(b)
At the 0.05 level, there is sufficient evidence that there is a significant difference in the mean ratings between TV and Internet services. A paired-samples t test revealed a tSTAT of 7.4714, which was above the upper critical limit, 2.0930. Because tSTAT = 7.4714 or p-value = 0.000, reject H0. The paired samples t test assumes the mean difference scores are normally distributed.
(c)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcix
Copyright ©2024 Pearson Education, Inc.
c Chapter 10: Two-Sample Tests 10.21 cont.
(c)
(d)
The differences appear to be left skewed. However, the sample size is very small, which makes it difficult to interpret the histogram for normality. The data also contains one outlier. Removal of this one outlier may lead to a different conclusion. Using the complete dataset, the confidence interval for the mean difference between TV and Internet ratings is S 3.0826 D t /2 D 5.1500 2.0930 3.7073 n 20 S 3.0826 D t /2 D 5.1500 2.0930 6.5927 n 20 3.7073 µD 6.5927.
10.22
From PHStat Paired t Test Data Hypothesized Mean Difference Level of significance
0 0.05
Intermediate Calculations Sample Size
25
DBar
3.0988
Degrees of Freedom
24
SD
4.6887
Standard Error
0.9377
t Test Statistic
3.3045 Two-Tail Test
Lower Critical Value
-2.0639
Upper Critical Value
2.0639
p-Value
0.0030
Reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ci Because tSTAT = 3.3045 > 2.0639 or p-value = 0.0030, reject H0. There is evidence to conclude that the mean meal cost is higher at an inexpensive restaurant than at McDonald’s. You must assume that the distribution of the differences between the meal costs is approximately normal.
SD
4.6887 1.1634 n 25 S 4.6887 D t /2 D 3.0988 2.0639 5.0342 n 25 The confidence interval is from 1.1634 to 5.0342. D t /2
3.0988 2.0639
Copyright ©2024 Pearson Education, Inc.
cii Chapter 10: Two-Sample Tests 10.23
From PHStat Paired t Test
Data Hypothesized Mean Difference Level of significance
0 0.05
Intermediate Calculations Sample Size
10
DBar
-0.1000
Degrees of Freedom
9
SD
1.7288
Standard Error
0.5467
t Test Statistic
-0.1829
Two-Tail Test Lower Critical Value
-2.2622
Upper Critical Value
2.2622
p-Value
0.8589
Do not reject the null hypothesis Because tSTAT = –0.1829 or p-value = 0.8589, do not reject H0. There is insufficient evidence to conclude that the mean scores of coffeepot-brewed coffee is has higher scores than K-cup-brewed coffee. 10.24
(a)
Define the difference in bone marrow microvessel density as the density before the transplant minus the density after the transplant and assume that the difference in density is normally distributed.
H 0 : D 0 vs. H1 : D 0
t-Test: Paired Two Sample for Means Mean
Before 312.1429 Copyright ©2024 Pearson Education, Inc.
After 226
Solutions to End-of-Section and Chapter Review Problems ciii Variance Observations Pearson Correlation Hypothesized Mean Difference df t Stat P(T<=t) one-tail t Critical one-tail P(T<=t) two-tail t Critical two-tail
15513.14 7 0.295069 0 6 1.842455 0.057493 1.943181 0.114986 2.446914
4971 7
Test statistic: tSTAT D D = 1.8425 SD n
10.24 cont.
(a)
(b)
(c)
(d)
Decision: Since tSTAT = is less than the critical value of 1.943, do not reject H 0 . There is not enough evidence to conclude that the mean bone marrow microvessel density is higher before the stem cell transplant than after the stem cell transplant. p-value = 0.0575. The probability of obtaining a mean difference in density that gives rise to a t test statistic that deviates from 0 by 1.8425 or more is 0.0575 if the mean density is not higher before the stem cell transplant than after the stem cell transplant. S 123.7005 28.26 D 200.55 D t D 86.1429 2.4469 n 7 You are 95% confident that the mean difference in bone marrow microvessel density before and after the stem cell transplant is somewhere between –28.26 and 200.55. You must assume that the distribution of differences between the mean density of before and after stem cell transplant is approximately normal.
10.25 Mean Standard Error Median Mode Standard Deviation Sample Variance Kurtosis Skewness Range Minimum Maximum Sum Count First Quartile Third Quartile Interquartile Range 1.33 * Std Dev 5 * Std Dev
Cola A Adindex Cola B (Test Cola) Adindex 18.55263158 21.31578947 0.978937044 0.822086011 18 21 24 21 6.034573222 5.067678519 36.41607397 25.68136558 -0.640865482 -0.294923931 -0.077015645 -0.173917096 24 21 6 9 30 30 705 810 38 38 15 18 24 24 9 6 8.025982385 6.740012431 30.17286611 25.3383926 Copyright ©2024 Pearson Education, Inc.
civ Chapter 10: Two-Sample Tests From the descriptive statistics provided in the Microsoft Excel output there does not seem to be any violation of the assumption of normality. The mean and median are similar and the skewness value is near 0. Without observing other graphical devices such as a stem-and-leaf display, boxplot, or normal probability plot, the fact that the sample size (n = 38) is not very small enables us to assume that the paired t test is appropriate here.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cv 10.25 cont.
At the 0.05 level, there is sufficient evidence that there is a significant difference between the Adindex values for Cola A and Cola B (Test Cola). The tSTAT is –2.57 with a p-value of 0.014. Because tSTAT = –2.57 or p-value = 0.014, reject H0. These findings suggest that the cola video ad is different than the likeability of Cola B 10.26
(a)
Copyright ©2024 Pearson Education, Inc.
cvi Chapter 10: Two-Sample Tests 10.26 cont.
(a)
(b) (c)
10.27
(a)
(b)
10.28
(a)
(b)
H0: D 0 H1: D 0 Decision rule: d.f. = 39. If tSTAT < –2.4258, reject H0. D D Test statistic: tSTAT = –9.372 SD n Decision: Since tSTAT = –9.372 is less than the critical bound of –2.4258, reject H0. There is enough evidence to conclude that the mean strength is lower at two days than at seven days. You must assume that the distribution of the differences between the mean strength of the concrete is approximately normal. p-value is virtually 0. The probability of obtaining a mean difference that gives rise to a test statistic that is –9.372 or less when the null hypothesis is true is virtually 0.
X 1 40 X X X2 25 40 25 0.40, p2 2 0.25, and p 1 0.325 n1 100 n2 100 n1 n2 100 100 H0: 1 = 2 H1: 1 2 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. p1 p2 1 2 0.40 0.25 0 Test statistic: Z STAT = 2.2646 1 1 1 1 p 1 p 0.325 1 0.325 100 100 n1 n2 Decision: Since ZSTAT = 2.2646 is above the critical bound of 1.96, reject H0. There is sufficient evidence to conclude that the population proportions differ for group 1 and group 2. p1
p 1 p1
p1 p2 Z 1
n1 0.0218 1 2 0.2782
p2 1 p2 0.4 0.6 0.25 0.75 + 0.15 1.96 n2 100 100
X 1 45 X X X 2 45 25 25 0.45, p2 2 0.50, and p 1 0.467 n1 100 n2 50 n1 n2 100 50 H0: 1 = 2 H1: 1 2 Decision rule: If Z < – 2.58 or Z > 2.58, reject H0. p p2 1 2 0.45-0.50 0 Z STAT 1 = –0.58 1 1 1 1 p 1 p 0.467 1-0.467 100 50 n1 n2 Decision: Since ZSTAT = –0.58 is between the critical bound of 2.58, do not reject H0. There is insufficient evidence to conclude that the population proportion differs for group 1 and group 2 p1
p 1 p1
p1 p2 Z 1
n1 0.2727 1 2 0.1727
p2 1 p2 .45 .55 .5 .5 + 0.05 2.5758 n2 50 100
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cvii 10.29
(a)
From PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
883
Sample Size
2741 Group 2
Number of Items of Interest
873
Sample Size
2776
Intermediate Calculations Group 1 Proportion
0.322145202
Group 2 Proportion
0.314481268
Difference in Two Proportions
0.007663934
Average Proportion
0.3183
Z Test Statistic
0.6110
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.5412
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
cviii Chapter 10: Two-Sample Tests H0: 1 = 2 H1: 1 2 where Populations: 1 = Basic subscribers, 2 = Premium subscribers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 0.6110 Decision: Since ZSTAT = 0.6110 is between the two critical bounds, do not reject H0. There is insufficient evidence of a difference between basic and premium subscribers in the proportion who churn at the 0.05 level of significance. p-value = 0.5412. The probability of obtaining a difference in two sample proportions of 0.007663934 or more in either direction when the null hypothesis is true is 0.5412.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cix 10.29 cont.
(b)
From PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference Level of Significance
0 0.05
Group 1 Number of Items of Interest
16
Sample Size
50 Group 2
Number of Items of Interest
15
Sample Size
50
Intermediate Calculations Group 1 Proportion
0.32
Group 2 Proportion
0.3
Difference in Two Proportions
0.02
Average Proportion
0.3100
Z Test Statistic
0.2162
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.8288
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
cx Chapter 10: Two-Sample Tests H0: 1 = 2 H1: 1 2 where Populations: 1 = Basic subscribers, 2 = Premium subscribers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 0.2162 Decision: Since ZSTAT = 0.2162 is between the two critical bounds, do not reject H0. There is insufficient evidence of a difference between basic and premium subscribers in the proportion who churn at the 0.05 level of significance. p-value = 0.8288. The probability of obtaining a difference in two sample proportions of 0.02 or more in either direction when the null hypothesis is true is 0.8288. (c)
There is no difference in the results despite the smaller sample size in part (b).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxi 10.30
(a) (b)
H0: 1 2 H1: 1 = 2 Population 1 = Caffeinated, Population 2 = Decaffeinated From PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
40
Sample Size
100 Group 2
Number of Items of Interest
10
Sample Size
100
Intermediate Calculations Group 1 Proportion
0.4
Group 2 Proportion
0.1
Difference in Two Proportions
0.3
Average Proportion
0.2500
Z Test Statistic
4.8990
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.0000
Reject the null hypothesis Decision rule: If ZSTAT > 1.6449, reject H0. Copyright ©2024 Pearson Education, Inc.
cxii Chapter 10: Two-Sample Tests
(c)
Test statistic: p p2 1 2 Z STAT 1 = 4.899 p-value is essentially 0. 1 1 p 1 p n1 n2 Decision: Since ZSTAT = 4.899 > 1.6449 or p-value 0.0000 < 0.05, reject H0. There is evidence to conclude that the population proportion of those who had caffeine were more likely to do impulse buying. Yes, the result in (b) makes it appropriate to that the population proportion of those who had caffeinated coffee were more likely to do impulse buying than those who did not have caffeinated coffee.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxiii 10.31
(a)
From PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
464
Sample Size
1030
Group 2 Number of Items of Interest
350
Sample Size
1030
Intermediate Calculations Group 1 Proportion
0.450485437
Group 2 Proportion
0.339805825
Difference in Two Proportions
0.110679612
Average Proportion
0.3951
Z Test Statistic
5.1377
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis H0: 1 = 2 H1: 1 2 Copyright ©2024 Pearson Education, Inc.
cxiv Chapter 10: Two-Sample Tests
(b)
where Populations: 1 = U.S workers, 2 = Canadian workers Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 5.1377 Decision: Since ZSTAT = 5.1377 > 1.960, reject H0. There is evidence of a difference between U.S. and Canadian workers in the proportion who indicate that their organization provides explicit training on empathy for all people managers at the 0.05 level of significance. p-value = 0.0000. The probability of obtaining a difference in two sample proportions of 0.110679612 or more in either direction when the null hypothesis is true is 0.0000.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxv 10.32
(a)
From PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.01
Group 1 Number of Items of Interest
1112
Sample Size
1737
Group 2 Number of Items of Interest
302
Sample Size
642
Intermediate Calculations Group 1 Proportion
0.640184226
Group 2 Proportion
0.470404984
Difference in Two Proportions
0.169779241
Average Proportion
0.5944
Z Test Statistic
7.4862
Two-Tail Test Lower Critical Value
-2.5758
Upper Critical Value
2.5758
p-Value
0.0000
Reject the null hypothesis H0: 1 = 2 H1: 1 2 Copyright ©2024 Pearson Education, Inc.
cxvi Chapter 10: Two-Sample Tests where Populations: 1 = HR Professionals, 2 = U.S. workers Decision rule: If ZSTAT < –2.58 or ZSTAT > 2.58, reject H0. Z STAT = 7.4862 Decision: Since ZSTAT = 7.4862 > 2.58, reject H0. There is evidence of a difference in the proportion of HR professionals and U.S. workers.at the 0.01 level of significance. (b)
p-value = 0.0000. The probability of obtaining a difference in proportions that gives a test statistic below –7.4862 or above 7.4862 is 0.0000 if there is no difference in the proportion based on the two groups.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxvii 10.32 cont.
(c)
From PHStat Confidence Interval Estimate of the Difference Between Two Proportions
Data Confidence Level
99%
Intermediate Calculations Z Value
-2.5758
Std. Error of the Diff. between two Proportions
0.0228
Interval Half Width
0.0588
Confidence Interval Interval Lower Limit
0.1110
Interval Upper Limit
0.2286
0.1110 1 2 0.2286 You are 99% confident that the difference in the proportion based on the groups is between 11.10% and 22.86%.
Copyright ©2024 Pearson Education, Inc.
cxviii Chapter 10: Two-Sample Tests 10.33
(a)
From PHStat, LinkedIn Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
1246
Sample Size
1538
Group 2 Number of Items of Interest
1514
Sample Size
2857
Intermediate Calculations Group 1 Proportion
0.810143043
Group 2 Proportion
0.529926496
Difference in Two Proportions
0.280216547
Average Proportion
0.6280
Z Test Statistic
18.3313
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis H0: 1 = 2 H1: 1 2 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxix where Populations: 1 = B2B marketers, 2 = B2C marketers Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 18.3313 Decision: Since ZSTAT = 18.3313 > 1.96, reject H0. There is evidence of a difference in the proportion of B2B and B2C marketers who use LinkedIn as a social media tool.at the 0.05 level of significance. (b)
p-value = 0.0000. The probability of obtaining a difference in proportions that gives a test statistic below –18.3313 or above 18.3313 is 0.0000 if there is no difference in the proportion based on the two groups.
Copyright ©2024 Pearson Education, Inc.
cxx Chapter 10: Two-Sample Tests 10.33 cont.
(c)
From PHStat, YouTube Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
877
Sample Size
1539 Group 2
Number of Items of Interest
1571
Sample Size
2856
Intermediate Calculations Group 1 Proportion
0.569850552
Group 2 Proportion
0.550070028
Difference in Two Proportions
0.019780524
Average Proportion
0.5570
Z Test Statistic
1.2593
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.2079
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxi H0: 1 = 2 H1: 1 2 where Populations: 1 = B2B marketers, 2 = B2C marketers Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 1.2593 Decision: Since ZSTAT = 1.2593 < 1.96, do not reject H0. There is insufficient evidence of a difference in the proportion of B2B and B2C marketers who use YouTube as a social media tool.at the 0.05 level of significance. (d)
p-value = 0.2079. The probability of obtaining a difference in proportions that gives a test statistic below –1.2593 or above 1.2593 is 0.2079 if there is no difference in the proportion based on the two groups.
Copyright ©2024 Pearson Education, Inc.
cxxii Chapter 10: Two-Sample Tests 10.34
(a)
H0: 1 = 2 H1: 1 2 Where Population: 1 = business leaders, 2 = knowledge workers Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
213
Sample Size
251
Group 2 Number of Items of Interest
161
Sample Size
251
Intermediate Calculations Group 1 Proportion
0.848605578
Group 2 Proportion
0.641434263
Difference in Two Proportions
0.207171315
Average Proportion
0.7450
Z Test Statistic
5.3249
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxiii Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 5.3249 Decision: Since ZSTAT = 5.3249 > 1.96, reject H0. There is evidence of a difference in the proportion business leaders and knowledge workers that indicate their company exemplifies effective communication.at the 0.05 level of significance. (b)
p-value = 0.0000. The probability of obtaining a difference in proportions that is 0.2071 or more in either direction is 0.0000 if there is no difference between the proportion of business leaders and knowledge workers that indicate their company exemplifies effective communication.
Copyright ©2024 Pearson Education, Inc.
cxxiv Chapter 10: Two-Sample Tests 10.35
(a)
H0: 1 = 2 H1: 1 2 Where Population: 1 = Northeast region, 2 = Midwest region Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
63
Sample Size
196
Group 2 Number of Items of Interest
44
Sample Size
208
Intermediate Calculations Group 1 Proportion
0.321428571
Group 2 Proportion
0.211538462
Difference in Two Proportions
0.10989011
Average Proportion
0.2649
Z Test Statistic
2.5017
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0124
Reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxv Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. Z STAT = 2.5017 Decision: Since ZSTAT = 2.5017 > 1.96, reject H0. There is evidence of a difference in the proportion Northeast and Midwest regions that indicate a preference for open-air shopping.at the 0.05 level of significance. (b)
p-value = 0.0124. The probability of obtaining a difference in proportions that is 0.10989 or more in either direction is 0.0124 if there is no difference between the proportion of Northeast and Midwest regions that indicate a preference for open-air shopping.
Copyright ©2024 Pearson Education, Inc.
cxxvi Chapter 10: Two-Sample Tests 10.35 cont.
(c)
From PHStat Confidence Interval Estimate of the Difference Between Two Proportions
Data Confidence Level
95%
Intermediate Calculations Z Value
-1.9600
Std. Error of the Diff. between two Proportions
0.0438
Interval Half Width
0.0858
Confidence Interval Interval Lower Limit
0.0241
Interval Upper Limit
0.1957
You are 95% confident that the difference in the proportion based on the groups is between 2.41% and 19.57%. 10.36
(a) (b) (c)
2.20 2.57 3.50
10.37
(a) (b)
= 0.05, n1 = 16, n2 = 20, F0.05/ 2 = 2.62 = 0.01, n1 = 16, n2 = 20, F0.01/ 2 = 3.59
10.38
(a) (b)
Population B: S 2 25 1.5625
10.39
FSTAT
S12 155.3 1.2152 S2 2 127.8
10.40
df numerator 24, df denominator 24
10.41
= 0.05, n1 = 25, n2 = 25, F0.05/ 2 = 2.27 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxvii
10.42
Because FSTAT 1.2152 2.27, do not reject H0.
10.43
The F test for the ratio of two variances is sensitive to departures from normality. The F test should not be used because both populations are skewed and do not meet the assumption of normality. The Levene test or a nonparametric test should be used in this situation.
10.44
(a) (b)
10.45
(a)
S12 45.6 1.1783 3.67, do not reject H0. S22 38.7 Because FSTAT 1.1783 2.945, do not reject H0. Because FSTAT
From PHStat, Larger-variance sample: Traditional. Smaller: PrePaid H0 : 12 22 . H1: 12 22 . F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
7
Sample Variance
56.80952381
Smaller-Variance Sample Sample Size
12
Sample Variance
15.35606061
Intermediate Calculations F Test Statistic
3.6995
Population 1 Sample Degrees of Freedom
6
Population 2 Sample Degrees of Freedom
11
Two-Tail Test Copyright ©2024 Pearson Education, Inc.
cxxviii Chapter 10: Two-Sample Tests Upper Critical Value
3.8807
p-Value
0.0583
Do not reject the null hypothesis Decision rule: If FSTAT > 3.6995, reject H0. At the 0.05 significance level, there is no evidence that there is a difference between the variances of the two types of cell providers. The FSTAT of 3.6995 is below the upper critical value of 3.8807. Because FSTAT = 3.6995 or p-value = 0.0583, do not reject H0. (b)
The p-value = 0.0583, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 5.83%.
(c)
To justify the use of the F test, it is assumed that the rating data from both groups are normally distributed.
(d)
Because the results from (a) and (b) revealed that there was no significant difference between the two types of cellular providers, one would use the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxix 10.46
(a)
Copyright ©2024 Pearson Education, Inc.
cxxx Chapter 10: Two-Sample Tests
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxi 10.46 cont.
(a)
(b) (c) (d)
10.47
At the 0.05 level, there is no evidence that there is a difference between the variances of the Southeast and Gulf Coast regions. One would fail to reject the null hypothesis. The FSTAT of 1.44 is below the upper critical value. Because FSTAT = 1.44 or p-value = 0.410, do not reject H0. The p-value is 0.410, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 41%. To justify the use of the F test, it was assumed that the rating data from both populations were normally distributed. Because the results from (a) and (b) revealed that there was no significant difference between the variances of the Southeast and Gulf Coast regions, one would use the pooled-variance t test.
(a)
Copyright ©2024 Pearson Education, Inc.
cxxxii Chapter 10: Two-Sample Tests 10.47 cont.
(a)
(b)
(c)
At the 0.05 level, there is no evidence that there is a difference between the variances in waiting times at the two bank branches. One would fail to reject the null hypothesis. The FSTAT of 1.62 is below the upper critical value. Because FSTAT = 1.62 or p-value = 0.380, do not reject H0. The p-value is 0.380, which means that the probability of obtaining an equal or larger value than the observed test statistic, when H0 is true, is 38.0 %. The p-value is above 0.05, which indicates that the observed difference is not significant. To justify the use of the F test, it was assumed that the rating data from both groups were normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxiii 10.47 cont.
(c)
(d)
Waiting times for Bank1 appear to be skewed to left while times for Bank2 are slightly skewed to the right. Because the F test for the ratio of two variances is sensitive to the normality assumption, other tests to assess for differences between two variances should be considered. Because the results from (a) revealed that there was no significant difference between the two bank branches, one would use the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
cxxxiv Chapter 10: Two-Sample Tests 10.48
From PHStat, Larger-variance sample: Halftime or after. Smaller: Before halftime H0 : 12 22 . H1: 12 22 . F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
29
Sample Variance
0.503054187
Smaller-Variance Sample Sample Size
28
Sample Variance
0.492843915
Intermediate Calculations F Test Statistic
1.0207
Population 1 Sample Degrees of Freedom
28
Population 2 Sample Degrees of Freedom
27
Two-Tail Test Upper Critical Value
2.1512
p-Value
0.9594
Do not reject the null hypothesis
(a)
(b)
At the 0.05 level, there is insufficient evidence that there is a difference between the variability in the rating scores between ads in first half and ads in the second half. The FSTAT of 1.0207 is below the upper critical value. Because FSTAT = 1.0207 < 2.1512 or p-value = 0.9594 < 0.05, do not reject H0. The p-value is 0.9594, which means that the probability of obtaining an equal or larger value than the observed test statistic, 1.0207, when H0 is true, is 95.98%, if there is no difference in the two population variances. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxv (c) (d)
To justify the use of the F test, it was assumed that the rating data from both groups were normally distributed. Based on (a) and (b), a pooled-variance t test should be used.
Copyright ©2024 Pearson Education, Inc.
cxxxvi Chapter 10: Two-Sample Tests 10.49
10.50
From PHStat H0 : 12 22 . H1: 12 22 .
(a)
At the 0.05 level, there is evidence that there is a difference between the variances in the time spent accessing the Internet between the iOS users and Android users. The FSTAT of 12.4665 is well above the critical value. Because FSTAT = 12.4665 > 2.1010 or p-value = 0.0000 < 0.05, reject H0.
(b)
On the basis of the results in (a), the separate-variance t test would be the appropriate choice for these data.
(a)
Because FSTAT = 69.50001 > 1.9811 or p-value = 0.0000 < 0.05, reject H0. There is evidence of a difference in the variance of the delay times between the two drivers. You assume that the delay times are normally distributed. From the boxplots and the normal probability plots, the delay times appear to be approximately normally distributed. Because there is a difference in the variance of the delay times between the two drivers, you should use the separate-variance t test to determine whether there is evidence of a difference in the mean delay time between the two drivers.
(b) (c) (d)
10.51
Among the criteria to be used in selecting a particular hypothesis test are the type of data, whether the samples are independent or paired, whether the test involves central tendency or variation, whether the assumption of normality is valid, and whether the variances in the two populations are equal.
10.52
The pooled-variance t test should be used when the populations are approximately normally distributed and the variances of the two populations are assumed equal. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxvii 10.53 10.54
The F test can be used to examine differences in two variances when each of the two populations is assumed to be normally distributed. With independent populations, the outcomes in one population do not depend on the outcomes in the second population. With two related populations, either repeated measurements are obtained on the same set of items or individuals, or items or individuals are paired or matched according to some characteristic.
10.55
Repeated measurements represent two measurements on the same items or individuals, while paired measurements involve matching items according to a characteristic of interest.
10.56
The hypothesis test for the difference between two means provides a single test statistic upon which a decision is made to reject or fail to reject the hypothesis. The confidence interval estimate provides the low end and high end of the mean differences assuming a given confidence level such as 95%. The confidence interval estimate can also be used to decide whether to accept or reject the null hypothesis. If the hypothesized value of 0 for the difference in two population means is not in the confidence interval, then, assuming a two-tailed test is used, the null hypothesis of no difference in the two population means can be rejected.
10.57
When you have paired data or data obtained from repeated measurements.
10.58
(a)
From PHStat, Black Belt variance = (30,000)2 = 900,000,000 Green Belt variance = (25,000)2 = 625,000,000 H0 : 12 22 . H1: 12 22 . F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
47
Sample Variance
900000000
Smaller-Variance Sample Sample Size
56
Sample Variance
625000000
Intermediate Calculations F Test Statistic
1.4400
Copyright ©2024 Pearson Education, Inc.
cxxxviii Chapter 10: Two-Sample Tests Population 1 Sample Degrees of Freedom
46
Population 2 Sample Degrees of Freedom
55
Two-Tail Test Upper Critical Value
1.7387
p-Value
0.1950
Do not reject the null hypothesis Because FSTAT = 1.44 < 1.7387 or p-value = 0.1950, do not reject H0. There is insufficient evidence of a difference in the variance of the salary of Black Belts and Green Belts. (b) 10.58 cont.
(c)
Based on the results from (a), one would not reject the null hypothesis and choose the pooled-variance t test. From PHStat Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
47
Sample Mean
126551
Sample Standard Deviation
30000
Population 2 Sample Sample Size
56
Sample Mean
95261
Sample Standard Deviation
25000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxix Intermediate Calculations Population 1 Sample Degrees of Freedom
46
Population 2 Sample Degrees of Freedom
55
Total Degrees of Freedom
101
Pooled Variance
750247524.7525
Standard Error
5418.4860
Difference in Sample Means
31290.0000
t Test Statistic
5.7747
Upper-Tail Test Upper Critical Value
1.6601
p-Value
0.0000 Reject the null hypothesis
At the 0.05 level, there is evidence that the mean salary for Black Belt jobs is significantly higher than the mean salary for Green Belt jobs. The tSTAT of 5.7747 is above the critical value. The p-value is 0.000. Because tSTAT = 5.7747 > 1.6601 or p-value = 0.000 < 0.05, reject H0.
Copyright ©2024 Pearson Education, Inc.
cxl Chapter 10: Two-Sample Tests 10.59
(a)
From PHStat, Private and Public H0 : 12 22 . H1: 12 22 . F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
226
Sample Variance
51703673.46
Smaller-Variance Sample Sample Size
266
Sample Variance
5310845.239
Intermediate Calculations F Test Statistic
9.7355
Population 1 Sample Degrees of Freedom
225
Population 2 Sample Degrees of Freedom
265
Two-Tail Test Upper Critical Value
1.2847
p-Value
0.0000 Reject the null hypothesis
(b)
At the 0.05 level, there is sufficient evidence that the difference between the variances of the debt amount of private and public colleges is significantly different. The FSTAT of 9.7355 is above the upper critical value. Because FSTAT = 9.7355 > 1.2847 or p-value = 0.0000 < 0.05, reject H0. Based on the results from (a), one would choose the separate-variance t test because the variances between the two groups differed significantly at the 0.05 level. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxli 10.59 cont.
(c)
From PHStat, Population 1 = Private Population 2 = Public Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
266
Sample Mean
8400.590226
Sample Standard Deviation
2304.5271
Population 2 Sample Sample Size
226
Sample Mean
7587.349558
Sample Standard Deviation
7190.5266
Intermediate Calculations Numerator of Degrees of Freedom
61873030209.5447
Denominator of Degrees of Freedom
234122289.7854
Total Degrees of Freedom Degrees of Freedom
264.2765 264
Standard Error
498.7413
Difference in Sample Means
813.2407
Separate-Variance t Test Statistic
Two-Tail Test Copyright ©2024 Pearson Education, Inc.
1.6306
cxlii Chapter 10: Two-Sample Tests Lower Critical Value
-1.9690
Upper Critical Value
1.9690
p-Value
0.1042
Do not reject the null hypothesis (d)
A separate-sample t test revealed that the average loan debt for students attending private colleges was significantly higher than students attending public colleges. The tSTAT of 1.6306 is between the critical values. Because tSTAT = 1.6306 < 1.969 or p-value = 0.1042 > 0.05, do not reject H0. One would conclude that there is not a significant difference between private and public colleges in the mean amount of student loan debt incurred.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxliii 10.60
(a)
From PHStat, Boys variance = (45)2 = 2025 Girls variance = (40)2 = 1600 H0 : 12 22 . H1: 12 22 . F Test for Differences in Two Variances
Data Level of Significance
0.01
Larger-Variance Sample Sample Size
100
Sample Variance
2025
Smaller-Variance Sample Sample Size
100
Sample Variance
1600
Intermediate Calculations F Test Statistic
1.2656
Population 1 Sample Degrees of Freedom
99
Population 2 Sample Degrees of Freedom
99
Two-Tail Test Upper Critical Value
1.6854
p-Value
0.2430
Do not reject the null hypothesis Using a 0.01 level of significance, there is insufficient evidence that there is a difference in the variances in time spent online between boys and girls. Because FSTAT = 1.2656 < 1.6854 or p-value = 0.2430 > 0.05, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
cxliv Chapter 10: Two-Sample Tests 10.60 cont.
(b)
It is most appropriate to use the pooled-variance t test to test for differences in mean online time between boys and girls. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.01
Population 1 Sample Sample Size
100
Sample Mean
556
Sample Standard Deviation
45
Population 2 Sample Sample Size
100
Sample Mean
482
Sample Standard Deviation
40
Intermediate Calculations Population 1 Sample Degrees of Freedom
99
Population 2 Sample Degrees of Freedom
99
Total Degrees of Freedom
198
Pooled Variance
1812.5000
Standard Error
6.0208
Difference in Sample Means
74.0000
t Test Statistic
12.2907
Two-Tail Test Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlv Lower Critical Value
-2.6009
Upper Critical Value
2.6009
p-Value
0.0000 Reject the null hypothesis
At the 0.01 significance level, the tSTAT of 12.2907 is well above the upper critical value. Because tSTAT = 12.2907 > 2.6009 or p-value = 0.0000 < 0.01, one would reject the null hypothesis that there is no difference between the mean entertainment screen use time per day between boys and girls. There is evidence in the mean entertainment screen use per day of boys and girls.
Copyright ©2024 Pearson Education, Inc.
cxlvi Chapter 10: Two-Sample Tests 10.61
Because the F test for the ratio of two variances revealed a significant difference between city and outlying restaurants on the cost variable, a separate-variance t test was used for this variable. Because no significant differences were observed at the 0.05 level for the food, décor, and service ratings, a pooled-variance t test will be used for these variables. From PHStat, Cost, Location: Center City, Metro Area Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
50
Sample Mean
60.68
Sample Standard Deviation
23.5973
Population 2 Sample Sample Size
50
Sample Mean
46.18
Sample Standard Deviation
14.7284
Intermediate Calculations Numerator of Degrees of Freedom
239.4821
Denominator of Degrees of Freedom
2.9153
Total Degrees of Freedom
82.1473
Degrees of Freedom
82
Standard Error
3.9339
Difference in Sample Means
14.5000
Separate-Variance t Test Statistic
3.6860
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlvii
Two-Tail Test Lower Critical Value
-1.9893
Upper Critical Value
1.9893
p-Value
0.0004 Reject the null hypothesis
The mean cost rating is significantly higher for Center City restaurants compared to Metro Area restaurants. Because tSTAT = 3.686 or p-value = 0.0004, one would reject the null hypothesis that the cost ratings were the same for the Center City and Metro Area restaurants.
Copyright ©2024 Pearson Education, Inc.
cxlviii Chapter 10: Two-Sample Tests 10.61 cont.
From PHStat, Food, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
50
Sample Mean
22.88
Sample Standard Deviation
2.412890552
Population 2 Sample Sample Size
50
Sample Mean
24.32
Sample Standard Deviation
2.132738012
Intermediate Calculations Population 1 Sample Degrees of Freedom
49
Population 2 Sample Degrees of Freedom
49
Total Degrees of Freedom
98
Pooled Variance
5.1853
Standard Error
0.4554
Difference in Sample Means
-1.4400
t Test Statistic
-3.1619
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlix Lower Critical Value
-1.9845
Upper Critical Value
1.9845
p-Value
0.0021 Reject the null hypothesis
The mean food rating is significantly higher for Metro Area restaurants compared to Center City restaurants. Because tSTAT = –3.1619 or p-value = 0.0021, one would reject the null hypothesis that the cost ratings were the same for the Center City and Metro Area restaurants.
Copyright ©2024 Pearson Education, Inc.
cl Chapter 10: Two-Sample Tests 10.61 cont.
From PHStat, Decor, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
50
Sample Mean
19.02
Sample Standard Deviation
4.515234982
Population 2 Sample Sample Size
50
Sample Mean
18.76
Sample Standard Deviation
3.825585189
Intermediate Calculations Population 1 Sample Degrees of Freedom
49
Population 2 Sample Degrees of Freedom
49
Total Degrees of Freedom
98
Pooled Variance
17.5112
Standard Error
0.8369
Difference in Sample Means
0.2600
t Test Statistic
0.3107
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cli Lower Critical Value
-1.9845
Upper Critical Value
1.9845
p-Value
0.7567
Do not reject the null hypothesis There is no significant difference in mean decor rating for Center City restaurants compared to Metro Area restaurants. Because tSTAT = 0.3107 or p-value = 0.7567, one would not reject the null hypothesis that the mean decor ratings were the same for the Center City and Metro Area restaurants.
Copyright ©2024 Pearson Education, Inc.
clii Chapter 10: Two-Sample Tests 10.61 cont.
From PHStat, Service, Location: Center City, Metro Area Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
50
Sample Mean
20.62
Sample Standard Deviation
3.421957907
Population 2 Sample Sample Size
50
Sample Mean
21.34
Sample Standard Deviation
2.599921506
Intermediate Calculations Population 1 Sample Degrees of Freedom
49
Population 2 Sample Degrees of Freedom
49
Total Degrees of Freedom
98
Pooled Variance
9.2347
Standard Error
0.6078
Difference in Sample Means
-0.7200
t Test Statistic
-1.1847
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cliii Lower Critical Value
-1.9845
Upper Critical Value
1.9845
p-Value
0.2390
Do not reject the null hypothesis There is no significant difference in mean service rating for Center City restaurants compared to Metro Area restaurants. Because tSTAT = –1.1847 or p-value = 0.2390, one would not reject the null hypothesis that the mean service ratings were the same for the Center City and Metro Area restaurants.
Copyright ©2024 Pearson Education, Inc.
cliv Chapter 10: Two-Sample Tests 10.62
(a)
H0: 10 minutes. Introductory computer students required no more than a mean of 10 minutes to write and run a program in Python. H1: > 10 minutes. Introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Decision rule: d.f. = 8. If tSTAT > 1.8595, reject H0. X – 12 –10 Test statistic: tSTAT = 3.3282 S 1.8028 n 9 Decision: Since tSTAT = 3.3282 is greater than the critical bound of 1.8595, reject H0. There is enough evidence to conclude that the introductory computer students required more than a mean of 10 minutes to write and run a program in Python.
(b)
(c)
(d)
H0: 10 minutes. Introductory computer students required no more than a mean of 10 minutes to write and run a program in Python. H1: > 10 minutes. Introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Decision rule: d.f. = 8. If tSTAT > 1.8595, reject H0. X – 16 – 10 Test statistic: tSTAT = 1.3636 S 13.2004 n 9 Decision: Since tSTAT = 1.3636 is less than the critical bound of 1.8595, do not reject H0. There is not enough evidence to conclude that the introductory computer students required more than a mean of 10 minutes to write and run a program in Python. Although the mean time necessary to complete the assignment increased from 12 to 16 minutes as a result of the increase in one data value, the standard deviation went from 1.8 to 13.2, which in turn brought the t-value down because of the increased denominator. H0: IC 2 CS 2 H1: IC 2 CS 2 Decision rule: If FSTAT > 3.8549, reject H0. S 2 2.02 Test statistic: FSTAT CS 2 = 1.2307 SIC 1.80282 Decision: Since FSTAT = 1.2307 is lower than the critical bound 3.8549, do not reject H0. There is not enough evidence to conclude that the population variances are different for the Introduction to Computers students and computer majors. Hence, the pooledvariance t test is a valid test to see whether computer majors can write a Python program (on average) in less time than introductory students, assuming that the distributions of the time needed to write a Python program for both the Introduction to Computers students and the computer majors are approximately normal. H0: IC CS The mean amount of time needed by Introduction to Computers students is not greater than the mean amount of time needed by computer majors. H1: IC CS The mean amount of time needed by Introduction to Computers students is greater than the mean amount of time needed by computer majors.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clv 10.62 cont.
(d) Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
9
Sample Mean
12
Sample Standard Deviation
1.802776
Population 2 Sample Sample Size
11
Sample Mean
8.5
Sample Standard Deviation
2
Intermediate Calculations Population 1 Sample Degrees of Freedom
8
Population 2 Sample Degrees of Freedom
10
Total Degrees of Freedom
18
Pooled-Variance
3.666667
Difference in Sample Means t Test Statistic
3.5 4.066633
Upper-Tail Test Upper Critical Value
1.734064
p-Value
0.000362 Copyright ©2024 Pearson Education, Inc.
clvi Chapter 10: Two-Sample Tests Reject the null hypothesis
Decision rule: d.f. = 18. If tSTAT > 1.7341, reject H0. Test statistic: (n 1) S IC 2 (nCS 1) SCS 2 9 1.80282 11 2.02 = 3.6667 S p 2 IC (nIC 1) (nCS 1) 8 10
tSTAT
X X IC
CS
IC
CS
12.0 8.5
= 4.0666 1 1 1 1 3.6667 Sp 9 11 n n CS IC Decision: Since tSTAT = 4.0666 is greater than 1.7341, reject H0. There is enough evidence to support a conclusion that the mean time is higher for Introduction to Computers students than for computer majors. p-value = 0.0052. If the true population mean amount of time needed for Introduction to Computer students to write a Python program is indeed no more than 10 minutes, the probability for observing a sample mean greater than the 12 minutes in the current sample is 0.0052, which means it will be a quite unlikely event. Hence, at a 95% level of confidence, you can conclude that the population mean amount of time needed for Introduction to Computer students to write a Python program is more than 10 minutes. 2
(e)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clvii 10.62 cont.
(e)
10.63
(a)
As illustrated in part (d) in which there is not enough evidence to conclude that the population variances are different for the Introduction to Computers students and computer majors, the pooled-variance t test performed is a valid test to determine whether computer majors can write a Python program in less time than in introductory students, assuming that the distributions of the time needed to write a Python program for both the Introduction to Computers students and the computer majors are approximately normal.
F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
232
Sample Variance
1600
Smaller-Variance Sample Sample Size
257
Sample Variance
400
Intermediate Calculations F Test Statistic
4.0000
Population 1 Sample Degrees of Freedom
231
Population 2 Sample Degrees of Freedom
256
Two-Tail Test Upper Critical Value
1.2857
p-Value
0.0000 Reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
clviii Chapter 10: Two-Sample Tests An F test for the ratio of two variances revealed a significant difference between the variances of the consumers with elementary children and the consumers with middle school children. This difference was significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clix 10.63 cont.
(b)
Population 1 = Middle School, Population 2 = Elementary School Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
232
Sample Mean
345.9
Sample Standard Deviation
40.0000
Population 2 Sample Sample Size
257
Sample Mean
318.4
Sample Standard Deviation
20.0000
Intermediate Calculations Numerator of Degrees of Freedom
71.4527
Denominator of Degrees of Freedom
0.2154
Total Degrees of Freedom Degrees of Freedom
331.7818 331
Standard Error
2.9074
Difference in Sample Means
27.5000
Separate-Variance t Test Statistic
9.4586
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
clx Chapter 10: Two-Sample Tests Lower Critical Value
-1.9672
Upper Critical Value
1.9672
p-Value
0.0000 Reject the null hypothesis
(c)
10.64
A separate-variance t test revealed a significance difference at the 0.05 level between the mean amount spent by consumers with elementary children and the consumers with middle school children. Consumers with middle school children spent significantly more than consumers with elementary school children. Because tSTAT = 9.4586 or p-value = 0.000, one would reject the null hypothesis that there is no significance difference in mean spent between consumers with elementary children and the consumers with middle school children. The confidence interval for the mean difference in amount spent is 21.9604 1 2 33.0396.
Because an F test for the ratio of two variances revealed no significant differences between the variances of the two manufacturers, a pooled-variance t test is appropriate for these data.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxi 10.64 cont.
Copyright ©2024 Pearson Education, Inc.
clxii Chapter 10: Two-Sample Tests
10.64 cont.
The mean length of life was significantly longer for Manufacturer 2 compared to Manufacturer 1. Because tSTAT = –5.08 or p-value = 0.000, one would reject the null hypothesis that the mean length of bulb life is the same for the two manufacturers.
10.65
Population 1 = Wing A, 2 = Wing B H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. Decision rule: If FSTAT > 2.5265, reject H0. 2 S 2 1.4172 Test statistic: FSTAT 1 2 = 1.0701 2 S2 1.3700 Decision: Since FSTAT = 1.0701 is lower than the critical bound of F / 2 = 2.5265, do not reject H0. There is not enough evidence to conclude that there is a difference between the variances in Wing A and Wing B. Hence, a pooled-variance t test is more appropriate for determining whether there is a difference in the mean delivery time in the two wings of the hotel. H0: 1 2 H1: 1 2 Decision rule: d.f. = 38. If tSTAT < – 2.0244 or tSTAT > 2.0244, reject H0. Test statistic: n 1 S12 n2 1 S22 20 11.3700 2 20 11.4172 2 S p2 1 = = 1.9427 n1 1 n2 1 20 1 20 1
tSTAT
X X = 10.40-8.12 0 = 5.1615 1
2
1
1 1 S p2 n1 n2
2
1 1 1.9427 20 20
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxiii Decision: Since tSTAT = 5.1615 is greater than the upper critical bound of 2.0244, reject H0. There is enough evidence of a difference in the mean delivery time in the two wings of the hotel.
Copyright ©2024 Pearson Education, Inc.
clxiv Chapter 10: Two-Sample Tests 10.66
H0: 1 = 2 H1: 1 2 where Populations: 1 = Males, 2 = Females Decision rule: If p-value < 0.05, reject H0. Gender: Z Test for Differences in Two Proportions Data Hypothesized Difference Level of Significance Group 1 Number of Items of Interest Sample Size Group 2 Number of Items of Interest Sample Size
0 0.05 50 300 96 330
Intermediate Calculations Group 1 Proportion 0.166666667 Group 2 Proportion 0.290909091 Difference in Two Proportions -0.12424242 Average Proportion 0.2317 Z Test Statistic -3.6911 Two-Tail Test Lower Critical Value Upper Critical Value p -Value Reject the null hypothesis
-1.9600 1.9600 0.0002
Decision: Since the p-value is smaller than 0.05, reject H0. There is enough evidence of a difference between males and females in the proportion who order dessert.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxv 10.66 cont.
Beef Entrée: Z Test for Differences in Two Proportions Data Hypothesized Difference Level of Significance Group 1 Number of Items of Interest Sample Size Group 2 Number of Items of Interest Sample Size
0 0.05 74 197 68 433
Intermediate Calculations Group 1 Proportion 0.375634518 Group 2 Proportion 0.15704388 Difference in Two Proportions 0.218590638 Average Proportion 0.2254 Z Test Statistic 6.0873 Two-Tail Test Lower Critical Value Upper Critical Value p -Value Reject the null hypothesis
-1.9600 1.9600 0.0000
Decision: Since the p-value = 0.0000 is smaller than 0.05, reject H0. There is enough evidence of a difference in the proportion who order dessert based on whether a beef entrée has been ordered. 10.67
Normal Probability Plot 3900 3850
Vermont
3800 3750 3700 3650 3600 3550 -4
-3
-2
-1
0
1
2
3
Z Value
Copyright ©2024 Pearson Education, Inc.
4
clxvi Chapter 10: Two-Sample Tests 10.67 cont.
Normal Probability Plot 3300 3250
Boston
3200 3150 3100 3050 3000 -4
-3
-2
-1
0
1
2
3
4
Z Value
Because the normal probability plots suggest that the two populations are not normally distributed an F test is inappropriate for testing the difference in two variances. The sample variances for Boston and Vermont shingles are 1204.992 and 2185.032, respectively. It appears that a separate-variance t test is more appropriate for testing the difference in means.
H 0 : B V Mean weights of Boston and Vermont shingles are the same. H1 : B V Mean weights of Boston and Vermont shingles are different.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxvii 10.67 cont.
Since the p-value is essentially zero, reject H 0 . There is sufficient evidence to conclude that the mean weights of Boston and Vermont shingles are different.
10.68
The normal probability plots suggest that the two populations are not normally distributed. An F test is inappropriate for testing the difference in the two variances. The sample variances for Boston and Vermont shingles are 0.0203 and 0.015, respectively. Because tSTAT = 3.015 > 1.967 or p-value = 0.0028 < = 0.05, reject H0. There is sufficient evidence to conclude that there is a difference in the mean granule loss of Boston and Vermont shingles.
10.69
Because an F test for the ratio of two variances revealed an insignificant difference at the 0.05 level between the variances of the two types of smartphone batteries, a pooled-variance t test is appropriate for these data. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
30
Sample Mean
30
Sample Standard Deviation
43.2
Population 2 Sample Sample Size
30
Sample Mean
35
Sample Standard Deviation
34.2
Intermediate Calculations Population 1 Sample Degrees of Freedom
29
Population 2 Sample Degrees of Freedom
29
Total Degrees of Freedom
58
Pooled Variance
1517.9400 Copyright ©2024 Pearson Education, Inc.
clxviii Chapter 10: Two-Sample Tests Standard Error
10.0596
Difference in Sample Means
-5.0000
t Test Statistic
-0.4970 Two-Tail Test
Lower Critical Value
-2.0017
Upper Critical Value
2.0017
p-Value
0.6210
Do not reject the null hypothesis A pooled-variance t test revealed that there was no significant difference between mean times between the two types of iPhones. Because tSTAT = –0.4970 or p-value = 0.6210, one would not reject H0. that there is a difference in time to charge to 50% capacity between the iPhone 14 Pro and iPhone 14 Pro Max.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxix 10.70
An analysis of the data in 10.67 revealed that a pallet of Vermont shingles weighed significantly more than a pallet of Boston shingles. Assuming that a heavier weight shingle is associated with higher quality, the Vermont shingle might be perceived as a higher quality shingle compared to the Boston shingle. The below figure shows the results from a separate-variance t test.
An analysis of the data in 10.68 revealed that the Vermont shingles were associated with less granule loss compared to the Boston shingles following accelerated-life testing. Shingles with less weight loss are assumed to have a longer life expectancy. Both shingles would be expected to outperform the length of the warranty period because their weight losses were well below the 0.8 gram threshold. However, the Vermont shingles would be expected to have a longer life expectancy given that they loss less weight relative to the Boston shingles. The below figure shows the results from a pooled-variance t test.
Taken together, the results from 10.67 and 10.68 suggest that the Vermont shingle may be a higher quality shingle based on pallet weight and life expectancy as determined by granule loss associated with accelerated-life testing. These conclusions suggest that the manufacturer may be able to charge more for the Vermont shingle compared to the Boston shingle. Copyright ©2024 Pearson Education, Inc.
Chapter 11
11.1
(a) (b) (c)
df A = c – 1 = 5 – 1 = 4 df W = n – c = 30 – 5 = 25 df T = n – 1 = 30 – 1 = 29
11.2
(a)
SSW = SST – SSA = 210 – 60 = 150 SSA 60 MSA 15 c –1 5 –1 SSW 150 MSW 6 n – c 30 – 5 MSA 15 FSTAT 2.5 MSW 6
(b) (c) (d) 11.3
(a) Source Among groups Within groups Total
11.4
df
SS 60 150 210
4 25 29
MS 15 6
F 2.5
(b) (c) (d)
F(4, 29) = 2.70 Decision rule: If F > 2.70, reject H0. Decision: Since F = 2.5 is less than the critical bound 2.70, do not reject H0.
(a) (b) (c)
df A = c – 1 = 3 – 1 = 2 df W = n – c = 15 – 3 = 12 df T = n – 1 = 15 – 1 = 14
11.5 Source Among groups Within groups Total
11.6
(a) (b) (c)
(d)
df 4 – 1 =3 24 – 4 = 20 24 – 1 = 23
SS (80) (3) = 240 480 240 + 480 = 720
MS 80 480/20 = 24
F 80/24 = 3.33
Decision rule: If FSTAT > 3.10, reject H0. Since FSTAT = 3.33 is greater than the critical bound of 3.10, reject H0. There are c = 4 degrees of freedom in the numerator and n – c = 24 – 4 = 20 degrees of freedom in the denominator. For 4 degrees of freedom in the numerator and 20 degrees in the denominator, the critical value, Q 3.96. To perform the Tukey-Kramer procedure, the critical range is
Q
MSW 1 1 24 1 1 3.96 7.92 2 n j n j 2 6 6
Copyright ©2024 Pearson Education, Inc. v
vi Chapter 11: Analysis of Variance
11.7
(a)
At the 0.05 level of significance, there is no evidence that there are differences among the four toasting times. Because FSTAT = 1.14 OR p-value = 0.389, do not reject H0. Due to the small sample size within each group (3), caution should be taken when interpreting these results. Further analyses with larger sample sizes is recommended. (b)
Because the FSTAT statistic was not significant, it would not be appropriate to perform post-hoc comparisons to determine differences among the groups.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.7 cont.
vii
(c)
(d)
Because the Levene test FSTAT = 0.13 or p-value = 0.939, there is no evidence that the variances are different among the 20 sec, 40 sec, 60 sec, and 80 sec groups. Although the FSTAT was not significant, the four toasting time groups had slightly different means. The means are 0.967, 1.367, 1.300, and 0.667 for the 20 sec, 40 sec, 60 sec, and 80 sec groups, respectively. There is no evidence that these differences could be attributed to toasting times.
Copyright ©2024 Pearson Education, Inc.
viii Chapter 11: Analysis of Variance
11.8
(a)Null hypothesis, H0: All means are equal, A B C D Alternative hypothesis, H1: At least one mean is different. c – 1 = 4 – 1 = 3, n = 4(8) = 32, n – c = 32 – 4 = 28 From PHStat ANOVA: Single Factor
SUMMARY Groups
Count
Sum
Average
Variance
1309
163.625 14513.9821
Europe
8
Americas
8 1236.77 154.59625
1986.5641
Asia
8
1050
131.25
901.6429
Africa
8
1115
139.375
855.9821
df
MS
F
P-value
0.3740
0.7724 2.9467
ANOVA Source of Variation
SS
Between Groups
5120.9418
3 1706.9806
Within Groups
127807.1988
28 4564.5428
Total
132928.1406
31 Level of significance
F crit
0.05
SSA 5,120.9418 1,706.9806 c 1 4 1 SSW 127,807.1988 MSW 4,564.5428 nc 32 4 MSA 1,706.9806 FSTAT 0.3740 MSW 4,564.5428 Because the p-value is 0.7724 and FSTAT = 0.3740 < 2.95, do not reject H0. There is insufficient evidence of a difference in the mean export price across the four global regions. (b) There are c = 4 degrees of freedom in the numerator and n – c = 32 – 4 = 28 degrees of freedom in the denominator. The table does not have 28 degrees of freedom in the denominator, so use the next larger critical value, Q 3.90 . MSA
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
ix
To perform the Tukey-Kramer procedure, the critical range is
Q
MSW 1 1 4,564.5428 1 1 3.90 93.158 2 n j n j 2 8 8
From the Tukey-Kramer procedure, there is no difference in the export prices of the four regions.
Copyright ©2024 Pearson Education, Inc.
x Chapter 11: Analysis of Variance
11.8 cont.
ANOVA output for Levene’s test for homogeneity of variance: From PHStat
(c)
ANOVA: Levene Test
SUMMARY Groups
Count
Sum
Europe
8
409
Americas
8 251.23
Asia
8
Africa
8
Average
Variance
51.125 13603.2679 31.40375
925.4298
152
19
586.8571
163
20.375
435.5536
df
MS
ANOVA Source of Variation
SS
Between Groups
5287.7656
3 1762.5885
Within Groups
108857.7588
28 3887.7771
Total
114145.5244
31
F
P-value
0.4534
0.7170 2.9467
Level of significance
F crit
0.05
SSA 5, 287.7656 1,762.5885 c 1 4 1 SSW 108,857,7588 MSW 3,887.7771 nc 32 4 MSA 1,762.5885 FSTAT 0.4534 MSW 3,887.7771 Because the p-value is 0.7170 > 0 05 and FSTAT = 0.4534 < 2.9467, do not reject H0. There is insufficient evidence to conclude that the variances in the export prices across the four global regions are different. MSA
(d)
From the results in (a) and (b), there is no difference in the mean and variance of the export prices in the different regions.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.9
xi
(a)
H 0 : 1 2 3 4
H1 : Not all j are equal
where 1 = Main, 2 = Satellite 1, 3 = Satellite 2, 4 = Satellite 3 where j = 1, 2, 3, 4
Decision Rule: If p-value < 0.05, reject H0. Since p-value = 0.0009 < 0.05, reject the null hypothesis. There is enough evidence to conclude that there is a significant difference in the mean waiting time in the four locations.
Copyright ©2024 Pearson Education, Inc.
xii
Chapter 11: Analysis of Variance
11.9 cont.
(b)
(c)
From the Tukey Pairwise Comparison procedure, there is a difference in mean waiting time between the main campus and Satellite 1, and the main campus and Satellite 3. H0: 12 22 32 42 H1: At least one variance is different. Source of Variation Between Groups Within Groups
SS
df
MS
F
P-value
F crit
310.979 7078.435
3 56
103.6597 126.4006
0.8201
0.4883
2.7694
Total
7389.414
59
Since the p-value = 0.4883 > 0.05, do not reject H0. There is not enough evidence to conclude there is a significant difference in the variation in waiting time among the four locations.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.10
(a)
H0: A B C D E
H1: At least one mean is different.
Since the p-value is essentially zero, reject H0. There is evidence of a difference in the mean rating of the five advertisements.
Copyright ©2024 Pearson Education, Inc.
xiii
xiv
Chapter 11: Analysis of Variance
11.10 cont.
(b)
(c)
(d)
There is a difference in the mean rating between advertisement A and C, between A and D, between B and C, between B and D and between D and E. H0: A2 B2 C2 D2 E2 H1: At least one variance is different. ANOVA output for Levene’s test for homogeneity of variance: ANOVA Source of Variation Between Groups Within Groups
SS 14.13333 45.83333
Total
59.96667
df 4 25
MS 3.533333 1.833333
F 1.927273
P-value F crit 0.137107 2.758711
29
Since the p-value = 0.137 > 0.05, do not reject H0. There is not enough evidence to conclude there is a difference in the variation in rating among the five advertisements. There is no significant difference between advertisements A and B, and they have the highest mean rating among the five and should be used. There is no significant difference between advertisements C and D, and they are among the lowest in mean rating and should be avoided.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.11
xv
(a)Null hypothesis, H0: All means are equal Alternative hypothesis, H1: At least one mean is different. c – 1 = 6 – 1 = 5, n = = 50, n – c = 50 – 6 = 44 From PHStat ANOVA: Single Factor SUMMARY Groups Burger Snack Chicken Global Sandwich Pizza
Count
Sum 14 27970 6 4713 9 18012 6 9361 9 11928 6 4905
ANOVA Source of Variation Between Groups Within Groups
SS 11814909.0324 29779445.5476
Total
41594354.5800
Average Variance 1997.857143 897397.3626 785.5 165132.3000 2001.333333 1379336.5000 1560.166667 152194.9667 1325.333333 658107.7500 817.5 45417.9000
df
MS 5 2362981.8065 44 676805.5806
F P-value F crit 3.4914 0.0096 2.4270
49 Level of significance
0.05
SSA 11,814,909.0324 2,362,981.8065 c 1 6 1 SSW 29,779, 445.5476 MSW 676,805.5806 nc 50 6 MSA 2,362,981.8065 FSTAT 3.4914 MSW 676,805.5806 Because the p-value is 0.0096 < 0 05 and FSTAT = 3.4914 > 2.4270, reject H0. At the 0.05 level of significance, there is sufficient evidence that there are differences in U.S. mean sales per unit among the six food segments. MSA
Copyright ©2024 Pearson Education, Inc.
xvi
Chapter 11: Analysis of Variance
11.11 cont.
(b)
c – 1 = 6 – 1 = 5, n = = 50, n – c = 50 – 6 = 44 From PHStat
ANOVA: Levene Test
SUMMARY Groups
Count
Sum
Average
Variance
Burger
14 9754 696.7142857 532090.1813
Snack
6 1669 278.1666667
Chicken
9 6945 771.6666667 800384.5000
Global
6 1667 277.8333333
Sandwich
9 5319
Pizza
6
74656.5667
60806.9667
591 294086.7500
923 153.8333333
18575.4667
ANOVA Source of Variation
SS
df
MS
Between Groups
2558287.0629
5 511657.4126
Within Groups
16443137.3571
44 373707.6672
Total
19001424.4200
49
F
P-value
1.3691
0.2543 2.4270
Level of significance
F crit
0.05
SSA 2,558, 287.0629 511,657.4126 c 1 6 1 SSW 16, 443,137.5371 MSW 373,707.6713 nc 50 6 MSA 511,657.4126 FSTAT 1.3691 MSW 373,707.6713 Because the p-value is 0.2543 > 0 05 and FSTAT = 1.3691 < 2.4270, do not reject H0. There is insufficient evidence that the variances are different among the six regions. MSA
(c)
Because the Levene test did not reveal significant differences in variation in U.S. average sales per unit, the results in (a) would be valid. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.11 cont.
(d)
xvii
There are c = 6 degrees of freedom in the numerator and n – c = 44 degrees of freedom in the denominator. The table does not have 44 degrees of freedom in the denominator, so use the next larger critical value, Q 4.23 . From PHStat
Tukey-Kramer Multiple Comparisons Sample Sample Mean Size 1997.857 14 785.5 6 2001.333 9 1560.167 6 1325.333 9 817.5 6
Group 1: Burger 2: Snack 3: Chicken 4: Global 5: Sandwich 6: Pizza
Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 1212.357 283.8522378 1201 Means are different Group 1 to Group 3 3.47619 248.5396104 1051 Means are not different Group 1 to Group 4 437.6905 283.8522378 1201 Means are not different Group 1 to Group 5 672.5238 248.5396104 1051 Means are not different Group 1 to Group 6 1180.357 283.8522378 1201 Means are not different Group 2 to Group 3 1215.833 306.5954584 1297 Means are not different Group 2 to Group 4 774.6667 335.8584971 1421 Means are not different Group 2 to Group 5 539.8333 306.5954584 1297 Means are not different Group 2 to Group 6 32 335.8584971 1421 Means are not different Group 3 to Group 4 441.1667 306.5954584 1297 Means are not different Group 3 to Group 5 676 274.2273146 1160 Means are not different Group 3 to Group 6 1183.833 306.5954584 1297 Means are not different Group 4 to Group 5 234.8333 306.5954584 1297 Means are not different Group 4 to Group 6 742.6667 335.8584971 1421 Means are not different Group 5 to Group 6 507.8333 306.5954584 1297 Means are not different
Other Data Level of significance 0.05 Numerator d.f. 6 Denominator d.f. 44 MSW 676805.6 Q Statistic 4.23
Although the ANOVA FSTAT value for the overall F test was not significant at the 0.05 significance level, the Tukey Pairwise Comparison procedure revealed no pairwise differences among the six food segments at the 0.05 significance level. The results should be interpreted with caution due to the relatively small sample sizes. 11.12
(a)
Source
Degrees of Freedom
Sum of Squares
Mean Squares
F
Among Groups
2
62,160,064,576.58
31,080,032,288.29
1.2239
Within Groups
41
1,041,172,839,201.60
25,394,459,492.722
Total
43
1,103,332,903,778.18
(b)
(c)
Because FSTAT = 1.2239 < 3.23, do not reject H0. There is insufficient evidence that there are significant differences in mean brand value among the financial institution, technology, and telecom sectors. Because the results in (b) indicated that there were no differences in mean brand value among the three sectors, it would not be appropriate to use the Tukey-Kramer procedure.
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 11: Analysis of Variance
11.13
H0: 1 2 3 4 5 H1: At least one mean is different. Population: 1 = Kidney, 2 = Shrimp, 3 = Chicken Liver, 4 = Salmon, 5 = Beef Decision rule: df: 4, 45. If p-value < 0.05, reject H0.
(a)
ANOVA Source of Variation Between Groups Within Groups
3.65896
Total
SS
MS
F
P-value
F crit
4
0.91474
20.80541
9.15E-10
2.578739
1.97849
45
0.04397
5.63745
49
Test statistic: FSTAT = 20.8054 p-value is essentially 0 Decision: Since p-value < 0.05, reject H0. There is evidence of a significant difference in the mean amount of food eaten among the various products. To determine which of the means are significantly different from one another, you use the Tukey-Kramer procedure.
(b)
Tukey Kramer Multiple Comparisons Sample Sample Group Mean Size 1 2.456 10
Comparison Group 1 to Group 2
Absolute Difference 0.047
Std. Error of Difference 0.0663072
Critical Range 0.2513
2
2.409
10
Group 1 to Group 3
0.088
0.0663072
0.2513
3
2.368
10
Group 1 to Group 4
0.428
0.0663072
0.2513
4
2.028
10
Group 1 to Group 5
0.702
0.0663072
0.2513
5
1.754
10
Group 2 to Group 3
0.041
0.0663072
0.2513
Group 2 to Group 4
0.381
0.0663072
0.2513
Group 2 to Group 5
0.655
0.0663072
0.2513
0.05
Group 3 to Group 4
0.34
0.0663072
0.2513
5
Group 3 to Group 5
0.614
0.0663072
0.2513
45
Group 4 to Group 5
0.274
0.0663072
0.2513
Other Data Level of significance Numerator d.f. Denominator d.f. MSW Q Statistic
df
Results Means are not different Means are not different Means are different Means are different Means are not different Means are different Means are different Means are different Means are different Means are different
0.043966 3.79
The kidney-based, shrimp-based and chicken-liver-based products are not significantly different while the salmon based and beef based products are significantly different from the others and from each other.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.13 cont.
(c)
(d)
11.14
xix
H0: H1: At least one variance is different. Excel output for Levene’s test for homogeneity of variance: 2 1
2 2
2 3
2 4
2 5
ANOVA Source of Variation Between Groups Within Groups
SS 0.13412 0.71373
4 45
Total
0.84785
49
df
MS 0.03353 0.015861
F 2.114035
P-value 0.094673
F crit 2.578739
Since the p-value = 0.0947 > 0.05, do not reject H0. There is not enough evidence to conclude there is a significant difference in the variation in the amount of food eaten among the various products. The pet food company should conclude that the mean amount of cat food eaten for the kidney-based, shrimp-based and chicken-liver-based products are not significantly different from each other but are significantly higher than salmon-based products and the mean amount eaten for salmon-based products is significantly higher than for beef-based products.
(a)Null hypothesis, H0: All means are equal, 1 2 3 4 Alternative hypothesis, H1: At least one mean is different. c – 1 = 4 – 1 = 3, n = 4(10) = 40, n – c = 40 – 4 = 36 From PHStat ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Asia
10
345
34.5
39.3889
Europe
10
421
42.1
65.8778
North America
10
275
27.5
21.6111
South America
10
301
30.1
72.9889
ANOVA Source of Variation
SS
df
MS
Between Groups
1225.1000
3 408.3667
Within Groups
1798.8000
36
Total
3023.9000
39
F 8.1728
P-value
F crit
0.0003 2.8663
49.9667
Level of significance Copyright ©2024 Pearson Education, Inc.
0.05
xx
Chapter 11: Analysis of Variance
SSA 1225.1000 SSW 1798.8000 408.3667 MSW 49.9667 c 1 4 1 nc 40 4 MSA 408.3667 FSTAT 8.1728 MSW 49.9667 At the 0.05 level of significance, there is evidence there are differences in congestion levels among the four continents. Because p-value = 0.003 < 0 05 and FSTAT = 8.1728 > 2.8663, reject H0. The ANOVA FSTAT indicates that there are differences in congestion levels among the four continents. However, it does not indicate which groups differ from one another. MSA
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.14 cont.
(b)
xxi
There are c = 4 df for the numerator and n – c = 36 df for the denominator. The table does not have 36 degrees of freedom in the denominator, so use the next larger critical value, Q 3.84 . From PHStat
Tukey-Kramer Multiple Comparisons
Group 1: Asia 2: Europe 3: North America 4: South America
Sample Sample Mean Size 34.5 10 42.1 10 27.5 10 30.1 10
Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 7.6 2.2353225 8.584 Means are not different Group 1 to Group 3 7 2.2353225 8.584 Means are not different Group 1 to Group 4 4.4 2.2353225 8.584 Means are not different Group 2 to Group 3 14.6 2.2353225 8.584 Means are different Group 2 to Group 4 12 2.2353225 8.584 Means are different Group 3 to Group 4 2.6 2.2353225 8.584 Means are not different
Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 36 MSW 49.96667 Q Statistic 3.84
Because the results from (a) revealed a significant difference in congestion levels among the four continents, the Tukey-Kramer procedure is appropriate. Europe is different from North America and South America, and North America is different from South America. (c)
The ANOVA F test assumes randomness and independence of samples, normally distributed data in all groups, and equal variances among groups.
(d)Null hypothesis, H0: All variances are equal. Alternative hypothesis, H1: At least one variance is different. c – 1 = 4 – 1 = 3, n = 4(10) = 40, n – c = 40 – 4 = 36 From PHStat ANOVA: Levene Test
SUMMARY Groups
Count
Sum Average Variance
Asia
10
43
4.3
21.3444
Europe
10
67
6.7
23.5111
North America
10
37
3.7
7.5111
South America
10
65
6.5
26.0556
ANOVA Source of Variation Between Groups
SS
df
MS
F
69.6000
3
23.2000
1.1833
Copyright ©2024 Pearson Education, Inc.
P-value
F crit
0.3297 2.8663
xxii
Chapter 11: Analysis of Variance
Within Groups
705.8000
36
Total
775.4000
39
19.6056
Level of significance
0.05
Because the Levene test FSTAT = 1.1833 < 2.8663 or p-value = 0.3297 > 0.05, do not reject H0. There is insufficient evidence that the variances are different among the four regions.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.15
xxiii
df A = r – 1 = 3 – 1 = 2 df B = c – 1 = 3 – 1 = 2 df AB = (r – 1)(c – 1) = (3 – 1)( 3 – 1) = 4 df E = rc(n՛ – 1) = (3)(3)(3 – 1) = 18 df T = n – 1 = rc n՛ – 1 = (3)(3)(3) – 1 = 26
(a) (b) (c) (d)
11.16 Source
11.17
df
SS
MS
Factor A
2
120
(b) 120 ÷ 2 = 60
Factor B
2
110
(b) 110 ÷ 2 = 55
Interaction, AB
4
(a) 540 – 120 – 110 – 270 = 40
(c) 40 ÷ 4 = 10
Error, E
18
270
(d)270 ÷ 18 = 15
Total, T
26
540
(a) (b) (c) (d)
540 – 120 – 110 – 270 = 40 120 ÷ 2 = 60 and 110 ÷ 2 = 55 40 ÷ 4 = 10 270 ÷ 18 = 15
(a) (b)
FSTAT MSAB MSE 10 15 0.67 FSTAT MSA MSE 60 15 4
(c)
FSTAT MSB MSE 55 15 3.67
F
(d) Source
df
SS
MS
F
Factor A
2
120
120 ÷ 2 = 60
(b) 60 ÷ 15 = 4
Factor B
2
110
110 ÷ 2 = 55
(c) 55 ÷ 15 = 3.67
Interaction, AB
4
540 – 120 – 110 – 270 = 40
40 ÷ 4 = 10
(a) 10 ÷ 15 = 0.67
Error, E
18
270
270 ÷ 18 = 15
Total, T
26
540
11.18 (a) (b) (c)
F(2, 18) = 3.55F(4, 18) = 2.93 Factor A Decision: Since FSTAT = 4.00 is greater than the critical bound of 3.55, reject H0. There is evidence of a difference among factor A means. Factor B Decision: Since FSTAT = 3.67 is greater than the critical bound of 3.55, reject H0. There is evidence of a difference among factor B means. Interaction, AB Decision: Since FSTAT = 0.67 is less than the critical bound of 2.93, do not reject H0. There is insufficient evidence to conclude there is an interaction effect. Copyright ©2024 Pearson Education, Inc.
xxiv
11.19
Chapter 11: Analysis of Variance
(a)
r = 3, c = 4, n՛ = 12 Source
df
SS
MS
F
Factor A
3–1=2
18
18 ÷ 2 = 9
9 ÷ 0.45 = 19.8
Factor B
4–1=3
64
64 ÷ 3 = 21.33
21.33 ÷ 0.45 = 46.93
Interaction, AB
2(3) = 6
150 – 18 – 64 – 60 = 8
8 ÷ 6 = 1.33
1.33 ÷ 0.45 = 2.93
Error, E
3(4)(11) = 132
60
60 ÷ 132 = 0.45
Total, T
143
150
The table does not have 132 degrees of freedom in the denominator, so use the next larger critical value, with 120 degrees of freedom. F(2, 120) = 3.07 (b)
F(6, 120) = 2.17
Factor A Decision: Since FSTAT = 19.8 is greater than the critical bound of 3.07, reject H0. There is evidence of a difference among factor A means. Factor B Decision: Since FSTAT = 46.93 is greater than the critical bound of 2.68, reject H0. There is evidence of a difference among factor B means. Interaction, AB Decision: Since FSTAT = 2.93 is greater than the critical bound of 2.17, reject H0. There is evidence to conclude there is an interaction effect.
(c) 11.19 cont.
F(3, 120) = 2.68
(d)
11.20 Source
df
SS
MS
F
Factor A
2
2 80 = 160
80
80 ÷ 5 = 16
Factor B
8 2 = 4
220
220 ÷ 4 = 55
11
Interaction, AB
8
8 10 = 80
10
10 ÷ 5 = 2
Error, E
30
30 5 = 150
55 ÷ 11 = 5
Total, T
44
160 + 220 + 80 + 150 = 610
11.21
F(2, 30) = 3.32F(4, 30) = 2.69F(8, 30) = 2.27 (a) Decision: Since FSTAT = 16 is greater than the critical bound of 3.32, reject H0. There is evidence of a difference among factor A means. (b) Decision: Since FSTAT = 11 is greater than the critical bound of 2.69, reject H0. There is evidence of a difference among factor B means. (c) Decision: Since FSTAT = 2 is less than the critical bound of 2.27, do not reject H0. There is insufficient evidence to conclude there is an interaction effect.
11.22
Two-way ANOVA output:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
(a)
(b)
(c)
xxv
H0: There is no interaction between die temperature and die diameter. H1: There is an interaction between die temperature and die diameter. Decision: Since FSTAT = 3.4032 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is any interaction between die temperature and die diameter. H0: 145.. 155.. H1: 145 155 Decision: Since FSTAT = 1.85 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect due to die temperature. H0: .3mm. .4mm. H1: 3mm 4 mm Decision: Since FSTAT = 9.45 > 4.3512, reject H0. There is sufficient evidence to conclude that there is an effect due to die diameter.
Copyright ©2024 Pearson Education, Inc.
xxvi
11.22 cont.
Chapter 11: Analysis of Variance
(d)
(e)
At 5% level of significance, die diameter has an effect on the density while the die temperature does not have any impact on the density. There is no significant interaction between die diameter and die temperature.
11.23
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.23 cont.
xxvii
Excel Two-way ANOVA output: ANOVA Source of Variation Sample Columns Interaction Within
SS 2.7812 1.8760 0.7526 43.0284
Total
48.4382
(a)
(b)
(c)
df 1 1 1 20
MS 2.7812 1.8760 0.7526 2.1514
F 1.2927 0.8720 0.3498
P-value 0.2690 0.3615 0.5608
F crit 4.3512 4.3512 4.3512
Level of significance
0.05
23
H0: There is no interaction between die temperature and die diameter. H1: There is an interaction between die temperature and die diameter. Decision: Since FSTAT = 0.35 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is any interaction between die temperature and die diameter. H0: 145.. 155.. H1: 145 155 Decision: Since FSTAT = 1.29 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect due to die temperature. H0: .3mm. .4mm. H1: 3mm 4 mm Decision: Since FSTAT = 0.87 < 4.3512, do not reject H0. There is insufficient evidence to conclude that there is an effect die diameter.
(d)
(e)
At 5% level of significance, neither die diameter nor die temperature have a significant effect on the foam diameter. There is no significant interaction between die diameter and die temperature.
Copyright ©2024 Pearson Education, Inc.
xxviii
Chapter 11: Analysis of Variance
11.24
(a)
H0: There is no interaction between filling time and mold temperature. H1: There is an interaction between filling time and mold temperature. 0.1136 = 2.27 < 2.9277 or the p-value = 0.102 > 0.05, do not reject H0. 0.05 There is insufficient evidence of interaction between filling time and mold temperature. FSTAT = 9.02 > 3.5546, reject H0. There is evidence of a difference in the warpage due to the filling time. FSTAT = 4.23 > 3.5546, reject H0. There is evidence of a difference in the warpage due to the mold temperature.
Because FSTAT = (b) (c) (d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.24 cont.
(e)
(f)
The Tukey procedure revealed that the 3 sec filling time group was significantly higher than the 2 sec and 1 sec groups, which were not significant from each other. The Tukey procedure also revealed that the 60°C group was different than the 85°C group. Both filling time and temperature had main effects on warpage. Although the interaction plot appeared to show an interaction between filling time and temperature, it was not significant at the 0.05 significance level. The warpage for a three-second filling time seems to be much higher at 60°C and 72.5°C but not at 85°C. Caution should be used in interpreting this non-significant interaction due to relatively small sample sizes.
11.25
(a)
(b)
xxix
At the 0.05 significance level, there is insufficient evidence of an interaction between breakoff pressure and stopper height. Because FSTAT = 0.23 or p-value = 0.640, do not reject H0. At the 0.05 significance level, there is no evidence of an effect of breakoff pressure. Because FSTAT = 1.56 or p-value = 0.235, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
xxx
11.25 cont.
11.26
Chapter 11: Analysis of Variance
(c)
At the 0.05 significance level, there is no evidence of an effect of stopper height. Because FSTAT = 1.56 or p-value = 0.235, do not reject H0.
(d)
The results from (a) through (d) indicate that neither stopper height nor breakoff pressure had an effect on the percentage of breakoff chips. There was no interaction between breakoff pressure and stopper height. The mean breakoff percentage for both the two and three breakoff pressure categories was higher for the twenty stopper height. The interaction plot revealed that the lines representing the means for the two and three breakoff pressures were parallel across the two heights. An interaction between the two variables would have been reflected by non-parallel lines. This did not occur. The interaction plot confirmed the FSTAT = 0.23 for the interaction between breakoff pressure and height.
(a)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.26 cont.
(a)
(b) (c)
xxxi
FSTAT = 0.83, p-value = 0.38 > 0.05, do not reject H0. There is not enough evidence to conclude that there is an interaction between zone lower and zone 3 upper. FSTAT = 0.38, p-value is 0.548 > 0.05, do not reject H0. There is insufficient evidence to conclude that there is an effect due to zone 1 lower. FSTAT = 0.10, p-value = 0.752 > 0.05, do not reject H0. There is inadequate evidence to conclude that there is an effect due to zone 3 upper.
(d)
(e)
A large difference at a zone 3 upper of 695°C but only a small difference at zone 3 upper of 715°C. Because this difference appeared on the cell means plot but the interaction was not statistically significant because of the large MSE. Caution should be used in interpreting these findings due to small sample sizes. Further testing should be done with larger sample sizes.
11.27
The among-groups variance MSA represents variation among the means of the different groups. The within groups-variance MSW measures variation within each group.
11.28
The completely randomized design evaluates one factor of interest, in which sample observations are randomly and independently drawn. The randomized block design also evaluates one factor of interest, but sample observations are divided into blocks according to common characteristics to reduce within group variation. The two-factor factorial design evaluates two factors of interest and the interaction between these two factors.
11.29
The major assumptions of ANOVA are randomness and independence, normality, and homogeneity of variance.
11.30 If the populations are approximately normally distributed and the variances of the groups are approximately equal, you select the one-way ANOVA F test to examine possible differences among the means of c independent populations. Copyright ©2024 Pearson Education, Inc.
xxxii
Chapter 11: Analysis of Variance
11.31
When the ANOVA has indicated that at least one of the groups has a different population mean than the others, you should use multiple comparison procedures for evaluating pairwise combinations of the group means. In such cases, the Tukey-Kramer procedure should be used to compare all pairs of means.
11.32
The one-way ANOVA F test for a completely randomized design is used to test for the existence of treatment effect of the treatment variable on the mean level of the dependent variable, while the Levene test is used to test whether the amounts of variation of the dependent variable are the same across the different categories of the treatment variable.
11.33
You should use the two-way ANOVA F test to examine possible differences among the means of each factor in a factorial design when there are two factors of interest that are to be studied and more than one observation can be obtained for each treatment combination (to measure the interaction of the two factors).
11.34
Interaction measures the difference in the effect of one variable for the different levels of the second factor. If there is no interaction, any difference between levels of one factor will be the same at each level of the second factor.
11.35
You can obtain the interaction effect and carry out an F test for its significance. In addition, you can develop a plot of the response for each level of one factor at each level of a second factor.
11.36
(a)
H0: There is no interaction H1: There is an interaction Decision: Since FSTAT = 0.01 < 2.9011, do not reject H0. There is not enough evidence to conclude that there is an interaction between supplier and loom.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.36
xxxiii
(b)
H0: jetta.. turk.. H1: At least one mean differs.
(c)
Decision: Since FSTAT = 0.81 < 4.1491, do not reject H0. There is insufficient evidence to conclude that there is an effect due to loom. H0: .1. .2. .3. .4. H1: At least one mean differs. Decision: Since FSTAT = 5.20 > 2.9011, reject H0. There is adequate evidence to conclude that there is an effect due to suppliers.
cont.
(d)
Cell Means Plot 30 25 20 jetta
15
turk 10 5 0
Supplier 1
(e)
Supplier 2
Supplier 3
Supplier 4
Output of the Tukey Procedure: For different suppliers, Q = 3.84 with numerator d.f. = 4 and denominator d.f. = 32.
Tukey-Kramer Multiple Comparisons
Group 1: Supplier 1 2: Supplier 2 3: Supplier 3 4: Supplier 4
Sample Sample Mean Size 18.97 10 23.9 10 22.41 10 20.83 10
Other Data Level of significance Numerator d.f. Denominator d.f. MSE Q Statistic
0.05 4 32 8.61225 3.84
Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 4.93 0.92802209 3.5636 Means are different Group 1 to Group 3 3.44 0.92802209 3.5636 Means are not different Group 1 to Group 4 1.86 0.92802209 3.5636 Means are not different Group 2 to Group 3 1.49 0.92802209 3.5636 Means are not different Group 2 to Group 4 3.07 0.92802209 3.5636 Means are not different Group 3 to Group 4 1.58 0.92802209 3.5636 Means are not different
There is a difference in mean strength between supplier 1 and supplier 2 only.
Copyright ©2024 Pearson Education, Inc.
xxxiv
Chapter 11: Analysis of Variance
11.36 cont.
(f)
H0: 1 2 3 4 H1: At least one mean differs. Decision: Since FSTAT = 5.70 > 2.8663, reject H0. There is adequate evidence to conclude that there is an effect due to suppliers. Output of Tukey-Kramer Procedure: Tukey-Kramer Multiple Comparisons
Group 1: Supplier 1 2: Supplier 2 3: Supplier 3 4: Supplier 4
Sample Sample Mean Size 18.97 10 23.9 10 22.41 10 20.83 10
Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 36 MSW 7.856972 Q Statistic 3.79
Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 4.93 0.88639564 3.3594 Means are different Group 1 to Group 3 3.44 0.88639564 3.3594 Means are different Group 1 to Group 4 1.86 0.88639564 3.3594 Means are not different Group 2 to Group 3 1.49 0.88639564 3.3594 Means are not different Group 2 to Group 4 3.07 0.88639564 3.3594 Means are not different Group 3 to Group 4 1.58 0.88639564 3.3594 Means are not different
The result is consistent with that in (b) and (e) except the Turkey-Kramer Procedure in the one-way ANOVA concludes that not only there is a difference in mean strength between supplier 1 and supplier 2, but there is also a difference in mean strength between supplier 1 and suppler 3.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
xxxv
11.37
(a)
(b) (c)
H0: There is no interaction. H1: There is an interaction. Decision: Since FSTAT = 23.79 > 4.3512, reject H0. There is enough evidence to conclude that there is an interaction between machine type and reduction angle. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to machine type. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to reduction angle.
(d) Cell Means Plot 94.5 94 93.5 W95
93
W96 92.5 92 91.5
Narrow
(e)
Wide
There is enough evidence to conclude that there is an interaction between machine type and reduction angle. Since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to machine type. Likewise, since there is an interaction between machine type and reduction angle, it is inappropriate to test whether there is an effect due to reduction angle.
Copyright ©2024 Pearson Education, Inc.
xxxvi
Chapter 11: Analysis of Variance
11.37 cont.
(f)
H0: narrow wide H1: At least one mean differs. Decision: Since FSTAT = 16.70 > 4.3009, reject H0. There is adequate evidence to conclude that there is an effect due to reduction angle. You conclude that there is adequate evidence of an effect due to reduction angle here while in (c) and (e), it is inappropriate to test whether there is an effect due to reduction angle since there is an interaction between machine type and reduction angle. 11.38
(a)
To test the homogeneity of variance, you perform a Levene’s Test. H0: 12 22 32 H1: Not all 2j are the same Excel output: ANOVA Source of Variation
SS
df
MS
Between Groups
0.07
2
0.035
Within Groups
7.03
15
0.468667
Total
7.1
17
F 0.07468
P-value
F crit
0.928383
3.682317
Since the p-value = 0.928 > 0.05, do not reject H0. There is not enough evidence of a significant difference in the variances of the breaking strengths for the three air-jet pressures. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.38 cont.
(b)
xxxvii
H0: 1 2 3 H1: At least one of the means differs. Decision rule: If FSTAT > 3.68, reject H0.
Decision: Since FSTAT = 4.09 is greater than the critical bound of 3.68, reject H0. There is enough evidence to conclude that the mean breaking strengths differ for the three air-jet pressures. (c)
(d)
Breaking strength scores under 30 psi are significantly higher than those under 50 psi. No other pairwise comparisons were significant at the 0.05 significance level. Other things being equal, use 30 psi. Copyright ©2024 Pearson Education, Inc.
xxxviii
Chapter 11: Analysis of Variance
11.39
(a)
(b)
(c)
H0: There is no interaction between side-to-side aspect and air-jet pressure. H1: There is an interaction between side-to-side aspect and air-jet pressure. Decision rule: If FSTAT > 3.89, reject H0.
Test statistic: FSTAT = 1.9719 Decision: Since FSTAT = 1.97 is less than the critical bound of 3.89, do not reject H0. There is insufficient evidence to conclude there is an interaction between side-to-side aspect and air-jet pressure. H0: 1 2 H1: 1 2 Decision rule: If FSTAT > 4.75, reject H0. Test statistic: FSTAT = 4.87 Decision: Since FSTAT = 4.87 is greater than the critical bound of 4.75, reject H0. There is sufficient evidence to conclude that mean breaking strength does differ between the two levels of side-to-side aspect. H0: 1 2 3 H1: At least one of the means differ. Decision rule: If FSTAT > 3.89, reject H0. Test statistic: FSTAT = 5.67 Decision: Since FSTAT = 5.67 is greater than the critical bound of 3.89, reject H0. There is enough evidence to conclude that the mean breaking strengths differ for the three air-jet pressures.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.39 cont.
xxxix
(d)
(e)
(f) (g)
Mean breaking strengths under 30 psi are higher than those under 40 psi or 50 psi. The mean breaking strength is highest under 30 psi. The two-factor experiment gave a more complete, refined set of results than the onefactor experiment. Not only was the side-to-side aspect factor significant, the application of the Tukey procedure on the air-jet pressure factor determined that breaking strength scores are highest under 30 psi.
Copyright ©2024 Pearson Education, Inc.
xl Chapter 11: Analysis of Variance
11.40 ANOVA Source of Variation
SS
Df
MS
F
P-value
F crit
Sample
112.5603
1
112.5603
30.4434
3.07E-06
4.113165
Columns
46.01025
1
46.01025
12.4441
0.001165
4.113165
Interaction
0.70225
1
0.70225
0.1899
0.665575
4.113165
Within
133.105
36
3.697361
Total
292.3778
39
(a)
(b)
(c)
H0: There is no interaction between type of breakfast and desired time. H1: There is an interaction between type of breakfast and desired time. Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 0.1899. Decision: Since FSTAT = 0.1899 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is any interaction between type of breakfast and desired time. H0: 1 2 H1: 1 2 Population 1 = Continental, 2 = American Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 30.4434. Decision: Since FSTAT = 30.4434 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to type of breakfast. H0: 1 2 H1: 1 2 Population 1 = Early, 2 = Late Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT =12.4441. Decision: Since FSTAT =12.4441 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to desired time
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
xli
Mean Delivery Time Difference 7 6
Time (minutes)
5 4
Continental American
3 2 1 0 Early Time Period
Late Time Period
Desired Time Period
. At the 5% level of significance, both the type of breakfast ordered and the desired time have an effect on delivery time difference. There is no interaction between the type of breakfast ordered and the desired time. Two-way ANOVA output from Excel: (e)
11.41
ANOVA Source of Variation
SS
df
MS
F
P-value
F crit
Sample
55.46025
1
55.46025
14.99995
0.000436
4.113165
Columns
13.11025
1
13.11025
3.54584
0.067795
4.113165
Interaction
5.40225
1
5.40225
1.46111
0.234633
4.113165
Within
133.105
36
3.697361
Total
207.0778
39
(a)
(b)
(c)
H0: There is no interaction between type of breakfast and desired time. H1: There is an interaction between type of breakfast and desired time. Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 1.4611. Decision: Since FSTAT = 1.4611 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is any interaction between type of breakfast and desired time. H0: 1 2 H1: 1 2 Population 1 = Continental, 2 = American Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT = 15 Decision: Since FSTAT = 15 is greater than the critical bound of 4.1132, reject H0. There is sufficient evidence to conclude that there is an effect that is due to type of breakfast. H0: 1 2 H1: 1 2 Population 1 = Early, 2 = Late Copyright ©2024 Pearson Education, Inc.
xlii
Chapter 11: Analysis of Variance
Decision rule: If FSTAT > 4.1132, reject H0. Test statistic: FSTAT =3.5458. Decision: Since FSTAT =3.5458 is less than the critical bound of 4.1132, do not reject H0. There is insufficient evidence to conclude that there is an effect that is due to desired time (d)
(e)
11.42
. At 5% level of significance, only the type of breakfast ordered has an effect on delivery time difference. There is no interaction between the type of breakfast ordered and the desired time.
H0: There is no interaction between the size of the pieces and the can fill height. H1: There is an interaction between the size of the pieces and the can fill height. Decision rule: If p-value < 0.05, reject H0.
Test statistic: FSTAT = 0.2169 Decision: Since p-value is 0.6428, do not reject H0. There is not sufficient evidence to conclude there is an interaction between the size of the pieces and the can fill height. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
xliii
Both piece size and fill height had significant main effects. The FSTAT was much higher for piece size, which suggest that this factor may be of more importance relative to fill height. Filling cans with fine cut size led to more accuracy in relation to the weight of the can compared to the label weight. Filling cans with fine cut size and lower fill height led to the most accuracy in relation to the weight of the can compared to the label weight. Based on these results, one would recommend that the pet food company use the fine cut size and the lower fill height.
Copyright ©2024 Pearson Education, Inc.
xliv
Chapter 11: Analysis of Variance
11.42 cont.
A One-Way Anova on fill height produced a FSTAT = 18.44. Because FSTAT = 18.44 or p-value = 0.000, reject H0.
A One-Way Anova on cut size produced a FSTAT = 223.98. Because FSTAT = 223.98 or p-value = 0.000, reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.43
H0: There is no interaction between the linear widths and height placement. H1: There is an interaction between the linear widths and height placement. PHStat Two-Way ANOVA with replication
SUMMARY
Below Eye Level
At Eye Level
Above Eye Level
Total
5 feet Count
2
2
2
6
Sum
44
47
41
132
Average
22
23.5
20.5
22
Variance
2.0000
12.5000
4.5000
5.6000
Count
2
2
2
6
Sum
54
47
52
153
Average
27
23.5
26
25.5
Variance
2.0000
0.5000
2.0000
3.5000
Count
2
2
2
6
Sum
64
63
67
194
Average
32
31.5
Variance
8.0000
4.5000
4.5000
Count
6
6
6
Sum
162
157
160
Average
27 26.16666667
26.66666667
6 feet
7 feet
33.5 32.33333333
Total
Copyright ©2024 Pearson Education, Inc.
4.2667
xlv
xlvi
Chapter 11: Analysis of Variance
Variance
22.4000
20.5667
36.2667
PHStat Two-way ANOVA output: ANOVA Source of Variation
SS
df
MS
F
P-value
F crit
Sample
331.4444
2
165.7222
36.8272
0.0000
4.2565
Columns
2.1111
2
1.0556
0.2346
0.7956
4.2565
Interaction
24.2222
4
6.0556
1.3457
0.3255
3.6331
Within
40.5000
9
4.5000
Total
398.2778
17 Level of significance
0.05
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems
11.43 cont. Decision rule: If FSTAT > 4.2565, reject H0. Test statistic: FSTAT = 36.8272. Decision: Since FSTAT = 36.8272 is greater than the critical bound of 4.2565, reject H0. Decision rule: If FSTAT > 4.2565, reject H0. Test statistic: FSTAT = 0.2346. Decision: Since FSTAT = 0.2346 is less than the critical bound of 4.2565, do not reject H0. Decision rule: If FSTAT > 3.6331, reject H0. Test statistic: FSTAT = 1.3457. Decision: Since FSTAT = 1.3457 is less than the critical bound of 3.6331, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
xlvii
Chapter 12
12.1
12.2
(a)
For df = 1 and = 0.01, 2 = 6.635.
(b)
For df = 1 and = 0.005, 2 = 7.879.
(c)
For df = 1 and = 0.10, 2 = 2.706.
(a)
For df = 1 and = 0.05, 2 = 3.841.
(b)
For df = 1 and = 0.025, 2 = 5.024.
(c)
For df = 1 and = 0.01, 2 = 6.635.
X 1 X 2 20 40 60 4 0.4444 n1 n2 50 85 135 9 1A: p 0.4444 and n1 50, so f e 22.22 1B: p 0.4444 and n2 85, so f e 37.78 2A: 1 p 0.5556 and n1 50, so f e 27.78 2B: 1 p 0.5556 and n2 85, so f e 47.22
12.3 (a)–(b)
(c)
p
1A Observed Freq Expected Freq 1B Observed Freq Expected Freq 20 22.22 40 37.78 2A Observed Freq Expected Freq 2B Observed Freq Expected Freq 30 27.78 45 47.22
Total Obs, Row 1 60 Total Obs, Row 2 75
Total Obs, Col A 50
GRAND TOTAL 135
Total Obs, Col B 85
( f 0 – f e ) 2 (20 22.22) 2 (40 37.78) 2 (30 27.78) 2 (45 47.22) 2 fe 22.22 37.78 27.78 47.22 All Cells
2 STAT
0.634 2 Since STAT = 0.634 < 3.841, it is not significant at the 5% level of significance. 12.4
(a)
(b)
p
X 1 X 2 20 30 1 0.5 n1 n2 50 50 2
Observed Freq Expected Freq 20 25 chi-sq contrib= 1.00 Observed Freq Expected Freq 30 25 chi-sq contrib= 1.00
Observed Freq 30 chi-sq contrib= Observed Freq 20 chi-sq contrib=
Total Obs, Col 1 50
Total Obs, Col 2 50
Expected Freq 25 1.00 Expected Freq 25 1.00
Decision rule: If 2 > 3.841, reject H0. Copyright ©2024 Pearson Education, Inc.
v
Total Obs, Row 1 50 Total Obs, Row 2 50 GRAND TOTAL 100
vi Chapter 16: Time-Series Forecasting
( f 0 – f e )2 = 1.00 + 1.00 + 1.00 + 1.00 = 4 fe All Cells
2 Test statistic: STAT
( f 0 – f e )2 = 4 is greater than the critical value of 3.841, it fe All Cells is significant at the 5% level of significance. 2 Decision: Since STAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 12.5
(a) Chi-Square Test
Observed Frequencies Subscribers Churn
Basic
Premium
Total
Yes
883
873
1756
No
1858
1903
3761
Total
2741
2776
5517
Expected Frequencies Subscribers Churn
Basic
Premium
Total
Yes
872.4299 883.5701
1756
No
1868.57
1892.43
3761
Total
2741
2776
5517
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
0.373342
Copyright ©2024 Pearson Education, Inc.
viii Chapter 16: Time-Series Forecasting p-Value
0.541188
Do not reject the null hypothesis H0: 1 2 H1: 1 2 2 The STAT = 0.373342 < 3.841459 for α = 0.05. Do not reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 12.5 cont.
(b) Chi-Square Test
Observed Frequencies Subscriber Churn
Basic
Premium
Total
Yes
16
15
31
No
34
35
69
Total
50
50
100
Premium
Total
Expected Frequencies Subscriber Churn
Basic Yes
15.5
15.5
31
No
34.5
34.5
69
Total
50
50
100
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
0.046751
Copyright ©2024 Pearson Education, Inc.
x Chapter 16: Time-Series Forecasting p-Value
0.828817
Do not reject the null hypothesis H0: 1 2 H1: 1 2 2 The STAT = 0.046751 < 3.841459 for α = 0.05. Do not reject H0.
(c)
(d)
The results in parts (a) and (b) are the same, do not reject H0. The p-values are different. 2 A p-value of 0.541188 indicates that the probability of obtaining a STAT of 0.373342 or larger is 54.1188% when the null hypothesis is true. A p-value of 0.828817 indicates that 2 the probability of obtaining a STAT of 0.046751 or larger is 82.8817% when the null hypothesis is true. 2 The results of (a) and (b) are exactly the same as those of Problem 10.29. The STAT in (a) and the Z in Problem 10.29 (a) satisfy the relationship that 2 STAT 0.3733 Z 2 (0.6610)2 , and the p-value in (b) is exactly the same as the p-value in Problem 10.29 (b), 0.8288.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 12.6
(a)
H0: 1 2 H1: 1 2 Chi-Square Test
Observed Frequencies Purchases Consumed Coffee
Yes
No
Total
Caffeinated
40
60
100
Decaffeinated
10
90
100
Total
50
150
200
Expected Frequencies Purchases Consumed Coffee
Yes
No
Total
Caffeinated
25
75
100
Decaffeinated
25
75
100
Total
50
150
200
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
24
Copyright ©2024 Pearson Education, Inc.
xii Chapter 16: Time-Series Forecasting p-Value
9.63E-07
Reject the null hypothesis
(b)
2 Decision rule: df = 1. If STAT > 3.841 or p-value < 0.05, reject H0. 2 Test statistic: STAT = 24 Decision: Since 2 STAT = 24 is larger than the upper critical bound of 3.841, reject H0. There is evidence to conclude that the population proportion of those who drink caffeinated coffee is different from those who do not drink caffeinated coffee.
(c)
The p-value is 0.0000. The probability of obtaining a test statistic of 24.0 or larger when the null hypothesis is true is 0.0000.
(d)
You should not compare the results in (b) to those of Problem 10.30 because Problem 10.30 was a one-tail test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 12.7
From PHStat Chi-Square Test
Observed Frequencies Workers Training
U.S.
Canada
Total
Yes
463.5
350
813.5
No
566.5
680
1246.5
Total
1030
1030
2060
U.S.
Canada
Total
Yes
406.75
406.75
813.5
No
623.25
623.25
1246.5
Total
1030
1030
2060
Expected Frequencies Workers Training
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
26.17032
Copyright ©2024 Pearson Education, Inc.
xiv Chapter 16: Time-Series Forecasting p-Value
3.13E-07
Reject the null hypothesis
(a)
At the 0.05 significance level, there is evidence of a difference in the proportion of U.S and Canadian workers whose organization provides explicit training on for all people 2 managers. Because STAT = 26.17032 > 3.841 or p-value = 0.0000, reject H0.
(b)
The p-value of 0.0000 is below the 0.05 significance level, which would allow one to reject the H0 that the percentage of U.S. and Canadian workers whose organization provides explicit training for all people managers is equal. A p-value of 0.0000 implies 2 that the probability of obtaining a STAT of 26.17032 or larger is 0% when H0 is true.
(c)
2 The results of (a) and (b) are exactly the same as those of Problem 10.31. The STAT in (a) and the Z in Problem 10.31 (a) satisfy the relationship that 2 STAT 26.17032 Z 2 (5.1377)2 , and the p-value is exactly the same as the p-value in Problem 10.31, 0.0000.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 12.8
(a)
H0: 1 2 . H1: 1 2 . Chi-Square Test
Observed Frequencies U.S. Organizations Providing Benefits
HR
Workers
Total
Effective
1112
302
1414
Not Effective
625
340
965
Total
1737
642
2379
Expected Frequencies U.S. Organizations Providing Benefits
Workers
Total
Effective
1032.416 381.5839
1414
Not Effective
704.5839 260.4161
965
Total
HR
1737
642
Data Level of Significance
0.01
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
6.634897
Chi-Square Test Statistic
56.04305
Copyright ©2024 Pearson Education, Inc.
2379
xvi Chapter 16: Time-Series Forecasting p-Value
7.09E-14
Reject the null hypothesis 2 Because STAT = 56.0431 > 6.635, reject H0. There is evidence of a difference in the proportion of HR and workers with respect to the proportion that rated their organizations as effective in providing affordable and comprehensive healthcare benefits.
(b)
The p-value = 0.0000. The probability of obtaining a difference in proportions that gives rise to a test statistic above 56.0431 is 0.0000 if there is no difference in the proportion in the two groups.
(c) and (d) The results of (a) and (b) are exactly the same as those of Problem 10.32. The 2 in (a) and the Z in Problem 10.32 (a) satisfy the relationship that
2 = 56.0431 = Z2 = (7.4862)2, and the p-value in (b) is exactly the same as the p-value computed in Problem 10.32 (b).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 12.9
From PHStat, LinkedIn Chi-Square Test
Observed Frequencies Marketers Use LinkedIn
B2B
B2C
Total
Yes
1246
1514
2760
No
292
1343
1635
Total
1538
2857
4395
B2C
Total
Expected Frequencies Marketers Use LinkedIn
B2B Yes
965.843 1794.157
2760
No
572.157 1062.843
1635
Total
1538
2857
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
336.0363
Copyright ©2024 Pearson Education, Inc.
4395
xviii Chapter 16: Time-Series Forecasting p-Value
4.66E-75
Reject the null hypothesis (a)
At the 0.05 significance level, there is evidence of a significance difference in the percentage of B2B and B2C marketers that use LinkedIn. 2 Because STAT = 336.0363 > 3.841 or p-value =0.0000, reject H0. The results indicate that a significant higher proportion (81%) of B2B marketers utilized LinkedIn than the proportion (53%) of B2C marketers who utilized LinkedIn.
(b)
The p-value of 0.000 is well below the 0.05 significance level, which would allow one to reject the H0 that the percentage of B2B and B2C who utilize LinkedIn is equal.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix 12.9 cont.
(b)
From PHStat, YouTube Chi-Square Test
Observed Frequencies Marketers Use YouTube
B2B
B2C
Total
Yes
877
1571
2448
No
662
1099
1761
Total
1539
2670
4209
B2C
Total
Expected Frequencies Marketers Use YouTube
B2B Yes
895.0991 1552.901
2448
No
643.9009 1117.099
1761
Total
1539
2670
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
1.378887
Copyright ©2024 Pearson Education, Inc.
4209
xx Chapter 16: Time-Series Forecasting p-Value
0.240291
Do not reject the null hypothesis (c)
At the 0.05 significance level, there is insufficient evidence of a significance difference in the percentage of B2B and B2C marketers that use YouTube. 2 Because STAT = 1.378887 < 3.841 or p-value =0.240291, do not reject H0.
(d)
The p-value of 0.240291 is well above the 0.05 significance level, which would allow one not to reject the H0 that the percentage of B2B and B2C marketers who utilize YouTube is equal.
(e)
2 The results of (a) and (c) are exactly the same as those of Problem 10.33. The STAT in (a) and the Z in Problem 10.33 (a) satisfy the relationship that 2 STAT 336.0363 Z 2 (18.3313)2 , and the p-value is exactly the same as the p-value in Problem 10.31, 0.0000.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi 12.10
(a)From PHStat Chi-Square Test
Observed Frequencies Region Prefer Open-air
Northeast
Midwest
Total
Yes
63
44
107
No
133
164
297
Total
196
208
404
Midwest
Total
Yes
51.91089 55.08911
107
No
144.0891 152.9109
297
Expected Frequencies Region Prefer Open-air
Northeast
Total
196
208
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
6.258608
Copyright ©2024 Pearson Education, Inc.
404
xxii Chapter 16: Time-Series Forecasting p-Value
0.012359
Reject the null hypothesis (b)
2 Because STAT = 6.258608 > 3.841, reject H0. There is evidence that there is a significant difference between the proportion of shoppers in the Northeast and Midwest who prefer open-air markets.
(c)
The p-value = 0.012359. The probability of obtaining a test statistic of 6.258608 or larger when the null hypothesis is true is 0.012359.
(d)
The confidence interval is from 0.0119 to 0.1865. The results are identical because (2.5017)2 = 6.2586.
12.11
(a) (b)
df = (r – 1)(c – 1) = (2 – 1)(4 – 1) = 3 2 = 7.815
(c)
2 = 11.345
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii 12.12
p
(a)
X 1 X 2 X 3 10 20 50 90 0.4 n1 n2 n3 50 75 100 225 Row 1 A: p 0.4 and n1 50, so f e 20 Row 1 B: p 0.4 and n1 75, so f e 30 Row 1 C: p 0.4 and n1 100, so f e 40 Thus, the expected frequencies in the first row are 20, 30, and 40. Row 2 A: 1 p 0.6 and n1 50, so f e 30 Row 2 B: 1 p 0.6 and n1 75, so f e 45
(b)
12.13
Row 2 C: 1 p 0.6 and n1 100, so f e 60 Thus, the expected frequencies in the second row are 30, 45, and 60. 2 = 12.500. The critical value with 2 degrees of freedom and = 0.05 is 5.991. The STAT result is deemed significant.
X 1 X 2 X 3 20 25 25 70 0.4667 n1 n2 n3 50 50 50 150 (a) The calculations for A, B, and C of Row 1 are identical. Row 1 A: p 0.4667 and n1 50, so f e 23.3333 Thus, the expected frequencies in the first row are 23.3333, 23.3333, and 23.3333. p
The calculations for A, B, and C of Row 2 are identical. Row 2 A: 1 p 0.5333 and n1 50, so f e 26.6667 Thus, the expected frequencies in the second row are 26.6667, 26.6667, and 26.6667. (b)
2 = 1.339286. The critical value with 2 degrees of freedom and = 0.05 is 5.991. STAT
The result is not deemed significant.
Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 16: Time-Series Forecasting 12.14
(a) Chi-Square Test
Observed Frequencies Column variable Owns Smartphone
18-29
30-49
50-64
65+
Total
Yes
192
190
166
122
670
No
8
10
34
78
130
Total
200
200
200
200
800
Expected Frequencies Column variable Owns Smartphone
18-29
30-49
50-64
65+
Total
Yes
167.5
167.5
167.5
167.5
670
No
32.5
32.5
32.5
32.5
130
Total
200
200
200
200
800
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
4
Degrees of Freedom
3
Results Critical Value
7.814728
Chi-Square Test Statistic
116.7945
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv p-Value
3.78E-25
Reject the null hypothesis
Because the calculated test statistic 116.7945 is greater than the critical value of 7.8147, you reject H0 and conclude that there is evidence of a difference among the age groups in the proportion of smartphone owners. (b)
p-value = 0.0000. The probability of obtaining a data set that gives rise to a test statistic of 116.7945 or more is 0.0000 if there is no difference in the proportion of smartphone owners.
Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 16: Time-Series Forecasting 12.14 cont.
(c) Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
2.795483483
Sample Proportions Group 1 (18-29)
0.96
Group 2 (30-49)
0.95
Group 3 (50-64)
0.83
Group 4 (65+)
0.61
MARASCUILO TABLE Absolute Differences
Proportions
Critical Range
| Group 1 - Group 2 |
0.01 0.057934667 Not significant
| Group 1 - Group 3 |
0.13 0.083747945 Significant
| Group 1 - Group 4 |
0.35 0.103904026 Significant
| Group 2 - Group 3 |
0.12
| Group 2 - Group 4 |
0.34 0.105601216 Significant
| Group 3 - Group 4 |
0.22 0.121691862 Significant
0.08584456 Significant
There is a significant difference between 18- to 29-year-olds and 50- to 64-years-olds and those 65 and older. There is a significant difference between 30- to 49-year-olds and 50- to 64-years-olds and those 65 and older. There is a significant difference between those who are between 50 and 64 years old and those 65 or older.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 12.15
From PHStat Chi-Square Test
Observed Frequencies Education Owns Smartphone
HS Grad
Some College
College Grad
Total
Yes
150
356
372
878
No
50
44
28
122
Total
200
400
400
1000
Expected Frequencies Education Owns Smartphone
HS Grad
Some College
College Grad
Total
Yes
175.6
351.2
351.2
878
No
24.4
48.8
48.8
122
Total
200
400
400
1000
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465
Chi-Square Test Statistic
41.22633
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 16: Time-Series Forecasting p-Value
1.12E-09
Reject the null hypothesis (a)
At the 0.05 significance level, because the calculated test statistic 41.22633 is greater than the critical value of 5.991, you reject H0 and conclude that there is evidence of a difference among the age groups in the proportion of U.S. adult smartphone owners.
(b)
p-value = 0.0000. The probability of obtaining a data set that gives rise to a test statistic of 41.22633 or more is 0.0000 if there is no difference in the proportion of U.S. adult smartphone owners.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix 12.15 cont.
(c) Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
2.447746831
Sample Proportions Group 1 (HS Grad)
0.75
Group 2 (Some College)
0.89
Group 3 (College Grad)
0.93
MARASCUILO TABLE Absolute Differences
Proportions
Critical Range
| Group 1 - Group 2 |
0.14
0.08416299 Significant
| Group 1 - Group 3 |
0.18 0.081191803 Significant
| Group 2 - Group 3 |
0.04 0.049411758 Not significant
There is a significant difference between High School Graduates and Some College Education and those College Graduates. There is a significant difference between Some College Education and College Graduates.
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 16: Time-Series Forecasting 12.16
(a)
H0: 1 2 3 . H1: At least one proportion differs. Chi-Square Test Observed Frequencies Residence Smartphone Owner
Rural
Suburb
Urban
Total
Yes
160
336
356
852
No
40
64
44
148
Total
200
400
400
1000
Suburb
Urban
Total
Expected Frequencies Residence Smartphone Owner
Rural
Yes
170.4
340.8
340.8
852
No
29.6
59.2
59.2
148
Total
200
400
400
1000
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465
Chi-Square Test Statistic
9.326228
p-Value
0.009437
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi Reject the null hypothesis Because 9.326228 > 5.9915, reject H0. There is a significant difference among areas of residence with respect to the proportion who own smartphones. (b)
p-value = 0.0094. The probability of a test statistic greater than 9.3262 is 0.0094.
(c)
Level of Significance Square Root of Critical Value
0.05 2.447746831
Sample Proportions
Group 1 (Rural) Group 2 (Suburb) Group 3 (Urban
0.8 0.84 0.89 Marascuilo Table Absolute Differences Critical Range
Proportions | Group 1 – Group 2 | | Group 1 – Group 3 | | Group 2 – Group 3 |
0.04 0.09
0.082500326 0.079117524
Not significant Significant
0.05
0.058987652
Not significant
Ownership of a smartphone is different between those who live in rural areas and those who live in urban areas. 12.17
H0: 1 2 3 . H1: At least one proportion differs. Chi-Square Test
Observed Frequencies Residence Smartphone Owner
Rural
Suburb
Urban
Total
Yes
80
84
89
253
No
20
16
11
47
Total
100
100
100
300
Urban
Total
Yes
84.33333 84.33333 84.33333
253
No
15.66667 15.66667 15.66667
47
Expected Frequencies Residence Smartphone Owner
Rural
Suburb
Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 16: Time-Series Forecasting Total
100
100
100
300
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465
Chi-Square Test Statistic
3.077958
p-Value
0.2146
Do not reject the null hypothesis (a)
Because 3.077958 < 5.9915, do not reject H0. There is not enough evidence of a significant difference among areas of residence with respect to the proportion who own smartphones. p-value = 0.2146. The probability of a test statistic greater than 3.077958 is 0.2146.
(b)
2 2 The STAT is sensitive to sample size. The STAT with the same proportions (80%, 84%, 2 89%) that are smartphone owners were associated with a much larger STAT 2 (9.3262 ) when n = 200, 400, or 400 for each group. The STAT was 3.077958 when n = 100 with the same proportions (80%, 84%, 89%). The chi-square test for differences among more than two populations requires that each table cell have a sufficient expected frequency. Each cell should have an expected frequency of at least 1.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii 12.18
(a)
From PHStat Chi-Square Test
Observed Frequencies Activities Used Device
Smartphone
Computer
Tablet
Total
Yes
225
156
174
555
No
75
144
126
345
Total
300
300
300
900
Expected Frequencies Activities Used Device
Smartphone
Computer
Tablet
Total
Yes
185
185
185
555
No
115
115
115
345
Total
300
300
300
900
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991464547
Chi-Square Test Statistic
36.12690952
Copyright ©2024 Pearson Education, Inc.
xxxiv Chapter 16: Time-Series Forecasting p-Value
1.42936E-08
Reject the null hypothesis 2 Because STAT = 36.1269 > 5.9915, reject H0. There is evidence of a difference in the percentage who use their device to check social media while watching TV between the groups.
(b)
p-value = 0.0000.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv 12.18 cont.
(b) Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
2.447746831
Sample Proportions Group 1 (Smartphone)
0.75
Group 2 (Computer)
0.52
Group 3 (Tablet)
0.58
MARASCUILO TABLE Absolute Differences
Proportions
(c)
Critical Range
| Group 1 - Group 2 |
0.23 0.093432135 Significant
| Group 1 - Group 3 |
0.17 0.092788655 Significant
| Group 2 - Group 3 |
0.06 0.099247004 Not significant
Cellphone versus computer 0.23 > 0.0934. Significant. Smartphone versus tablet: 0.17 > 0.0928. Significant. Computer versus tablet: 0.06 < 0.0992. Not significant. The smartphone group is different from the computer and tablet groups.
Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 16: Time-Series Forecasting 12.19
(a)
From PHStat
Chi-Square Test
Observed Frequencies Country At least 30% women
Australia
Germany
Ireland
Spain
Swiss
Total
Yes
48
47
11
16
21
143
No
14
10
9
3
22
58
Total
62
57
20
19
43
201
Expected Frequencies Country At least 30% women
Australia
Germany
Ireland
Spain
Swiss
Total
Yes
44.10945 40.55224 14.22886 13.51741 30.59204
143
No
17.89055 16.44776 5.771144 5.482587 12.40796
58
Total
62
57
20
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
5
Degrees of Freedom
4
Results Critical Value
9.487729
Chi-Square Test Statistic
19.28403
Copyright ©2024 Pearson Education, Inc.
19
43
201
Solutions to End-of-Section and Chapter Review Problems xxxvii p-Value
0.000691
Reject the null hypothesis
(b)
At the 0.05 significance level, there is evidence that there are differences among countries in the proportion of companies that have at least three female directors on their 2 boards. Because STAT = 19.28403 > 9.487729 or p-value =0.0007, reject H0. The p-value of 0.0007 is well below the 0.05 significance level, which would allow one to reject the H0 that there are no differences among countries in the proportion of companies that have at least 30% women directors on their boards. The probability that a 2 of 19.28403 or larger would be observed if H0 is true is 0.0007. STAT
Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 16: Time-Series Forecasting 12.19 cont.
(c) Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
3.080215745
Sample Proportions Group 1 (Australia)
0.774193548
Group 2 (Germany)
0.824561404
Group 3 (Ireland)
0.55
Group 4 (Spain)
0.842105263
Group 5 (Swiss)
0.488372093
MARASCUILO TABLE Proportions
Absolute Differences
Critical Range
| Group 1 - Group 2 |
0.050367855
0.225456989 Not significant
| Group 1 - Group 3 |
0.224193548
0.379687583 Not significant
| Group 1 - Group 4 |
0.067911715
0.305201793 Not significant
| Group 1 - Group 5 |
0.285821455
0.286152749 Not significant
| Group 2 - Group 3 |
0.274561404
0.376150883 Not significant
| Group 2 - Group 4 |
0.01754386
0.30079056 Not significant
| Group 2 - Group 5 |
0.33618931
0.281443107 Significant
| Group 3 - Group 4 |
0.292105263
0.428726915 Not significant
| Group 3 - Group 5 |
0.061627907
0.415381787 Not significant
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix
| Group 4 - Group 5 |
0.35373317
0.348607951 Significant
2 Because STAT was significant at 0.05 level, it is appropriate to perform the Marascuilo procedure. At the 0.05 level of significance, the Marascuilo procedure revealed that a significantly higher proportion of German companies have at least 30% women directors compared to Swiss companies. The procedure also revealed that a significantly higher proportion of Spanish companies have at least 30% women board directors compared to Swiss companies.
12.20
df = (r – 1)(c – 1) = (3 – 1)(4 – 1) = 6
12.21
(a)
df = 9, 2 16.919(d) df = 10, 2 18.307
(b)
df = 9, 2 21.666(e)df = 10, 2 23.209
(c)
df = 15, 2 30.578
Copyright ©2024 Pearson Education, Inc.
xl Chapter 16: Time-Series Forecasting 12.22
H0: There is no relationship between type of dessert and type of entrée. H1: There is a relationship between type of dessert and type of entrée. 2 Test statistic: STAT
f0 fe
2
92.1028 fe Decision: Since the calculated test statistic 92.1028 is larger than the critical value of 16.9190, you reject H0 and conclude that there is enough evidence of a relationship between type of dessert and type of entrée. All cells
12.23
From PHStat Chi-Square Test
Observed Frequencies Generation Employment Status
Gen Z
Millennials
Gen X
Boomers
Total
Working Full-time
210
280
291
290
1071
Working Part-time
170
98
85
97
450
Freelancer
110
129
136
69
444
Project-based
11
15
10
5
41
Total
501
522
522
461
2006
Boomers
Total
Expected Frequencies Generation Employment Status
Gen Z
Millennials
Gen X
Working Full-time
267.4831
278.6949 278.6949 246.1271
1071
Working Part-time
112.3878
117.0987 117.0987 103.4148
450
Freelancer
110.8893
115.5374 115.5374 102.0359
444
Project-based
10.23978
10.66899 10.66899 9.422233
41
Total
501
522
Data Copyright ©2024 Pearson Education, Inc.
522
461
2006
Solutions to End-of-Section and Chapter Review Problems xli Level of Significance
0.05
Number of Rows
4
Number of Columns
4
Degrees of Freedom
9
Results Critical Value
16.91898
Chi-Square Test Statistic
82.39589
p-Value
5.39E-14
Reject the null hypothesis
12.24
At the 0.05 level of significance, there is sufficient evidence of a significant relationship between 2 generation and employment status. Because STAT = 82.39589 > 16.919 or p-value = 0.0000, reject H0. From PHStat Chi-Square Test
Observed Frequencies Age Group Support Quiet Quitting
18-29
30-44
45-64
65+
Total
Strongly Support
53
36
58
28
175
Somewhat support
57
56
107
73
293
Somewhat oppose
29
30
64
45
168
Strongly Oppose
27
12
39
26
104
Not sure
56
63
89
41
249
Total
222
197
357
213
989
Expected Frequencies
Copyright ©2024 Pearson Education, Inc.
xlii Chapter 16: Time-Series Forecasting Age Group Support Quiet Quitting
18-29
30-44
45-64
65+
Total
Strongly Support
39.2821 34.85844 63.16987 37.68959
175
Somewhat support
65.76946 58.36299 105.7644 63.10313
293
Somewhat oppose
37.71082 33.46411 60.64307
36.182
168
23.34479 20.71587 37.54095 22.39838
104
55.89282 49.59858
Strongly Oppose Not sure Total
222
197
89.8817
53.6269
249
357
213
989
Data Level of Significance
0.01
Number of Rows
5
Number of Columns
4
Degrees of Freedom
12
Results Critical Value
26.21697
Chi-Square Test Statistic
26.75745
p-Value
0.008373
Reject the null hypothesis 2 Decision: Because STAT = 26.757 > 26.217 or p-value = 0.008373, reject H0. There is evidence to conclude that there is a relationship between age and supporting quiet quitting.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii 12.25
From PHStat Chi-Square Test
Observed Frequencies Geographic Region Cloud Deployment
Americas
Asia-Pacific
EMEA
Total
Hybrid Cloud
595
294
190
1079
Private Cloud
646
154
185
985
Public Cloud
306
196
85
587
No Cloud Deployment
153
56
40
249
Total
1700
700
500
2900
EMEA
Total
Hybrid Cloud
632.5172 260.4482759 186.0345
1079
Private Cloud
577.4138 237.7586207 169.8276
985
Public Cloud
344.1034 141.6896552 101.2069
587
No Cloud Deployment
145.9655 60.10344828 42.93103
249
Expected Frequencies Geographic Region Cloud Deployment
Americas
Total
1700
Asia-Pacific
700
500
Data Level of Significance
0.05
Number of Rows
4
Number of Columns
3
Degrees of Freedom
6 Copyright ©2024 Pearson Education, Inc.
2900
xliv Chapter 16: Time-Series Forecasting
Results Critical Value
12.59159
Chi-Square Test Statistic
74.09251
p-Value
5.9E-14
Reject the null hypothesis At the 0.05 level of significance, there is evidence of a significant relationship between 2 geographic region and cloud deployment. Because STAT = 74.093 > 12.592 or p-value =0.000, 2 reject H0. The probability of observing a STAT of 74.093 or larger, when H0 is true, is close to 0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 12.26
From PHStat Chi-Square Test Observed Frequencies Geographic Region North America
Number of Tech M&A
Europe
AsiaPacific
Latin America
Total
1
14
20
15
10
59
2-3
44
44
50
17
155
4-6
24
21
22
3
70
7 or more
8
5
3
0
16
Total
90
90
90
30
300
Expected Frequencies Geographic Region North America
Number of Tech M&A
Europe
AsiaPacific
Latin America
Total
1
17.7
17.7
17.7
5.9
59
2-3
46.5
46.5
46.5
15.5
155
4-6
21
21
21
7
70
7 or more
4.8
4.8
4.8
1.6
16
Total
90
90
90
30
300
Data Level of Significance
0.05
Number of Rows
4
Number of Columns
4
Degrees of Freedom
9 Copyright ©2024 Pearson Education, Inc.
xlvi Chapter 16: Time-Series Forecasting Results Critical Value
16.9189776
Chi-Square Test Statistic
12.18932412
p-Value
0.202845358
Do not reject the null hypothesis 2 Because STAT = 12.189 < 16.919 do not reject H0. There is no evidence of a relationship between number of Tech M&A and geographic region.
12.27
(a)
The lower and upper critical values are 29 and 55, respectively.
(b)
The lower and upper critical values are 27 and 57, respectively.
(c)
The lower and upper critical values are 25 and 59, respectively.
(d)
As the level of significance α gets smaller, the width of the nonrejection region gets wider.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii 12.28
(a)
The lower critical value is 31.
(b)
The lower critical value is 29.
(c)
The lower critical value is 27.
(d)
The lower critical value is 25.
12.29
T1 = 4 + 1 + 8 + 2 + 5 + 16 + 11 = 47
12.30
The lower and upper critical values are 40 and 79, respectively.
12.31
Decision: Since T1 = 47 is between the critical bounds of 40 and 79, do not reject H0.
12.32
(a)
The ranks for Sample 1 are 1, 2, 4, 5, and 10, respectively. The ranks for Sample 2 are 3, 6.5, 6.5, 8, 9, and 11, respectively.
(b)
T1 = 1 + 2 + 4 + 5 + 10 = 22
(c)
T2 = 3 + 6.5 + 6.5 + 8 + 9 + 11 = 44
(d)
T1 T2
n(n 1) 11(12) 66 2 2
T1 T2 22 44 66
12.33
The lower critical value is 20.
12.34
Decision: Since T1 = 22 is greater than the lower critical bound of 20, do not reject H0.
12.35
H0: M1 = M2where Populations: 1 = traditional, 2 = experimental There is no difference in performance between the traditional and the experimental training methods. H1: M1 M2 There is a difference in performance between the traditional and the experimental training methods. Decision rule: If T1 < 78 or T1 > 132, reject H0. Test statistic: T1 = 1 + 2 + 3 + 5 + 9 + 10 + 12 + 13 + 14 + 15 = 84 Decision: Since T1 = 84 is between the critical bounds of 78 and 132, do not reject H0. There is not enough evidence to conclude that there is a difference in median performance between the traditional and the experimental training methods.
12.36
(a)
The data are ordinal.
(b)
The two-sample t test is inappropriate because the data are ordinal, the sample size is small and the distribution of the ordinal data is not normally distributed.
Copyright ©2024 Pearson Education, Inc.
xlviii Chapter 16: Time-Series Forecasting 12.36 cont.
(c)
H0: M1 = M2 where Populations: H1: M1 M2
1 = California,
2 = Washington
Data Level of Significance
0.05
Population 1 Sample Sample Size
8
Sum of Ranks
47
Population 2 Sample Sample Size
8
Sum of Ranks
89
Intermediate Calculations Total Sample Size n
16
T1 Test Statistic
47
T1 Mean
68
Standard Error of T1
9.521905
Z Test Statistic
–2.20544 Two-Tailed Test
Lower Critical Value
–1.95996
Upper Critical Value
1.959964
p-value
0.027423 Reject the null hypothesis
n n 1
T 1 1
T 1
Z STAT
8 16 1
2 n1n2 n 1
12 T1 T1
T
2
= 68
8 8 16 1 12
= 9.5219
= –2.2054
1
Decision: Since ZSTAT = –2.2054 is lower than the lower critical bounds of –1.96, reject H0. There is enough evidence of a significant difference in the median rating of California Cabernets and Washington Cabernets. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlix
Copyright ©2024 Pearson Education, Inc.
l Chapter 16: Time-Series Forecasting 12.37
From PHStat Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
12
Sum of Ranks
131
Population 2 Sample Sample Size
7
Sum of Ranks
59
Intermediate Calculations Total Sample Size n
19
T1 Test Statistic
59
T1 Mean
70
Standard Error of T1
11.8322
Z Test Statistic
-0.92967
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.3525
Do not reject the null hypothesis (a)
Using α = 0.05, there is insufficient evidence of a difference in the median satisfaction
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li rating for traditional and prepaid providers. Because one of the sample sizes is greater than 10, it is appropriate to use the standardized Z test statistic for the Wilcoxon rank sum test. Because ZSTAT = –0.9297 or p-value = 0.3525, do not reject H0. (b)
The Wilcoxon rank sum test assumes the two samples are independent and that the dependent variable is at the ordinal level or higher. However, the Wilcoxon rank sum test does not require that the sample data meet the normality assumption. The Wilcoxon rank sum test is an alternative to the pooled-variance and separate-variance t tests when the assumptions of these procedures are violated.
(c)
The results from (a) were similar to the results observed in 10.9 (a), which were associated with a pooled-variance t test on the same cell phone data used in 12.37 (a). The tSTAT of –1.0468 or p-value = 0.3099 led to the conclusion not to reject H0. With the Wilcoxon rank sum test, the ZSTAT = –0.92967 or p-value = 0.3525, also led to the conclusion not to reject H0. The p-value was slightly higher for the Wilcoxon rank sum test compared to the p-value associated with the pooled-variance t test from 10.9 (a).
Copyright ©2024 Pearson Education, Inc.
lii Chapter 16: Time-Series Forecasting 12.38
(a)
H0: M1 = M2, where Populations: 1 = Wing A, 2 = Wing B. H1: M1 M2. Population 1 sample: Sample size 20, sum of ranks 561 Population 2 sample: Sample size 20, sum of ranks 259 n (n 1) 20(40 1) T1 1 410 2 2 n n (n 1) 20(20)(40 1) T1 1 2 36.9685 12 12 T1 T 1 561 410 Z STAT 4.0846 ST1 36.9685 Decision: Because ZSTAT = 4.0846 > 1.96 (or p-value = 0.0000 < 0.05), reject H0. There is sufficient evidence of a difference in the median delivery time in the two wings of the hotel.
(b)
The results of (a) are consistent with the results of Problem 10.65.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems liii 12.39
(a)
Using α = 0.05, there is evidence of a difference in the median life of bulbs between the two manufacturers. Because both sample sizes are greater than 10, it is appropriate to use the standardized Z test statistic for the Wilcoxon rank sum test. Because ZSTAT = –4.4215 or p-value = 0.0000, reject H0.
(b)
The Wilcoxon rank sum test assumes the two samples are independent and that the dependent variable is at the ordinal level or higher. However, the Wilcoxon rank sum test does not require that the sample data meet the normality assumption. The Wilcoxon rank sum test is an alternative to the pooled-variance and separate-variance t tests when one or more the assumptions of these procedures are violated.
(c)
A pooled-variance t test performed on the same data analyzed in 12.39 (a) produced the results for Problem 10.64. Because tSTAT = –5.08 or p-value = 0.000, one would reject the null hypothesis that the mean length of bulb life is the same for the two manufacturers. The Wilcoxon rank sum test produced similar results with ZSTAT = –4.4215 or p-value = 0.0000. The conclusion to reject H0 was the same both the pooled-variance t test and the Wilcoxon rank sum test.
Copyright ©2024 Pearson Education, Inc.
liv Chapter 16: Time-Series Forecasting 12.40
(a)
From PHStat Wilcoxon Rank Sum Test Data Level of Significance
0.05
Population 1 Sample Sample Size
21
Sum of Ranks
464.5
Population 2 Sample Sample Size
15
Sum of Ranks
201.5
Intermediate Calculations Total Sample Size n
36
T1 Test Statistic
201.5
T1 Mean
277.5
Standard Error of T1 Z Test Statistic
31.1649 -2.438642
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0147
Reject the null hypothesis Because ZSTAT = –2.4386 < –1.96, reject H0. There is evidence to conclude that there is a difference in the median brand value between the two sectors. (b)
You must assume approximately equal variability in the two populations.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lv (c)
Using the pooled-variance t test and the separate-variance t test, the decision was to not reject the null hypothesis. Because the data is very right skewed and the variances are different, the Wilcoxon rank sum test is most appropriate.
Copyright ©2024 Pearson Education, Inc.
lvi Chapter 16: Time-Series Forecasting 12.41
(a)
H0: M1 = M2 where Populations: 1 = Bank 1, 2 = Bank 2; H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
15
Sum of Ranks
153
Population 2 Sample Sample Size
15
Sum of Ranks
312
Intermediate Calculations Total Sample Size n
30
T1 Test Statistic
153
T1 Mean
232.5
Standard Error of T1
24.10913
Z Test Statistic
–3.29751
Two-Tailed Test Lower Critical Value
–1.95996
Upper Critical Value
1.959961
p-value
0.000976 Reject the null hypothesis
Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lvii Decision: Since ZSTAT = –3.2975 is less than the lower critical bound of –1.96, reject H0. There is enough evidence to conclude that the median waiting time between the two branches is different. (b)
You must assume approximately equal variability in the two populations.
(c)
Using both the pooled-variance t test and the separate-variance t test allowed you to reject the null hypothesis and conclude in Problem 10.12 that the mean waiting time between the two branches is different. In this test using the Wilcoxon rank sum test with largesample Z-approximation also allowed you to reject the null hypothesis and conclude that the median waiting time between the two branches is different.
Copyright ©2024 Pearson Education, Inc.
lviii Chapter 16: Time-Series Forecasting 12.42
(a)
From PHStat, Use the Wilcoxon Rank Sum Test. Level of Significance
0.05
Population 1 Sample Sample Size
28
Sum of Ranks
891
Population 2 Sample Sample Size
29
Sum of Ranks
762
Intermediate Calculations Total Sample Size n
57
T1 Test Statistic
891
T1 Mean
812
Standard Error of T1 Z Test Statistic
62.6472 1.2610308
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.2073
Do not reject the null hypothesis Because –1.96 < ZSTAT = 1.261 < 1.96 (or the p-value = 0.2073 > 0.05), do not reject H0. There is not enough evidence to conclude that there is a difference in the median rating of ads that play before and after halftime. (b)
You must assume approximately equal variability in the two populations.
(c)
Using the pooled-variance t test, you do not reject the null hypothesis (t = –2.004 < tSTAT = –1.3627 < 2.004; p-value = 0.1785 > 0.05) and conclude that there is insufficient evidence of a difference in the mean rating of ads before and after halftime in Problem 10.11 (a). Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lix 12.43
For the 0.01 level of significance and 5 degrees of freedom, 2 15.086 .
12.44
(a)
Decision rule: If H > 2 15.086 , reject H0.
(b)
Decision: Since Hcalc = 13.77 is less than the critical bound of 15.086, do not reject H0.
Copyright ©2024 Pearson Education, Inc.
lx Chapter 16: Time-Series Forecasting 12.45 Kruskal-Wallis Rank Test for Differences in Medians
Data Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
39586.65
Sum of Sample Sizes
50
Number of Groups
5
Test Result H Test Statistic
33.29012
Critical Value
9.487729
p-Value
1.04E-06
Reject the null hypothesis
(a)
H0: M1 = M2 = M3 = M4= M5 H1: At least one of the medians differs. Since the p-value is virtually 0, reject H0. There is enough evidence of a significant difference in the median amount of food eaten among the various products.
(b)
In (a), you conclude that there is enough evidence of a significant difference in the median amount of food eaten among the various products, while in problem 11.13(a) you also conclude that there is evidence of a significant difference in the mean amount of food eaten among the various products.
(c)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxi
The normal probability plot suggests that the kidney-based data appear to deviate from the normal distribution. Hence, the Kruskal-Wallis rank test is more appropriate.
Copyright ©2024 Pearson Education, Inc.
lxii Chapter 16: Time-Series Forecasting 12.46
Kruskal-Wallis rank test: Data Level of Significance
0.05 Group 1
Sum of Ranks
640
Sample Size
15 Group 2
Sum of Ranks
291
Sample Size
15 Group 3
Sum of Ranks
468
Sample Size
15 Group 4
Sum of Ranks
431
Sample Size
15
Intermediate Calculations Sum of Squared Ranks/Sample Size
59937.73
Sum of Sample Sizes
60
Number of groups
4
H Test Statistic
13.51716 Test Result
Critical Value
7.814728
p-Value
0.003642 Reject the null hypothesis
(a)
H0: Mmain = MSat1 = MSat2 = MSat3 H1: At least one of the medians differs. Since the p-value = 0.0036 is lower than 0.05, reject H0. There is sufficient evidence of a difference in the median waiting time in the four locations.
(b)
The results are consistent with those of Problem 11.9.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxiii 12.47
From PHStat Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations
Group
Sample Sum of Size Ranks
Mean Ranks
Burger
14
473 33.7857143
Snack
6
66
11
Sum of Squared Ranks/Sample Size
36785.21
Chicken
9
295.5 32.8333333
Sum of Sample Sizes
50
Global
6
181.5
Number of Groups
6
Sandwich
9
194 21.5555556
Pizza
6
65 10.8333333
30.25
Test Result H Test Statistic
20.1069
Critical Value
11.0705
p-Value
0.0012
Reject the null hypothesis
12.48
(a)
Because H = 20.1069 > 11.0705 or p-value = 0.0012, reject H0. At the 0.05 significance level, there is evidence of a difference in median U.S. average sales per unit among the food segments.
(b)
The results from Problem 11.11 (a) were produced by a one-way ANOVA F test. At the 0.05 level of significance, there is insufficient evidence there are differences in U.S. mean sales per unit among the four food segments. Because FSTAT = 1.3691 OR p-value = 0.2543, do not reject H0. The Kruskal-Wallis test revealed a significant difference among the food sectors.
Kruskal-Wallis rank test: Data Level of Significance
0.05
Intermediate Calculations
Copyright ©2024 Pearson Education, Inc.
lxiv Chapter 16: Time-Series Forecasting Sum of Squared Ranks/Sample Size
8705.333
Sum of Sample Sizes
30
Number of Groups
5 Test Result
H Test Statistic
19.32688
Critical Value
9.487729
p-Value
0.000678 Reject the null hypothesis
(a)
H0: MA = MB = MC = MD = ME H1: At least one of the medians differs. Since the p-value = 0.0007 < 0.05, reject H0. There is sufficient evidence of a difference in the median rating of the five advertisements.
(b)
In (a), you conclude that there is evidence of a difference in the median rating of the five advertisements, while in problem 11.10 (a), you conclude that there is evidence of a difference in the mean rating of the five advertisements.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxv 12.48 cont.
12.49
(c)
Since the combined scores are not true continuous variables, the nonparametric KruskalWallis rank test is more appropriate because it does not require the scores to be normally distributed.
From PHStat Kruskal-Wallis Rank Test for Differences in Medians
Data Sample Size
Sum of Ranks
Mean Ranks
Asia
10
223
22.3
Europe
10
318
31.8
19102.35
North America
10
114.5
11.45
Sum of Sample Sizes
40
South America
10
164.5
16.45
Number of Groups
4
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
Group
Test Result H Test Statistic
16.7733
Critical Value
7.8147
p-Value
0.0008
Reject the null hypothesis
(a)
Because H = 16.7733 > 7.815 or p-value = 0.0008, reject H0. At the 0.05 significance level, there is evidence of a difference in median congestion levels across the five continents.
(b)
The results from Problem 11.14 (a) were produced by a one-way ANOVA F test. At the 0.05 level of significance, there was evidence of differences in congestion levels among the four continents. Because FSTAT = 8.172 and a p-value = 0.003, the conclusion was to reject H0. Similarly, the Kruskal-Wallis test revealed a significant difference among the food sectors. Because H > 7.815 or p-value = 0.0008, the conclusion was to reject H0.
Copyright ©2024 Pearson Education, Inc.
lxvi Chapter 16: Time-Series Forecasting 12.50 From PHStat Kruskal-Wallis Rank Test for Differences in Medians
Data
Group
Sample Size
Europe
8
114
14.25
Americas
8
172.5
21.5625
9023.688
Asia
8
109
13.625
Sum of Sample Sizes
32
Africa
8
132.5
16.5625
Number of Groups
4
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
Sum of Ranks
Mean Ranks
Test Result H Test Statistic
3.5419
Critical Value
7.8147
p-Value
0.3154
Do not reject the null hypothesis
(a)
Because H = 3.5419 > 7.815 or the p-value is 0.3154, do not reject H0. There is insufficient evidence of a difference in the median export price across the global regions.
(b)
The results are the same. The ANOVA FSTAT also indicates that there is no difference in export costs among the four regions.
12.51
The Chi-square test for the difference between two proportions can be used as long as all expected frequencies are at least 5.
12.52
The Chi-square test can be used for c populations as long as all expected frequencies are at least one.
12.53
The Chi-square test for independence can be used as long as all expected frequencies are at least one.
12.54
The Wilcoxon rank sum test should be used when you are unable to assume that each of two independent populations are normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxvii 12.55
The Kruskal-Wallis rank test should be used if you cannot assume that the populations are normally distributed.
Copyright ©2024 Pearson Education, Inc.
lxviii Chapter 16: Time-Series Forecasting 12.56
(a)
H0: There is no relationship between a student's gender and his/her pizzeria selection. H1: There is a relationship between a student's gender and his/her pizzeria selection. 2 2 Decision rule: d.f. = 1. If STAT > 3.841, reject H0. Test statistic: STAT = 0.412 2 Decision: Since the STAT = 0.412 is smaller than the critical bound of 3.841, do not reject H0. There is not enough evidence to conclude that there is a relationship between a student's gender and his/her pizzeria selection.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxix 12.56 cont.
(a)
(b)
2 Test statistic: STAT = 2.624 2 Decision: Since the STAT = 2.624 is less than the critical bound of 3.841, do not reject H0. There is not enough evidence to conclude that there is a relationship between a student's gender and his/her pizzeria selection.
(c)
H0: There is no relationship between price and pizzeria selection. Copyright ©2024 Pearson Education, Inc.
lxx Chapter 16: Time-Series Forecasting
12.56
(c)
2 Decision: Since the STAT = 4.956 is smaller than the critical bound of 5.991, do not reject H0. There is not enough evidence to conclude that there is a relationship between price and pizzeria selection.
cont.
(d)
12.57
H1: There is a relationship between price and pizzeria selection. 2 2 Decision rule: d.f. = 2. If STAT > 5.991, reject H0. Test statistic: STAT = 4.956
p-value = 0.0839. The probability of obtaining a sample that gives a test statistic equal to or greater than 4.956 is 0.0839 if the null hypothesis of no relationship between price and pizzeria selection is true.
PHStat Results for Facebook: Chi-Square Test
Observed Frequencies Column variable Facebook
B2B
B2C
Total
Yes
1369
2742
4111
No
169
114
283
Total
1538
2856
4394
Expected Frequencies Column variable Facebook
B2B
B2C
Total
Yes 1438.944 2672.056
4111
No
283
Total
99.05644 183.9436 1538
2856
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Copyright ©2024 Pearson Education, Inc.
4394
Solutions to End-of-Section and Chapter Review Problems lxxi Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
81.2133
p-Value
2.03E-19
Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use Facebook. Because STAT = 81.2133 > 3.841 or p-value = 0.000, reject H0. The results indicate that a greater proportion of B2C marketers (96%) utilize Facebook compared to B2B marketers (89%).
Copyright ©2024 Pearson Education, Inc.
lxxii Chapter 16: Time-Series Forecasting 12.57 cont.
PHStat Results for Twitter: Chi-Square Test
Observed Frequencies Column variable Twitter
B2B
B2C
Total
Yes
831
1314
2145
No
707
1542
2249
Total
1538
2856
4394
Expected Frequencies Column variable Twitter
B2B
B2C
Total
Yes
750.7988 1394.201
2145
No
787.2012 1461.799
2249
Total
1538
2856
4394
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
25.75197
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxiii p-Value
3.88E-07
Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of Twitter. Because STAT = 25.75197 > 3.841 or p-value = 0.0000, reject H0. The results indicate that a greater proportion of B2B marketers (54%) utilize Twitter compared to B2C marketers (46%) utilize Twitter.
Copyright ©2024 Pearson Education, Inc.
lxxiv Chapter 16: Time-Series Forecasting 12.57 cont.
PHStat Results for LinkedIn: Chi-Square Test
Observed Frequencies Column variable LinkedIn
B2B
B2C
Total
Yes
1246
1514
2760
No
292
1342
1634
Total
1538
2856
4394
Expected Frequencies Column variable LinkedIn
B2B
B2C
Total
Yes
966.0628 1793.937
2760
No
571.9372 1062.063
1634
Total
1538
2856
4394
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
335.6029
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxv p-Value
5.79E-75
Reject the null hypothesis
At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of LinkedIn. Because STAT = 335.6029 > 3.841 or p-value =0.0000, reject H0. The results indicate that a greater proportion of B2B marketers (81%) utilize LinkedIn compared to B2C marketers (53%).
Copyright ©2024 Pearson Education, Inc.
lxxvi Chapter 16: Time-Series Forecasting 12.57 cont.
PHStat Results for Pinterest: Chi-Square Test
Observed Frequencies Column variable TikTok
B2B
B2C
Total
Yes
108
286
394
No
1430
2750
4180
Total
1538
3036
4574
Expected Frequencies Column variable TikTok
B2B
B2C
Total
Yes 132.4819 261.5181
394
No
4180
Total
1405.518 2774.482 1538
3036
4574
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
7.458414
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxvii p-Value
0.006314
Reject the null hypothesis At the 0.05 significance level, there is evidence of a difference between B2B and B2C marketers 2 in the proportion who use of Pinterest. Because STAT = 7.458414 > 3.841 or p-value = 0.0063, reject H0. The results indicate that a greater proportion of B2C marketers (10%) utilize TikTok compared to B2B marketers (7%).
Copyright ©2024 Pearson Education, Inc.
lxxviii Chapter 16: Time-Series Forecasting 12.58
From PHStat Chi-Square Test
Observed Frequencies Industry Sector Universal Bank
Leaders Level
Insurance Company
Private Equity
Private Debt Firms
Total
Yes
33
12
10
55
110
No
50
71
73
194
388
Total
83
83
83
249
498
Private Equity
Private Debt Firms
Total
Expected Frequencies Industry Sector Universal Bank
Leaders Level
Insurance Company
Yes
18.33333
18.33333
18.33333
55
110
No
64.66667
64.66667
64.66667
194
388
Total
83
83
83
249
498
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
4
Degrees of Freedom
3
Results Critical Value
7.814728
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxix Chi-Square Test Statistic p-Value
22.72971 4.6E-05
Reject the null hypothesis 2 Because STAT = 22.7297 < 7.815; p-value = 0.000 < 0.05 reject H0. There is evidence of a difference among the proportions of financial sub-sector companies that operate at the highest level (Leaders Level) of the digital maturity scale.
Copyright ©2024 Pearson Education, Inc.
lxxx Chapter 16: Time-Series Forecasting 12.59
From PHStat Chi-Square Test
Observed Frequencies Geographic Region Central Europe
Use Social Media
Western Europe
Nordic
Total
Yes
62
22
52
136
No
75
44
38
157
Total
137
66
90
293
Expected Frequencies Geographic Region Central Europe
Western Europe
Nordic
Total
Yes
63.59044
30.63481 41.77474
136
No
73.40956
35.36519 48.22526
157
Total
137
Use Social Media
66
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465
Copyright ©2024 Pearson Education, Inc.
90
293
Solutions to End-of-Section and Chapter Review Problems lxxxi Chi-Square Test Statistic
9.287276
p-Value
0.009623
Reject the null hypothesis
(a)
At the 0.05 significance level, there is evidence of a difference in the proportion of customer service leaders who use social media in customer service among the four 2 geographic regions. Because STAT = 9.297276 > 5.991 or p-value = 0.009, reject H0.
Copyright ©2024 Pearson Education, Inc.
lxxxii Chapter 16: Time-Series Forecasting 12.59 cont.
(a)From PHStat Chi-Square Test
Observed Frequencies Geographic Region Central Europe
Offer Self-Service
Western Europe
Nordic
Total
Yes
64
46
68
178
No
73
20
22
115
Total
137
66
90
293
Nordic
Total
Expected Frequencies Geographic Region Central Europe
Offer Self-Service
Western Europe
Yes
83.22867
40.09556 54.67577
178
No
53.77133
25.90444 35.32423
115
Total
137
66
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465 Copyright ©2024 Pearson Education, Inc.
90
293
Solutions to End-of-Section and Chapter Review Problems lxxxiii Chi-Square Test Statistic
21.80688
p-Value
1.84E-05
Reject the null hypothesis (b)
At the 0.05 significance level, there is evidence of a difference in the proportion of customer service leaders who offer self-service options for customers among the four 2 geographic regions. Because STAT = 21.80688 > 5.991 or p-value = 0.000, reject H0.
Chapter 13
13.1
(a) (b)
(d)
When X = 0, the estimated expected value of Y is 3. For each increase in the value X by 1 unit, you can expect an increase by an estimated 6 units in the value of Y. Yˆ 3 6 X 3 6(3) 21 Yˆ 3 6 X 3 6(1) 9
13.2
(a) (b) (c) (d)
yes no no yes
13.3
(a) (b)
When X = 0, the estimated expected value of Y is 16. For each increase in the value X by 1 unit, you can expect a decrease in an estimated 0.5 units in the value of Y. Yˆ 16 0.5 X 16 0.5(6) 13
(c)
(c) 13.4
(a)
Copyright ©2024 Pearson Education, Inc.
lxxxiv Chapter 16: Time-Series Forecasting
The scatter plot shows a positive linear relationship. (b) (c) (d)
For each % increase in alcohol content, there is an expected increase in quality of an estimated 0.5624. Yˆ 0.3529 0.5624 X 0.3529 0.5624 10 = 5.2715
There appears to be a positive linear relationship between quality and % alcohol content. For each % increase in alcohol content, there is an expected increase in quality of an estimated 0.5624.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxv 13.5
(a)
The scatterplot depicts a positive linear relationship between summary rating and cost. (b)
b0 = –47.8091, b1 = 1.5951.
(c)
The Y intercept, b0, would be the mean cost per person when the summary rating is equal to zero. The literal interpretation is not meaningful in this case because a negative price and a mean summary rating of zero would not be possible. The sample slope, b1, indicates that for each unit of change in summary rating, the predicted mean cost per person increases by $1.60.
(d)
Ŷ 47.8091 1.5951X 47.8091 1.5951(55) 39.92 . The predicted mean cost per person for a restaurant with a summary rating of 55 would be $39.92.
(e)
One could inform the owners that there is a positive relationship between summary rating for food, décor, and service and the average cost of a meal per person. This relationship suggests that higher cost per person is associated with higher ratings with food, décor, and service.
Copyright ©2024 Pearson Education, Inc.
lxxxvi Chapter 16: Time-Series Forecasting 13.6
(a)
(b) (c) (d) (e) 13.7
b0 = 42,989.39135680, b1 = 1.60307520. For each increase of $1,000 in tuition, the mean starting salary is predicted to increase by $1,603.10. Yˆ 42,939.39135680 1.60307520(50, 450) $123,864.54 . A program that has a peryear tuition cost of $50,450 is predicted to have a mean starting salary of $123,864.54. Starting salary seems higher for those schools that have a higher tuition.
(a)
(b) (c)
Yˆ 0.7500 0.5000 X For each increase of one additional plate gap, the estimated mean tear rating will increase by 0.5. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxvii 13.7 cont.
(d) (e)
13.8
(a)
(b) (c)
(d) (e) 13.9
Yˆ 0.7500 0.5000 X 0.7500 0.5000 0 0.7500 Bag tear rating increases as the plate gap on the bag-sealing equipment increases.
b0 = –1,410.6988, b1 = 10.9927. For each additional million dollar increase in revenue, the mean value is predicted to increase by an estimated $10.9927 million. Literal interpretation of b0 is not meaningful because an operating franchise cannot have zero revenue. Yˆ 1, 410.6998 10.9927(250) $1,337.47 million. That the value of the team can be expected to increase as revenue increases.
(a)
Copyright ©2024 Pearson Education, Inc.
lxxxviii Chapter 16: Time-Series Forecasting
13.9 cont.
(b)
(c)
b0 = 782 and b1 = 0.978. The Y intercept, b0, would be the mean monthly rental cost when the square footage of an apartment is equal to zero. The literal interpretation is not meaningful in this case because the square footage cannot be equal to zero. The sample slope, b1, indicates that for each Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxix unit of change in apartment square footage, the mean monthly rental cost is predicted to increase by $0.978. (d)
13.9 cont.
(d)
Y 782 0.978(800) $1,564.55. The predicted mean monthly rent for a 800 square foot apartment would be $1,564.55.
(e)
(f)
13.10
It would not be appropriate to use the model 9 (d) to predict the monthly rent for a 1,500 square foot apartment because the model was based on an independent square footage variable that ranged from 434 square feet to 955 square feet. A 1,500 square foot apartment would fall outside of this range. One should only use the relevant range of the independent variable. The 800 square foot apartment for $1,130 would represent the better deal. Based on the equation in 13.9 (d), a 800 square foot apartment would have an estimated monthly rent of $1,564. A 800 square foot apartment renting at $1,130 per month would be $434.55 below the predicted rent of $1,564.55. Based on the equation in 13.9 (d), a 830 square foot apartment would have a predicted rent of $1,593.90. A 830 apartment renting at $1410 would only be $183.90 below the predicted amount of $1,593.90.
(a)
Copyright ©2024 Pearson Education, Inc.
xc Chapter 16: Time-Series Forecasting
(b) (c) (d) (e)
b0 = –5.6263, b1 = 1.3712. For each increase of one million YouTube trailer views, the predicted weekend box office gross is estimated to increase by $1.3712 million. Yˆ 5.6263 1.3712(20) $21.7969 million. You can conclude that the mean predicted increase in weekend box office gross is $1.3712 million for each one million increase in YouTube trailer views.
13.11
83% of the variation in the dependent variable can be explained by the variation in the independent variable.
13.12
SST = 40 and r2 = SSR/SST = 36/40 = 0.90. So, 90% of the variation in the dependent variable can be explained by the variation in the independent variable.
13.13
r2 = SSR/SST = 77/99 = 0.7778. So, 77.78% of the variation in the dependent variable can be explained by the variation in the independent variable.
13.14
SST = SSR + SSE = 30 + 10 = 40 and r2 = SSR/SST = 30/40 = 0.75. So, 75% of the variation in the dependent variable can be explained by the variation in the independent variable.
13.15
Since SST = SSR + SSE and since SSE cannot be a negative number, SST must be at least as large as SSR.
13.16
(a)
(b) (c)
SSR = 0.3417. So, 34.17% of the variation in wine quality can be explained by the SST variation in alcohol content. 42.1323 SYX 0.9369 48 Based on (a) and (b), the model should be moderately useful for predicting wine quality. r2 =
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xci 13.17
From PHStat, Restaurant: Cost vs Summary Rating Regression Statistics Multiple R
0.5915
R Square
0.3499
Adjusted R Square
0.3433
Standard Error
16.9229
Observations
100
(a)
(b) (c)
r2 = 0.3499, which means that 34.99% of the variation in the dependent variable can be explained by the variation in the independent variable. In this case, 34.99% of the variation in the cost of meal per person can be explained by the variation in summary customer rating of food, décor, and service. SYX = 16.9229. Based on 13.17 (a) and (b), the typical difference between the actual meal cost per person and the amount predicted cost based on the regression model using summary rating as the independent variable, is approximately $16.92. The model is weak to moderate in its usefulness in predicting the cost of a restaurant meal.
Copyright ©2024 Pearson Education, Inc.
xcii Chapter 16: Time-Series Forecasting 13.18
From PHStat, FTMBA: Mean Starting Salary and Bonus vs Pre-year Tuition Regression Statistics Multiple R
0.6809
R Square
0.4637
Adjusted R Square
0.4474
Standard Error
25753.6529
Observations
35
(a) (b) (c)
r2 = 0.46.37. 46.37% of the variation in starting salary can be explained by the variation in tuition. SYX = 25,753.6529. Based on (a) and (b), the model should be very useful for predicting the starting salary.
13.19
(a) (b)
r2 = 0.3811. So, 38.11% of the variation in tear rating can be explained by the variation in the plate gap. SYX 1.0241 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xciii (c)
Based on (a) and (b), the model should be somewhat useful for predicting the tear rating.
Copyright ©2024 Pearson Education, Inc.
xciv Chapter 16: Time-Series Forecasting 13.20
From PHStat, MLB Values: Current Value vs Annual Revenue Regression Statistics Multiple R
0.8293
R Square
0.6877
Adjusted R Square
0.6766
Standard Error
654.9821
Observations
30
(a) (b) (c)
13.21
r2 = 0.6877. 68.77% of the variation in the value of a MLB baseball team can be explained by the variation in its annual revenue. SYX = 654.9821 Based on (a) and (b), the model should be very useful for predicting the value of a baseball team.
(a)
r2 = 0.2500, which means that 25% of the variation in the dependent variable can be explained by the variation in the independent variable. In this case, 25% of the variation in the monthly rent cost can be explained the variation in square footage of an apartment. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcv (b) (c)
SYX = 198.486. Based on 13.21 (a) and (b), the typical difference between the actual monthly rent and the amount predicted based on the regression model using apartment square footage as the independent variable, is approximately $198.49. The model is relatively weak in its useful in predicting the monthly rent of an apartment. Other variables that might explain the variation in monthly rent could include the following: the age of the apartment, condition of the apartment, crime rate, amenities, and the average income level in the local area.
13.21 cont.
(d)
13.22
From PHStat, Movie: YouTube Trailer Views vs Opening Weekend Gross Regression Statistics Multiple R
0.9139
R Square
0.8352
Adjusted R Square
0.8328
Standard Error
15.9010
Observations
71
(a) (b) (c) (d)
r2 = 0.8352. 83.52% of the variation in weekend box office gross can be explained by the variation in YouTube trailer views. SYX = 15.9010. Based on (a) and (b), the model should be useful for predicting weekend box office gross. Other variables that might explain the variation in weekend box office gross could be the amount spent on advertising, the timing of the release of the movie, and the type of movie.
13.23
The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of X. There appears to be no violation of the linearity and equal variance assumptions.
13.24
A residual analysis of the data indicates a pattern, with sizable clusters of consecutive residuals that are either all positive or all negative. This pattern indicates a violation of the assumption of linearity. A curvilinear model should be investigated.
Copyright ©2024 Pearson Education, Inc.
xcvi Chapter 16: Time-Series Forecasting 13.25
The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcvii 13.26
Based on the residual plot, there does not appear to be a pattern in the residual plot. There is no apparent violation of the linearity and equal variance assumptions.
The normal probability plot suggests possible departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
xcviii Chapter 16: Time-Series Forecasting 13.27
The residual plot reveals that the equal variance assumption is most likely violated. The linearity assumption may also have been violated. A linear fit does not appear to be adequate.
The normal probability plot suggests possible departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcix 13.28
Based on the residual plot, the assumption of equal variance may be violated since there are two large positive residuals for lower salaries.
Copyright ©2024 Pearson Education, Inc.
c Chapter 16: Time-Series Forecasting 13.29
The plot of residuals versus the independent variable, square footage, reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot reveals no evidence of substantial departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ci 13.30
Based on the residual plot, there is no evidence of a pattern. There is no evidence of a violation of the linearity and equal variance assumptions.
Copyright ©2024 Pearson Education, Inc.
cii Chapter 16: Time-Series Forecasting 13.31
The plot of residuals versus the independent variable, YouTube trailer views, reveals evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are tightly grouped around 0 for lower values of X, but the variability of the residuals increases substantially as values of X increase. The equal-variance assumption is not valid, indicating a need to use alternative statistical approaches.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ciii 13.32
(a) Residual Plot
Residuals
10 5 0 -5 0
2
4
6
8
10
-10
Time Period
(b) 13.33
(a)
(b) (c) 13.34
An increasing linear relationship exists. There appears to be strong positive autocorrelation among the residuals.
(a) (b)
There is no apparent pattern in the residuals over time. D = 1.574 > 1.36. There is no evidence of positive autocorrelation among the residuals. The data are not positively autocorrelated. No, it is not necessary to compute the Durbin-Watson statistic since the data have been collected for a single period for a set of bags. If a single bag-sealing equipment was studied over a period of time and the amount of plate gap varied over time, computation of the Durbin-Watson statistic would be necessary.
Copyright ©2024 Pearson Education, Inc.
civ Chapter 16: Time-Series Forecasting 13.35
(a)
(b)
From PHStat, Oil and Gasoline, Gasoline vs Crude Oil Calculations b1, b0 Coefficients
0.0332
-0.1248
Y i 0.1248 0.0332 X i
(c)
The sample slope, b1, indicates that for each unit of change in price of barrel of crude oil, the predicted mean price for a gallon of gas increases by $0.0332.
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cv
13.35 cont.
(d)
The residual plot versus observation order reveals a cyclical pattern across the order of observations.
(e) Durbin-Watson Statistic (f) (g)
(h)
13.36
(a)
(b) (c)
0.2678
Because D = 0.2678 < 1.69, one can conclude there is evidence of a positive autocorrelation among the residuals. Based on the results of (d) through (f), there is substantial reason to question the validity of the model. Because of the violation of the independence-of-errors assumption, alternative approaches should be used. The regression equation reveals a strong positive relationship between the price of a barrel of crude oil and the price of a gallon of gas. Because the data were collected over 306 weeks and there was a clear cyclical pattern in residuals across observation order, alternative statistical approaches should be used.
SSXY 201399.05 0.0161 SSX 12495626 b0 = Y b1 X 71.2621 0.0161 4393 = 0.458 Yˆ 0.458 0.0161X 0.458 0.0161(4500) 72.908 or $72,908 b1 =
Copyright ©2024 Pearson Education, Inc.
cvi Chapter 16: Time-Series Forecasting
n
e e
(d)
D = i2
n
e i 1
(e) (f)
i 1
i
2 i
2
1243.2244 = 2.08 > 1.45. 599.0683
There is no evidence of positive autocorrelation among the residuals. Based on a residual analysis, the model appears to be adequate. It appears that the number of orders affects the monthly distribution costs.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cvii 13.37
(a) (b) (c)
Yˆ 17.0833 5 X Yˆ 17.0833 5 0.5 14.5833 seconds
There is no noticeable pattern in the plot. (d)
H0: There is no autocorrelation. H1: There is positive autocorrelation. PHStat output: Durbin-Watson Calculations Sum of Squared Difference of Residuals Sum of Squared Residuals Durbin-Watson Statistic
(e) (f)
238.4375 138.3333333 1.723644578
dL = 1.27, dU = 1.45. Since D = 1.7236 > 1.45, do not reject H0. There is no evidence of autocorrelation. Based on the results of (c) and (d), there is no reason to question the validity of the model. It appears that the larger the tamping distance, the shorter is the time of separation.
Copyright ©2024 Pearson Education, Inc.
cviii Chapter 16: Time-Series Forecasting 13.38
(a) (b) (c)
b0 = –2.535, b1 = 0.060728 Yˆ 2.535 0.060728 X 2.535 0.060728(83) 2.5054
(d) (e)
D = 1.64>1.42. There is no evidence of positive autocorrelation among the residuals. The plot of the residuals versus time period shows some clustering of positive and negative residuals for intervals in the domain, suggesting a nonlinear model might be better. Otherwise, the model appears to be adequate. There appears to be a positive relationship between sales and atmospheric temperature.
(f) 13.39
13.40
(b) (c)
H0 : 0 H1 : 0 r 0.75 0 = 4.5356 tSTAT 1 r2 1 0.752 n2 18 2 d.f. = 16, lower critical value = –2.1199, upper critical value = 2.1199. Since t = 4.5356 is greater than the upper critical value of 2.1199, reject H0.
(a)
H 0 : 1 0
(a)
H1 : 1 0
Test statistic: tSTAT b1 0 / sb1 4.5 / 1.5 3.00 (b) (c) (d)
With n = 18, df = 18 – 2 =16, t0.05/2 2.1199 . Reject H0. There is evidence that the fitted linear regression model is useful. b1 t0.05/2 sb1 1 b1 t0.05/2 sb1 4.5 2.1199(1.5) 1 4.5 2.1199(1.5) 1.32 1 7.68
13.41
(a)
(b)
MSR SSR / k 72 / 1 72
MSE SSE / (n k 1) 44 / 18 2.444 FSTAT MSR / MSE 72 / 2.444 29.4545 F0.05 4.41 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cix 13.41
(c)
cont.
(d) (e)
Reject H0. There is evidence that the fitted linear regression model is useful. SSR 72 r2 0.6207 r 0.6207 0.7878 SST 116 There is no correlation between X and Y. H0 : 0 There is correlation between X and Y. H1 : 0 d.f. = 18.
Decision rule: Reject H 0 if tSTAT > 2.1009. r
0.7878 5.4272 . 1 0.6207 1 r 18 n2 Since tSTAT 5.4272 is less than the lower critical bound of –2.1009, reject H 0 . There is enough evidence to conclude that the correlation between X and Y is significant.
Test statistic: tSTAT
13.42
(a)
2
H 0 : 1 0
H1 : 1 0
Intercept alcohol
Coefficients Standard Error t Stat P-value -0.3529 1.2000 -0.2941 0.7700 0.5624 0.1127 4.9913 0.0000
Lower 95% Upper 95% -2.7656 2.0599 0.3359 0.7890
b 1 = 4.9913 with a p-value = 0.0000 < 0.05. Reject H0. There is enough tSTAT 1 Sb1
(b) 13.43
(a)
evidence to conclude that the fitted linear regression model is useful. b1 t /2 Sb1 0.3359 1 0.7890 H 0 : 1 0 From PHStat
H1 : 1 0
b1, b0 Coefficients
1.5951
-47.8091
ANOVA df
SS
MS
F
Significance F
Regression
1
15104.9827 15104.9827 52.7440
Residual
98
28065.5273
Total
99
43170.5100
Intercept Summary Rating
0.0000
286.3829
Coefficients
Standard Error
P-value
Lower 95%
Upper 95%
t Stat
-47.8091
14.0423
-3.4046
0.0010
-75.6756
-19.9426
1.5951
0.2196
7.2625
0.0000
1.1592
2.0309
Copyright ©2024 Pearson Education, Inc.
cx Chapter 16: Time-Series Forecasting
(b)
At the 0.05 significance level, there is evidence of linear relationship between summated rating and the cost of a meal. Because tSTAT = 7.2625 > 1.9856 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.5951 ± 1.9845(0.2196). Thus, 1.1592 ≤ β1 ≤ 2.0309.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxi 13.44
(a)
H 0 : 1 0 From PHStat
H1 : 1 0
b1, b0 Coefficients
1.6031 42989.3914
ANOVA df
SS
Significance F
1 18922315884.4426 18922315884.4426 28.5297
Residual
33 21887271048.2431
Total
34 40809586932.6857
Intercept
Standard Error
42989.3914 18770.3449
Per-Year Tuition
(b)
1.6031
0.3001
0.0000
663250637.8255
t Stat
P-value
Lower 95%
2.2903
0.0285
4800.8375 81177.9453
5.3413
0.0000
0.9925
Upper 95%
2.2137
At the 0.05 significance level, there is evidence of linear relationship between tuition and starting salary. Because tSTAT = 5.3413 > 2.0345 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.6031 ± 2.0345(0.3001). Thus, 0.9925 ≤ β1 ≤ 2.2137.
(a) Coefficients Standard Error t Stat P-value 0.7500 0.2349 3.1922 0.0053 0.5000 0.1545 3.2356 0.0049
Intercept Plate Gap
(b) 13.46
F
Regression
Coefficients
13.45
MS
(a)
Lower 95% Upper 95% 0.2543 1.2457 0.1740 0.8260
p-value = 0.0049< 0.05. Reject H0. There is evidence that the fitted linear regression model is useful. b1 t /2 Sb1 0.1740 1 0.8260 H 0 : 1 0 From PHStat
H1 : 1 0
b1, b0 Coefficients
10.9927
-1410.6988
ANOVA df Regression
1
SS 26454210.9591
MS
F
26454210.9591 61.6646
Copyright ©2024 Pearson Education, Inc.
Significance F 0.0000
cxii Chapter 16: Time-Series Forecasting Residual
28
12012043.2076
Total
29
38466254.1667
Intercept Annual Revenue
(b) 13.47
429001.5431
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
-1410.6988
461.6593
-3.0557
0.0049
-2356.3650
-465.0326
10.9927
1.3999
7.8527
0.0000
8.1252
13.8602
At the 0.05 significance level, there is evidence of linear relationship between annual revenue and current value. Because tSTAT = 7.8527 > 2.0484 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 10.9927 ± 2.0484(1.3999). Thus, 8.1252 ≤ β1 ≤ 13.8602.
(a)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxiii
13.48
(b) (a)
At the 0.05 significance level, there is evidence of linear relationship between summated rating and the cost of a meal. Because tSTAT = 4.28 or p-value = 0.000, reject H0. b ± tα/2Sb1 = 0.978 ± 2.0040(0.228). 0.521088 ≤ β1 ≤ 1.434912. H 0 : 1 0 H1 : 1 0 From PHStat
b1, b0 Coefficients
1.3712
-5.6263
ANOVA df
SS
MS
88415.4136 349.6847
Regression
1
88415.4136
Residual
69
17446.1839
Total
70
105861.5975
Coefficients
Standard Error
F
Significance F 0.0000
252.8432
t Stat
P-value
Copyright ©2024 Pearson Education, Inc.
Lower 95%
Upper 95%
cxiv Chapter 16: Time-Series Forecasting Intercept
-5.6263
2.4916
-2.2581
0.0271
-10.5968
-0.6558
YouTube Trailer Views
1.3712
0.0733
18.6999
0.0000
1.2249
1.5174
(b) 13.49
At the 0.05 significance level, there is evidence of linear relationship between YouTube trailer views and opening weekend box office gross. Because tSTAT = 18.6999 > 1.9949 or p-value = 0.0000 < 0.05, reject H0. b ± tα/2Sb1 = 1.3712 ± 1.9949(0.0733). Thus, 1.2249 ≤ β1 ≤ 1.5174.
(a)
Alphabet, Inc. moves 5% more than the overall market and is more volatile than the market. Amazon.com, Inc. moves 23% more than the overall market and is more volatile than the market. Apple moves 25% more than the overall market and is more volatile than the market. Hilton Worldwide moves 21% more than the overall market and is more volatile than the market. Marriot Intl. moves 58% more than the overall market and is more volatile than the market. Microsoft moves only 92% as much as the overall market and is less volatile than the market. Pfizer moves only 64% as much as the overall market and is less volatile than the market. Tesla, Inc. moves 97% more than the overall market and is more volatile than the market. TORM moves only 19% as much as the overall market and is less volatile than the market. Walt Disney Co. moves 25% more than the overall market and is more volatile than the market.
(b)
Investors can use the beta value to assess the risk of a stock relative to the overall market. From the list, an investor looking for growth should probably avoid TORM and Pfizer.
13.50
(a) (b) (c) (d)
(% daily change in DXRLX) = b0 + 1.75 (% daily change in Russell 2000 index). If the S&P 500 gains 10% in a year, DXNLX is expected to gain an estimated 17.5%. If the S&P 500 loses 20% in a year, DXNLX is expected to lose an estimated 35%. Risk takers will be attracted to leveraged funds, and risk-averse investors will stay away.
13.51
(a)
r = 0.8391. There appears to be a strong positive linear relationship between calories and sugar (in grams). t = 3.4496, p-value = 0.0183 < 0.05. Reject H0. At the 0.05 level of significance, there is enough evidence of a significant linear relationship between calories and sugar (in grams).
(b)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxv 13.52
13.53
(a)
First weekend gross and U.S. gross r = 0.7601. There appears to be a strong positive linear relationship. First weekend gross and the worldwide gross r = 0.8596. There appears to be a strong positive linear relationship. U.S. gross, and worldwide gross r = 0.9448. There appears to be a very strong positive linear relationship.
(b)
First weekend gross and the U.S. gross Since tSTAT = 3.3083 > 2.306 and p-value = 0.0107 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between first weekend sales and U.S. gross. First weekend gross and the worldwide gross Since tSTAT = 4.7574 > 2.306 and p-value = 0.0014 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between first weekend sales and worldwide gross. U.S. gross, and worldwide gross Since tSTAT = 8.1552 > 2.306 and p-value = 0.0000 < 0.05, reject H0. At the 0.05 level of significance, there is evidence of a linear relationship between U.S. gross and worldwide gross.
(a)
From PHStat, Download speed and Upload speed Regression Statistics Multiple R
Intercept Upload Speed
13.54
0.2463 Coefficients
Standard Error
t Stat
P-value
543.9337
449.0916
1.2112
0.2714
2.6111
4.1945
0.6225
0.5565
(b)
For download and upload speeds, r = 0.2463. There appears to be a positive linear relationship. At the 0.05 significance level, there is insufficient evidence of a significant linear relationship between download and upload speed. Because tSTAT = 0.6225 < 2.4469 or p-value = 0.5565 < 0.05, do not reject H0.
(a)
From PHStat, Value and Payroll Regression Statistics Multiple R
0.6766 Coefficients
Standard Error
t Stat
P-value
Intercept
0.2485
0.4066
0.6111
0.5460
Payroll
0.0124
0.0026
4.8616
0.0000
Copyright ©2024 Pearson Education, Inc.
cxvi Chapter 16: Time-Series Forecasting Value and payroll r = 0.6766. There appears to be a strong positive linear relationship.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxvii 13.54 cont.
(a)
From PHStat, Payroll and Wins Regression Statistics Multiple R
0.4406 Coefficients
Standard Error
t Stat
P-value
Intercept
-6.1646
59.9545
-0.1028
0.9188
Wins
1.8943
0.7293
2.5974
0.0148
Payroll and Wins r = 0.4406. There appears to be a moderate positive linear relationship.
13.55
(b)
Value and payroll Because tSTAT = 4.8616 > 2.0484 or p-value = 0.0000 < 0.05, reject H0. At the 0.05 level of significance, there is significant evidence of a linear relationship between team value and payroll.
(c)
Because tSTAT = 2.5974 > 2.0484 or p-value = 0.0148 < 0.05, reject H0. At the 0.05 level of significance, there is significant evidence of a linear relationship between payroll and wins.
(a)
When X = 2, Yˆ 5 3 X 5 3(2) 11
h
( X X )2 1 1 (2 2)2 n i 0.05 n 20 20 2 ( Xi X ) i 1
(b)
95% confidence interval: Yˆ t0.05/2 sYX h 11 2.1009 1 0.05 10.53 YX 11.47 s 1 h 11 2.1009 1 1.05 95% prediction interval: Yˆ t 0.05/2 YX
8.847 YI 13.153
13.56
(a)
When X = 4, Yˆ 5 3 X 5 3(4) 17
( X i X )2 1 1 (4 2)2 h n 0.25 n 20 20 2 ( X X ) i i 1
95% confidence interval: Yˆ t0.05/2 sYX h 17 2.1009 1 0.25 15.95 Y | X 4 18.05 (b) (c)
95% prediction interval: Yˆ t0.05/2 sYX 1 h 17 2.1009 1 1.25 14.651 YX 4 19.349 The intervals in this problem are wider than the intervals in Exercise 13.55 because the value of X is farther from X . Copyright ©2024 Pearson Education, Inc.
cxviii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxix 13.57
From PHStat, Predict cost of a meal Confidence Interval Estimate Data X Value
50
Confidence Level
95%
Intermediate Calculations Sample Size
100
Degrees of Freedom
98
t Value
1.984467
XBar, Sample Mean of X
63.47
Sum of Squared Differences from XBar
5936.91
Standard Error of the Estimate
16.92285
h Statistic
0.040562
Predicted Y (YHat)
31.9444 For Average Y
Interval Half Width
6.7635
Confidence Interval Lower Limit
25.1809
Confidence Interval Upper Limit
38.70795
For Individual Response Y Interval Half Width
34.2572
Prediction Interval Lower Limit
-2.3128
Prediction Interval Upper Limit
66.20157
(a)
25.1809 Y X 50 38.70795. The 95% confidence interval estimate is that the
population mean cost of a meal is between $25.1809 and $38.70795 for restaurants that have a summary rating of 50. Copyright ©2024 Pearson Education, Inc.
cxx Chapter 16: Time-Series Forecasting
(c)
2.3128 YX 50 66.20157. The 95% confidence interval estimate is that the cost of a meal for an individual restaurant with a summary rating of 50 is between $0 and $66.20157. 13.57 (a) represents a confidence interval estimate for the mean value among all restaurants in the study while 13.57 (b) represents a prediction interval for an individual restaurant. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the individual restaurant is much wider than the confidence interval estimate for the mean.
(a)
4.9741 Y | X 10 5.568984094
(b) (c)
3.3645 YX 10 7.178609919 Part (b) provides a prediction interval for the individual response given a specific value of the independent variable, and part (a) provides an interval estimate for the mean value, given a specific value of the independent variable. Because there is much more variation in predicting an individual value than in estimating a mean value, a prediction interval is wider than a confidence interval estimate.
(a)
0.2543 Y | X 0 1.2457
(b) (c)
1.4668 YX 0 2.9668 Part (b) provides an interval prediction for the individual response given a specific value of the independent variable, and part (a) provides an interval estimate for the mean value given a specific value of the independent variable. Since there is much more variation in predicting an individual value than in estimating a mean value, a prediction interval is wider than a confidence interval estimate holding everything else fixed.
(b)
13.58
13.59
13.60
From PHStat, Predict the starting salary Confidence Interval Estimate
Data X Value
50450
Confidence Level
95%
Intermediate Calculations Sample Size
35
Degrees of Freedom
33
t Value
2.0345153
XBar, Sample Mean of X
60836.1143
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxi Sum of Squared Differences from XBar
7363198352
Standard Error of the Estimate
25753.6529
h Statistic
0.0432215
Predicted Y (YHat)
123864.535
For Average Y Interval Half Width
10893.0553
Confidence Interval Lower Limit
112971.4797
Confidence Interval Upper Limit
134757.59
For Individual Response Y Interval Half Width
53516.5443
Prediction Interval Lower Limit
70347.9907
Prediction Interval Upper Limit
177381.079
(a)
$112,971.48 Y X 50,450 $134,757.59
(b) (c)
$70,347.99 YX 50,450 $177,381.08 You can estimate a mean more precisely than you can predict a single observation.
Copyright ©2024 Pearson Education, Inc.
cxxii Chapter 16: Time-Series Forecasting 13.61
(a)
(b) (c)
$1,499.31 Y X 800 $1,629.79 The 95% confidence interval estimate is that the
population mean cost for all one-bedroom apartments that are 800 square feet is between $1,499.31 and $1,629.79. $1,161.46 YX 800 $1,967.64 The 95% confidence interval estimate is that the cost of an individual one-bedroom 800 square foot apartment is between $1,161.46 and $1,967.64. 13.61 (a) represents a confidence interval estimate for the mean value among all 800 square foot apartments while 13.61 (b) represents a prediction interval for an individual 800 square foot apartment. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the individual apartment is much wider than the confidence interval estimate for the mean.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxiii 13.62
From PHStat, Predict the mean value Confidence Interval Estimate
Data X Value
250
Confidence Level
95%
Intermediate Calculations Sample Size
30
Degrees of Freedom
28
t Value
2.048407
XBar, Sample Mean of X
318.5333
Sum of Squared Differences from XBar
218921.5
Standard Error of the Estimate
654.9821
h Statistic
0.054788
Predicted Y (YHat)
1337.469
For Average Y Interval Half Width
314.0416
Confidence Interval Lower Limit
1023.4273
Confidence Interval Upper Limit
1651.511
For Individual Response Y Interval Half Width
1377.9334
Prediction Interval Lower Limit
-40.4645
Prediction Interval Upper Limit
2715.402
(a)
1,023.4273 Y X 250 1,651.511
(b) (c)
40.4645 YX 250 1,377.9334 Because there is much more variation in predicting an individual value than in estimating a mean, the prediction interval is wider than the confidence interval. Copyright ©2024 Pearson Education, Inc.
cxxiv Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxv 13.63
From PHStat Calculations b1, b0 Coefficients (a) (b) (c)
1.3712
-5.6263
Y 5.6263 1.3712(50) $62.93 million. The predicted weekend gross for the movie with 50 million views would be $62.93 million. Because this example focuses on one individual movie, the prediction interval for an individual response is more appropriate. From PHStat
Confidence Interval Estimate
Data X Value
50
Confidence Level
95%
For Average Y Interval Half Width
5.5430
Confidence Interval Lower Limit
57.3887
Confidence Interval Upper Limit
68.47471
For Individual Response Y Interval Half Width
32.2024
Prediction Interval Lower Limit
30.7294
Prediction Interval Upper Limit
95.13409
$30.7294 million YX 50 $95.13409 million. The 95% confidence interval estimate weekend box office gross for an individual move that had 50 million You Tube trailer views would be between $30.7294 million and $95.13409 million. Because there is much more variation in predicting an individual value compared estimating a mean value, the prediction interval for the weekend box office gross for an individual movie is much wider than the confidence interval estimate for the mean of all movies that had 50 million You Tube trailer views. 13.64
The slope of the line, b1, represents the estimated expected change in Y per unit change in X. It represents the estimated mean amount that Y changes (either positively or negatively) for a Copyright ©2024 Pearson Education, Inc.
cxxvi Chapter 16: Time-Series Forecasting particular unit change in X. The Y intercept b0 represents the estimated mean value of Y when X equals 0. 13.65
The coefficient of determination measures the proportion of variation in Y that is explained by the independent variable X in the regression model.
13.66
The unexplained variation or error sum of squares (SSE) will be equal to zero only when the regression line fits the data perfectly and the coefficient of determination equals 1.
13.67
The explained variation or regression sum of squares (SSR) will be equal to zero only when there is no relationship between the Y and X variables, and the coefficient of determination equals 0.
13.68
Unless a residual analysis is undertaken, you will not know whether the model fit is appropriate for the data. In addition, residual analysis can be used to check whether the assumptions of regression have been seriously violated.
13.69
The assumptions of regression are normality of error, homoscedasticity, and independence of errors.
13.70
The normality of error assumption can be evaluated by obtaining a histogram, boxplot, and/or normal probability plot of the residuals. The homoscedasticity assumption can be evaluated by plotting the residuals on the vertical axis and the X variable on the horizontal axis. The independence of errors assumption can be evaluated by plotting the residuals on the vertical axis and the time order variable on the horizontal axis. This assumption can also be evaluated by computing the Durbin-Watson statistic.
13.71
If the data in a regression analysis has been collected over time, then the assumption of independence of errors needs to be evaluated using the Durbin-Watson statistic.
13.72
The confidence interval for the mean response estimates the mean response for a given X value. The prediction interval estimates the value for a single item or individual.
13.73
(a)
From PHStat Coefficients
Standard Error
P-value
Lower 95%
Upper 95%
t Stat
Intercept
1.3759
10.8281
0.1271
0.9012
-22.4566
25.2084
Tomatometer Rating
0.0363
0.1363
0.2665
0.7948
-0.2636
0.3362
b0 = 1.3759, b1 = 0.0363 (b)
For each one unit increase in Tomatometer rating, movie receipts will increase by 0.0363. The Y intercept, b0, would be the mean receipts when the Tomatometer rating is equal to zero.
(c)
Yˆ 1.3759 0.0363(55) 3.37 ($thousands). The mean receipt per theater for a movie that has a Tomatometer rating of 55% would be $3,370. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxvii (d) (e)
Yˆ 1.3759 0.0363(5) 1.557 ($thousands). The mean receipt per theater for a movie that has a Tomatometer rating of 5% would be $1,557. From PHStat, Simple Linear Regression Analysis
Regression Statistics Multiple R
0.0801
R Square
0.0064
r2 = 0.0064. So 0.64% of the variation in movie receipts can be explained by the variation in Tomatometer rating.
Copyright ©2024 Pearson Education, Inc.
cxxviii Chapter 16: Time-Series Forecasting 13.73 cont.
(f)
The residual plot reveals no evidence of a violation of linearity and equal variance assumptions. The normal probability plot reveals no substantial departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxix 13.73 cont.
(g) Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
1.3759
10.8281
0.1271
0.9012
-22.4566
Tomatometer Rating
0.0363
0.1363
0.2665
0.7948
-0.2636
At the 0.05 significance level, there is insufficient evidence of a significant linear relationship between Tomatometer rating and receipts. Because tSTAT = 0.2665 < 2.2010 or p-value = 0.7948 > 0.05, do not reject H0. (h)
From PHStat Confidence Interval Estimate
Data X Value
55
Confidence Level
95%
Intermediate Calculations Sample Size
13
Degrees of Freedom
11
t Value
2.200985
XBar, Sample Mean of X
75.84615
Sum of Squared Differences from XBar
7297.692
Standard Error of the Estimate
11.64107
h Statistic
0.136471
Predicted Y (YHat)
3.37361
For Average Y Interval Half Width
9.4652
Copyright ©2024 Pearson Education, Inc.
cxxx Chapter 16: Time-Series Forecasting Confidence Interval Lower Limit
-6.0916
Confidence Interval Upper Limit
12.83882
For Individual Response Y Interval Half Width
27.3143
Prediction Interval Lower Limit
-23.9406
Prediction Interval Upper Limit
30.68786
6.0916 Y | X 55 12.83882; 23.9406 Yx 55 30.68786
Note: receipts per theater are in $thousands. (i)
13.74
(a) (b)
(c) (d) (e) (f)
(g) (h)
Based on the results from (a) – (h), Tomatometer rating would not be useful in predicting receipts on the first weekend a movie opens. There was a significant relationship between Tomatometer rating and receipts, with the model accounting for 0.64% of the variation in movie receipts. One should be hesitant to use a Tomatometer rating that falls outside of the values included in the dataset. The dataset always contained a small sample size, which can make it difficult violation of assumptions such as normality. b0 = 24.84, b1 = 0.14 24.84 is the portion of estimated mean delivery time that is not affected by the number of cases delivered. For each additional case, the estimated mean delivery time increases by 0.14 minutes. X 24.84 0.14(150) 45.84 Yˆ 24.84 0.14 No, 500 cases is outside the relevant range of the data used to fit the regression equation. r2 = 0.972. So, 97.2% of the variation in delivery time can be explained by the variation in the number of cases. Based on a visual inspection of the graphs of the distribution of residuals and the residuals versus the number of cases, there is no pattern. The model appears to be adequate. t 24.88 t0.05/2 2.1009 with 18 degrees of freedom for 0.05 . Reject H0. There is evidence that the fitted linear regression model is useful. 44.88 Y | X 150 46.80 41.56 YX 150 50.12
13.75
(a)
Partial PHStat output: Intercept Diameter at breast height
(b) (c) (d)
Coefficients Standard Error t Stat 78.79634012 12.21480794 6.450886538 2.673214402 0.374109159 7.145546532
P-value 3.49317E-06 8.59802E-07
b0 = 78.7963, b1 = 2.6732 The estimated mean height of a redwood tree will increase by 2.6732 feet for each additional inch increase in diameter at breast height. Yˆ 78.7963 2.6732 X 78.7963 2.6732 25 145.6267
r 2 = 0.7288. So 72.88% of the variation in the height of the redwood trees can be explained by the variation in diameter at breast height. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxi (e) Diameter at breast height Residual Plot 60 Residuals
40 20
0 -20 -40
-60 0
20
40
60
Diameter at breast height
There are clusters of negative residuals at the low and high end of the diameter values. There appears to be some non-linear relationship between height and diameter.
Copyright ©2024 Pearson Education, Inc.
cxxxii Chapter 16: Time-Series Forecasting 13.75 cont.
(e) Normal Probability Plot 60
Residuals
40 20
0 -20 -40 -60 -2
-1
0
1
2
Z Value
(g)
The normal probability plot does not suggest any possible departure from the normality assumption. H 0 : 1 0 vs. H1 : 1 0 Since t-stat = 7.1455 with a p-value which is virtually 0, reject H 0 . There is a significant relationship between the height of redwood trees and the breast diameter at the 0.05 level of significance. 1.8902 1 3.4562
(a)
Independent variable is living space. Dependent variable is asking price.
(f)
13.76
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxiii 13.76 cont.
(a)
From PHStat, Simple Linear Regression Analysis
Regression Statistics Multiple R
0.6307
R Square
0.3978
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
408.2614
Living Space
0.1044
Upper 95%
33.0407
12.3563
0.0000
342.1471
474.3758
0.0167
6.2433
0.0000
0.0710
0.1379
b0 = 408.2614, b1 = 0.1044. (b)
(c)
For each additional square foot of living space in the house, the mean asking price is predicted to increase by $104.40. The estimated asking price of a house with 0 living space is 408.2614 thousand dollars. However, this interpretation is not meaningful because the living space of the house cannot be 0. Y 408.2614 0.1044(2,000) 617.1 thousand dollars.
(d)
r2 = 0.3978. So 39.78% of the variation in asking price is explained by the variation in living space.
(e)
Neither the residual plot nor the normal probability plot reveals any potential violation of the linearity, equal variance, and normality assumptions.
(f)
tSTAT = 6.2433 > 2.0010, p-value is 0.0000. Because p-value < 0.05, reject H0. There is evidence of a linear relationship between asking price and living space.
(g)
0.0710 1 0.1379
(h)
The living space in the house is somewhat useful in predicting the asking price, but because only 39.78% of the variation in asking price is explained by variation in living space, other variables should be considered.
Copyright ©2024 Pearson Education, Inc.
cxxxiv Chapter 16: Time-Series Forecasting 13.77
(a)
Independent variable is asking price. Dependent variable is taxes.
From PHStat Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9903
R Square
0.9807
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
156.6320
Asking Price
7.9345
Upper 95%
88.6771
1.7663
0.0825
-20.8105
334.0745
0.1448
54.7961
0.0000
7.6447
8.2242
b0 = 156.6320, b1 = 7.9345. (b)
(c)
The Y intercept, b0, would be the mean yearly taxes when the asking price is equal to zero. The literal interpretation is not meaningful in this case because an asking price of zero dollars is not realistic. The sample slope, b1, indicates that for each unit of change in asking price, the predicted mean yearly taxes increase by 7.9345. For each additional thousand dollars in asking price, the predicted mean yearly taxes increase by $7.9345 dollars. Y 156.6320 7.9345(400) $3,330.43 dollars. The predicted mean yearly taxes for a $400,000 home would be $3,330.43 dollars. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxv (d)
r2 = 0.9807, which means that 98.07% of the variation in yearly taxes can be explained by the variation in asking price.
Copyright ©2024 Pearson Education, Inc.
cxxxvi Chapter 16: Time-Series Forecasting 13.77 cont.
(e)
The residual plot indicates that there is very little difference between the predicted Y and the observed value of Y. This is not surprising because r2 = 0.9807. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot appears to reveal evidence of a violation of the normality assumption. However, both plots show two substantial residual outliers.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxvii 13.77 cont.
13.78
(f)
At the 0.05 significance level, there is evidence of a significant linear relationship between yearly taxes and home asking price. Because tSTAT = 54.7961 > 2.0010 or p-value = 0.000 < 0.05, reject H0.
(g)
One can conclude that there is a very strong positive significant relationship between home asking price and yearly taxes, but that the normality assumption of linear regression is not satisfied, due to the presence of outlier(s).
(a)
Independent variable is efficiency ratio. Dependent variable is ROE (return on equity).
From PHStat, Simple Linear Regression Analysis
Regression Statistics Multiple R
0.3237
R Square
0.1048
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
24.6488
Living Space
-0.1447
2.4836
9.9248
0.0000
19.7203
29.5774
0.0427
-3.3867
0.0010
-0.2295
-0.0599
b0 = 24.6488, b1 = –0.1447. Copyright ©2024 Pearson Education, Inc.
Upper 95%
cxxxviii Chapter 16: Time-Series Forecasting
13.78 cont.
(b)
For each additional point on the efficiency ratio, the predicted mean ROE is estimated to decrease by 0.1447. For an efficiency of 0, the predicted mean ROE is 24.6488.
(c)
Y 24.6488 0.1447(60) 15.9681.
(d)
r2 = 0.1048. So 10.48% of the variation in ROE is explained by the variation in efficiency ratio.
(e)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxix There is no obvious pattern in the residuals, so the assumptions of regression are met. The model appears to be adequate. (f)
tSTAT = –3.3867 < –1.9845; reject H0. There is evidence of a linear relationship between efficiency ratio and ROE.
Copyright ©2024 Pearson Education, Inc.
cxl Chapter 16: Time-Series Forecasting 13.78 cont.
(g)
From PHStat Confidence Interval Estimate
Data X Value
60
Confidence Level
95%
For Average Y Interval Half Width
0.8153
Confidence Interval Lower Limit
15.1528
Confidence Interval Upper Limit
16.78338
For Individual Response Y Interval Half Width
7.8902
Prediction Interval Lower Limit
8.0780
Prediction Interval Upper Limit
23.85826
15.1528 Y X 60 11.7834 and 8.0780 YX 60 23.8583
(h) (i) 13.79
0.2295 1 0.0599 There is a small relationship between efficiency ratio and ROE.
(a)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxli
(b)
(c)
b0 = 0.4872, b1 = 0.0123 0.4872 is the portion of estimated mean completion time that is not affected by the number of invoices processed. When there is no invoice to process, the mean completion time is estimated to be 0.4872 hours. Of course, this is not a very meaningful interpretation in the context of the problem. For each additional invoice processed, the estimated mean completion time increases by 0.0123 hours. Yˆ 0.4872 0.0123X 0.4872 0.0123(150) 2.3304
Copyright ©2024 Pearson Education, Inc.
cxlii Chapter 16: Time-Series Forecasting 13.79 cont.
(d)
r2 = 0.8623. 86.23% of the variation in completion time can be explained by the variation in the number of invoices processed.
(e)
(f)
(g)
Based on a visual inspection of the graphs of the distribution of residuals and the residuals versus the number of invoices and time, there appears to be autocorrelation in the residuals. D = 0.69 < 1.37 = dL. There is evidence of positive autocorrelation. The model does not appear to be adequate. The number of invoices and, hence, the time needed to process them, tend to be high for a few days in a row during historically heavier shopping days or during advertised sales days. This could be the possible causes for positive autocorrelation. Due to the violation of the independence of errors assumption, the prediction made in (c) is very likely to be erroneous. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxliii
Copyright ©2024 Pearson Education, Inc.
cxliv Chapter 16: Time-Series Forecasting (a) Scatter Plot
O-ring Damage Index
12 10 8 6 4 2 0 0
20
40
60
80
Temperature (degrees F)
There is not any clear relationship between atmospheric temperature and O-ring damage from the scatter plot. (b) 12 10 O-ring Damage Index
13.80
8 6 4 2 0 0
20
40
60
80
100
-2 -4 Temperature (degrees F)
(c)
(d)
(e) (g)
In (b), there are 16 observations with an O-ring damage index of 0 for a variety of temperatures. If one concentrates on these observations with no O-ring damage, there is obviously no relationship between O-ring damage index and temperature. If all observations are used, the observations with no O-ring damage will bias the estimated relationship. If the intention is to investigate the relationship between the degrees of Oring damage and atmospheric temperature, it makes sense to focus only on the flights in which there was O-ring damage. Prediction should not be made for an atmospheric temperature of 31 0F because it is outside the range of the temperature variable in the data. Such prediction will involve extrapolation, which assumes that any relationship between two variables will continue to hold outside the domain of the temperature variable. Yˆ 18.036 0.240X A nonlinear model is more appropriate for these data.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlv 13.80 cont.
(h) Temperature Residual Plot 7 6 5
Residuals
4 3 2 1 0 -1 -2 -3
0
10
20
30
40
50
60
70
80
90
Temperature
The string of negative residuals and positive residuals that lie on a straight line with a positive slope in the lower-right corner of the plot is a strong indication that a nonlinear model should be used if all 23 observations are to be used in the fit. 13.81
(a)
Independent variable is ERA (earned run average). Dependent variable is wins.
Copyright ©2024 Pearson Education, Inc.
cxlvi Chapter 16: Time-Series Forecasting 13.81 cont.
(a)
From PHStat Simple Linear Regression Analysis
Regression Statistics Multiple R
0.8762
R Square
0.7677
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
172.2106
ERA
-23.0039
Upper 95%
9.5731
17.9889
0.0000
152.6009
191.8203
2.3915
-9.6189
0.0000
-27.9028
-18.1051
b0 = 172.2106 and b1 = –23.0039. (b)
The Y intercept, b0, would be the mean number of wins when the team’s ERA is equal to zero. The literal interpretation is not meaningful in this case because a team ERA of zero is not realistic. The sample slope, b1, indicates that for each unit increase in ERA, the predicted mean number of wins decreases by 23.0039.
(c) (d)
Y 172.2106 23.0039(4.5) 68.693 wins. r2 = 0.7677, which means that 76.77% of the variation in season wins can be explained by the variation in a team’s ERA.
(e)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlvii
Copyright ©2024 Pearson Education, Inc.
cxlviii Chapter 16: Time-Series Forecasting 13.81 cont.
(e)
(f)
(g)
The residual plot reveals no evidence of a pattern in the residuals. The plot reveals that the residuals, the difference between the observed value of Y and the predicted value of Y, are spread above and below 0 for different values of summated rating. There appears to be no violation of the linearity and equal variance assumptions. The normal probability plot reveals no evidence of departure from the normality assumption. At the 0.05 significance level, there is evidence of a negative linear relationship between the number of wins and team ERA. Because tSTAT = –9.6189 < -2.0484 or p-value = 0.000 < 0.05, reject H0. From PHStat
Confidence Interval Estimate
Data X Value
4.5
Confidence Level
95%
For Average Y Interval Half Width
3.7582
Confidence Interval Lower Limit
64.9347
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlix
Confidence Interval Upper Limit
72.45112
64.9347 Y X 4.5 72.45112. The 95% confidence interval estimate is that the
population mean number of wins for all teams with an ERA of 4.5 is between 64.93 and 72.45.
Copyright ©2024 Pearson Education, Inc.
cl Chapter 16: Time-Series Forecasting 13.81 cont.
(h)
From PHStat
Confidence Interval Estimate
Data X Value
4.5
Confidence Level
95%
For Individual Response Y
(i) (j)
13.82
Interval Half Width
15.2245
Prediction Interval Lower Limit
53.4684
Prediction Interval Upper Limit
83.91734
53.4684 YX 4.5 83.91734. The 95% prediction interval estimate is that the number of wins for an individual team with an ERA of 4.5 is between 53.47 and 83.92. b t /2 Sb1 23.0039 2.0484 2.3915 . 27.90 1 18.11. The population in this case could include all games played by all teams for the last five years.
(k)
Other variables that one might consider for inclusion in the model would be saves, runs scored per game, batting average, and the number of home runs.
(l)
One can conclude there is a significant negative relationship between team ERA and the number of wins. As the ERA decreases, the expected number of wins increases. The regression equation revealed that 76.77% of the variation in wins can be explained by team ERA.
(a)
Independent variable is revenue. Dependent variable is market value. From PHStat Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9220
R Square
0.8501
Coefficients
Standard
t Stat
P-value
Copyright ©2024 Pearson Education, Inc.
Lower
Upper
Solutions to End-of-Section and Chapter Review Problems cli Error
95%
95%
Intercept
-1.9291
0.3284
-5.8748
0.0000
-2.6018
-1.2565
Revenue
0.0139
0.0011
12.6025
0.0000
0.0116
0.0161
b0 = –1.9291, b1 = 0.0139.
13.82 cont.
(b)
For each additional million-dollar increase in revenue, the current value will increase by an estimated 0.0139 billion. Literal interpretation of b0 is not meaningful because an operating team cannot have negative revenue.
(c) (d)
Y 1.9291 0.0139(250) 1.5406 billion r2 = 0.8501. 85.01% of the variation in the value of an NBA basketball team can be explained by the variation in its annual revenue.
(e)
Copyright ©2024 Pearson Education, Inc.
clii Chapter 16: Time-Series Forecasting
There does not appear to be a pattern in the residual plot. The assumptions of regression do not appear to be seriously violated. (f)
tSTAT = 12.6025 > 2.0484 or because the p-value is 0.0000 < 0.05, reject H0 at the 5% level of significance. There is evidence of a linear relationship between annual revenue and value.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cliii 13.82 cont.
(g)
From PHStat Confidence Interval Estimate Data X Value
250
Confidence Level
95%
For Average Y Interval Half Width
0.1662
Confidence Interval Lower Limit
1.3743
Confidence Interval Upper Limit
1.706757
1.3743 Y X 250 1.7068 billons
(h)
From PHStat Confidence Interval Estimate Data X Value
250
Confidence Level
95%
For Individual Response Y
(i)
13.83
Interval Half Width
0.7665
Prediction Interval Lower Limit
0.7741
Prediction Interval Upper Limit
2.307016
0.7741 YX 250 2.3070 billons The strength of the relationship between revenue and value is approximately the same for NBA basketball teams and for European soccer teams but lower than for MLB baseball teams.
The textbook asks the student for 13.83 (a) to repeat (a) through (h) shown in 13.82. However, the student is to use data from a different file titled “Soccer Values.” (a)
(a)
Independent variable is revenue. Dependent variable is value. From PHStat Copyright ©2024 Pearson Education, Inc.
cliv Chapter 16: Time-Series Forecasting Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9666
R Square
0.9342
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
-748.4724
174.1255
-4.2985
0.0006
-1117.6020
-379.3428
Revenue
5.0568
0.3354
15.0761
0.0000
4.3458
5.7679
b0 = –748.4724, b1 = 5.0568.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clv 13.83 cont.
(a)
(b) For each additional million-dollar increase in revenue, the current value will increase by an estimated 5.0568 billion. Literal interpretation of b0 is not meaningful because an operating team cannot have negative revenue. (c) Y 748.4724 5.0568(250) 515.73 billion dollars (d) r2 = 0.9342. 93.42% of the variation in the value of an European soccer team can be explained by the variation in its annual revenue. (e)
Copyright ©2024 Pearson Education, Inc.
clvi Chapter 16: Time-Series Forecasting 13.83 cont.
(a)
(e)
There does not appear to be a pattern in the residual plot. The assumptions of regression do not appear to be seriously violated.
(f)
tSTAT = 15.0761 > 2.1199 or because the p-value is 0.0000 < 0.05, reject H0 at the 5% level of significance. There is evidence of a linear relationship between annual revenue and value. (g) From PHStat Confidence Interval Estimate Data X Value
250
Confidence Level
95%
For Average Y Interval Half Width
211.5163
Confidence Interval Lower Limit
304.2134
Confidence Interval Upper Limit
727.2461
304.2134 Y X 250 727.2461 billons
(h) From PHStat Confidence Interval Estimate Data X Value
250
Confidence Level
95%
For Individual Response Y Interval Half Width
582.0576
Prediction Interval Lower Limit
-66.3278
Prediction Interval Upper Limit
1097.787
66.3278 YX 250 1,097.787 billons
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clvii 13.83 cont.
(b)
For 13.83 (b), the student is to compare the results from 13.83 (a) to similar problems in the chapter. Soccer Values dataset analysis: Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9666
R Square
0.9342
Adjusted R Square
0.9301
Standard Error
255.7971
Observations
18
ANOVA df
SS
MS
F
Significance F
Regression
1 14872071.4841 14872071.4841 227.2900
Residual
16
Total
17 15918985.6111
1046914.1270
0.0000
65432.1329
Coefficients
Standard Error
t Stat
P-value
Lower 95%
Upper 95%
Intercept
-748.4724
174.1255
-4.2985
0.0006
-1117.6020
-379.3428
Revenue
5.0568
0.3354 15.0761
0.0000
4.3458
5.7679
MLB Values dataset analysis Simple Linear Regression Analysis Regression Statistics Multiple R
0.8293
R Square
0.6877
Copyright ©2024 Pearson Education, Inc.
clviii Chapter 16: Time-Series Forecasting Adjusted R Square
0.6766
Standard Error
654.9821
Observations
30
ANOVA df
SS
Regression
1
26454210.9591
Residual
28
12012043.2076
Total
29
38466254.1667
Intercept
F
Significance F
26454210.9591 61.6646
0.0000
429001.5431
Coefficients
Standard Error
t Stat
Pvalue
Lower 95%
Upper 95%
-1410.6988
461.6593
-3.0557 0.0049
-2356.3650
-465.0326
10.9927
1.3999
7.8527 0.0000
8.1252
13.8602
Annual Revenue
13.83 cont.
MS
(b) NBA Financial dataset analysis: Simple Linear Regression Analysis Regression Statistics Multiple R
0.9220
R Square
0.8501
Adjusted R Square
0.8448
Standard Error
0.3653
Observations
30
ANOVA df
SS
MS
F
Copyright ©2024 Pearson Education, Inc.
Significance F
Solutions to End-of-Section and Chapter Review Problems clix Regression
1
21.1908
21.1908
Residual
28
3.7359
0.1334
Total
29
24.9267
Coefficients
Standard Error
t Stat
Intercept
-1.9291
0.3284
Revenue
0.0139
158.8236
0.0000
P-value
Lower 95%
Upper 95%
-5.8748
0.0000
-2.6018
-1.2565
0.0011 12.6025
0.0000
0.0116
0.0161
Among the three franchises, annual revenue could explain the most variation in franchise value for the Soccer with a r2 = 0.9342. Basketball had a r2 = 0.8501 and Baseball had a r2 = 0.6877. 13.84
(a) (b) (c) (d) (e) (f) (g) (h)
13.85
b0 = –2,629.222, b1 = 82.472. For each additional centimeter in circumference, the weight is estimated to increase by 82.472 grams. 2,319.08 grams. Yes, because circumference is a very strong predictor of weight. r2 = 0.937. There appears to be a nonlinear relationship between circumference and weight. p-value is virtually 0 < 0.05; reject H0. 72.7875 1 92.156.
Solution located in 13.83 (b) of the present solutions.
Chapter 14
14.1
(a)
(b) 14.2
14.3
(a)
Holding constant the effect of X2, for each increase of one unit in X1, the response variable Y is estimated to increase a mean of 4 units. Holding constant the effect of X1, for each increase of one unit in X2, the response variable Y is estimated to increase an average of 5 units. The Y-intercept 8 is the estimate of the mean value of Y if X1 and X2 are both 0.
(b)
Holding constant the effect of X2, for each increase of one unit in X1, the response variable Y is estimated to decrease an average of 2 units. Holding constant the effect of X1, for each increase of one unit in X2, the response variable Y is estimated to increase an average of 7 units. The Y-intercept 50 is the estimate of the mean value of Y if X1 and X2 are both 0.
(a)
Y 11.002079 0.6684 X1 0.8317 X 2 Copyright ©2024 Pearson Education, Inc.
clx Chapter 16: Time-Series Forecasting (b)
14.4
(c)
For each one unit increase in revenue, one would estimate that the predicted mean commitment would increase 0.6683647 units, while holding efficiency constant. For each one unit increase in efficiency, one would estimate that the predicted mean commitment would increase 0.831739 units, while holding revenue constant. The model uses both total revenue, percent of private donations remaining after fundraising expenses to predict commitment, and efficiency, the percent of total expenses that are directly allocated to charitable services. The model may be more effective in predicting commitment compared to a model using only one of these variables is included.
(a)
From PHStat Coefficients Intercept
1.3847
Efficiency Ratio
-0.0072
Risk-Based Capital
0.0181
(b)
Y 1.3847 0.0072 X1 0.0181X 2 where X1 = Efficiency Ratio, X 2 = growth rate (Risk-based capital) For a given growth rate, for each increase of 1% in efficiency ratio, ROAA decreases by 0.0072%. For a given efficiency ratio, for each increase of 1% in growth rate, ROAA increases by 0.0181%
(c)
Y 1.3847 0.0072(60) 0.0181(15) 1.2231
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxi 14.4 cont.
(c)
From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level
95%
Efficiency Ratio given value
60
Risk-Based Capital given value
15
For Average Predicted Y (YHat) Confidence Interval Lower Limit
1.121979
Confidence Interval Upper Limit
1.324142
For Individual Response Y -0.1988
Prediction Interval Upper Limit
2.64492
(d)
1.1220 Y | X 1.3241
(e) (f)
0.1988 YX 2.64492 The interval in (e) is narrower because it is estimating the mean value, not an individual value. The model uses both the efficiency ratio and growth rate to predict ROAA. This may produce a better model than if only one of these independent variables is included.
(g)
14.5
Prediction Interval Lower Limit
(a) Intercept alcohol chlorides
Coefficients Standard Error t Stat P-value 1.1592 1.2719 0.9114 0.3667 0.4962 0.1094 4.5378 0.0000 -9.6331 3.6818 -2.6164 0.0119
Yˆ 1.1592 0.4962 X 1 9.6331X 2 (b)
(c)
For a given amount of chlorides, each increase of one percent in alcohol is estimated to result in a mean increase in quality rating of 0.4962. For a given alcohol content, each increase of one unit in chlorides is estimated to result in the mean decrease in quality rating of 9.6331. The interpretation of b0 has no practical meaning here because it would have meant the Copyright ©2024 Pearson Education, Inc.
clxii Chapter 16: Time-Series Forecasting
(d)
estimated mean quality rating when a wine has 0 alcohol content and 0 amount of chlorides. Yˆ 1.1592 0.4962 10 9.6331.08 = 5.3510.
(e)
5.0635 Y | X 5.6386
(f) (g)
3.5484 YX 7.1536 The model uses both alcohol content (%) and the amount of chlorides to predict wine quality. This may produce a better model than if only one of these independent variables is included.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxiii 14.6
(a)
From PHStat Coefficients
(b)
(c) (d)
Intercept
654.7054
Worldwide Revenues
24.6789
Number of New Graduates Hired
1.0579
Y 654.7054 24.6789 X1 1.0579 X 2 where X1 = Revenues, X 2 = New Hires For a given number of new graduates, for each increase of $1 billion in worldwide revenue, the mean number of full-time jobs added is predicted to increase by 24.6789. For a given $1 billion in worldwide revenue, for each increase of number of new graduates hired, the mean number of full-time jobs added is predicted to increase by 1.0579. The Y intercept has no meaning in this problem. Holding the other independent variable constant, number of new graduates has a higher slope than worldwide revenue.
14.7
(a) (b)
(c) (d) (e)
14.8
Yˆ 330.675 1.764865 X 1 0.13897 X 2 For a given amount of remote hours, each increase of one unit in total staff present is estimated to result in a mean increase in standby hours of 1.764865. For a given amount of total staff present, each increase of one unit in remote hours is estimated to result in a mean decrease in standby hours of 0.13897. The interpretation of b0 has no practical meaning here because it provides an estimate of the mean standby hours when there was no total staff present and no remote hours. Yˆi 330.675 1.764865(310) 0.13897(400) 160.845 141.7856 Y | X 179.9074
(f) (g)
85.2014 YX 236.4915 The model uses both the total staff present and the remote hours to predict standby hours. This may produce a better model than if only one of these independent variables is included.
(a)
From PHStat Coefficients Intercept
1023.3446
House Size
0.0229 Copyright ©2024 Pearson Education, Inc.
clxiv Chapter 16: Time-Series Forecasting Age
-6.3465
Y 1,023.3446 0.0229 X1 6.3465 X 2 where X1 = House Size, X 2 = Age
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxv 14.8 cont.
(b)
(c) (d)
For a given age, each increase by one square foot in house size is estimated to result in an increase in the mean asking price of $0.0229 thousands. For a given house size, each increase of one year in age is estimated to result in the decrease in mean asking price of $6.3465 thousands. The interpretation of b0 has no practical meaning here because it would represent the estimated asking price of a new house that has zero square feet.
Y 1,023.3446 0.0229(25,000) 6.3465(55) 1,247.311 thousands. From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level
95%
House Size given value
25000
Age given value
55
For Average Predicted Y (YHat) Confidence Interval Lower Limit
1165.016
Confidence Interval Upper Limit
1329.606
For Individual Response Y Prediction Interval Lower Limit
740.8534
Prediction Interval Upper Limit
1753.769
(e)
1,165.016 Y | X 1,329.606
(f)
740.8534 YX 1,753.769
14.9
There is no evidence of a violation of the assumptions of regression.
14.10
(a)
Copyright ©2024 Pearson Education, Inc.
clxvi Chapter 16: Time-Series Forecasting
14.10 cont.
(b)
(c)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxvii
(d)
Based on a residual analysis of (a) to (c), there is no evidence of a violation of the assumptions of regression.
Copyright ©2024 Pearson Education, Inc.
clxviii Chapter 16: Time-Series Forecasting 14.11
(a)
(b)
(c)
(d)
The residual plots do not reveal any specific pattern. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxix
Copyright ©2024 Pearson Education, Inc.
clxx Chapter 16: Time-Series Forecasting 14.11 cont.
(e)
Since the data set is cross-sectional, it is inappropriate to compute the Durbin-Watson statistic.
14.12
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxi 14.12 cont.
(a) (b) (c) 14.13
There is no evidence of a violation of the assumptions. Because the data are not collected over time, the Durbin-Watson test is not appropriate. They are valid.
(a)
Copyright ©2024 Pearson Education, Inc.
clxxii Chapter 16: Time-Series Forecasting 14.13 cont.
(a)
Based upon a residual analysis, the model appears adequate. (b)
There is no evidence of a pattern over time.
(c)
D = 1.79
(d)
D = 1.79 > 1.55. There is no evidence of positive autocorrelation in the residuals.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxiii 14.14
(a)
Copyright ©2024 Pearson Education, Inc.
clxxiv Chapter 16: Time-Series Forecasting 14.14 cont.
(a)
The residual analysis reveals no patterns.
14.15
(b)
Because the data are not collected over time, the Durbin-Watson test is not appropriate.
(c)
There are no apparent violations of the assumptions.
(a)
MSR SSR / k 60 / 2 30
(b) (c)
(d) (e) 14.16
(a) (b) (c)
(d) (e)
MSE SSE / (n k 1) 120 / 18 6.67 FSTAT MSR / MSE 30 / 6.67 4.5 FSTAT 4.5 FU (2,2121) 3.555 . Reject H0. There is evidence of a significant linear
relationship. SSR 60 r2 0.3333 SST 180 n 1 2 radj 1 1 rY2.12 = 0.2592 n k 1 MSR SSR / k 30 / 2 15
MSE SSE / (n k 1) 120 / 10 12 FSTAT MSR / MSE 15 / 12 1.25 FSTAT 1.25 FU (2,13 21) 4.103 . Do not reject H0. There is not sufficient evidence of a significant linear relationship. SSR 30 r2 0.2 SST 150 n 1 2 radj 1 1 rY2.12 = 0.04 n k 1 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxv 14.17
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
clxxvi Chapter 16: Time-Series Forecasting 14.17 cont.
(c)
(d)
(e)
14.18
(a) 0.00% of the variation in price-to-book-value ratio is explained by the variation in return on equity after adjusting for number of independent variables and sample size. (b) 2.63% of the variation in price-to-book-value ratio is explained by the variation in growth after adjusting for number of independent variables and sample size. (c) 1.80% of the variation in price-to-book-value ratio is explained by the variation in return on equity and growth after adjusting for number of independent variables and sample size. The second model with growth as the only independent variable has the highest adjusted r2. However, only 2.63% of the variation in price-to-book-value ratio is explained by the variation in growth after adjusting for the number independent variables and sample size. Because all three models are rejected, there is no best model in this case.
p-value for revenue is 0.0395 < 0.05 and the p-value for efficiency is less than 0.0001 < 0.05. Reject H0 for each of the independent variables. There is evidence of a significant linear relationship with each of the independent variables.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxvii 14.19
(a) df Regression Residual Total
SS MS F Significance F 27.2241 13.6120 17.3963 0.0000 36.7759 0.7825 64.0000
2 47 49
MSR 17.3963 MSE Since p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant linear relationship. (b) p-value = 0.0000. The probability of obtaining an F test statistic of 17.3963 or larger is 0.0000 if H0 is true. SSR rY2.12 0.4254. So, 42.54% of the variation in quality rating can be explained by (c) SST variation in the percentage of alcohol and variation in chorides. n 1 2 1 (1 rY2.12 ) 0.4009 (d) radj n k 1 FSTAT
14.20
From PHStat Regression Analysis Regression Statistics Multiple R
0.5307
R Square
0.2816
Adjusted R Square
0.2743
Standard Error
0.7192
Observations
200
ANOVA df
SS
MS
F 38.6148
Regression
2
39.9437
19.9719
Residual
197
101.8898
0.5172
Total
199
141.8335
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
clxxviii Chapter 16: Time-Series Forecasting Intercept
1.3847
0.3699
3.7438
0.0002
Efficiency Ratio
-0.0072
0.0060
-1.1984
0.2322
Risk-Based Capital
0.0181
0.0021
8.6824
0.0000
(a) (b) (c) (d)
FSTAT = 38.6148 > 3.00; reject H0. p-value = 0.0000. The probability of obtaining an FSTAT value > 38.6148 if the null hypothesis is true is 0.0000. r2 = 0.2816. 28.16% of the variation in ROAA can be explained by variation in efficiency ratio and variation in growth. 2 radj 0.2743
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxix 14.21
(a)
SSR 27,662.54 13,831 k 2 SSE 28,802.07 MSE 1, 252 (n k 1) 23 MSR
FSTAT
MSR 13,831 11.05 MSE 1,252
FSTAT 11.05 FU (2,2621) 3.422 . Reject H0. There is evidence of a significant linear (b) (c)
(d)
relationship. p-value < 0.001. The probability of obtaining an F test statistic of 11.05 or larger is less than 0.001 if H0 is true. SSR 27,662.54 rY2.12 0.4899 . So, 48.99% of the variation in standby hours can be SST 56,464.62 explained by variation in the total staff present and remote hours. n 1 26 1 2 radj 1 (1 rY2.12 ) 1 (1 0.4899) 0.4456 n k 1 26 2 1
Copyright ©2024 Pearson Education, Inc.
clxxx Chapter 16: Time-Series Forecasting 14.22
From PHStat Regression Analysis Regression Statistics Multiple R
0.7123
R Square
0.5074
Adjusted R Square
0.4939
Standard Error
1829.4342
Observations
76
ANOVA df
SS
MS
Regression
2 251630055.3805 125815027.6902
Residual
73 244318546.0274
Total
75 495948601.4079
F 37.5923
3346829.3976
Coefficients
Standard Error
Intercept
654.7054
257.1081
2.5464
0.0130
Worldwide Revenues
24.6789
7.0951
3.4783
0.0009
Number of New Graduates Hired
1.0579
0.1643
6.4385
0.0000
(a) (b) (c) (d)
t Stat
P-value
FSTAT = 37.5923 > 3.13; reject H0. There is evidence of a significant linear relationship. p-value = 0.0000. The probability of obtaining an FSTAT value > 37.5923 if the null hypothesis is true is 0.0000. r2 = 0.5074. 50.74% of the variation in full-time jobs added can be explained by variation in worldwide revenue and variation in number of new graduates. 2 radj 0.4939
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxi 14.23
From PHStat Regression Analysis Regression Statistics Multiple R
0.8379
R Square
0.7020
Adjusted R Square
0.6919
Standard Error
249.7391
Observations
62
ANOVA df
MS
F 69.5067
Regression
2
8670214.3248
4335107.1624
Residual
59
3679806.9172
62369.6088
Total
61
12350021.2419
Coefficients
Standard Error
Intercept
1023.3446
92.9492
11.0097
0.0000
House Size
0.0229
0.0024
9.4642
0.0000
Age
-6.3465
1.1018
-5.7603
0.0000
(a) (b) (c) (d) 14.24
SS
(a) (b)
t Stat
P-value
FSTAT = 69.5067 > 3.15; reject H0. p-value = 0.0000. The probability of obtaining an FSTAT value > 69.5067 if the null hypothesis is true is 0.0000. r2 = 0.7020. 70.20% of the variation in asking price of a house can be explained by variation in house size and age of the house. 2 radj 0.6919 The slope of X2 in terms of the t statistic is 3.75 which is larger than the slope of X1 in terms of the t statistic which is 3.33. 95% confidence interval on 1 : b1 tnk 1Sb1 , 4 2.1098 1.2
1.46824 1 6.53176
Copyright ©2024 Pearson Education, Inc.
clxxxii Chapter 16: Time-Series Forecasting (c)
For X1: tSTAT
b1 4 3.33 t17 2.1098 with 17 degrees of freedom for = 0.05. Sb1 1.2
Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. b 3 3.75 t17 2.1098 with 17 degrees of freedom for = 0.05. For X2: tSTAT 2 Sb2 0.8 Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both variables X1 and X2 should be included in the model. 14.25
(a)
95% confidence interval estimate of the population slope between commitment and revenue: 0.6683647 ± 1.9853(0.320077), 0.032916 ≤ β1 ≤ 1.303814. 95% confidence interval estimate of the population slope between commitment and efficiency: 0.8317339 ± 1.9853(0.077736), 0.677405 ≤ β2 0.986063.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxiii 14.25
(b)
cont.
Note: X1 = revenue and X2 = efficiency. b 0.6683647 2.09 1.9853 and p-value = 0.0395. For X1: tSTAT 1 Sb1 0.320077 Because p-value < 0.05, reject H0. b 0.8317339 10.7 1.9853 and p-value < 0.0001. For X2: tSTAT 2 Sb2 0.077736 Because p-value < 0.05, reject H0. Because H0 is rejected for both independent variables, both should be included in the model.
14.26
From PHStat Coefficients
Standard Error
Intercept
1.3847
0.3699
3.7438 0.0002
0.6553
2.1141
Efficiency Ratio
-0.0072
0.0060
-1.1984 0.2322
-0.0191
0.0047
Risk-Based Capital
0.0181
0.0021
8.6824 0.0000
0.0140
0.0222
(a)
(b)
t Stat
Pvalue
Lower 95%
Upper 95%
95% confidence interval on 1: b1 tSb1 , 0.0072 1.98(0.0060),
0.0191 1 0.0047 b 0.0072 1.1984 1.96. Do not reject H0. For X1 : tSTAT = 1 Sb1 0.0060 There is no evidence that X1 contributes to a model already containing X2. b 0.0181 8.6824 1.96. Reject H0. For X2 : tSTAT = 2 Sb2 0.0021 There is evidence that X2 contributes to a model already containing X1. X2 (risk-based capital) should be included in the model.
14.27
(a) Intercept alcohol chlorides
(b)
Coefficients Standard Error t Stat P-value 1.1592 1.2719 0.9114 0.3667 0.4962 0.1094 4.5378 0.0000 -9.6331 3.6818 -2.6164 0.0119
0.2762 1 0.7162 b For X1: tSTAT 1 4.5378, p-value = 0.0000. Since p-value < 0.05, reject H0. Sb1 There is enough evidence that the variable percentage of alcohol contributes to a model already containing chlorides. b For X2: tSTAT 2 2.6164, p-value = 0.0119. Since p-value < 0.05, reject H0. Sb2 Copyright ©2024 Pearson Education, Inc.
clxxxiv Chapter 16: Time-Series Forecasting There is enough evidence that the variable chlorides contributes to a model already containing percentage of alcohol. Both percentage alcohol and chlorides should be included in the model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxv 14.28
From PHStat Coefficients
Standard Error
t Stat
P-value
Lower 95%
Intercept
654.7054
257.1081
2.5464
0.0130
142.2898 1167.1210
Worldwide Revenues
24.6789
7.0951
3.4783
0.0009
10.5383
38.8196
Number of New Graduates Hired
1.0579
0.1643
6.4385
0.0000
0.7304
1.3853
(a) (b)
Upper 95%
10.5383 1 38.8196 For X1 : tSTAT = 3.4783 > 1.993. Reject H0. There is evidence that X1 contributes to a model already containing X2. For X2 : tSTAT = 6.4385 > 1.993. Reject H0. There is evidence that X2 contributes to a model already containing X1. Both variables contribute to a model that includes the other variable. You should consider using both in the model.
14.29
(a)
(b)
95% confidence interval on 1 : b1 tnk 1sb1 , 1.7649 2.0687 0.379
0.9809 1 2.5489 b 1.7649 4.66 t23 2.0687 with 23 degrees of freedom for For X1: tSTAT 1 Sb1 0.379 = 0.05. Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. b 0.1390 2.36 t23 2.0687 with 23 degrees of freedom for For X2: tSTAT 2 Sb2 0.0588 = 0.05. Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both variables X1 and X2 should be included in the model.
Copyright ©2024 Pearson Education, Inc.
clxxxvi Chapter 16: Time-Series Forecasting 14.30
From PHStat Standard Coefficients Error
t Stat
P-value
Lower 95%
Intercept
1023.3446
92.9492
11.0097
0.0000
837.3537 1209.3356
House Size
0.0229
0.0024
9.4642
0.0000
0.0181
0.0278
Age
-6.3465
1.1018
-5.7603
0.0000
-8.5512
-4.1419
(a)
14.31
Upper 95%
0.0181 1 0.0278
(b)
For X1: tSTAT 9.4642 and p-value = 0.0000. Since p-value < 0.05, reject H0. There is evidence that the variable X1 contributes to a model already containing X2. For X2: tSTAT 5.7603 and p-value = 0.0000. Since p-value < 0.05, reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Both X1 (house size) and X2 (age) should be included in the model.
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 60 25 35
SSR( X1 X 2 )
SSR( X 2 X1 )
35 3.79 FU (1,13) 4.67 with 1 and 13 degrees of MSE 120 / 13 freedom and 0.05 . Do no reject H0. There is not sufficient evidence that the variable X1 contributes to a model already containing X2. For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 60 45 15 FSTAT
(b)
15 1.625 FU (1,13) 4.67 with 1 and 13 degrees of MSE 120 / 13 freedom and 0.05 . Do not reject H0. There is not sufficient evidence that the variable X2 contributes to a model already containing X1. Neither independent variable X1 nor X2 makes a significant contribution to the model in the presence of the other variable. SSR( X1 X 2 ) 35 rY21.2 = 0.2258 SST SSR( X1 and X 2 ) SSR( X1 X 2 ) 180 60 35 Holding constant the effect of variable X2, 22.58% of the variation in Y can be explained by the variation in variable X1. SSR( X 2 X1 ) 15 rY22.1 = 0.1111 SST SSR( X1 and X 2 ) SSR( X 2 X 1 ) 180 60 15 Holding constant the effect of variable X1, 11.11% of the variation in Y can be explained by the variation in variable X2.
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 30 15 15
FSTAT
14.32
SSR( X1 X 2 )
15 1.25 FU (1,10) 4.965 with 1 and 10 degrees of MSE 120 / 10 freedom and 0.05 . Do not reject H0. There is not sufficient evidence that the variable X1 contributes to a model already containing X2. FSTAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxvii
Copyright ©2024 Pearson Education, Inc.
clxxxviii Chapter 16: Time-Series Forecasting 14.32
(a)
For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 30 20 10
(b)
10 0.833 FU (1,10) 4.965 with 1 and 10 degrees of MSE 120 / 10 freedom and 0.05 . Do not reject H0. There is not sufficient evidence that the variable X2 contributes to a model already containing X1. Neither independent variable X1 nor X2 makes a significant contribution to the model in the presence of the other variable. Also, the overall regression equation involving both independent variables is not significant: SSR( X1 X 2 ) 15 rY21.2 = 0.1111. SST SSR( X 1 and X 2 ) SSR( X 1 X 2 ) 150 30 15 Holding constant the effect of variable X2, 11.11% of the variation in Y can be explained by the variation in variable X1. SSR( X 2 X1 ) 10 rY22.1 = 0.0769. SST SSR( X1 and X 2 ) SSR( X 2 X 1 ) 150 30 10 Holding constant the effect of variable X1, 7.69% of the variation in Y can be explained by the variation in variable X2.
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 27.2241 – 11.1119 = 16.1122
FSTAT
cont.
14.33
SSR( X 2 X1 )
SSR( X1 X 2 )
SSR( X 2 X1 )
16.1122 = 20.5916 with 1 and 47 degrees of freedom, and 0.7825 MSE p-value = 0.0000. Reject H0. There is sufficient evidence that the variable percentage alcohol contributes to a model already containing chlorides. For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 27.2241 – 21.8677 = 5.3564
FSTAT
5.3564 = 6.8455 with 1 and 47 degrees of freedom and 0.7825 MSE p-value = 0.0119. Reject H0. There is enough evidence that the variable chlorides contributes to a model already containing percentage alcohol. Since both percentage alcohol and chlorides make a significant contribution to the model in the presence of the other, the most appropriate regression model for this data set should include both percentage alcohol and chlorides. SSR( X 1 X 2 ) 16.1122 rY21.2 = 0.3046. 64 27.2241 16.1122 SST SSR( X 1 and X 2 ) SSR( X 1 X 2 )
FSTAT
(b)
Holding constant the effect of chlorides, 30.46% of the variation in quality rating can be explained by the variation in percentage alcohol. SSR( X 2 X1 ) 5.3564 rY22.1 = 0.1271. 64 27.2241 5.3564 SST SSR( X 1 and X 2 ) SSR( X 2 X 1 ) Holding constant the effect of percentage alcohol, 12.71% of the variation in quality rating can be explained by the variation in chlorides.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxix 14.34
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 39.9437 – 39.2009 = 0.7428
SSR( X1 X 2 )
0.7428 = 1.44 < 3.84 101.8898 / 197 MSE Do not reject H0. There is insufficient evidence that X1 contributes to a model already containing X2.
FSTAT
For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 39.9437 – 0.9542 = 38.9895,
SSR( X 2 X1 )
38.9895 = 75.38 > 3.84. 101.8898 / 197 MSE Reject H0. There is evidence that X2 contributes to a model already containing X1.
FSTAT
(b)
Because only X2 makes a significant contribution to the model in the presence of the other variable, only that variable should be included in the model. SSR( X 1 X 2 ) 0.7428 rY21.2 = 0.0072. 141.8335 39.9437 0.7428 SST SSR( X 1 and X 2 ) SSR( X 1 X 2 ) Holding constant the effect of the risk based capital, 0.72% of the variation in ROAA can be explained by the variation in efficiency ratio.
SSR( X 2 X1 )
38.9895 = 0.2768. 141.8335 39.9437 38.9895 SST SSR( X 1 and X 2 ) SSR( X 2 X 1 ) Holding constant the effect of efficiency ratio 27.68% of the variation in ROAA can be explained by the variation in the risk-based capital.
rY22.1
14.35
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 27,662.54 513.2846 27,149.255
SSR( X1 X 2 )
SSR( X 2 X1 )
27,149.255 21.68 FU (1,23) 4.279 with 1 and 23 degrees of MSE 28,802.07 / 23 freedom and 0.05 . Reject H0. There is evidence that the variable X1 contributes to a model already containing X2. For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 27,662.54 20,667.4 6,995.14 FSTAT
6,995.14 5.586 FU (1,23) 4.279 with 1 and 23 degrees of MSE 28,802.07 / 23 freedom and 0.05 . Reject H0. There is evidence that the variable X2 contributes to a model already containing X1. Since each independent variable, X1 and X2, makes a significant contribution to the model in the presence of the other variable, the most appropriate regression model for this data set should include both variables. SSR( X 1 X 2 ) 27,149.255 rY21.2 SST SSR( X 1 and X 2 ) SSR( X 1 X 2 ) 56,464.62 27,662.54 27,149.255 = 0.4852. Holding constant the effect of remote hours, 48.52% of the variation in Y can be explained by the variation in total staff present. FSTAT
(b)
SSR( X 2 X1 )
6,995.14 SST SSR( X 1 and X 2 ) SSR( X 2 X 1 ) 56,464.62 27,662.54 6,995.14 = 0.1954. Holding constant the effect of total staff present, 19.54% of the variation in Y can be explained by the variation in remote hours.
rY22.1
Copyright ©2024 Pearson Education, Inc.
cxc Chapter 16: Time-Series Forecasting
14.36
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 251,630,055.3805 211,138,564.1929 40, 491, 491.19
SSR( X1 X 2 )
40,491.491.19 = 12.098 > 3.98 244,318,546.0274 / 73 MSE Reject H0. There is sufficient evidence that X1 contributes to a model containing X2. FSTAT
For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 251,630,055.3805 112,888,778.5259 138,741,276.9
SSR( X 2 X1 )
138,741,276.9 = 41.45 > 3.98. 244,318,546.0274 / 73 MSE Reject H0. There is sufficient evidence that X2 contributes to a model containing X1. FSTAT
(b)
Because both variables make a significant contribution to the model in the presence of the other variable, both variables should be included in the model. SSR( X 1 X 2 ) rY21.2 SST SSR( X 1 and X 2 ) SSR( X 1 X 2 )
40, 491, 491.19 0.142 495,948,601.4079 251,630,055.3805 40, 491, 491.19 Holding constant the effect of the number of new graduates, 14.2% of the variation in full-time jobs can be explained by the variation in total worldwide revenue.
rY22.1
SSR( X 2 X 1 ) SST SSR( X 1 and X 2 ) SSR( X 2 X 1 )
138,741, 276.9 0.3622 495,948,601.4079 251,630,055.3805 138,741, 276.9 Holding constant the effect of total worldwide revenue, 36.22% of the variation in full-time jobs can be explained by the variation in the number of new graduates.
14.37
(a)
For X1: SSR( X1 X 2 ) SSR( X1 and X 2 ) SSR( X 2 ) 8,670, 214.3248 3,083,694.6869 5,586,519.638
SSR( X1 X 2 )
5,586,519.638 = 89.571 > 4.00. 3679806.9172 / 59 MSE Reject H0. There is sufficient evidence that X1 contributes to a model containing X2.
FSTAT
For X2: SSR( X 2 X1 ) SSR( X1 and X 2 ) SSR( X1 ) 8,670, 214.3248 6,600,743.6307 2,069, 470.694
SSR( X 2 X1 )
2,069, 470.694 = 33.18 > 4.00. 3679806.9172 / 59 MSE Reject H0. There is evidence that X2 contributes to a model already containing X1.
FSTAT
Because both variables make a significant contribution to the model in the presence of the other variable, both variables should be included in the model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxci 14.37
(b)
rY21.2
SSR( X 1 X 2 ) SST SSR( X 1 and X 2 ) SSR( X 1 X 2 )
5,586,519.638 0.6029 12,350,021.2419 8,670, 214.3248 5,586,519.638 Holding constant the effect of age, 60.29% of the variation in asking price can be explained by the variation in house size.
cont.
rY22.1
SSR( X 2 X 1 ) SST SSR ( X 1 and X 2 ) SSR ( X 2 X 1 )
2,069, 470.694 0.3560 12,350,021.2419 8,670, 214.3248 2,069, 470.694 Holding constant the effect of house size 35.60% of the variation in asking price can be explained by the variation in age.
14.38
(a)
Holding constant the effect of X2, the estimated mean value of the dependent variable will increase by 4 units for each increase of one unit of X1.
(b)
Holding constant the effects of X1, the presence of the condition represented by X2 = 1 is estimated to increase the mean value of the dependent variable by 2 units. t 3.27 t17 2.1098 . Reject H0. The presence of X2 makes a significant contribution to the model.
(c)
14.39
14.40
(a)
First develop a multiple regression model using X1 as the variable for the SAT score and X2 a dummy variable with X2 = 1 if a student had a grade of B or better in the introductory statistics course. If the dummy variable coefficient is significantly different from zero, you need to develop a model with the interaction term X1 X2 to make sure that the coefficient of X1 is not significantly different if X2 = 0 or X2 = 1.
(b)
If a student received a grade of B or better in the introductory statistics course, the student would be estimated to have a grade point average in accounting that is 0.30 greater than a student who had the same SAT score, but did not get a grade of B or better in the introductory statistics course.
(a) (b)
(c)
Yˆ 243.7371 9.2189X 1 12.6967X 2 , where X1 = number of rooms and X2 = neighborhood (east = 0). Holding constant the effect of neighborhood, for each additional room, the selling price is estimated to increase by a mean of 9.2189 thousands of dollars, or $9218.9. For a given number of rooms, a west neighborhood is estimated to increase mean selling price over an east neighborhood by 12.6967 thousands of dollars, or $12,696.7. Yˆ 243.7371 9.2189(9) 12.6967(0) 326.70758 or $326,707.58 $309,560.04 YX X i $343,855.11$321,471.44 Y | X X i $331,943.71
Copyright ©2024 Pearson Education, Inc.
cxcii Chapter 16: Time-Series Forecasting (d) Normal Probability Plot 15
10
Residuals
5
0 -2
-1.5
-1
-0.5
0
0.5
1
1.5
10
12
2
-5
-10
-15
Z Value
Rooms Residual Plot 15 10 5
Residuals
14.40 cont.
0 -5 -10 -15 0
2
4
6
8
14
Rooms
Based on a residual analysis, the model appears adequate. (e)
(g) (h)
FSTAT = 55.39, p-value is virtually 0. Since p-value < 0.05, reject H0. There is evidence of a significant relationship between selling price and the two independent variables (rooms and neighborhood). For X1: tSTAT = 8.9537, p-value is virtually 0. Reject H0. Number of rooms makes a significant contribution and should be included in the model. For X2: tSTAT = 3.5913, p-value = 0.0023 < 0.05. Reject H0. Neighborhood makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. 7.0466 1 11.3913 5.2378 2 20.1557
(i)
2 radj 0.851
(f)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxciii 14.40
(j)
cont.
(k) (l)
(m) (n)
14.41
rY21.2 0.825 . Holding constant the effect of neighborhood, 82.5% of the variation in selling price can be explained by variation in number of rooms. rY22.1 0.431 . Holding constant the effect of number of rooms, 43.1% of the variation in selling price can be explained by variation in neighborhood. The slope of selling price with number of rooms is the same regardless of whether the house is located in an east or west neighborhood. Yˆ 253.95 8.032 X 1 5.90 X 2 2.089 X 1 X 2 . For X1 X2: the p-value is 0.330. Do not reject H0. There is no evidence that the interaction term makes a contribution to the model. The two-variable model in (f) should be used. The real estate association can conclude that the number of rooms and the neighborhood both significantly affect the selling price, but the number of rooms has a greater effect.
PHStat output: Regression Statistics Multiple R 0.5068 R Square 0.2568 Adjusted R Square 0.2415 Standard Error 1.0509 Observations 100 ANOVA df Regression Residual Total
Intercept alcohol Type of Wine
(a) (b)
2 97 99
SS MS F Significance F 37.0257 18.5129 16.7617 0.0000 107.1343 1.1045 144.1600
Coefficients Standard Error t Stat P-value 0.9342 0.8770 1.0652 0.2894 0.4652 0.0820 5.6762 0.0000 -0.2577 0.2102 -1.2258 0.2232
Lower 95% Upper 95% -0.8064 2.6747 0.3025 0.6278 -0.6749 0.1595
Yˆ 0.9342 0.4652 X 1 0.2577 X 2 Holding constant the effect of the type of wine, for each additional % increase in alcohol content, wine quality is estimated to increase by a mean of 0.4652. For a given amount of alcohol content, a white wine is estimated to have a 0.2577 higher mean quality than a red wine.
Copyright ©2024 Pearson Education, Inc.
cxciv Chapter 16: Time-Series Forecasting 14.41
(c)
Yˆ 0.9342 0.4652 10 0.2577 1 = 5.3283 3.2196 YX X i 7.43715.0184 Y | X X i 5.6382
cont.
PHStat output: Residual Plot for alcohol 3 2
Residuals
1 0 -1 -2 -3 -4 0
5
10
15
alcohol
Residual Plot for Type of Wine 3 2
Residuals
1 0 -1 -2 -3 -4 0
0.2
0.4
0.6 0.8 Type of Wine
1
1.2
Normal Probability Plot 3 2
Residual
1 0 -1
Residual
-2 -3 -4 -4
(d)
-2
0 Z Value
2
4
Based on a residual analysis, there is not any obvious pattern in the residual plots but the normal probability plot indicates departure from the normality assumption. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxcv 14.41 cont.
(j)
FSTAT 16.7617 with a p-value = 0.0000. Reject H0. There is evidence of a relationship between quality and percentage of alcohol and the type of wine. For X1: tSTAT 5.6762 with a p-value = 0.0000. Reject H0. Alcohol content makes a significant contribution and should be included in the model. For X2: tSTAT –1.2258 with a p-value = 0.2232. Do not reject H0. The type of wine does not make a significant contribution and should not be included in the model. Only alcohol content should be kept in the model. 0.3025 1 0.6278, –0.6749 2 0.1595 The slope here takes into account the effect of the other predictor variable, type of wine, while the solution for Problem 13.4 did not. r 2 0.2568. So, 25.68% of the variation in quality can be explained by variation in alcohol content and variation in the type of wine. 2 radj 0.2415
(k)
r 2 0.2568 while r 2 0.3417 in Problem 13.16 (a).
(l)
rY21.2 0.2493. Holding constant the effect of wine type, 24.93% of the variation in
(m) (n)
quality can be explained by variation in alcohol content. rY22.1 0.0153. Holding constant the effect of alcohol content, 1.53% of the variation in quality can be explained by variation in wine type. The slope of alcohol content is the same regardless of whether the wine is red or white. PHStat output:
(e) (f)
(g) (h) (i)
Coefficients Standard Error t Stat P-value Intercept 1.6780 1.1448 1.4658 0.1460 alcohol 0.3947 0.1076 3.6667 0.0004 Type of Wine -2.0309 1.7669 -1.1494 0.2533 alcohol X Type of Wine 0.1678 0.1660 1.0107 0.3147
(o) (p) 14.42
(a) (b)
(c)
Lower 95% Upper 95% -0.5944 3.9503 0.1810 0.6083 -5.5382 1.4764 -0.1617 0.4973
Since the tSTAT for the significance of X 1 X 2 has a p-value = 0.3147, do not reject H0. There is not evidence that the interaction term makes a contribution to the model. The one-variable model should be used. Only the alcohol content is significant in predicting the wine quality.
Yˆ 8.0100 0.0052X 1 2.1052X 2 , where X1 = depth (in feet) and X2 = type of drilling (wet = 0, dry = 1). Holding constant the effect of type of drilling, for each foot increase in depth of the hole, the additional drilling time is estimated to increase by a mean of 0.0052 minute. For a given depth, a dry drilling is estimated to reduce mean additional drilling time over wet drilling by 2.1052 minutes. Dry drilling: Yˆ 8.0101 0.0052 100 2.1052=6.4276 minutes. 6.2096 Y | X X i 6.6457 , 4.9230 YX X i 7.9322
Copyright ©2024 Pearson Education, Inc.
cxcvi Chapter 16: Time-Series Forecasting 14.42 cont.
(d) Depth Residual Plot 2.5 2 1.5
Residuals
1 0.5 0 -0.5 -1 -1.5 -2 -2.5 0
50
100
150
200
250
300
Depth
(g) (h)
Based on a residual analysis, the model appears adequate. FSTAT = 111.109 with 2 and 97 degrees of freedom, F2,97 = 3.09 using Excel. p-value is virtually 0. Reject H0 at 5% level of significance. There is evidence of a relationship between additional drilling time and the two dependent variables. For X1: tSTAT = 5.0289 > t97 = 1.9847. Reject H0. Depth of the hole makes a significant contribution and should be included in the model. For X2: tSTAT = –14.0331 < t97 = –1.9847. Reject H0. Type of drilling makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. 0.0032 1 0.0073 2.4029 2 1.8075
(i)
2 radj 0.6899
(j)
rY21.2 0.2068 . Holding constant the effect of type of drilling, 20.68% of the variation in
(e)
(f)
(k) (l)
(m) (n)
14.43
(a) (b)
additional drilling time can be explained by variation in depth of the hole. rY22.1 0.6700 . Holding constant the effect of the depth of the hole, 67% of the variation in additional drilling time can be explained by variation in type of drilling. The slope of additional drilling time with depth of the hole is the same regardless of whether it is a dry drilling hole or a wet drilling hole. Yˆ 7.9120 0.0060X 1 1.9091X 2 0.0015X 1 X 2 . For X1X2: the p-value is 0.4624 > 0.05. Do not reject H0. There is not evidence that the interaction term makes a contribution to the model. The two-variable model in (a) should be used. Both variables affect the drilling time. Dry drilling holes should be used to reduce the drilling time.
Yˆ 2.4512 0.0482X 1 4.5283X 2 , where X1 = amount of cubic feet moved and X2 = is there an elevator in the apartment (yes = 1, no = 0)? Holding constant the effect of elevator in the building, for each cubic foot increase in amount moved, the labor hours are estimated to increase by a mean of 0.0482. For a given amount of cubic feet moved, a building with an elevator is estimated to have a mean labor hours of 4.5283 below an apartment without an elevator. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxcvii 14.43
(c)
Yˆ 2.4512 0.0482 500 4.5283 1 = 22.0254 20.1431 Y | X X i 23.9078
cont.
12.1150 YX X i 31.9359 (d) Normal Probability Plot 15
Residuals
10
5
0 -2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2
2.5
-5
-10
Z Value
Feet Residual Plot 15
Residuals
10
5
0
-5
-10 0
200
400
600
800
1000
1200
1400
1600
Feet
(e)
(f)
Based on a residual analysis, the errors appear to be normally distributed. The equal variance assumption does not appear to have been violated. The linearity assumption also appears to be intact. FSTAT = 153.3884, p-value is virtually 0. Since p-value < 0.05, reject H0. There is evidence of a significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and whether there is an elevator in the building). For X1: tSTAT = 16.015, p-value is virtually 0. Reject H0. The amount of cubic feet moved makes a significant contribution and should be included in the model. For X2: tSTAT = –2.1521, p-value = 0.0388 < 0.05. Reject H0. The presence of an elevator makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. Copyright ©2024 Pearson Education, Inc.
cxcviii Chapter 16: Time-Series Forecasting
14.43 cont.
(g)
0.0421 1 0.0543, –8.8091 2 –0.2475
(h)
r 2 0.9029. So 90.29% of the variation in labor hours can be explained by variation in the amount of cubic feet moved and whether there is an elevator in the building. 2 radj 0.8970
(i) (j)
(k) (l)
(m) (n)
14.44
(a)
(b)
14.45
rY21.2 0.8860. Holding constant the effect of the presence of an elevator, 88.6% of the variation in labor hours can be explained by variation in the amount of cubic feet moved. rY22.1 0.1231. Holding constant the effect of the amount of cubic feet moved, 12.31% of the variation in labor hours can be explained by whether there is an elevator in the building. The slope of labor hours with the amount of cubic feet moved is the same regardless of whether there is an elevator in the building. Yˆ 4.7260 0.0573X 1 5.4614X 2 0.0139X 1 X 2 . For X1 X2: the p-value is 0.0257 < 0.05. Reject H0. There is evidence that the interaction term makes a contribution to the model. The interaction model in (l) should be used. Both the amount of cubic feet moved and the presence of an elevator affect labor hours.
Y 2.1698 0.0201X1 0.0156 X 2 0.0006 X1 X 2 , where X1 = efficiency ratio, X2 = total risk-based capital, where p-value = 0.0072 < 0.05. Reject H0. There is evidence that the interaction term makes a contribution to the model. Because there is evidence of an interaction effect between efficiency ratio and growth, the model in (a) should be used. From PHStat Regression Analysis Regression Statistics Multiple R
0.7284
R Square
0.5306
Adjusted R Square
0.5209
Standard Error
14.4536
Observations
100
ANOVA df Regression
SS 2
22906.6581
MS
F
11453.3290
54.8254
Copyright ©2024 Pearson Education, Inc.
Significance F 0.0000
Solutions to End-of-Section and Chapter Review Problems cxcix Residual
97
20263.8519
Total
99
43170.5100
Coefficients
Standard Error
-47.9482
Summed Rating Coded Location
Intercept
208.9057
t Stat
P-value
Lower 95%
11.9934
-3.9979
0.0001
-71.7517
-24.1447
1.7375
0.1890
9.1918
0.0000
1.3623
2.1127
-17.8012
2.9129
-6.1111
0.0000
-23.5826
-12.0199
Copyright ©2024 Pearson Education, Inc.
Upper 95%
cc Chapter 16: Time-Series Forecasting 14.45 cont.
(a) (b)
Y 47.9482 1.7375 X1 17.8012 X 2 . The Y intercept, b0, would be the mean cost per person for a meal when the summary rating and the location are both zero. The literal interpretation is not meaningful in this case because a summary rating of zero would not be logical. For each one unit increase in summary rating, one would estimate that the predicted mean cost per person would increase by $1.7375, while holding presence of metro area location constant. When there is a presence of metro area location, the predicted mean cost per person will decrease by $17.8012, while holding summary rating constant.
(c)
Y 47.9482 1.7375(60) 17.8012(0) 56.3018. From PHStat Confidence Interval Estimate and Prediction Interval Data Confidence Level
95% 1
Summated Rating given value
60
Coded Location given value
0
For Average Predicted Y (YHat) Confidence Interval Lower Limit
52.13595
Confidence Interval Upper Limit
60.46708
For Individual Response Y Prediction Interval Lower Limit
27.31431
Prediction Interval Upper Limit
85.28871
52.13595 Y | X X i 60.46708 27.31431 YX X i 85.28871 (d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cci
Copyright ©2024 Pearson Education, Inc.
ccii Chapter 16: Time-Series Forecasting 14.45 cont.
(d)
(e)
There is no pattern in the relationship between residuals and the predicted value of Y, the value of the summary rating, or the value of the location. The regression assumptions are satisfied. At the 0.05 significance level, there is evidence of a significant linear relationship between price per person and the independent variables, summary rating and location. Because FSTAT = 54.8254 or p-value = 0.0000, reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cciii 14.45 cont.
(f)
From PHStat
Coefficients
Standard Error
t Stat
-47.9482
11.9934
Summary Rating
1.7375
Coded Location
-17.8012
Intercept
(g) (h)
P-value
Lower 95%
Upper 95%
-3.9979
0.0001
-71.7517
-24.1447
0.1890
9.1918
0.0000
1.3623
2.1127
2.9129
-6.1111
0.0000
-23.5826
-12.0199
At the 0.05 significance level, there is evidence of linear relationship between summary rating and the cost of a meal. Because tSTAT = 9.19 or p-value = 0.000, reject H0 and include summary rating in the model. At the 0.05 significance level, there is evidence of linear relationship between presence of a restaurant in the metro area and the cost of a meal. Because tSTAT = –6.11 or p-value = 0.0000, reject H0 and include location in the model. Based on these results both summary rating and location should be included in the model. 1.3623 1 2.1127 For Problem 13.5, the sample slope is 1.5951. For (b) of 14.45, the sample slope for summary rating is 1.7375 and –17.8012 for location. The slope from 14.45 (b) is higher due to the addition of the location dummy variable. Because the two variables are not independent, removal of the location dummy variable results in a reduction of the summated rating slope for the model in 13.5. From PHStat Regression Analysis Regression Statistics
(i) (j) (k)
Multiple R
0.7284
R Square
0.5306
Adjusted R Square
0.5209
r2 = 0.5306. Thus, 53.06% of the variation in price per person can be explained by the variation in summary rating and the location of the restaurant. 2 radj = 0.5209. The adjusted r2 takes into account the number of independent variables and the sample size. For Problem 13.17, r2 = 0.3499, which means that 34.49% of the variation in the dependent variable can be explained by the independent variable. For 14.45, r2 = 0.5306, which means that 53.06% of the variation in the dependent variable can be explained by the variation in summated rating and location. From PHStat: Regression Analysis Coefficients of Partial Determination Copyright ©2024 Pearson Education, Inc.
cciv Chapter 16: Time-Series Forecasting Coefficients
(l)
14.45 cont.
(m) (n)
r2 Y1.2
0.465534817
r2 Y2.1
0.277980717
rY21.2 0.4655. Holding the effect of location constant, 46.5% of variation in price per person can be explained by the variation in summary rating. rY22.1 0.2780. Holding the effect of summary rating constant, 27.80% of variation in price per person can be explained by the variation in restaurant location. The slope of price per person on summary rating is the same irrespective of the location of the restaurant. From PHStat, with Summary Rating * Location Regression Analysis Regression Statistics Multiple R
0.7306
R Square
0.5337
Adjusted R Square
0.5192
Standard Error
14.4801
Observations
100
ANOVA df
SS
MS
F 36.6312
Regression
3
23041.8093
7680.6031
Residual
96
20128.7007
209.6740
Total
99
43170.5100
Coefficients
Standard Error
t Stat
P-value
-55.1287
14.9786
-3.6805
0.0004
Summary Rating
1.8523
0.2373
7.8049
0.0000
Coded Location
2.3450
25.2623
0.0928
0.9262
Rating*Location
-0.3161
0.3937
-0.8029
0.4240
Intercept
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccv For the interaction term, summary rating*location, tSTAT = –0.8029 with a p-value of 0.4240. Because p-value > 0.05, do not reject H0. There is not sufficient evidence that the interaction term makes a significant contribution. (o)
(p)
14.46
(a)
(b)
On the basis of (f) and (n), the most appropriate model would include both summary rating and location as independent variables. The interaction term would not be included in this model because it did not make a significant contribution to the model at the 0.05 significance level. The most appropriate model: Y 47.9482 1.7375 X1 17.8012 X 2 . Both the summary rating and the location of the restaurant contribute to the cost of a meal. The model developed in 14.45 indicates that 53.06% of the variation in price per person can be explained by the variation in summary rating and the variation in location of the restaurant.
Y 368.2348 36.4269 X1 1.8282 X 2 0.0186 X1 X 2 , where X1 = worldwide revenue, X2 = number of new graduates, where p-value =0.01832 < 0.05. Reject H0. So the term in (a) is significant and should be included in the model. Because there is evidence of an interaction effect, the model in (a) should be used.
Copyright ©2024 Pearson Education, Inc.
ccvi Chapter 16: Time-Series Forecasting 14.47
(a) Coefficients Standard Error t Stat P-value Intercept 7.5904 3.5598 2.1323 0.0384 alcohol -0.1321 0.3430 -0.3850 0.7020 chlorides -99.9904 47.0361 -2.1258 0.0389 alcohol X chlorides 8.8772 4.6077 1.9266 0.0602
(b)
14.48
(a)
(b)
14.49
For X1X2: the p-value is 0.0602 > 0.05. Do not reject H0. There is not enough evidence that the interaction term makes a contribution to the model. Since there is not enough evidence of an interaction effect between percentage alcohol and chlorides, the model in problem 14.5 should be used.
Yˆ 250.4237 0.0127X 1 1.4785X 2 0.004X 3 . where X1 = staff present, X2 = remote hours, X3 = X1 X2 For X1X2: the p-value is 0.2353 > 0.05. Do not reject H0. There is not enough evidence that the interaction term makes a contribution to the model. Since there is not enough evidence of an interaction effect between total staff present and remote hours, the model in problem 14.7 should be used.
(a) Intercept Proficiency Classroom Online
(b)
(c) (d)
Coefficients Standard Error -63.9813 16.7997 1.1258 0.1589 -22.2887 4.3154 8.0880 4.3103
t Stat P-value -3.8085 0.0008 7.0868 0.0000 -5.1649 0.0000 1.8765 0.0719
where X1 = proficiency exam, X2 = classroom dummy, X3 = online dummy Holding constant the effect of training method, for each point increase in proficiency exam score, the end-of-training exam score is estimated to increase by a mean of 1.1258 points. For a given proficiency exam score, the end-of-training exam score of a trainee who has been trained by the classroom method will have an estimated mean score that is 22.2887 points below a trainee that has been trained using the courseware app method. For a given proficiency exam score, the end-of-training exam score of a trainee who has been trained by the online method will have an estimated mean score that is 8.0880 points above a trainee that has been trained using the courseware app method Yˆ 63.9813 1.1258(100) 48.5969
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccvii Proficiency
Residual Plot
25 20
Residuals
15 10 5 0 -5 -10 -15 -20 -25 0
20
40
60
80
100
120
140
Proficiency
(d) Residuals vs Predicted Y 25 20
Residuals
15 10 5 0 -5 0
20
40
60
80
100
-10 -15 -20 -25
Predicted Y
There appears to be a quadratic effect from the residual plots. Normal Probability Plot 25 20 15
Residuals
14.49 cont.
10 5 0 -3 -5
-2
-1
0
1
2
3
-10 -15 -20 -25
Z Value
(e)
(f)
There is no severe departure from the normality assumption from the normal probability plot. FSTAT = 31.77 with 3 and 26 degrees of freedom. The p-value is virtually 0. Reject H0 at 5% level of significance. There is evidence of a relationship between end-of-training exam score and the independent variables. For X1: tSTAT = 7.0868 and the p-value is virtually 0. Reject H0. Proficiency exam score makes a significant contribution and should be included in the model. Copyright ©2024 Pearson Education, Inc.
ccviii Chapter 16: Time-Series Forecasting
(g) (h) (i)
For X2: tSTAT = –5.1649 and the p-value is virtually 0. Reject H0. The classroom dummy makes a significant contribution and should be included in the model. For X3: t 1.8765 and the p-value = 0.07186. Do not reject H0. There is not sufficient evidence to conclude that there is a difference in the online method and the courseware app method on the mean end-of-training exam scores. Base on the above result, the regression model should use the proficiency exam score and the classroom dummy variable. 0.7992 1 1.4523 , 31.1591 2 13.4182 , 0.7719 3 16.9480
r 2 0.7857 . 78.57% of the variation in the end-of-training exam score can be explained by the proficiency exam score and the various training methods. 2 radj 0.7610
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccix 14.49 cont.
(j)
(k) (l)
(m)
14.50
(a) (b) (c) (d)
14.51
(a) (b) (c) (d)
rY21.23 0.6589 . Holding constant the effect of training method, 65.89% of the variation in end-of-training exam score can be explained by variation in the proficiency exam score. rY22.13 0.5064 . Holding constant the effect of proficiency exam score, 50.64% of the variation in end-of-training exam score can be explained by the difference between classroom and courseware app methods. rY23.12 0.1193 . Holding constant the effect of proficiency exam score, 11.93% of the variation in end-of-training exam score can be explained by the difference between online and courseware app methods. The slope of end-of-training exam score with proficiency score is the same regardless of the training method. Let X4 = X1X2, X5 = X1X3. H 0 : 4 5 0 There is no interaction among X1 , X2 and X3. H1 : At least one of 4 and 5 is not zero. There is interaction among at least a pair of X1 , X2 and X3. SSR X 4 , X 5 | X 1 , X 2 , X 3 SSR X 1 , X 2 , X 3 , X 4 , X 5 SSR X 1 , X 2 , X 3 / 2 FSTAT MSE X 1 , X 2 , X 3 , X 4 , X 5 MSE X 1 , X 2 , X 3 , X 4 , X 5 = 0.8122. The p-value = 0.46 > 0.05. Do not reject H0. The interaction terms do not make a significant contribution to the model. The regression model should use the proficiency exam score and the classroom dummy variable. Predicted Yˆ 7 2 X 1i 3 X 12i 7 2(2) 3(22 ) 23 . tSTAT 2.35 t /2 2.0518 with 27 degrees of freedom. Reject H0. The quadratic term is significant. The quadratic model is better than the linear model. tSTAT 1.17 t /2 2.0518 with 27 degrees of freedom. Do not reject H0. The quadratic term is not significant. The quadratic model is not better than the linear model. Predicted Yˆ 7 3.0 X 1i 3 X 12i 7 3.0(2) 3(22 ) 13 . Predicted Yˆ 7 2 X 1i 1.5 X 12i 7 2(3) 1.5(32 ) 26.5 . tSTAT 2.35 t /2 2.0518 with 27 degrees of freedom. Reject H0. The quadratic term is significant. The quadratic model is better than the linear model. tSTAT 1.17 t /2 2.0518 with 27 degrees of freedom. Do not reject H0. The quadratic term is not significant. The quadratic model is not better than the linear model. Predicted Yˆ 7 3 X 1i 1.5 X 12i 7 3(2) 1.5(22 ) 7 .
Copyright ©2024 Pearson Education, Inc.
ccx Chapter 16: Time-Series Forecasting (a) GPA 2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9
Predicted HOCS 2.8600 3.0342 3.1948 3.3418 3.4752 3.5950 3.7012 3.7938 3.8728 3.9382
GPA 3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 4
Predicted HOCS 3.9900 4.0282 4.0528 4.0638 4.0612 4.0450 4.0152 3.9718 3.9148 3.8442 3.7600
(b)
Predicted HOCS
HOCS
14.52
4.50 4.00 3.50 3.00 2.50 2.00 1.50 1.00 0.50 0.00 0
1
2
3
4
5
GPA (c)
(d)
The curvilinear relationship suggests that HOCS increases at a decreasing rate. It reaches its maximum value of 4.0638 at GPA = 3.3 and declines after that as GPA continues to increase. An r2 of 0.07 and an adjusted r2 of 0.06 tell you that GPA has very low explanatory power in explaining the variation in HOCS. You can tell that the individual HOCS scores will have scattered quite widely around the curvilinear relationship plotted in (b) and discussed in (c).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxi 14.53
(a)
(b) (c) (d)
Ŷ 59.6691 0.0328 X 0.000005029 X 2 2 Yˆ 59.6691 0.0328 3000 0.000005029 3000 = 112.6782
Copyright ©2024 Pearson Education, Inc.
ccxii Chapter 16: Time-Series Forecasting 14.53 cont.
(d)
(e)
(f)
(g) (h) (i)
The residuals plot does not reveal any non-linearity. The normal probability plot does not indicate any severe departure from the normality assumption. H0: 1 2 0 H1: At least one j is not 0. FSTAT = 3192.8738 with a p-value of virtually zero. Reject H0. The overall quadratic relationship is significant. H0: 2 0H1: 2 0 tSTAT = –54.9089 with a p-value = 0.0000 < 0.05. Reject H0. At the 0.05 level of significance, the quadratic model is better than the linear model. r 2 0.9584. So, 95.84% of the variation in torque can be explained by the quadratic relationship between torque and RPM. 2 radj 0.9581. Toque depend quadratically on RPM and 95.84% of the variation in torque can be explained by the quadratic relationship between torque and RPM.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxiii 14.54
(a)
From PHStat: Domestic Beer, Calories vs Alcohol and Carbohydrates Regression Analysis Regression Statistics Multiple R
0.9835
R Square
0.9673
Adjusted R Square
0.9668
Standard Error
7.9735
Observations
157
ANOVA df
SS
MS
Regression
2
Residual
154
9790.8159
Total
156
299125.4268
F
289334.6109 144667.3054 2275.4758 63.5767
Coefficients
Standard Error
t Stat
P-value
Intercept
-5.1828
2.5746
-2.0130
0.0459
Alcohol
21.5146
0.5613
38.3308
0.0000
Carbohydrates
3.9387
0.1526
25.8068
0.0000
Yˆ 5.182 21.5146 X 1 3.9387X 2 , where X1 = alcohol % and X2 = carbohydrates. FSTAT 2,275.4758 , p-value = 0.0000 < 0.05, so reject H0. At the 5% level of significance, the linear terms are significant together. (b)
From PHStat: Domestic Beer, Calories vs Alcohol, Carbohydrates, Alcohol Squared, and Carbohydrates Squared Regression Analysis Regression Statistics
Copyright ©2024 Pearson Education, Inc.
ccxiv Chapter 16: Time-Series Forecasting Multiple R
0.9842
R Square
0.9686
Adjusted R Square
0.9678
Standard Error
7.8617
Observations
157
ANOVA df
SS
Regression
4
289730.9290
Residual
152
9394.4978
Total
156
299125.4268
MS
F
72432.7322 1171.9387 61.8059
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxv 14.54 cont.
(b)
From PHStat: Domestic Beer, Calories vs Alcohol, Carbohydrates, Alcohol Squared, and Carbohydrates Squared Coefficients
Standard Error
t Stat
P-value
Intercept
10.9881
8.2218
1.3365
0.1834
Alcohol
14.5795
2.9323
4.9721
0.0000
Carbohydrates
4.7076
0.4700
10.0171
0.0000
Alcohol Sq
0.5227
0.2163
2.4166
0.0169
CarbSq
-0.0257
0.0164
-1.5667
0.1193
Yˆ 10.9881 14.5795 X 1 4.7076X 2 0.5227 X 12 0.0257 X 22 , where X1 = alcohol % and X2 = carbohydrates. (c)
For the model in (b) that includes the quadratic terms, FSTAT 1,171.9387 , or p-value = 0.0000 < 0.05, so reject H0. At the 5% level of significance, the model with quadratic terms are significant. For the quadratic alcohol % term, tSTAT 2.4166 , and the p-value = 0.2163. Reject H0. There is enough evidence that the quadratic term for alcohol % is significant at the 5% level of significance. For the quadratic carbohydrate term, tSTAT 1.5667 , and the p-value = 0.1193. Do not reject H0. There is insufficient evidence that the quadratic term for carbohydrates is significant at the 5% level of significance. Hence, because the quadratic term for alcohol % is significant, the model in (b) that includes this term is better.
(d)
The number of calories in a beer depends quadratically on the alcohol percentage but linearly on the number of carbohydrates. The alcohol percentage and number of carbohydrates explain about 96.73% of the variation in the number of calories in a beer.
Copyright ©2024 Pearson Education, Inc.
ccxvi Chapter 16: Time-Series Forecasting 14.55
(a)
(b)
Yˆ 1003.9000+ 6.2937X 1 0.0098X 12 Intercept Temperature Temperature Sq
Coefficients Standard Error t Stat P-value -1003.9000 473.7367 -2.1191 0.1014 6.2937 3.1635 1.9895 0.1175 -0.0098 0.0053 -1.8587 0.1366
(c)
(d)
(e)
There is no obvious pattern in the residual plot and normal probability plot. The model appears to be adequate. H 0 : 2 0 vs. H1 : 2 0 Since the p-value = 0.1366 > 0.05, do not reject H0. There is not a significant quadratic relationship between temperature and registration error. Since the quadratic term is not significant at the 5% level, the linear model is a better fit than the quadratic regression model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxvii 14.55 cont.
(e)
(f) (g) (h)
r2 = 0.8216. So, 82.16% of the variation in registration error can be explained by the variation in temperature. Adjusted r2 = 0.7859. There is a strong linear relationship between registration error and temperature. Registration error depends linearly on temperature and 82.16% of the variation in registration error can be explained by the variation in temperature.
Copyright ©2024 Pearson Education, Inc.
ccxviii Chapter 16: Time-Series Forecasting 14.56
(a)
(b)
Yˆ 18030 1813 X 63.2 X 2 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxix 14.56 cont.
(c)
Yˆ 18030 1813(5) 63.2(5)2 $10,545.6 (d)
(e) (f)
There are no patterns in the residual plots. There does not appear to be any violation of assumptions. Because FSTAT = 243.51 or p-value = 0.000, reject H0. p-value = 0.000. The probability of FSTAT = 243.51 or higher is 0.000, given the null hypothesis is true. Copyright ©2024 Pearson Education, Inc.
ccxx Chapter 16: Time-Series Forecasting 14.56 cont.
(g) (h) (i) (k)
14.57
Because tSTAT = 4.86 or p-value = 0.000, reject H0. The probability of tSTAT < –4.86 or > 4.86 is 0.000, given the null hypothesis is true. r2 = 0.9312. 93.12% of the variation in price can be explained by the quadratic relationship between age and price. (j) adjusted r2 = 0.9273. There is a strong quadratic relationship between age and price.
(a)
(b)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxi 14.57 cont.
(b)
From PHStat Regression Analysis Regression Statistics Multiple R
0.5882
R Square
0.3460
Adjusted R Square
0.2975
Standard Error
7989.3501
Observations
30
ANOVA df
SS
MS
Regression
2
911627846.6386
455813923.3193 7.1411
Residual
27
1723402298.5881
63829714.7625
Total
29
2635030145.2267
Coefficients
Standard Error
Intercept
2704.3781
Tourism Establishments Establishments^2
F
t Stat
P-value
1848.3824
1.4631
0.1550
335.4417
108.9968
3.0775
0.0047
-1.1854
0.5293
-2.2396
0.0335
Yˆ 2,704.3781 335.4417 X 1.1854 X 2 (c)
Yˆ 20.2 28.69 X 0.1052 X 2 When X = 3, Yˆ 2,704.3781 335.4417(3) 1.1854(3)2 3,700.035 thousands. For a country with 3,000 tourist establishments, the predicted mean number of jobs generated in the travel and tourism industry is 3,700,035.
Copyright ©2024 Pearson Education, Inc.
ccxxii Chapter 16: Time-Series Forecasting 14.57 cont.
(d)
(e)
(f)
A plot of the residuals against the values of the independent variable, number of tourism establishments, reveals a potential violation of the equal variance assumption. A normal probability plot reveals a potential violation of the normality assumption. At the 0.05 significance level there is evidence of a significant overall relationship between the number of jobs generated in the travel industry and the number of tourist establishments. Because FSTAT = 7.1411 or p-value = 0.0047, reject H0. The p-value of 0.0047 indicates that the probability of observing an FSTAT of 7.1411 or greater is 0.0047 when H0 is true. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxiii 14.57 cont.
(g) (h)
(i) (j)
14.58
(a) (b)
There is sufficient evidence that the quadratic model is significant at the 0.05 level. Because tSTAT = –2.2396 or p-value = 0.0335, reject H0. r2 = 0.3460. Thus, 34.60% of the variation in the number of jobs generated in the travel and tourism industry can be explained by the quadratic relationship between the number of tourism establishments. r2adj = 0.2975. The adjusted r2 takes into account the number of independent variables and the sample size. There is a significant quadratic relationship between the number of jobs generated in the travel and tourism industry in 2021 and the number of establishments that provide overnight accommodation for tourists. However, the results should be interpreted with caution given the potential violations in the equal variance and normality assumptions.
log Yˆ log(3.07) 0.9log(8.5) 1.41log(5.2) 2.33318 Yˆ 102.33318 215.37 Holding constant the effects of X2, for each additional unit of the logarithm of X1, the logarithm of Y is estimated to increase by a mean of 0.9. Holding constant the effects of X1, for each additional unit of the logarithm of X2, the logarithm of Y is estimated to increase by a mean of 1.41.
14.59
(a) (b)
ln Yˆ 4.62 0.5(8.5) 0.7(5.2) 12.51 Yˆ e12.51 271,034.12 Holding constant the effects of X2, for each additional unit of X1, the natural logarithm of Y is estimated to increase by a mean of 0.5. Holding constant the effects of X1, for each additional unit of X2, the natural logarithm of Y is estimated to increase by a mean of 0.7.
14.60
(a)
From PHStat: sqrt (Calories) vs Alcohol and Carbohydrates Yˆ 6.2596 0.7755 X 0.1672 X , where X1 = alcohol %, and X2 = carbohydrates 1
2
Regression Analysis Regression Statistics Multiple R
0.9784
R Square
0.9573
Adjusted R Square
0.9568
Standard Error
0.3523
Observations
157
ANOVA df Regression
SS 2
428.5147
MS
F
214.2574
1726.7142
Copyright ©2024 Pearson Education, Inc.
ccxxiv Chapter 16: Time-Series Forecasting
14.60 cont.
Residual
154
19.1089
Total
156
447.6236
0.1241
Coefficients
Standard Error
t Stat
P-value
Intercept
6.2596
0.1137
55.0329
0.0000
Alcohol
0.7755
0.0248
31.2762
0.0000
Carbohydrates
0.1672
0.0067
24.7987
0.0000
(a)
(b)
The normal probability plot of the linear model showed departure from a normal distribution, so a square-root transformation of calories was done. FSTAT = 1,726.7142. Because the p-value = 0.000, reject H0 at the 5% level of significance. There is evidence of a significant linear relationship between the square root of calories and the percentage of alcohol and the number of carbohydrates.
(c)
r2 = 0.9573. This indicates that 95.73% of the variation in the square root of calories can be explained by the variation in the percentage of alcohol and the variation in the number of carbohydrates.
(d)
Adjusted r2 = 0.9568.
(e)
The model in 14.60 is better because the residual plot is not right skewed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxv 14.61
(a)
From PHStat: natlog (Calories) vs Alcohol and Carbohydrates LnYˆ 4.0613 0.1143 X 1 0.0288 X 2 , where X1 = alcohol %, and X2 = carbohydrates Regression Analysis Regression Statistics Multiple R
0.9609
R Square
0.9233
Adjusted R Square
0.9223
Standard Error
0.0760
Observations
157
ANOVA df
SS
MS
Regression
2
10.7108
5.3554
Residual
154
0.8897
0.0058
Total
156
11.6005
F 926.9479
Coefficients
Standard Error
t Stat
Intercept
4.0613
0.0245
165.4771
0.0000
Alcohol
0.1143
0.0054
21.3612
0.0000
Carbohydrates
0.0288
0.0015
19.7964
0.0000
(b)
Copyright ©2024 Pearson Education, Inc.
P-value
ccxxvi Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxvii 14.61 cont.
(b)
The residual plots for percentage of alcohol and number of carbohydrates reveals some potential remaining non-linearity in the transformed dependent variable. The normal probability plot reveals no evidence of a violation of the normality assumption, except for three outlying residuals on the negative side. (c)
At the 0.05 level of significance, there is evidence of significant overall relationship between the natural logarithm of calories and the percentage of alcohol and the number of carbohydrates. Because FSTAT = 926.9479 or p-value = 0.0000, reject H0. Copyright ©2024 Pearson Education, Inc.
ccxxviii Chapter 16: Time-Series Forecasting 14.61 cont.
14.62
(d)
r2 = 0.9233. This indicates that 92.33% of the variation in the natural logarithmic transformation of calories can be explained by the variation in the percentage of alcohol and the variation in the number of carbohydrates.
(e)
Adjusted r2 = 0.9223.
(f)
The models in 14.54 (r2 = 0.9673) and 14.60 (r2 = 0.9573) can explain more variation in the dependent variable because they had slightly higher r2 values compared to the present model. The model in 14.54 would be the best model because it had the highest r2 value.
(a) (b) (c)
Predicted ln(Price) = 9.7771 – 0.10218 Age. $10,574.92
(d)
There is no evidence of violations of assumptions. The model is adequate. tSTAT = 19.48 or p-value = 0.000, reject H0. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxix 14.62 cont.
(e) (f) (g)
14.63
(a)
(b) (c)
0.9112. 91.12% of the variation in the natural log of price can be explained by the age of the auto. 0.9088. Choose the model from Problem 15.6. That model has a higher adjusted r2 of 92.73%.
Yˆ = 128.29 – 4.798(X1)
Ŷ = (128.29 – 4.798(5))2 = $10,878.49
Copyright ©2024 Pearson Education, Inc.
ccxxx Chapter 16: Time-Series Forecasting The residual plot of residuals versus age appears to reveal a quadratic pattern, which indicates remaining nonlinearity after the square root transformation. 14.63 cont.
(c)
(d)
(e) (f)
(g)
The normal probability plot reveals no sufficient evidence of a violation of the normality assumption. At the 0.05 significance level, there is evidence of significant overall relationship between the square root of price and age of vehicle. Because FSTAT = 392.73 or p-value = 0.000, reject H0. r2 = 0.9139. This indicates that 91.39% of the of the variation in the square root transformation of price can be explained by the variation in the age of a vehicle. Adjusted r2 = 0.9116. This indicates that 91.16% of the of the variation in the square root transformation of price can be explained by the variation in the age of a vehicle after adjusting for the number of independent variables and the sample size. The models in 15.6 and 15.12 can explain more variation in the dependent variable because they had slightly higher r2 values compared to the present model. The models in 15.6 and 15.12 had the same adjusted r2 value of 0.9273. Both the 15.6 model and the 15.12 model would be better than the present model.
14.64
r2 represents the proportion of the variation in Y that is explained by the set of explanatory variables selected. Adjusted r2 takes into account both the number of explanatory variables in the model and the sample size.
14.65
In the case of the simple linear regression model, the slope b1 represents the change in the estimated mean of Y per unit change in X and does not take into account any other variables. In the multiple linear regression model, the slope b1 represents the change in the estimated mean of Y per unit change in X1, taking into account the effect of all the other independent variables.
14.66
Testing the significance of the entire regression model involves a simultaneous test of whether any of the independent variables are significant. Testing the contribution of each independent variable tests the contribution of that independent variable after accounting for the effect of the other independent variables in the model. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxi
Copyright ©2024 Pearson Education, Inc.
ccxxxii Chapter 16: Time-Series Forecasting 14.67
The coefficient of partial determination measures the proportion of variation in Y explained by a particular X variable holding constant the effect of the other independent variables in the model. The coefficient of multiple determination measures the proportion of variation in Y explained by all the X variables included in the model.
14.68
Dummy variables are used to represent categorical independent variables in a regression model. One category is coded as 0 and the other category of the variable is coded as 1.
14.69
You test whether the interaction of the dummy variable and each of the independent variables in the model make a significant contribution to the regression model.
14.70
You will want to include an interaction term in a regression model if the effect of an independent variable on the response variable is dependent on the value of a second independent variable.
14.71
It is assumed that the slope of the dependent variable Y with an independent variable X is the same for each of the two levels of the dummy variable.
14.72
When a regression analysis fails to yield a suitable linear model, a nonlinear model that expresses a curvilinear relationship, a quadratic regression model may be suitable.
14.73
(a)
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
= 3.888 1.449 2 1.462 2 0.190 2 2 = 1.174
(b)
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
= 3.888 1.449 2 1.462 7 0.190 2 7 = 6.584
(c)
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
= 3.888 1.449 7 1.462 2 0.190 7 2 = 6.519
(d)
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
= 3.888 1.449 7 1.462 7 0.190 7 7 = 7.179
(e)
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
= 3.888 1.449 X1 1.462 2 0.190 X1 2
(f)
= 0.964 1.069X1 The slope of X1 is 1.069. Yˆ 3.888 1.449 X 1.462 X 0.190 X X 1
2
1
2
= 3.888 1.449 X1 1.462 7 0.190 X1 7
(g)
= 6.346 0.119X1 The slope of X1 is 0.119. Yˆ 3.888 1.449 X 1.462 X 0.190 X X 1
2
1
2
= 3.888 1.449 2 1.462 X 2 0.190 2 X 2 = 0.99 1.082X 2 The slope of X 2 is 1.082.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxiii 14.73
(h)
= 3.888 1.449 7 1.462 X 2 0.190 7 X 2
cont.
(i)
14.74
Yˆ 3.888 1.449 X1 1.462 X 2 0.190 X1 X 2
(a) (b)
(c) (d)
(e)
(f)
(g) (h) (i)
(j)
(k)
= 6.255 0.132X 2 The slope of X 2 is 0.132. Since the interaction between X1 and X 2 is negative, a higher value of the perceived quality of the product, X1 , will attenuate the effect of the perceived value of the product, X 2 , on the predicted value of purchasing behavior. Likewise, a higher value of the perceived value of the product, X 2 , will attenuate the effect of the perceived quality of the product, X1 , on the predicted value of purchasing behavior.
Y 3.9152 0.0319 X1 4.2228 X 2 , where X1 = number cubic feet moved and X2 = number of pieces of large furniture. Holding constant the number of pieces of large furniture, for each additional cubic foot moved, the mean labor hours are estimated to increase by 0.0319. Holding constant the amount of cubic feet moved, for each additional piece of large furniture, the mean labor hours are estimated to increase by 4.2228.
Y 20.4926 Based on a residual analysis, the errors appear to be normally distributed. The equalvariance assumption might be violated because the variances appear to be larger around the center region of both independent variables. There might also be violation of the linearity assumption. A model with quadratic terms for both independent variables might be fitted. FSTAT = 228.80, p-value is virtually 0 < 0.05, reject H0. There is evidence of a significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and the number of pieces of large furniture). The p-value is virtually 0. The probability of obtaining a test statistic of 228.80 or greater is virtually 0 if there is no significant relationship between labor hours and the two independent variables (the amount of cubic feet moved and the number of pieces of large furniture). r2 = 0.9327. 93.27% of the variation in labor hours can be explained by variation in the number of cubic feet moved and the number of pieces of large furniture. 2 r adj 0.9287 For X1: tSTAT = 6.9339, the p-value is virtually 0. Reject H0. The number of cubic feet moved makes a significant contribution and should be included in the model. For X2: tSTAT = 4.6192, the p-value is virtually 0. Reject H0. The number of pieces of large furniture makes a significant contribution and should be included in the model. Based on these results, the regression model with the two independent variables should be used. For X1: tSTAT = 6.9339, the p-value is virtually 0. The probability of obtaining a sample that will yield a test statistic greater than 6.9339 is virtually 0 if the number of cubic feet moved does not make a significant contribution, holding the effect of the number of pieces of large furniture constant. For X2: tSTAT = 4.6192, the p-value is virtually 0. The probability of obtaining a sample that will yield a test statistic greater than 4.6192 is virtually 0 if the number of pieces of large furniture does not make a significant contribution, holding the effect of the amount of cubic feet moved constant. 0.0226 1 .0413 Copyright ©2024 Pearson Education, Inc.
ccxxxiv Chapter 16: Time-Series Forecasting 14.74 cont.
(l)
(m)
14.75
rY21.2 = 0.5930. Holding constant the effect of the number of pieces of large furniture, 59.3% of the variation in labor hours can be explained by variation in the amount of cubic feet moved. rY22.1 = 0.3927. Holding constant the effect of the number of cubic feet moved, 39.27% of the variation in labor hours can be explained by variation in the number of pieces of large furniture. Both the number of cubic feet moved and the number of large pieces of furniture are useful in predicting the labor hours, but the cubic feet moved is more important.
From PHStat, Wins vs field goal percentage and three-point percentage Coefficients
Standard Error
Intercept
-216.1443
Field Goal Percentage Three-Point Percentage (a) (b)
(c) (d)
t Stat
P-value
49.1568
-4.3970
0.0002
237.7929
133.8594
1.7764
0.0869
416.9878
140.5164
2.9675
0.0062
Yˆ 216.1443 237.7929 X 1 416.9878 X 2 where X1 = field goal and X2 = 3-point For a given three-point field goal %, each increase of 1% in field goal % increases the estimated mean number of wins by 1.78. For a given field goal %, each increase of 1% in three-point field goal % increases the estimated mean number of wins by 2.97. Yˆ 216.1443 237.7929(0.45) 416.9878(0.35) 36.8082 wins
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxv 14.75 cont.
(d)
Residual analysis does not reveal any potential violation of the regression assumptions.
Copyright ©2024 Pearson Education, Inc.
ccxxxvi Chapter 16: Time-Series Forecasting 14.75
(e)
H 0 : 1 2 0 H1 : Not all j = 0 for j = 1, 2
cont. Regression Analysis
Regression Statistics Multiple R
0.7303
R Square
0.5334
Adjusted R Square
0.4988
Standard Error
8.1729
Observations
30
ANOVA df
(f) (g) (h) (i)
SS
MS
F
Regression
2
2061.4855 1030.7427 15.4313
Residual
27
1803.4812
Total
29
3864.9667
66.7956
Coefficients
Standard Error
t Stat
P-value
Intercept
-216.1443
49.1568
-4.3970
0.0002
Field Goal Percentage
237.7929
133.8594
1.7764
0.0869
Three-Point Percentage
416.9878
140.5164
2.9675
0.0062
FSTAT = 15.4313 with p-value = 0.0002. Since the p-value < 0.05, reject H0 at 5% level of significance. There is evidence of a significant linear relationship between number of wins and the two explanatory variables. p-value is 0.0002. The probability of obtaining an F test statistic equal to or larger than 15.4313 is 0.0002 if H 0 is true. r2 = SSR/SST = 0.5334. So, 53.34% of the variation in number of wins can be explained by variation in field goal % and three-point field goal %. Adjusted r2 = 0.4988. For X1: tSTAT b1 / Sb1 = 1.7764 and p-value = 0.0869 > 0.05, do not reject H0. There is insufficient evidence that the variable X1 contributes to a model already containing X2. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxvii For X2: tSTAT b2 / Sb2 = 2.9675 and p-value = 0.0062 < 0.05, reject H0. There is evidence
(j)
that the variable X2 contributes to a model already containing X1. X2 should be included in the model. For X1: p-value = 0.0869. The probability of obtaining a t test statistic that differs from 0 by 1.7764 or more in either direction is 3.3% if X1 is insignificant. For X2: p-value = 0.022. The probability of obtaining a t test statistic that differs from 0 by 2.9675 or more in either direction is 2.2% if X2 is insignificant.
Copyright ©2024 Pearson Education, Inc.
ccxxxviii Chapter 16: Time-Series Forecasting 14.75 cont.
(j)
From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients
(k)
(l)
14.76
(a)
r2 Y1.2
0.104647873
r2 Y2.1
0.24594251
rY21.2 0.1046. Holding constant three-point field goal %, 10.46% of the variation in number of wins can be explained by variation in field goal% for the team. rY22.1 0.2459. Holding constant the effect of field goal % for the team, 24.59% of the variation in number of wins can be explained by variation in three-point field goal %. Both field goal% and three-point field goal % for the team are useful in predicting the number of wins. From PHStat, Asking price vs living space and age
Y 450.2780 0.0969 X1 0.5151X 2 , where X1 = house size and X2 = age. Regression Analysis Regression Statistics Multiple R
0.6340
R Square
0.4019
Adjusted R Square
0.3813
Standard Error
88.6341
Observations
61
ANOVA df
SS
MS
F 19.4889
Regression
2
306209.3804
153104.6902
Residual
58
455648.6622
7856.0114
Total
60
761858.0426
Coefficients
Standard Error
t Stat
P-value
Copyright ©2024 Pearson Education, Inc.
Lower 95%
Upper 95%
Solutions to End-of-Section and Chapter Review Problems ccxxxix Intercept
450.2780
74.4900
6.0448
0.0000
301.1702
599.3859
Living Space
0.0969
0.0207
4.6903
0.0000
0.0555
0.1382
Age
-0.5151
0.8174
-0.6302
0.5311
-2.1513
1.1211
(b)
Holding constant the age, for each additional square foot in the size of the house, the mean asking price is estimated to increase by 0.0969 thousand dollars. Holding constant the living space of the house, for each additional year in age, the asking price is estimated to decrease by 0.5151 thousand dollars.
(c)
Y 450.2780 0.0969(2,000) 0.5151(55) 615.686 thousand dollars.
Copyright ©2024 Pearson Education, Inc.
ccxl Chapter 16: Time-Series Forecasting 14.76 cont.
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxli 14.76 cont.
(d)
Based on a residual analysis, the model appears to be adequate. (e)
(f)
(g) (h) (i)
(j)
(k)
FSTAT = 19.4889, the p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between asking price and the two independent variables (size of the house and age). The p-value is 0.0000. The probability of obtaining a test statistic of 19.4889 or greater is virtually 0 if there is no significant relationship between asking price and the two independent variables (living space of the house and age). r2 = 0.4019. 40.19% of the variation in asking price can be explained by variation in the size of the house and age. 2 radj = 0.3813. For X1: tSTAT = 4.6903, the p-value is 0.0000. Reject H0. The living space of the house makes a significant contribution and should be included in the model. For X2: tSTAT = –0.6302, p-value = 0.5311 > 0.05. Do not reject H0. Age does not make a significant contribution and should not be included in the model. Based on these results, the regression model with only the size of the house should be used. For X1: tSTAT = 4.6903. The probability of obtaining a sample that will yield a test statistic farther away than 4.6903 is 0.0000 if the living space does not make a significant contribution, holding age constant. For X2: tSTAT = –0.6302. The probability of obtaining a sample that will yield a test statistic farther away than 0.6302 is 0.5311 if the age does not make a significant contribution holding the effect of the living space constant. 0.0555 1 0.1382 You are 95% confident that the asking price will increase by an amount somewhere between $55.50 thousand and $138.20 thousand for each additional thousand square foot increase in living space, holding constant the age of the house. In Problem 13.76, you are 95% confident that the assessed value will increase by an amount somewhere between $71.0 thousand and $137.9 thousand for each additional 1,000 square foot increase in living space, regardless of the age of the house. Copyright ©2024 Pearson Education, Inc.
ccxlii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxliii 14.76 cont.
(k)
From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients
(l)
(m) (a) (b)
(c) (d)
0.274988878
r2 Y2.1
0.006799874
rY21.2 = 0.2750. Holding constant the effect of the age of the house, 27.50% of the variation in asking price can be explained by variation in the living space of the house. rY22.1 = 0.0068. Holding constant the effect of the size of the house, 0.68% of the variation in asking price can be explained by variation in the age of the house. Only the living space of the house should be used to predict asking price. Yˆ 62.1411 2.0567X 1 15.6418X 2 , where X1 = diameter of the tree at breast height of a person (in inches) and X2 = thickness of the bark (in inches). Holding constant the effects of the thickness of the bark, for each additional inch of increase in the diameter of the tree at breast height of a person, the height of the tree is estimated to increase by a mean of 2.0567 feet. Holding constant the effects of the diameter of the tree at breast height of a person, for each additional inch of increase in the thickness of the bark, the height of the tree is estimated to increase by a mean of 15.6418 feet. Yˆ 62.1411 2.0567 25 15.6418 2 144.84 feet.
r 2 0.7858 . So 78.58% of the total variation in the height of the tree can be explained by the variations of both the diameter of the tree at breast height of a person and the thickness of the bark of the tree.
(e) Diameter at breast height Residual Plot 80 60 40
Residuals
14.77
r2 Y1.2
20 0 -20 -40 -60 0
10
20
30
40
50
Diameter at breast height
Copyright ©2024 Pearson Education, Inc.
60
ccxliv Chapter 16: Time-Series Forecasting (e) Bark thickness Residual Plot 80 60 40
Residuals
14.77 cont.
20 0 -20 -40 -60 0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
Bark thickness
(f)
(g) (h) (i) (j)
(k)
(l)
The plot of the residuals against bark thickness indicates a potential pattern that may require the addition of nonlinear terms. One value appears to be an outlier in both plots. F = 33.0134 with 2 and 18 degrees of freedom. p-value = 9.49912E-07 < 0.05. Reject H0. At least one of the independent variables is linearly related to the dependent variable. 1.1264 1 2.9870 0.6238 2 30.6598 Since 0 is not included in both 95% confidence intervals in (g), both explanatory variables should be included in this model. 134.0091 Y | X 155.6760 96.1452 YX 193.5399
rY21.2 0.5452 . For a given bark thickness of the tree, 54.52% of the variation in height can be explained by variation in the diameter of the tree at the breast height of a person. rY22.1 0.2101 . For a given diameter of the tree at the breast height of a person, 21.01% of the variation in height can be explained by variation in bark thickness. None of the observations have a Cook’s Di > F 0.8194 with d.f. = 3 and 18. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. Both the diameter of the tree and the thickness of the bark affect the height of the tree, but the diameter is more important.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlv 14.78
(a)
From PHStat, Taxes vs asking price and age
Y 99.0443 8.1105 X1 2.7558 X 2 , where X1 = asking price and X2 = age. Regression Analysis Regression Statistics Multiple R
0.9915
R Square
0.9830
Adjusted R Square
0.9824
Standard Error
119.7195
Observations
61
ANOVA df
SS
Regression
2
Residual
58
831300.5637
Total
60
48905968.8393
MS
F
48074668.2757 24037334.1378 1677.0894 14332.7683
Coefficients
Standard Error
P-value
Lower 95%
Upper 95%
t Stat
Intercept
-99.0443
124.4368
-0.7959
0.4293
-348.1315
150.0430
Asking Price
8.1105
0.1510
53.7060
0.0000
7.8082
8.4127
Age
2.7558
0.9896
2.7849
0.0072
0.7750
4.7366
(b)
Holding age constant, for each additional $1,000 in asking price, the taxes are estimated to increase by a mean of $8.1105 thousand. Holding asking price constant, for each additional year, the taxes are estimated to increase by $2.7558
(c)
Y 99.0443 8.1105(400) 2.7558(50) $3,282.928
Copyright ©2024 Pearson Education, Inc.
ccxlvi Chapter 16: Time-Series Forecasting 14.78 cont.
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlvii 14.78 cont.
(d)
(e) (f)
(g) (h) (i)
(j)
(k)
Based on a residual analysis, the errors appear to be normally distributed. The equalvariance assumption appears to be valid. However, there is one very large residual that is from the house that is 107 years old. Removing this point, still leaves a residual for the house that has an asking price of $550,000 and is 52 years old. However, because this model is an almost perfect fit, you may want to use this model. In this model, age is no longer significant. FSTAT = 1,677.0894, p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between taxes and the two independent variables (asking price and age). p-value = 0.0000. The probability of obtaining an FSTAT test statistic of 1,677.0894 or greater is virtually 0 if there is no significant relationship between taxes and the two independent variables (asking price and age). r2 = 0.9830, 98.30% of the variation in taxes can be explained by variation in asking price and age. 2 radj 0.9824 For X1: tSTAT = 53.7060, p-value = 0.0000 < 0.05. Reject H0. The asking price makes a significant contribution and should be included in the model. For X2: tSTAT = 2.7849, p-value = 0.0072 < 0.05. Reject H0. The age of a house makes a significant contribution and should be included in the model. Based on these results, the regression model with asking price and age should be used. For X1: p-value = 0.0000. The probability of obtaining a sample that will yield a test statistic greater than 53.7060 is 0.0000 if the asking price does not make a significant contribution, holding age constant. For X2: p-value = 0.0072. The probability of obtaining a sample that will yield a test statistic greater than 2.7849 is 0.0072 if the age of a house does not make a significant contribution, holding the effect of the asking price constant. 7.8082 1 8.4127. You are 95% confident that the mean taxes will increase by an amount somewhere between $7.81 and $8.41 for each additional $1,000 increase in the asking price, holding constant the age. In Problem 13.77, you are 95% confident that the mean taxes will increase by an amount somewhere between $7.6447 and $8.2242 for each additional $1,000 increase in asking price, regardless of the age. Copyright ©2024 Pearson Education, Inc.
ccxlviii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlix 14.78 cont.
(k)
From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients 0.980287764
r2 Y2.1
0.117946139
(m)
rY21.2 0.9803. Holding constant the effect of age, 98.03% of the variation in taxes can be explained by variation in the asking price. rY22.1 0.1179. Holding constant the effect of the asking price, 11.79% of the variation in taxes can be explained by variation in the age. Based on your answers to (b) through (k), the age of a house has an effect on its taxes.
(a)
From PHStat, Wins vs ERA and runs per game
(l)
14.79
r2 Y1.2
Y 69.1966 16.9662 X1 15.3543 X 2 , where X1 = ERA and X2 = runs per game. Regression Analysis Regression Statistics Multiple R
0.9724
R Square
0.9455
Adjusted R Square
0.9415
Standard Error
3.5512
Observations
30
ANOVA df
SS
MS
Regression
2
5911.5086
2955.7543
Residual
27
340.4914
12.6108
Total
29
6252.0000
Intercept
Coefficients
Standard Error
t Stat
69.1966
11.9427
5.7940
F 234.3829
P-value
Lower 95%
Upper 95%
0.0000
44.6922
93.7011
Copyright ©2024 Pearson Education, Inc.
ccl Chapter 16: Time-Series Forecasting Runs per game
16.9662
1.8068
9.3902
0.0000
13.2590
20.6735
ERA
-15.3543
1.4332
-10.7133
0.0000
-18.2950
-12.4137
(b)
The Y intercept, b0, would be the mean number of wins when the ERA and runs per game are zero. This would not be meaningful in this case because a zero ERA and zero hits per game would not be reasonable. For each one unit increase ERA, one would estimate that the predicted mean number of wins would decrease by 16.9662, while holding hits per game constant. For each one unit increase in hits per game, the predicted mean number of wins would decrease by 15.3543, while holding ERA constant.
(c)
Y 69.1966 16.9662(4.50) 15.3543(4.6) 74.9147 wins.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccli 14.79 cont.
(d)
Copyright ©2024 Pearson Education, Inc.
cclii Chapter 16: Time-Series Forecasting 14.79 cont.
(d)
(e)
(f) (g) (h) (i)
(j)
(k)
There is no pattern in the relationship between residuals and the predicted value of Y, the value of ERA, or the value runs scored per game. The regression assumptions are satisfied. At the 0.05 significance level, there is evidence of a significant linear relationship between number of wins and the independent variables, ERA and runs scored per game. Because FSTAT = 234.3829 or p-value = 0.0000, reject H0. The p-value from (e) indicates that the probability of obtaining a FSTAT of 234.3829 or larger is 0.000 when the null hypothesis, β1 = β2 =0, is true. r2 = 0.9455. Thus, 94.55% of the variation in wins can be explained by the variation in ERA and the variation in hits per game. 2 = 0.9415. The adjusted r2 takes into account the number of independent variables and radj the sample size. At the 0.05 significance level, there is evidence of linear relationship between ERA and number of wins. For X1: because tSTAT = 9.3902 or p-value = 0.000, reject H0 and include ERA in the model. At the 0.05 significance level, there is evidence of linear relationship between runs scored per game and the number of wins. For X2: because tSTAT = –10.7133 or p-value = 0.0000, reject H0 and include runs scored per game in the model. Based on these results both ERA and runs scored per game should be included in the model. The p-value for the ERA independent variable X1, indicates that the probability of obtaining a tSTAT of 9.3902 or larger is 0.000 when the null hypothesis is true. The p-value for the runs scored per game independent variable X2, indicates that the probability of obtaining a tSTAT of –10.7133 or less is 0.000 when the null hypothesis is true. 13.2590 ≤ β1 ≤ 20.6735
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccliii 14.79 cont.
(k)
From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients
(l)
(m)
14.80
(a)
r2 Y1.2
0.765575645
r2 Y2.1
0.809558207
rY21.2 = 0.7656. Holding the effect of runs scored per game constant, 76.56% of variation in number of wins can be explained by the variation in ERA. rY22.1 = 0.8096. Holding the effect of ERA constant, 80.96% of variation in number of wins can be explained by the variation in runs per game. Pitching contributes more to the number of wins because it can account for more variation in number of wins when holding runs per game constant compared to the amount of variation in number of wins that can be explained by runs per game when holding ERA constant. From PHStat, Wins vs ERA and league
Y 172.4619 23.5464 X1 3.7990 X 2 , where X1 = ERA and where X1 = ERA and X2 = league (American = 0 National = 1). Regression Analysis Regression Statistics Multiple R
0.8858
R Square
0.7846
Adjusted R Square
0.7686
Standard Error
7.0629
Observations
30
ANOVA df
SS
MS
Regression
2
4905.1172
2452.5586
Residual
27
1346.8828
49.8845
Total
29
6252.0000
Copyright ©2024 Pearson Education, Inc.
F 49.1647
ccliv Chapter 16: Time-Series Forecasting
Coefficients
Standard Error
P-value
Lower 95%
Upper 95%
t Stat
Intercept
172.4619
9.3894
18.3677
0.0000
153.1964
191.7273
ERA
-23.5464
2.3747
-9.9156
0.0000
-28.4188
-18.6739
League
3.7990
2.6114
1.4548
0.1573
-1.5591
9.1572
(b)
Holding constant the effect of the league, for each additional earned run, the number of wins is estimated to decrease by 23.5464. For a given ERA, a team in the National League is estimated to have 3.7990 more wins than a team in the American League.
(c)
Y 172.4619 23.5464(4.50) 3.7990(0) 66.503 wins.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclv 14.80 cont.
(d)
Copyright ©2024 Pearson Education, Inc.
cclvi Chapter 16: Time-Series Forecasting 14.80 cont.
(d)
(e)
(f)
Based on a residual analysis, there is no pattern in the errors. There is no apparent violation of other assumptions. FSTAT = 49.1647 > 3.35, p-value = 0.0000 < 0.05, reject H0. There is evidence of a significant relationship between wins and the two independent variables (ERA and league). For X1: tSTAT = –9.9156 < –2.0518, the p-value = 0.0000. Reject H0. ERA makes a significant contribution and should be included in the model. For X2: tSTAT = 1.4548 < 2.0518, p-value = 0.1573 > 0.05. Do not reject H0. The league does not make a significant contribution and should not be included in the model.
(g) (h)
Based on these results, the regression model with only the ERA as the independent variable should be used. 28.4188 1 18.6739 1.5591 2 9.1572
(i)
2 radj 0.7686 76.86% of the variation in wins can be explained by the variation in ERA
and league after adjusting for number of independent variables and sample size. (j)
From PHStat, Regression Analysis Coefficients of Partial Determination Coefficients r2 Y1.2
0.78454931
r2 Y2.1
0.072687011
rY21.2 0.7845 Holding constant the effect of league, 78.45% of the variation in number of wins can be explained by the variation in ERA. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclvii
rY22.1 0.0727 Holding constant the effect of ERA, 7.27% of the variation in number of wins can be explained by the variation in league. The slope of the number of wins with ERA is the same, regardless of whether the team belongs to the American League or the National League. For X1X2: tSTAT = –0.2083 > –2.0555 the p-value is 0.8366 > 0.05. Do not reject H0. There is no evidence that the interaction term makes a contribution to the model. The model with one independent variable (ERA) should be used.
(k) 14.80 cont.
(l) (m)
14.81
Model with interaction terms: Yˆ 983.4037 0.0249 X 1 5.8443 X 2 1.9300 X 3
0.0000 X1 X 2 0.0205 X 1 X 3 0.4307 X 2 X 3 where X1 = House size, X 2 = Age, X 3 = 0 if Glen Cove, 1 if Merrick PHStat output: Regression Statistics Multiple R
0.7279
R Square
0.5298
Adjusted R Square
0.5106
Standard Error
271.5296
Observations
154
ANOVA df
SS
MS
F 27.6034
Regression
6
12210914.1616
2035152.3603
Residual
147
10838062.6760
73728.3175
Total
153
23048976.8377
Coefficients
Standard Error
t Stat
Intercept
983.4037
148.7121
6.6128
0.0000
House Size
0.0249
0.0059
4.2033
0.0000
Age
-5.8443
1.8211
-3.2092
0.0016
Location
-1.9300
158.0724
-0.0122
0.9903
Copyright ©2024 Pearson Education, Inc.
P-value
cclviii Chapter 16: Time-Series Forecasting Size*Age
0.0000
0.0001
-0.3661
0.7148
Size*Location
0.0205
0.0089
2.3108
0.0222
Age*Location
-0.4307
1.9446
-0.2215
0.8250
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclix 14.81 cont.
Model without interaction terms: Yˆ 986.1017 0.0247 X 1 6.2045 X 2 113.0510 X 3 where X1 = House size, X 2 = Age, X 3 = 0 if Glen Cove, 1 if Merrick PHStat output: Regression Statistics Multiple R
0.7157
R Square
0.5122
Adjusted R Square
0.5024
Standard Error
273.7885
Observations
154
ANOVA df
SS
MS
F 52.4944
Regression
3
11804955.7623
3934985.2541
Residual
150
11244021.0753
74960.1405
Total
153
23048976.8377
Coefficients
Standard Error
t Stat
P-value
Intercept
986.1017
80.3543
12.2719
0.0000
House Size
0.0247
0.0025
9.8176
0.0000
Age
-6.2045
0.8707
-7.1260
0.0000
Location
133.0510
49.4102
2.6928
0.0079
Partial F test for the interaction effects: H 0 : 4 5 6 0 H1 : Not all j = 0 for j = 4, 5, 6
SSR X 1 , X 2 , X 3 , X 4 , X 5 , X 6 SSR X 1 , X 2 , X 3 / 3 FSTAT = = 13.3268 with 3 numerator and 53 MSE X 1 , X 2 , X 3 , X 4 , X 5 , X 6 denominator degrees of freedom. The p-value is 0.0000. At 5% level of significance, the interaction terms are significant together.
Copyright ©2024 Pearson Education, Inc.
cclx Chapter 16: Time-Series Forecasting 14.81 cont.
Individual t test of the slope parameters: H 0 : j 0 H1 : j 0 Using 5% level of significance, the interaction between land and age, and the interaction between age and the Glen Cove dummy variable are significant in explaining the variation of fair market value. Model with land, land and age interaction and land and Glen Cove dummy interaction: PHStat output: Regression Statistics Multiple R
0.7160
R Square
0.5127
Adjusted R Square
0.4962
Standard Error
275.4819
Observations
154
ANOVA df
SS
MS
F 31.1429
Regression
5
11817218.7218
2363443.7444
Residual
148
11231758.1158
75890.2575
Total
153
23048976.8377
Coefficients
Standard Error
t Stat
P-value
Intercept
944.6140
149.9124
6.3011
0.0000
House Size
0.0269
0.0059
4.5285
0.0000
Age
-5.6850
1.8463
-3.0791
0.0025
Location
156.2574
144.5545
1.0810
0.2815
Size*Age
0.0000
0.0001
-0.3997
0.6900
Age*Location
-0.2788
1.9718
-0.1414
0.8877
Yˆ 944.6140 0.0269 X 1 5.6850X 2 156.2574 X 3 0.0 X 1 X 2 0.2788 X 2 X 3 H 0 : j 0 H1 : j 0 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxi All the slope parameters are significant individually at 5% level of significance. The final model should use land, age, Glen Cove dummy variable, land and age interaction, and age and Glen Cove dummy variable interaction. 14.82
The multiple regression model is Predicted base salary = 48,091.7853 + 8,249.2156 (gender) + 1,061.4521 (age). Holding constant the age of the person, the mean base salary is predicted to be $8,249.22 higher for males than for females. Holding constant the gender of the person, for each addition year of age, the mean base salary is predicted to be $1,061.45 higher. The regression model with the two independent variables has F = 118.0925 and a p-value = 0.0000. So, you can conclude that at least one of the independent variable makes a significant contribution to the model to predict base pay. Each independent variable makes a significant contribution to the regression model given that the other variable is included. (tSTAT = 3.9937, p-value = 0.0001 for gender and tSTAT = 14.8592, p-value = 0.0000 for age). Both independent variables should be included in the model. 37.01% of the variation in base salary can be explained by gender and age.
Copyright ©2024 Pearson Education, Inc.
cclxii Chapter 16: Time-Series Forecasting 14.82 cont.
There is no pattern in the residuals and no other violations of the assumptions, so the model appears to be appropriate. Including an interaction term of gender and age does not significantly improve the model (tSTAT –0.2371, p-value = 0.8127 > 0.05). You can conclude that females are paid less than males holding constant the age of the person. Perhaps other variables such as department, seniority, and score on a performance evaluation can be included in the model to see if the model is improved.
14.83
Excel output: Regression Statistics Multiple R 0.7520 R Square 0.5655 Adjusted R Square 0.4785 Standard Error 0.9136 Observations 19 ANOVA df Regression Residual Total
Intercept Viscosity Pressure Plate Gap
3 15 18
SS MS 16.2908 5.4303 12.5192 0.8346 28.8100
F Significance F 6.5063 0.0049
Coefficients Standard Error t Stat P-value -18.6915 7.9789 -2.3426 0.0334 0.0121 0.0082 1.4817 0.1591 0.0844 0.0414 2.0415 0.0592 0.5000 0.1379 3.6271 0.0025
Lower 95% Upper 95% -35.6982 -1.6848 -0.0053 0.0296 -0.0037 0.1726 0.2062 0.7938
The r 2 of the multiple regression is 0.5655. So 56.66% of the variation in tear rating can be explained by the variation of viscosity, pressure, and plate gap on the bag-sealing equipment. The F test statistic for the combined significant of viscosity, pressure, and plate gap on the bagsealing equipment is 6.5063 with a p-value of 0.0049. Hence, at a 5% level of significance, there is enough evidence to conclude that viscosity, pressure, and plate gap on the bag-sealing equipment affect tear rating. The p-value of the t test for the significance of viscosity is 0.1591, which is larger than 5%. Hence, there is not sufficient evidence to conclude that viscosity affects tear rating holding constant the effect of pressure and plate gap on the bag-sealing equipment. The p-value of the t test for the significance of pressure is 0.0592, which is also larger than 5%. There is not enough evidence to conclude that pressure affects tear rating at 5% level of significance holding constant the effect of viscosity and plate gap on the bag-sealing equipment. The p-value of the t test for the significance of plate gap is 0.0025, which is smaller than 5%. There is enough evidence to conclude that plate gap affects tear rating at 5% level of significance holding constant the effect of viscosity and pressure.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxiii Excel output after dropping viscosity and pressure: Regression Statistics Multiple R 0.6173 R Square 0.3811 Adjusted R Square 0.3447 Standard Error 1.0241 Observations 19 ANOVA df Regression Residual Total
SS MS F Significance F 10.9800 10.9800 10.4689 0.0049 17.8300 1.0488 28.8100
1 17 18
Coefficients Standard Error t Stat P-value 0.7500 0.2349 3.1922 0.0053 0.5000 0.1545 3.2356 0.0049
Intercept Plate Gap
Lower 95% Upper 95% 0.2543 1.2457 0.1740 0.8260
Plate gap still remains statistically significant at the 5% level of significance. Hence, only plate gap on the bag-sealing equipment need to be used in the model. Residual Plot
Residuals
3 2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5
-4
-2
0 X
2
4
The residual plot suggests that the equal variance assumption is likely violated. Normal Probability Plot
Residuals
14.83 cont.
Boxplot
3 2.5 2 1.5 1 0.5 0 -0.5 -1 -1.5
Residuals
-2
-1
0 Z Value
1
2 -10
-5
0
The normal probability plot and the boxplot both suggest that the normal distribution assumption is also likely violated.
Copyright ©2024 Pearson Education, Inc.
cclxiv Chapter 16: Time-Series Forecasting 14.84
b0 = 18.2892 (die temperature), b1 = 0.5976, (die diameter), b2 = –13.5108. The r2 of the multiple regression model is 0.3257 so 32.57% of the variation in unit density can be explained by the variation of die temperature and die diameter. The F test statistic for the combined significance of die temperature and die diameter is 5.0718 with a p-value of 0.0160. Hence, at a 5% level of significance, there is enough evidence to conclude that die temperature and die diameter affect unit density. The p-value of the t test for the significance of die temperature is 0.2117, which is greater than 5%. Hence, there is insufficient evidence to conclude that die temperature affects unit density holding constant the effect of die diameter. The p-value of the t test for the significance of die diameter is 0.0083, which is less than 5%. There is enough evidence to conclude that die diameter affects unit density at the 5% level of significance holding constant the effect of die temperature. After removing die temperature from the model, b0 = 107.9267 (die diameter), b1 = –13.5108. The r2 of the multiple regression is 0.2724. So 27.24% of the variation in unit density can be explained by the variation of die diameter. The p-value of the t test for the significance of die diameter is 0.0087, which is less than 5%. There is enough evidence to conclude that die diameter affects unit density at the 5% level of significance. There is some lack of equality in the residuals and some departure from normality.
14.85
Excel output: Regression Statistics Multiple R 0.3101 R Square 0.0961 Adjusted R Square 0.0101 Standard Error 1.4439 Observations 24 ANOVA df Regression Residual Total
Intercept Die Temperature Die Diameter
SS 2 21 23
MS 4.6572 2.3286 43.7810 2.0848 48.4382
F Significance F 1.1169 0.3460
Coefficients Standard Error t Stat P-value 1.6308 9.0843 0.1795 0.8592 0.0681 0.0589 1.1550 0.2611 -0.5592 0.5895 -0.9486 0.3536
Lower 95% Upper 95% -17.2609 20.5226 -0.0545 0.1907 -1.7850 0.6667
The r 2 of the multiple regression is 0.0961. So 9.61% of the variation in foam diameter can be explained by the variation of die temperature and die diameter. The F test statistic for the combined significant of die temperature and die diameter is 1.1169 with a p-value of 0.3460. Hence, at a 5% level of significance, there is not enough evidence to conclude that die temperature and die diameter affect foam diameter. The p-value of the t test for the significance of die temperature is 0.2611, which is larger than 5%. Hence, there is not sufficient evidence to conclude that die temperature affects foam diameter holding constant the effect of die diameter. The p-value of the t test for the significance of die diameter is 0.3536, which is also larger than 5%. There is not enough evidence to conclude that die diameter affects foam diameter at 5% level of significance holding constant the effect of die temperature. None of the two independent variables should be kept in the model. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxv Chapter 15
15.1
Multicollinearity is observed when the existence of such a high degree of correlation between supposedly independent variables being used to estimate a dependent variable that the contribution of each independent variable to variation in the dependent variable cannot be determined.
15.2
VIF
1 3.33 1 0.7
15.3
VIF
1 1.25 1 0.2
15.4
From PHStat Regression Analysis
Regression Analysis
Efficiency Ratio and all other X
Risk-Based Capital and all other X
Regression Statistics
Regression Statistics
Multiple R
0.0184
Multiple R
0.0184
R Square
0.0003
R Square
0.0003
Adjusted R Square
-0.0047
Adjusted R Square
-0.0047
Standard Error
8.4810
Standard Error
24.4994
Observations
200
Observations
200
VIF
1.0003
VIF
1.0003
1 1 1.0003 1.0003 R22 0.0003 , VIF2 1 0.0003 1 0.0003 There is no evidence of collinearity because both VIFs are < 5.
R12 0.0003 , VIF1
15.5
1 1.0565 1 0.0535 There is no reason to suspect the existence of collinearity. VIF
Copyright ©2024 Pearson Education, Inc.
cclxvi Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxvii 15.6
From PHStat Regression Analysis
Regression Analysis
Worldwide Revenues and all other X
Number of New Graduates Hired and all other X
Regression Statistics
Regression Statistics
Multiple R
0.3157
Multiple R
0.3157
R Square
0.0997
R Square
0.0997
Adjusted R Square
0.0875
Adjusted R Square
0.0875
Standard Error
29.9736
Standard Error
1294.3388
Observations
76
Observations
76
VIF
1.1107
VIF
1.1107
1 1 1.1107 1.1107 R22 0.0997 , VIF2 1 0.0997 1 0.0997 There is no evidence of collinearity because both VIFs are < 5.
R12 0.0997 , VIF1
15.7
1 1.169 1 0.1444 1 1.169 R22 0.1444 , VIF2 1 0.1444 There is no reason to suspect the existence of collinearity.
R12 0.1444 , VIF1
15.8
From PHStat Regression Analysis
Regression Analysis
House Size and all other X
Age and all other X
Regression Statistics
Regression Statistics
Multiple R
0.1282
Multiple R
0.1282
R Square
0.0164
R Square
0.0164
Copyright ©2024 Pearson Education, Inc.
cclxviii Chapter 16: Time-Series Forecasting Adjusted R Square
0.0000
Adjusted R Square
0.0000
Standard Error
13312.5581
Standard Error
29.2630
Observations
62
Observations
62
VIF
1.0167
VIF
1 1.0167 1 0.0164 There is no evidence of collinearity. VIF
Copyright ©2024 Pearson Education, Inc.
1.0167
Solutions to End-of-Section and Chapter Review Problems cclxix 15.9
The feature selection is selected from the list of candidate variables.
15.10
The principle of parsimony allows you to choose the minimum number of independent variables to include in a model.
15.11
(a)
For the model that includes independent variables A and B, the value of Cp exceeds 3, the number of parameters, so this model does not meet the criterion for further consideration. For the model that includes independent variables A and C, the value of Cp is less than or equal to 3, the number of parameters, so this model does meet the criterion for further consideration. For the model that includes independent variables A, B, and C, the value of Cp is exceeds 4, the number of parameters, so this model does not meet the criterion for further consideration. The inclusion of variable C in the model does not appear to improve the model’s ability to explain variation in the dependent variable sufficiently to justify its inclusion in a model that contains only variables A and B.
15.12
Stepwise regression uses the t or F statistics to determine whether a variable should be entered into or deleted from a model, while best subsets uses the statistic to determine the best models to consider. Stepwise regression attempts to find the best regression model without examining all possible regressions by adding and subtracting X variables at each step of the process. Bestsubsets regression examines each possible regression model and uses the Cp statistic to determine which models can be considered to be good fitting models.
Copyright ©2024 Pearson Education, Inc.
cclxx Chapter 16: Time-Series Forecasting 15.13
From PHStat, after Multiple Regression Analysis, including VIF, where Y = mean starting salary, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Regression Analysis Regression Statistics Multiple R
0.9145
R Square
0.8364
Adjusted R Square
0.8145
Standard Error
14920.0070
Observations
35
ANOVA df
SS
MS
F
Regression
4 34131388662.4270 8532847165.6067 38.3315
Residual
30
Total
34 40809586932.6857
6678198270.2588
222606609.0086
Coefficients
Standard Error
t Stat
P-value
44742.2327
40286.3146
1.1106
0.2756
Per-Year Tuition
0.5324
0.2417
2.2030
0.0354
0.4823
1.9316
Average GMAT Score
30.6930
59.8615
0.5127
0.6119
0.5586
2.2653
Acceptance Percentage
-547.3644
190.2569
-2.8770
0.0073
0.5942
2.4643
Graduates Employed at Graduation Pctage
885.9678
202.7716
4.3693
0.0001
0.4980
1.9920
Intercept
R Square
VIF
Based on a full regression model involving all of the variables, all the VIF values (1.9316, 2.2653, 2.4643, 1.9920, respectively) are less than 5. There is no reason to suspect the existence of collinearity.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxi 15.13 cont.
From PHStat, Best Subsets Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Best Subsets Analysis Intermediate Calculations R2T
0.836357
1 - R2T
0.163643
n
35
T
5
n-T
30
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
67.3226
2
0.4637
0.4474
25753.6529
X2
62.7813
2
0.4884
0.4729
25151.8681
X3
30.8211
2
0.6628
0.6526
20421.1629
X4
30.6912
2
0.6635
0.6533
20399.6963
X1X2
47.2291
3
0.5842
0.5582
23027.9093
X1X3
24.2435
3
0.7096
0.6914
19245.4138
X1X4
10.7179
3
0.7833
0.7698
16622.1443
X2X3
23.9291
3
0.7113
0.6932
19188.5066
X2X4
21.3111
3
0.7256
0.7084
18707.9312
X3X4
7.8969
3
0.7987
0.7862
16020.9679
X1X2X3
22.0907
4
0.7322
0.7063
18775.3386
X1X2X4
11.2770
4
0.7912
0.7710
16578.9516
X1X3X4
3.2629
4
0.8349
0.8189
14741.5590
X2X3X4
7.8533
4
0.8099
0.7915
15820.1386
X1X2X3X4
5.0000
5
0.8364
0.8145
14920.0070
Copyright ©2024 Pearson Education, Inc.
cclxxii Chapter 16: Time-Series Forecasting Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X1, X3 and X4, which has Cp = 3.2629. Models that add other variables do not change the results very much.
15.13 cont.
The residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot indicates a slight deviation from normality.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxiii
15.13 cont.
From PHStat, Stepwise Regression Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers
Stepwise Regression Analysis Table of Results for General Stepwise
Graduates Employed at Graduation Pctage entered.
df
SS
MS
F
Regression
1 27076715818.2036 27076715818.2036 65.0652
Residual
33 13732871114.4821
416147609.5298
Copyright ©2024 Pearson Education, Inc.
cclxxiv Chapter 16: Time-Series Forecasting Total
34 40809586932.6857
Coefficients
Standard Error
t Stat
P-value
Intercept
26392.8575
14562.0657
1.8124
0.0790
Graduates Employed at Graduation Pctage
1584.5164
196.4366
8.0663
0.0000
Acceptance Percentage entered.
df
SS
MS
F
Regression
2 32596101726.9610 16298050863.4805 63.4977
Residual
32
Total
34 40809586932.6857
Coefficients Intercept
8213485205.7247
256671412.6789
Standard Error
t Stat
P-value
102703.4711
20039.8461
5.1250
0.0000
Graduates Employed at Graduation Pctage
955.2549
205.4603
4.6493
0.0001
Acceptance Percentage
-803.7266
173.3212
-4.6372
0.0001
Per-Year Tuition entered.
df
SS
MS
F
Regression
3 34072866482.0294 11357622160.6765 52.2638
Residual
31
Total
34 40809586932.6857
6736720450.6563
217313562.9244
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxv
Copyright ©2024 Pearson Education, Inc.
cclxxvi Chapter 16: Time-Series Forecasting 15.13 cont.
From PHStat, Stepwise Regression Analysis, X1 = tuition, X2 = average GMAT score, X3 = acceptance percentage, X4 = percent with job offers Coefficients
Standard Error
61063.9936
Graduates Employed at Graduation Pctage Acceptance Percentage
Intercept
Per-Year Tuition
t Stat
P-value
24395.8895
2.5030
0.0178
919.6437
189.5455
4.8518
0.0000
-569.6082
183.0291
-3.1121
0.0040
0.5782
0.2218
2.6068
0.0139
No other variables could be entered into the model. Stepwise ends.
Based on a stepwise regression analysis with all the original variables, only X1, X3 and X4 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the tuition (X1), acceptance percentage (X3), and percent with job offers (X4) should be included in the model. 15.14
From PHStat, after Multiple Regression Analysis, including VIF, where Y = asking price, X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace (0 = No, 1 = Yes). Regression Analysis Regression Statistics Multiple R
0.6970
R Square
0.4859
Adjusted R Square
0.4287
Standard Error
85.1687
Observations
61
ANOVA df
SS
MS
F 8.5050
Regression
6
370157.9011
61692.9835
Residual
54
391700.1416
7253.7063
Total
60
761858.0426
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxvii
Coefficients
Standard Error
t Stat
P-value
R Square
VIF
Intercept
372.6087
95.1280
3.9169
0.0003
Lot Size
69.3349
81.5135
0.8506
0.3988
0.2833
1.3953
Living Space
0.0740
0.0235
3.1459
0.0027
0.5278
2.1175
Bedrooms
8.8378
17.2349
0.5128
0.6102
0.5210
2.0878
Bathrooms
3.1427
22.1538
0.1419
0.8877
0.5751
2.3537
Age
-0.3351
0.8532
-0.3927
0.6961
0.4384
1.7807
Fireplace
66.8626
25.0072
2.6737
0.0099
0.0858
1.0939
Based on a full regression model involving all of the variables, all the VIF values (1.3953, 2.1175, 2.0878, 2.3537, 1.7807, 1.0539, respectively) are less than 5. There is no reason to suspect the existence of collinearity.
Copyright ©2024 Pearson Education, Inc.
cclxxviii Chapter 16: Time-Series Forecasting 15.14 cont.
From PHStat, Best Subsets Analysis (partial display), X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace Best Subsets Analysis Intermediate Calculations R2T
0.485862
1 - R2T
0.514138
n
61
T
7
n-T
54
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
28.9997
2
0.1812
0.1673
102.8259
X2
6.2460
2
0.3978
0.3876
88.1801
X3
32.4034
2
0.1488
0.1344
104.8410
X4
31.0924
2
0.1613
0.1471
104.0694
X5
29.6414
2
0.1751
0.1611
103.2088
X6
32.6028
2
0.1469
0.1324
104.9578
X1X2
6.3800
3
0.4156
0.3954
87.6152
X1X3
24.0514
3
0.2473
0.2214
99.4307
X1X4
23.8039
3
0.2497
0.2238
99.2750
X1X5
23.4593
3
0.2530
0.2272
99.0577
X1X6
21.3329
3
0.2732
0.2482
97.7062
X2X3
8.0988
3
0.3992
0.3785
88.8334
X2X4
8.0059
3
0.4001
0.3794
88.7680
X2X5
7.8160
3
0.4019
0.3813
88.6341
X2X6
0.8703
3
0.4681
0.4497
83.5904
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxix Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X2 and X6, which has Cp = 0.8703. Models that add other variables do not change the results very much.
Copyright ©2024 Pearson Education, Inc.
cclxxx Chapter 16: Time-Series Forecasting 15.14 cont.
Residuals
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxi 15.14 cont.
From PHStat, Stepwise Regression Analysis, X1 = lot size, X2 = living space, X3 = number of bedrooms, X4 = number of bathrooms, X5 =age, X6 = fireplace Stepwise Regression Analysis Table of Results for General Stepwise
Living Space entered.
df
SS
MS
F
Regression
1
303089.8144 303089.8144 38.9789
Residual
59
458768.2282
Total
60
761858.0426
Coefficients
Standard Error
Intercept
408.2614
Living Space
7775.7327
t Stat
P-value
33.0407
12.3563
0.0000
0.1044
0.0167
6.2433
0.0000
df
SS
MS
Fireplace entered.
F
Regression
2
356591.2010 178295.6005 25.5169
Residual
58
405266.8416
Total
60
761858.0426
Coefficients
Standard Error
Intercept
377.8322
Living Space
0.0957
6987.3593
t Stat
P-value
33.1954
11.3821
0.0000
0.0162
5.9176
0.0000
Copyright ©2024 Pearson Education, Inc.
cclxxxii Chapter 16: Time-Series Forecasting Fireplace
66.2138
23.9289
2.7671
0.0076
No other variables could be entered into the model. Stepwise ends. Based on a stepwise regression analysis with all the original variables, only X2 and X6 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the living area of the house, X2 and fireplace, X6 should be included in the model. This was the model developed in Section 14.6.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxiii 15.15
From PHStat, after Multiple Regression Analysis, including VIF, where Y = revenue, X1 = number of partners, X2 = number of offices, X3 = Southeast Region (1 = Yes, 0 = No), X4 = Gulf Coast Region (1 = Yes, 0 = No). Regression Analysis Regression Statistics Multiple R
0.9148
R Square
0.8369
Adjusted R Square
0.8259
Standard Error
29.1776
Observations
64
ANOVA df
SS
MS
Regression
4
257811.6969
Residual
59
50228.7157
Total
63
308040.4126
F
64452.9242 75.7081
Coefficients
Standard Error
Intercept
4.7384
Number of Partners
851.3342
t Stat
P-value
R Square
VIF
6.9717
0.6797
0.4994
1.3007
0.1463
8.8900
0.0000
0.7267
3.6584
Number of Offices
0.2458
1.2073
0.2036
0.8394
0.7278
3.6743
Southeast Region
4.5279
9.1767
0.4934
0.6236
0.2998
1.4281
Gulf Coast Region
-5.1724
9.2033
-0.5620
0.5762
0.3038
1.4364
Based on a full regression model involving all of the variables, all the VIF values (3.6584, 6.6743, 1.4281, and 1.4364, respectively) are less than 5. There is no reason to suspect the existence of collinearity.
Copyright ©2024 Pearson Education, Inc.
cclxxxiv Chapter 16: Time-Series Forecasting 15.15 cont.
From PHStat, Best Subsets Analysis (partial display), X1 = number of partners, X2 = number of offices, X3 = Southeast Region, X4 = Gulf Coast Region. Best Subsets Analysis Intermediate Calculations R2T
0.836941
1 - R2T
0.163059
n
64
T
5
n-T
59
Model
Cp
R Square
k+1
Adj. R Square
Std. Error
X1
0.1917
2
0.8336
0.8310
28.7490
X2
83.8278
2
0.6025
0.5961
44.4402
X3
292.1319
2
0.0268
0.0111
69.5355
X4
301.8157
2
0.0000
-0.0161
70.4852
X1X2
2.1899
3
0.8337
0.8282
28.9832
X1X3
1.3327
3
0.8360
0.8306
28.7761
X1X4
1.2836
3
0.8362
0.8308
28.7642
X2X3
80.7616
3
0.6165
0.6039
44.0068
X2X4
82.0991
3
0.6128
0.6001
44.2184
X3X4
289.8578
3
0.0386
0.0071
69.6764
X1X2X3
3.3159
4
0.8361
0.8279
29.0108
X1X2X4
3.2435
4
0.8363
0.8281
28.9931
X1X3X4
3.0415
4
0.8368
0.8287
28.9436
X2X3X4
82.0321
4
0.6185
0.5994
44.2552
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxv X1X2X3X4
5.0000
5
0.8369
0.8259
29.1776
Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with only variable X1, which has Cp = 0.1917. Models that add other variables do not change the results very much.
Copyright ©2024 Pearson Education, Inc.
cclxxxvi Chapter 16: Time-Series Forecasting 15.15 cont.
Residuals
The residual plot versus the number of partners reveals some evidence for potential deviation from the equal variance assumption. The normal probability plot reveals evidence of potential departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxvii 15.15 cont.
From PHStat, Stepwise Regression Analysis, X1 = number of partners, X2 = number of offices, X3 = Southeast Region, X4 = Gulf Coast Region. Stepwise Regression Analysis Table of Results for General Stepwise
Number of Partners entered.
df
SS
MS
F
Regression
1
256797.1392 256797.1392 310.7027
Residual
62
51243.2734
Total
63
308040.4126
Coefficients
Standard Error
826.5044
t Stat
P-value
Intercept
4.8429
4.5358
1.0677
0.2898
Number of Partners
1.3285
0.0754
17.6268
0.0000
No other variables could be entered into the model. Stepwise ends. Based on a stepwise regression analysis with all the original variables, only X1 makes a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the number of partners, X1 should be included in the model. 15.16
Leave one out cross-validation (LOOCV) divides the data into the number of parts that equals the samples size, and is best used for smaller data sets, not for a sample of n = 10,000.
15.17
An overfit model is one that is too specific to a sample to be of best use for all possible samples.
15.18
The holdout method divides the data into two parts, the training and test sets, and then holds out the test set in the initial analysis.
15.19
Answers may vary.
15.20
Answers may vary.
15.21
Answers may vary. Copyright ©2024 Pearson Education, Inc.
cclxxxviii Chapter 16: Time-Series Forecasting 15.22
Estimated Probability of Success =
15.23
(a)
Estimated Odds Ratio 0.75 0.4286 (1 Estimated Odds Ratio) (1 0.75)
Holding constant the effects of X2, for each additional unit of X1 the natural logarithm of the odds ratio is estimated to increase by a mean of 0.5. Holding constant the effects of X1, for each additional unit of X2 the natural logarithm of the odds ratio is estimated to increase by a mean of 0.2.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxix 15.23 cont.
(b)
(c)
15.24
(a)
(b)
(c)
(d)
15.25
(a)
(b)
ln(estimated odds ratio) = 0.1 + 0.5 X1 + 0.2 X2 = 0.1 + 0.5(2) + 0.2(1.5) = 1.4 Estimated odds ratio = e1.4 = 4.055. The estimated odds of ―success‖ to failure are 4.055 to 1. Estimated Probability of the Event of Interest Estimated Odds Ratio 4.055 = 0.8022 (1 Estimated Odds Ratio) (1 4.055) ln(estimated odds ratio)= –6.94 + 0.13947X1 + 2.774X2 = –6.94 + 0.13947(36) + 2.774(0) = –1.91908 Estimated odds ratio = e 1.91908 = 0.1467 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1467 = 0.1280 (1 Estimated Odds Ratio) (1 0.1467) From the text discussion of the example, 70.16% of the individuals who charge $36,000 per annum and possess additional cards can be expected to purchase the premium card. Only 12.80% of the individuals who charge $36,000 per annum and do not possess additional cards can be expected to purchase the premium card. For a given amount of money charged per annum, the likelihood of purchasing a premium card is substantially higher among individuals who already possess additional cards than for those who do not possess additional cards. ln(estimated odds ratio) = –6.94 + 0.13947X1 + 2.774X2 = –6.94 + 0.13947(18) + 2.774(0) = –4.42954 Estimated odds ratio = e 4.42954 = 0.0119 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.0119 = 0.01178 (1 Estimated Odds Ratio) (1 0.0119) Among individuals who do not purchase additional cards, the likelihood of purchasing a premium card diminishes dramatically with a substantial decrease in the amount charged per annum. Let X1 = distance traveled to rehabilitation in km, X2 = whether the person had a car (0 = no, 1 = yes), X3 = age of the person in years. ln(estimated odds) = 5.7765 0.0675 X1 1.9369 X 2 0.0599 X 3 ln(estimated odds) = 5.7765 0.0675 20 1.9369 1 0.0599 65 = 2.4699 Estimated odds ratio = e 2.4699 = 11.8213 Estimated Probability of the Event of Interest =
(c)
Estimated Odds Ratio 0.9220 (1 Estimated Odds Ratio)
ln(estimated odds) = 5.7765 0.0675 20 1.9369 0 0.0599 65 = 0.533 Estimated odds ratio = e 0.533 = 1.7040 Estimated Odds Ratio 0.6301 (1 Estimated Odds Ratio) Holding everything else the same, a person who has a car has a much higher probability of participating in rehabilitation.
Estimated Probability of the Event of Interest = (d)
Copyright ©2024 Pearson Education, Inc.
ccxc Chapter 16: Time-Series Forecasting 15.25 cont.
(e)
(f)
15.26
X1 : test statistic = –6.113. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that the distance traveled makes a significant contribution to the model. X 2 : test statistic = 7.121. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that whether a person has a car makes a significant contribution to the model. X 3 : test statistic = –5.037. p-value = 0.0000 < 0.05. Reject H0. There is sufficient evidence to conclude that the age of a person makes a significant contribution to the model. Holding everything else constant, the farther the distance traveled to rehabilitation, the less likely is a person to participation in rehabilitation. Holding everything else constant, a person who has a car is more likely to participation in rehabilitation. Holding everything else constant, the older the age is, the less likely is a person to participation in rehabilitation.
(a) Binary Logistic Regression
(b)
(c)
(d) (e)
(f)
Predictor Intercept fixed acidity chlorides pH
Coefficients SE Coef Z -47.4821 12.0173 -3.9512 1.310179398 0.4139 3.1656 90.57937563 22.643 4.0003 9.779258829 2.9743 3.288
Deviance
54.45564087
p -Value 0.0001 0.0015 0.0001 0.0010
Holding constant the effects of chlorides and pH, for each increase of one unit of fixed acidity, ln(odds) increases by an estimate of 1.3102. Holding constant the effects of fixed acidity and pH, for each increase of one unit in chlorides, ln(odds) increases by an estimate of 90.5794. Holding constant the effects of fixed acidity and chlorides, for each increase of one unit in pH, ln(odds) increases by an estimate of 9.7793. ln(estimated odds ratio) = 47.4821 1.3102 X1 90.5794 X 2 9.7793X 3 = –0.4603 Estimated odds ratio = e 0.4603 = 0.6311 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.6311 = 0.3869 (1 Estimated Odds Ratio) (1 0.6311) The deviance statistic is 54.4556, which has a p-value of 0.9998. Do not reject H0. The model is a good fitting model. For fixed acidity: ZSTAT = 3.1656 with a p-value = 0.0015. Reject H0. There is sufficient evidence that fixed acidity makes a significant contribution to the model. For chlorides: ZSTAT = 4.0003 with a p-value = 0.0001. Reject H0. There is sufficient evidence that the amount of chlorides makes a significant contribution to the model. For pH: ZSTAT = 3.2880 with a p-value = 0.0010. Reject H0. There is sufficient evidence that pH makes a significant contribution to the model. Based on the p-values corresponding to the Z-values for the variable coefficients in the logistic regression equation and corresponding to the deviance statistics, the model that Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxci
15.27
(a)
(b)
(c)
(d)
(e)
(f)
15.28
includes fixed acidity, chlorides and pH should be used to predict whether the wine is red. Let X1 = price of the pizza. Using PHStat, ln(estimated odds) = 1.243 –0.25034 X1 For X1: Z = –2.68 < –1.96. Reject H0. There is sufficient evidence that price of the pizza makes a significant contribution to the model. Let X1 = price of the pizza, X2 = status. Using PHStat, ln(estimated odds) = 1.220 –0.25019 X1 + 0.0377 X2 For X1: ZSTAT = –2.68 < –1.96. Reject H0. There is sufficient evidence that price of the pizza makes a significant contribution to the model. For X2: ZSTAT = 0.10 < 1.96. Do not reject H0. There is not sufficient evidence to conclude that status makes a significant contribution to the model. Model (a): Deviance statistic = 0.258. p-value = 0.998 > 0.05. Do not reject H0. There is insufficient evidence to conclude that model (a) is not a good fit. Model (b): Deviance statistic = 7.804. p-value = 0.731 > 0.05. Do not reject H0. There is insufficient evidence to conclude that model (b) is not a good fit. However, the Z test in (b) suggests that there is not sufficient evidence to conclude that status makes a significant contribution to the model. Using the parsimony principle, the model in (a) is preferred to the model in (b). ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (8.99) = –1.0076 Estimated odds ratio = e 1.0076 = 0.3651 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.3651 = 0.2675 (1 Estimated Odds Ratio) (1 0.3651) ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (11.49) = –1.6334 Estimated odds ratio = e 1.6334 = 0.1953 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1953 = 0.1634 (1 Estimated Odds Ratio) (1 0.1953) ln(estimated odds ratio) = 1.243 –0.25034 X1 = 1.243 –0.25034 (13.99) = –2.2593 Estimated odds ratio = e 2.2593 = 0.1044 Estimated Probability of the Event of Interest Estimated Odds Ratio 0.1044 = (1 Estimated Odds Ratio) (1 0.1044) = 0.0946 estimated odds ratio / (1 + estimated odds ratio) = 0.1044/(1 + 0.1044) = 0.0946
(a) Predictor
Coefficients SE Coef Z Intercept -0.6048 0.4194 -1.4421 claims/year 0.093769442 0.5029 0.1865 new business (1=yes, 0=no):1 1.810770296 0.8134 2.2261 Deviance
(b)
119.4353239
p-value
p -Value 0.1493 0.8521 0.0260 0.0457
Holding constant the effects of whether the policy is new, for each increase of the number of claims submitted per year by the policy holder, ln(odds) increases by an estimate of 0.0938. Holding constant the number of claims submitted per year by the policy holder, Copyright ©2024 Pearson Education, Inc.
ccxcii Chapter 16: Time-Series Forecasting
15.28
(c)
ln(odds) is estimated to be 1.8108 higher when the policy is new as compared to when the policy is not new. ln(estimated odds ratio) = 0.6048 0.0938(1) 1.8108(1) = 1.2998 Estimated odds ratio = e1.2998 = 3.6684
cont.
Estimated Probability of the Event of Interest = (d) (e)
(f)
(g)
The deviance statistic is 119.4353 with a 2 distribution of 95 d.f. and p-value = 0.0457 < 0.05. Reject H0. The model is not a good fitting model. For claims/year: ZSTAT = 0.1865, p-value = 0.8521> 0.05. Do not eject H0. There is not sufficient evidence that the number of claims submitted per year by the policy holder makes a significant contribution to the logistic model. For new business: ZSTAT = 2.2261, p-value = 0.0260 < 0.05. Reject H0. There is sufficient evidence that whether the policy is new makes a significant contribution to the logistic model. PHStat output: Predictor Intercept claims/year
Coefficients SE Coef Z -1.0125 0.3888 -2.6042 0.992742206 0.3367 2.9481
Deviance
125.0102452
Deviance
15.29
p-value
p -Value 0.0092 0.0032 0.0250
PHStat output: Predictor Coefficients SE Coef Z Intercept -0.5423 0.2515 -2.1563 new business (1=yes, 1.928618927 0=no):10.5211 3.7008
(h)
Estimated Odds Ratio 0.7858 (1 Estimated Odds Ratio)
119.4701921
p-value
p -Value 0.0311 0.0002 0.0526
The deviance statistic for (f) is 125.0102 with a 2 distribution of 96 d.f. and p-value = 0.0250 < 0.05. Reject H0. The model is not a good fitting model. The deviance statistic for (g) is 119.4702 with a 2 distribution of 96 d.f. and p-value = 0.0526 > 0.05. Do not reject H0. The model is a good fitting model. The model in (g) should be used to predict a fraudulent claim.
(a) Binary Logistic Regression Predictor Intercept calls visits
Coefficients SE Coef Z -1.6023 0.9884 -1.6211 0.061953028 0.0291 2.1316 0.094175407 0.5214 0.1806
Deviance
30.13249138
p -Value 0.1050 0.0330 0.8567
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxciii 15.29 cont.
(b)
(c)
Holding constant the effects of the number of visits the customer makes to the local service center, for each increase of the number of calls the customer makes to the company call center, ln(odds) increases by an estimate of 0.0620. Holding constant the number of calls the customer makes to the company call center, for each increase of the number of visits the customer makes to the local service center, ln(odds) increases by an estimate of 0.0942. ln(estimated odds ratio) = 1.6023 0.0620(10) 0.0942(1) = –0.8886 Estimated odds ratio = e 0.8886 = 0.4112 Estimated Odds Ratio 0.2914 (1 Estimated Odds Ratio) The deviance statistic is 30.1325 with a p-value = 0.3082 > 0.05. Do not reject H0. The model is a good fitting model. For calls: ZSTAT = 2.1316, p-value = 0.0330 < 0.05. Reject H0. There is sufficient evidence that the number of calls the customer makes to the company call center makes a significant contribution to the logistic model. For visits: ZSTAT = 0.1806, p-value = 0.8567 > 0.05. Do not reject H0. There is not sufficient evidence that the number of visits the customer makes to the local service center makes a significant contribution to the logistic model.
Estimated Probability of the Event of Interest = (d) (e)
(f) Binary Logistic Regression
(g)
Predictor Intercept calls
Coefficients SE Coef Z -1.4702 0.6496 -2.2633 0.060084224 0.0267 2.249
Deviance
30.16532266
p -Value 0.0236 0.0245
PHStat output: Binary Logistic Regression
15.30
Predictor Intercept visits
Coefficients SE Coef Z 0.4786 0.5425 0.8822 -0.51392294 0.4231 -1.215
Deviance
40.06397737
p -Value 0.3777 0.2245
(h)
Since there is not sufficient evidence that the number of visits the customer makes to the local service center makes a significant contribution to the logistic model in (e), the model in (f) should be used.
(a) (b)
ln(estimated odds) = 1.252 – 0.0323 Age + 2.2165 subscribes to the wellness newsletters. Holding constant the effect of subscribes to the wellness newsletters, for each increase of one year in age, ln(estimated odds) decreases by an estimate of 0.0323. Holding constant the effect of age, for a customer who subscribes to the wellness newsletters, ln(estimated odds) increases by an estimate of 2.2165. 0.912 Deviance = 102.8762, p-value = 0.3264. Do not reject H0 so model is adequate.
(c) (d)
Copyright ©2024 Pearson Education, Inc.
ccxciv Chapter 16: Time-Series Forecasting (e)
For Age: Z = –1.8053 > –1.96, Do not reject H0. For subscribes to the wellness newsletters: Z = 4.3286 > 1.96, Reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcv 15.30 cont.
(f)
Only subscribes to wellness newsletters is useful in predicting whether a customer will purchase organic food.
15.31
In least squares regression the dependent variable is numerical. The use of categorical variables in least squares regression would violate the normality assumption and would not be appropriate with this method. Logistic regression allows one to predict a categorical dependent variable utilizing the odds ratio. The least squares regression uses a numerical dependent variable while logistic regression uses a categorical dependent variable.
15.32
In order to evaluate whether independent variables are intercorrelated, you can compute the Variance Inflationary Factor (VIF).
15.33
You use logistic regression when the dependent variable is a categorical variable.
15.34
One way to choose among models that meet these criteria is to determine whether the models contain a subset of variables that are common, and then test whether the contribution of the additional variables is significant.
15.35
From PHStat, Y = wins, X1 = runs per game, X2 = batting average, X3 = home runs, X4 = ERA, X5 = Saves, X6 = WHIP, X7 = OBS.
Coefficients
Standard Error
t Stat
P-value
Intercept
17.7302
26.6511
0.6653
0.5128
Runs per game
9.8491
4.6436
2.1210
Batting Average
-223.0903
150.0442
Home Runs
-0.1099
ERA
R Square
VIF
0.0454
0.9257
13.4578
-1.4868
0.1513
0.9018
10.1857
0.0468
-2.3497
0.0282
0.8815
8.4366
-11.1796
3.8947
-2.8704
0.0089
0.9335
15.0460
Saves
0.2735
0.1056
2.5913
0.0167
0.4565
1.8400
WHIP
-14.9478
21.0550
-0.7099
0.4852
0.9348
15.3361
OBS
207.2519
94.8418
2.1852
0.0398
0.9741
38.5477
As a first step in the model building process, a review of the VIFs for the seven independent variables reveals that one variable had values less than 5, which indicates that that variable is free from collinearity problems. The variable with the largest VIF, OBS, was removed before running a second regression analysis.
Intercept
Coefficients
Standard Error
58.8155
20.3803
t Stat
P-value
2.8859
0.0083
Copyright ©2024 Pearson Education, Inc.
R Square
VIF
ccxcvi Chapter 16: Time-Series Forecasting Runs per game
16.6449
3.7208
4.4735
0.0002
0.8653
7.4220
Batting Average
49.1552
90.2209
0.5448
0.5911
0.6839
3.1634
Home Runs
-0.0326
0.0330
-0.9870
0.3339
0.7228
3.6074
ERA
-11.5174
4.1989
-2.7429
0.0116
0.9334
15.0222
Saves
0.2446
0.1130
2.1648
0.0410
0.4478
1.8111
WHIP
-15.6068
22.7151
-0.6871
0.4989
0.9348
15.3330
After removing the variable with the largest VIF, OBS, a second regression analysis reveals only two variables, ERA (earned run average) and WHIP, with a VIF above five. The variable with the largest VIF, WHIP, was removed before running a third regression analysis. 15.35 cont. Coefficients
Standard Error
t Stat
P-value
R Square
VIF
Intercept
50.2879
15.9862
3.1457
0.0044
Runs per game
17.3811
3.5237
4.9326
0.0000
0.8531
6.8064
Batting Average
36.1803
87.2467
0.4147
0.6821
0.6694
3.0248
Home Runs
-0.0369
0.0320
-1.1505
0.2613
0.7125
3.4779
ERA
-14.1834
1.5868
-8.9386
0.0000
0.5441
2.1935
Saves
0.2464
0.1117
2.2055
0.0372
0.4476
1.8101
After removing the variables with the largest VIF, OBS and WHIP, a third regression analysis reveals only one variable, runs per game, with a VIF above five. Remove runs per game before running a fourth regression analysis. After running a fourth regression analysis, all of the variables, batting average (X1), home runs (X2), ERA (X3), and Saves (X4)all have VIF below 5. Regression Analysis Regression Statistics Multiple R
0.9589
R Square
0.9195
Adjusted R Square
0.9066
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcvii Standard Error
4.4862
Observations
30 Coefficients
Standard Error
t Stat
P-value
Intercept
31.6338
Batting Average
21.5962
1.4648
0.1554
376.4944
74.2517
5.0705
0.0000
0.1176
1.1333
0.0849
0.0284
2.9892
0.0062
0.2926
1.4135
ERA
-16.5869
2.0996
-7.8999
0.0000
0.4967
1.9867
Saves
0.2198
0.1552
1.4165
0.1690
0.4463
1.8059
Home Runs
R Square
VIF
The best-subset approach yielded: Cp = 5.0000 with X1X2X3X4 and adjusted r2 = 0.9066. (display of 3 smallest Cp values from PHStat) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X2X3
5.0064
4
0.9131
0.9030
4.5722
X1X3X4
11.9352
4
0.8908
0.8782
5.1253
X1X2X3X4
5.0000
5
0.9195
0.9066
4.4862
The most appropriate multiple regression model for predicting wins is: Yˆ 31.633 376.4944 X 1 0.0849 X 2 16.5869 X 3 0.2198 X 4 , for X1 = batting average, X2 = home runs, X3 = ERA, X4 = saves.
Copyright ©2024 Pearson Education, Inc.
ccxcviii Chapter 16: Time-Series Forecasting 15.35 cont.
Residuals
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcix 15.35 cont.
Residual plots against each of the four independent variables reveals no obvious pattern, which indicates insufficient evidence for violations in equal variance and linearity assumptions.
The normal probability plot indicates no evidence of a violation of the normality assumption.
Copyright ©2024 Pearson Education, Inc.
ccc Chapter 16: Time-Series Forecasting 15.36
A regression analysis with all three variables reveals that all had VIF values well below five with the largest VIF = 1.06.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccci 15.36 cont.
A best subsets regression reveals that the model with pressure and cost had the lowest Cp value and the highest adjusted r2.
A regression analysis on the two variable model with pressure and cost reveals that the pressure variable was not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
cccii Chapter 16: Time-Series Forecasting 15.36 cont.
A regression analysis with cost as the only variable reveals a significant FSTAT for the overall model. The cost variable is significant at the 0.05 significance level. Because tSTAT = –2.81 or p-value = 0.016, reject H0. The r2 of 0.3971 indicates that 39.71% of variation in registration errors can be explained by the cost of the material.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccciii 15.36 cont.
Residual analyses reveal no clear patters, which indicate there appear to be no violations of assumptions. The best linear model, which contains only the cost as the predictor variable, is: Yˆ 1.77 14.23X where X = cost (low = 1). 15.37
Since the variable Rooms is the sum of Bathrooms, Bedrooms, Loft/Den and Finished Basement, it is removed from the list of potential independent variables. Including it will introduce perfect collinearity.
An analysis of the linear regression model all of the remaining seven possible independent variables revealed that none of the variables have VIF values in excess of 5.0.
Copyright ©2024 Pearson Education, Inc.
ccciv Chapter 16: Time-Series Forecasting 15.37 cont.
A best subsets regression produces the following potential models that have Cp values less than or equal to k+1. Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X2X3X4X5
5.657125
6
0.532645
0.490157971
yes
X1X2X3X5X6
5.78991
6
0.531509
0.488919347
yes
X1X2X3X4X5X6
6.074978
7
0.546173
0.49574803
yes
X1X2X3X4X5X6X7
8
8
0.546814
0.486959627
yes
where X1 = Hot tub (0 = No and 1 = Yes), X2 = Lake View (0 = No and 1 = Yes), X3 = Bathrooms, X4 = Bedrooms , X5 = Loft/Den (0 = No and 1 = Yes), X6 = Finished Basement (0 = No and 1 = Yes), X7 = Acres. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccv Looking at the p-values of the t statistics for each slope coefficient of the model that includes X1 through X7 reveals that Bathrooms, Bedrooms, Loft/Den, Finished Basement and Acres are not significant at 5% level of significance.
Copyright ©2024 Pearson Education, Inc.
cccvi Chapter 16: Time-Series Forecasting 15.37 cont. Coefficients
Standard Error
t Stat
P-value
Intercept
56.86025281
75.83635651
0.749775641
0.456705242
Hot Tub
83.33704829
39.54169465
2.107574019
0.039815205
Lake View
188.1459004
46.55629756
4.041255648
0.000172817
Bathrooms
44.97723054
28.05985879
1.602902954
0.114899805
Bedrooms
31.7825021
24.30126221
1.307853963
0.196567371
Loft/Den
68.68786861
34.62218517
1.983926441
0.052453764
Finished Basement
42.40047126
35.03765953
1.210139942
0.231594833
Acres
8.214876036
30.00092343
0.273820773
0.7852867
Dropping Acres which has the highest p-value, the new regression indicates that Bathrooms, Bedrooms and Finished Basement are still not significant. Coefficients
Standard Error
t Stat
P-value
Intercept
63.04035916
71.77716439
0.878278763
0.383683708
Hot Tub
82.87369204
39.16564269
2.115979372
0.038973223
Lake View
190.4252197
45.41206354
4.193273877
0.000102798
Bathrooms
44.42625163
27.74686825
1.601126702
0.115183479
Bedrooms
31.82322517
24.09177116
1.320916796
0.192099589
Loft/Den
68.8302055
34.3204956
2.005513157
0.049930143
Finished Basement
43.67867801
34.42659968
1.268747957
0.209972962
Dropping Finished Basement, which has the largest p-value, the new regression indicatesthat Bathrooms and Bedrooms are still insignificant. Coefficients
Standard Error
t Stat
P-value
Intercept
33.50312632
68.27210075
0.490729389
0.625570141
Hot Tub
98.1802218
37.46721666
2.620430087
0.011330891
Lake View
181.7931221
45.14769931
4.026630921
0.000174862
Bathrooms
52.45485906
27.1649841
1.930973303
0.058647073
Bedrooms
36.99281947
23.87596476
1.549374856
0.127027169
Loft/Den
79.72730905
33.41209396
2.386181158
0.020493138
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccvii Dropping Bedrooms next, which has the largest p-value, the new regression indicates that all the remaining variables become significant at 5% level of significance. Coefficients
Standard Error
t Stat
P-value
Intercept
102.0014472
52.67086383
1.936582007
0.057848524
Hot Tub
89.12922256
37.4689494
2.378748911
0.020806217
Lake View
183.8349004
45.68930995
4.023586712
0.00017363
Bathrooms
77.69078822
22.01055013
3.529706789
0.000839855
Loft/Den
76.80439638
33.77336974
2.274111141
0.026811687
The best linear model is determined to be:
Yˆ 102.0014 89.1292X 1 183.8349X 2 77.6908X 3 76.8044X 5 . The overall model has FSTAT = 14.7030 (4 and 56 degrees of freedom) with a p-value that is 2 virtually 0. r2 = 0.5122, radj = 0.4774.
Copyright ©2024 Pearson Education, Inc.
cccviii Chapter 16: Time-Series Forecasting 15.37 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccix 15.37 cont.
A residual analysis does not reveal any strong patterns and the normal probability plot does not suggest any departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
cccx Chapter 16: Time-Series Forecasting 15.38
(a)
From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.
Regression Analysis Regression Statistics Multiple R
0.9062
R Square
0.8211
Adjusted R Square
0.8086
Standard Error
196.8748
Observations
62
Coefficients
Standard Error
t Stat
P-value
Intercept
326.0868
142.2280
2.2927
0.0256
House Size
0.0154
0.0023
6.7138
Bedrooms
51.0930
28.5164
Bathrooms
163.9062
Age
-3.4459
R Square
VIF
0.0000
0.3151
1.4600
1.7917
0.0785
0.3707
1.5891
33.8509
4.8420
0.0000
0.4930
1.9723
0.9907
-3.4782
0.0010
0.2441
1.3229
Glen Cove: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 5.0000 with X1X2X3X4 and adjusted r2 = 0.8086. Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
90.3313
2
0.5345
0.5267
309.5501
X2
137.9638
2
0.3850
0.3747
355.7972
X3
65.1831
2
0.6134
0.6070
282.0916
X4
181.0711
2
0.2497
0.2372
392.9870
X1X2
61.2776
3
0.6319
0.6195
277.5695
X1X3
17.2209
3
0.7702
0.7624
219.3218
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxi X1X4
38.9390
3
0.7020
0.6919
249.7391
X2X3
52.2616
3
0.6602
0.6487
266.6867
X2X4
104.0881
3
0.4976
0.4805
324.2977
X3X4
60.4780
3
0.6344
0.6220
276.6216
X1X2X3
15.0977
4
0.7831
0.7719
214.8859
X1X2X4
26.4450
4
0.7475
0.7345
231.8600
X1X3X4
6.2102
4
0.8110
0.8013
200.5909
X2X3X4
48.0757
4
0.6796
0.6631
261.1785
X1X2X3X4
5.0000
5
0.8211
0.8086
196.8748
The most appropriate multiple regression model for predicting asking price in Glen Cove is: Yˆ 326.0868 0.0154 X 51.0930 X 163.9062 X 3.4459 X , 1
2
3
4
for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age (b)
15.39
(a)
The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size. From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.
Regression Analysis Regression Statistics Multiple R
0.6558
R Square
0.4301
Adjusted R Square
0.4039
Standard Error
264.5673
Observations
92
Intercept
Coefficients
Standard Error
t Stat
P-value
469.9116
194.4919
2.4161
0.0178
Copyright ©2024 Pearson Education, Inc.
R Square
VIF
cccxii Chapter 16: Time-Series Forecasting House Size
0.0336
0.0087
3.8755
0.0002
0.1080
1.1211
Bedrooms
37.5697
44.8910
0.8369
0.4049
0.2910
1.4104
Bathrooms
145.4441
49.9937
2.9092
0.0046
0.3579
1.5574
Age
-4.9885
1.2826
-3.8894
0.0002
0.0964
1.1067
Merrick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 3.7004 with X1X3X4 and adjusted r2 = 0.4059. Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
42.1513
2
0.1474
0.1379
318.1553
X2
47.6019
2
0.1117
0.1018
324.7489
X3
25.5951
2
0.2559
0.2476
297.2314
X4
42.1917
2
0.1472
0.1377
318.2047
X1X2
30.0341
3
0.2399
0.2228
302.0883
X1X3
17.6092
3
0.3213
0.3060
285.4567
X1X4
16.4081
3
0.3292
0.3141
283.7972
X2X3
26.6559
3
0.2620
0.2454
297.6584
X2X4
34.0587
3
0.2135
0.1959
307.2825
X3X4
16.3915
3
0.3293
0.3142
283.7742
X1X2X3
18.1278
4
0.3310
0.3082
285.0144
X1X2X4
11.4637
4
0.3746
0.3533
275.5586
X1X3X4
3.7004
4
0.4255
0.4059
264.1165
X2X3X4
18.0199
4
0.3317
0.3089
284.8637
X1X2X3X4
5.0000
5
0.4301
0.4039
264.5673
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxiii 15.39 cont.
(a)Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
t Stat
P-value
Intercept
578.0637
145.1043
3.9838
0.0001
Bathrooms
166.2756
43.2829
3.8416
0.0002
Age
-5.0919
1.2744
-3.9954
0.0001
House Size
0.0332
0.0087
3.8394
0.0002
The most appropriate multiple regression model for predicting asking price in Merrick is: Yˆ 578.0637 0.0332 X 1 0 X 2 166.2756 X 3 5.0919 X 4 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age
15.40
(b)
The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size.
(a)
From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age.
Regression Analysis Regression Statistics Multiple R
0.7514
R Square
0.5646
Adjusted R Square
0.5162
Standard Error
190.3234
Observations
41
Coefficients
Standard Error
t Stat
P-value
Intercept
541.8804
238.8438
2.2688
0.0294
House Size
0.0121
0.0171
0.7077
Bedrooms
-47.0501
42.0582
-1.1187
R Square
VIF
0.4837
0.1198
1.1360
0.2707
0.4830
1.9343
Copyright ©2024 Pearson Education, Inc.
cccxiv Chapter 16: Time-Series Forecasting Bathrooms
247.7834
55.1422
4.4935
0.0001
0.5049
2.0197
Age
-3.7219
1.6632
-2.2378
0.0315
0.1559
1.1847
Bellmore: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxv 15.40 cont.
The best-subset approach yielded: Cp = 3.1084 with X3X4 and adjusted r2 = 0.5148.
(a)
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
38.8899
2
0.0821
0.0586
265.4922
X2
34.0716
2
0.1404
0.1183
256.9258
X3
7.0201
2
0.4676
0.4539
202.2018
X4
27.7709
2
0.2166
0.1965
245.2730
X1X2
29.4430
3
0.2205
0.1795
247.8496
X1X3
6.8207
3
0.4942
0.4675
199.6622
X1X4
27.6016
3
0.2428
0.2030
244.2829
X2X3
7.7427
3
0.4830
0.4558
201.8511
X2X4
23.8691
3
0.2880
0.2505
236.8886
X3X4
3.1084
3
0.5391
0.5148
190.5946
X1X2X3
8.0077
4
0.5040
0.4638
200.3659
X1X2X4
23.1919
4
0.3203
0.2652
234.5460
X1X3X4
4.2515
4
0.5494
0.5129
190.9690
X2X3X4
3.5009
4
0.5585
0.5227
189.0353
X1X2X3X4
5.0000
5
0.5646
0.5162
190.3234
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
t Stat
P-value
Intercept
535.3201
166.3062
3.2189
0.0026
Bathrooms
210.6304
40.8497
5.1562
0.0000
Age
-3.9061
1.6088
-2.4279
0.0200
The most appropriate multiple regression model for predicting asking price in Bellmore is: Yˆ 535.3201 0 X 1 0 X 2 210.6304 X 3 3.9061X 4 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age
Copyright ©2024 Pearson Education, Inc.
cccxvi Chapter 16: Time-Series Forecasting (b)
The adjusted r2 for the best model in 15.38(a), 15.39(a), and 15.40(a) are 0.8086, 0.4059, and 0.5227, respectively. The model in 15.38(a) has the highest explanatory power after adjusting for the number of independent variables and sample size.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxvii 15.41
(a)
From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = Merrick).
Regression Analysis Regression Statistics Multiple R
0.7914
R Square
0.6264
Adjusted R Square
0.6138
Standard Error
241.2153
Observations
154
Coefficients
Standard Error
t Stat
P-value
Intercept
488.3431
122.9849
3.9708
0.0001
House Size
0.0172
0.0025
6.8999
Bedrooms
40.0918
26.3980
Bathrooms
159.7503
Age Glen Cove
R Square
VIF
0.0000
0.3350
1.5038
1.5187
0.1310
0.3312
1.4951
30.0709
5.3125
0.0000
0.4176
1.7169
-4.0414
0.8318
-4.8587
0.0000
0.1650
1.1976
-96.2654
44.0796
-2.1839
0.0305
0.1915
1.2369
GC and Merick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 6.0000 with X1X2X3X4X5 and adjusted r2 = 0.6138. Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X2X3X4
8.7694
5
0.6143
0.6040
244.2473
X1X2X3X5
27.6072
5
0.5668
0.5552
258.8686
X1X2X4X5
32.2221
5
0.5551
0.5432
262.3263
X1X3X4X5
6.3066
5
0.6206
0.6104
242.2706
X2X3X4X5
51.6084
5
0.5062
0.4930
276.3792
Copyright ©2024 Pearson Education, Inc.
cccxviii Chapter 16: Time-Series Forecasting X1X2X3X4X5
6.0000
6
0.6264
0.6138
241.2153
The most appropriate multiple regression model for predicting asking price in Glen Cove and Merrick is: Yˆ 488.3431 0.0172 X 1 40.0918 X 2 159.7503 X 3 4.0414 X 4 96.2654 X 5 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = Merrick). (b)
The asking price in Glen Cove is $96.2654 thousands below Merrick for two otherwise identical properties.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxix 15.42
(a)
From PHStat, Y = asking price, X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = otherwise), X6 = Merrick (1 = Merrick, 0 = otherwise).
Regression Analysis Regression Statistics Multiple R
0.7914
R Square
0.6264
Adjusted R Square
0.6138
Standard Error
241.2153
Observations
154
Coefficients
Standard Error
t Stat
P-value
Intercept
488.3431
122.9849
3.9708
0.0001
House Size
0.0172
0.0025
6.8999
Bedrooms
40.0918
26.3980
Bathrooms
159.7503
Age Glen Cove
R Square
VIF
0.0000
0.3350
1.5038
1.5187
0.1310
0.3312
1.4951
30.0709
5.3125
0.0000
0.4176
1.7169
-4.0414
0.8318
-4.8587
0.0000
0.1650
1.1976
-96.2654
44.0796
-2.1839
0.0305
0.1915
1.2369
GC and Merick: Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yielded: Cp = 4.1575 with X1X3X4X6 and adjusted r2 = 0.6196. (display of 3 smallest Cp values from PHStat) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X3X4X6
4.1575
5
0.6274
0.6196
230.6767
X1X2X3X4X6
5.0000
6
0.6297
0.6199
230.5774
X1X3X4X5X6
6.1444
6
0.6275
0.6176
231.2781
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Copyright ©2024 Pearson Education, Inc.
cccxx Chapter 16: Time-Series Forecasting
(b)
Coefficients
Standard Error
t Stat
P-value
Intercept
482.1616
86.1384
5.5975
0.0000
Bathrooms
184.9559
23.1061
8.0046
0.0000
House Size
0.0176
0.0022
8.1890
0.0000
Age
-4.0378
0.7319
-5.5171
0.0000
Merrick
100.9783
34.6406
2.9150
0.0040
The most appropriate multiple regression model for predicting asking price in Glen Cove, Merrick, and Bellmore is: Yˆ 482.1616 0.0176 X 1 0 X 2 184.9559 X 3 4.0378 X 4 0 X 5 100.9783 X 6 , for X1 = house size, X2 = number of bedrooms, X3 = number of bathrooms, X4 = age, X5 = Glen Cove (1 = Glen Cove, 0 = otherwise), X6 = Merrick (1 = Merrick, 0 = otherwise). The asking price in Merrick is $100.9873 thousands above Glen Cove or Bellmore for two otherwise identical properties.
15.43
As a first step in the model building process, a review of the VIFs for the seven independent variables reveals that all variables had VIF values less than 5, which indicates that these variables are free from collinearity problems.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxi
A Best Subsets analysis reveals that all models had Cp values below or at k+1, where k represents the number of independent variables. The model with the highest adjusted r2 is the model with growth as the only independent variable. A stepwise regression analysis also produced a model with growth as the only independent variable.
Copyright ©2024 Pearson Education, Inc.
cccxxii Chapter 16: Time-Series Forecasting 15.43 cont.
There was insufficient evidence for a relationship between growth and price-to-book value ratio. Because FSTAT = 2.59 or p-value = 0.113, do not reject H0.
A normal probability plot reveals deviation from the normality assumption. To correct for the deviation from the normality assumption, a natural log transformation was performed on the price-to-book value ratio dependent variable. However, one of the rows contained a negative price-to-book value ratio. This case, row 60, was removed before performing the transformation. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxiii 15.43 cont.
A review of the VIFs for the seven independent variables reveals that all variables had VIF values less than 5, which indicates that these variables are free from collinearity problems.
A Best Subsets analysis with the natural log transformation of the dependent variable reveals that most of the models had Cp values below or at k+1, where k represents the number of independent variables. The model with the highest adjusted r2 is the model with growth as the only independent variable.
Copyright ©2024 Pearson Education, Inc.
cccxxiv Chapter 16: Time-Series Forecasting 15.43 cont.
A regression analysis with the transformed price-to-book value ratio dependent variable and the growth independent variable, growth, reveals a significant FSTAT for the overall model. The growth coefficient had a significant tSTAT at the 0.05 significance level. The r2 of 0.0995 indicates that 9.95% of variation in the natural log of the price-to-book value ratio can be explained by the variation in growth. The regression equation for this model is ln(Ŷ) = 1.590 – 0.0128(X1), where X1=growth.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxv 15.43 cont.
Residual plots against the growth independent variable reveals no obvious pattern, which indicates insufficient evidence for violations in equal variance and linearity assumptions. The normal probability plot reveals no evidence of deviation from the normality assumption with this model. The present model using the transformed price-to-book value ratio dependent variable and growth as the single independent variable represents the one model associated with a significant FSTAT for the overall model. However, it is relatively weak in that in can account for less than 10% of the variation in the dependent variable. In general, none of the variables in the dataset appeared to be very predictive of the price-to-book value ratio.
15.44
In the multiple regression model with catalyst, pH, pressure, temperature and voltage as independent variables, none of the variables have a VIF value of 5 or larger.
The best-subset approach yielded only the following model to be considered: Model X1X2X3X4X5
Cp
k+1
R Square
Adj. R Square
Std. Error
6
0.875922
0.861822068
1.293575
6
where X1 = catalyst, X2 = pH, X3 = pressure, X4 = temp, and X5 = voltage. Looking at the p-values of the t statistics for each slope coefficient of the model that includes X1 through X5 reveals that pH level is not significant at 5% level of significance. Coefficients
Standard Error
t Stat
P-value
Intercept
4.454255233
8.222983547
0.541683588
0.590769119
Catalyst
0.162669323
0.036277562
4.484020293
5.18724E-05
Copyright ©2024 Pearson Education, Inc.
cccxxvi Chapter 16: Time-Series Forecasting pH
0.086375011
0.080013101
1.079510851
0.286242198
Pressure
–0.043059299
0.013464369
–3.198018263
0.002564899
Temp
–0.402556214
0.069704281
–5.775200729
7.21416E-07
Voltage
0.422370024
0.028413318
14.86521277
9.13658E-19
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxvii 15.44 cont.
The multiple regression model with pH level deleted shows that all coefficients are significant individually at 5% level of significance. Coefficients
Standard Error
t Stat
P-value
Intercept
3.683340948
8.206951065
0.44880747
0.655724457
Catalyst
0.154754083
0.035594069
4.347749199
7.77444E-05
Pressure
–0.041971526
0.013451255
–3.120268445
0.003150939
Temp
–0.4035674
0.069825915
–5.779622062
6.62469E-07
Voltage
0.428756573
0.027841579
15.3998654
1.47975E-19
The best linear model is determined to be:
Yˆ 3.6833 0.1548X1 0.04197X 3 0.4036X 4 0.4288X 5 . The overall model has F = 77.0793 (4 and 45 degrees of freedom) with a p-value that is virtually 2
2
0. r = 0.8726, radj = 0.8613. The normal probability plot does not suggest possible violation of the normality assumption. A residual analysis reveals a potential non-linear relationship in temperature.
Copyright ©2024 Pearson Education, Inc.
cccxxviii Chapter 16: Time-Series Forecasting 15.44 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxix 15.44 cont.
The p-value of the squared term for temperature in the following quadratic transformation of temperature does not support the need for a quadratic transformation at the 5% level of significance. Coefficients
Standard Error
t Stat
P-value
Intercept
–322.0541757
209.7341228
–1.535535426
0.131812685
Catalyst
0.16474942
0.035632189
4.623612047
3.30582E-05
Pressure
–0.044452151
0.01334035
–3.332157868
0.001753581
Temp
7.27648367
4.9417966
1.472436901
0.148020497
Temp Squared
–0.04508917
0.029010216
–1.554251433
0.127288634
Voltage
0.424662847
0.027539942
15.41988889
2.34994E-19
The p-value of the interaction term between pressure and temperature below indicates that there is not enough evidence of an interaction at the 5% level of significance. Coefficients
Standard Error
t Stat
P-value
Intercept
103.5523674
55.92763384
1.851542078
0.070809822
Catalyst
0.144935857
0.035157935
4.122422311
0.000163315
Pressure
–0.859424944
0.453254189
–1.896121349
0.064522645
Temp
–1.586885548
0.659370559
–2.406667277
0.020363651
Pressure x Temp
0.009640768
0.005343284
1.804277709
0.078035623
Voltage
0.431941042
0.027226309
15.86484039
8.10114E-20
The best model is still the one that includes catalyst, pressure, temperature and voltage which manages to explain 87.26% of the variation in thickness. 15.45
(a)
In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger. The best-subset approach yielded only the following models to be considered: Model
Cp
k+1
R Square
Adj. R Square
Std. Error
Consider This Model?
X1X2
2.4647
3
0.6124
0.5668
0.0365
Yes
X1X2X3
3.6599
4
0.6313
0.5622
0.0367
Yes
X1X2X4
2.9741
4
0.6475
0.5814
0.0359
Yes
X1X2X3X4
4.1693
5
0.6664
0.5775
0.0360
Yes
Copyright ©2024 Pearson Education, Inc.
cccxxx Chapter 16: Time-Series Forecasting X1X2X4X5
4.8048
5
0.6515
0.5585
0.0368
Yes
X1X2X3X4X5
6.0000
6
0.6704
0.5527
0.0371
Yes
where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1, X2, X4 and yields the following results: Intercept Time Pressure Density
Coefficients Standard Error t Stat P-value 0.0884 0.0816 1.0835 0.2946 0.0022 0.0004 4.8180 0.0002 0.0013 0.0006 2.1406 0.0481 0.2263 0.1793 1.2620 0.2250
Copyright ©2024 Pearson Education, Inc.
Lower 95% Upper 95% -0.0845 0.2613 0.0012 0.0031 0.0000 0.0025 -0.1538 0.6063
Solutions to End-of-Section and Chapter Review Problems cccxxxi (a)
Looking at the p-value of the t-test statistic for each slope coefficient reveals that X4 is not significant at 5% level of significance. The results after dropping X4 follow: Regression Statistics Multiple R 0.7826 R Square 0.6124 Adjusted R Square 0.5668 Standard Error 0.0365 Observations 20 ANOVA df Regression Residual Total
SS 2 17 19
MS F Significance F 0.0357 0.0179 13.4294 0.0003 0.0226 0.0013 0.0584
Coefficients Standard Error t Stat P-value 0.1789 0.0395 4.5242 0.0003 0.0022 0.0005 4.7362 0.0002 0.0013 0.0006 2.1042 0.0505
Intercept Time Pressure
Lower 95% Upper 95% 0.0955 0.2623 0.0012 0.0031 0.0000 0.0026
The p-value of the t-test statistic for the slope coefficient for pressure is only slightly above 5%. For parsimony consideration, you might drop pressure from the model. The best model is Yˆ 0.1789 0.0022X 1 0.0013X 2 The normal probability plot suggests possible departure from the normality assumption. The residual plots do not reveal any specific pattern.
Normal Probability Plot 0.08
0.06 0.04
Residual
15.45 cont.
0.02 0 -0.02 -0.04 -0.06 -2
-1
0 Z Value
1
Copyright ©2024 Pearson Education, Inc.
2
cccxxxii Chapter 16: Time-Series Forecasting (a) Residual Plot for Time 0.08
0.06
Residuals
0.04 0.02 0 -0.02 -0.04 -0.06 0
20
40
60
80
100
60
80
Time
Residual Plot for Pressure 0.08
0.06 0.04
Residuals
15.45 cont.
0.02 0 -0.02 -0.04 -0.06 0
(b)
20
40 Pressure
In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger. The best-subset approach yielded only the following model to be considered: Model
Cp
k+1
R Square
Adj. R Square
Std. Error
Consider This Model?
X1X4
2.8943
3
0.5574
0.5053
0.0393
Yes
X1X2X4
2.2854
4
0.6258
0.5556
0.0373
Yes
X1X2X3X4
4.0090
5
0.6330
0.5351
0.0381
Yes
X1X2X4X5
4.2764
5
0.6260
0.5263
0.0385
Yes
X1X2X3X4X5
6.0000
6
0.6332
0.5023
0.0395
Yes
where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxiii (b)
The model with the highest adjusted r-square contains X1, X2, X4 and yields the following results: Intercept Time Pressure Density
Coefficients Standard Error 0.0146 0.0848 0.0020 0.0005 0.0011 0.0006 0.4588 0.1865
t Stat P-value 0.1715 0.8660 4.2165 0.0007 1.7094 0.1067 2.4602 0.0256
Looking at the p-value of the t-test statistic for each slope coefficient reveals that X2 is not significant at 5% level of significance. The results after dropping X2 follow: Regression Statistics Multiple R 0.7466 R Square 0.5574 Adjusted R Square 0.5053 Standard Error 0.0393 Observations 20 ANOVA df Regression Residual Total
SS 2 17 19
MS F Significance F 0.0331 0.0166 10.7053 0.0010 0.0263 0.0015 0.0595
Coefficients Standard Error t Stat P-value 0.0624 0.0845 0.7380 0.4706 0.0020 0.0005 3.9966 0.0009 0.4588 0.1967 2.3319 0.0323
Intercept Time Density
Lower 95% Upper 95% -0.1159 0.2406 0.0009 0.0030 0.0437 0.8738
The p-value of the t-test statistic for all the slope coefficients is < 5%. The best model is Yˆ 0.0624 0.0020 X 1 0.4588X 4 The normal probability plot suggests possible slight departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption. Normal Probability Plot 0.08
0.06 0.04
Residual
15.45 cont.
0.02 0 -0.02 -0.04 -0.06 -2
-1
0 Z Value
1
2
Copyright ©2024 Pearson Education, Inc.
cccxxxiv Chapter 16: Time-Series Forecasting (b) Residual Plot for Time 0.08
0.06
Residuals
0.04 0.02 0 -0.02 -0.04 -0.06 0
20
40
60
80
100
0.4
0.5
Time
Residual Plot for Density 0.08
0.06 0.04 Residuals
15.45 cont.
0.02 0 -0.02 -0.04 -0.06 0
(c)
(d)
0.1
0.2 0.3 Density
The most appropriate model to predict the product length in cavity 1 includes time and pressure while the most appropriate model to predict the product length in cavity 2 includes time and density. In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxv 15.45 cont.
(d)
The best-subset approach yielded only the following model to be considered: Model
Cp
k+1
R Square
Adj. R Square
Std. Error
Consider This Model?
X1
0.1360
2
0.2444
0.2025
0.1737
Yes
X1X2
1.9472
3
0.2533
0.1654
0.1777
Yes
X1X3
2.1345
3
0.2445
0.1556
0.1788
Yes
X1X4
1.3053
3
0.2833
0.1990
0.1741
Yes
X1X5
1.0212
3
0.2966
0.2139
0.1725
Yes
X1X2X3
3.9456
4
0.2533
0.1134
0.1832
Yes
X1X2X4
3.1164
4
0.2922
0.1595
0.1784
Yes
X1X2X5
2.8323
4
0.3055
0.1753
0.1767
Yes
X1X3X4
3.3037
4
0.2834
0.1490
0.1795
Yes
X1X3X5
3.0196
4
0.2967
0.1648
0.1778
Yes
X1X4X5
2.1904
4
0.3355
0.2109
0.1728
Yes
X1X2X3X5
4.8307
5
0.3056
0.1204
0.1825
Yes
X1X2X4X5
4.0016
5
0.3444
0.1695
0.1773
Yes
X1X3X4X5
4.1889
5
0.3356
0.1584
0.1785
Yes
X1X2X3X4X5
6.0000
6
0.3445
0.1103
0.1835
Yes
where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1, and X5, and yields the following results: Intercept Time Quantity
Coefficients Standard Error t Stat P-value 0.3724 0.2025 1.8392 0.0834 0.0052 0.0022 2.4306 0.0264 0.0484 0.0431 1.1233 0.2769
Looking at the p-value of the t-test statistic for each slope coefficient reveals that X5 is not significant at 5% level of significance. The results after dropping X5 follow:
Copyright ©2024 Pearson Education, Inc.
cccxxxvi Chapter 16: Time-Series Forecasting Regression Statistics Multiple R 0.4944 R Square 0.2444 Adjusted R Square 0.2025 Standard Error 0.1737 Observations 20 ANOVA df Regression Residual Total
Intercept Time
(d)
1 18 19
MS 0.1758 0.1758 0.5433 0.0302 0.7191
F Significance F 5.8231 0.0267
Coefficients Standard Error t Stat P-value 0.5420 0.1360 3.9858 0.0009 0.0052 0.0022 2.4131 0.0267
Lower 95% Upper 95% 0.2563 0.8276 0.0007 0.0098
The p-value of the t-test statistic for the significance of time is < 5%. The best model is Yˆ 0.5420 0.0052X 1 None of the observations have a Cook’s Di > F 0.8212 with d.f. = 3 and 17. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. The normal probability plot suggests possible slight departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption.
Normal Probability Plot 0.4 0.3 0.2
Residuals
15.45 cont.
SS
0.1 0 -0.1 -0.2 -0.3 -0.4 -2
-1
0 Z Value
1
2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxvii Residual Plot 0.4 0.3
Residuals
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 0
20
40
60
80
100
X
(e)
In the multiple regression model with all the possible independent variables, none of the variables have a VIF value of 5 or larger.
Copyright ©2024 Pearson Education, Inc.
cccxxxviii Chapter 16: Time-Series Forecasting 15.45 cont.
(e)
The best-subset approach yielded only the following model to be considered:
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
Consider This Model?
X1
–0.6667
2
0.2023
0.1580
0.1831
Yes
X1X2
1.1217
3
0.2133
0.1208
0.1871
Yes
X1X3
1.3164
3
0.2032
0.1094
0.1883
Yes
X1X4
1.0492
3
0.2171
0.1250
0.1867
Yes
X1X5
0.5125
3
0.2450
0.1562
0.1833
Yes
X1X2X3
3.1049
4
0.2142
0.0668
0.1928
Yes
X1X2X4
2.8376
4
0.2281
0.0834
0.1911
Yes
X1X2X5
2.3009
4
0.2560
0.1165
0.1876
Yes
X1X3X4
3.0323
4
0.2180
0.0713
0.1923
Yes
X1X3X5
2.4956
4
0.2459
0.1045
0.1888
Yes
X1X4X5
2.2284
4
0.2598
0.1210
0.1871
Yes
X1X2X3X4
4.8208
5
0.2290
0.0234
0.1972
Yes
X1X2X3X5
4.2841
5
0.2569
0.0587
0.1936
Yes
X1X2X4X5
4.0168
5
0.2708
0.0763
0.1918
Yes
X1X3X4X5
4.2115
5
0.2607
0.0635
0.1931
Yes
X1X2X3X4X5
6.0000
6
0.2717
0.0115
0.1984
Yes
where X1 = Time, X2 = Pressure, X3 = Amplitude, X4 = Density, and X5 = Quantity. The model with the highest adjusted r-square contains X1 and yields the following results:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxix Regression Statistics Multiple R 0.4498 R Square 0.2023 Adjusted R Square 0.1580 Standard Error 0.1831 Observations 20
Note: This worksheet does not recalculate. If regression data changes, rerun procedure to create an updated version of this worksheet.
ANOVA df Regression Residual Total
Intercept Time
SS 1 18 19
0.1531 0.6036 0.7567
Coefficients Standard Error 0.5673 0.1433 0.0049 0.0023
MS 0.1531 0.0335
F Significance F 4.5650 0.0466
t Stat P-value 3.9586 0.0009 2.1366 0.0466
Lower 95% Upper 95% 0.2662 0.8684 0.0001 0.0097
Looking at the p-value of the t-test statistic for the slope coefficient reveals that X1 is significant at 5% level of significance. The best model is Yˆ 0.5673 0.0049X 1
Copyright ©2024 Pearson Education, Inc.
cccxl Chapter 16: Time-Series Forecasting (e)
None of the observations have a Cook’s Di > F 0.8212 with d.f. = 3 and 17. Hence, using the Studentized deleted residuals, hat matrix diagonal elements and Cook’s distance statistic together, there is insufficient evidence for removal of any observation from the model. The normal probability plot does not suggest any departure from the normality assumption. The residual plots suggest possible violation of the equal variances assumption. Normal Probability Plot 0.4 0.3
Residuals
0.2 0.1 0 -0.1 -0.2 -0.3 -0.4 -2
-1
0 Z Value
1
2
80
100
Residual Plot 0.4 0.3 0.2
Residuals
15.45 cont.
0.1 0 -0.1 -0.2 -0.3 -0.4 0
20
40
60 X
(f)
The most appropriate model to predict the product weight in both cavity 1 and cavity 2 contains only the variable time. A slightly higher percentage of variation in product weight is explained by the variation in time for cavity 1 as compared to cavity 2 for the r2 = 0.2444 for cavity 1 while r2 = 0.2023 for cavity 2.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxli 15.46
From PHStat, Y = average annual salary, X1 = unemployment rate, X2 = median home price, X3 = violent crime per 100,000 residents, X4 = average commute time, X5 = livability score. Coefficients
Intercept
Standard Error
-8258.7031 15018.7217
t Stat
P-value
-0.5499
0.5888
R Square
VIF
Unemployment Rate
-92.5228
853.7659
-0.1084
0.9148
0.3720
1.5925
Median Home Price
0.0274
0.0044
6.2698
0.0000
0.4337
1.7658
Violent Crime per 100,000 residents
0.3855
9.0627
0.0425
0.9665
0.4667
1.8752
Average Commute Time
763.1739
449.6812
1.6971
0.1060
0.6725
3.0534
Livability Score
608.8310
207.4193
2.9353
0.0085
0.1030
1.1148
A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a three-variable model with median home price, commuter time, and livability score has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 2.0157 with X2X4X5 and adjusted r2 = 0.8543. (PHStat display of 3 smallest Cp) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X2X4X5
2.0157
4
0.8725
0.8543 4473.4675
X1X2X4X5
4.0018
5
0.8726
0.8471 4582.2602
X2X3X4X5
4.0117
5
0.8725
0.8470 4583.4579
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model t Stat
P-value
13150.0706
-0.6772
0.5056
0.0272
0.0039
6.9839
0.0000
Livability Score
613.1958
194.1231
3.1588
0.0047
Average Commute Time
760.2396
295.8934
2.5693
0.0179
Intercept Median Home Price
Coefficients
Standard Error
-8905.7247
Copyright ©2024 Pearson Education, Inc.
cccxlii Chapter 16: Time-Series Forecasting The most appropriate multiple regression model for predicting average annual salary is: Yˆ 8905.7247 0 X 1 0.0272 X 2 0 X 3 760.2396 X 4 613.1958 X 5 , for X1 = unemployment rate, X2 = median home price, X3 = violent crime per 100,000 residents, X4 = average commute time, X5 = livability score. The adjusted r2 for the best model is 0.8543 and the r2 for the model is 0.8725, so the variation in average annual salary can be explained by variation in median home price, variation in average commuting time, and variation in livability score.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxliii 15.46Residual analyses reveal no potential violations in assumptions. cont.
Copyright ©2024 Pearson Education, Inc.
cccxliv Chapter 16: Time-Series Forecasting
15.46 cont.
15.47
From PHStat, Y = wins, X1 = field goal percentage, X2 = three-point percentage, X3 = free throw percentage, X4 = rebounds, X5 = assists, X6 = turnovers. Coefficients
Standard Error
t Stat
P-value
Copyright ©2024 Pearson Education, Inc.
R Square
VIF
Solutions to End-of-Section and Chapter Review Problems cccxlv Intercept
-338.4853
82.2815
-4.1137
0.0004
Field Goal Percentage
223.6901
136.5881
1.6377
0.1151
0.6174
2.6134
Three-Point Percentage
322.2388
118.8202
2.7120
0.0124
0.4428
1.7947
Free Throw Percentage
79.4471
53.5478
1.4837
0.1515
0.2857
1.4000
Rebounds
2.8376
0.8781
3.2315
0.0037
0.1763
1.2140
Assists
0.0694
0.9365
0.0741
0.9416
0.4282
1.7488
Turnovers
-1.9662
1.4960
-1.3143
0.2017
0.2357
1.3084
A regression analysis revealed that all six variables had VIF values below 5. So, there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with field-goal percentage, commuter time, and livability score has the lowest Cp value and the highest adjusted r2.
Copyright ©2024 Pearson Education, Inc.
cccxlvi Chapter 16: Time-Series Forecasting 15.47 cont.
The best-subset approach yielded: Cp = 4.7442 with X1X2X3X4 and adjusted r2 = 0.6722. (PHStat display of 3 smallest Cp) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X2X3X4
4.7442
5
0.7174
0.6722
6.6098
X1X2X4X6
5.2054
5
0.7121
0.6661
6.6711
X1X2X3X4X6
5.0055
6
0.7373
0.6825
6.5048
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
Intercept
-360.9347
Three-Point Percentage Rebounds Free Throw Percentage
t Stat
P-value
61.7701
-5.8432
0.0000
481.6958
99.9288
4.8204
0.0001
3.1252
0.8689
3.5967
0.0013
119.5796
52.2259
2.2897
0.0304
The most appropriate multiple regression model for predicting wins is: Yˆ 360.9347 0 X 1 481.6958 X 2 119.5796 X 3 3.1252 X 4 0 X 5 0 X 6 , for X1 = field goal percentage, X2 = three-point percentage, X3 = free throw percentage, X4 = rebounds, X5 = assists, X6 = turnovers. Residual analyses reveal no potential violations in assumptions.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxlvii
Copyright ©2024 Pearson Education, Inc.
cccxlviii Chapter 16: Time-Series Forecasting 15.47 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxlix 15.47Residual analyses reveal no potential violations in the equal variance assumption. cont.However, there appears to be some deviation from the normality assumption.
15.48
All analyses and associated summaries are provided with problems 15.38–15.42.
Chapter 16
16.1
Use the smoothed value for this year: 41.3 million of constant this year dollars.
16.2
(a)
(b)
16.3
(a)
Since you need data from four prior years to obtain the centered 9-year moving average for any given year and since the first recorded value is for 1984, the first centered moving average value you can calculate is for 1988. You would lose four years for the period 1984–1987 since you do not have enough past values to compute a centered moving average. You will also lose the final four years of recorded time series since you do not have enough later values to compute a centered moving average. Therefore, you will lose a total of eight years in computing a series of 9year moving averages. E2022 = (0.20)(12.3) + (0.80)(9.5) = 10.06 Copyright ©2024 Pearson Education, Inc.
cccl Chapter 16: Time-Series Forecasting
16.4
(b)
E2023 0.20Y2023 0.80 E2022 0.20 12.6 0.80 10.06 10.57
(a)
Times Series Plot: Enrollment vs Year
(b)
Three-year moving average Year Enrollment MA(3) 2011
956
#N/A
2012
933
935.3333
2013
917
936.6667
2014
960
927.6667
2015
906
955.0000
2016
999
960.0000
2017
975
958.6667
2018
902
933.6667
2019
924
930.6667
2020
966
959.3333
2021
988
973.0000
2022
965
976.5000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccli 16.4 cont.
(b)
Three-year moving average for Enrollment
(c)
W = 0.50, exponentially smooth the series Year
Enrollment
ES(0.50)
2011
956
956.0000
2012
933
944.5000
2013
917
930.7500
2014
960
945.3750
2015
906
925.6875
2016
999
962.3438
2017
975
968.6719
2018
902
935.3359
2019
924
929.6680
2020
966
947.8340
2021
988
967.9170
2022
965
966.4585
Copyright ©2024 Pearson Education, Inc.
ccclii Chapter 16: Time-Series Forecasting
16.4 cont.
(d) (e)
Yˆ2023 E2022 966.4585 W = 0.25, exponentially smooth the series Year
Enrollment
ES(0.25)
2011
956
956.0000
2012
933
950.2500
2013
917
941.9375
2014
960
946.4531
2015
906
936.3398
2016
999
952.0049
2017
975
957.7537
2018
902
943.8152
2019
924
938.8614
2020
966
945.6461
2021
988
956.2346
2022
965
958.4259
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccliii
(f)
(g)
Yˆ2023 E2022 958.4259 The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying longterm tendencies. There is no perceptible trend in the annual enrollment in the introductory business statistics courses from 2011-2022. There has been a consistent enrollment of 950 students.
Copyright ©2024 Pearson Education, Inc.
cccliv Chapter 16: Time-Series Forecasting 16.5
(a)
Time Series Plot: Accounting Majors: Major vs Year
(b)
Three-year moving average for Majors Year
Majors
MA(3)
2011
283
#N/A
2012
290
294.0000
2013
309
321.3333
2014
365
330.0000
2015
316
323.0000
2016
288
310.0000
2017
326
303.3333
2018
296
330.6667
2019
370
326.0000
2020
312
340.0000
2021
338
309.6667
2022
279
308.5000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclv
Copyright ©2024 Pearson Education, Inc.
ccclvi Chapter 16: Time-Series Forecasting 16.5 cont.
(c)
W = 0.50, exponentially smooth the series Year
Majors
ES(0.50)
2011
283
283.0000
2012
290
286.5000
2013
309
297.7500
2014
365
331.3750
2015
316
323.6875
2016
288
305.8438
2017
326
315.9219
2018
296
305.9609
2019
370
337.9805
2020
312
324.9902
2021
338
331.4951
2022
279
305.2476
Exponenially Smoothed Majors, ES(0.50) W = 0.50 400 350 300 250 200 150 100 50 0 2010
2012
2014
2016
2018
2020
Year Majors
(d)
ES(0.50)
Yˆ2023 E2022 305.2476
Copyright ©2024 Pearson Education, Inc.
2022
2024
Solutions to End-of-Section and Chapter Review Problems ccclvii 16.5 cont.
(e)
(f)
W = 0.25, exponentially smooth the series Year
Majors
ES(0.25)
2011
283
283.0000
2012
290
284.7500
2013
309
290.8125
2014
365
309.3594
2015
316
311.0195
2016
288
305.2646
2017
326
310.4485
2018
296
306.8364
2019
370
322.6273
2020
312
319.9705
2021
338
324.4778
2022
279
313.1084
Yˆ2023 E2022 313.1084 The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, Copyright ©2024 Pearson Education, Inc.
ccclviii Chapter 16: Time-Series Forecasting
(g)
which assigns more weight to more distant values, is better suited for identifying longterm tendencies. There is no perceptible trend in annual number of declared accounting majors from 20112022. There has been a consistent number of majors of 313 students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclix 16.6
(a)
Time Series Plot: Stock Performance, Decade vs Performance (%)
(b)
Three-period moving average Decade
Performance
MA(3)
1830s
2.8
#N/A
1840s
12.8
7.4000
1850s
6.6
10.6333
1860s
12.5
8.8667
1870s
7.5
8.6667
1880s
6.0
6.3333
1890s
5.5
7.4667
1900s
10.9
6.2000
1910s
2.2
8.8000
1920s
13.3
4.4333
1930s
-2.2
6.9000
1940s
9.6
8.5333
1950s
18.2
12.0333
Copyright ©2024 Pearson Education, Inc.
ccclx Chapter 16: Time-Series Forecasting 1960s
8.3
11.0333
1970s
6.6
10.5000
1980s
16.6
13.6000
1990s
17.6
11.8000
2000s
1.2
10.8000
2010s
13.6
#N/A
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxi 16.6 cont.
(b)
Three-period moving average for Stock Performance
(c)
W = 0.50, exponentially smooth the series Decade Performance ES(0.50) 1830s
2.8
2.8000
1840s
12.8
7.8000
1850s
6.6
7.2000
1860s
12.5
9.8500
1870s
7.5
8.6750
1880s
6.0
7.3375
1890s
5.5
6.4188
1900s
10.9
8.6594
1910s
2.2
5.4297
1920s
13.3
9.3648
1930s
-2.2
3.5824
1940s
9.6
6.5912
Copyright ©2024 Pearson Education, Inc.
ccclxii Chapter 16: Time-Series Forecasting 1950s
18.2
12.3956
1960s
8.3
10.3478
1970s
6.6
8.4739
1980s
16.6
12.5370
1990s
17.6
15.0685
2000s
1.2
8.1342
2010s
13.6
10.8671
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxiii 16.6 cont.
(c)
W = 0.50, exponentially smooth the series
(d)
Yˆ2020 s E2010 s 10.8671
(e)
W = 0.25, exponentially smooth the series Decade Performance ES(0.25) 1830s
2.8
2.8000
1840s
12.8
5.3000
1850s
6.6
5.6250
1860s
12.5
7.3438
1870s
7.5
7.3828
1880s
6.0
7.0371
1890s
5.5
6.6528
1900s
10.9
7.7146
1910s
2.2
6.3360
1920s
13.3
8.0770
1930s
-2.2
5.5077
1940s
9.6
6.5308
Copyright ©2024 Pearson Education, Inc.
ccclxiv Chapter 16: Time-Series Forecasting 1950s
18.2
9.4481
1960s
8.3
9.1611
1970s
6.6
8.5208
1980s
16.6
10.5406
1990s
17.6
12.3055
2000s
1.2
9.5291
2010s
13.6
10.5468
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxv 16.6 cont.
(e)
W = 0.25, exponentially smooth the series
Yˆ2020 s E2010 s 10.5468 (f)
(g)
16.7
(a)
The exponentially smoothed forecast for 2020s is lower with a W of 0.50 compared to a W of 0.25. The exponential smoothing with W = 0.50 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. Exponential smoothing with a W of 0.25 reveals a general upward trend in the performance of stocks over the last several decades. Time Series Plot: Coffee Exports (thousands 60 kg bag) vs Year
Copyright ©2024 Pearson Education, Inc.
ccclxvi Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxvii 16.7 cont.
(b)
Three-year moving average for Coffee Exports Year
Exports (thousands 60 kg bag )
MA(3)
2004
10,194
#N/A
2005
10,871
10670.14
2006
10,945
11038.84
2007
11,300
11110.15
2008
11,085
10093.17
2009
7,894
8933.58
2010
7,822
7816.40
2011
7,734
7575.15
2012
7,170
8191.25
2013
9,670
9264.84
2014
10,954
11113.57
2015
12,716
12167.26
2016
12,831
12844.13
2017
12,985
12845.33
2018
12,720
12405.00
2019
11,510
11842.67
2020
11,298
11436.33
2021
11,501
#N/A
Copyright ©2024 Pearson Education, Inc.
ccclxviii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxix 16.7 cont.
(c)
W = 0.50, exponentially smooth the series Year
Exports (thousands 60 kg bag )
ES(0.50)
2004
10,194
10194.00
2005
10,871
10532.62
2006
10,945
10738.74
2007
11,300
11019.58
2008
11,085
11052.37
2009
7,894
9473.15
2010
7,822
8647.39
2011
7,734
8190.51
2012
7,170
7680.36
2013
9,670
8675.13
2014
10,954
9814.77
2015
12,716
11265.58
2016
12,831
12048.29
2017
12,985
12516.64
2018
12,720
12618.32
2019
11,510
12064.16
2020
11,298
11681.08
2021
11,501
11591.04
Copyright ©2024 Pearson Education, Inc.
ccclxx Chapter 16: Time-Series Forecasting
(d)
Yˆ2022 E2021 11,591.04 in thousands of 60 kg bags
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxi 16.7 cont.
(e)
W = 0.25, exponentially smooth the series Year
Exports (thousands 60 kg bag )
ES(0.25)
2004
10,194
10194.00
2005
10,871
10363.31
2006
10,945
10508.70
2007
11,300
10706.63
2008
11,085
10801.26
2009
7,894
10074.43
2010
7,822
9511.23
2011
7,734
9066.83
2012
7,170
8592.67
2013
9,670
8861.98
2014
10,954
9385.09
2015
12,716
10217.91
2016
12,831
10871.18
2017
12,985
11399.64
2018
12,720
11729.73
2019
11,510
11674.80
2020
11,298
11580.60
2021
11,501
11560.70
Copyright ©2024 Pearson Education, Inc.
ccclxxii Chapter 16: Time-Series Forecasting
16.8
(a)
Yˆ2022 E2021 11,560.7 in thousands of 60 kg bags The exponentially smoothed 2022 forecast for Costa Rica coffee exports is lower with a W of 0.25 compared to a W of 0.50. The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. Exponential smoothing with a W of 0.25 reveals an upward trend from 2004 to 2008 in Costa Rica coffee exports, which was followed by decline in exports from 2009 until 2013. An increase in coffee exports was observed from 2014 to 2021. Time Series Plot: IPOs vs Year
16.8 cont.
(b)
Three-year moving average for IPOs
(f)
(g)
Year
Number of IPOs
MA(3)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxiii 2000
397
#N/A
2001
141
240.3333
2002
183
157.3333
2003
148
215.0000
2004
314
249.3333
2005
286
273.3333
2006
220
258.0000
2007
268
183.3333
2008
62
136.3333
2009
79
110.3333
2010
190
146.6667
2011
171
172.6667
2012
157
193.0000
2013
251
237.3333
2014
304
253.6667
2015
206
214.3333
2016
133
185.3333
2017
217
201.6667
2018
255
234.6667
2019
232
322.3333
2020
480
#N/A
Copyright ©2024 Pearson Education, Inc.
ccclxxiv Chapter 16: Time-Series Forecasting 16.8 cont.
(b)
Three-year moving average for IPOs
16.8
(c)
W = 0.50, exponentially smooth the series Year
Number of IPOs
ES(0.50)
2000
397
397.0000
2001
141
269.0000
2002
183
226.0000
2003
148
187.0000
2004
314
250.5000
2005
286
268.2500
2006
220
244.1250
2007
268
256.0625
2008
62
159.0313
2009
79
119.0156
2010
190
154.5078
2011
171
162.7539
2012
157
159.8770
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxv 2013
251
205.4385
2014
304
254.7192
2015
206
230.3596
2016
133
181.6798
2017
217
199.3399
2018
255
227.1700
2019
232
229.5850
2020
480
354.7925
Copyright ©2024 Pearson Education, Inc.
ccclxxvi Chapter 16: Time-Series Forecasting 16.8 cont.
(c)
W = 0.50, exponentially smooth the series
(d) (e)
Yˆ2021 E2020 354.7925 W = 0.25, exponentially smooth the series Year
Number of IPOs
ES(0.25)
2000
397
397.0000
2001
141
333.0000
2002
183
295.5000
2003
148
258.6250
2004
314
272.4688
2005
286
275.8516
2006
220
261.8887
2007
268
263.4165
2008
62
213.0624
2009
79
179.5468
2010
190
182.1601
2011
171
179.3701
2012
157
173.7775
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxvii 2013
251
193.0832
2014
304
220.8124
2015
206
217.1093
2016
133
196.0820
2017
217
201.3115
2018
255
214.7336
2019
232
219.0502
2020
480
284.2877
Copyright ©2024 Pearson Education, Inc.
ccclxxviii Chapter 16: Time-Series Forecasting 16.8 cont.
(e)
(f)
(g)
W = 0.25, exponentially smooth the series
Yˆ2021 E2020 284.2877 The exponentially smoothed 2021 forecast for IPOs is lower with a W of 0.25 compared to a W of 0.50. The exponential smoothing with W = 0.5 assigns more weight to the more recent values and is better for short-term forecasting, while the exponential smoothing with W = 0.25, which assigns more weight to more distant values, is better suited for identifying long-term tendencies. There appears to be a cyclical component every several years with up and down cycles of the number of IPOs. The fact that the forecast was so incorrect makes you want to use other approaches for short term forecasting.
16.9
(a) (b) (c) (d)
X=0 X=4 X = 30 X = 35
16.10
(a)
The Y-intercept b0 = 4.0 is the fitted trend value reflecting the real total revenues (in millions of dollars) during the origin or base year 2001. The slope b1 = 1.5 indicates that the real total revenues are increasing at an estimated rate of 1.5 million per year. Year is 2005, X = 2005 – 2001 = 4 Yˆ2002 4.0 1.5(4) 10.0 million dollars Year is 2022, X = 2022 – 2001 = 21, Yˆ2019 4.0 1.5(21) 35.5 million dollars Year is 2025, X = 2025 – 2001 = 24 Yˆ2022 4.0 1.5(24) 40 million dollars
(b) (c) (d) (e)
16.11
(a) (b)
The Y-intercept b0 of 1.2 is the predicted mean sales in $billions during the base year, 2001. The slope b1 of 0.4 indicates that mean sales in $billions are predicted to increase by an estimated rate of $0.4 billion per year. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxix 16.11
(c)
Yˆ 1.7 0.4(9) 5.3 billion dollars
cont.
(d)
Most recent year is 2022, X = 2022 – 2001 = 21, Yˆ 1.7 0.4(21) 10.1 billion dollars Year is 2024, X = 2024 – 2001 = 23 Yˆ 1.7 0.4(23) 10.9 billion dollars
(e) 16.12
(a)
(b)
Bonus($thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.7641
R Square
0.5838
Adjusted R Square
0.5630
Standard Error
30.8973
Observations
22
ANOVA df
SS
MS
F
Regression
1
26780.8753 26780.8753 28.0532
Residual
20
19092.9034
Total
21
45873.7786
Copyright ©2024 Pearson Education, Inc.
954.6452
ccclxxx Chapter 16: Time-Series Forecasting Coefficients
Standard Error
t Stat
P-value
Intercept
89.1332
12.7378
6.9975
0.0000
Coded Year
5.4994
1.0383
5.2965
0.0000
Linear model: Predicted Bonus = Yˆ 89.1332 5.4994 X where X = years relative to 2000 t = 5.2965, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.5838, 58.38% of the variation in bonuses is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxi 16.12 cont.
(b)
Fitted Line Plot of Bonus($thousands) vs Coded Year
(c)
Regression Analysis: Bonus($thousands) vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.7688
R Square
0.5911
Adjusted R Square
0.5481
Standard Error
31.4203
Observations
22
ANOVA df
SS
MS
F
Regression
2
27116.3539 13558.1770 13.7335
Residual
19
18757.4247
Total
21
45873.7786
Copyright ©2024 Pearson Education, Inc.
987.2329
ccclxxxii Chapter 16: Time-Series Forecasting Coefficients
Standard Error
t Stat
P-value
Intercept
96.7498
18.3986
5.2585
0.0000
Coded Year
3.2145
4.0595
0.7918
0.4382
Coded Year Sq
0.1088
0.1867
0.5829
0.5668
Quadratic model:
(c)
Predicted Bonus = Yˆ 96.7498 3.2145 X 0.1088 X where X = years relative to 2000 For full model: F = 13.7335, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 0.5829, p-value = 0.5668 > 0.05. Coded Year^2 is not significant; r2 = 0.5911. 59.11% of the variation in bonuses is explained by year. Fitted Line Plot of Bonus($thousands) vs Coded Year, Coded Year Sq
(d)
log(Bonus($thousands)) vs Coded Year
2
16.12 cont.
Simple Linear Regression Analysis Regression Statistics Multiple R
0.7616
R Square
0.5800
Adjusted R Square
0.5590
Standard Error
0.0995
Observations
22
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxiii ANOVA df
SS
MS
F
Regression
1
0.2732
0.2732 27.6224
Residual
20
0.1978
0.0099
Total
21
0.4711
Coefficients
Standard Error
t Stat
P-value
Intercept
1.9595
0.0410 47.7890
0.0000
Coded Year
0.0176
0.0033
0.0000
5.2557
Exponential model: log (predicted bonus) = log10 Yˆ 1.9595 0.0176( X ) where X = years relative to 2000 t = 5.2557, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.5800. 58% of the variation in the log of bonuses is explained by year.
Copyright ©2024 Pearson Education, Inc.
ccclxxxiv Chapter 16: Time-Series Forecasting 16.12 cont.
(d)
Fitted Line Plot of log(Bonus($thousands)) vs Coded Year
(e)
Linear: Yˆ2022 89.1332 5.4994(22) 210.1208 Yˆ2023 89.1332 5.4994(23) 215.6202 Quadratic: Yˆ 96.7498 3.2145 X 0.1088 X Yˆ2022 96.7498 3.2145(22) 0.1088(22)2 220.1312 Yˆ2023 96.7498 3.2145(23) 0.1088(23)2 228.2420 2
Exponential: log10 Yˆ2022 1.9595 0.0176(22) 2.3460 Yˆ2022 102.3460 221.7999 log10 Yˆ2023 1.9595 0.0176(23) 2.3635 Yˆ 102.3635 230.9551 2023
(f)
Because the r2 values are similar, you would evaluate the residual plots to see if there were any patterns before choosing. Based on the principle of parsimony, one might choose the linear model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxv 16.13
(a)
(b)
GDP vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9912
R Square
0.9825
Adjusted R Square
0.9821
Standard Error
787.0082
Observations
42
ANOVA df
SS
MS
F
Regression
1 1390018070.5883 1390018070.5883 2244.2020
Residual
40
Total
41 1414793346.5698
Coefficients
24775275.9815
Standard Error
Copyright ©2024 Pearson Education, Inc.
619381.8995
t Stat
P-value
ccclxxxvi Chapter 16: Time-Series Forecasting Intercept
1389.3978
238.6022
5.8231
0.0000
Coded Year
474.6244
10.0189
47.3730
0.0000
Yˆ 1,389.3978 474.6244 X where X = years relative to 1980.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxvii 16.13 cont.
(b)
Linear Model: Fitted Line Plot GDP = 1,389.3978 + 474.6244 Coded Year
(c)
Yˆ2022 1,389.3978 474.6244(42) 21,323.6218 billion dollars. Yˆ 1,389.3978 474.6244(43) 21,798.2462 billion dollars.
(d)
Based on the regression equation one would conclude that there is an upward linear trend in GDP growth.
2023
16.14
(a)
Copyright ©2024 Pearson Education, Inc.
ccclxxxviii Chapter 16: Time-Series Forecasting 16.14 cont.
(b)
Receipts vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.9824
R Square
0.9651
Adjusted R Square
0.9642
Standard Error
177.2781
Observations
44
ANOVA df
SS
MS
F
Regression
1 36448964.8010 36448964.8010 1159.7783
Residual
42
Total
43 37768920.9591
1319956.1581
31427.5276
Coefficients
Standard Error
t Stat
P-value
Intercept
238.9876
52.5530
4.5476
0.0000
Coded Year
71.6748
2.1046
34.0555
0.0000
Yˆ 238.9876 71.6748( X ) where X = years relative to 1978 t = 34.0555, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.9651, 96.51% of the variation in federal receipts is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxix
(c)
Yˆ2022 238.9876 71.6748(44) 3,392.678 billion Yˆ 238.9876 71.6748(45) 3, 464.353 billion
(d)
There is strong upward trend in federal receipts from 1978 through 2021, which appears to be linear.
2023
16.15
(a)
(b)
Total Sales (thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics
Copyright ©2024 Pearson Education, Inc.
cccxc Chapter 16: Time-Series Forecasting Multiple R
0.3960
R Square
0.1568
Adjusted R Square
0.1267
Standard Error
239.2122
Observations
30
ANOVA df
SS
MS
Regression
1
298008.6318 298008.6318
Residual
28
1602229.2349
Total
29
1900237.8667
Coefficients
Standard Error
Intercept
868.7011
Coded Year
-11.5150
F 5.2079
57222.4727
t Stat
P-value
85.2085
10.1950
0.0000
5.0458
-2.2821
0.0303
Linear Model: Predicted House Sales = Yˆ 868.7011 11.515 X where X = years relative to 1992. t = –2.2821, p-value = 0.0303 < 0.05. Coded year is significant. r2 = 0.1568. 15.68% of the variation in house sales is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxci 16.15 cont.
(b)
Linear Model: Fitted Line Plot Total Sales (thousands) = 868.7011 – 11.515 Coded Year
(c)
Total Sales (thousands) vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R
0.4392
R Square
0.1929
Adjusted R Square
0.1331
Standard Error
238.3291
Observations
30
ANOVA df
SS
MS
Regression
2
366617.2501 183308.6251
Residual
27
1533620.6165
Total
29
1900237.8667
Copyright ©2024 Pearson Education, Inc.
56800.7636
F 3.2272
cccxcii Chapter 16: Time-Series Forecasting
16.15 cont.
(c)
(d)
Coefficients
Standard Error
t Stat
P-value
Intercept
771.9544
122.2948
6.3122
0.0000
Coded Year
9.2164
19.5217
0.4721
0.6406
Coded Year Sq
-0.7149
0.6505
-1.0990
0.2815
Quadratic Model: Predicted House Sales = Yˆ 771.9544 9.2164 X 0.7149 X 2 where X = years relative to 1992. For full model, F = 3.2272, p-value = 0.0000, at least one X variable is significant. For Coded Year2, t = –1.0990, p-value = 0.2815 > 0.05. Coded Year2 not is significant. r2 = 0.1929. 19.29% of the variation in house sales is explained by year. Quadratic Model: Fitted Line Plot Total sales (thousands) = 771.9544 + 9.2164 Coded Year – 0.7149 Coded Year2
log(Total Sales(thousands)) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.4133
R Square
0.1708
Adjusted R Square
0.1412
Standard Error
0.1531
Observations
30 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxciii
ANOVA df
SS
MS
F 5.7694
Regression
1
0.1352
0.1352
Residual
28
0.6561
0.0234
Total
29
0.7913
Coefficients
Standard Error
t Stat
P-value
Intercept
2.9295
0.0545 53.7250
0.0000
Coded Year
-0.0078
0.0032
0.0232
-2.4020
Exponential model: log (predicted house sales) = log10 Yˆ 2.9295 0.0078( X ) where X = years relative to 1992 t = –2.4020, p-value = 0.0232 < 0.05; Coded Year is significant; r2 = 0.1708. 17.08% of the variation in the log of house sales is explained by year.
Copyright ©2024 Pearson Education, Inc.
cccxciv Chapter 16: Time-Series Forecasting 16.15 cont.
(d)
Fitted Line Plot of log(House Sales(thousands)) vs Coded Year
(e) First
Year
Second
Percentage
Total Sales Difference Difference Difference 1992 610 #N/A #N/A #N/A 1993 666 56 #N/A 9.18% 1994 670 4 -52 0.60% 1995 667 -3 -7 -0.45% 1996 757 90 93 13.49% 1997 804 47 -43 6.21% 1998 886 82 35 10.20% 1999 880 -6 -88 -0.68% 2000 877 -3 3 -0.34% 2001 908 31 34 3.53% 2002 973 65 34 7.16% 2003 1,086 113 48 11.61% 2004 1,203 117 4 10.77% 2005 1,283 80 -37 6.65% 2006 1,051 -232 -312 -18.08% 2007 776 -275 -43 -26.17% 2008 485 -291 -16 -37.50% 2009 375 -110 181 -22.68% 2010 323 -52 58 -13.87% 2011 306 -17 35 -5.26% 2012 368 62 79 20.26% 2013 429 61 -1 16.58% 2014 437 8 -53 1.86% 2015 501 64 56 14.65% 2016 561 60 -4 11.98% 2017 613 52 -8 9.27% 2018 617 4 -48 0.65% 2019 600 -17 -21 -2.76% 2020 650 50 67 8.33% 2021 690 40 -10 6.15%
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxcv
Copyright ©2024 Pearson Education, Inc.
cccxcvi Chapter 16: Time-Series Forecasting 16.15 cont.
(e)
(f) 16.16
A review of 1st, 2nd, and percentage differences reveals no particular model is more appropriate than the other. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Yˆ 868.7011 11.515(30) 523.2506 (thousands)
(a)
(b)
Linear: Solar Power Generated (gigawatts) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.9700
R Square
0.9409
Adjusted R Square
0.9350
Standard Error
9641.9264
Observations
12
ANOVA df
SS
MS
F
Regression
1 14791878580.3916 14791878580.3916 159.1094
Residual
10
929667449.2751
Copyright ©2024 Pearson Education, Inc.
92966744.9275
Solutions to End-of-Section and Chapter Review Problems cccxcvii Total
11 15721546029.6667 Coefficients
16.16 cont.
Standard Error
t Stat
P-value
Intercept
-14810.0897
5235.7684
-2.8286
0.0179
Coded Year
10170.5315
806.2984
12.6139
0.0000
(b)
Predicted solar power generated = Yˆ 14,810.0897 10,170.5315 X , where X = years relative to 2010. t = 12.6139, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9409, 94.09% of the variation in solar power generated is explained by year. Fitted Line Plot of Solar Power Generated (gigawatts) vs Coded Year
(c)
Quadratic Regression Analysis: Solar Power Generated vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.9968
R Square
0.9936
Adjusted R Square
0.9922
Standard Error
3340.8379
Observations
12
Copyright ©2024 Pearson Education, Inc.
cccxcviii Chapter 16: Time-Series Forecasting ANOVA df
SS
MS
F
Regression
2 15621095247.8205 7810547623.9103 699.7947
Residual
9
Total
11 15721546029.6667 Coefficients
100450781.8462
Standard Error
11161197.9829
t Stat
P-value
Intercept
-359.3846
2470.1951
-0.1455
0.8875
Coded Year
1500.1084
1043.9910
1.4369
0.1846
Coded Year Sq
788.2203
91.4469
8.6194
0.0000
Predicted solar power generated = Yˆ 359.3846 1,500.1084 X 788.2203X 2 , where X = years relative to 2010. For full model: F = 699.7947, p-value = 0.0002, at least one X variable is significant. For Coded Year^2: t = 0.8.6194, p-value = 0.0000 < 0.05. Coded Year^2 is significant; r2 = 0.9936. 99.36% of the variation in solar power generated is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxcix 16.16 cont.
(c)
(d)
Exponential: log(Solar Power(gigawatts)) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9628
R Square
0.9270
Adjusted R Square
0.9197
Standard Error
0.1904
Observations
12
ANOVA df
SS
MS
F
Regression
1
4.6026
4.6026 127.0262
Residual
10
0.3623
0.0362
Total
11
4.9649
Copyright ©2024 Pearson Education, Inc.
cd Chapter 16: Time-Series Forecasting Coefficients Standard Error
t Stat
P-value
Intercept
3.3142
0.1034 32.0632
0.0000
Coded Year
0.1794
0.0159 11.2706
0.0000
Log(predicted solar power generated) = log10 Yˆ 3.3142 0.1794( X ) , where X = years relative to 2010 t = 11.2706, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9270. 92.70% of the variation in the log of solar power generated is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdi 16.16 cont.
(d)
Fitted Line Plot of log(Solar Power Generated (gigawatts)) vs Coded Year
(e)
Linear: Yˆ2022 14,810.0897 10,170.5315(12) 107, 236.29 gigawatts Yˆ2023 14,810.0897 10,170.5315(13) 117, 406.82 gigawatts Quadratic: Yˆ2022 359.3846 1500.1084(12) 788.2203(12)2 131,145.64 gigawatts Yˆ2023 359.3846 1500.1084(13) 788.2203(13) 2 152,351.25 gigawatts Exponential: log10 Yˆ2022 3.3142 0.1794(12) 5.4670 Yˆ2022 105.4670 293,109.51 gigawatts log10 Yˆ2023 3.3142 0.1794(13) 5.6464 Yˆ 105.6464 443,030.78 gigawatts 2023
16.17
(a)
Time Series Plot of Auto Production
Copyright ©2024 Pearson Education, Inc.
cdii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdiii 16.17 cont.
(b)
Linear: Units Produced (thousands) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.7781
R Square
0.6055
Adjusted R Square
0.5867
Standard Error
702.1920
Observations
23
ANOVA df
SS
MS
F
Regression
1 15892756.8289 15892756.8289 32.2320
Residual
21 10354544.5846
Total
22 26247301.4135
493073.5516
Coefficients
Standard Error
t Stat
P-value
Intercept
5123.8344
283.5356
18.0712
0.0000
Coded Year
-125.3168
22.0732
-5.6773
0.0000
Predicted Production = Yˆ 5,123.8344 125.3168( X ) where X = years relative to 1999. t = –5.6773, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.6055, 60.55% of the variation in units produced is explained by year.
Copyright ©2024 Pearson Education, Inc.
cdiv Chapter 16: Time-Series Forecasting
16.17 cont.
(c)
Regression Analysis Unit Production (thousands) vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.7784
R Square
0.6058
Adjusted R Square
0.5664
Standard Error
719.2189
Observations
23
ANOVA df
SS
MS
F
Regression
2 15901785.1296 7950892.5648 15.3707
Residual
20 10345516.2839
Total
22 26247301.4135
517275.8142
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdv Coefficients
Standard Error
t Stat
P-value
Intercept
5162.7093
413.4319
12.4874
0.0000
Coded Year
-136.4239
87.0604
-1.5670
0.1328
Coded Year Sq
0.5049
3.8215
0.1321
0.8962
Yˆ 5542 296.7( X ) 10.36 X 2 where X = years relative to 1999. Quadratic model: Predicted Unit Production = Yˆ 5162.7093 136.4239 X 0.5049 X 2 where X = years relative to 1999 For full model: F = 15.3707, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 0.1321, p-value = 0.8962 > 0.05. Coded Year^2 is not significant; r2 = 0.6058. 60.58% of the variation in unit production is explained by year.
Fitted Line Plot of United Produced (thousands) vs Coded Year, Coded Year Sq
Copyright ©2024 Pearson Education, Inc.
cdvi Chapter 16: Time-Series Forecasting 16.17 cont.
(d)
Exponential log(Units Produced) vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.7461
R Square
0.5567
Adjusted R Square
0.5355
Standard Error
0.0991
Observations
23
ANOVA df
SS
MS
F
Regression
1
0.2589
0.2589 26.3676
Residual
21
0.2062
0.0098
Total
22
0.4652
Coefficients Standard Error
t Stat
P-value
Intercept
3.7284
0.0400 93.1778
0.0000
Coded Year
-0.0160
0.0031
0.0000
-5.1349
Exponential model: log (predicted units produced) = log10 Yˆ 3.7284 0.0160( X ) where X = years relative to 1999. t = –5.1349, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.5567. 55.67% of the variation in the log of units produced is explained by year. Fitted Line Plot of log(Units Produced(thousands)) vs Coded Year
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdvii
Copyright ©2024 Pearson Education, Inc.
cdviii Chapter 16: Time-Series Forecasting 16.17 cont.
(e) First
Second
Percentage
Year Units Produced Difference Difference Difference 1999 5,577.749 #N/A #N/A #N/A 2000 5,470.917 -107 #N/A -1.92% 2001 4,808.019 -663 -556 -12.12% 2002 4,957.377 149 812 3.11% 2003 4,453.369 -504 -653 -10.17% 2004 4,165.925 -287 217 -6.45% 2005 4,265.872 100 387 2.40% 2006 4,311.696 46 -54 1.07% 2007 3,867.268 -444 -490 -10.31% 2008 3,731.383 -136 309 -3.51% 2009 2,196.446 -1,535 -1,399 -41.14% 2010 2,731.759 535 2,070 24.37% 2011 2,977.711 246 -289 9.00% 2012 4,109.013 1,131 885 37.99% 2013 4,368.835 260 -871 6.32% 2014 4,253.098 -116 -376 -2.65% 2015 4,162.808 -90 25 -2.12% 2016 3,916.584 -246 -156 -5.91% 2017 3,033.216 -883 -637 -22.55% 2018 2,785.164 -248 635 -8.18% 2019 2,511.711 -273 -25 -9.82% 2020 1,924.398 -587 -314 -23.38% 2021 1,562.717 -362 226 -18.79%
(f)
A review of 1st, 2nd, and percentage differences reveals no particular model is more appropriate than the other. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Linear: Yˆ2022 5123.9344 125.3168(23) 2241.5475 units (thousands) Quadratic: Yˆ2023 5162.7093 136.4239(23) 0.5049(23) 2 2292.03 units (thousands) Exponential: log10 Yˆ2022 3.7284 0.160(23) 3.3605 Yˆ2022 103.3605 2293.46 units (thousands)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdix 16.18
(a)
Time Series Plot of MLB Salaries (in $millions)
(b)
Linear: Salary ($millions) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9643
R Square
0.9298
Adjusted R Square
0.9259
Standard Error
0.2118
Observations
20
ANOVA df
SS
MS
F
Regression
1
10.6991 10.6991 238.4980
Residual
18
0.8075
Total
19
11.5066
Coefficients
Standard Error
Copyright ©2024 Pearson Education, Inc.
0.0449
t Stat
P-value
cdx Chapter 16: Time-Series Forecasting Intercept
2.2680
0.0913 24.8478
0.0000
Coded Year
0.1268
0.0082 15.4434
0.0000
Predicted MLB Salary = Yˆ 2.2680 0.1268 X where X = years relative to 2003 t = 15.4434, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9298, 92.98% of the variation in MLB salaries is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxi 16.18 cont.
(b)
Fitted Line Plot of MLB Salaries ($millions) vs Coded Year
(c)
Regression Analysis MLB Salary ($mil) vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.9658
R Square
0.9328
Adjusted R Square
0.9249
Standard Error
0.2133
Observations
20
ANOVA df
SS
MS
F
Regression
2
10.7333
5.3666 117.9703
Residual
17
0.7734
0.0455
Total
19
11.5066
Copyright ©2024 Pearson Education, Inc.
cdxii Chapter 16: Time-Series Forecasting Coefficients
Standard Error
t Stat
P-value
Intercept
2.1885
0.1299 16.8511
0.0000
Coded Year
0.1533
0.0317
4.8396
0.0002
Coded Year Sq
-0.0014
0.0016
-0.8662
0.3984
Quadratic model:
(c)
Predicted Bonus = Yˆ 2.1885 0.1533 X 0.0014 X where X = years relative to 2003 For full model: F = 117.9703, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = –0.8662, p-value = 0.3984 > 0.05. Coded Year^2 is not significant; r2 = 0.9328. 93.28% of the variation in MLB salaries is explained by year. Fitted Line Plot of MLB Salary ($mil) vs Coded Year, Coded Year Sq
(d)
log(MLB Salary ($millions)) vs Coded Year
2
16.18 cont.
Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9681
R Square
0.9371
Adjusted R Square
0.9336
Standard Error
0.0257
Observations
20 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxiii
ANOVA df
SS
MS
F
Regression
1
0.1778
0.1778 268.2773
Residual
18
0.0119
0.0007
Total
19
0.1897
Coefficients Standard Error
t Stat
P-value
Intercept
0.3746
0.0111 33.7678
0.0000
Coded Year
0.0164
0.0010 16.3792
0.0000
Exponential model: log (predicted Salary) = log10 Yˆ 0.3746 0.0164 X where X = years relative to 2003 t = 16.3792, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9371. 93.71% of the variation in the log of MLB Salaries is explained by year.
Copyright ©2024 Pearson Education, Inc.
cdxiv Chapter 16: Time-Series Forecasting 16.18 cont.
(d)
Fitted Line Plot of log(MLB Salary ($millions)) vs Coded Year
(e) First Second Percentage Year Salary Difference Difference Difference 2003 2.37 #N/A #N/A #N/A 2004 2.31 -107 #N/A -1.92% 2005 2.46 -663 -556 -12.12% 2006 2.70 149 812 3.11% 2007 2.82 -504 -653 -10.17% 2008 2.93 -287 217 -6.45% 2009 3.00 100 387 2.40% 2010 3.01 46 -54 1.07% 2011 3.10 -444 -490 -10.31% 2012 3.21 -136 309 -3.51% 2013 3.39 -1,535 -1,399 -41.14% 2014 3.69 535 2,070 24.37% 2015 3.84 246 -289 9.00% 2016 4.38 1,131 885 37.99% 2017 4.45 260 -871 6.32% 2018 4.41 -116 -376 -2.65% 2019 4.38 -90 25 -2.12% 2020 4.43 -246 -156 -5.91% 2021 4.17 -883 -637 -22.55% 2022 4.41 -248 635 -8.18%
The first and second differences are relatively consistent across the series. This is not the case for percentage differences. Based on the principle of parsimony, one might choose the linear model. Because the r2 value is similar between the linear model and the exponential model and the linear model is simpler, that model is chosen. (f)
Linear forecast: Yˆ2023 2.2680 0.1268(20) 4.8048 $million Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxv
Copyright ©2024 Pearson Education, Inc.
cdxvi Chapter 16: Time-Series Forecasting 16.19
(a)
Time Series Plot of Silver (US$/ounce)
(b)
Silver Price (US$/ounce) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.7073
R Square
0.5003
Adjusted R Square
0.4765
Standard Error
6.0311
Observations
23
ANOVA df
SS
MS
F
Regression
1
764.6610 764.6610 21.0218
Residual
21
763.8697
Total
22
1528.5307
Copyright ©2024 Pearson Education, Inc.
36.3747
Solutions to End-of-Section and Chapter Review Problems cdxvii Coefficients Standard Error
t Stat
P-value
Intercept
5.7811
2.4353
2.3739
0.0272
Coded Year
0.8692
0.1896
4.5849
0.0002
Yˆ 5.59 0.899( X ) where X = years relative to 1999. Predicted Bonus = Yˆ 5.7811 0.8692( X ) where X = years relative to 1999
t = 4.5849, p-value = 0.0002 < 0.05; coded year is significant; r2 = 0.5003, 50.03% of the variation in Silve Price is explained by year.
Copyright ©2024 Pearson Education, Inc.
cdxviii Chapter 16: Time-Series Forecasting 16.19 cont.
(b)
Fitted Line Plot of Silver Price (US$/ounce) vs Coded Year
(c)
Regression Analysis: Silver Price (US$/ounce) vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.7657
R Square
0.5864
Adjusted R Square
0.5450
Standard Error
5.6225
Observations
23
ANOVA df
SS
MS
F
Regression
2
896.2734 448.1367 14.1758
Residual
20
632.2573
Total
22
1528.5307
Copyright ©2024 Pearson Education, Inc.
31.6129
Solutions to End-of-Section and Chapter Review Problems cdxix Coefficients
Standard Error
t Stat
P-value
Intercept
1.0874
3.2320
0.3364
0.7400
Coded Year
2.2103
0.6806
3.2476
0.0040
Coded Year^2
-0.0610
0.0299
-2.0404
0.0547
Quadratic model: Predicted Bonus = Yˆ 1.0874 2.2103 X 0.0610 X where X = years relative to 1999. For full model: F = 14.1758, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = –2.0404, p-value = 0.0547 > 0.05. Coded Year^2 is not significant; r2 = 0.5864. 58.64% of the variation in Silver Price is explained by year. Fitted Line Plot of Silver Price (US$/ounce) vs Coded Year, Coded Year Sq 2
(c)
Fitted Line Plot Price = 1.0874 + 2.2103X - 0.0610X^2 35.000 30.000
Price (US$/ounce)
16.19 cont.
25.000 y = -0.061x2 + 2.2103x + 1.0874
20.000 15.000 10.000 5.000 0.000 0
5
10
15
Coded Year
(d)
log(Silver Price(US$/ounce)) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.8048
R Square
0.6477
Adjusted R Square
0.6309
Standard Error
0.1667
Observations
23 Copyright ©2024 Pearson Education, Inc.
20
25
cdxx Chapter 16: Time-Series Forecasting
ANOVA df
SS
MS
F
Regression
1
1.0733
1.0733 38.6085
Residual
21
0.5838
0.0278
Total
22
1.6570
Coefficients
Standard Error
t Stat
P-value
Intercept
0.7540
0.0673 11.1991
0.0000
Coded Year
0.0326
0.0052
0.0000
6.2136
log 10Yˆ 0.7297 0.03622( X ) where X = years relative to 1999. Exponential model: log (predicted bonus) = log 10Yˆ 0.7540 0.0326( X ) where X = years relative to 1999 t = 6.2136, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.6477. 64.77% of the variation in the log of Silver Price is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxi 16.19 cont.
(d)
Fitted Line Plot of log(Silver Price(US$/ounce)) vs Coded Year
(e) First
Year 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021
(f)
Second
Percentage
Price Difference Difference Difference 5.330 #N/A #N/A #N/A 4.570 -1 #N/A -14.26% 4.520 0 1 -1.09% 4.670 0 0 3.32% 5.965 1 1 27.73% 6.815 1 0 14.25% 8.830 2 1 29.57% 12.900 4 2 46.09% 14.760 2 -2 14.42% 10.790 -4 -6 -26.90% 16.990 6 10 57.46% 30.630 14 7 80.28% 28.180 -2 -16 -8.00% 29.950 2 4 6.28% 19.500 -10 -12 -34.89% 15.970 -4 7 -18.10% 13.820 -2 1 -13.46% 15.990 2 4 15.70% 16.865 1 -1 5.47% 15.490 -1 -2 -8.15% 20.770 5 7 34.09% 26.490 6 0 27.54% 23.090 -3 -9 -12.84%
Neither the first differences, second differences, nor percentage differences are constant across years. Based on the principle of parsimony, one might choose the linear model. However, none of the models fit the data well because of an irregular component that occurred during the time series. Other models covered in Chapter 16 may be more appropriate. Linear: Yˆ2022 5.7811 0.8692(23) 25.77 (US$/ounce) Copyright ©2024 Pearson Education, Inc.
cdxxii Chapter 16: Time-Series Forecasting 16.20
(a)
Time Series Plot of CPI-U (consumer price index for all urban consumers)
(b)
There has been an upward trend in the CPI-U in the United States from 1965 through 2021.
(c)
Linear: CPI-U (consumer price index for all urban consumers) vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9979
R Square
0.9957
Adjusted R Square
0.9956
Standard Error
4.9277
Observations
57
ANOVA df
SS
Regression
1
Residual
55
1335.5424
Total
56
311277.0656
MS
F
309941.5232 309941.5232 12763.9405
Copyright ©2024 Pearson Education, Inc.
24.2826
Solutions to End-of-Section and Chapter Review Problems cdxxiii
Coefficients
Standard Error
t Stat
P-value
Intercept
16.3914
1.2884
12.7223
0.0000
Coded Year
4.4821
0.0397
112.9776
0.0000
Linear: Predicted CPI-U = Yˆ 16.3914 4.4821X where X = years relative to 1965. t = 112.9776, p-value = 0.0000 < 0.05; coded year is significant; r2 = 0.9957, 99.57% of the variation in CPI-U is explained by year.
Copyright ©2024 Pearson Education, Inc.
cdxxiv Chapter 16: Time-Series Forecasting 16.20 cont.
(c)
Fitted Line Plot of CPI-U vs Coded Year
(d)
Regression Analysis: CPI-U vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.9979
R Square
0.9959
Adjusted R Square
0.9957
Standard Error
4.8759
Observations
57
ANOVA df
SS
Regression
2
Residual
54
1283.8426
Total
56
311277.0656
MS
F
309993.2230 154996.6115 6519.3482
Copyright ©2024 Pearson Education, Inc.
23.7749
Solutions to End-of-Section and Chapter Review Problems cdxxv Coefficients
Standard Error
t Stat
P-value
Intercept
18.4118
1.8715
9.8382
0.0000
Coded Year
4.2617
0.1545
27.5785
0.0000
Coded Year Sq
0.0039
0.0027
1.4746
0.1461
Quadratic model:
(d)
Predicted CPI-U = Yˆ 18.4118 4.2617 X 0.0039 X where X = years relative to 1965. For full model: F = 6519.3482, p-value = 0.0000, at least one X variable is significant. For Coded Year^2: t = 1.4746, p-value = 0.1461 > 0.05. Coded Year^2 is not significant; r2 = 0.9959. 99.59% of the variation in CPI-U is explained by year. Fitted Line Plot of CPI-U vs Coded Year, Coded Year Sq
(e)
log(CPI-U) vs Coded Year
2
16.20 cont.
Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9664
R Square
0.9338
Adjusted R Square
0.9326
Standard Error
0.0751
Observations
57 Copyright ©2024 Pearson Education, Inc.
cdxxvi Chapter 16: Time-Series Forecasting
ANOVA df
SS
MS
F
Regression
1
4.3820
4.3820 776.3896
Residual
55
0.3104
0.0056
Total
56
4.6924
Coefficients Standard Error
t Stat
P-value
Intercept
1.6004
0.0196 81.4767
0.0000
Coded Year
0.0169
0.0006 27.8638
0.0000
Exponential model: log (CPI-U) = log10 Yˆ 1.6004 0.0169( X ) where X = years relative to 1965 t = 27.8638, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.9338. 93.38% of the variation in the log of CPI-U is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxvii 16.20 cont.
(e)
Fitted Line of log(CPI-U) vs Coded Year
(f)
Because the quadratic term was not significant and the r2 value is lower for the linear model, that model is chosen. A review of the first, second, and percentage differences revealed similar levels of variation across the time series. Based on the principle of parsimony, one might choose the linear model. Linear forecast: Yˆ2022 16.3914 4.4821(57) 271.8732 Yˆ2023 16.3914 4.4821(58) 276.3553
(g)
16.21
(a)
Time Series I: Year
Series I
First Second Percentage Difference Difference Difference
2010
10.0
#N/A
#N/A
#N/A
2011
15.1
5.1
#N/A
51.00%
2012
24.0
8.9
3.8
58.94%
2013
36.7
12.7
3.8
52.92%
2014
53.8
17.1
4.4
46.59%
2015
74.8
21.0
3.9
39.03%
2016
100.0
25.2
4.2
33.69%
2017
129.2
29.2
4.0
29.20%
Copyright ©2024 Pearson Education, Inc.
cdxxviii Chapter 16: Time-Series Forecasting 2018
162.4
33.2
4.0
25.70%
2019
199.0
36.6
3.4
22.54%
2020
239.3
40.3
3.7
20.25%
2021
283.5
44.2
3.9
18.47%
For Time Series I, second differences are the most stable with values staying near 4. The quadratic model appears to be the most appropriate model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxix 16.21 cont.
(a)
Time Series II:
Year
Series II
First Second Percentage Difference Difference Difference
2010
30.0
2011
33.1
3.1
2012
36.4
3.3
0.2
9.97%
2013
39.9
3.5
0.2
9.62%
2014
43.9
4.0
0.5
10.03%
2015
48.2
4.3
0.3
9.79%
2016
53.2
5.0
0.7
10.37%
2017
58.2
5.0
0.0
9.40%
2018
64.5
6.3
1.3
10.82%
2019
70.7
6.2
-0.1
9.61%
2020
77.1
6.4
0.2
9.05%
2021
83.9
6.8
0.4
8.82%
#N/A
#N/A
#N/A
#N/A
10.33%
For Time Series II, the second differences and percentage differences both appear to be stable. The quadratic model and the exponential appear to be appropriate models. Time Series 3: Year
Series III
First Second Difference Difference
2010
60.0
2011
67.9
7.9
2012
76.1
8.2
0.3
12.08%
2013
84.0
7.9
-0.3
10.38%
2014
92.2
8.2
0.3
9.76%
2015
100.0
7.8
-0.4
8.46%
#N/A
Percentage Difference
#N/A
#N/A
#N/A
13.17%
Copyright ©2024 Pearson Education, Inc.
cdxxx Chapter 16: Time-Series Forecasting 2016
108.0
8.0
0.2
8.00%
2017
115.8
7.8
-0.2
7.22%
2018
124.1
8.3
0.5
7.17%
2019
132.0
7.9
-0.4
6.37%
2020
140
8.0
0.1
6.06%
2021
147.8
7.8
-0.2
5.57%
For Time Series 3, the first differences are slightly more stable than second differences. The linear model appears to be the most appropriate model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxi 16.21 cont.
(b)
Time Series I: Series I vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R
1.0000
R Square
1.0000
Adjusted R Square
1.0000
Standard Error
0.3931
Observations
12
ANOVA df
SS
Regression
2
Residual
9
1.3911
Total
11
94123.4500
MS
F
94122.0589 47061.0295 304477.5537 0.1546
Coefficients Standard Error
t Stat
P-value
Intercept
9.7560
0.2907
33.5618
0.0000
Coded Year
3.1876
0.1229
25.9456
0.0000
Coded Year^2
1.9770
0.0108
183.7105
0.0000
Yˆ 9.756 3.1876 X 1.9770 X 2 where X = years relative to 2010. Time Series II: Series II vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R
0.9999
R Square
0.9999
Copyright ©2024 Pearson Education, Inc.
cdxxxii Chapter 16: Time-Series Forecasting Adjusted R Square
0.9999
Standard Error
0.2049
Observations
12
ANOVA df
SS
Regression
2
Residual
9
0.3779
Total
11
3485.4692
MS
F
3485.0912 1742.5456 41495.8162 0.0420
Coefficients Standard Error
t Stat
P-value
Intercept
30.1920
0.1515
199.2630
0.0000
Coded Year
2.5818
0.0640
40.3180
0.0000
Coded Year^2
0.2103
0.0056
37.4855
0.0000
Yˆ 30.192 2.5818 X 0.2103 X 2 where X = years relative to 2010. 16.21 cont.
(b)
Time Series III: Series II vs Coded Year Simple Linear Regression Analysis
Regression Statistics Multiple R
1.0000
R Square
1.0000
Adjusted R Square
1.0000
Standard Error
0.1168
Observations
12
ANOVA df
SS
Copyright ©2024 Pearson Education, Inc.
MS
F
Solutions to End-of-Section and Chapter Review Problems cdxxxiii Regression
1
9130.4127 9130.4127 669277.5852
Residual
10
0.1364
Total
11
9130.5492
Coefficients
Standard Error
Intercept
60.0436
0.0634
946.6904
0.0000
Coded Year
7.9906
0.0098
818.0939
0.0000
0.0136
t Stat
P-value
Yˆ 60.0436 7.99056( X ) where X = years relative to 2010.
(c)
Forecasts where X = 12 for year 2022 in all models: Time Series I: Yˆ2022 9.756 3.188(12) 1.9770(122 ) 332.691 Time Series II: Yˆ2022 30.192 2.5818(12) 0.21026(122 ) 91.4523 Time Series III: Yˆ2022 60.0436 7.99056(12) 155.930
16.22
(a)
Time Series I: Data Y over Time XTime Series I: Data log (Y) over Time X
For Time Series I, the first differences are slightly more stable than second differences. The linear model appears to be the most appropriate model.
Copyright ©2024 Pearson Education, Inc.
cdxxxiv Chapter 16: Time-Series Forecasting 16.22 cont.
(a)
Time Series II: Data Y over Time XTime Series II: Data log (Y) over Time X
For Time Series II, the graph of log (Y) versus X appears to be more linear than the graph of Y versus X, so an exponential model appears to be more appropriate. (b)
Time Series I: Yˆ 100.0731 14.9776( X ) , where X = years relative to 2010 Simple Linear Regression Analysis Coefficients Standard Error
t Stat
P-value
Intercept
100.0731
0.0539
1857.8969
0.0000
Coded Year
14.9776
0.0083
1805.6429
0.0000
Time Series II: Yˆ 101.9982 0.0609( X ) , where X = years relative to 2010 Simple Linear Regression Analysis Coefficients
16.23
Standard Error
t Stat
Intercept
1.9982
0.0010 2003.1699
0.0000
Coded Year
0.0609
0.0002
0.0000
396.3726
(c)
X = 12 for year 2022 in all models. Forecasts for the year 2022 Time Series I: Yˆ 100.0731 14.9776(12) 279.8045 Time Series II Yˆ 101.99820.0609(12) 535.6886
(a) (b) (c) (d)
Four. Comparisons cannot be made for the first four observations. Five. One for each of the four variables and one for the intercept. The final four observations are needed to generate forecasts. Yˆi a0 a1Yi 1 a2Yi 2 a3Yi 3 a4Yi 4 Yˆ a a Yˆ a Yˆ a Yˆ a Yˆ
(e)
n j
P-value
0
1 n j 1
2 n j 2
3 n j 3
4 n j 4
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxv
16.24
tSTAT
a3 0.24 2.4 is greater than the critical bound of 2.2281. Reject H0. Sa3 0.10
There is sufficient evidence that the third-order regression parameter is significantly different from zero. A third-order autoregressive model is appropriate. 16.25
Ŷ18 a0 a1Y17 a2Y16 a3Y15 = 4.50 + (1.80)(36) + (0.80)(31) + (0.24)(25) = 100.1 Yˆ a a Yˆ a Y a Y = 4.50 + (1.80)(100.1) + (0.80)(36) + (0.24)(31) = 220.92 19
16.26
16.27
0
1 18
2 17
3 16
a3 0.24 1.6 is less than the critical bound of 2.2281. Do not reject H0. There is Sa3 0.15
(a)
tSTAT
(b)
not sufficient evidence that the third-order regression parameter is significantly different than zero. A third-order autoregressive model is not appropriate. Fit a second-order autoregressive model and test to see if it is appropriate.
(a)
Regression Analysis: Total Sales (thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis
Regression Statistics Multiple R
0.9691
R Square
0.9392
Adjusted R Square
0.9313
Standard Error
70.6486
Observations
27
ANOVA df
SS
MS
F
Regression
3
1773802.0158 591267.3386 118.4613
Residual
23
114798.2805
Total
26
1888600.2963
Coefficients
Standard Error
Copyright ©2024 Pearson Education, Inc.
4991.2296
t Stat
P-value
cdxxxvi Chapter 16: Time-Series Forecasting Intercept
88.4888
44.6247
1.9830
0.0594
Lag1
1.7458
0.2044
8.5391
0.0000
Lag2
-1.0162
0.3571
-2.8454
0.0092
Lag3
0.1469
0.2043
0.7187
0.4796
For the third order term, tSTAT = 0.7187 with a p-value of 0.4796. The third term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxvii 16.27 cont.
(b)
Regression Analysis: Total Sales (thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis
Regression Statistics Multiple R
0.9679
R Square
0.9368
Adjusted R Square
0.9317
Standard Error
69.1224
Observations
28
ANOVA df
SS
MS
F
Regression
2
1770518.2193 885259.1096 185.2821
Residual
25
119447.4950
Total
27
1889965.7143
Coefficients
Standard Error
100.5300
38.4250
2.6163
0.0149
Lag1
1.6253
0.1272
12.7771
0.0000
Lag2
-0.7682
0.1270
-6.0483
0.0000
Intercept
4777.8998
t Stat
P-value
For the second order term, tSTAT = –6.0483 with a p-value of 0.0000. The second order term is significant at the 0.05 significance level. The second order term should be retained. (c)
A first-order autoregression is not necessary.
(d)
Yˆ2022 100.5300 1.6253Yˆ2021 0.7682Yˆ2020 100.5300 1.6253(690) 0.7682(650) 722.6907 thousand houses sold Copyright ©2024 Pearson Education, Inc.
cdxxxviii Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxix 16.28
(a)
Regression Analysis: Bonus ($ thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis
Regression Statistics Multiple R
0.6335
R Square
0.4014
Adjusted R Square
0.2816
Standard Error
33.9143
Observations
19
ANOVA df
SS
MS
Regression
3
11566.8807 3855.6269
Residual
15
17252.7361 1150.1824
Total
18
28819.6168
Coefficients
Standard Error
Intercept
50.4300
Lag1
F 3.3522
t Stat
P-value
37.7256
1.3368
0.2012
0.6323
0.2727
2.3182
0.0350
Lag2
-0.0376
0.3163
-0.1190
0.9069
Lag3
0.1436
0.2740
0.5239
0.6080
For the third order term, tSTAT = 0.5293 with a p-value of 0.6080. The third term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
cdxl Chapter 16: Time-Series Forecasting 16.28 cont.
(b)
Regression Analysis: Bonus ($ thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis
Regression Statistics Multiple R
0.6951
R Square
0.4832
Adjusted R Square
0.4224
Standard Error
33.8617
Observations
20
ANOVA df
SS
MS
Regression
2
18224.0096 9112.0048
Residual
17
19492.4959 1146.6174
Total
19
37716.5055
Coefficients
Standard Error
Intercept
41.6635
Lag1 Lag2
F 7.9469
t Stat
P-value
31.2113
1.3349
0.1995
0.7421
0.2548
2.9125
0.0097
0.0329
0.2660
0.1236
0.9031
For the second order term, tSTAT = 0.1236 with a p-value of 0.9031. The second order term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxli 16.28 cont.
(c)
Regression Analysis: Bonus ($ thousands) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis
Regression Statistics Multiple R
0.7137
R Square
0.5094
Adjusted R Square
0.4836
Standard Error
33.5612
Observations
21
ANOVA df
SS
MS
F
Regression
1
22219.7257 22219.7257 19.7271
Residual
19
21400.7800
Total
20
43620.5057
Coefficients
Standard Error
Intercept
32.9804
27.1474
1.2149
0.2393
Lag1
0.8199
0.1846
4.4415
0.0003
1126.3568
t Stat
P-value
For the first order term, tSTAT = 4.4415 with a p-value of 0.0003. The first order term should be retained because it is significant at the 0.05 significance level. (d)
Yˆ2022 32.9804 0.8199Yˆ2021 32.9804 0.8199(257.5) 244.1040 $ thousands
Copyright ©2024 Pearson Education, Inc.
cdxlii Chapter 16: Time-Series Forecasting 16.29
(a)
Regression Analysis: Units Produced (thousands) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis
Regression Statistics Multiple R
0.8477
R Square
0.7187
Adjusted R Square
0.6659
Standard Error
558.1040
Observations
20
ANOVA df
SS
MS
F
Regression
3 12731566.1591 4243855.3864 13.6248
Residual
16
Total
19 17715248.1221
4983681.9630
311480.1227
Coefficients
Standard Error
555.6187
629.1767
0.8831
0.3903
Lag1
0.9722
0.2446
3.9753
0.0011
Lag2
0.1195
0.3388
0.3526
0.7290
Lag3
-0.2685
0.2470
-1.0871
0.2931
Intercept
t Stat
P-value
For the third order term, tSTAT = –1.0871 with a p-value of 0.2931. The third term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxliii 16.29 cont.
(b)
Regression Analysis: Units Produced (thousands) vs Lag1, Lag2 Rows unused: 2 Regression Analysis
Regression Statistics Multiple R
0.8483
R Square
0.7195
Adjusted R Square
0.6884
Standard Error
548.5111
Observations
21
ANOVA df
SS
MS
F
Regression
2 13893654.1967 6946827.0984 23.0896
Residual
18
Total
20 19309213.4279
5415559.2312
300864.4017
Coefficients
Standard Error
419.7450
539.4739
0.7781
0.4466
Lag1
0.9925
0.2355
4.2146
0.0005
Lag2
-0.1467
0.2403
-0.6107
0.5491
Intercept
t Stat
P-value
For the second order term, tSTAT = –0.6107 with a p-value of 0.5491. The second term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
cdxliv Chapter 16: Time-Series Forecasting 16.29 cont.
(c)
Regression Analysis: Units Produced (thousands) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis
Regression Statistics Multiple R
0.8680
R Square
0.7534
Adjusted R Square
0.7411
Standard Error
529.4649
Observations
22
ANOVA df
SS
MS
F
Regression
1 17130328.9577 17130328.9577 61.1071
Residual
20
Total
21 22736990.7243
Intercept Units Produced
5606661.7666
280333.0883
Coefficients
Standard Error
t Stat
P-value
211.5935
455.6051
0.4644
0.6474
0.8975
0.1148
7.8171
0.0000
For the first order term, tSTAT = 7.8171 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. (d)
The most appropriate forecasting model is the first-order autoregressive model: Yˆ2022 211.5935 0.8975Yˆ2021 211.5935 0.8975(1562.717) 1614.1208 units produced (thousands)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlv 16.30
(a)
Regression Analysis: MLB Salary ($ millions) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis
Regression Statistics Multiple R
0.9722
R Square
0.9451
Adjusted R Square
0.9324
Standard Error
0.1754
Observations
17
ANOVA df
SS
MS
F
Regression
3
6.8791
2.2930 74.5682
Residual
13
0.3998
0.0308
Total
16
7.2788
Coefficients Standard Error
t Stat
P-value
Intercept
0.3679
0.2333
1.5769
0.1388
Lag1
0.9285
0.2936
3.1628
0.0075
Lag2
0.2200
0.4707
0.4675
0.6479
Lag3
-0.2278
0.3194 -0.7134
0.4882
For the third order term, tSTAT = –0.7134 with a p-value of 0.4882. The third term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
cdxlvi Chapter 16: Time-Series Forecasting 16.30 cont.
(b)
Regression Analysis: MLB Salary ($ millions) vs Lag1, Lag2 Rows unused: 2 Regression Analysis
Regression Statistics Multiple R
0.9756
R Square
0.9518
Adjusted R Square
0.9454
Standard Error
0.1667
Observations
18
ANOVA df
SS
MS
F
Regression
2
8.2355
4.1178 148.2401
Residual
15
0.4167
0.0278
Total
17
8.6522
Coefficients
Standard Error
Intercept
0.3331
Lag1 Lag2
t Stat
P-value
0.1955
1.7042
0.1090
1.0074
0.2508
4.0169
0.0011
-0.0716
0.2434
-0.2943
0.7725
For the second order term, tSTAT = –0.2943 with a p-value of 0.7725. The second term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlvii 16.30 cont.
(c)
Regression Analysis: MLB Salary ($ millions) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9767
R Square
0.9539
Adjusted R Square
0.9512
Standard Error
0.1665
Observations
19
ANOVA df
SS
MS
Regression
1
9.7549
9.7549 351.9942
Residual
17
0.4711
0.0277
Total
18
10.2260
Coefficients Standard Error
(d)
F
t Stat
P-value
Intercept
0.2440
0.1793
1.3605
0.1914
Lag1
0.9601
0.0512 18.7615
0.0000
For the first order term, tSTAT = 18.7615 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. Yˆ2023 0.2440 0.9601Yˆ2022 0.2440 0.9601(4.41) 4.4780 $ millions
Copyright ©2024 Pearson Education, Inc.
cdxlviii Chapter 16: Time-Series Forecasting 16.31
(a)
Regression Analysis: Solar Power Generated (gigawatts) vs Lag1, Lag2, Lag3 Rows unused: 3 Regression Analysis
Regression Statistics Multiple R
0.9949
R Square
0.9899
Adjusted R Square
0.9838
Standard Error
4436.3453
Observations
9
ANOVA df
SS
MS
F
Regression
3 9634507851.3840 3211502617.1280 163.1765
Residual
5
Total
8 9732913648.0000
98405796.6160
19681159.3232
Coefficients
Standard Error
8785.1072
3652.7954
2.4050
0.0612
Lag1
1.1944
0.3776
3.1635
0.0250
Lag2
-0.7640
0.5581
-1.3689
0.2293
Lag3
0.8203
0.4051
2.0249
0.0987
Intercept
t Stat
P-value
For the third order term, tSTAT = 2.0249 with a p-value of 0.0987. The third term can be dropped because it is not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlix 16.31 cont.
(b)
Regression Analysis: Solar Power Generated (gigawatts) vs Lag1, Lag2 Rows unused: 2 Regression Analysis
Regression Statistics Multiple R
0.9921
R Square
0.9844
Adjusted R Square
0.9799
Standard Error
5169.1234
Observations
10
ANOVA df
SS
MS
F
Regression
2 11768299798.4597 5884149899.2299 220.2165
Residual
7
Total
9 11955338656.4000
Coefficients Intercept
187038857.9403
Standard Error
26719836.8486
t Stat
P-value
4453.9291
3147.9158
1.4149
0.2000
Lag1
1.2742
0.4055
3.1425
0.0163
Lag2
-0.1216
0.4658
-0.2611
0.8015
For the second order term, tSTAT = –0.2611 with a p-value of 0.8015. The second order term is not significant at the 0.05 significance level. The second order term can be dropped.
Copyright ©2024 Pearson Education, Inc.
cdl Chapter 16: Time-Series Forecasting 16.31 cont.
(c)
Regression Analysis: Solar Power Generated (gigawatts) vs Lag1 Rows unused: 1 Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9926
R Square
0.9853
Adjusted R Square
0.9837
Standard Error
4771.7587
Observations
11
ANOVA df
SS
MS
F
Regression
1 13778502076.9679 13778502076.9679 605.1249
Residual
9
Total
10 13983429210.7273
Coefficients Intercept
204927133.7594
Standard Error
22769681.5288
t Stat
P-value
3959.9648
2195.5439
1.8036
0.1048
1.1845
0.0482
24.5993
0.0000
Lag1
For the first order term, tSTAT = 24.5993 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. (d)
Yˆ2022 3,959.9648 1.1845Yˆ2021 3,959.9648 1.1845(114,678) 139,798.3124 gigawatts n
16.32
(a)
SYX
(Y Yˆ ) i 1
i
i
n p 1
2
45 2.121 . The standard error of the estimate is 2.121. 12 1 1
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdli n
Y Yˆ
(b)
MAD i 1
i
i
n
18 1.5 . The mean absolute deviation is 1.5. 12
n
16.33
(a)
SYX
(Y Yˆ ) i 1
i
2
i
n p 1
335.24 5.790 . The standard error of the estimate is 5.790. 12 1 1
n
Y Yˆ
(b)
MAD i 1
i
n
i
39.2 3.267 . The mean absolute deviation is 3.267. 12
Copyright ©2024 Pearson Education, Inc.
cdlii Chapter 16: Time-Series Forecasting 16.34
(a)
Linear Displays large curvilinear pattern. Do not consider this model.
Quadratic Does not show a pattern. Consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdliii 16.34 cont.
(a)
Exponential Displays large curvilinear pattern. Do not consider this model.
First-order Autoregressive Does not show a pattern. Consider this model.
Copyright ©2024 Pearson Education, Inc.
cdliv Chapter 16: Time-Series Forecasting 16.34 cont.
(b–c) Solar Power Linear Quadratic Exponential AR-First Order (d)
r2 0.9409 0.9936 0.927 0.9853
SYX 9,641.926 3,340.838 29,413.36 4,771.759
MAD 7,074.251 2,251.934 16,012.19 2,975.726
Because the quadratic trend model had the highest r2 of the regression models and the first order regressive model had a significant t value, and they had no pattern in the residuals, those models should be considered. Quadratic model SYX = 3,340.838, MAD = 2,251.934. No strong evidence of a pattern in the residuals. First order autoregressive model: SYX = 4,771.759, MAD = 2,975.726. Because the quadratic model has lower SYX and MAD and similar r2, that model should be selected.
16.35
(a)
Linear Displays cyclical pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlv 16.35 cont.
(a)
Quadratic Displays cyclical pattern. Do not consider this model.
Exponential Displays cyclical pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
cdlvi Chapter 16: Time-Series Forecasting 16.35 cont.
(a)
Second-Order Autoregressive Does not show a pattern. Consider this model.
(b–c) House Sales Linear Quadratic Exponential AR-Second Order (d)
r2 0.1568 0.1929 0.1708 0.9368
SYX 239.2122 238.3291 244.195 69.1224
MAD 188.4059 184.3723 191.9446 51.3408
Based on the results from (a) through (c), the residuals associated with the linear, quadratic, and exponential models have cyclical patterns. The second-order autoregressive model has no clear pattern. The second-order autoregressive model has the smallest Syx, and MAD values and highest r2. Based on these results and the principle of parsimony, the second-order autoregressive model would be the best option based on these results.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlvii 16.36
(a)
Linear Displays slight cyclic pattern. Do not consider this model.
Quadratic Displays slight cyclic pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
cdlviii Chapter 16: Time-Series Forecasting 16.36 cont.
(a)
Exponential Displays slight cyclic pattern. Do not consider this model.
First-Order Autoregressive Does not show a pattern. Consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlix 16.36 cont.
(b–c) Bonus Linear Quadratic Exponential AR-First Order
16.37
r2 0.5838 0.5911 0.58 0.5094
SYX 30.8973 31.4203 30.6493 33.5612
MAD 22.6094 21.6598 21.5919 26.6907
(d)
The residual plots for the linear, quadratic, exponential reveal a slight cyclical pattern for the first part of the time series followed by no clear pattern for the remainder of the series. The first-order autoregressive revealed no clear pattern throughout the time series. Each of the models had similar r2, Sxy, and MAD values. On the basis of the residual plots and the principle of parsimony, the first-order autoregressive model might be the best choice for forecasting.
(a)
Linear Displays cyclic pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
cdlx Chapter 16: Time-Series Forecasting 16.37 cont.
(a)
Quadratic Displays cyclic pattern. Do not consider this model.
Exponential Displays cyclic pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxi 16.37 cont.
(a)
First-Order Autoregressive Does not show a pattern. Consider this model.
(b–c) Units Produced Linear Quadratic Exponential AR-First Order (d)
r2 0.6055 0.6058 0.5567 0.7534
SYX 702.192 719.2189 708.3523 529.4649
MAD 521.5868 520.7087 511.2103 375.6391
The residual plots for linear, quadratic, and exponential reveal cyclical patterns across coded year. The first-order autoregressive model has no clear pattern but does have one outlier. The first-order autoregressive model has the smallest Sxy and MAD values. On the basis of (a) through (c) and the principle of parsimony, the first-order autoregressive model would be the best model for forecasting.
Copyright ©2024 Pearson Education, Inc.
cdlxii Chapter 16: Time-Series Forecasting 16.38
(a)
Linear Displays cyclic pattern. Do not consider this model.
Quadratic Displays cyclic pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxiii 16.38 cont.
(a)
Exponential Displays cyclic pattern. Do not consider this model.
First-Order Autoregressive Does not show a pattern. Consider this model.
Copyright ©2024 Pearson Education, Inc.
cdlxiv Chapter 16: Time-Series Forecasting 16.38 cont.
(b–c) MLB Salary Linear Quadratic Exponential AR-First Order (d)
16.39
SYX 0.2118 0.2133 0.2424 0.1665
MAD 0.1499 0.1498 0.1637 0.1069
The residual plots for linear, quadratic, and exponential reveal cyclical patterns across coded year. The first-order autoregressive model has no clear pattern. The Sxy and MAD values are similar across each of the models, but lower for autoregressive first order model. On the basis of (a) through (c) and the principle of parsimony, the first-order autoregressive model might be the best model for forecasting.
(a)
(b) (c) (d)
16.40
r2 0.9298 0.9328 0.9371 0.9539
(a) (b)
SYX 787.0082 MAD 629.8621 On the basis of (a) through (c), the linear model does not appear to be an adequate option because it does not account for cyclical variations. Other models would be more appropriate. One would not be satisfied with the linear trend forecasts in 16.13. b0 log Bˆ0 2 , Bˆ0 100 This is the unadjusted forecast. log Bˆ1 0.01, Bˆ1 1.0233 The estimated monthly compound growth rate is 2.33%. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxv log Bˆ2 0.10, Bˆ2 1.2589 The January values in the time series are estimated to have a mean of 25.89% higher than the December values.
16.40 cont.
(c)
16.41
To account for the seasonal component day of the week, six dummy variables would be needed.
16.42
(a) (b)
b0 log Bˆ0 3, then Bˆ0 1,000 This is the unadjusted forecast. b1 log Bˆ1 0.10, then Bˆ1 1.2589
(c)
The estimated quarterly compound growth rate is ( Bˆ1 1)100% 25.89% b3 log Bˆ3 0.20, then Bˆ3 1.5849 Bˆ3 1.5849 is the seasonal multiplier relative to the fourth quarter. This multiplier indicates that second quarter values are 58.49% greater than the fourth quarter values.
16.43
(a) (b) (c) (d)
16.44
(a)
Fitted value for Q4 of 2022: log Yˆ20 3.0 0.10(19) 4.9 Yˆ20 104.9 79, 432.82 Fitted value for Q1 of 2022: log Yˆ17 3.0 0.10(17) 0.25 4.45 Yˆ17 104.45 28,183.83 Forecast for Q4 of 2023: log Yˆ24 3.0 0.10(23) 5.30 Yˆ24 105.30 199,526.2315 Forecast for Q4 of 2023: log Yˆ21 3 0.10(21) 0.25 4.85 Yˆ21 104.85 70,794.58 The revenues for Target appear to be subject to seasonal variation given that revenues are consistently higher in the fourth quarter, which includes several substantial holidays.
(b)
The plot confirms the answer for (a) by clearly revealing a seasonal component to revenues.
Copyright ©2024 Pearson Education, Inc.
cdlxvi Chapter 16: Time-Series Forecasting 16.44 cont.
(c) Coefficients
Standard Error
t Stat
P-value
Intercept
1.0974
0.0120 91.3610
0.0000
Coded Quarter
0.0044
0.0002 25.4733
0.0000
Q1
-0.1275
0.0128
-9.9209
0.0000
Q2
-0.1157
0.0128
-9.0051
0.0000
Q3
-0.1161
0.0128
-9.0392
0.0000
Predict log(Revenue) = 1.0974 + 0.0044 Coded Quarter – 0.1275Q1 – 0.1557Q2 – 0.1161Q3 (d)
log10 ˆ1 0.0044; ˆ1 100.0044 1.0101
The estimated quarterly compound growth rate is ( Bˆ1 1)100% 1.01% (e) Quarter
bi log ˆi
ˆi 10b
( ˆi 1)100%
First
–0.1275
0.7456
–25.44%
Second
–0.1157
0.7661
–23.39%
Third
–0.1161
0.7653
–23.47%
i
The first, second, and third quarter multipliers are –25.44%, –23.39%, and –23.47% relative to fourth quarter values, respectively. (f)
log(Revenue) = 1.0974 + 0.0044 Coded Quarter – 0.1275Q1 – 0.1557Q2 – 0.1161Q3 Predicted 2022 Q4 Revenue log(Revenue) = 1.0974 + 0.0044(91) = 1.4961 101.4961 = 31.3422 $million Predicted 2023 Q1 Revenue log(Revenue) = 1.0974 + 0.0044(92) – 0.1275 = 1.3730 101.3730 = 23.6065 $million Predicted 2023 Q2 Revenue log(Revenue) = 1.0974 + 0.0044(93) – 0.1557 = 1.3892 101.3892 = 24.5014 $million Predicted 2023 Q3 Revenue log(Revenue) = 1.0974 + 0.0044(94) – 0.1161 = 1.3931 101.3931 = 24.7243 $million Predicted 2023 Q4 Revenue log(Revenue) = 1.0974 + 0.0044(95) = 1.5137 101.5137 = 32.6329 $million
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxvii 16.45
(a)
(b) Coefficients
Standard Error
t Stat
P-value
Intercept
0.4322
0.0251
17.2338
0.0000
Coded Month
-0.0001
0.0001
-0.6549
0.5133
Jan
-0.0031
0.0314
-0.0983
0.9218
Feb
0.0044
0.0314
0.1397
0.8891
Mar
0.0311
0.0314
0.9906
0.3231
Apr
0.0445
0.0314
1.4171
0.1581
May
0.0616
0.0314
1.9607
0.0514
June
0.0692
0.0314
2.2031
0.0288
July
0.0550
0.0314
1.7520
0.0814
Aug
0.0567
0.0314
1.8062
0.0725
Sept
0.0509
0.0314
1.6199
0.1069
Oct
0.0391
0.0314
1.2465
0.2141
Nov
0.0136
0.0319
0.4264
0.6703
Copyright ©2024 Pearson Education, Inc.
cdlxviii Chapter 16: Time-Series Forecasting Predict log(Price) = 0.4322 – 0.0001 Coded Month – 0.0031Jan + 0.0044Feb + 0.0311Mar + 0.0445Apr + 0.0616May + 0.0692June + 0.0550July + 0.0567Aug + 0.0509Sept + 0.0391Oct + 0.0136Nov (c)
log10 ˆ1 0.0001; ˆ1 100.0001 0.9998
The estimated monthly compound growth rate is ( ˆ1 1)100% 0.02% after adjusting for the seasonal component.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxix 16.45 cont.
(d) Month
bi log10 ˆi
ˆi 10b
( ˆi 1)100%
Jan
-0.0031
0.9929
-0.71%
Feb
0.0044
1.0102
1.02%
Mar
0.0311
1.0743
7.43%
Apr
0.0445
1.1079
10.79%
May
0.0616
1.1523
15.23%
June
0.0692
1.1727
17.27%
July
0.0550
1.1350
13.50%
Aug
0.0567
1.1395
13.95%
Sept
0.0509
1.1243
12.43%
Oct
0.0391
1.0943
9.43%
Nov
0.0136
1.0318
3.18%
i
The January, February, March, April, May, June, July, August, September, October, and November multipliers are –0.71%, 1.02%, 7.43%, 10.79%, 15.23%, 17.27%, 13.50%, 13.95%, 12.43%, 9.43%, and 3.18% relative to the December values, respectively. (e)
16.46
Gasoline prices are lower from November to March and higher from April to October with the highest prices occurring in June.
(a)
Copyright ©2024 Pearson Education, Inc.
cdlxx Chapter 16: Time-Series Forecasting
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxi 16.46 cont.
(b) Coefficients
Standard Error
t Stat
P-value
Intercept
5.0230
0.0870 57.7213
0.0000
Coded Month
0.0033
0.0006
5.5910
0.0000
Jan
-0.1241
0.1081
-1.1473
0.2536
Feb
-0.1728
0.1081
-1.5981
0.1127
Mar
-0.1021
0.1081
-0.9442
0.3470
Apr
-0.1400
0.1081
-1.2949
0.1979
May
-0.1793
0.1081
-1.6585
0.0999
June
-0.2193
0.1081
-2.0286
0.0448
July
-0.2015
0.1081
-1.8643
0.0648
Aug
-0.1626
0.1081
-1.5041
0.1353
Sept
-0.1131
0.1081
-1.0465
0.2975
Oct
-0.0819
0.1107
-0.7399
0.4608
Nov
-0.0624
0.1106
-0.5643
0.5736
Predict log(Volume) = 5.0230 + 0.0033 Coded Month – 0.1241Jan – 0.1728Feb – 0.1021Mar – 0.1400Apr – 0.1793May – 0.2193June – 0.2015July – 0.1626Aug – 0.1131Sept – 0.0819Oct – 0.0624Nov (c)
Sept 2022 Fitted value: log Yˆ128 5.0230 0.0033(128) 0.1133 5.3293
(d)
Sept 2022 Forecast: log Yˆ128 5.0230 0.0033(128) 0.1133 5.3293 Yˆ128 105.3293 213, 445.7622 barrels Oct 2022 Forecast: log Yˆ129 5.0230 0.0033(129) 0.0819 5.4380 Yˆ129 105.4380 231,120.5189 barrels Nov 2022 Forecast: log Yˆ130 5.0230 0.0033(130) 0.0624 5.3866 Yˆ130 105.3866 243,530.12 barrels Dec 2022 Forecast: log Yˆ131 5.0230 0.0033(131) 5.4523 Yˆ131 105.4523 283,313.3103 barrels
(e)
log10 ˆ1 0.0033; ˆ1 100.0033 1.0076 Copyright ©2024 Pearson Education, Inc.
cdlxxii Chapter 16: Time-Series Forecasting The estimated monthly compound growth rate is ( ˆ1 1)100% 0.76% after adjusting for the seasonal component. (f)
log Bˆ July 0.2015, BˆJuly 0.6287
The multiplier for July is 0.6287, which means the volume is 37.12% lower in July than in December.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxiii 16.47
(a)
(b)
The time series plot reveals a strong monthly seasonal pattern with high call volume that peaks between December and February and drops from March to lows in the summer months before rising again from October through February.
(c)
While the call volume varies seasonally, the overall volume remains fairly steady.
(d)
(e)
(f)
log10 ˆ1 0.000523; ˆ1 100.000523 0.9988.
The estimated monthly compound growth rate is ( ˆ1 1)100% 0.1204% after adjusting for the seasonal component. log ˆ2 0.0539 ˆ2 100.0539 0.8833625 ( ˆ 1)100% 11.6717% The January values are estimated to have a mean of 11.67% below the December values. Copyright ©2024 Pearson Education, Inc.
cdlxxiv Chapter 16: Time-Series Forecasting 16.47 cont.
(g) (h) (i)
16.48
Month 60: X = 59, M1 = M2 = M3 = M4 = M5 = M6 = M7 = M8 = M9 = M10 = M11 = 0. Yˆ60 27919.65195 Month 61: X = 60, M1 = 1; M2 = M3 = M4 = M5 = M6 = M7 = M8 = M9 = M10 = M11 = 0. Yˆ61 24633.71996 The call center can more accurately predict call center by month, which will allow the center to allocate resources more effectively to account for seasonal variation in call volume.
(a)
(b) Coefficients
Standard Error
t Stat
P-value
Intercept
1.0654
0.0454
23.4866
0.0000
Coded Quarter
0.0043
0.0007
5.7455
0.0000
Q1
0.0073
0.0485
0.1502
0.8810
Q2
-0.0069
0.0485
-0.1417
0.8877
Q3
-0.0014
0.0485
-0.0288
0.9771
Predict log(Price) = 1.0654 + 0.0043 Coded Quarter + 0.0073Q1 – 0.0069Q2 – 0.0014Q3 (c)
log10 ˆ1 0.0043; ˆ1 100.0043 1.0099
(d)
The estimated quarterly compound growth rate is ( ˆ1 1)100% 0.99% log ˆ 0.0073; ˆ 100.0073 1.0169 10
2
2
The 1st quarter values are estimated to have a mean of 1.69% above the 4th quarter values. A review of the p-values associated with the t test on the slope of the coefficients reveals Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxv
(e)
that the slope coefficients for Quarter 1, Quarter 2, and Quarter 3 are not significant at the 0.05 significance level. 2022 Q4, X = 79, log Yˆ79 1.0654 0.0043(79) 1.4048 Yˆ79 101.4048 25.3951 (US$)
Copyright ©2024 Pearson Education, Inc.
cdlxxvi Chapter 16: Time-Series Forecasting 16.48
(f)
cont.
2023 Q1 Forecast: log Yˆ80 1.0654 0.0043(80) 0.0073 1.4163 Yˆ80 101.4163 26.0815 (US$) 2023 Q2 Forecast: log Yˆ81 1.0654 0.0043(81) 0.0069 1.4065 Yˆ81 101.4065 25.4957 (US$) 2023 Q3 Forecast: log Yˆ82 1.0654 0.0043(82) 0.0014 1.4162 Yˆ82 101.4162 26.0780 (US$) 2023 Q4 Forecast: log Yˆ83 1.0654 0.0043(83) 1.4219 Yˆ82 101.4219 26.4200 (US$)
(g)
16.49
The forecasts are not likely to be accurate given that that the quarterly exponential trend model did not fit the data particularly well. The adjusted r 2 = 0.2715. In addition, the time series contained an irregular component from 2010 through 2013.
(a)
(b) Coefficients
Standard Error
t Stat
P-value
Intercept
2.7778
0.0330
84.1762
0.0000
Coded Quarter
0.0072
0.0006
12.5343
0.0000
Q1
0.0027
0.0353
0.0763
0.9394
Q2
-0.0043
0.0353
-0.1212
0.9039
Q3
-0.0012
0.0353
-0.0354
0.9719
Predict log(Price) = 2.7778 + 0.0072 Coded Quarter + 0.0027Q1 – 0.0043Q2 – 0.0012Q3 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxvii (c)
log10 ˆ1 0.0072; ˆ1 100.0072 1.016676
The estimated quarterly compound growth rate is ( ˆ1 1)100% 1.67%
Copyright ©2024 Pearson Education, Inc.
cdlxxviii Chapter 16: Time-Series Forecasting 16.49 cont.
(d)
log10 ˆ2 0.0027; ˆ2 100.0027 1.006218 The 1st quarter values are estimated to have a mean of 0.6218% above the 4th quarter values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients Quarter 1, Quarter 2, and Quarter 3 are not significant at the 0.05 significance level.
(e)
2022 Q4, X = 75, log Yˆ75 2.7778 0.0072(75) 3.3162 Yˆ75 103.3162 2071.10 (US$)
(f)
2023 Q1 Forecast: log Yˆ76 2.7778 0.0072(76) 0.0027 3.3261 Yˆ76 103.3261 2118.71 (US$) 2023 Q2 Forecast: log Yˆ77 2.7778 0.0072(77) 0.0043 3.3268 Yˆ77 103.3268 2119.73 (US$) 2023 Q3 Forecast: log Yˆ78 2.7778 0.0072(78) 0.0012 3.3365 Yˆ78 103.3365 2170.14 (US$) 2023 Q4 Forecast: log Yˆ79 2.7778 0.0072(79) 3.3449 Yˆ79 103.3449 2112.67 (US$)
(g)
The forecasts in (f) are not accurate because of downward shift in the price in the 2nd quarter of 2013 followed by a flattening in the price over the remaining quarters.
16.50
A time series is a set of numerical data obtained at regular periods over time.
16.51
A trend is the overall long-term tendency or impression of upward or downward movements. The cyclical component depicts the up-and-down swings or movements through the series. Any observed data that do not follow the trend curve modified by the cyclical component are indicative of the irregular component. When data are recorded monthly or quarterly, an additional component called the seasonal factor is considered.
16.52
Moving averages take into account the results of a limited number of periods of time. Exponential smoothing takes into account all the time periods but gives increased weight to more recent time periods.
16.53
The exponential trend model is appropriate when the percentage difference from observation to observation is constant.
16.54
The linear trend model in this chapter has the time period as the X variable.
16.55
Autoregressive models have independent variables that are the dependent variable lagged by a given number of time periods.
16.56
The different methods for choosing an appropriate forecasting model are residual analysis, the standard error of the estimate, the mean absolute deviation, and parsimony.
16.57
The standard error of the estimate relies on the squared sum of the deviations between the observed value and the predicted value. This measure gives increased weight to large differences. The mean absolute deviation is the mean of the absolute value of the deviations between the observed value and predicted value. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxix 16.58
Forecasting for monthly or quarterly data uses an exponential trend model with dummy variables to represent either months or quarters.
16.59
(a)
(b)
Yˆ 0.8267 0.4253 X , where X = years since 1915
Copyright ©2024 Pearson Education, Inc.
cdlxxx Chapter 16: Time-Series Forecasting 16.59 cont.
(b)
(c)
1960: Yˆ 0.8267 0.4253(45) 19.97 1965: Yˆ 0.8267 0.4253(50) 22.09
(d) (e)
16.60
(a)
1970: Yˆ 0.8267 0.4253(55) 24.22 The actual rates, which varied across various sources located on the Internet were extremely low and almost non-existent for the years 1965 and 1970. The forecast made in (c) are not useful because the linear equation could not anticipate the discovery of a polio vaccine. Workforce:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxi 16.60 cont.
(b) Simple Linear Regression Analysis
Regression Statistics Multiple R
0.9501
R Square
0.9028
Adjusted R Square
0.9002
Standard Error
4555.6473
Observations
39
ANOVA df
(c)
SS
MS
F
Regression
1 7130650204.4534 7130650204.4534 343.5808
Residual
37
Total
38 7898545325.4359
767895120.9825
20753922.1887
Coefficients
Standard Error
t Stat
P-value
Intercept
119451.9487
1431.3576
83.4536
0.0000
Coded Year
1201.4372
64.8167
18.5359
0.0000
Yˆ 119, 451.9487 1, 201.4372( X ) where X = years relative to 1984 Yˆ 119, 451.9487 1, 201.4372(39) 166,308.001 (thousands) 2023
Yˆ2024 119, 451.9487 1, 201.4372(40) 167,509.439 (thousands)
16.61
(a)
It would be reasonable to expect the price of natural gas would have a seasonal component which reflects the variation in the use of gas across seasonal temperature changes.
Copyright ©2024 Pearson Education, Inc.
cdlxxxii Chapter 16: Time-Series Forecasting 16.61 cont.
(b)
The time series plot for Commercial Price does appear to support the answer in (a) that there is a seasonal component. An overall downward adjustment in price in 2008 is followed by seasonal variation from 2011 through May of 2021.
The time series plot for Residential does appear to support the answer in (a) that there is a seasonal component from 2011 through 2022.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxiii 16.61 cont.
(c)
Commercial Price: Coefficients
Standard Error
t Stat
P-value
Intercept
0.9042
0.0134 67.6441
0.0000
Coded Month
-0.0001
0.0001
-1.0721
0.2858
Jan
-0.0031
0.0164
-0.1917
0.8483
Feb
0.0000
0.0164
0.0026
0.9979
Mar
0.0092
0.0164
0.5615
0.5755
Apr
0.0070
0.0168
0.4167
0.6776
May
0.0264
0.0168
1.5734
0.1182
June
0.0447
0.0168
2.6675
0.0087
July
0.0523
0.0168
3.1197
0.0023
Aug
0.0531
0.0168
3.1654
0.0020
Sept
0.0460
0.0168
2.7415
0.0070
Oct
0.0196
0.0168
1.1702
0.2442
Nov
0.0008
0.0168
0.0464
0.9631
Predict Commercial Price log(C Price) = 0.9042 – 0.0001 Coded Month – 0.0031Jan + 0.00004Feb + 0.0092Mar + 0.0070Apr + 0.0264May + 0.0447June + 0.0523July + 0.0531Aug + 0.0460Sept + 0.0196Oct + 0.0008Nov Residential Price: Coefficients
Standard Error
t Stat
P-value
Intercept
0.9613
0.0112
85.5486
0.0000
Coded Month
0.0004
0.0001
5.6424
0.0000
Jan
-0.0107
0.0138
-0.7734
0.4408
Feb
-0.0062
0.0138
-0.4481
0.6549
Mar
0.0149
0.0138
1.0787
0.2829
Copyright ©2024 Pearson Education, Inc.
cdlxxxiv Chapter 16: Time-Series Forecasting Apr
0.0474
0.0141
3.3578
0.0010
May
0.1199
0.0141
8.4989
0.0000
June
0.2020
0.0141
14.3247
0.0000
July
0.2444
0.0141
17.3376
0.0000
Aug
0.2591
0.0141
18.3780
0.0000
Sept
0.2336
0.0141
16.5723
0.0000
Oct
0.1255
0.0141
8.9025
0.0000
Nov
0.0271
0.0141
1.9258
0.0565
Predict Residential Price log(R Price) = 0.9613 + 0.0004 Coded Month – 0.0107Jan – 0.0062Feb + 0.0149Mar + 0.0474Apr + 0.1199May + 0.2020June + 0.2444July + 0.2591Aug + 0.2336Sept + 0.1255Oct + 0.0271Nov
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxv 16.61 cont.
(d)
Commercial: log10 ˆ1 0.0001; ˆ1 100.0001 0.9997853 The estimated monthly compound growth rate is ( ˆ1 1)100% 0.21% Residential: log10 ˆ1 0.0004; ˆ1 100.0004 1.0009504 The estimated monthly compound growth rate is ( ˆ1 1)100% 0.095%
(e)
Commercial: Month
bi log10 ˆi
ˆi 10b
( ˆi 1)100%
Jan
-0.0031
0.9928
-0.72%
Feb
0.0000
1.0001
0.01%
Mar
0.0092
1.0214
2.14%
Apr
0.0070
1.0162
1.62%
May
0.0264
1.0627
6.27%
June
0.0447
1.1085
10.85%
July
0.0523
1.1280
12.80%
Aug
0.0531
1.1300
13.00%
Sept
0.0460
1.1116
11.16%
Oct
0.0196
1.0462
4.62%
Nov
0.0008
1.0018
0.18%
i
January, February, and March, April, and November are estimated to have very close to the December values. May, June, July, August, September, and October are estimated to have a mean above the December values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients were not significant for the January, February, March, April and May estimates. The slope coefficients for June, July, August, and September were significant at the 0.05 significance level. The slope coefficients were not significant for the months of October and November. The multipliers indicate that the monthly residential prices for natural gas are highest in the summer months. The multipliers support the answers in (a) and (b). Residential: Month
bi log10 ˆi
ˆi 10b
i
( ˆi 1)100%
Copyright ©2024 Pearson Education, Inc.
cdlxxxvi Chapter 16: Time-Series Forecasting
16.61 cont.
16.62
Jan
-0.0107
0.9757
-2.43%
Feb
-0.0062
0.9859
-1.41%
Mar
0.0149
1.0349
3.49%
Apr
0.0474
1.1152
11.52%
May
0.1199
1.3178
31.78%
June
0.2020
1.5921
59.21%
July
0.2444
1.7556
75.56%
Aug
0.2591
1.8158
81.58%
Sept
0.2336
1.7123
71.23%
Oct
0.1255
1.3350
33.50%
Nov
0.0271
1.0645
6.45%
(e)
January, February, and March are estimated to have very close to the December values. April, May, June, July, August, September, October, and November are estimated to have a mean above the December values. A review of the p-values associated with the t test on the slope of the coefficients reveals that the slope coefficients were not significant for the January, February, and March estimates. The slope coefficients for April, May, June, July, August, September, and October were significant at the 0.05 significance level. The November coefficient was not significant at the 0.05 level. The multipliers indicate that the monthly residential prices for natural gas are higher in the spring, summer months, and fall months with the highest prices occurring in the summer months. The multipliers support the answers in (a) and (b).
(f)
Both the residential and commercial price for natural gas appear to be highest in the summer months. The results also revealed that the seasonal pattern appears to be stronger residential prices compared to commercial prices.
(a)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxvii
Copyright ©2024 Pearson Education, Inc.
cdlxxxviii Chapter 16: Time-Series Forecasting 16.62 cont.
(b)
Linear (Simple Linear Regression Analysis): Revenues vs Coded Year Regression Statistics Multiple R
0.9507
R Square
0.9038
Adjusted R Square
0.9016
Standard Error
2.8492
Observations
47
ANOVA df
SS
MS
F
Regression
1
3430.8592 3430.8592 422.6250
Residual
45
365.3089
Total
46
3796.1681
Coefficients
Standard Error
Intercept
-1.2784
Coded Year
0.6299
8.1180
t Stat
P-value
0.8181
-1.5626
0.1252
0.0306
20.5578
0.0000
Linear Predict: Yˆ 1.2784 0.6299( X ) , where X = years relative to 1975 tSTAT = 20.5578, p-value = 0.000 < 0.05, coded year is significant; r2 = 0.9038. 90.38% of the variation in predicted revenue in $billions is explained by year. (c)
Quadratic (Regression Analysis): Revenues vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.9515
R Square
0.9053
Adjusted R Square
0.9010
Standard Error
2.8582
Observations
47 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxix ANOVA df
SS
MS
F 210.3511
Regression
2
3436.7307
1718.3653
Residual
44
359.4374
8.1690
Total
46
3796.1681
Coefficients
Standard Error
Intercept
-2.0198
1.1993
-1.6841
0.0992
Coded Year
0.7287
0.1206
6.0429
0.0000
Coded Year Sq
-0.0021
0.0025
-0.8478
0.4011
t Stat
P-value
Quadratic Predict: Yˆ 2.0198 0.7287 X 0.0021X , where X = years relative to 1975 For full model, F = 210.3511, p-value = 0.0002, at least one X variable is significant. For coded year2, tSTAT = –0.8478, p-value = 0.4011 > 0.05, Coded Year Sq is not significant; r2 = 0.9053. 90.53% of the variation in predicted revenue in $billions is explained by year. Exponential (Regression Analysis): Log(Revenues) vs Coded Year 2
16.62 cont.
(d)
Regression Statistics Multiple R
0.9485
R Square
0.8997
Adjusted R Square
0.8975
Standard Error
0.1372
Observations
47
ANOVA df
SS
MS
F
Regression
1
7.5963
7.5963 403.7741
Residual
45
0.8466
0.0188
Total
46
8.4428
Copyright ©2024 Pearson Education, Inc.
cdxc Chapter 16: Time-Series Forecasting
Coefficients
Standard Error
t Stat
P-value
Intercept
0.2813
0.0394
7.1437
0.0000
Coded Year
0.0296
0.0015 20.0941
0.0000
Exponential Predict:
log10 Yˆ 0.2813 0.0296( X ) where X = years relative to 1975 tSTAT = 20.0941, p-value = 0.000 < 0.05, coded year is significant; r2 = 0.8997. 89.97% of the variation in predicted revenue in $billions is explained by year.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxci 16.62 cont.
(e)
Autoregressive: Regression Analysis for Revenue vs Lag1, Lag2, Lag3 (Rows unused: 3) Coefficients
Standard Error
t Stat
P-value
Intercept
0.5271
0.3205
1.6444
0.1079
Lag1
1.0714
0.2021
5.3006
0.0000
Lag2
0.2729
0.3637
0.7504
0.4574
Lag3
-0.3617
0.2154
-1.6792
0.1009
For the third order term, tSTAT = –1.6792 with a p-value of 0.1009. The third order term can be dropped because it is not significant at the 0.05 significance level. Regression Analysis for Revenue vs Lag1, Lag2 (Rows unused: 2) Coefficients
Standard Error
t Stat
P-value
Intercept
0.5582
0.3115
1.7923
0.0803
Lag1
1.2587
0.1718
7.3271
0.0000
Lag2
-0.2722
0.1692
-1.6086
0.1152
For the second order term, tSTAT = –1.6086 with a p-value of 0.1152. The second order term can be dropped because it is not significant at the 0.05 significance level. Regression Analysis for Revenue vs Lag1 (Rows unused: 1) Regression Statistics Multiple R
0.9923
R Square
0.9847
Adjusted R Square
0.9844
Standard Error
1.1241
Observations
46
ANOVA df Regression
SS 1
MS
F
3588.2881 3588.2881 2839.8938
Copyright ©2024 Pearson Education, Inc.
cdxcii Chapter 16: Time-Series Forecasting Residual
44
55.5953
1.2635
Total
45
3643.8834
Coefficients
Standard Error
Intercept
0.6699
0.2919
2.2949
0.0266
Lag1
0.9856
0.0185
53.2907
0.0000
t Stat
P-value
For the first order term, tSTAT = 53.2907 with a p-value of 0.0000. The first order term cannot be dropped because it is significant at the 0.05 significance level. The first-order model is appropriate. Autoregressive Predict
Yˆi 0.6699 0.9856(Yi 1 )
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxciii 16.62 cont.
(f)
Linear:Displays large curvilinear pattern. Do not consider this model.
Quadratic: Displays large curvilinear pattern. Do not consider this model.
Copyright ©2024 Pearson Education, Inc.
cdxciv Chapter 16: Time-Series Forecasting 16.62 cont.
(f)
Exponential: Displays large curvilinear pattern. Do not consider this model.
First-Order Autoregressive:Does not show a pattern. Consider this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcv 16.62 cont.
(g)
Linear Quadratic Exponential Autoregressive-1st (h)
(i)
16.63
Syx 2.8492 2.8582 6.5339 1.1241
MAD 2.1531 2.2484 3.8816 0.7533
The residuals plots reveal clear cyclical patterns for the linear, quadratic, and exponential models. The residual plot for first-order autoregressive model revealed no cyclical pattern. Based on the results from (f), (g), and the principle of parsimony, the first-order autoregressive model would be best suited for forecasting.
Yˆi 0.6699 0.9856(Yi 1 ) , where Yi 1 23.2 Yˆ2022 0.6699 0.9856(23.2) 23.536 $billion
(a)
Copyright ©2024 Pearson Education, Inc.
cdxcvi Chapter 16: Time-Series Forecasting 16.63 cont.
(b)
Diversified: Linear Diversified Equity vs Coded Year Simple Linear Regression Analysis Regression Statistics Multiple R
0.9143
R Square
0.8359
Adjusted R Square
0.8315
Standard Error
12.0538
Observations
39
ANOVA df
SS
MS
F
Regression
1
27387.6853 27387.6853 188.4984
Residual
37
5375.8791
Total
38
32763.5644
Coefficients
Standard Error
Intercept
10.8112
3.7872
2.8546
0.0070
Coded Year
2.3546
0.1715
13.7295
0.0000
145.2940
t Stat
P-value
Linear Model Diversified Equity: Predicted Diversified Equity = Yˆ 10.8112 2.3546 X where X = years relative to 1984. t = 13.7295, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8359. 83.59% of the variation in Diversified Equity is explained by year. Balanced: Linear Balanced vs Coded Year; Simple Linear Regression Analysis Regression Statistics Multiple R
0.5978
R Square
0.3573
Adjusted R Square
0.3399
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcvii Standard Error
2.0963
Observations
39
ANOVA df
16.63 cont.
(c)
SS
MS
F
Regression
1
90.3961 90.3961 20.5711
Residual
37
162.5900
Total
38
252.9861
Coefficients
Standard Error
4.3943
t Stat
P-value
Intercept
14.6039
0.6586 22.1730
0.0000
Coded Year
0.1353
0.0298
0.0001
4.5355
Linear Model Balanced: Predicted Balanced = Yˆ 14.6039 0.1353 X where X = years relative to 1984. t = 4.5355, p-value = 0.0001 < 0.05. Coded year is significant. r2 = 0.3573. 35.73% of the variation in Balanced is explained by year. Diversified: Quadratic Diversified Equity vs Coded Year, Coded Year Sq Regression Analysis Regression Statistics Multiple R
0.9176
R Square
0.8420
Adjusted R Square
0.8332
Standard Error
11.9932
Observations
39
ANOVA
df
SS
MS
F
Regression
2
27585.4383 13792.7192 95.8914
Residual
36
5178.1261
Total
38
32763.5644
Copyright ©2024 Pearson Education, Inc.
143.8368
cdxcviii Chapter 16: Time-Series Forecasting Coefficients
Standard Error
t Stat
P-value
Intercept
15.4733
5.4780
2.8246
0.0077
Coded Year
1.5986
0.6670
2.3967
0.0219
Coded Year Sq
0.0199
0.0170
1.1725
0.2487
Quadratic Model: Predicted Diversified Equity = Yˆ 15.4733 1.5986 X 0.0199 X 2 where X = years relative to 1984. For Coded Year2, t = 1.1725, p-value = 0.2487 > 0.05. Coded Year2 not is significant. r2 = 0.8420. 84.20% of the variation in diversified equity is explained by year. Balanced: Quadratic Balanced vs Coded Year, Coded Year Sq, Regression Analysis Regression Statistics Multiple R
0.9847
R Square
0.9696
Adjusted R Square
0.9679
Standard Error
0.4624
Observations
39
ANOVA
df
SS
MS
Regression
2
Residual
36
7.6967
Total
38
252.9861
Coefficients
Standard Error
Intercept
10.4778
Coded Year Coded Year Sq
F
245.2894 122.6447 573.6492 0.2138
t Stat
P-value
0.2112
49.6111
0.0000
0.8044
0.0257
31.2811
0.0000
-0.0176
0.0007
-26.9163
0.0000
Quadratic Model: Predicted Diversified Equity = Yˆ 10.4778 0.8044 X 0.0176 X 2 where X = years relative to 1984. For Coded Year2, t = –26.9163, p-value = 0.0000 < 0.05. Coded Year2 is significant. r2 = 0.9696. 96.96% of the variation in balanced is explained by year. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcix 16.63 cont.
(d)
Diversified: Exponential log(Diversified Equity) vs Coded Year Regression Analysis Regression Statistics Multiple R
0.9202
R Square
0.8469
Adjusted R Square
0.8427
Standard Error
0.1068
Observations
39
ANOVA
df
SS
MS
F
Regression
1
2.3337
2.3337 204.6059
Residual
37
0.4220
0.0114
Total
38
2.7557
Coefficients
Standard Error
t Stat
P-value
Intercept
1.2606
0.0336 37.5681
0.0000
Coded Year
0.0217
0.0015 14.3041
0.0000
Exponential model: log (predicted Diversified Equity) = log10 Yˆ 1.2606 0.0217( X ) where X = years relative to 1984 t = 14.3041, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.8469. 84.69% of the variation in the log of Diversified Equity is explained by year. Balanced: Exponential log(Balanced) vs Coded Year, Regression Analysis Regression Statistics Multiple R
0.6128
R Square
0.3755
Adjusted R Square
0.3587
Standard Error
0.0585
Copyright ©2024 Pearson Education, Inc.
d Chapter 16: Time-Series Forecasting Observations
39
ANOVA
df
SS
MS
F
Regression
1
0.0763
0.0763 22.2505
Residual
37
0.1268
0.0034
Total
38
0.2031
Coefficients Standard Error
t Stat
P-value
Intercept
1.1547
0.0184 62.7733
0.0000
Coded Year
0.0039
0.0008
0.0000
4.7170
Exponential model:
16.63 cont.
log (predicted Balanced) = log10 Yˆ 11.1547 0.0039( X ) where X = years relative to 1984 t = 4.7170, p-value = 0.0000 < 0.05; Coded Year is significant; r2 = 0.3755. 37.55% of the variation in the log of Balanced is explained by year. (e) Diversified Equity: Third-Order Autoregressive Diversified Equity vs Lag1, Lag2, Lag3 Regression Analysis Coefficients
Standard Error
t Stat
P-value
Intercept
1.2631
3.6890
0.3424
0.7343
Lag1
1.0847
0.1833
5.9187
0.0000
Lag2
-0.2515
0.2627
-0.9572
0.3456
Lag3
0.2098
0.1913
1.0967
0.2810
For the third order term, tSTAT = 1.0967 with a p-value of 0.2810. The third order term can be dropped because it is not significant at the 0.05 significance level. Diversified Equity: Second-Order Autoregressive Diversified Equity vs Lag1, Lag2 Regression Analysis Coefficients
Standard Error
t Stat
P-value
Intercept
1.8183
3.4520
0.5267
0.6018
Lag1
1.0777
0.1807
5.9635
0.0000
Lag2
-0.0540
0.1885
-0.2865
0.7762
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems di For the second order term, tSTAT = –0.2865 with a p-value of 0.7762. The second order term can be dropped because it is not significant at the 0.05 significance level. Diversified Equity: First-Order Autoregressive Diversified Equity vs Lag1 Regression Analysis Regression Statistics Multiple R
0.9549
R Square
0.9118
Adjusted R Square
0.9094
Standard Error
8.7009
Observations
38
ANOVA
df
SS
MS
F
Regression
1
28189.8623 28189.8623 372.3640
Residual
36
2725.3843
Total
37
30915.2465
Coefficients Standard Error
75.7051
t Stat
P-value
Intercept
1.4834
3.1890
0.4652
0.6446
Lag1
1.0316
0.0535
19.2967
0.0000
For the first order term, tSTAT = 19.2967 with a p-value of 0.0000. The first order term should be retained because it is significant at the 0.05 significance level. Autoregression model: (First Order) Predicted Diversified Equity = Yˆi 1.4834 1.0316Yˆi 1
Copyright ©2024 Pearson Education, Inc.
dii Chapter 16: Time-Series Forecasting 16.63 cont.
(e) Balanced: Third-Order Autoregressive Balanced vs Lag1, Lag2, Lag3 Regression Analysis Coefficients
Standard Error
t Stat
P-value
Intercept
0.8444
0.3726
2.2661
0.0303
Lag1
1.4867
0.1765
8.4222
0.0000
Lag2
-0.3684
0.3128
-1.1777
0.2476
Lag3
-0.1646
0.1610
-1.0223
0.3143
For the third order term, tSTAT = –1.0223 with a p-value of 0.3143. The third order term can be dropped because it is not significant at the 0.05 significance level. Balanced: Second-Order Autoregressive Balanced vs Lag1, Lag2 Regression Analysis Regression Statistics Multiple R
0.9949
R Square
0.9897
Adjusted R Square
0.9891
Standard Error
0.2228
Observations
37
ANOVA
df
SS
MS
81.4387 1640.0175
Regression
2
162.8774
Residual
34
1.6883
Total
36
164.5657
Coefficients Standard Error
F
0.0497
t Stat
P-value
Intercept
0.8740
0.3354
2.6060
0.0135
Lag1
1.6222
0.1133
14.3230
0.0000
Lag2
-0.6698
0.1020
-6.5683
0.0000
For the second order term, tSTAT = –6.5683 with a p-value of 0.0000. The second order term should be retained because it is significant at the 0.05 significance level. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems diii Autoregression model: (Second Order) Predicted Balanced = Yˆi 0.8740 1.6222Yˆi 1 0.6698Yˆi 2
Copyright ©2024 Pearson Education, Inc.
div Chapter 16: Time-Series Forecasting 16.63 cont.
(f)
Linear
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dv 16.63 cont.
(f)
Quadratic:
Copyright ©2024 Pearson Education, Inc.
dvi Chapter 16: Time-Series Forecasting 16.63 cont.
(f)
Exponential:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dvii 16.63 cont.
(f)
Autoregressive:
Copyright ©2024 Pearson Education, Inc.
dviii Chapter 16: Time-Series Forecasting 16.63 cont.
(g) Diversified Equity Linear Quadratic Exponential AR-First Order
r2 0.8359 0.842 0.8469 0.9118
SYX 12.0538 11.9932 12.5313 8.7009
MAD 8.0378 8.362 9.5213 5.9918
r2 0.3573 0.9696 0.3755 0.9897
SYX 2.0963 0.4624 2.1897 0.2228
MAD 1.8259 0.2891 1.9171 0.1556
Balanced Linear Quadratic Exponential AR-Second Order (h)
For the Diversified Equity Fund, the first-order autoregressive model appears to be the best model based on the results from (f) and (g) and the principle of parsimony. The residual plots for the other models revealed clear patterns while the residual plot for the first-order autoregressive model revealed no clear pattern. The first-order autoregressive model also had the lowest Syx and MAD values, and highest r2. For the Balanced Fund, the second-order autoregressive model appears to be the best model based on the results from (f) and (g) and the principle of parsimony. The residual plots for linear, quadratic, and exponential models had strong patterns. The residual plot for the second-order autoregressive model revealed less of a pattern relative to the other models. The second-order autoregressive model had the lowest Syx and MAD values, and highest r2.
(i)
Diversified: (First Order Autoregression model) Yˆ2023 1.4834 1.0316Yˆ2022 1.4834 1.0316(133.741)
139.4525 Balanced: (Second Order Autoregression model) Yˆ2023 0.8740 1.6222Yˆ2022 0.6698Yˆ2021 0.8740 1.6222(17.271) 0.6698(17.000) (j)
17.5036 Based on the results from (a) through (i), one would recommend that a member of the Teacher’s Retirement System of the City of New York should invest most of their retirement in the Diversified Equity Fund with possibly a small percentage in the Balanced Fund. The member should be advised that the Diversified Equity Fund does have more risk than the Balanced Fund and that this should be considered. If the member prefers to have almost no risk, most of the retirement should be invested in the Balanced Fund. However, the member should be aware the value of the Diversified Equity Fund increased by 920% from 1984 to 2022 compared to only a 67% increase in value for the Stable-Value Fund. A second-order autoregressive model was able to account for 99% of the variation in the Balanced Fund price while a first-order model was able to account for 91% of the variation in the Diversified Equity Fund price. For most individuals willing to take some risk, the Diversified Equity Fund would clearly be the fund to invest most of a member’s retirement funds. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dix
Copyright ©2024 Pearson Education, Inc.
dx Chapter 16: Time-Series Forecasting 16.64
Each of the currencies Canadian dollar (CAD), Japanese yen (JPY), and British pound (BPD) are expressed in units per U.S. dollar (USD). A time series analysis of the Canadian dollar (CAD) reveals a moderate component with up and down cycles of varying durations. Although the currency rate varies in cycles, the 1980 exchange rate is fairly similar to the 2022 exchange rate. A review of residual plots reveals that the linear, quadratic, and exponential models may be problematic due to cyclical variation in the residuals. In contrast, a first-order autoregressive model revealed a random pattern of residuals. In addition, the first-order autoregressive model had the smallest standard error of the estimate and MAD values, and highest r2. The first-order autoregressive was the most appropriate model to use for forecasting. Using this model, the forecasted exchange rate is 1.2573 (units per $ U.S.) and 1.2606 (units per $ U.S.) for 2023 and 2024, respectively. A time series analysis of the Japanese yen exchange rate revealed a steep drop in the rate beginning in 1986. The rate dropped from 238.47 (units per $ U.S.) in 1985 to 128.17 (units per $ U.S.) in 1988. The exchange rate had a cyclical component from that point forward with an overall declining trend. A review of the residual plots, standard error of the estimate, MAD values, and r2, revealed that the first-order autoregressive model was the most appropriate for forecasting. Using this model, the forecasted exchange rate is 109.5270 (units per $ U.S.) and 109.2481 (units per $ U.S.) for 2023 and 2024, respectively. A time series analysis of the English pound reveals a moderate component with up and down cycles of varying durations. Although the currency rate varies in cycles, the 1980 exchange rate has increased from 0.4302 (units per $ U.S.) in 1980 to 0.7847 (units per $ U.S.) in 2019. A review of the residual plots and the standard error of the estimate and MAD values revealed that the first-order autoregressive model was the most appropriate for forecasting. Using this model, the forecasted exchange rate is 0.7076 (units per $ U.S.) and 0.6941 (units per $ U.S.) for 2023 and 2024, respectively. An unexpected irregular component in the future could not be anticipated by the autoregressive models used for each of the currencies.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxi
Copyright ©2024 Pearson Education, Inc.
dxii Chapter 16: Time-Series Forecasting 16.64 cont.
CAD Linear
Simple Linear Regression Analysis: CAD vs Coded Year Regression Statistics Multiple R
0.1067
R Square
0.0114
Adjusted R Square
-0.0127
Standard Error
0.1489
Observations
43
ANOVA
df
SS
MS
F 0.4724
Regression
1
0.0105
0.0105
Residual
41
0.9088
0.0222
Total
42
0.9193
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxiii Intercept
1.2902
0.0446 28.9090
0.0000
Coded Year
-0.0013
0.0018
0.4958
-0.6873
Predicted CAD = Yˆ 1.2902 0.0013 X where X = years relative to 1980. t = –0.6873, p-value = 0.4958 > 0.05. Coded year not is significant. r2 = 0.0114. 1.14% of the variation in CAD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxiv Chapter 16: Time-Series Forecasting 16.64 cont.
CAD Quadratic
Regression Analysis: CAD vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.1282
R Square
0.0164
Adjusted R Square
-0.0328
Standard Error
0.1503
Observations
43
ANOVA
df
SS
MS
F 0.3340
Regression
2
0.0151
0.0075
Residual
40
0.9042
0.0226
Total
42
0.9193
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxv Intercept
1.2685
0.0657 19.3067
0.0000
Coded Year
0.0019
0.0072
0.2637
0.7934
Coded Year Sq
-0.0001
0.0002
-0.4524
0.6534
Predicted CAD = Yˆ 1.2685 0.0019 X 0.0001X 2 where X = years relative to 1980. t = –0.4524, p-value = 0.6534 > 0.05. Coded year not is significant. r2 = 0.0164. 1.64% of the variation in CAD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxvi Chapter 16: Time-Series Forecasting 16.64 cont.
CAD Exponential
Simple Linear Regression Analysis: log(CAD) vs Coded Year Regression Statistics Multiple R
0.1223
R Square
0.0150
Adjusted R Square
-0.0091
Standard Error
0.0520
Observations
43
ANOVA
df
SS
MS
F 0.6227
Regression
1
0.0017
0.0017
Residual
41
0.1109
0.0027
Total
42
0.1126
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxvii Intercept
0.1093
0.0156
7.0085
0.0000
Coded Year
-0.0005
0.0006
-0.7891
0.4346
log(predicted CAD) = log10 Yˆ 0.1093 0.0005( X ) where X = years relative to 1980. t = –0.7891, p-value = 0.44346 > 0.05. Coded year not is significant. r2 = 0.0150. 1.50% of the variation in CAD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxviii Chapter 16: Time-Series Forecasting 16.64
CAD Autoregressive
cont.
Autoregressive Third Order regression had third order term Lag3 p-value = 0.2025 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.5148 > 0.05. The second order term can be dropped.
Simple Linear Regression Analysis: CAD vs Lag1 Regression Statistics Multiple R
0.8164
R Square
0.6666
Adjusted R Square
0.6582
Standard Error
0.0871
Observations
42
ANOVA
df
SS
MS
F 79.9618
Regression
1
0.6067
0.6067
Residual
40
0.3035
0.0076
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxix Total
16.64 cont.
41
0.9101
Coefficients
Standard Error
t Stat
P-value
Intercept
0.2391
0.1156
2.0680
0.0451
Lag1
0.8124
0.0909
8.9421
0.0000
Predicted CAD = Yˆi 0.2391 0.8124Yˆi 1 For the first order term, tSTAT = 8.9421, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.6666. 66.66% of the variation in CAD is explained by year. CAD CAD Linear Quadratic Exponential AR-First Order
r2 0.0114 0.0164 0.0150 0.6666
SYX 0.1489 0.1503 0.1492 0.0871
MAD 0.1200 0.1204 0.1210 0.0694
Japanese yen (JPY)
Copyright ©2024 Pearson Education, Inc.
dxx Chapter 16: Time-Series Forecasting 16.64 cont.
JPY Linear
Simple Linear Regression Analysis: JPY vs Coded Year Regression Statistics Multiple R
0.7218
R Square
0.5210
Adjusted R Square
0.5093
Standard Error
31.7874
Observations
43
ANOVA
df
SS
MS
Regression
1
45061.4687 45061.4687
Residual
41
41427.8707
Total
42
86489.3394
Coefficients
Standard Error
F 44.5961
1010.4359
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxi Intercept
186.7181
9.5284
19.5960
0.0000
Coded Year
-2.6086
0.3906
-6.6780
0.0000
Predicted JPY = Yˆ 186.7181 2.6086 X where X = years relative to 1980. t = –6.6780, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.5210. 52.10% of the variation in JPY is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxii Chapter 16: Time-Series Forecasting 16.64 cont.
JPY Quadratic
Regression Analysis: JPY vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.8989
R Square
0.8080
Adjusted R Square
0.7984
Standard Error
20.3744
Observations
43
ANOVA
df
SS
MS
Regression
2
69884.6962 34942.3481
Residual
40
16604.6432
Total
42
86489.3394
Coefficients
Standard Error
F 84.1749
415.1161
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxiii Intercept
236.8211
8.9039
26.5976
0.0000
Coded Year
-9.9408
0.9807
-10.1367
0.0000
Coded Year Sq
0.1746
0.0226
7.7329
0.0000
Predicted JPY = Yˆ 236.8211 9.9408 X 0.1746 X 2 where X = years relative to 1980. t = 7.7329, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8080. 80.80% of the variation in JPY is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxiv Chapter 16: Time-Series Forecasting 16.64 cont.
JPY Exponential
Simple Linear Regression Analysis: log(JPY) vs Coded Year Regression Statistics Multiple R
0.7320
R Square
0.5359
Adjusted R Square
0.5245
Standard Error
0.0880
Observations
43
ANOVA
df
SS
MS
F 47.3342
Regression
1
0.3670
0.3670
Residual
41
0.3179
0.0078
Total
42
0.6848
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxv Intercept
2.2565
0.0264 85.4939
0.0000
Coded Year
-0.0074
0.0011
0.0000
-6.8800
log(predicted JPY) = log10 Yˆ 2.2565 0.0074( X ) where X = years relative to 1980. t = –6.8800, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 5359. 53.59% of the variation in JPY is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxvi Chapter 16: Time-Series Forecasting 16.64
JPY Autoregressive
cont.
Autoregressive Third Order regression had third order term Lag3 p-value = 0.6463 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.0841 > 0.05. The second order term can be dropped.
Simple Linear Regression Analysis: JPY vs Lag1 Regression Statistics Multiple R
0.9396
R Square
0.8829
Adjusted R Square
0.8799
Standard Error
15.0461
Observations
42
ANOVA
df
SS
MS
F
Regression
1
68253.7451 68253.7451 301.4936
Residual
40
9055.4160
226.3854
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxvii Total
41
77309.1610
Coefficients
Standard Error
Intercept
11.6679
7.1823
1.6245
0.1121
Lag1
0.8909
0.0513
17.3636
0.0000
t Stat
P-value
Predicted JPY = Yˆi 11.6679 0.8909Yˆi 1 For the first order term, tSTAT = 17.3636, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.8829. 88.29% of the variation in JPY is explained by year. 16.64 cont. JPY Linear Quadratic Exponential AR-First Order
r2
SYX
MAD
0.5210 0.8080 0.5359 0.8829
31.7874 20.3744 30.3981 15.0461
26.4410 16.3532 22.9516 10.6744
British pound (BPD)
Copyright ©2024 Pearson Education, Inc.
dxxviii Chapter 16: Time-Series Forecasting 16.64 cont.
BPD Linear
Simple Linear Regression Analysis: BPD vs Coded Year Regression Statistics Multiple R
0.4775
R Square
0.2280
Adjusted R Square
0.2091
Standard Error
0.0785
Observations
43
ANOVA
df
SS
MS
F 12.1074
Regression
1
0.0746
0.0746
Residual
41
0.2528
0.0062
Total
42
0.3274
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxix Intercept
0.5677
0.0235 24.1201
0.0000
Coded Year
0.0034
0.0010
0.0012
3.4796
Predicted BPD = Yˆ 0.5677 0.0034 X where X = years relative to 1980. t = 3.4796, p-value = 0.0012 < 0.05. Coded year is significant. r2 = 0.2280. 22.80% of the variation in BPD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxx Chapter 16: Time-Series Forecasting 16.64 cont.
BPD Quadratic
Regression Analysis: BPD vs Coded Year, Coded Year Sq Regression Statistics Multiple R
0.5661
R Square
0.3205
Adjusted R Square
0.2865
Standard Error
0.0746
Observations
43
ANOVA
df
SS
MS
F 9.4330
Regression
2
0.1049
0.0525
Residual
40
0.2225
0.0056
Total
42
0.3274
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxxi Intercept
0.6230
0.0326 19.1164
0.0000
Coded Year
-0.0047
0.0036
-1.3210
0.1940
Coded Year Sq
0.0002
0.0001
2.3336
0.0247
Predicted BPD = Yˆ 0.6230 0.0047 X 0.0002 X 2 where X = years relative to 1980. t = 2.3336, p-value = 0.0247 < 0.05. Coded year is significant. r2 = 0.3205. 32.05% of the variation in BPD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxxii Chapter 16: Time-Series Forecasting 16.64 cont.
BPD Exponential
Simple Linear Regression Analysis: log(BPD) vs Coded Year Regression Statistics Multiple R
0.4722
R Square
0.2230
Adjusted R Square
0.2041
Standard Error
0.0546
Observations
43
ANOVA
df
SS
MS
F 11.7672
Regression
1
0.0350
0.0350
Residual
41
0.1221
0.0030
Total
42
0.1572
Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems dxxxiii Intercept
-0.2475
0.0164
-15.1303
0.0000
Coded Year
0.0023
0.0007
3.4303
0.0014
log(predicted BPD) = log10 Yˆ 0.2475 0.0023( X ) where X = years relative to 1980. t = 3.4303, p-value = 0.0.0014 < 0.05. Coded year is significant. r2 = 0.2230. 22.30% of the variation in BPD is explained by year.
Copyright ©2024 Pearson Education, Inc.
dxxxiv Chapter 16: Time-Series Forecasting 16.64
BPD Autoregressive
cont.
Autoregressive Third Order regression had third order term Lag3 p-value = 0.5215 > 0.05. The third order term can be dropped. Autoregressive Second Order regression had second order term Lag2 p-value = 0.1946 > 0.05. The second order term can be dropped.
Simple Linear Regression Analysis: BPD vs Lag1 Regression Statistics Multiple R
0.7569
R Square
0.5728
Adjusted R Square
0.5622
Standard Error
0.0550
Observations
42
ANOVA
df
SS
MS
F 53.6421
Regression
1
0.1622
0.1622
Residual
40
0.1209
0.0030
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxv Total
41
0.2831
Coefficients
Standard Error
t Stat
P-value
Intercept
0.1899
0.0625
3.0402
0.0042
Lag1
0.7125
0.0973
7.3241
0.0000
Predicted BPD = Yˆi 0.1899 0.7125Yˆi 1 For the first order term, tSTAT = 7.3241, p-value = 0.0000 < 0.05. Coded year is significant. r2 = 0.5728. 58.28% of the variation in BPD is explained by year. 16.64 cont. BPD Linear Quadratic Exponential AR-First Order
r2 0.2280 0.3205 0.2230
SYX 0.0785 0.0746 0.0782
MAD 0.0617 0.0570 0.0610
0.5728
0.055
0.0433
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc. v
vi Chapter 18: Getting Ready to Analyze Data in the Future
Chapter 17
17.1
The three major categories of business analytics are descriptive, predictive, and prescriptive. Descriptive analytics summarize historical data to facilitate the identification of potential patterns or trends that could lead to more in-depth analyses. Predictive analytics utilize historical data to understand relaionships among variables or to predict values of a dependent variable. Prescriptive analytics evaluate business models to facilitate the identification of operational improvement strategies.
17.2
Limited information technology and data management systems prevented the widespread adoption of business analytics in the past.
17.3
What is data mining? Data mining invovles the extraction of valuable data through model building and/or descriptive or predictive analytics.
17.4
Articifial intelligence is a type of computer science that utilizes software solutions to simulate human expertice, reasoning, or knowledge. Machine learning is one of the articifial intelligence tools used to automate model building.
17.5
Exploratory models facilitate the understanding of the relationship among variables. Predictive models seek to predict individual cases rather than an estimate of the general case. Dashboards typically allow users to drill down to various levels of detail.
17.6
Decision rules compare decision criteria to facilitate prediction.
17.7
Dashboards facilitate the exploration of data by displaying critical pieces of information in a visual format that allows users to quickly interpret the overall status of an activity or event.
17.8
Data dimesionality represents the number of variables associated with visualizing a data item. Color, size, and motion represent additional dimensions that can be used to describe a data item.
17.9
Classification trees and regression trees are decision trees that split data into groups based on the values of independent or explanatory variables. Classificaiton trees use categorical dependent variable while regression trees use numerical dependent variables.
17.10
Clustering involves the grouping of items into sets based on the similarity of items. A calculated distance is used to determine the similarity of items. Association analyses assesses that similarity of the values that comprise one item.
17.11
Text analytics represents a blend of desriptive and prescriptive analytics. Text analytics utilize clustering and association methods to automate analysis and interpretation of text.
17.12
A large language model (LLM) is a type of artificial intelligence algorithm using deep learning techniques on massive amounts of data.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 17.13
The two primary approaches to prescriptive analytics are optimization and simulation. Optimization involves the setting of constraints to assist in the development of a decision modeel that represents the optimal way of managing a business process. Simulation involves the repeating of a predictive analytics model by varying the assumptions or the data associated with the model. Decision criteria are used to select a particular version of the model, which may or may not be optimal. Simulation can be used when a business process is not well understood.
Chapter 18
18.1
Output for the summary statistics for the processing time:
If you assume that the 20 books from the two production represent independent events, one can use either the pooled-variance t test or the separate variance t test to determine whether there is a significant different between the two means if the normality assumption is met for both plants.
Copyright ©2024 Pearson Education, Inc.
viii Chapter 18: Getting Ready to Analyze Data in the Future
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 18.1 cont.
Histogram plots reveal that processing time for both plants is somewhat right skewed. A Wilcoxon rank sum test may be appropriate in this situation. However, because both t tests are robust to the departure from normality, one of these tests will be performed.
Copyright ©2024 Pearson Education, Inc.
x Chapter 18: Getting Ready to Analyze Data in the Future 18.1 cont.
Since the p-value = 0.287 is larger than 0.05, do not reject H0. There is not sufficient evidence of a difference in the two population variances.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 18.1 cont.
Since the p-value = 0.183 is greater than 0.05, do not reject H0. There is not sufficient evidence of a difference in the population mean processing time in the two plants. A Wilcoxon nonparametric rank sum test also failed to reveal a significance difference in the median processing time between the two plants. All tests performed revealed insufficient evidence of a difference in processing time between the two plants.
Copyright ©2024 Pearson Education, Inc.
xii Chapter 18: Getting Ready to Analyze Data in the Future 18.2
Please note that for this Problem: ―Employees‖ represents ―Travel & Tourism Employees (thousands)‖ ―Nights‖ represents ―Nights Spent at Tourism Establishments (millions)‖ ―Expenditures‖ represents ―Accommodation Expenditures at Tourism Establishments (thousands of euros)‖ ―Establishments‖ represents ―Tourism Establishments (thousands)‖ Descriptive Summary of Travel and Tourism Descriptive Summary
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Employees Nights Expenditures Establishments 6820.533333 62.0709685 5410744.926 20.2177 2988.3 22.356157 1067998.545 5.1515 #N/A #N/A #N/A #N/A 269.6 1.322284 30805.56 0.268 41500 324.389046 59606179.21 220.457 41230.4 323.066762 59575373.65 220.189 90863108.4561 8619.6867 129583421076066.0000 1903.4944 9532.2142 92.8423 11383471.3983 43.6291 139.76% 149.57% 210.39% 215.80% 2.3608 2.0121 4.0772 3.8119 5.6568 2.8184 18.7172 16.0905 30 30 30 30 1740.3363 16.9506 2078328.0225 7.9655
Multiple Regression Analysis: Employees vs Nights, Expenditures, Establishments Coefficients Standard Error t Stat P-value
VIF
Intercept
1125.6632
603.7475
1.8645
0.0736
Nights
56.8394
10.8787
5.2248
0.0000
3.9597
Expenditures
0.0004
0.0001
5.9620
0.0000
2.4254
Establishments
-3.6231
16.7572
-0.2162
0.8305
2.0748
Because the goal of 18.2 is to predict the value of a dependent variable, travel and tourism jobs (Employees), one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all three independent variables have VIF values below five.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 18.2 cont.
Best Subsets Analysis: Employees vs Nights, Expenditures, Establishments Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
43.4125
2
0.8032
0.7962 4303.5715
X2
49.8956
2
0.7848
0.7771 4500.0600
X3
247.5345
2
0.2245
0.1968 8543.1137
X1X2
2.0467
3
0.9262
0.9207 2684.6272
X1X3
37.5458
3
0.8255
0.8126 4126.7352
X2X3
29.2989
3
0.8489
0.8377 3840.3121
X1X2X3
4.0000
4
0.9263
0.9178 2733.3115
A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of nights (X1) and expenditures (X2) had the lowest Cp value (2.0467) and an adjusted r2 = 0.9207. Although one other model had the similar adjusted r2 it included three variables and had a slightly higher Cp value. Based on the principle of parsimony, the twovariable model of Nights and Expenditures appears to be the preferred model in this case. Multiple Regression Analysis: Employees vs Nights, Expenditures Regression Statistics Multiple R
0.9624
R Square
0.9262
Adjusted R Square
0.9207
Standard Error
2684.6272
Observations
30
ANOVA df
SS
MS
Regression
2
2440435118.7175
1220217559.3588
Residual
27
194595026.5091
7207223.2040
Copyright ©2024 Pearson Education, Inc.
F 169.3048
xiv Chapter 18: Getting Ready to Analyze Data in the Future Total
29
2635030145.2267
Coefficients
Standard Error
t Stat
P-value
Intercept
1121.7726
592.7305
1.8926
0.0692
Nights
55.2040
7.6796
7.1884
0.0000
Expenditures
0.0004
0.0001
6.7047
0.0000
Predict Employees = 1,121.7726 + 55.2040 Nights + 0.0004 Expenditures The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 169.3048 or p-value = 0.0000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that the two independent variables in the model should be included. The adjusted r2 of this model is 0.9207. The r2 of 0.9262 indicates that 92.62% of the variation in used Employees can be explained by the variation in Nights and Expenditures.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 18.2 cont.
Copyright ©2024 Pearson Education, Inc.
xvi Chapter 18: Getting Ready to Analyze Data in the Future 18.2 cont.
For both of the two variables, residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot revealed no evidence of significant departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 18.3
Descriptive Summary of Best Cities
Descriptive Summary
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Average Annual Unemployment Average Median Home Price Median Age Salary ($) Rate Commute Time 330824.66 39.027 59820.5 7.672 25.01 302596 38.35 52875 7.6 24.4 #N/A 37.2 #N/A 8.1 22.8 92450 30.7 39250 4.3 19.4 1455741 52.9 625560 11.9 34.8 1363291 22.2 586310 7.6 15.4 36266084291.2772 14.9470 3335481948.3939 2.1115 10.1213 190436.5624 3.8661 57753.6315 1.4531 3.1814 57.56%
9.91%
96.54%
18.94%
12.72%
2.9281 13.4183 100 19043.6562
1.4034 2.4918 100 0.3866
9.6868 95.7280 100 5775.3631
0.2309 0.3287 100 0.1453
0.7925 0.4608 100 0.3181
Multiple Regression Analysis: Median Home Price vs Median Age, Average Annual Salary, Unemployment Rate, Average Commute Time Coefficients
Standard Error
t Stat
P-value
VIF
Intercept
-77791.7592
198959.2076
-0.3910
0.6967
Median Age
-10192.7152
4180.6788
-2.4381
0.0166
1.0562
-0.0471
0.2753
-0.1711
0.8645
1.0218
Unemployment Rate
-6961.3447
11388.8321
-0.6112
0.5425
1.1072
Average Commute Time
34491.5294
5113.5102
6.7452
0.0000
1.0699
Average Annual Salary ($)
Because the goal of 18.3 is to predict the value of a dependent variable, median sales price of homes, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all of the four variables, has a VIF below 5.
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.
Best Subsets Analysis: Median Home Price vs Median Age, Average Annual Salary, Unemployment Rate, Average Commute Time Best Subsets Analysis
Intermediate Calculations R2T
0.352051
1 - R2T
0.647949
n
100
T
5
n-T
95
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
45.7184
2
0.0334
0.0235
188181.4015
X2
50.0472
2
0.0039
-0.0063
191033.7864
X3
50.4230
2
0.0013
-0.0089
191279.4235
X4
6.2276
2
0.3028
0.2956
159826.0705
X1X2
47.4087
3
0.0355
0.0156
188942.1382
X1X3
46.8380
3
0.0394
0.0196
188560.4962
X1X4
1.3929
3
0.3494
0.3360
155184.6726
X2X3
51.8206
3
0.0054
-0.0151
191866.9556
X2X4
8.2144
3
0.3028
0.2885
160637.4282
X3X4
6.9452
3
0.3115
0.2973
159637.0537
X1X2X3
48.4974
4
0.0417
0.0118
189310.7168
X1X2X4
3.3736
4
0.3495
0.3292
155975.0643
X1X3X4
3.0293
4
0.3519
0.3316
155693.2517
X2X3X4
8.9441
4
0.3115
0.2900
160465.4419
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix X1X2X3X4
5.0000
5
0.3521
0.3248
156486.4220
A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of median age (X1) and average commute time (X4) had the lowest Cp value (1.3929) and an adjusted r2 = 0.3360. The two-variable model of Median Age and Average Commute Time appears to be the preferred model in this case.
Copyright ©2024 Pearson Education, Inc.
xx Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.
Multiple Regression Analysis: Median Home Price vs Median Age, Average Commute Time Regression Statistics Multiple R
0.5911
R Square
0.3494
Adjusted R Square
0.3360
Standard Error
155184.6726
Observations
100
ANOVA df
SS
MS
F 26.0432
Regression
2
1254360932349.1600
627180466174.5810
Residual
97
2335981412487.2800
24082282602.9617
Total
99
3590342344836.4400
Coefficients
Standard Error
t Stat
P-value
Intercept
-96397.8894
194672.5492
-0.4952
0.6216
Median Age
-10653.9446
4041.3288
-2.6362
0.0098
Average Commute Time
33707.0790
4911.1519
6.8634
0.0000
Predict Median Home Price = –96,397.8894 –10,653.9446 median age + 33,707.0790 average commute time The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 26.0432 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that both of the independent variables should be included. The model including two independent variables, median age and average commute time represents the most appropriate model. The adjusted r2 of this model is 0.3360. The r2 of 0.3494 indicates that 34.94% of the variation in median home price can be explained by the variation in median age and average commute time.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi 18.3 cont.
Copyright ©2024 Pearson Education, Inc.
xxii Chapter 18: Getting Ready to Analyze Data in the Future 18.3 cont.
For both of the two variables, residual plots reveal no clear pattern, indicating no evidence for violation of the equal variance or linearity assumptions. The normal probability plot revealed no evidence of significant departure from the normality assumption.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii 18.4
Descriptive Summary of MLB Attendance Study
Descriptive Summary
Mean Median Mode Minimum Maximum Range Variance Standard Deviation Coeff. of Variation Skewness Kurtosis Count Standard Error
Season Mean Attendance Attendance Wins per Game Stadium Capacity 2149.786233 81.03333333 26.8081 43293.23333 2282.0015 81 28.5245 42120.5 #N/A 74 #N/A #N/A 787.902 55 9.973 31042 3861.408 111 47.671 56000 3073.506 56 37.698 24958 598646.8575 216.9989 91.4334 30593431.4954 773.7227 14.7309 9.5621 5531.1329 35.99% 18.18% 35.67% 12.78% 0.0815 0.1825 0.0709 0.2755 -0.6432 -0.7822 -0.6627 0.2322 30 30 30 30 141.2618 2.6895 1.7458 1009.8421
Mean Capacity Team Payroll Team Value Percentage ($millions) ($billions) 61.43215526 132.1365634 2.074 61.02860178 135.75455 1.73 #N/A #N/A #N/A 21.14267543 46.011667 0.99 93.90507667 222.205 6 72.76240124 176.193333 5.01 362.5600 1921.2932 1.2997 19.0410 43.8326 1.1401 31.00% 33.17% 54.97% -0.1768 0.1613 1.8739 -0.8514 -0.6412 3.8080 30 30 30 3.4764 8.0027 0.2081
Multiple Regression Analysis: Season Attendance vs Wins, Mean Attendance per Game, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions) Coefficients
Standard Error
t Stat
P-value
Intercept
133.9804
Wins
VIF
188.9378
0.7091
0.4854
-1.2155
0.6121
-1.9858
0.0591
2.0950
Mean Attendance per Game
86.1712
6.9257
12.4423
0.0000
113.0619
Stadium Capacity
-0.0021
0.0042
-0.5019
0.6205
14.1711
Mean Capacity Percentage
-1.2117
2.9529
-0.4103
0.6854
81.4715
Team Payroll
0.0185
0.1984
0.0934
0.9264
1.9479
Team Value
-15.2767
8.1984
-1.8634
0.0752
2.2513
Because the goal of 18.4 is to predict the value of a dependent variable, season attendance, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that mean attendance per game has the largest value and is well above five. Multiple Regression Analysis: Season Attendance vs Wins, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions)
Intercept Wins
Coefficients
Standard Error
t Stat
P-value
-2091.5977
165.6186
-12.6290
0.0000
0.6654
1.6144
0.4122
0.6839
Copyright ©2024 Pearson Education, Inc.
VIF
1.9672
xxiv Chapter 18: Getting Ready to Analyze Data in the Future Stadium Capacity
0.0482
0.0035
13.8867
0.0000
1.2811
Mean Capacity Percentage
34.9331
1.4426
24.2149
0.0000
2.6246
Team Payroll
-0.5543
0.5251
-1.0556
0.3017
1.8430
Team Value
13.8482
21.3864
0.6475
0.5234
2.0678
A second regression analysis was performed after eliminating the mean attendance per game variable because it had the highest VIP value that was above 5. The second regression analysis reveals that all five of the remaining independent variables in this model have VIFs lower than 5.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv 18.4 cont.
Best Subsets Analysis: Season Attendance vs Wins, Stadium Capacity, Mean Capacity Percentage, Team Payroll ($millions), Team Value ($billions) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
1136.7460
2
0.4416
0.4217
588.4075
X2
1478.7172
2
0.2774
0.2516
669.3653
X3
239.7515
2
0.8724
0.8678
281.3025
X4
1399.3913
2
0.3155
0.2910
651.4826
X5
1167.5012
2
0.4268
0.4064
596.1385
X1X2
836.8057
3
0.5866
0.5560
515.5674
X1X3
233.8780
3
0.8762
0.8670
282.1889
X1X4
789.8166
3
0.6092
0.5802
501.2982
X1X5
814.4472
3
0.5973
0.5675
508.8277
X2X3
1.7550
3
0.9876
0.9867
89.1793
X2X4
1100.0511
3
0.4602
0.4202
589.1500
X2X5
1028.3004
3
0.4946
0.4572
570.0366
X3X4
236.4575
3
0.8749
0.8657
283.5968
X3X5
196.8519
3
0.8939
0.8861
261.1460
X4X5
1064.9480
3
0.4770
0.4383
579.8778
X1X2X3
3.1952
4
0.9879
0.9865
89.8849
X1X2X4
617.5854
4
0.6928
0.6574
452.8740
X1X2X5
692.0896
4
0.6571
0.6175
478.5248
X1X3X4
227.4929
4
0.8802
0.8664
282.8507
X1X3X5
194.8909
4
0.8958
0.8838
263.7235
X1X4X5
703.7411
4
0.6515
0.6113
482.4129
X2X3X4
2.6974
4
0.9881
0.9868
88.9925
X2X3X5
3.6388
4
0.9877
0.9863
90.6728
Copyright ©2024 Pearson Education, Inc.
xxvi Chapter 18: Getting Ready to Analyze Data in the Future X2X4X5
934.7652
4
0.5405
0.4875
553.8993
X3X4X5
198.7588
4
0.8940
0.8817
266.0646
X1X2X3X4
4.4193
5
0.9883
0.9864
90.2425
X1X2X3X5
5.1142
5
0.9879
0.9860
91.5176
X1X2X4X5
590.3603
5
0.7069
0.6600
451.1676
X1X3X4X5
196.8400
5
0.8959
0.8792
268.9147
X2X3X4X5
4.1699
5
0.9884
0.9865
89.7805
X1X2X3X4X5
6.0000
6
0.9885
0.9861
91.3092
A Best Subsets analysis reveals a number of models that had Cp values equal to k + 1 or less. The two-variable model of stadium capacity (X2) and mean capacity percentage (X3) had the lowest Cp value (1.7550) and an adjusted r2 = 0.9867. Although one other model had the same adjusted r2 it included three variables and had a slightly higher Cp value. Based on the principle of parsimony, the two-variable model of Stadium Capacity and Mean Capacity Percentage appears to be the preferred model in this case.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 18.4 cont.
Multiple Regression Analysis: Season Attendance vs Stadium Capacity, Mean Capacity Percentage Regression Statistics Multiple R
0.9938
R Square
0.9876
Adjusted R Square
0.9867
Standard Error
89.1793
Observations
30
ANOVA df
SS
MS
F
Regression
2 17146029.3009 8573014.6505 1077.9670
Residual
27
Total
29 17360758.8687
214729.5678
7952.9470
Coefficients
Standard Error
-2103.3507
133.4018
-15.7670
0.0000
Stadium Capacity
0.0486
0.0031
15.8618
0.0000
Mean Capacity Percentage
35.0141
0.8892
39.3758
0.0000
Intercept
t Stat
P-value
Predict Season Attendance = –2103.3507 + 0.0486 stadium capacity + 35.0141 mean capacity% The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 1077.967 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that the two independent variables in the model should be included. The adjusted r2 of this model is 0.9876. The r2 of 0.9876 indicates that 98.76% of the variation in used season attendance can be explained by the variation in stadium capacity and mean capacity percentage.
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.4 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix 18.4 cont.
The residual plots reveal no clear pattern in the residuals. However, the normal probability plot suggests that there is potential evidence of deviation from the normality assumption. One should consider models that use appropriate transformations.
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 18: Getting Ready to Analyze Data in the Future 18.5
Because the goal of 18.5 is to predict the value of a dependent variable, price of used cars, one would use multiple regression. As a first step in the model building process, a review of the VIFs reveals that all variables have a VIF of less than 5.
A Best Subsets analysis reveals that all but one of the models have Cp values greater than k + 1, where k represents the number of independent variables. The model including all three independent variables has a Cp value equal to k + 1. This model, which also has the highest adjusted r2, represents the preferred model for further analyses. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi 18.5 cont.
The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 29.52 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 3 of the independent variables should be included. The adjusted r2 of this model is 0.3428. The r2 of 0.3549 indicates that 35.49% of the variation in used car price can be explained by the variation in age, mileage, and fuel mileage.
Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 18: Getting Ready to Analyze Data in the Future
18.5 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii
Copyright ©2024 Pearson Education, Inc.
xxxiv Chapter 18: Getting Ready to Analyze Data in the Future 18.5 cont.
The residual plots reveal little to no pattern. However, the plots reveal several outliers. The normal probability plot suggest that there may be evidence of potential deviation from the normality assumption. 18.6
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv
18.6 cont.
The focus of the question in this problem was on whether there is evidence of gender bias in relationship to the evaluation of candidates. A multiple regression analysis was chosen in this case to determine whether the rater to candidate gender relationship is significantly predictive of recommended salary relative to other potential predictors. Let Y = Salary, X1 = Competence Rating, X2 = M to M (1 if Rater = M and Candidate = M, 0 otherwise), X3 = F to M (1 if Rater = F and Candidate = M, 0 otherwise), X4 = M to F (1 if Rater = M and Candidate = F, 0 otherwise), X5 = Public (1 if public, 0 otherwise), X6 = Biology (1 if Biology, 0 otherwise), X7 = Chemistry (1 if Chemistry, 0 otherwise), X8 = Age-Rater. Dummy variables were created for X2,3,4,5,6,7. For each of the factors, rater to candidate gender and department, the total number of categories minus one represented the number of dummy variables needed.
Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 18: Getting Ready to Analyze Data in the Future
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii 18.6 cont.
A Best Subsets analysis reveals that several models have Cp values greater than k + 1, where k represents the number of independent variables. A number of models have Cp values equal to or less than k +1. The model including competency rating, the male to male rater to candidate relationship variable, the female to male rater to candidate relationship variable, and the male to female rater to candidate relationship variable had the lowest Cp value and the highest adjusted r2. This model represents a reasonable option for further analyses.
Copyright ©2024 Pearson Education, Inc.
xxxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.
The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 196.68 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level for all but the male to female rater to candidate relationship variable. For this variable, one would not reject H0 because the p-value = 0.069. This variable should be excluded from the model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 18.6 cont.
The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 255.88 or p-value = 0.000, reject H0. Individual t-tests on the independent variables are significant at the 0.05 level, which suggests that all 3 of the independent variables should be included. The adjusted r2 of this model is 0.8653. The r2 of 0.8687 indicates that 86.87% of the variation in salary recommendation can be explained by the variation in competence rating, male to male rater to candidate relationship, and female to male rater candidate relationship. The best model appears to be the following: Salary 17.314 2.982 Competence Rating 1.690 M to M 2.127 F to M The above model was chosen based on the Best Subsets Approach. However, the same model would have been chosen by running a comprehensive regression analysis on all eight predictor variables and then choosing the coefficients that were significant at the 0.05 significance level. The Best Subsets Approach was chosen to illustrate the many different possible models that could be considered relative to the Cp value and the adjusted r2.
Copyright ©2024 Pearson Education, Inc.
xl Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli 18.6 cont.
Copyright ©2024 Pearson Education, Inc.
xlii Chapter 18: Getting Ready to Analyze Data in the Future 18.6 cont.
The residual plots suggest possible violation of the equal variance assumption. The conclusion from the regression result may not be reliable based on this possible violation. The normal probability plot suggests that the residuals are normally distributed except the 3 outliers in the right-tail.
After removing the three outliers, the normal probability plot suggest that the residuals are normally distributed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii 18.6 cont.
The best regression model suggests that after taking into consideration several factors, the mean salary of a female candidate is estimated to be the same regardless of the gender of the rater. This conclusion is based on the finding that the gender of the rater was not predictive of the recommended salary for female candidates. In contrast, the mean salary of a male candidate is estimated to be $1.690 thousands higher than his female counterpart when rated by a male and estimated to be $2.1270 thousands higher than his female counterpart when rated by a female. The mean salary recommendation for female candidates was $27.60 thousand compared to $31.20 thousand for male candidates. Based on a two-sample t test, the difference is significant at the 0.05 significance level. The regression analyses go beyond understanding whether a difference exist in mean recommended salaries. These analyses directly address the question of whether there is gender bias in the evaluations. The above regression model indicates that there appears to be gender bias with both male and female raters recommending higher salaries for male candidates compared to female candidates.
18.7
The first part of the problem focuses on analyzing the data based on differences in cost among the various types of cuisines. Descriptive statistics followed by a One-Way ANOVA were performed to address this part of the problem. The descriptive statistics associated with cost for each of the different types of restaurants reveals that French restaurants have the highest mean cost and Mexican restaurants have the lowest mean cost. Although there appears to be some skewness for some of the restaurant categories, the normality assumption is difficult to assess given the relatively small sample sizes. The Levene test for difference in variances did not reveal any evidence of a significant violation of the equality of variance assumption across the cost of different types of cuisines.
Copyright ©2024 Pearson Education, Inc.
xliv Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.
A one-way Anova F test for the equality of the mean reveals that there is sufficient evidence of differences in the cost of a meal for the different types of cuisines at the 5% level of significance. Because FSTAT = 14.21 or p-value = 0.000, reject H0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 18.7 cont.
The Tukey multiple comparisons show the pair-wise differences among the 7 different types of cuisines. At the higher cost category, the French, Japanese, and Italian restaurants are not significantly different from each other at the 5% level of significance. French cuisine is significantly higher in cost compared to American, Chinese, Indian, and Mexican cuisines. The Japanese and Italian cuisines are higher in cost compared to Chinese, Indian, and Mexican cuisines. At the lower cost category, the Chinese, Indian and Mexican cuisines are not significantly different from each other at the 5% level of significance. The second part of the problem focuses on developing a regression model to predict the cost of a meal based on the variables included in the dataset. This procedure will require dummy variables for all but one of the cuisine categories. The following variables represent potential predictors in the model: X1 = food rating, X2 = décor rating, X3 = service rating, X4 = popularity index, X5= 1 if American and 0 otherwise, X6 = 1 if Chinese and 0 otherwise, X7 = 1 if French and 0 otherwise, X8 = 1 if Indian and 0 otherwise, X9 = 1 if Italian and 0 otherwise, and X10 = 1 if Japanese and 0 otherwise.
Among all of the variables, service rating had the highest VIF above 5.
Copyright ©2024 Pearson Education, Inc.
xlvi Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.
After dropping service rating from the model, food rating was the only variable with a VIF value above 5.
After removing the service rating and food rating variables, none of the remaining variables had VIF values above 5.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii 18.7 cont.
A Best Subsets analysis reveals that several models have Cp values greater than k + 1, where k represents the number of independent variables. A number of models have Cp values equal to or less than k +1. The six variable model including décor, American cuisine, Chinese cuisine, French cuisine, Italian cuisine, and Japanese cuisine had the lowest Cp value and the highest adjusted r2. This model represents a reasonable option for further analyses.
Copyright ©2024 Pearson Education, Inc.
xlviii Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.
This model included one variable, Chinese cuisine, that was not significant at the 0.05 significance level.
After removing the Chinese cuisine variable, all remaining variables are significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlix 18.7 cont.
The F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 28.97 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 5 of the independent variables should be included. The adjusted r2 of this model is 0.6860. The r2 of 0.7105 indicates that 71.05% of the variation in salary recommendation can be explained by the variation in décor rating, American, French, Italian, and Japanese cuisines. The most appropriate model for predicting the cost of a meal is: Cost = 8.1 2.924 Decor 18.65 American 38.55 French 22.96 Italian 28.11 Japanese
Copyright ©2024 Pearson Education, Inc.
l Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li 18.7 cont.
Copyright ©2024 Pearson Education, Inc.
lii Chapter 18: Getting Ready to Analyze Data in the Future 18.7 cont.
The various residual plots against the independent variables suggest possible violation of the equality of variance assumption. The normal probability plot reveals possible departure from normality. The five-factor model including décor, American, French, Italian, and Japanese cuisines represent the best model for predicting the cost of a meal. This model is similar to the results from the ANOVA post-hoc comparison results that showed significant differences in the higher cost restaurants relative to the lower cost restaurants. 18.8
Please note that for Problem 18.8: ―male‖ represents ―M‖ or ―men‖ and ―female‖ represents ―W‖ or ―women‖ in ―Gender‖
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems liii 18.8 cont.
Descriptive Statistics for Bank Churn Study
A descriptive analysis of all variables is provided above. The problem focuses on determining the likelihood that a customer will leave the bank. It this case, 260 out of 1,000 customers had left the bank. The next step is to develop a model that predicts the likelihood that a customer would leave the bank.
Because the dependent variable in this case is categorical, whether a customer has left the bank, a logistic regression would be the appropriate regression procedure. The above coefficients table was created from an initial analysis of all potential predictor variables. It was necessary to create dummy variables for gender, and domicile location. A series of regression analyses was performed by removing variables that did not significantly predict whether a customer left the bank. Copyright ©2024 Pearson Education, Inc.
liv Chapter 18: Getting Ready to Analyze Data in the Future 18.8 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lv 18.8 cont.
The above model predicts the likelihood a customer will leave the bank based on the following four variables: gender, age, whether a customer is active, and whether a customer lives in Germany. All other variables were not significant at the 0.05 significance level. Holding constant the effects of age, active membership, and whether one lives in Germany, ln (odds) decreases by 0.515 if the customer is male. Holding constant gender, whether a customer is active, and whether a customer lives in Germany, ln (odds) increases by 0.07009 each year in age. Holding constant the effects of gender, age, and whether a customer lives in Germany, ln (odds) decreases by 0.843 if a customer is an active member. Holding constant the effects of gender, age, and active membership, ln (odds) increases by 0.572 if a customer lives in Germany. The deviance statistic is 144.72 is well below the critical value of 2 . In this case, one would not reject H0. There is insufficient evidence that the model is not good fitting. However, the results should be interpreted with caution given the large number of degrees of freedom. Using the above four variable model, one could predict the likelihood that a customer would leave the bank based on specific values for each of the four variables. For example, suppose one wanted to determine the estimated probability of leaving the bank for customers that were female, 50 years of age, a non-active bank member, and a resident of Germany.
For this example, the estimated odds ratio = 0.667854. The model would predict that 66.8% of customers who are women W (female), age 50, a non-active bank member, and reside in Germany would exit the bank. In contrast to the above example, suppose one wanted to determine the estimated probability of leaving the bank for customers that were men M (male), 25 years of age, an active bank member, and not a resident of Germany.
Copyright ©2024 Pearson Education, Inc.
lvi Chapter 18: Getting Ready to Analyze Data in the Future 18.8 cont.
For this example, the estimated odds ratio = 0.0481653. The model would predict that 4.8% of customers who are men M (male), age 25, an active bank member, and not living in Germany would exit the bank. Results from other statistical tests of the data are consistent with the above logistic regression analysis. For example, Chi-Square tests for association revealed the following: a significantly higher percentage of women W (females) had exited the bank, a significantly higher percentage of non-active bank members exited the bank, and a significantly higher percentage of customers from Germany exited the bank. A two-sample t test revealed that the mean age of customers was significantly higher among individuals that had exited the bank. Examining the data in various ways can be helpful in assessing the interpretation of the logistic regression model above.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lvii 18.9
A descriptive analysis of the various variables revealed considerable right skewness for a number of potential predictor variables. This should be taken into consideration when interpreting the results from subsequent regression analyses. The first part of the problem focuses on presenting conclusions in regard to the relationship between the amount stacked and the various potential types of downtime causes. Two approaches to model building were used to identify an appropriate multiple regression model to understand the variation in amount stacked in relation to the various downtime factors.
Copyright ©2024 Pearson Education, Inc.
lviii Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.
For the above analysis, the following variables were utilized: Y = amount stacked, X1 = mechanical, X2 = Electrical, X3 = Tonnage Restriction, X4 = Operator, and X5 = No Feed. None of the independent variables have a VIF > 5.0. Hence, there is not any concern of collinearity among the independent variables. Because the t-test for the significant of individual independent variable reveals that tonnage restriction has a tSTAT = 0.01 with a p-value = 0.992, do not reject H0. There is not enough evidence that tonnage restriction is significant at the 5% level and should be removed.
After removing the tonnage restriction variable, the F test of the overall model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 76.08 or p-value = 0.000, reject H0. Individual t tests on the independent variables are significant at the 0.05 level, which suggests that all 4 of the independent variables should be included. The adjusted r2 of this model is .8983. The r2 of 0.9103 indicates that 91.03% of the variation in amount stacked can be explained by the variation in mechanical, electrical, operator, and no feed variables. The same model is identified as the preferred model for predicting amount stacked using the Best Subsets approach.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lix 18.9 cont.
The Best Subsets approach also revealed the four variable model consisting of the mechanical, electrical, operator, and no feed variables. This model had the lowest Cp value and the highest adjusted r2. Both model building approaches led to the same 4 variable model.
Copyright ©2024 Pearson Education, Inc.
lx Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxi 18.9 cont.
Copyright ©2024 Pearson Education, Inc.
lxii Chapter 18: Getting Ready to Analyze Data in the Future 18.9 cont.
The residual plots revealed no clear pattern, suggesting that there is insufficient evidence for violations in equal variance and linear assumptions. The normal probability plot revealed no evidence for a violation of the normality assumption. The second part of the problem focuses on developing a model to predict amount stacked based on total downtime.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxiii 18.9 cont.
As expected, the daily amount stacked and downtime appear to be negatively related with a correlation coefficient r = 0.9410 with a p-value of 0.000. There is strong evidence that the daily amount stacked and downtime are negatively related.
The F test of the model reveals that the model is significant at the 0.05 significance level. Because FSTAT = 255.15 or p-value = 0.000, reject H0. For downtime, tSTAT = -15.97 with a p-value of 0.000. In this case, one would reject H0. The r2 of 0.8855 indicates that 88.55% of the variation in amount stacked can be explained by the variation in downtime. Copyright ©2024 Pearson Education, Inc.
lxiv Chapter 18: Getting Ready to Analyze Data in the Future 18.9
To predict daily amount stacked, the model to use is:
cont.
Y 36760 28.24X where X is the downtime.
18.10
Please note that for Problem 18.10: ―male‖ represents ―M‖ or ―man‖ and ―female‖ represents ―W‖ or ―woman‖ in ―Gender‖
The above descriptive analysis represents a starting point for understanding the data associated with customers of Wally’s Discount Stores. One could conduct detailed descriptive analyses to examine the characteristics and demographics of these customers. Although the problem did not provide a specific direction for an analysis, it would be reasonable to assume that the owners may want to understand the factors that are predictive of the amount of its private-label, ShowGo, purchases. The below analysis will identify the preferred model for predicting ShowGo purchases among customers of Wally’s Discount Stores.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxv 18.10 cont.
A multiple regression analysis including all potential predictor variables revealed that variables had VIF values well below five. However, several predictor variables were not significant at the 0.05 significance level.
Copyright ©2024 Pearson Education, Inc.
lxvi Chapter 18: Getting Ready to Analyze Data in the Future 18.10 cont.
After removing all non-significant predictor variables, a two variable model was identified that consisted of whether one owned a Wallys Card and age of customer. Although both variables had tSTAT values that were significant, the overall model was very weak in predicting ShowGo purchases. The The r2 of 0.0299 indicates that 2.99% of the variation ShowGo purchases can be explained by the variation in whether one owns a Wallys Card and age of customer. Although the model had very little predictive value, the findings do suggest that the owners of Wall’s Discount Stores may want to consider collecting other data if they desire to identify predictors of ShowGo label. This finding is a reminder that it can be helpful to understand which variables are not predictive as well which variables are predictive. As an example, scatterplots of the numeric variables healthy eating rating and active lifestyle rating clearly show there is no relationship between these variables and ShowGo purchases.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxvii 18.10 cont.
Other variables would need to be identified to develop an appropriate model for predicting ShowGo purchases among customers of Wally’s Discount Stores.
Copyright ©2024 Pearson Education, Inc.
lxviii Chapter 18: Getting Ready to Analyze Data in the Future 18.11
Because the problem focuses on predicting the number of domestic and imported hybrid vehicles sold in 2019 and 2020, it is necessary to build an appropriate regression model. The time series plot reveals an overall upward trend with cyclical components in the number of domestic and imported hybrids sold in the United States from 1999 to 2018. Linear Model:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxix 18.11 cont.
A linear trend model with coded year as the independent variable reveals an upward linear trend with a r2 of 0.7521, which indicates that 75.21% of the variation in hybrid sales can be explained by the linear trend of the time series. Because tSTAT = 7.39 or p-value = 0.000, reject H0. Quadratic Model:
Because tSTAT = -3.61 or p-value = 0.002, reject H0. The quadratic model adds significantly in predicting hybrid sales. r2 of 0.8597, which indicates that 85.97% of the variation in hybrid sales can be explained by the linear and quadratic trends of the time series.
Copyright ©2024 Pearson Education, Inc.
lxx Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.
Exponential Model:
Because tSTAT = 4.10 or p-value = 0.001, reject H0. The exponential model is useful in predicting the sales of hybrids. However, its r2 value of 0.4826 is much lower than the r2 associated with the linear and quadratic models.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxi 18.11 cont.
Autoregressive Third-Order Model:
Because tSTAT = 0.33 or p-value = 0.745, do not reject H0. The third-order term can be removed from the model.
Copyright ©2024 Pearson Education, Inc.
lxxii Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.
Autoregressive Second-Order Model:
Because tSTAT = -1.13 or p-value = 0.275, do not reject H0. The second-order term can be removed from the model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxiii 18.11 cont.
Autoregressive First-Order Model:
Because tSTAT = 10.10 or p-value = 0.000, reject H0. The first-order autoregressive model is appropriate. The adjusted r2 of 0.8487 for this model was higher than all other models. Linear
Copyright ©2024 Pearson Education, Inc.
lxxiv Chapter 18: Getting Ready to Analyze Data in the Future
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxv 18.11 cont.
Quadratic
Exponential
Copyright ©2024 Pearson Education, Inc.
lxxvi Chapter 18: Getting Ready to Analyze Data in the Future 18.11 cont.
First-Order Autoregressive
The residual plots for the linear, quadratic, and exponential models show clear patterns. The FirstOrder Autoregressive Model residual plot shows no clear pattern, which indicates the model fits adequately. Based on regression results and the residual plots, the First-Order Autoregressive Model represents the best model to predict hybrid sales. The first-order autoregressive model also had the smallest values for the standard error of the estimate and MAD. Utilizing the first-order autoregressive model, the predicted number of domestic and imported hybrid vehicles sold in the U.S. in 2019 and 2020 would be as follows:
Yˆ2019 49388 0.8676(323912) 370981 Yˆ 49388 0.8676(370981) 371238 2020
Copyright ©2024 Pearson Education, Inc.
Chapter 19 (Online)
(a)
Proportion of nonconformances largest on Day 5, smallest on Day 3.
Proportion
19.1
0.3 0.2 0.1 0 0
2
4
6
8
10
Day
(b)
n = 100, p = 1.48/10 = 0.148,
p (1 p ) 0.148(1 0.148) 0.148 3 0.04147 , n 100 p (1 p ) 0.148(1 0.148) UCL p 3 0.148 3 0.25453 n 100 Proportions are within control limits, so there does not appear to be any special causes of variation. LCL p 3
(c)
(a)
Proportion of nonconformances largest on Day 4, smallest on Day 3.
Proportion
19.2
0.3 0.2 0.1 0 0
2
4
6
8
10
Day
(b)
n = 1036/10 = 103.6, p = 148/1036 = 0.142857,
p (1 p ) 0.142857(1 0.142857) 0.142857 3 0.039719 n 103.6 p (1 p ) 0.142857(1 0.142857) UCL p 3 0.142857 3 0.245995 n 103.6 LCL p 3
Copyright ©2024 Pearson Education, Inc. v
vi Chapter 19: Statistical Applications in Quality Management (online) (c)
Proportions are within control limits, so there do not appear to be any special causes of variation.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 19.3
(a)
n = 125, p = 146/3875 = 0.0377,
p (1 p ) 0.0377(1 0.0377) 0.0377 3 0.0134 < 0, so the lower n 125 control limit does not exist. p (1 p ) 0.0377(1 0.0377) UCL p 3 0.0377 3 0.0888 n 125 LCL p 3
19.4
(b)
The proportion of transmissions with errors on Day 23 is substantially out of control. Possible causes of this value should be investigated.
(a)
n = 500, p = 761/16000 = 0.0476
p (1 p ) 0.0476(1 0.0476) 0.0476 3 0.0190 > 0 n 500 p (1 p ) 0.0476(1 0.0476) UCL p 3 0.0476 3 0.0761 n 500 LCL p 3
(b)
Since the individual points are distributed around p without any pattern and all the points are within the control limits, the process is in a state of statistical control. Copyright ©2024 Pearson Education, Inc.
viii Chapter 19: Statistical Applications in Quality Management (online) 19.5
(a)
n = 102.5667, p = 0.308742, LCL = 0.171895, UCL = 0.44559
PHStat output:
(b)
19.6
(c)
Yes, the process gives an out of control signal because the proportions fall outside of the control limits on four of the 30 days. n = 103.1923, p = 0.297428, LCL = 0.162428, UCL = 0.432428
(a)
n = 113345/22 = 5152.0455, p = 1460/113345 = 0.01288,
p (1 p ) 0.01288(1 0.01288) 0.01288 3 0.00817 n 5152.0455 p (1 p ) 0.01288(1 0.01288) UCL p 3 0.01288+3 0.01759 n 5152.0455 PHStat output: LCL p 3
The proportion of unacceptable cans is below the LCL on Day 4. There is evidence of a pattern over time, since the last eight points are all above the mean and most of the earlier points are below the mean. Thus, the special causes that might be contributing to this pattern should be investigated before any change in the system of operation is contemplated. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 19.6 cont.
(b)
Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system. They might also look at day 4 to see if they could identify and exploit the special cause that led to such a low proportion of defects on that day.
19.7
(a)
p = 0.042, UCL = 0.085, LCL does not exist. Points 9, 16, 20, 31, and 36 are above the UCL. First, the reasons for the special cause variation would need to be determined and local corrective action taken. Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system.
(b)
(a)
(b)
19.11
(a)
p = 0.1091, LCL = 0.0751, UCL = 0.1431. Points 9, 26, and 30 are above the UCL. First, the reasons for the special cause variation would need to be determined and local corrective action taken. Once special causes have been eliminated and the process is stable, Deming’s fourteen points should be implemented to improve the system. c = 38/10 = 3.8, LCL c 3 c 3.8 3 3.8 0 , LCL does not exist.
UCL c 3 c 3.8 3 3.8 9.648077
Noncomformance s
19.8
15 10 5 0 0
2
4
6
8
Time
Copyright ©2024 Pearson Education, Inc.
10
x Chapter 19: Statistical Applications in Quality Management (online) 19.11 cont.
(b)
There do not appear to be special causes of variation, as there are no points outside the control limits and no discernable pattern.
19.12
(a)
c = 115/10 = 11.5, LCL c 3 c 11.5 3 11.5 1.32651
Noncomformance s
UCL c 3 c 11.5 3 11.5 21.67349
30 20 10 0 0
2
4
6
8
10
Time
19.13
(b)
Yes, the number of nonconformances per unit for Time Period 1 is above the upper control limit.
(a)
c = 155/24 = 6.458,
LCL c 3 c 6.458 3 6.458 0 , LCL does not exist.
Noncomformances
UCL c 3 c 6.458 3 6.458 14.082 The process appears to be in control since there are no points outside the upper control limit and there is no pattern in the results over time.
15 10 5 0 0
5
10
15
20
Day
(b)
(c)
19.14
(a)
The value of 12 is within the control limits, so that it should be identified as a source of common cause variation. Thus, no action should be taken concerning this value. If the value were 20 instead of 12, c would be 6.792 and UCL would be 14.61. In this situation, a value of 20 would be substantially above the UCL and action should be taken to explain the special cause of variation. The process needs to be studied and potentially changed using principles of Six Sigma® management and/or Deming’s 14 points for management. The twelve errors committed by Gina appear to be much higher than all others, and Gina would need to explain her performance. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 19.14 cont.
(b)
(c) (d)
19.15
c = 5.5, UCL = 12.56, LCL does not exist. The number of errors is in a state of statistical control since none of the tellers are outside the UCL. Since Gina is within the control limits, she is operating within the system, and should not be singled out for further scrutiny. The process needs to be studied and potentially changed using principles of Six Sigma® management and/or Deming’s 14 points for management.
(a)
(b) (c)
c = 5.07, UCL = 11.83, LCL does not exist. There are no points above the UCL. However, the first nine points are all below the center line. Data collection would be delayed until the startup period for the unit had passed. For example, the severity of the illness or the age of the patients in the unit from month to month could be contributing factors. Copyright ©2024 Pearson Education, Inc.
xii Chapter 19: Statistical Applications in Quality Management (online) 19.16
(a) (b)
c 3.057
(c)
There is evidence of a pattern over time, since the first eight points are all below the mean. Thus, the special causes that might be contributing to this pattern should be investigated before any change in the system of operation is contemplated. Even though weeks 15 and 41 experienced seven fire runs each, they are both below the upper control limit. They can, therefore, be explained by chance causes. After having identified the special causes that might have contributed to the first eight points that are below the average, the fire department can use the c-chart to monitor the process in future weeks in real-time and identify any potential special causes of variation that might have arisen and could be attributed to increased arson, severe drought, or holiday-related activities.
(d) (e)
19.17
(a)
(b) (c)
The number of unsafe acts observed during the ninth tour is above the upper control limit. The process is out of control. The special causes that might have contributed to the extraordinary number of unsafe acts during the ninth tour should be investigated and corrected to improve the process.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii 19.18
(a) (b) (c)
d2 = 2.059(d)D4 = 2.282 d3 = 0.88(e)A2 = 0.729 D3 = 0
19.19
(a) (b) (c)
d2 = 1.693(d)D4 = 2.575 d3 = 0.888(e)A2 = 1.023 D3 = 0
19.20
(a)
R = 0.247, R chart: UCL = 0.636; LCL does not exist
(b)
According to the R-charts, the process appears to be in control with all points lying inside the control limits without any pattern and no evidence of special cause variation.
(c)
X = 47.998, X chart: UCL = 48.2507; LCL = 47.7453
(d)
According to the X -chart, the process appears to be in control with all points lying inside the control limits without any pattern and no evidence of special cause variation. Copyright ©2024 Pearson Education, Inc.
xiv Chapter 19: Statistical Applications in Quality Management (online)
19.21
(a)
(b) (c)
R = 3.97 LCL = D3 R = 0 (3.97) = 0. LCL does not exist. UCL = D4 R = (2.282) (3.97) = 9.05954 There are no sample ranges outside the control limits and there does not appear to be a pattern.
X = 13.95 LCL = X – A2 R = 13.95 – (0.729) (3.97) = 11.05587
(d)
UCL = X + A2 R = 13.95 + (0.729) (3.97) = 16.84413 The sample mean on Day 7 is above the UCL, which is an indication there is evidence of special cause variation. k
Ri 19.22
(a)
R =
i 1
k
k
X
= 3.275, X = i 1 k
i
= 5.9413.
R chart: UCL = D4 R 2.282 3.275 = 7.4736 LCL does not exist. X chart: UCL = X A2 R 5.9413 0.729 3.275 = 8.3287 LCL = X A2 R 5.9413 0.729 3.275 = 3.5538 PHStat R Chart output:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv 19.22 cont.
(a)
(b)
19.23
(a)
PHStat X Chart output:
The process appears to be in control since there are no points outside the control limits and there is no evidence of a pattern in the range chart, and there are no points outside the control limits and there is no evidence of a pattern in the X chart.
R = 271.57, UCL = D4 R = (2.114) (271.57) = 574.09 LCL = D3 R = 0 (271.57) = 0. LCL does not exist.
X =198.67 LCL = X – A2 R = 198.67 – (0.577) (271.57) = 41.97 UCL = X + A2 R = 198.67 + (0.577) (271.57) = 355.36
Copyright ©2024 Pearson Education, Inc.
xvi Chapter 19: Statistical Applications in Quality Management (online) 19.23 cont.
(a)
(b)
19.24
(a)
The process appears to be in control since there are no points outside the control limits and no evidence of a pattern in the range chart, and there are no points outside the control limits and no evidence of a pattern in the X chart.
R = 0.8794, R chart: UCL = 2.0068; LCL does not exist R Chart 2.5 UCL
2 1.5 1
RBar
0.5 0
LCL 0
10
20
30
X = 20.1065, X chart: UCL = 20.7476; LCL = 18.4654
Copyright ©2024 Pearson Education, Inc.
40
Solutions to End-of-Section and Chapter Review Problems xvii XBar Chart 21 20.8 20.6 20.4 20.2 20 19.8 19.6 19.4
UCL
XBar
LCL 0
19.24 cont.
(c)
19.25
(a)
5
10
15
20
25
30
35
The process appears to be in control since there are no points outside the lower and upper control limits of either the R-chart and Xbar-chart, and there is no pattern in the results over time.
(b)
(c)
19.26
According to both charts, the process appears to be in control with all points lying inside the control limits and no evidence of any pattern.
(a)
Copyright ©2024 Pearson Education, Inc.
xviii Chapter 19: Statistical Applications in Quality Management (online)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix 19.26 cont.
(a)
R = 8.145, X = 18.12. R chart: LCL = D3 R = 0 (8.145) = 0. LCL does not exist. UCL = D4 R = (2.282) (8.145) = 18.58689. For X chart: LCL = X – A2 R = 18.12 – (0.729) (8.145) = 12.1823 (b)
19.27
(a)
(b)
UCL = X + A2 R = 18.12 + (0.729) (8.145) = 24.0577 There are no sample ranges outside the control limits and there does not appear to be a pattern in the range chart. The sample mean on Day 15 is above the UCL and the sample mean on Day 16 is below the LCL, which is an indication there is evidence of special cause variation in the sample means. Some possible sources of common cause variation can be the fluctuation in temperature, humidity and geographical disturbances of the environment in which the machines operate. A machine that operates in an earthquake zone could experience chance variation during even undetectable quakes. If a machine operates in a location near a subway line, the vibration from the transit trains can cause systematic assignable cause of variation.
(c) R Chart 0.6 UCL
0.5 0.4 0.3
RBar
0.2 0.1 0 0
5
10
15
20
LCL 25
Copyright ©2024 Pearson Education, Inc.
30
xx Chapter 19: Statistical Applications in Quality Management (online) 19.27 cont.
19.28
(c)
(d)
The process appears to be in control since none of the points fall outside the control limits and there is no evidence of any pattern.
(a)
R = 0.3022, R chart: UCL = 0.6389; LCL does not exist
X = 90.1317, X chart: UCL = 90.3060; LCL = 89.9573 R Chart 1.2 1 0.8 UCL
0.6 0.4
RBar 0.2 0 0
5
10
Copyright ©2024 Pearson Education, Inc.
15
LCL 20
Solutions to End-of-Section and Chapter Review Problems xxi 19.28 cont.
(a)
XBar Chart 91 90.8 90.6 90.4
UCL
90.2
XBar
90
LCL
89.8 89.6 0
19.29
19.30
5
10
15
20
(b)
The R-chart is out-of-control because the 5th and 6th data points fall above the upper control limit. There is also a downward trend in the right tail of the R-chart, which signifies that special causes of variation must be identified and corrected. Even though the X-bar chart also appears to be out-of-control because a majority of the data point fall above or below the control limit, any interpretation will be misleading because the R-chart has indicated the presence of out-of-control conditions. There is also a downward trend in both control charts. Special causes of variation should be investigated and eliminated.
(a)
Estimate of the population mean of all X values = X 20
(b)
Estimate of the population standard deviation of all X values = R / d 2
(a)
Estimate of the population mean = X 100 Estimate of population standard deviation = R / d 2
(b) (c) (d)
3.386 2 1.693
102 100 98 100 P 98 X 102 P Z 0.6827 2 2 107.5 100 93 100 P 93 X 107.5 P Z .9997 2 2 93.8 100 P X 93.8 P Z .9990 2 110 100 P X 110 P Z 1 2
Copyright ©2024 Pearson Education, Inc.
2 0.9713 2.059
xxii Chapter 19: Statistical Applications in Quality Management (online) 19.31
(a)
Cp
USL LSL 102 98 0.3333 6 2 6 R / d2
CPL
X LSL 100 98 0.3333 3 2 3 R / d2 USL X 102 100 0.3333 3 2 3 R / d2
CPU
C pk min(CPL, CPU ) 0.3333
(b)
Cp
USL LSL 107.5 93 1.2083 6 2 6 R / d2
CPL
X LSL 100 93 1.1667 3 2 3 R / d2
CPU
USL X 107.5 100 1.25 3 2 3 R / d2
C pk min(CPL, CPU ) 1.1667
19.32
(a)
(b)
22 20.1065 18 20.1065 P 18 X 22 P Z 0.8794 / 2.059 0.8794 / 2.059 P 4.932 Z 4.4335 0.9999
Cp
22 18 (USL LSL) 1.56 6 0.8794 / 2.059 6 R / d2
CPL
X LSL 20.1065 18 1.644
CPU
3 R / d2
3 0.8794 / 2.059
3 R / d2
3 0.8794 / 2.059
USL X 22 20.1065 1.4778
C pk min(CPL, CPU ) 1.4778
19.33
(a)
Estimate of the population mean = X 15.85
2.272 1.342 1.693 13 15.85 Z 0.9832 Percentage within specification = P 13 X P 1.342 Estimate of population standard deviation = R / d 2
(b)
CPL
X LSL 15.85 13 0.7073 3 R / d2
3 2.272 / 1.693
C pk min(CPL, CPU ) 0.7073
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxiii 19.34
(a)
(b)
19.35
(a)
5.8 5.509 5.2 5.509 P 5.2 X 5.8 P Z 0.2248 / 2.059 0.2248 / 2.059 P 2.830 Z 2.665 0.9938 According to the estimate in (a), only 99.38% of the tea bags will have weight fall between 5.2 grams and 5.8 grams. The process is, therefore, incapable of meeting the 99.7% goal. The estimated percentage of the waiting times that are inside the specification limits is 5 5.9413 P X 5 P Z 0.2773 3.275/2.059
(b)
The process is not capable of reaching the goal with a 99% requirement and will not be capable with a even more stringent criterion of 99.7%.
19.36
Chance or common causes of variation represent the inherent variability that exists in a system. These consist of the numerous small causes of variability that operate randomly or by chance. Special or assignable causes of variation represent large fluctuations or patterns in the data that are not inherent to a process. These fluctuations are often caused by changes in a system that represent either problems to be fixed or opportunities to exploit.
19.37
Find the reasons for the special causes and take corrective action to prevent their occurrence in the future or exploit them if they improve the process.
19.38
When only common causes of variation are present, it is up to management to change the system.
19.39
The p chart is an attribute control chart. It can be used when sampled items are classified according to whether they conform or do not conform to operationally defined requirements. It is based on the proportion of nonconforming items in a sample.
19.40
Attribute control charts are used for categorical or discrete data such as the number of nonconformances. Variables control charts are used for numerical variables and are based on statistics such as the mean and standard deviation.
19.41
Since the range is used to obtain the control limits of the chart for the mean, the range needs to be in a state of statistical control. Thus, the range and mean charts are used together.
19.42
From the red bead experiment you learned that variation is an inherent part of any process, that workers work within a system over which they have little control, that it is the system that primarily determines their performance, and that only management can change the system.
19.43
Process potential measures the potential of a process in satisfying production specification limits or customer satisfaction but does not take into account the actual performance of the process; process performance refers to the actual performance of the process in satisfying production specification limits.
19.44
If a process has a Cp = 1.5 and a Cpk = 0.8, it indicates that the process has the potential of meeting production specification limits but fails to meet the specification limits in actual performance. The process should be investigated and adjusted to increase either the CPU or CPL or both. Copyright ©2024 Pearson Education, Inc.
xxiv Chapter 19: Statistical Applications in Quality Management (online) 19.45
Capability analysis is not performed on out-of-control processes because out-of-control processes do not allow one to predict their capability. They are considered incapable of meeting specifications and, therefore, incapable of satisfying the production requirement.
19.46
(a)
(b)
(c) (d) 19.47
(a) (b)
(c)
(d)
(e) (f)
One the main reason that service quality is lower than product quality is because the former involves human interaction which is prone to variation. Also, the most critical aspects of a service are often timeliness and professionalism, and customers can always perceive that the service could be done quicker and with greater professionalism. For products, customers often cannot perceive a better or more ideal product than the one they are getting. For example, a new laptop is better and contains more interesting features than any laptop that he or she has ever imagined. Both services and products are the results of processes. However, measuring services is often harder because of the dynamic variation due to the human interaction between the service provider and the customer. Product quality is often a straightforward measurement of a static physical characteristic like the amount of sugar in a can of soda. Categorical data are also more common in service quality. Yes. Yes. A question like ―Do you find the restrooms to be clean?‖ will provide responses that will allow you to construct a p-chart. An example of common-cause variation is the inherent fluctuation in the proportion of customers who view the restroom to be clean due to natural fluctuation in the number of visitors during different periods of the day. An example of a special cause variation is large fluctuation in the proportion that might be caused by malfunctioning of the plumbing system in some restrooms. If the control chart is in control, you need to determine if the amount of common-cause variation is small enough to satisfy the customers. If the common-cause variation is small enough to satisfy the customers, you can use the control chart to monitor the process on a continuing basis to make sure that the process remains in control. If the common-cause variation is too large, you need to alter the process to reduce the size of the common-cause variation. If the control chart is out of control, you need to identify the special causes of variation that are producing the out of control conditions. If the special cause of variation is detrimental to the quality, you need to implement plans to eliminate this source of variation. If the special cause of variation increases quality, you should change the process so that this special cause is incorporated into the process design. A question like ―Do you intend to return for another visit in the future?‖ will provide responses that will allow you to construct a p-chart. Answers to (b) – (d) are the same. Continue to chart daily responses from random sampling.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv 19.48
(a)
R = 0.1284, R chart: UCL = 0.33063; LCL does not exist R-chart Boston Shingles 0.35
UCL 0.3 0.25 0.2 0.15
RBar 0.1 0.05
LCL
0 0
(b)
5
10
15
20
25
30
X = 1.2133, X chart: UCL = 1.3447; LCL = 1.0820 Xbar Chart Boston Shingles 1.5 1.4 UCL 1.3 1.2
XBar
1.1
LCL
1 0.9 0
(c)
(d)
5
10
15
20
25
30
There is one point above the UCL and one point below the LCL in the control chart for the mean sealant strength. So the process is out-of-control and the process should be investigated for special causes. (a) R Chart Vermont 0.35 UCL
0.3 0.25 0.2 0.15
RBar 0.1 0.05 0 0
5
10
Copyright ©2024 Pearson Education, Inc.
15
LCL 20
xxvi Chapter 19: Statistical Applications in Quality Management (online) 19.48 cont.
(d)
(b) Xbar Chart Vermont 1.4 UCL
1.35 1.3 1.25
XBar 1.2 1.15 1.1
LCL
1.05 1 0
5
10
15
20
(c) Since no point falls outside the upper and lower control limit of either chart, the process appears to be in-control. 19.49
(a)
(b) (c)
p = 0.75175, LCL = 0.62215, UCL = 0.88135. Although none of the points are outside either the LCL or UCL, there is a clear pattern over time with lower values occurring in the first half of the sequence and higher values occurring toward the end of the sequence. This would explain the pattern in the results over time. The control chart would have been developed using the first 20 days and then, using those limits, the additional proportions could have been plotted.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 19.50
(a)
(b) (c)
19.51
(a) (b) (c)
19.52
p = 0.391, LCL = 0.301, UCL = 0.480. The process is out of statistical control. The proportion of investigations that are closed is below the LCL on Days 2 and 16 and are above the UCL on Days 22 and 23. Special causes of variation should be investigated and eliminated. Next, process knowledge should be improved to increase the proportion of investigations closed the same day. When the proportions are above the UCL, this is in the direction the process needs to go. Therefore, when investigating special causes for days 22 and 23, consideration of exploiting these special causes should be given. p = 0.1198, LCL = 0.0205, UCL = 0.2191. The process is out of statistical control. The proportion of trades that are undesirable is below the LCL on Day 24 and are above the UCL on Day 4. Special causes of variation should be investigated and eliminated. Next, process knowledge should be improved to decrease the proportion of trades that are undesirable.
Processing time: Control chart for the range:
Copyright ©2024 Pearson Education, Inc.
xxviii Chapter 19: Statistical Applications in Quality Management (online) 19.52 cont.
R = 3.597, LCL = .802, UCL = 6.392 Control chart for the mean:
X =2.2653, LCL = 1.1575, UCL = 3.3732 Processing time can be considered to be in control in terms of the mean and the range since there is no strong pattern in either chart and there are no points that are outside the control limits. Proportion of rework in the laboratory:
p = 0.04737, LCL = 0.02721, UCL = .06752 Days 6 and 29 are above the UCL. Thus, the special causes that might be contributing to these values should be investigated before any change in the system of operation is contemplated. Then Deming's fourteen points can be applied to improve the system.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix 19.52 cont.
Number of daily admissions:
c = 26.8, LCL = 11.2694, UCL = 42.3306 The number of daily admissions can be considered to be in control since there is no strong pattern in the chart and there are no points that are outside the control limits. 19.53
Kidney- Shift 1
Copyright ©2024 Pearson Education, Inc.
xxx Chapter 19: Statistical Applications in Quality Management (online) 19.53 cont.
PHStat output:
Although there are no points outside the control limits, there is a strong increasing trend in nonconformances over time. Kidney- Shift 2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi 19.53 cont.
Although there are no points outside the control limits, there is a strong increasing trend in nonconformances over time. Shift 1 Shrimp
There are no points outside the control limits and there is no pattern over time. Copyright ©2024 Pearson Education, Inc.
xxxii Chapter 19: Statistical Applications in Quality Management (online) 19.53 cont.
Shift 2 Shrimp
There are no points outside the control limits and there is no pattern over time. The team needs to determine the reasons for the increase in nonconformances for the kidney product. The production volume for kidney is clearly decreasing for both shifts. This can be observed from a plot of the production volume over time. The team needs to investigate the reasons for this.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii 19.54
Shift 1 - Kidney
The R-chart shows a process in a state of statistical control, with the individual points distributed around the average range of weight without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented.
From the Xbar-chart, you can see that the eighth interval is below the LCL. Management needs to determine the root cause for this special cause variation and take corrective action. Also, during the first half of the sequence almost all the 15-minute intervals had less than the mean subgroup average weight, and all the 15-minute intervals in the second half had more than the mean subgroup average weight. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. When identified, action will be needed to correct this special cause.
Copyright ©2024 Pearson Education, Inc.
xxxiv Chapter 19: Statistical Applications in Quality Management (online) 19.54 cont.
Shift 2 – Kidney
The R-chart reveals that the 25th interval is above the UCL. Also, during the first half of the sequence almost all the 15-minute intervals had less than the mean subgroup weight range, and all the 15-minute intervals in the second half had more than the mean subgroup weight range. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. Management needs to determine the root cause for this special cause variation and take corrective action. Since the R-chart indicates an out-of-control process, the interpretation of the chart for the mean will be misleading. Shift 1 – Shrimp
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv 19.54 cont.
The R-chart shows a process in a state of statistical control, with the individual points distributed around the average range of weight without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented.
The Xbar-chart shows a process in a state of statistical control, with the individual points distributed around the average of the subgroup means without any pattern and all the points are within the control limits. Thus, any improvement in the process must come from the reduction of common cause variation. Such reductions require changes in the process. These changes are the responsibility of management since improvements in quality cannot occur until changes to the process itself are successfully implemented. Shift 2 – Shrimp
The R-chart reveals that the 20th, 23rd and 28th intervals are above the UCL. Also, there is an upward trend in the range. This is a degradation in the production capability which is due to some special cause of variation. So the special cause that produced this pattern needs to be determined. Management needs to determine the root cause for this special cause variation and take corrective action. Copyright ©2024 Pearson Education, Inc.
xxxvi Chapter 19: Statistical Applications in Quality Management (online) Since the R-chart indicates an out-of-control process, the interpretation of the chart for the mean will be misleading.
Copyright ©2024 Pearson Education, Inc.
Chapter 20 (Online)
20.1
(a)
Opportunity loss table: Profit of Optimum
Optimum
Alternative Courses of Action
Event
Action
Action
1
B
100
100 – 50 = 50
100 – 100 = 0
2
A
200
200 – 200 = 0
200 – 125 = 75
A
B
(b) 1
50
A 2 1
200 100
B 2
20.2
125
(a) 1 50 A
2 300 3 500
1 10 B
2 100 3 200
(b)
Opportunity loss table: Profit of Copyright ©2024 Pearson Education, Inc. v
Optimum
Optimum
Alternative Courses of Action
Event
Action
Action
A
B
1
A
50
50 – 50 = 0
50 – 10 = 40
2
A
300
300 – 300 = 0
300 – 100 = 200
3
A
500
500 – 500 = 0
500 – 200 = 300
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems vii 20.3
(a)-(c) Payoff table: Action Event
A = Build large factory
B = Build small factory
1
10,000 10 – 400,000 = –300,00010,000 10 – 200,000 = –100,000
2
20,000 10 – 400,000 = –200,00020,000 10 – 200,000 = 0
3
50,000 10 – 400,000 = 100,00050,000 10 – 200,000 = 300,000
4
100,000 10 – 400,000 = 600,00050,000 10 – 200,000 = 300,000
(d) 1 2
-300,000 -200,000
A
100,000 600,000
4 1 2
B
-100,000 0 300,000 300,000
4
(e)
Opportunity loss table: Profit of Optimum
Optimum
Alternative Courses of Action
Event
Action
Action
A
1
B
– 100,000
– 100,000 – (– 300,000)
B – 100,000 – (– 100,000)
= 200,000= 0
2
B
0
0 – (– 200,000)
0–0=0
= 200,000
3
B
300,000
300,000 – (100,000)
300,000 – 300,000
= 200,000= 0
4
A
600,000
600,000 – 600,000
Copyright ©2024 Pearson Education, Inc.
600,000 – 300,000
= 0= 300,000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ix 20.4
(a)-(b) Payoff table: Action Event
Company A
Company B
1
$10,000 + $2 1,000 =
$12,000
$2,000 + $4 1,000 =
$6,000
2
$10,000 + $2 2,000 =
$14,000
$2,000 + $4 2,000 =
$10,000
3
$10,000 + $2 5,000 =
$20,000
$2,000 + $4 5,000 =
$22,000
4
$10,000 + $2 10,000 = $30,000
$2,000 + $4 10,000 = $42,000
5
$10,000 + $2 50,000 = $110,000
$2,000 + $4 50,000 = $202,000
(c) 1 2 A 4 1 2 3 4 5
B
(d)
$12,000 $14,000
$20,000 $30,000 $110,000 $6,000 $10,000 $22,000 $42,000 $202,000
Opportunity loss table: Profit of
20.5
Optimum
Optimum
Alternative Courses of Action
Event
Action
Action
1
A
12,000
0
6,000
2
A
14,000
0
4,000
3
B
22,000
2,000
0
4
B
42,000
12,000
0
5
B
202,000
92,000
0
A
B
(a)-(b) Payoff table: A Event
Buy 100
Action B Buy 200
C
D
Buy 500
Buy 1,000
Copyright ©2024 Pearson Education, Inc.
1: Sell 100
1,000
200
– 2,200
– 6,200
2: Sell 200
1,000
2,000
– 400
– 4,400
3: Sell 500
1,000
2,000
5,000
1,000
4: Sell 1,000
1,000
2,000
5,000
10,000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xi 20.5 cont.
(c) 1 2
1,000 1,000
A
1,000 1,000
4 1 2
B
4 1 2
C
200 2,000 2,000 2,000 -2,200 -400 5,000
4 1 2
D
4
(d)
5,000 -6,200 -4,400 1,000 10,000
Opportunity loss table: Profit of Optimum
Optimum
Event
Action
Action
1
A
1,000
2
B
3 4
Alternative Courses of Action A
B
C
D
0
800
3,200
7,200
2,000
1,000
0
2,400
6,400
C
5,000
4,000
3,000
0
4,000
D
10,000
9,000
8,000
5,000
0
Copyright ©2024 Pearson Education, Inc.
20.6
Excel output: Probabilities & Payoffs Table: P
A
B
E1
0.5
50
100
E2
0.5
200
125
Max
200
125
Min
50
100
Maximax
(a)
200
maximin
(b)
Statistics for:
A
Expected Monetary Value
(c)
100 B
125
(c) 112.5
5625
156.25
Standard Deviation
75
12.5
Coefficient of Variation
0.6
0.111111
Return to Risk Ratio
1.666667
9
Variance
Opportunity Loss Table: Optimum
Optimum
Alternatives
Action
Profit
A
B
E1
B
100
50
0
E2
A
200
0
75
A Expected Opportunity Loss
(d) 25
B (d) 37.5
EVPI
(a) (b) (c) (d) (e)
The optimal action based on the maximax criterion is Action A. The optimal action based on the maximin criterion is Action B. EMVA = 50(0.5) + 200(0.5) = 125EMVB = 100(0.5) + 125(0.5) = 112.50 EOLA = 50(0.5) + 0(0.5) = 25EOLB = 0(0.5) + 75(0.5) = 37.50 Perfect information would correctly forecast which event, 1 or 2, will occur. The value of perfect information is the increase in the expected value if you knew which of the events 1 or 2 would occur prior to making a decision between actions. It allows us to select the optimum action given a correct forecast. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xiii EMV with perfect information = 100 (0.5) + 200 (0.5) = 150 EVPI = EMV with perfect information – EMVA = 150 – 125 = 25 (f) (g)
Based on (c) and (d) above, select action A because it has a higher expected monetary value (a) and a lower opportunity loss (b) than action B. A 2 = (50 – 125)2 (0.5) + (200 – 125)2 (0.5) = 5625 A = 75 75 100% 60% CVA 125 B 2 = (100 – 112.5)2 (0.5) + (125 – 112.5)2 (0.5) = 156.25 B = 12.5 12.5 100% 11.11% CVB 112.5
Copyright ©2024 Pearson Education, Inc.
20.6
(h)
cont. (i) (j)
20.7
125 1.667 75 112.5 9.0 Return-to-risk ratio for B = 12.5 Based on (g) and (h), select action B because it has a lower coefficient of variation and a higher return-to-risk ratio. The best decision depends on the decision criteria. In this case, expected monetary value leads to a different decision than the return-to-risk ratio. Return-to-risk ratio for A =
PHStat output: Expected Monetary Value
Probabilities & Payoffs Table: P
A
B
E1
0.8
50
10
E2
0.1
300
100
E3
0.1
500
200
Max
500
200
Min
50
10
Maximax
500
Maximin
50
Statistics for:
A
Expected Monetary Value
B 120
38
21600
3636
Standard Deviation
146.9694
60.29925
Coefficient of Variation
1.224745
1.586822
Return to Risk Ratio
0.816497
0.63019
Variance
Opportunity Loss Table: Optimum
Optimum
Alternatives
Action
Profit
A
B
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xv E1 A
50
0
40
E2 A
300
0
200
E3 A
500
0
300
A Expected Opportunity Loss
B 0
82
EVPI
(a) (b) (c) (d) (e) (f) (g)
The optimal action based on maximax criterion is Action A. The optimal action based on maximin criterion is Action A. EMVA = 50(0.8) + 300(0.1) + 500 (0.1) = 120 EMVB = 10(0.8) + 100(0.1) + 200 (0.1) = 38 EOLA = 0(0.8) + 0(0.1) + 0(0.1) = 0 EOLB = 40(0.8) + 200(0.1) + 300(0.1) = 82 EVPI = 0. The expected value of perfect information is zero because the optimum decision is action A across all three event states. Based on the results of (c) and (d), select action A because it has a higher expected monetary value than action B and an opportunity loss of zero. A 2 = (50 – 120)2 (0.8) + (300 – 120)2 (0.1) + (500 – 120)2 (0.1) = 21,600 A = 146.97 146.97 CVA 100% = 122.47% 120
Copyright ©2024 Pearson Education, Inc.
20.7 cont.
(g)
(h)
(i) (j) (k)
20.8
(a)
Rate of return =
(b)
CV =
(c) 20.9
B 2 = (10 – 38)2 (0.8) + (100 – 38)2 (0.1) + (200 – 38)2 (0.1) = 3636 B = 60.299 60.299 100% 158.68% CVB 38 120 Return-to-risk ratio for A = = 0.816 146.97 38 0.630 Return-to-risk ratio for B = 60.299 Based on the results of (g) and (h), select action A because it has a lower coefficient of variation and a higher return-to-risk ratio than action B. The recommendation to select action A is made consistently across both parts (f) and (i). No, the recommendation to select action A is independent of the probabilities whenever action A is the preferred choice across all event states.
(a) (b)
(c) (d)
$100 100% = 10% $1,000
$25 100% = 25% $100 $100 Return-to-risk ratio = = 4.0 $25
EMV = 50(0.3) + 100(0.3) + 120 (0.3) + 200 (0.1) = 101 2 = (50 – 101)2(0.3) + (100 – 101)2(0.3) + (120 – 101)2(0.3) + (200 – 101)2(0.1) = 1,869 = 43.23 CV = 42.80% Return-to-risk ratio = 2.336
20.10
Select stock A because it has a higher expected monetary value while it has the same standard deviation as stock B.
20.11
Select stock B because it has the same expected monetary value as stock A but a smaller standard deviation.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xvii 20.12
PHStat output: Expected Monetary Value
Probabilities & Payoffs Table: P
Sell Soft Drinks
Sell Ice Cream
Cool weather
0.4
50
30
Warm weather
0.6
60
90
max
60
90
min
50
30
Maximax
90
Maximin
50
Statistics for:
Sell Soft Drinks
Expected Monetary Value
Sell Ice Cream
56
66
24
864
Standard Deviation
4.898979
29.39388
Coefficient of Variation
0.087482
0.445362
Return to Risk Ratio
11.43095
2.245366
Variance
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Sell Soft Drinks Sell Ice Cream
Cool weather Sell Soft Drinks
50
0
20
Warm weather Sell Ice Cream
90
30
0
Sell Soft Drinks Sell Ice Cream Expected Opportunity Loss
18
8 EVPI
(a) (b)
The optimal action based on the maximax criterion is to sell ice cream. The optimal action based on the maximin criterion is to sell soft drinks. Copyright ©2024 Pearson Education, Inc.
(c) (d) (e) (f) (g)
EMV(Soft drinks) = 50(0.4) + 60(0.6) = 56 EMV(Ice cream) = 30(0.4) + 90(0.6) = 66 EOL(Soft drinks) = 0(0.4) + 30(0.6) = 18 EOL(Ice cream) = 20(0.4) + 0(0.6) = 8 EVPI is the maximum amount of money the vendor is willing to pay for the information about which event will occur. Based on (c) and (d), choose to sell ice cream because you will earn a higher expected monetary value and incur a lower opportunity loss than choosing to sell soft drinks. CV(Soft drinks) = 4.899 100% = 8.748% 56
CV(Ice cream) = 29.394 100% = 44.536% 66
56 = 11.431 4.899 66 Return-to-risk ratio for ice cream = = 2.245 29.394
(h)
Return-to-risk ratio for soft drinks =
(i)
To maximize return and minimize risk, you will choose to sell soft drinks because it has the smaller coefficient of variation and the larger return-to-risk ratio.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xix 20.12 cont.
(j)
Ignoring the variability of the payoff in (f), you will choose to sell ice cream. However, when risk, which is measured by standard deviation, is taken into consideration as in coefficient of variation or return-to-risk ratio, you will choose to sell soft drinks because it has the lower variability per unit of expected return or the higher expected return per unit of variability.
20.13
PHStat output: Expected Monetary Value
Probabilities & Payoffs Table: P
Buy 500 Buy 1000 Buy 2000
Sell 500
0.2
500
0
-1000
Sell 1000
0.4
500
1000
0
Sell 2000
0.4
500
1000
2000
Max
500
1000
2000
Min
500
0
-1000
Maximax
2000
Maximin
500
Statistics for:
Buy 500 Buy 1000 Buy 2000
Expected Monetary Value
500
800
600
Variance
0
160000 1440000
Standard Deviation
0
400
1200
Coefficient of Variation
0
0.5
2
Return to Risk Ratio #DIV/0!
2
0.5
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Sell 500 Buy 500
Alternatives Buy 500 Buy 1000 Buy 2000
500
0
500
1500
Copyright ©2024 Pearson Education, Inc.
Sell 1000 Buy 1000
1000
500
0
1000
Sell 2000 Buy 2000
2000
1500
1000
0
Buy 500 Buy 1000 Buy 2000 Expected Opportunity Loss
800
500
700
EVPI
(a) (b) (c) (d)
(e)
See the table above. The optimal action based on the maximax criterion is to buy 2000 pounds. The optimal action based on the maximin criterion is to buy 500 pounds. EMVA = 500(0.2) + 500(0.4) + 500(0.4) = 500 EMVB = 0(0.2) + 1,000(0.4) + 1,000(0.4) = 800 EMVC = – 1,000(0.2) + 0(0.4) + 2,000(0.4) = 600 Based on the expected monetary value, the company should purchase 1,000 pounds of clams and will expect to net $800 for the activity. A 2 = (500 – 500)2 (0.2) + (500 – 500)2 (0.4) + (500 – 500)2 (0.4) = 0 A = 0 B 2 = (0 – 800)2 (0.2) + (1,000 – 800)2 (0.4) + (1,000 – 800)2 (0.4) = 160,000 B = 400 C 2 = (– 1,000 – 600)2(0.2) + (0 – 600)2(0.4) + (2,000 – 600)2(0.4) = 1,440,000 C = 1,200
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxi 20.13 cont.
(f)
Opportunity loss table:
Profit of
(g)
(h)
(i)
(j)
(k) (l)
Optimum
Optimum
Event
Action
Action
1
A
500
2
B
3
C
Alternative Courses of Action A
B
C
0
500
1,500
1,000
500
0
1,000
2,000
1,500
1,000
0
EOLA = 0(0.2) + 500(0.4) + 1,500(0.4) = 800 EOLB = 500(0.2) + 0(0.4) + 1,000(0.4) = 500 EOLC = 1,500(0.2) + 1,000(0.4) + 0(0.4) = 700 EMV with perfect information = 500(0.2) + 1,000(0.4) + 2,000(0.4) = 1,300 EVPI = EMV with perfect information – EMVB = 1,300 – 800 = 500 The company should not be willing to pay more than $500 for a perfect forecast. 0 400 100% 0% CVB 100% 50% CVA 800 500 1, 200 100% 200% CVC 600 500 Return-to-risk ratio for A = = undefined 0 800 600 Return-to-risk ratio for B = = 2.0Return-to-risk ratio for C = = 0.5 400 1, 200 Choose to buy 1,000 pounds of clams. Buying 1,000 pounds has the highest expected monetary value ($800), the lowest expected opportunity loss ($500), and the larger of the two return-to-risk ratios with defined solutions. There is no discrepancy. PHStat output: Probabilities & Payoffs Table: P
Buy 500
Buy 1000
Buy 2000
Sell 500
0.2
750
250
-750
Sell 1000
0.4
750
1500
500
Sell 2000
0.4
750
1500
3000
Max
750
1500
3000
Min
750
250
-750
Maximax
Copyright ©2024 Pearson Education, Inc.
3000
Maximin
750
Statistics for:
Buy 500
Buy 1000
Buy 2000
750
1250
1250
0
250000
2250000
Standard Deviation
0
500
1500
Coefficient of Variation
0
0.4
1.2
Return to Risk Ratio
#DIV/0!
2.5
0.833333
Expected Monetary Value Variance
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Buy 500
Buy 1000
Buy 2000
Sell 500
Buy 500
750
0
500
1500
Sell 1000
Buy 1000
1500
750
0
1000
Sell 2000
Buy 2000
3000
2250
1500
0
Buy 500
Buy 1000
Buy 2000
1200
700
700
Expected Opportunity Loss
EVPI
Copyright ©2024 Pearson Education, Inc.
EVPI
Solutions to End-of-Section and Chapter Review Problems xxiii 20.13 cont.
(l)
(a) See the table above. (b) The optimal action based on the maximax criterion is to buy 2000 pounds. (c) The optimal action based on the maximin criterion is to buy 500 pounds. (d) EMVA = 750(0.2) + 750(0.4) + 750(0.4) = 750 EMVB = 250(0.2) + 1,500(0.4) + 1,500(0.4) = 1,250 EMVC = – 750(0.2) + 500(0.4) + 3,000(0.4) = 1,250 Based solely on the expected monetary value, the company should purchase 1,000 or 2,000 pounds of clams and will expect to net $1,250 for the activity. (e) A 2= (750 – 750)2 (0.2) + (750 – 750)2 (0.4) + (750 – 750)2 (0.4) = 0 A = 0 B 2 = (250 – 1,250)2 (0.2) + (1,500 – 1,250)2 (0.4) + (1,500 – 1,250)2 (0.4) = 250,000 B = 500 C 2 = (– 750 – 1,250)2(0.2) + (500 – 1,250)2(0.4) + (3,000 – 1,250)2(0.4) = 2,250,000 C = 1,500 (f) EOLA = 0(0.2) + 750(0.4) + 2,250(0.4) = 1,200 EOLB = 500(0.2) + 0(0.4) + 1,500(0.4) = 700 EOLC = 1,500(0.2) + 1,000(0.4) + 0(0.4) = 700 (g) EMV with perfect information = 750(0.2) + 1,500(0.4) + 3,000(0.4) = 1,950 EVPI = EMV, perfect information – EMVB or C = 1,950 – 1,250 = 700 The company should not be willing to pay more than $700 for a perfect forecast. 0 500 100% 0% CVB (h) CVA 100% 40% 750 1,250 1,500 CVC 100% 120% 1,250 750 (i) Return-to-risk ratio for A = = undefined 0 1, 250 Return-to-risk ratio for B = = 2.5 500 1, 250 Return-to-risk ratio for C = = 0.833 1,500 (j) Buy 1,000 or 2,000 pounds of clams, actions B or C. Buying 1,000 or 2,000 pounds has the highest expected monetary value ($1,250) and the lowest expected opportunity loss ($700). But action B has the higher return-to-risk ratio and is the best choice with respect to the return-to-risk.
Copyright ©2024 Pearson Education, Inc.
20.13 cont.
(m)
PHStat output:
Probabilities & Payoffs Table: P
Buy 500
Buy 1000 Buy 2000
Sell 500
0.4
500
0
-1000
Sell 1000
0.4
500
1000
0
Sell 2000
0.2
500
1000
2000
Max
500
1000
2000
Min
500
0
-1000
Maximax
2000
Maximin
500
Statistics for:
Buy 500
Expected Monetary Value
Buy 1000 Buy 2000
500
600
0
Variance
0
240000
1200000
Standard Deviation
0 489.8979 1095.445
Coefficient of Variation
0 0.816497
#DIV/0!
Return to Risk Ratio
#DIV/0! 1.224745
0
Opportunity Loss Table: Optimum Optimum Action
Profit
Alternatives Buy 500
Buy 1000 Buy 2000
Sell 500 Buy 500
500
0
500
1500
Sell 1000 Buy 1000
1000
500
0
1000
Sell 2000 Buy 2000
2000
1500
1000
0
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxv Buy 500 Expected Opportunity Loss
500
Buy 1000 Buy 2000 400
1000
EVPI (a) See the table above. (b) The optimal action based on the maximax criterion is to buy 2000 pounds. (c) The optimal action based on the maximin criterion is to buy 500 pounds. (d) EMVA = 500(0.4) + 500(0.4) + 500(0.2) = 500 EMVB = 0(0.4) + 1,000(0.4) + 1,000(0.2) = 600 EMVC = – 1,000(0.4) + 0(0.4) + 2,000(0.2) = 0 Based solely on the expected monetary value, the company should purchase 1,000 pounds of clams and will expect to net $600 for the activity. (e) A 2= (500 – 500)2 (0.4) + (500 – 500)2 (0.4) + (500 – 500)2 (0.2) = 0 A = 0 B 2 = (0 – 600)2 (0.4) + (1,000 – 600)2 (0.4) + (1,000 – 600)2 (0.2) = 240,000 B = 489.90 C 2 = (– 1,000 – 0)2(0.4) + (0 – 0)2(0.4) + (2,000 – 0)2(0.2) = 1,200,000 C = 1,095.45
Copyright ©2024 Pearson Education, Inc.
20.13 cont.
(m)
(f) Opportunity loss table:
Profit of Optimum
Optimum
Alternative Courses of Action
Event
Action
Action
A
B
C
1
A
500
0
500
1,500
2
B
1,000
500
0
1,000
3
C
2,000
1,500
1,000
0
EOLA = 0(0.4) + 500(0.4) + 1,500(0.2) = 500 EOLB = 500(0.4) + 0(0.4) + 1,000(0.2) = 400 EOLC = 1,500(0.4) + 1,000(0.4) + 0(0.2) = 1,000 (g) EMV with perfect information = 500(0.4) + 1,000(0.4) + 2,000(0.2) = 1,000 EVPI = EMV, perfect information – EMVB = 1,000 – 600 = 400 The company should not be willing to pay more than $400 for a perfect forecast. 0 489.90 100% 0% CVB 100% 81.65% (h) CVA 600 500 1,095.45 100% undefined CVC 0 500 (i) Return-to-risk ratio for A = = undefined 0 600 Return-to-risk ratio for B = = 1.22 489.90 0 Return-to-risk ratio for C = =0 1,095.45 (j) Buy 1,000 pounds of clams, action B. Buying 1,000 pounds has the highest expected monetary value ($600) and the lowest expected opportunity loss ($400). Action B has the higher return-to-risk ratio and is the best choice with respect to the return-to-risk. (k) Although the values of EMV, EOL and are affected by $.50/pound changes in the profit and by shifts in the probability with which events occur, the recommendation for action B remains unaffected.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxvii 20.14
PHStat output: Probabilities & Payoffs Table: P
A
B
C
Economy declines
0.3
500
-2000
-7000
No change
0.5
1000
2000
-1000
Economy expands
0.2
2000
5000
20000
Max
2000
5000
20000
Min
500
-2000
-7000
Maximax
20000
Maximin
500
Statistics for:
A
B
C
Expected Monetary Value
1050
1400
1400
272500
6240000
93240000
Standard Deviation
522.0153
2497.999
9656.086
Coefficient of Variation
0.497157
1.784285
6.897204
Return to Risk Ratio
2.011435
0.560449
0.144986
Variance
Opportunity Loss Table: Optimum
Optimum
Alternatives
Action
Profit
A
B
C
Economy declines
A
500
0
2500
7500
No change
B
2000
1000
0
3000
Economy expands
C
20000
18000
15000
0
A
B
C
4100
3750
3750
Expected Opportunity Loss
EVPI
Copyright ©2024 Pearson Education, Inc.
EVPI
(a) (b) (c)
(d)
(e)
(f) (g)
The optimal action based on the maximax criterion is to choose investment C. The optimal action based on the maximin criterion is to choose investment A. EMVA = 500(0.3) + 1,000(0.5) + 2,000(0.2) = 1,050 EMVB = – 2,000(0.3) + 2,000(0.5) + 5,000(0.2) = 1,400 EMVC = – 7,000(0.3) – 1,000(0.5) + 20,000(0.2) = 1,400 See the table above. EOLA = 0(0.3) + 1,000(0.5) + 18,000(0.2) = 4,100 EOLB = 2,500(0.3) + 0(0.5) + 15,000(0.2) = 3,750 EOLC = 7,500(0.3) + 3,000(0.5) + 0(0.2) = 3,750 EMV with perfect information = 500(0.3) + 2,000(0.5) + 20,000(0.2) = 5,150 EVPI = EMV with perfect information – EMVB or C = 5,150 – 1,400 = 3,750 The investor should not be willing to pay more than $3,750 for a perfect forecast. Action B and C maximize the expected monetary value and have the lower opportunity loss A 2= (500 – 1,050)2 (0.3) + (1,000 – 1,050)2 (0.5) + (2,000 – 1,050)2 (0.2) = 272,500
A = 522.02
B 2 = (– 2,000 – 1,400)2 (0.3) + (2,000 – 1,400)2 (0.5) + (5,000 – 1,400)2 (0.2) = 6,240,000
20.14
(g)
cont.
(h)
(i)-(j)
B = 2,498.00 C 2= (– 7,000 – 1,400)2(0.3) + (– 1,000 – 1,400)2(0.5) + (20,000 – 1,400)2(0.2) = 93,240,000 C = 9656.09 522.02 100% 49.72% CVA 1050 2498.00 100% 178.43% CVB 1400 9656.09 100% 689.72% CVC 1400 1050 Return-to-risk ratio for A = = 2.01 522.02 1400 Return-to-risk ratio for B = = 0.56 2498 1400 Return-to-risk ratio for C = = 0.14 9656.09 Action A minimizes the coefficient of variation and maximizes the investor’s return-torisk.
(k)
(c) Max EMV
(1)
(2)
(3)
(4)
0.1, 0.6, 0.3
0.1, 0.3, 0.6
0.4, 0.4, 0.2
0.6, 0.3, 0.1
C: 4,700
C: 11,000
A or B: 800
A: 800
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxix Max EMV
C : 10,169
C : 11,145
A : 548
A : 458
B : 2,683 (d) Min EOL &
C: 2,550
C: 1,650
(e) EVPI
A: 4,000 or
A: 2,100
B: 4,000
(g) Min CV
A: 40.99%
A: 36.64%
A: 54.77%
A: 57.28%
(h) Max Return-torisk
A: 2.4398
A: 2.7294
A: 1.8257
A: 1.7457
(i) Choice on (g), (h)
Choose A
Choose A
Choose A
Choose A
(j) Compare (c) and (i)
Different:
Different:
Different:
Same: A
(c) C
(c) C
(c) A or B
(j) A
(j) A
(j) A
Copyright ©2024 Pearson Education, Inc.
20.15 PHStat output: Probabilities & Payoffs Table: P
Large factory
Small factory
Sell 10000
0.1
-300000
-100000
Sell 20000
0.4
-200000
0
Sell 50000
0.2
100000
300000
Sell 100000
0.3
600000
300000
Max
600000
300000
Min
-300000
-100000
Maximax
600000
Maximin
-100000
Statistics for:
Large factory
Expected Monetary Value
Small factory
90000
140000
1.27E+11
2.64E+10
Standard Deviation
356230.3
162480.8
Coefficient of Variation
3.958114
1.160577
Return to Risk Ratio
0.252646
0.86164
Variance
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Large factory
Small factory
Sell 10000 Small factory
-100000
200000
0
Sell 20000 Small factory
0
200000
0
Sell 50000 Small factory
300000
200000
0
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxi Sell 100000 Large factory
600000
0
Large factory Expected Opportunity Loss
300000
Small factory
140000
90000 EVPI
(a) (b) (c)
(d)
(e)
(f) (g) 20.15
(h)
cont. (i) (j) (k)
The optimal action based on the maximax criterion is to build a large factory. The optimal action based on the maximin criterion is to build a small factory. See the table above. EMVA= – 300,000(0.1) + – 200,000(0.4) + 100,000(0.2) + 600,000(0.3) = 90,000 EMVB= – 100,000(0.1) + 0(0.4) + 300,000(0.2) + 300,000(0.3) = 140,000 See the table above. EOLA = 200,000(0.1) + 200,000 (0.4) + 200,000 (0.2) + 0(0.3) = 140,000 EOLB = 0(0.1) + 0(0.4) + 0(0.2) + 300,000(0.3) = 90,000 EMV with perfect information = – 100,000(0.1) + 0(0.4) + 300,000(0.2) + 600,000(0.3) = 230,000 EVPI = EMV, perfect information – EMVB = 230,000 – 140,000 = 90,000 The company should not be willing to pay more than $90,000 for a perfect forecast. The company should build a small factory to maximize expected monetary value ($140,000) and minimize expected opportunity loss ($90,000). 356,230 162,481 CVA 100% 395.81% CVB 100% 116.06% 90,000 140,000 90,000 Return-to-risk ratio for A = = 0.2526 356,230 140,000 Return-to-risk ratio for B = = 0.8616 162, 481 To minimize risk and maximize the return-to-risk, the company should decide to build a small plant. There are no discrepancies. PHStat output: Probabilities & Payoffs Table: P
Large factory
Small factory
Sell 10000
0.4
-300000
-100000
Sell 20000
0.2
-200000
0
Sell 50000
0.2
100000
300000
Sell 100000
0.2
600000
300000
600000
300000
Max
Copyright ©2024 Pearson Education, Inc.
Min
-300000
Maximax
600000
Maximin
-100000
-100000
Statistics for:
Large factory
Expected Monetary Value
Small factory
-20000
80000
1.18E+11
3.36E+10
Standard Deviation
342928.6
183303
Coefficient of Variation
-17.1464
2.291288
Return to Risk Ratio
-0.05832
0.436436
Variance
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Large factory
Small factory
Sell 10000
Small factory
-100000
200000
0
Sell 20000
Small factory
0
200000
0
Sell 50000
Small factory
300000
200000
0
Sell 100000
Large factory
600000
0
300000
Expected Opportunity Loss
Large factory
Small factory
160000
60000 EVPI
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxiii 20.15 cont.
(k)
(c), (d), (g), (h) See the table above. (e) EMV with perfect information = – 100,000(0.4) + 0(0.2) + 300,000(0.2) + 600,000(0.2) = 140,000 EVPI = EMV, perfect information – EMVB = 140,000 – 80,000 = 60,000 Under these conditions, the company should not be willing to pay more than $60,000 for a perfect forecast. (f) The company should build a small factory to maximize expected monetary value ($80,000) and minimize expected opportunity loss ($60,000). (i) To minimize risk and maximize the return-to-risk, the company should decide to build a small plant. (j) There are no discrepancies. The company’s decision is not affected by the changed probabilities.
20.16 PHStat output: Probabilities & Payoffs Table: P
A
B
Demand 1000
0.45
12000
6000
Demand 2000
0.2
14000
10000
Demand 5000
0.15
20000
22000
Demand 10000
0.1
30000
42000
Demand 50000
0.1
110000
202000
Max
110000
202000
Min
14000
10000
Maximax
202000
Maximin
14000
Statistics for:
A
B
Expected Monetary Value
25200
32400
8.29E+08
3.32E+09
Standard Deviation
28791.67
57583.33
Coefficient of Variation
1.142526
1.777263
Return to Risk Ratio
0.875253
0.562663
Variance
Copyright ©2024 Pearson Education, Inc.
Opportunity Loss Table: Optimum
Optimum
Alternatives
Action
Profit
A
B
Demand 1000 A
12000
0
6000
Demand 2000 A
14000
0
4000
Demand 5000 B
22000
2000
0
Demand 10000 B
42000
12000
0
Demand 50000 B
202000
92000
0
A Expected Opportunity Loss
10700
B 3500 EVPI
(a) (b) (c)
The optimal action based on the maximax criterion is to sign with company B. The optimal action based on the maximin criterion is to sign with company A. EMVA= 12,000(0.45) + 14,000(0.2) + 20,000(0.15) + 30,000(0.1) + 110,000(0.1) = 25,200 EMVB= 6,000(0.45) + 10,000(0.2) + 22,000(0.15) + 42,000(0.1) + 202,000(0.1) = 32,400
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxv 20.16 cont.
(d)
(g)
EOLA= 0(0.45) + 0(0.2) + 2,000(0.15) + 12,000(0.1) + 92,000(0.1) = 10,700 EOLB= 6,000(0.45) + 4,000(0.2) + 0(0.15) + 0(0.1) + 0(0.1) = 3,500 EMV with perfect information = 12,000(0.45) + 14,000(0.2) + 22,000(0.15) + 42,000(0.1) + 202,000(0.1) = 35,900 EVPI = EMV, perfect information – EMVB = 35,900 – 32,400 = 3,500 The author should not be willing to pay more than $3,500 for a perfect forecast. Sign with company B to maximize the expected monetary value ($32,400) and minimize the expected opportunity loss ($3,500). CVA 28, 792 100% 114.25% CVB 57,583 100% 177.73%
(h)
Return-to-risk ratio for A = 25,200 = 0.8752
(e)
(f)
25,200
32,400
28,792
Return-to-risk ratio for B = 32,400 = 0.5627 57,583
(i) (j) (k)
Signing with company A will minimize the author’s risk and yield the higher return-torisk. Company B has a higher EMV than A, but choosing company B also entails more risk and has a lower return-to-risk ratio than A. (c)-(j) See the table below. Probabilities & Payoffs Table: P
A
B
Demand 1000
0.3
12000
6000
Demand 2000
0.2
14000
10000
Demand 5000
0.2
20000
22000
Demand 10000
0.1
30000
42000
Demand 50000
0.2
110000
202000
Max
110000
202000
Min
14000
10000
Maximax Maximin
202000 14000
Statistics for:
A
B
Expected Monetary Value
35400
52800
1.42E+09
5.68E+09
37672.8
75345.6
Variance Standard Deviation
Copyright ©2024 Pearson Education, Inc.
Coefficient of Variation
1.064203
1.427
Return to Risk Ratio
0.93967
0.700771
Opportunity Loss Table: Optimum
Optimum
Alternatives
Action
Profit
A
B
Demand 1000
A
12000
0
6000
Demand 2000
A
14000
0
4000
Demand 5000
B
22000
2000
0
Demand 10000
B
42000
12000
0
Demand 50000
B
202000
92000
0
Expected Opportunity Loss
A
B
20000
2600 EVPI
The author’s decision is not affected by the changed probabilities.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxvii 20.17
PHStat output: Probabilities & Payoffs Table: P
Purchase 100
Purchase 200
Purchase 500
Purchase 1000
Sell 100
0.2
1000
200
-2200
-6200
Sell 200
0.5
1000
2000
-400
-4400
Sell 500
0.2
1000
2000
5000
1000
Sell 1000
0.1
1000
2000
5000
10000
Max
1000
2000
5000
10000
Min
1000
200
-2200
-6200
Maximax
10000
Maximin
1000
Statistics for:
Purchase 100
Purchase 200
Purchase 500
1000
1640
860
-2240
0
518400
7808400
22550400
Standard Deviation
0
720
2794.351
4748.726
Coefficient of Variation
0
0.439024
3.249246
-2.11997
Return to Risk Ratio
#DIV/0!
2.277778
0.307764
-0.47171
Expected Monetary Value Variance
Purchase 1000
Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Purchase 100
Purchase 200
Purchase 500
Purchase 1000
Sell 100
Purchase 100
1000
0
800
3200
7200
Sell 200
Purchase 200
2000
1000
0
2400
6400
Sell 500
Purchase 500
5000
4000
3000
0
4000
Copyright ©2024 Pearson Education, Inc.
Sell 1000
Purchase 1000
Expected Opportunity Loss
10000
9000
8000
5000
0
Purchase 100
Purchase 200
Purchase 500
Purchase 1000
2200
1560
2340
5440
EVPI
(a) (b) (c)
The optimal action based on the maximax criterion is to purchase 1000 trees. The optimal action based on the maximin criterion is to purchase 100 trees. Buy 100: EMVA = 1,000(0.2) + 1,000(0.5) + 1,000(0.2) + 1,000(0.1) = 1,000 Buy 200: EMVB = 200(0.2) + 2,000(0.5) + 2,000(0.2) + 2,000(0.1) = 1,640 Buy 500: EMVC = – 2,200(0.2) – 400(0.5) + 5,000(0.2) + 5,000(0.1) = 860 Buy 1,000: EMVD = – 6,200(0.2) – 4,400(0.5) + 1,000(0.2) + 10,000(0.1) = – 2,240
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xxxix 20.17 cont.
(d)
(e)
(f) (g)
(h)
(i) (j) (k)
EOLA = 0(0.2) + 1,000(0.5) + 4,000(0.2) + 9,000(0.1) = 2,200 EOLB = 800(0.2) + 0(0.5) + 3,000(0.2) + 8,000(0.1) = 1,560 EOLC = 3,200(0.2) + 2,400(0.5) + 0(0.2) + 5,000(0.1) = 2,340 EOLD = 7,200(0.2) + 6,400(0.5) + 4,000(0.2) + 0(0.1) = 5,440 EMV with perfect information = 1,000(0.2) + 2,000(0.5) + 5,000(0.2) + 10,000(0.1) = 3,200 EVPI = EMV, perfect information – EMVB = 3,200 – 1,640 = 1,560 The garden center management should not be willing to pay more than $1,560 for a perfect forecast. The garden center management should buy 200 trees, action B, to maximize expected monetary value ($1,640) and minimize expected opportunity loss ($1,560). 0 720 CVA 100% 0% CVB 100% 43.90% 1,000 1,640 2,794.35 4,748.73 100% 324.92% CVD CVC 100% 212.00% 860 2,240 1,000 Return-to-risk ratio for A = = undefined 0 1,640 Return-to-risk ratio for B = = 2.2778 720 860 Return-to-risk ratio for C = = 0.3078 2,794 –2,240 Return-to-risk ratio for D = = – 0.4717 4,749 To minimize risk and maximize the return-to-risk, management should decide to buy 200 trees, action B. There are no discrepancies. (c)-(j) See the table below. PHStat output: Probabilities & Payoffs Table: P
Purchase 100 Purchase 200 Purchase 500 Purchase 1000
Sell 100
0.4
1000
200
-2200
-6200
Sell 200
0.2
1000
2000
-400
-4400
Sell 500
0.2
1000
2000
5000
1000
Sell 1000
0.2
1000
2000
5000
10000
Max
1000
2000
5000
10000
Min
1000
200
-2200
-6200
Maximax Maximin
10000 1000
Copyright ©2024 Pearson Education, Inc.
Statistics for: Expected Monetary Value
Purchase 100 Purchase 200 Purchase 500 Purchase 1000 1000
1280
1040
-1160
0
777600
10886400
38102400
Standard Deviation
0
881.8163
3299.455
6172.714
Coefficient of Variation
0
0.688919
3.172552
-5.32131
Return to Risk Ratio
#DIV/0!
1.451549
0.315204
-0.18792
Variance
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xli 20.17 cont.
(k) Opportunity Loss Table: Optimum
Optimum
Action
Profit
Alternatives Purchase 100 Purchase 200 Purchase 500 Purchase 1000
Sell 100 A
1000
0
800
3200
7200
Sell 200 B
2000
1000
0
2400
6400
Sell 500 C
5000
4000
3000
0
4000
Sell 1000 D
10000
9000
8000
5000
0
Purchase 100 Purchase 200 Purchase 500 Purchase 1000 Expected Opportunity Loss
2800
2520
2760
4960
EVPI
The garden center management should not be willing to pay more than $2,520 for a perfect forecast. The change in probabilities did not result in a different decision. Under these conditions, the garden center should buy 200 trees. 20.18
(a)
(b) (c) (d) (e) (f)
P( F | E1 ) P( E1 ) 0.6(0.5) = 0.6 P( F | E1 ) P( E1 ) P( F | E2 ) P( E2 ) 0.6(0.5) 0.4(0.5) P( E2 | F ) 1 – P( E1 | F ) = 1 – 0.6 = 0.4 EMVA = (0.6)(50) + (0.4)(200) = 110 EMVB = (0.6)(100) + (0.4)(125) = 110 EOLA = (0.6)(50) + (0.4)(0) = 30 EOLB = (0.6)(0) + (0.4)(75) = 30 EVPI = (0.6)(100) + (0.4)(200) = 30 You should not be willing to pay more than $30 for a perfect forecast. Both have the same EMV and the same EOL. A2 = (0.6)(-60)2 + (0.4)(90)2 = 5400 A = 73.4847 P( E 1 | F )
B2 = (0.6)(-10)2 + (0.4)(15)2 = 150 B = 12.2474 73.4847 12.2474 100% = 66.8% CVB 100% = 11.1% 110 110 110 Return-to-risk ratio for A = = 1.497 73.4847 110 Return-to-risk ratio for B = = 8.981 12.2474 Action B has a better return-to-risk ratio. Both have the same EMV, but action B has a better return-to-risk ratio. CVA
(g)
(h) (i)
Copyright ©2024 Pearson Education, Inc.
20.19
(a)
P( E 1 | F )
P( F | E1 ) P( E1 ) P( F | E1 ) P( E1 ) P( F | E2 ) P( E2 ) P F | E3 P E3
0.2(0.8) = 0.667 or 2/3 0.2(0.8) 0.4(0.1) 0.4(0.1) 0.4(0.1) P( E2 | F ) = 0.167 or 1/6 0.2(0.8) 0.4(0.1) 0.4(0.1) P(E3 | F) = 1 – P(E1 | F) – P(E2 | F) = 1 – 0.667 – 0.167 = 0.167 or 1/6 EMVA = (0.667)(50) + (0.167)(300) + (0.167)(500) = 166.95 EMVB = (0.667)(10) + (0.167)(100) + (0.167)(200) = 56.77 EOLA = (0.667)(0) + (0.167)(0) + (0.167)(0) = 0 EOLB = (0.667)(40) + (0.167)(200) + (0.167)(300) = 110.18 EVPI = 0 You should not be willing to pay any money for a perfect forecast. Action A has a higher EMV and is better for all events. A2 = (0.667)( 13677.3025) + (0.167)( 17702.3025) + (0.167)( 110922.3025) = 30603.0698 A = 174.937
(b) 20.19 cont.
(c) (d) (e) (f)
B2 = (0.667)( 2187.4329) + (0.167)(1868.8329) + (0.167)( 20514.8329) = 5197.090
(g)
(h) (i) 20.20
(a)
B = 72.091 174.937 72.091 CVA 100% = 104.78% CVB 100% = 126.99% 166.95 56.77 166.95 Return-to-risk ratio for A = = 0.954 174.937 56.77 Return-to-risk ratio for B = = 0.787 72.091 Action A has a better return-to-risk ratio. Both support action A. P(forecast cool | cool weather) = 0.80 P(forecast warm | warm weather) = 0.70 Forecast Cool 0.8 Cool 0.4
0.32
Forecast Warm 0.08 0.2 Forecast Cool 0.18 0.3
Warm 0.6
Forecast Warm 0.42 0.7 Forecast
Forecast
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xliii Cool
Warm
Totals
Cool
0.32
0.08
0.4
Warm
0.18
0.42
0.6
Totals
0.5
0.5
Revised probabilities:P(cool | forecast cool) = P(warm | forecast cool) =
0.32 = 0.64 0.5
0.18 = 0.36 0.5
Copyright ©2024 Pearson Education, Inc.
20.20 cont.
(a) Cool 0.64
0.32
Warm 0.36 Cool 0.16
0.18
Warm 0.84
0.42
Forecast cool 0.5
Forecast warm 0.5
(b)
0.08
EMV(Soft drinks) = 50(0.64) + 60(0.36) = 53.6 EMV(Ice cream) = 30(0.64) + 90(0.36) = 51.6 EOL(Soft drinks) = 0(0.64) + 30(0.36) = 10.8 EOL(Ice cream) = 20(0.64) + 0(0.36) = 12.8 EMV with perfect information = 50(0.64) + 90(0.36) = 64.4 EVPI = EMV, perfect information – EMVA = 64.4 – 53.6 = 10.8 The vendor should not be willing to pay more than $10.80 for a perfect forecast of the weather. The vendor should sell soft drinks to maximize value and minimize loss. 4.8 100% = 8.96% CV(Soft drinks) = 53.6 28.8 100% = 55.81% CV(Ice cream) = 51.6 53.6 Return-to-risk ratio for soft drinks = = 11.1667 4.8 51.6 Return-to-risk ratio for ice cream = = 1.7917 28.8 Based on these revised probabilities, the vendor’s decision changes because of the increased likelihood of cool weather given a forecast for cool. Under these conditions, she should sell soft drinks to maximize the expected monetary value and also to minimize her expected opportunity loss, as well as minimizing risk and maximizing return.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlv 20.21
(a)
P(rosy | decline) = 0.2 P(rosy | no change) = 0.4 P(rosy | expanding) = 0.7
Forecast Rosy
Gloomy
Totals
Decline
0.06
0.24
0.3
No Change
0.20
0.30
0.5
Expanding
0.14
0.06
0.2
Totals
0.4
0.60
R osy 0.2
0.06
D ecl ine 0.3 G l oomy 0.8 R osy 0.4 N o C hang e 0.5
G l oomy 0.6 R osy 0.7
0.24 0.20
0.30 0.14
E x pandi ng 0.2 G l oomy 0.3
(b)
0.06
Given a gloomy forecast, revised conditional probabilities are: .24 .30 P(decline | gloomy) = = 0.40P(no change | gloomy) = = 0.50 .60 .60 .06 P(expanding | gloomy) = = 0.10 .60 Payoff table, given gloomy forecast: Pr
A
B
C
Decline
0.4
500
– 2,000
– 7,000
No Change
0.5
1,000
2,000
–1,000
Expanding
0.1
2,000
5,000
20,000
2,000
5,000
20,000
Max
Copyright ©2024 Pearson Education, Inc.
Min
500
-2,000
Maximax
-7,000 20,000
Maximin
500 900
700
– 1,300
190,000
5,610,000
58,410,000
435.89
2,368.54
7,642.64
CV
48.43%
338.36%
– 587.90%
Return-to-risk
2.0647
0.2955
– 0.1701
EMV
2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlvii 20.21 cont.
(b)
Opportunity loss table: Profit of
(c)
20.22
(a)
Optimum
Optimum
Event
Action
Action
1
A
500
2
B
3
C
Alternative Courses of Action A
B
C
0
2,500
7,500
2,000
1,000
0
3,000
20,000
18,000
15,000
0
The optimal action based on the maximax criterion is to choose investment C. The optimal action based on the maximin criterion is to choose investment A. EOLA = 0(0.4) + 1,000(0.5) + 18,000(0.1) = 2,300 EOLB = 2,500(0.4) + 0(0.5) + 15,000(0.1) = 2,500 EOLC = 7,500(0.4) + 3,000(0.5) + 0(0.1) = 4,500 EMV with perfect information = 500(0.4) + 2,000(0.5) + 20,000(0.1) = 3200 EVPI = 3,200 – 900 = 2,300 The investor should not be willing to pay more than $2,300 for a perfect forecast. Under the new conditions, action A optimizes the expected monetary value, minimizes the coefficient of variation, and maximizes the investor’s return-to-risk. The probability of decline has increased, which lowered the expected monetary value of actions B and C. P(favorable | 1,000) = 0.01P(favorable | 2,000) = 0.01 P(favorable | 5,000) = 0.25P(favorable | 10,000) = 0.60 P(favorable | 50,000) = 0.99 P(favorable and 1,000)= 0.01(0.45) = 0.0045 P(favorable and 2,000)= 0.01(0.20) = 0.0020 P(favorable and 5,000)= 0.25(0.15) = 0.0375 P(favorable and 10,000)= 0.60(0.10) = 0.0600 P(favorable and 50,000)= 0.99(0.10) = 0.0990 Joint probability table: Favorable
Unfavorable
Totals
1,000
0.0045
0.4455
0.45
2,000
0.0020
0.1980
0.20
5,000
0.0375
0.1125
0.15
10,000
0.0600
0.0400
0.10
50,000
0.0990
0.0010
0.10
Totals
0.2030
0.7970
Copyright ©2024 Pearson Education, Inc.
Given an unfavorable review, the revised conditional probabilities are: P(1,000 | unfavorable)= 0.4455/0.7970 = 0.5590 P(2,000 | unfavorable)= 0.1980/0.7970 = 0.2484 P(5,000 | unfavorable)= 0.1125/0.7970 = 0.1412 P(10,000 | unfavorable)= 0.0400/0.7970 = 0.0502 P(50,000 | unfavorable)= 0.0010/0.7970 = 0.0013
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xlix 20.22 cont.
(b)
Payoff table, given unfavorable review: Pr
A
B
1,000
0.5590
12,000
6,000
2,000
0.2484
14,000
10,000
5,000
0.1412
20,000
22,000
10,000
0.0502
30,000
42,000
50,000
0.0013
110,000
202,000
14,658.60
11,315.4
31,719,333.50
126877326.67
5,631.99
11263.98
CV
38.42%
99.55%
Return-to-risk
2.6027
1.0046
EMV
2
Opportunity loss table: Pr
A
Event 1
0.5590
0
6,000
Event 2
0.2484
0
4,000
Event 3
0.1412
2,000
0
Event 4
0.0502
12,000
0
Event 5
0.0013
92,000
0
EOL
(c)
B
1,004.40
4,347.60
The author’s decision is affected by the changed probabilities. Under the new circumstances, signing with company A maximizes the expected monetary value ($14,658.60), minimizes the expected opportunity loss ($1,004.40), minimizes risk with a smaller coefficient of variation and yields a higher return-to-risk than choosing company B.
20.25
Alternative courses of action represent the choices of the decision-maker. Events are the actual states of the world that can occur.
20.26
A payoff table presents the alternatives in a tabular format, while the probability tree organizes the alternatives and events visually. Copyright ©2024 Pearson Education, Inc.
20.27
The opportunity loss is the difference between the highest possible profit for an event and the actual profit obtained for an action taken.
20.28
Since it is the difference between the highest possible profit for an event and the actual profit obtained for an action taken. It can never be negative.
20.29
Expected monetary value represents the mean profit of an alternative course of action. Expected opportunity loss represents the mean opportunity loss of the alternative course of action as compared to the action that would be taken if you knew the event that was going to occur.
20.30
The expected value of perfect information represents the maximum amount you would pay to obtain perfect information. It represents the alternative course of action with the smallest expected opportunity loss. It is also equal to the expected profit under certainty minus the expected monetary value of the best alternative course of action.
20.31
The expected value of perfect information equals the expected profit under certainty minus the expected monetary value of the best alternative course of action. Expected monetary value measures the mean return or profit of an alternative course of action over the long run without regard for the variability in the payoffs under different events. The return-to-risk ratio considers the variability in the payoffs in evaluating which alternative course of action should be chosen.
20.32
20.33
Bayes’ theorem uses conditional probabilities to revise the probability of an event in the light of new information.
20.34
A risk averter attempts to reduce risk, while a risk seeker looks for increased return usually associated with greater risk.
20.35
Under many circumstances in the business world, the assumption that each incremental change of profit or loss has the same value as the previous amount of profits attained or losses incurred is not valid. Utilities should be used instead of payoffs under such differential evaluation of incremental profits or losses.
20.36
(a), (c), (g), (h)Payoff table: Probabilities & Payoffs Table: P Buy 6,000 Sell 6,000 0.1 6,840 Sell 8,000 0.5 6,840 Sell 10,000 0.3 6,840 Sell 12,000 0.1 6,840
Buy 8,000 Buy 10,000 Buy 12,000 6,340 5,840 5,340 9,120 8,620 8,120 9,120 11,400 10,900 9,120 11,400 13,680
Statistics for: Buy 6,000 Buy 8,000 Buy 10,000 Buy 12,000 Expected Monetary Value 6840 8842 9454 9232 Variance 0 695556 3168644 4946176 Standard Deviation 0 834 1780.068538 2224 Coefficient of Variation 0 0.094322551 0.188287343 0.240901213 Return to Risk Ratio #DIV/0! 10.60191847 5.311031456 4.151079137
(d)
Opportunity loss table: Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems li Opportunity Loss Table: Optimum Optimum Action Profit Sell 6,000 Buy 6,000 6840 Sell 8,000 Buy 8,000 9120 Sell 10,000 Buy 10,000 11400 Sell 12,000 Buy 12,000 13680 Expected Opportunity Loss
Buy 6,000 0 2280 4560 6840 Buy 6,000 3192
Alternatives Buy 8,000 Buy 10,000 Buy 12,000 500 1000 1500 0 500 1000 2280 0 500 4560 2280 0 Buy 8,000 Buy 10,000 Buy 12,000 1190 578 800 EVPI
Copyright ©2024 Pearson Education, Inc.
20.36 cont.
(d) 1 2 A 4 1 2
B
4 1 2
C
4 1 2
D
4
(e) (f) (i)
(j) (k)
6,840 6,840 6,840 6,840 6,340 9,120 9,120 9,120 5,840 8,620 11,400 11,400 5,340 8,120 10,900 13,680
EVPI = $578. The management of Shop-Quick Supermarkets should not be willing to pay more than $578 for a perfect forecast. To maximize the expected monetary value and minimize expected opportunity loss, the management should buy 10,000 loaves. Action B (buying 8,000 loaves) maximizes the return-to-risk and, while buying 6,000 loaves reduces the coefficient of variation to zero, action B has a smaller coefficient of variation than C or D. The results depend on what your objective is. (a), (c), (g), (h) Payoff table: Probabilities & Payoffs Table: P Buy 6,000 Sell 6,000 0.3 6,840 Sell 8,000 0.4 6,840 Sell 10,000 0.2 6,840 Sell 12,000 0.1 6,840
Buy 8,000 Buy 10,000 Buy 12,000 6,340 5,840 5,340 9,120 8,620 8,120 9,120 11,400 10,900 9,120 11,400 13,680
Statistics for: Buy 6,000 Buy 8,000 Buy 10,000 Buy 12,000 Expected Monetary Value 6840 8286 8620 8398 Variance 0 1622964 4637040 6878276 Standard Deviation 0 1273.956043 2153.37874 2622.646755 Coefficient of Variation 0 0.153748014 0.249811919 0.312294208 Return to Risk Ratio #DIV/0! 6.504149059 4.003011564 3.202108704
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems liii 20.36 cont.
(k)
(d) Opportunity loss table: Opportunity Loss Table: Optimum Optimum Action Profit Sell 6,000 Buy 6,000 6840 Sell 8,000 Buy 8,000 9120 Sell 10,000 Buy 10,000 11400 Sell 12,000 Buy 12,000 13680 Expected Opportunity Loss
Buy 6,000 0 2280 4560 6840 Buy 6,000 2508
Alternatives Buy 8,000 Buy 10,000 Buy 12,000 500 1000 1500 0 500 1000 2280 0 500 4560 2280 0 Buy 8,000 Buy 10,000 Buy 12,000 1062 728 950 EVPI
(e) EVPI = $728. The management of Shop-Quick Supermarkets should not be willing to pay more than $728 for a perfect forecast. (f) To maximize the expected monetary value and minimize expected opportunity loss, the management should buy 10,000 loaves. (i) Action B (buying 8,000 loaves) maximizes the return-to-risk and, while buying 6,000 loaves reduces the coefficient of variation to zero, action B has a smaller coefficient of variation than C or D. (j) The results depend on what your objective is. 20.37
(a), (d), (g) Payoff table: Event Pr
A:
B:
Install
Do Not Install
1
50
0.40
– 50,000
0
2
100
0.30
50,000
0
3
200
0.30
250,000
0
EMV
70,000
0
124,900
0
CV
178.43%
undefined
0.5604
undefined
Return-to-risk
(b)
Copyright ©2024 Pearson Education, Inc.
1 A
– 50,000
2
50,000
3
250,000
1 0 B
2
0
3
0
(c), (e) Opportunity loss table: A:
B:
Pr
Install
Do Not Install
Event
1
50
0.40
50,000
0
2
100
0.30
0
50,000
3
200
0.30
0
250,000
20,000
90,000
EOL
20.37 cont.
(f) (h) (i)
EVPI = $20,000. The owner of the home heating-oil delivery company should not be willing to pay more than $20,000 for a perfect forecast. To maximize the expected monetary value and minimize the expected opportunity loss, the owner should offer solar heating. Payoff table: A:
B:
Pr
Install
Do Not Install
50
0.40
– 100,000
0
100
0.30
0
0
200
0.30
200,000
0
EMV
20,000
0
124,900
0
CV
624.50%
undefined
0.1601
undefined
Return-to-risk
Opportunity loss table: Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lv A:
B:
Pr
Install
Do Not Install
50
0.40
100,000
0
100
0.30
0
0
200
0.30
0
200,000
40,000
60,000
EOL
EVPI = $40,000. The owner of the home heating oil delivery company should not be willing to pay more than $40,000 for a perfect forecast. Although individual values are different, the owner’s decision is not affected by the altered start-up costs.
20.38
(a) 1 A
–4,000,000
2
1,000,000
3
5,000,000
1 0 B
2
0
3
0
(c), (f) Payoff table: Event
Pr
New
Old
1
Weak
0.3
– 4,000,000
0
2
Moderate
0.6
1,000,000
0
3
Strong
0.1
5,000,000
0
EMV
– 100,000
0
2,808,914
0
CV
– 2,808.94%
undefined
– 0.0356
undefined
Return-to-risk
Copyright ©2024 Pearson Education, Inc.
20.38 cont.
(b), (d), (e)Opportunity loss table: Pr
New
Weak
0.3
4,000,000
0
Moderate
0.6
0
1,000,000
Strong
0.1
0
5,000,000
1,200,000
1,100,000
EOL
(g) (h)
Old
EVPI = $1,100,000. The product manager should not be willing to pay more than $1,100,000 for a perfect forecast. The product manager should continue to use the old packaging to maximize expected monetary value and to minimize expected opportunity loss and risk. (c), (f)Payoff table: Pr
New
Weak
0.6
– 4,000,000
0
Moderate
0.3
1,000,000
0
Strong
0.1
5,000,000
0
– 1,600,000
0
3,136,877
0
CV
– 196.05%
undefined
– 0.5101
undefined
EMV
Return-to-risk
Old
(b), (d), (e) Opportunity loss table: Pr
New
Weak
0.6
4,000,000
0
Moderate
0.3
0
1,000,000
Strong
0.1
0
5,000,000
2,400,000
800,000
EOL
(i)
Old
EVPI = $800,000. The product manager should not be willing to pay more than $800,000 for a perfect forecast. (g) The product manager should continue to use the old packaging to maximize expected monetary value and to minimize expected opportunity loss and risk. (c), (f)Payoff table: Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lvii Pr
New
Weak
0.1
– 4,000,000
0
Moderate
0.3
1,000,000
0
Strong
0.6
5,000,000
0
2,900,000
0
2,913,760.457
0
100.47%
undefined
0.9953
undefined
EMV
CV Return-to-risk
Old
(b), (d), (e) Opportunity loss table: Pr
New
Weak
0.1
4,000,000
0
Moderate
0.3
0
1,000,000
Strong
0.6
0
5,000,000
EOL
20.38 cont.
(i) (j)
400,000
Old
3,300,000
EVPI = $400,000. The product manager should not be willing to pay more than $400,000 for a perfect forecast. (g) The product manager should use the new packaging to maximize expected monetary value and to minimize expected opportunity loss. P(Sales decreased | weak response) = 0.6 P(Sales stayed same | weak response) = 0.3 P(Sales increased | weak response) = 0.1 P(Sales decreased | moderate response) = 0.2 P(Sales stayed same | moderate response) = 0.4 P(Sales increased | moderate response) = 0.4 P(Sales decreased | strong response) = 0.05 P(Sales stayed same | strong response) = 0.35 P(Sales increased | strong response) = 0.6 P(Sales decreased and weak response) = 0.6(0.3) = 0.18 P(Sales stayed same and weak response) = 0.3(0.3) = 0.09 P(Sales increased and weak response) = 0.1(0.3) = 0.03 P(Sales decreased and moderate response) = 0.2(0.6) = 0.12 P(Sales stayed same and moderate response) = 0.4(0.6) = 0.24 P(Sales increased and moderate response) = 0.4(0.6) = 0.24 P(Sales decreased and strong response) = 0.05(0.1) = 0.005 P(Sales stayed same and strong response) = 0.35(0.1) = 0.035 P(Sales increased and strong response) = 0.6(0.1) = 0.06 Joint probability table: Copyright ©2024 Pearson Education, Inc.
Sales
Sales
Sales
Pr
Decrease
Stay Same
Increase
Weak
0.3
0.180
0.090
0.030
Moderate
0.6
0.120
0.240
0.240
Strong
0.1
0.005
0.035
0.060
0.305
0.365
0.330
Total
Given the sales stayed the same, the revised conditional probabilities are: P(weak response | sales stayed same) = .09 = 0.2466 .365
P(moderate response | sales stayed same) = .24 = 0.6575 .365
P(strong response | sales stayed same) = .035 = 0.0959 .365
(k)
(c), (f) Payoff table: Pr
New
Old
Weak
0.2466
– 4,000,000
0
Moderate
0.6575
1,000,000
0
Strong
0.0959
5,000,000
0
150,600
0
2,641,575.219
0
1,754.03%
undefined
0.0570
undefined
EMV
CV Return-to-risk
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lix 20.38 cont.
(k)
(b), (d), (e) Opportunity loss table: Pr
New
Weak
0.2466
4,000,000
0
Moderate
0.6575
0
1,000,000
Strong
0.0959
0
5,000,000
986,400
1,137,000
EOL
(l)
(m)
Old
EVPI = $986,400. The product manager should not be willing to pay more than $986,400 for a perfect forecast. (g) The product manager should use the new packaging to maximize expected monetary value and to minimize expected opportunity loss. Given the sales decreased, the revised conditional probabilities are: .18 P(weak response | sales decreased) = = 0.5902 .305 .12 P(moderate response | sales decreased) = = 0.3934 .305 .005 P(strong response | sales decreased) = = 0.0164 .305 (c), (f) Payoff table: Pr
New
Old
Weak
0.5902
– 4,000,000
0
Moderate
0.3934
1,000,000
0
Strong
0.0164
5,000,000
0
– 1,885,400
0
2,586,864.287
0
– 137.21%
undefined
– 0.7288
undefined
EMV
CV Return-to-risk
(b), (d), (e) Opportunity loss table: Pr
New
Weak
0.5902
4,000,000
0
Moderate
0.3934
0
1,000,000
Strong
0.0164
0
5,000,000
Old
Copyright ©2024 Pearson Education, Inc.
EOL
2,360,800
475,400
EVPI = $475,400. The product manager should not be willing to pay more than $475,400 for a perfect forecast. (g) The product manager should continue to use the old packaging to maximize expected monetary value and minimize expected opportunity loss.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxi 20.39
(a) 1 2 A
-50,000 60,000 130,000
4 1 2
B
300,000 0 0 0
4
0
(c), (f) Payoff table: A:
B: No
Pr
Garden Service
Garden Service
Event
1
Very low
0.2
– 50,000
0
2
Low
0.5
60,000
0
3
Moderate
0.2
130,000
0
4
High
0.1
300,000
0
76,000
0
94,361.01
0
CV
124.16%
undefined
0.8054
undefined
EMV
Return-to-risk
(b), (d) Opportunity loss table: A:
B: No
Pr
Garden Service
Garden Service
Event
1
Very low
0.2
50,000
0
2
Low
0.5
0
60,000
3
Moderate
0.2
0
130,000
4
High
0.1
0
300,000
10,000
86,000
EOL
(e)
EVPI = $10,000. The entrepreneur should not be willing to pay more than $10,000 for a perfect forecast. Copyright ©2024 Pearson Education, Inc.
(g) (h)
To maximize the expected monetary value and minimize expected opportunity loss, the entrepreneur should provide gardening service. Given 3 events of interest out of 20, the binomial probabilities and their related revised conditional probabilities are: Binomial
Revised Conditional
Pr
Probabilities
Probabilities
Very low
0.2
0.2054
0.2054/0.6019 = 0.3412
Low
0.5
0.0011
0.0011/0.6019 = 0.0018
Moderate
0.2
0.2054
0.2054/0.6019 = 0.3412
High
0.1
0.1901
0.1901/0.6019 = 0.3158
0.6019
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxiii 20.39 cont.
(i)
(c), (f) Payoff table:
Pr
A:
B: No
Garden Service
Garden Service
Very low
0.3412
– 50,000
0
Low
0.0018
60,000
0
Moderate
0.3412
130,000
0
High
0.3158
300,000
0
EMV
122,159.8
0
141,884.9
0
CV
116.15%
undefined
0.8610
undefined
Return-to-risk
(b), (d) Opportunity loss table: A:
B: No
Pr
Garden Service
Garden Service
Very low
0.3412
50,000
0
Low
0.0018
0
60,000
Moderate
0.3412
0
130,000
High
0.3158
0
300,000
17,062.63
139,222.5
EOL
(e) EVPI = $17,062.63. The entrepreneur should not be willing to pay more than $17,062.63 for a perfect forecast. (g) To maximize the expected monetary value and minimize expected opportunity loss, the entrepreneur should provide gardening service. 20.40
(a)
Copyright ©2024 Pearson Education, Inc.
1 2 A
20 100 200
4
B
1 2
4
400 100 100 100 100
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxv 20.40 cont.
(c), (e), (f) Payoff table:*
A: Do Not
B:
Pr
Call Mechanic
Call Mechanic
Event
1
Very low
0.25
20
100
2
Low
0.25
100
100
3
Moderate
0.25
200
100
4
High
0.25
400
100
EMV
180
100
142
0
CV
78.96%
0
Return-to-risk
1.2665
undefined
*Note: The payoff here is cost and not profit. The opportunity cost is therefore calculated as the difference between the payoff and the minimum in the same row. (b), (d) Opportunity loss table:
Pr
(g) (h)
B:
Call Mechanic
Call Mechanic
Very low
0.25
80
0
Low
0.25
0
0
Moderate
0.25
0
100
High
0.25
0
300
20
100
EOL
(e)
A: Do Not
EVPI = $20. The manufacturer should not be willing to pay more than $20 for the information about which event will occur. We want to minimize the expected monetary value because it is a cost. To minimize the expected monetary value, call the mechanic. Given 2 defectives out of 15, the binomial probabilities and their related revised conditional probabilities are:
Pr
Binomial
Revised Conditional
Probabilities
Probabilities
Copyright ©2024 Pearson Education, Inc.
Very low
0.01
0.0092
0.0092/0.6418 = 0.0143
Low
0.05
0.1348
0.1348/0.6418 = 0.2100
Moderate
0.10
0.2669
0.2669/0.6418 = 0.4159
High
0.20
0.2309
0.2309/0.6418 = 0.3598
0.6418
(i)
(c), (e), (f) Payoff table:
Pr
A: Do Not
B:
Call Mechanic
Call Mechanic
Very low
0.0143
20
100
Low
0.2100
100
100
Moderate
0.4159
200
100
High
0.3598
400
100
248.3860
100
121
0
CV
48.68%
0
Return-to-risk
2.0544
undefined
EMV
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxvii 20.40 cont.
(i)
(b), (d) Opportunity loss table:
Pr
A: Do Not
B:
Call Mechanic
Call Mechanic
Very low
0.0143
80
0
Low
0.2100
0
0
Moderate
0.4159
0
100
High
0.3598
0
300
1.1440
149.53
EOL
(e) EVPI = $1.14. The manufacturer should not be willing to pay more than $1.14 for the information about which event will occur. (g) We want to minimize the expected monetary value because it is a cost. To minimize the expected monetary value, call the mechanic. Online Sections
Chapter 5
5.45
PHstat output: Probabilities & Outcomes:
Weight Assigned to X
P
X
Y
0.4
100
200
0.6
200
100
0.5
Statistics E(X)
160
E(Y)
140 Copyright ©2024 Pearson Education, Inc.
Variance(X)
2400
Standard Deviation(X)
48.98979
Variance(Y)
2400
Standard Deviation(Y)
48.98979
Covariance(XY)
-2400
Variance(X+Y)
0
Standard Deviation(X+Y)
0
Portfolio Management Weight Assigned to X
0.5
Weight Assigned to Y
0.5
Portfolio Expected Return
150
Portfolio Risk
0
(a)
E(X) = (0.4)($100) + (0.6)($200) = $160 E(Y) = (0.4)($200) + (0.6)($100) = $140
(b)
X (0.4)(100 –160) 2 (0.6)(200 –160) 2 2400 $48.99
(c) (d)
Y (0.4)(200 –140) 2 (0.6)(100 –140) 2 2400 $48.99 XY = (0.4)(100 – 160)(200 – 140) + (0.6)(200 – 160)(100 – 140) = – 2400 E(X + Y) = E(X) + E(Y) = $160 + $140 = $300
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxix 5.46
PHStat output: Probabilities & Outcomes:
Weight Assigned to X
P
X
Y
0.2
-100
50
0.4
50
30
0.3
200
20
0.1
300
20
0.5
Statistics E(X)
90
E(Y)
30
Variance(X)
15900
Standard Deviation(X)
126.0952
Variance(Y)
120
Standard Deviation(Y)
10.95445
Covariance(XY)
-1300
Variance(X+Y)
13420
Standard Deviation(X+Y)
115.8447
Portfolio Management Weight Assigned to X
0.5
Weight Assigned to Y
0.5
Portfolio Expected Return
60
Portfolio Risk
(a)
57.92236
E(X) = (0.2)($ – 100) + (0.4)($50) +(0.3)($ 200) + (0.1)($300) = $90 E(Y) = (0.2)($50) + (0.4)($30) + (0.3)($ 20) + (0.1)($20) = $30 Copyright ©2024 Pearson Education, Inc.
(b)
X (0.2)(100 90) 2 (0.4)(50 90) 2 (0.3)(200 90) 2 (0.1)(300 90) 2 15900 126.10
Y (0.2)(50 – 30) 2 (0.4)(30 – 30) 2 (0.3)(20 – 30) 2 (0.1)(20 – 30) 2
(d)
120 10.95 XY = (0.2)( –100 – 90)(50 – 30) + (0.4)(50 – 90)(30 – 30) + (0.3)(200 – 90)(20 – 30) + (0.1)(300 – 90)(20 – 30) = –1300 E(X + Y) = E(X) + E(Y) = $90 + $30 = $120
(a)
E(P) = (0.4)($50) + (0.6)($100) = $80
(b)
P (.4) 2 (9000) (.6) 2 (15000) 2(.4)(.6)(7500) 102.18
(a)
E(total time) = E(time waiting) + E(time served) = 4 + 5.5 = 9.5 minutes
(b)
(total time) = 1.22 1.52 1.9209 minutes
(c)
5.47
5.48
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxi 5.49
(a)
E(P) = 0.3(65) + 0.7(35) = $44
P (0.3) 2 (37,525) (0.7) 2 (11,025) 2(0.3)(0.7)(19,275) $26.15 P 26.15
100% 59.44% E P 44 E(P) = 0.7(65) + 0.3(35) = $ 56 CV
(b)
P (0.7) 2 (37,525) (0.3)2 (11,025) 2(0.7)(0.3)(19,275) $106.23 P 106.23
100% 189.69% E P 56 Investing 30% in the Dow Jones index and 70% in the weak-economy fund will yield the lowest risk per unit average return at 59.44%. This will be the investment recommendation if you are a risk-averse investor. CV
(c)
5.50
PHStat output for (a)-(c): Covariance Analysis Probabilities & Outcomes:
P
X
Y
0.1
-100
50
0.3
0
150
0.3
80
-20
0.3
150
-100
Statistics E(X)
59
E(Y)
14
Variance(X) Standard Deviation(X) Variance(Y) Standard Deviation(Y)
6189 78.6702 9924 99.61928
Covariance(XY)
-6306
Variance(X+Y)
3501
Standard Deviation(X+Y)
59.16925 Copyright ©2024 Pearson Education, Inc.
N
(a)
E(X) = xi P xi = 59 i 1 N
E(Y) = yi P yi = 14 i 1
(b)
X = Y =
N
x E X P x = 78.6702 i 1
2
i
N
i
y E Y P y = 99.62 i 1
2
i
i
N
(c)
XY = xi E X yi E Y P xi , yi = 6306 i 1
(d)
Stock X gives the investor a lower standard deviation while yielding a higher expected return so the investor should select stock X.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxiii 5.51
(a)
PHStat output: Probabilities & Outcomes:
P
Weight Assigned to X
X
Y
0.1
-100
50
0.3
0
150
0.3
80
-20
0.3
150
-100
0.3
Statistics E(X)
59
E(Y)
14
Variance(X)
6189
Standard Deviation(X) Variance(Y)
78.6702 9924
Standard Deviation(Y)
99.61928
Covariance(XY)
-6306
Variance(X+Y)
3501
Standard Deviation(X+Y)
59.16925
Portfolio Management Weight Assigned to X
0.3
Weight Assigned to Y
0.7
Portfolio Expected Return
27.5
Portfolio Risk
52.64266
Copyright ©2024 Pearson Education, Inc.
E(P) = $27.5 P = 52.64 CV
P
E P
52.64 100% 191.42% 27.5
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxv 5.51 cont.
(b)
PHStat output: Probabilities & Outcomes:
P
Weight Assigned to X
X
Y
0.1
-100
50
0.3
0
150
0.3
80
-20
0.3
150
-100
0.5
Statistics E(X)
59
E(Y)
14
Variance(X)
6189
Standard Deviation(X) Variance(Y)
78.6702 9924
Standard Deviation(Y)
99.61928
Covariance(XY)
-6306
Variance(X+Y)
3501
Standard Deviation(X+Y)
59.16925
Portfolio Management Weight Assigned to X
0.5
Weight Assigned to Y
0.5
Portfolio Expected Return
36.5
Portfolio Risk
29.58462
Copyright ©2024 Pearson Education, Inc.
E(P) = $36.5 P = 29.59 CV
P
E P
29.59 100% 81.07% 36.5
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxvii 5.51 cont.
(c)
PHStat output: Probabilities & Outcomes:
P
Weight Assigned to X
X
Y
0.1
-100
50
0.3
0
150
0.3
80
-20
0.3
150
-100
0.7
Statistics E(X)
59
E(Y)
14
Variance(X)
6189
Standard Deviation(X)
78.6702
Variance(Y)
9924
Standard Deviation(Y)
99.61928
Covariance(XY)
-6306
Variance(X+Y)
3501
Standard Deviation(X+Y)
59.16925
Portfolio Management Weight Assigned to X
0.7
Weight Assigned to Y
0.3
Portfolio Expected Return
45.5
Portfolio Risk
35.73863
E(P) = $45.5 P = 35.74 CV
P
E P
35.74 100% 78.55% 45.5
Copyright ©2024 Pearson Education, Inc.
(d)
Based on the results of (a)-(c), you should recommend a portfolio with 70% of stock X and 30% of stock Y because it has the lowest risk per unit average return.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxix 5.52
PHStat output: Probabilities & Outcomes:
P
X
Y
0.1
-50
-100
0.3
20
50
0.4
100
130
0.2
150
200
Statistics E(X)
71
E(Y)
97
Variance(X)
3829
Standard Deviation(X)
61.87891
Variance(Y)
7101
Standard Deviation(Y)
84.26743
Covariance(XY)
5113
Variance(X+Y)
21156
Standard Deviation(X+Y)
5.53
(a) (b)
E(X) = $71E(Y) = $97 X = 61.88 Y = 84.27
(a)
PHStat output:
145.451
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxi 5.53
(a)
E(P) = $106.84 P = $ 87.7145 CV
cont.
(b)
PHStat output:
E(P) = $94.4 P = $ 73.1575 CV (c)
P
E P
P
E P
= 82.10%
= 77.50%
PHStat output:
E(P) = $81.96 P = $ 61.1439 CV
P
E P
= 74.60%
Copyright ©2024 Pearson Education, Inc.
5.53 cont.
(d)
5.54
(a)
(b) (c)
Based on the results of (a)-(c), you should recommend a portfolio with 70% of Black Swan fund and 30% of Good Times fund because it has the lowest risk per unit average return as measured by the coefficient of variation. PHStat output:
Let X = corporate bond fund, Y = common stock fund. E(X) = $66.2E(Y) = $63.01. X = $57.2150 Y = $195.2172 According to the probability of 0.01, it is highly unlikely that you will lose $999 of every $1,000 invested.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxiii 5.55
(a)
PHStat output:
E(P) = $ 63.967 P = 153.2659 CV
P
E P
239.60%
Copyright ©2024 Pearson Education, Inc.
5.55 cont.
(b)
PHStat output:
E(P) = $64.61 P = $125.4161 CV
P
E P
194.13%
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxv 5.55 cont.
(c)
PHStat output:
E(P) = $65.24 P = $97.75455 CV (d)
5.56
(a)
P
149.83% E P Since investing $700 in the corporate bond fund and $300 in the common stock fund has the lowest coefficient of variation at 149.83%, you should recommend this portfolio. PHStat output: Hypergeometric Probabilities Data Sample size
4
No. of successes in population
5
Population size
10
Hypergeometric Probabilities Table X
P(X)
3
0.238095
Copyright ©2024 Pearson Education, Inc.
5 10 5 5 4 3! 5 4! 3 4 3 3! 2 1 4!1! 5 P X 3 0.2381 10 9 8 7 6! 10 3 7 6! 4 3 2 1 4
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxvii 5.56 cont.
(b)
PHStat output: Hypergeometric Probabilities Data Sample size
4
No. of successes in population
3
Population size
6
Hypergeometric Probabilities Table X
P(X)
1
0.2
2
0.6
3
0.2
3 6 3 3 2! 3! 1 4 1 2! 1 3! 0! 1 P X 1 0.2 6 5 4! 5 6 4! 2 1 4
(c)
Partial PHStat output: Hypergeometric Probabilities Data Sample size
5
No. of successes in population
3
Population size
12
Hypergeometric Probabilities Table X
P(X)
0
0.159091
3 12 3 3! 9 8 7 6 5! 0 5 0 3! 0 5! 4 3 2 1 7 P X 0 0.1591 12 11 10 9 8 7! 44 12 7! 5 4 3 2 1 5 Copyright ©2024 Pearson Education, Inc.
5.56 cont.
(d)
Partial PHStat output:
Hypergeometric Probabilities
Data Sample size
3
No. of successes in population
3
Population size
10
Hypergeometric Probabilities Table X
P(X)
3
0.008333
3 10 – 3 3! 7! 3 3 – 3 3! 0! 7! 0! 1 0.0083 P( X 3) 10 9 8 7! 120 10 7! 3 2 1 3
5.57
5.58
(a)
nE N E N n nE 4 5 = = 2 = 0.8165 N 10 N2 N 1
(b)
nE N E N n nE 4 3 = = 2 = 0.6325 N 6 N2 N 1
(c)
nE N E N n nE 5 3 = = 1.25 = 0.7724 N 12 N2 N 1
(d)
nE N E N n nE 3 3 = = 0.9 = 0.7 N 10 N2 N 1
(a)
Partial PHStat outuput: Hypergeometric Probabilities Data Sample size
6
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems lxxxix No. of successes in population
25
Population size
100
Hypergeometric Probabilities Table X
P(X)
0
0.168918
1
0.361968
2
0.305888
3
0.130286
4
0.029448
5
0.003343
6
0.000149
Copyright ©2024 Pearson Education, Inc.
5.58 cont.
(a)
If n = 6, E = 25, and N = 100, 25 100 25 25 100 25 0 60 + 1 6 1 P(X 2) = 1 – [P(X = 0) + P(X = 1)] = 1 – 100 100 6 6
(b)
= 1 – [0.1689 + 0.3620] = 0.4691 Partial PHStat output: Hypergeometric Probabilities Data Sample size
6
No. of successes in population
30
Population size
100
Hypergeometric Probabilities Table X
P(X)
0
0.109992
1
0.304593
2
0.33459
3
0.186438
4
0.05552
5
0.008368
6
0.000498
If n = 6, E = 30, and N = 100,
30 100 30 30 100 30 0 6 0 1 6 1 P(X 2) = 1 – [P(X = 0) + P(X = 1)] = 1 – + 100 100 6 6 (c)
= 1 – [0.1100 + 0.3046] = 0.5854 Partial PHStat output: Hypergeometric Probabilities
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xci Data Sample size
6
No. of successes in population
5
Population size
100
Hypergeometric Probabilities Table X
P(X)
0
0.729085
1
0.243028
2
0.026706
3
0.001161
4
1.87E-05
5
7.97E-08
Copyright ©2024 Pearson Education, Inc.
5.58 cont.
(c)
If n = 6, E = 5, and N = 100, 5 100 5 5 100 5 0 6 0 1 6 1 P(X 2) = 1 – [P(X = 0) + P(X = 1)] = 1 – + 100 100 6 6
(d)
= 1 – [0.7291 + 0.2430] = 0.0279 Partial PHStat output: Hypergeometric Probabilities Data Sample size
6
No. of successes in population
10
Population size
100
Hypergeometric Probabilities Table X
P(X)
0
0.522305
1
0.368686
2
0.096458
3
0.011826
4
0.000706
5
1.9E-05
6
1.76E-07
If n = 6, E = 10, and N = 100,
10 100 10 10 100 10 0 6 0 1 6 1 P(X 2) = 1 – [P(X = 0) + P(X = 1)] = 1 – + 100 100 6 6 (e)
= 1 – [0.5223 + 0.3687] = 0.1090 The probability that the entire group will be audited is very sensitive to the true number of improper returns in the population. If the true number is very low (E = 5), the probability is very low (0.0279). When the true number is increased by a factor of six Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xciii (E = 30), the probability the group will be audited increases by a factor of almost 21 (0.5854).
Copyright ©2024 Pearson Education, Inc.
5.59
Partial PHStat output:
(a) (b) (c) (d)
P(X = 0) = 0.4015 P(X 1) = 1 – 0.4015 = 0.5985 P(X 2) = P(X = 0) + P(X = 1) + P(X = 2) = 0.9758 Partial PHStat output:
P(X = 0) = 0.2701
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcv 5.60
PHStat output: Data Sample size
4
No. of successes in population
4
Population size
30
Hypergeometric Probabilities Table
(a) (b) (c) (d)
5.61
X
P(X)
0
0.545521
1
0.379493
2
0.071155
3
0.003795
4
3.65E-05
P(X = 4) = 3.6490 10 5 P(X = 0) = 0.5455 P(X 1) = 0.4545 E=6 (a) P(X = 4) = 0.0005 (b) P(X = 0) = 0.3877 (c) P(X 1) = 0.6123
Partial PHStat output:
Copyright ©2024 Pearson Education, Inc.
(a) (b) (c) (d)
P(X = 0) = 0.0404 P(X 1) = 1 – P(X = 0) = 0.9596 P(X = 4) = 0.1318 P(X < 4) = 0.8296
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcvii 5.62
Partial PHStat output:
(a) (b) (c) (d)
P(X = 1) = 0.2424 P(X 1) = 1 – P(X = 0) = 0.9697 P(X = 3) = 0.2424 Because the number of events of interest in the population is a smaller fraction of the population size in (c), the probability in (c) is smaller than that in Example 5.7
Copyright ©2024 Pearson Education, Inc.
Chapter 6
6.43
(a)
PHStat output: Exponential Probabilities
Data Mean
10
X Value
0.1
Results P(<=X)
0.6321
P(arrival time < 0.1) 1 – e – x 1 – e –(10)(0.1) 0.6321 (b) (c)
P(arrival time > 0.1) = 1 – P(arrival time 0.1) = 1 – 0.6321 = 0.3679 PHStat output: Exponential Probabilities
Data Mean
10
X Value
0.2
Results P(<=X)
6.44
0.8647
(d)
P(0.1 < arrival time < 0.2) = P(arrival time < 0.2) – P(arrival time < 0.1) = 0.8647 – 0.6321 = 0.2326 P(arrival time < 0.1) + P(arrival time > 0.2) = 0.6321 + 0.1353 = 0.7674
(a)
PHStat output: Exponential Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems xcix Probabilities
Data Mean
30
X Value
0.1
Results P(<=X)
(b)
0.9502
P(arrival time < 0.1) = 1 e x 1 e 30 0.1 = 0.9502 P(arrival time > 0.1) = 1 – P(arrival time 0.1) = 1 – 0.9502 = 0.0498
Copyright ©2024 Pearson Education, Inc.
6.44 cont.
(c)
PHStat output: Exponential Probabilities
Data Mean
30
X Value
0.2
Results P(<=X)
6.45
0.9975
(d)
P(0.1 < arrival time < 0.2) = P(arrival time < 0.2) – P(arrival time < 0.1) = 0.9975 – 0.9502 = 0.0473 P(arrival time < 0.1) + P(arrival time > 0.2) = 0.9502 + 0.0025 = 0.9527
(a)
PHStat output: Data Mean
5
X Value
0.3
Results P(<=X)
(b) (c)
0.7769
P(arrival time 0.3) = 1 e 5 0.3 = 0.7769 P(arrival time > 0.3) = 1 – P(arrival time < 0.3) = 0.2231 PHStat output: Data Mean
5
X Value
0.5
Results P(<=X)
0.9179 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ci
6.46
(d)
P(0.3 < arrival time < 0.5) = P(arrival time < 0.5) – P(arrival time < 0.3) = 0.9179 – 0.7769 = 0.1410 P(arrival time < 0.3 or > 0.5) = 1 – P(0.3 < arrival time < 0.5) = 0.8590
(a)
PHStat output: Exponential Probabilities
Data Mean
50
X Value
0.05
Results P(<=X)
0.9179
P(arrival time 0.05) 1 – e –(50)(0.05) = 0.9179
Copyright ©2024 Pearson Education, Inc.
6.46 cont.
(b)
PHStat output: Exponential Probabilities
Data Mean
50
X Value
0.0167
Results P(<=X)
(c)
0.5661
P(arrival time 0.0167) = 1 – 0.4339 = 0.5661 PHStat output: Exponential Probabilities
Data Mean
60
X Value
0.05
Results P(<=X)
0.9502
Exponential Probabilities
Data Mean X Value
60 0.0167
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ciii Results P(<=X)
0.6329
If = 60,P(arrival time 0.05) = 0.9502, P(arrival time 0.0167) = 0.6329
Copyright ©2024 Pearson Education, Inc.
6.46 cont.
(d)
PHStat output: Exponential Probabilities
Data Mean
30
X Value
0.05
Results P(<=X)
0.7769
Exponential Probabilities
Data Mean
30
X Value
0.0167
Results P(<=X)
0.3941
If = 30,P(arrival time 0.05) = 0.7769 P(arrival time 0.0167) = 0.3941 6.47
(a)
PHStat output: Exponential Probabilities
Data Mean
2
X Value
1
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cv
Results P(<=X)
(b)
0.8647
P(arrival time 1) = 0.8647 PHStat output: Exponential Probabilities
Data Mean
2
X Value
5
Results P(<=X)
0.999955
P(arrival time 5) = 0.99996
Copyright ©2024 Pearson Education, Inc.
6.47 cont.
(c)
PHStat output: Exponential Probabilities
Data Mean
1
X Value
1
Results P(<=X)
0.6321
Exponential Probabilities
Data Mean
1
X Value
5
Results P(<=X)
0.993262
If = 1,P(arrival time 1) = 0.6321, P(arrival time 5) = 0.9933 6.48
(a)
PHStat output: Exponential Probabilities
Data Mean
15
X Value
0.05
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cvii
Results P(<=X)
(b)
0.5276
P(arrival time 0.05) 1 – e –(15)(0.05) 0.5276 PHStat output: Exponential Probabilities
Data Mean
15
X Value
0.25
Results P(<=X)
0.9765
P(arrival time 0.25) = 0.9765
Copyright ©2024 Pearson Education, Inc.
6.48 cont.
(c)
PHStat output: Exponential Probabilities
Data Mean
25
X Value
0.05
Results P(<=X)
0.7135
Exponential Probabilities
Data Mean
25
X Value
0.25
Results P(<=X)
0.9981
If = 25,P(arrival time 0.05) = 0.7135, P(arrival time 0.25) = 0.9981 6.49
(a)
PHStat output:
P(next call arrives in 3) = 0.4512 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cix (b)
PHStat output:
P(next call arrives in 6) = 1 − 0.6988 = 0.3012
Copyright ©2024 Pearson Education, Inc.
6.49 cont.
(c)
PHStat output:
P(next call arrives in 1) = 0.1813 6.50
(a)
PHStat output: Exponential Probabilities
Data Mean
0.05
X Value
14
Results P(<=X)
(b)
0.5034
P(X 14) 1 – e –(1/ 20)(14) 0.5034 PHstat output: Exponential Probabilities
Data Mean
0.05
X Value
21
Results P(<=X)
0.6501
P(X > 21) 1 1 – e –(1/20)(21) 0.3499 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxi (c)
PHStat output: Exponential Probabilities
Data Mean
0.05
X Value
7
Results P(<=X)
0.2953
P(X 7) 1 – e –(1/ 20)(7) 0.2953
Copyright ©2024 Pearson Education, Inc.
6.51
(a)
PHStat output: Exponential Probabilities
Data Mean
8
X Value
0.25
Results P(<=X)
(b)
0.8647
P(arrival time 0.25) = 0.8647 PHStat output: Exponential Probabilities
Data Mean
8
X Value
0.05
Results P(<=X)
(c)
0.3297
P(arrival time 0.05) = 0.3297 PHStat output: Exponential Probabilities
Data Mean
15
X Value
0.25
Results
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxiii P(<=X)
0.9765
Exponential Probabilities
Data Mean
15
X Value
0.05
Results P(<=X)
0.5276
If = 15,P(arrival time 0.25) = 0.9765, P(arrival time 0.05) = 0.5276
Copyright ©2024 Pearson Education, Inc.
6.52
(a)
PHStat output: Exponential Probabilities
Data Mean
0.6944
X Value
1
Results P(<=X)
(b)
0.5006
P(X < 1) = 1 e 0.69441 = 0.5006 PHStat output: Exponential Probabilities
Data Mean
0.6944
X Value
2
Results P(<=X)
(c)
0.7506
P(X < 2) = 1 e 0.6944 2 = 0.7506 PHStat output: Exponential Probabilities
Data Mean X Value
0.6944 3
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxv Results P(<=X)
0.8755
P(X > 3) = 1 1 – e –(0.6944)(3) = 0.1245 (d)
6.53
The time between visitors is similar to waiting line (queuing) where the exponential distribution is most appropriate.
n = 20 5 and n 1 = 80 5 n = 20 Partial PHStat output:
n = 100, = 0.20 (a)
n 1 4
Probability for a Range From X Value
24.5
To X Value
25.5
Z Value for 24.5
1.125
Z Value for 25.5
1.375
P(X<=24.5)
0.8697
P(X<=25.5)
0.9154
P(24.5<=X<=25.5)
0.0457
P(X = 25) P(24.5 X 25.5) = P(1.125 Z 1.375) = 0.0457
Copyright ©2024 Pearson Education, Inc.
6.53 cont.
(b)
Partial PHStat output: Probability for X >
(c)
X Value
25.5
Z Value
1.375
P(X>25.5)
0.0846
P(X > 25) = P(X 26) P(X 25.5) = P(Z 1.375) = 0.0846 Partial PHStat output: Probability for X <= X Value
25.5
Z Value
1.375
P(X<=25.5)
0.9154343
P(X 25) P(X 25.5) = P(Z 1.375) = 0.9154 (d) Common Data Mean
20
Standard Deviation
4
Probability for X <= X Value
24.5
Z Value
1.125
P(X<=24.5)
0.8697055
P(X < 25) = P(X 24) P(X 24.5) = P(Z 1.125) = 0.8697 6.54
n = 100, p = 0.40. n = 40 5 and n 1 = 60 5
n = 40 n 1 = 4.8990 (a) Probability for a Range From X Value
39.5
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxvii To X Value
40.5
Z Value for 39.5
-0.102062
Z Value for 40.5
0.102062
P(X<=39.5)
0.4594
P(X<=40.5)
0.5406
P(39.5<=X<=40.5)
0.0813
P(X = 40) P(39.5 X 40.5) = P(−0.1021 Z 0.1021) = 0.0813 (b) Probability for X > X Value
40.5
Z Value
0.1020616
P(X>40.5)
0.4594
P(X > 40) = P(X 41) P(X 40.5) = P(Z 0.1021) = 0.4594
Copyright ©2024 Pearson Education, Inc.
6.54 cont.
(c) Probability for X <= X Value
40.5
Z Value
0.1020616
P(X<=40.5)
0.5406461
P(X 40) P(X 40.5) = P(Z 0.1021) = 0.5406 (d) Probability for X <= X Value
39.5
Z Value
-0.102062
P(X<=39.5)
0.4593539
P(X < 40) = P(X 39) P(X 39.5) = P(Z –0.1021) = 0.4594 6.55
n = 10, p = 0.50. n = 5 5 and n 1 = 5 5
n = 5 n 1 = 1.5811 PHStat output: X
(a)
P(X)
P(<=X)
P(<X)
P(>X)
0
0.000977
0.000977
0
0.999023
1
1
0.009766
0.010742
0.000977
0.989258
0.999023
2
0.043945
0.054688
0.010742
0.945313
0.989258
3
0.117188
0.171875
0.054688
0.828125
0.945313
4
0.205078
0.376953
0.171875
0.623047
0.828125
5
0.246094
0.623047
0.376953
0.376953
0.623047
6
0.205078
0.828125
0.623047
0.171875
0.376953
7
0.117188
0.945313
0.828125
0.054687
0.171875
8
0.043945
0.989258
0.945313
0.010742
0.054687
9
0.009766
0.999023
0.989258
0.000977
0.010742
10
0.000977
1
0.999023
0
0.000977
P(X = 4) = 0.2051 Copyright ©2024 Pearson Education, Inc.
P(>=X)
Solutions to End-of-Section and Chapter Review Problems cxix (b) (c) (d)
P(X 4) = 0.8281 P(4 X 7) = 0.9453 – 0.1719 = 0.7734 (a) Probability for a Range From X Value
3.5
To X Value
4.5
Z Value for 3.5
-0.948707
Z Value for 4.5
-0.316236
P(X<=3.5)
0.1714
P(X<=4.5)
0.3759
P(3.5<=X<=4.5)
0.2045
P(X = 4) P(3.5 X 4.5) = P(–0.9487 Z –0.3162) = 0.2045
Copyright ©2024 Pearson Education, Inc.
6.55 cont.
(d)
(b) Probability for X > X Value
3.5
Z Value
-0.948707
P(X>3.5)
0.8286
P(X 4) P(X 3.5) = P(Z –0.9487) = 0.8286 (c) Probability for a Range From X Value
3.5
To X Value
7.5
Z Value for 3.5
-0.948707
Z Value for 7.5
1.581178
P(X<=3.5)
0.1714
P(X<=7.5)
0.9431
P(3.5<=X<=7.5)
0.7717
P(4 X 7) P(3.5 X 7.5) = P(–0.9487 Z 1.5812) = 0.7717 6.56
1 0.3333 , n = 150, n = 50 > 5, n 1 = 100 > 5 3 (a)
(b) (c) (d) 6.57
X n 59.5 150 0.3333 P(X 60) = P n 1 150 0.3333 1 0.3333 = P(Z > 1.6454) = 0.0499 P(X = 60) = P( 59.5 X a 60.5) = P(1.6454 Z 1.8187) = 0.0155 P(X < 60) = P( X a 59.5) = P(Z 1.6454) = 0.9501 P(X = 71) = P( 70.5 X a 71.5) = P(3.5507 Z 3.7239) = 0.0001
0.5 , n = 55, n = 27.5 > 5, n 1 = 27.5 > 5 (a) (b)
X n 37.5 27.5 = P(Z > 2.6968) = 0.0035 P(X 38) = P n 1 3.7081 The results are virtually the same as those in Problem 5.43 (a).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxi
Chapter 7
7.36
7.37
N n 80 10 0.9413 N 1 80 1 N n 400 100 N n 900 200 0.8671 0.8824 N 1 400 1 N 1 900 1 A sample of size 100 selected without replacement from a population a population of size 400 has a greater effect in reducing the standard error.
7.38
Whenever a sample is obtained with replacement, no finite population correction factor is needed.
7.39
1.30, 0.04 n 16 0.05 and the sample is selected without replacement, we need to perform the N 200 finite population correction. N n X 1.3 X 0.0096 n N 1 PHstat output: Since
Common Data Mean
1.3
Standard Deviation
0.0096
Probability for a Range From X Value
1.31
To X Value
1.33
Z Value for 1.31
1.041667
Z Value for 1.33
3.125
P(X<=1.31)
0.8512
P(X<=1.33)
0.9991
P(1.31<=X<=1.33)
0.1479
P(1.31 < X < 1.33) = P(1.0417< Z < 3.125) = 0.1479 7.40
3.1, 0.40 Copyright ©2024 Pearson Education, Inc.
Even though the sample is selected without replacement, we do not need to perform the finite n 16 0.05. However, if the finite population correction is population correction since N 500 performed, the answer will be: N n X 3.1 X 0.0985 n N 1 PHstat output: Common Data Mean Standard Deviation
Find X and Z Given Cum. Pctage. 3.1 0.0985
Cumulative Percentage
85.00%
Z Value
1.036433
X Value
3.202089
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxiii 7.40 cont. Probability for X > X Value
3
Z Value
-1.015228
P(X>3)
0.8450
(a) (b)
7.41
P( X > 3) = P(Z > 1.0152) = 0.8450 P( X < A) = P(Z < 1.0364) = 0.85 A = 1.0364 (0.0985) + 3.1 = 3.20 minutes
= 0.10 n 400 0.05 and the sample is selected without replacement, we need to perform the N 5000 finite population correction. 1 N n p 0.1 p 0.0144 n N 1 PHstat output: Since
Common Data
Probability for a Range
Mean
0.1
Standard Deviation
0.0144
Probability for X <=
0.09
To X Value
0.1
Z Value for 0.09
-0.694444
Z Value for 0.1
0
X Value
0.08
P(X<=0.09)
0.2437
Z Value
-1.388889
P(X<=0.1)
0.5000
P(X<=0.08)
0.0824333
P(0.09<=X<=0.1)
0.2563
(a) (b) 7.42
From X Value
P(0.09 < < 0.10) = P(0.6944 < Z < 0) = 0.2563 P( < 0.08) = P(Z <1.3889) = 0.0824
= 0.93 n 500 0.05 and the sample is selected without replacement, we will perform the N 10000 finite population correction. 1 N n p 0.93 p 0.0111 n N 1 PHstat output: Since
Copyright ©2024 Pearson Education, Inc.
Common Data Mean Standard Deviation
Probability for a Range 0.93
From X Value
0.93
0.0111
To X Value
0.95
Probability for X >
Z Value for 0.93
0
Z Value for 0.95
1.801802
X Value
0.95
P(X<=0.93)
0.5000
Z Value
1.8018018
P(X<=0.95)
0.9642
P(0.93<=X<=0.95)
0.4642
P(X>0.95)
(a) (b)
0.0358
P(0.93 < p < 0.95) = P(0 < Z < 1.8018) = 0.4642 P(p > .95) = P(Z >1.8018) = 0.0358
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxv
Chapter 8
8.70
N X N t
S n
N –n 7.8 500 – 25 500 25.7 500 2.7969 N –1 500 –1 25
$10,721.53 Population Total $14,978.47 8.71
Using PHStat Confidence Interval Estimate for the Total Difference
Data Population Size
10000
Sample Size
200
Confidence Level
95%
Intermediate Calculations Sum of Differences
200.63
Average Difference in Sample
1.00315
Total Difference
10031.5
Standard Deviation of Differences
4.998502
FPC Factor
0.989999
Standard Error of the Total Diff.
3499.126
Degrees of Freedom
199
t Value
1.971957
Interval Half Width
6900.128
Confidence Interval
Copyright ©2024 Pearson Education, Inc.
Interval Lower Limit
3131.37
Interval Upper Limit
16931.63
N n 4.9985 10,000 200 10,000 1.00315 10,000 1.972 10,000 1 n N 1 200 3131.37 Total Difference in the Population 16931.63 N D N t
8.72
(a)
SD
pZ
p (1 p ) n
N n 0.04+1.2816 0.04(1 0.04) 5000 300 300 N 1 5000 1
0.05406 (b)
pZ
p(1 p) n
N n 0.04+1.645 0.04(1 0.04) 5000 300 300 N 1 5000 1
0.05804 (c)
pZ
p (1 p ) n
N n 0.04+2.3263 0.04(1 0.04) 5000 300 300 N 1 5000 1
0.06552 8.73
8.74
8.75
8.76
8.77
8.78
S N n $0.44 1000 100 = 1000 $2.55 1000 1.9842 1000 1 n N 1 100 $ 2,467.13 Population Total $ 2,632.87 N X N t
S N n $138.8046 3000 10 3000 $261.40 3000 1.8331 3000 1 n N 1 10 $543,176.96 Population Total $1,025,223.04 N X N t
S N n $93.67 1546 50 1546 $252.28 1546 2.0096 1546 1 n N 1 50 $349,526.64 Population Total $430,523.12 N X N t
N –n $29.5523 4000 150 4000 $7.45907 4000 2.6092 4000 1 n N 1 150 $ 5,126.26 Total Difference in the Population $54,546.28 Note: The t-value of 2.6092 for 99% confidence and d.f. = 149 was derived on Excel. N D N t
SD
N –n $25.2448 1200 120 1200 ($0.9583) 1200 1.9801 1200 1 n N 1 120 –$4,046.99 Total Difference in the Population $6,346.99 N D N t
(a)
SD
p(1 p) N n 0.0367(1 0.0367) 10000 300 0.0367+1.645 n N 1 300 10000 1 0.0542 pZ
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxvii (b)
Since the upper bound is higher than the tolerable exception rate of 0.04, the auditor should request a larger sample.
(b)
p(1 p) N n 0.024(1 0.024) 5000 500 0.0347 0.024+1.645 n N 1 500 5000 1 With 95% level of confidence, the auditor can conclude that the rate of noncompliance is less than 0.0347, which is less than the 0.05 tolerable exception rate for internal control, and, hence, the internal control compliance is adequate.
8.80
X t n 1
S n
8.81
Z 2 2 1.96 20 n0 2 61.4633 e 52
8.79
pZ
(a)
N n 24 = 75 2.0301 N 1 36
2
n
200 36 67.63 82.37 200 1
2
61.46331000 n0 N 57.9589 n0 N 1 61.4633 1000 1
Copyright ©2024 Pearson Education, Inc.
Use n = 58
8.82
(a)
N n 100 2000 50 350 1.96 n N 1 50 2000 1 322.6238 377.3762 X Z
Z 2 2 1.96 100 n0 2 96.0364 e 202 96.0364 2000 n0 N n 91.6799 Use n = 92 n0 N 1 96.0364 2000 1 2
2
(b)
(c)
N n 100 1000 50 350 1.96 n N 1 50 1000 1 322.9703 377.0297
(a) X Z
Z 2 2 1.96 100 (b) n0 2 96.0364 e 202 96.0364 1000 n0 N n 87.7015 Use n = 88 n0 N 1 96.0364 1000 1 2
Z 2 2 1.96 400 n0 2 245.8531 e 502 245.8531 3000 n0 N n 227.3013 Use n = 228 n0 N 1 245.8531 3000 1 2
8.83
8.84
(a)
(b)
8.85
2
2
p 1 p N n 0.3 1 0.3 1000 100 0.3 1.6449 n N 1 100 1000 1 0.2285 0.3715 Z 2 p 1 p 1.64492 0.31 0.3 n0 227.2656 e2 0.052 227.2656 1000 n0 N n 185.3315 Use n = 186 n0 N 1 227.2656 1000 1 pZ
p 1 p N n 0.3 1 0.3 2000 100 0.3 1.6449 n N 1 100 2000 1 0.2265 0.3735 Z 2 p 1 p 1.64492 0.31 0.3 (b) n0 227.2656 e2 0.052 227.2656 2000 n0 N n 204.1676 Use n = 205 n0 N 1 227.2656 2000 1
(c)
(a) p Z
(a)
p 1 p N n 0.411 0.41 4000 200 0.41 1.96 n N 1 200 4000 1 0.3436 0.4764 pZ
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxix 8.85
n0
(c)
(a) p Z
(a)
X Z
cont.
8.86
Z 2 p 1 p
(b)
1.962 0.411 0.41
1486.7964 e2 0.0252 1486.7964 4000 n0 N n 1084.1062 Use n = 1085 n0 N 1 1486.7964 4000 1
p 1 p N n 0.411 0.41 6000 200 0.41 1.96 n N 1 200 6000 1 0.3430 0.4770 Z 2 p 1 p 1.962 0.411 0.41 (b) n0 1486.7964 e2 0.0252 1486.7964 6000 n0 N n 1191.6940 Use n = 1192 n0 N 1 1486.7964 6000 1
n
N n 0.05 2000 100 1.9804 2.0000 1.99 1.96 N 1 2000 1 100
Z 2 1.96 0.05 96.0364 e2 0.012 96.0364 2000 n0 N n 91.6799 Use n = 92 n0 N 1 96.0364 2000 1 2
2
2
(b)
n0
(c)
(a) X Z
N n 0.05 1000 100 1.99 1.96 n N 1 100 1000 1 1.9807 1.9993 Z 2 2 1.96 0.05 96.0364 e2 0.012 96.0364 1000 n0 N n 87.7015 Use n = 88 n0 N 1 96.0364 1000 1 2
2
(b) n0
8.87
(a)
X t n 1
S n
N n 0.44 300 20 = 2.55 2.0930 $2.35 $2.75 N 1 20 300 1
(b)
X t n 1
S n
N n 0.44 500 20 = 2.55 2.0930 $2.35 $2.75 N 1 20 500 1
Copyright ©2024 Pearson Education, Inc.
Chapter 9
9.80
H 0 : 7 , H1 : 7 , 0.05 , n 16 , 0.2
.2 Lower critical value: Z L 1.6449 , X L Z L 7 1.6449 6.9178 n 16 X 1 6.9178 6.9 (a) Z STAT L 0.3551 .2 n 16 power = 1 P X X L P Z 0.3551 0.6388
1 0.6388 0.3612 (b)
Z STAT
X L 1
n
6.9178 6.8 2.3551 .2 16
power = 1 P X X L P Z 2.3551 0.9907
1 0.9907 0.0093 9.81
H 0 : 7 , H1 : 7 , 0.01 , n 16 , 0.2
.2 Lower critical value: Z L 2.3263 , X L Z L 7 2.3263 6.8837 n 16 X 1 6.8837 6.9 (a) Z STAT L 0.3263 .2 n 16 power = 1 P X X L P Z 0.3263 0.3721
1 0.3721 0.6279 (b)
Z STAT
X L 1
n
6.8837 6.8 1.6737 .2 16
power = 1 P X X L P Z 1.6737 0.9529 (c)
9.82
1 0.9529 0.0471 Holding everything else constant, the greater the distance between the true mean and the hypothesized mean, the higher the power of the test will be and the lower the probability of committing a Type II error will be. Holding everything else constant, the smaller the level of significance, the lower the power of the test will be and the higher the probability of committing a Type II error will be.
H 0 : 7 , H1 : 7 , 0.05 , n 25 , 0.2
.2 Lower critical value: Z L 1.6449 , X L Z L 7 1.6449 6.9342 n 25 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxi 9.82
(a)
Z STAT
X L 1
n
6.9342 6.9 0.8551 .2 25
power = 1 P X X L P Z 0.8551 0.8038
cont.
1 0.8038 0.1962 (b)
Z STAT
X L 1
n
6.9342 6.8 3.3551 .2 25
power = 1 P X X L P Z 3.3551 0.9996 (c)
9.83
1 0.9996 0.0004 Holding everything else constant, the larger the sample size, the higher the power of the test will be and the lower the probability of committing a Type II error will be.
H 0 : 25,000 , H1 : 25,000 , 0.05 , n 100 , 3500 Lower critical value: Z L 1.6449 ,
3,500 X L ZL 25,000 1.6449 24,424.3013 n 100 X 1 24, 424.3012 24,000 (a) Z STAT L 1.2123 3500 n 100
power = 1 P X X L P Z 1.2123 0.8873
1 0.8873 0.1127 (b)
Z STAT
X L 1
n
24, 424.3012 24,900 1.3591 3500 100
power = 1 P X X L P Z 1.3591 0.0871
1 0.0871 0.9129 9.84
H 0 : 25,000 vs. H1 : 25,000 , 0.01 , n 100 , 3500 Lower critical value: Z L 2.3263 ,
3,500 X L ZL 25,000 2.3263 24,185.7786 n 100 X 1 24,185.7786 24,000 Z STAT L 0.5308 (a) 3500 n 100
power = 1 P X X L P Z 0.5308 0.7022
1 0.7022 0.2978
Copyright ©2024 Pearson Education, Inc.
9.84
(b)
Z STAT
X L 1
n
power = 1 P X X L P Z 2.0406 0.0206
cont. (c)
9.85
24,185.7786 24,900 2.0406 3500 100
1 0.0206 0.9794 Holding everything else constant, the greater the distance between the true mean and the hypothesized mean, the higher the power of the test will be and the lower the probability of committing a Type II error will be. Holding everything else constant, the smaller the level of significance, the lower the power of the test will be and the higher the probability of committing a Type II error will be.
H 0 : 25,000 , H1 : 25,000 , 0.05 , n 25 , 3500 Lower critical value: Z L 1.6449 ,
3,500 X L ZL 25,000 1.6449 23,848.6026 n 25 X 1 23,848.6026 24,000 (a) Z STAT L 0.2163 3500 n 25
power = 1 P X X L P Z 0.2163 0.4144
1 0.4144 0.5856 (b)
Z STAT
X L 1
n
23,848.6026 24,900 1.5020 3500 25
power = 1 P X X L P Z 1.5020 0.0665 (c)
9.86
1 0.0665 0.9335 Holding everything else constant, the larger the sample size, the higher the power of the test will be and the lower the probability of committing a Type II error will be.
H 0 : 25,000 , H1 : 25,000 , 0.05 , n 100 , 3500 Critical values: Z L 1.960 , ZU 1.960
3,500 X L ZL 25,000 1.960 24,314.0130 n 100 3,500 X U ZU 25,000 1.960 25,685.9870 n 100 (a)
P X L X X U P 0.8972 Z 4.8171 0.1848 power = 1 1 0.1848 0.8152
(b)
P X L X X U P 1.6742 Z 2.2457 0.9406 power = 1 1 0.9406 0.0594
(c)
A one-tail test is more powerful than a two-tail test, holding everything else constant.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxiii
Chapter 11
11.43
(a) (b) (c) (d)
df A = c – 1 = 5 – 1 = 4 df BL = r – 1 = 7 – 1 = 6 df E = (r – 1)(c – 1) = (7 – 1)( 5 – 1) = 24 df T = rc – 1 = 7 5 – 1 = 34
11.44
(a)
SSE = SST – SSA – SSBL = 210 – 60 – 75 = 75 SSA 60 MSA = 15 c –1 4 SSBL 75 MSBL = 12.5 r –1 6 SSE 75 = 3.125 MSE (r – 1) (c – 1) 6 4 MSA 15 FSTAT 4.80 MSE 3.125 MSBL 12.5 FSTAT 4.00 MSE 3.125
(b)
(c) (d) 11.45
(a)
(b)
(c)
11.46
(a)
Source
Df
SS
MS
F
Among groups
4
60
15
4.80
Among blocks
6
75
12.5
4.00
Error
24
75
3.125
Total
34
210
For testing the treatment means: Decision rule: If FSTAT > 2.78, reject H0. Decision: Since FSTAT = 4.80 is greater than the upper critical bound of 2.78, reject H0. For testing the block means: Decision rule: If FSTAT > 2.51, reject H0. Decision: Since FSTAT = 4.00 > 2.51, reject H0. There is enough evidence of a difference due to blocks.
(b)
There are 5 degrees of freedom in the numerator and 24 degrees of freedom in the denominator. Q = 4.17
(c)
critical range Q
MSE 3.125 4.17 2.786 r 7
Copyright ©2024 Pearson Education, Inc.
11.47
(a) (b) (c) (d)
df A = c – 1 = 3 – 1 2 df BL = r – 1 7 – 1 6 df E = (r – 1)(c – 1) (7 – 1)( 3 – 1) 12 df T = rc – 1 7 3 – 1 = 20
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxv 11.48
MSA 18 =3 F 6 SSE =(MSE)(df E) = (3)(12) = 36 SSBL = (F)(MSE)(df BL) = (4)(3)(6) = 72 SST = SSA + SSBL + SSE = 36 + 72 + 36 = 144 Since FSTAT = 6 < F0.01,2,12 = 6.9266, do not reject the null hypothesis of no treatment effect. There is not enough evidence to conclude there is a treatment effect. Since FSTAT = 4.0 < F0.01,6,12 = 4.821, do not reject the null hypothesis of no block effect. There is not enough evidence to conclude there is a block effect. MSE
(a) (b) (c) (d)
11.49 Source
df
SS
Among groups
4–1=3
3 x 80 = 240
MS
F 80 15.4286 =
80 Among blocks
5.185
540 7 = 77.1429
8–1=7 540
11.50
77.1429 5 = 15.4286
Error
3 x 7 = 21
15.4286 x 21= 324
Total
32 – 1 = 31
240 + 540 + 324 = 1104
(a)
(b)
11.51
5.000
Decision rule: If FSTAT > 3.07, reject H0. Decision: Since FSTAT = 5.185 is greater than the critical bound 3.07, reject H0. There is enough evidence to conclude that the treatment means are not all equal. Decision rule: If FSTAT > 2.49, reject H0. Decision: Since FSTAT = 5.000 is greater than the critical bound 2.49, reject H0. There is enough evidence to conclude that the block means are not all equal.
H0: A B C D H1: At least one mean differs. Decision rule: If FSTAT > 4.718, reject H0. Anova: Two-Factor Without Replication Source of Variation
SS
df
MS
F
P-value
F crit
Rows
153.2222
8
19.15278
19.06452
1.21E-08
3.362857
Columns
79.63889
3
26.5463
26.42396
8.86E-08
4.718061
Error
24.11111
24
1.00463
Total
256.9722
35
Test statistic: FSTAT = 26.42 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 26.42 is greater than the critical bound 4.718, reject H0. There is adequate evidence to conclude that there is a difference in the mean summed ratings of the four brands of Colombian coffee. MSE 1.0046 From Table E.10, Q = 3.9.Critical range = Q = 1.303 3.9 r 9 Pairs of means that differ at the 0.05 level are marked with * below. X A X B = 1.56* X A X C = 0.89 X A X D = 2.56*
X B X C = 2.45* X B X D = 4.12* X C X D = 1. 67* Brand B is rated highest with a sample mean rating of 25.56.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxvii 11.52
(a)
H0: .1 .2 where 1 = Internet Service, 2 = TV Service H1: Not all . j are equal where j = 1, 2 From PHStat ANOVA Source of Variation
SS
df
MS
F
P-value
Rows
1067.4750
19
56.1829
11.8247
0.0000 2.1683
Columns
265.2250
1 265.2250
55.8214
0.0000 4.3807
Error
90.2750
19
Total
1422.9750
39
4.7513
Level of significance
(b)
F crit
0.05
FSTAT = 55.8214. Since the p-value is virtually 0 < 0.05, reject H0. There is evidence of a difference in the mean rating between Internet Service and TV Service. PHStat output for the Tukey procedure:
The mean rating for the two services are significantly different from each other with Internet Service at the lowest, followed by TV Service. 11.53
(a)
H0: .1 .2 .3 .4 where 1 = Publix, 2 = Winn-Dixie, 3 = Target, 4 = Walmart H1: Not all . j are equal where j = 1, 2, 3, 4 Excel Output:
Copyright ©2024 Pearson Education, Inc.
(b)
FSTAT = 3.6831. Since p-value = 0.0147 < 0.05, reject H0. There is evidence of a difference between the mean price of these items at the four supermarkets. The assumptions needed are: (i) samples are randomly and independently drawn, (ii) populations are normally distributed, (iii) populations have equal variances and (iv) no interaction effect between treatments and blocks.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxxxix 11.53 cont.
(c)
(d)
11.54
(a)
Excel output for the Tukey procedure:
Using Q = 3.69 for numerator d.f. = 4 and denominator d.f. = 120, the mean price of items at Walmart differs from that of Winn-Dixie at 5% level of significance. H 0 : 1. 2. 33. H1 : Not all i . are equal where i = 1, 2, …, 33 FSTAT =18.5723 and p-value is essentially 0. Reject H0. There is evidence of a significant block effect in this experiment. The blocking has been advantageous in reducing the experimental error. H0: .1 .2 .3 where 1 = One Year CD, 2 = Two Year CD, 3 = Five Year CD H1: Not all . j are equal where j = 1, 2, 3 Excel output: ANOVA Source of Variation
(b)
SS
df
MS
F
P-value
F crit
Rows
39.4541
37
1.0663
34.1775
0.0000 1.7295
Columns
1.1210
1
1.1210
35.9285
0.0000 4.1055
Error
1.1544
37
0.0312
Total
41.7295
75
FSTAT = 35.9285. Since the p-value is virtually 0, reject H0. There is evidence of a difference in the mean rates for these investments. The assumptions needed are: (i) samples are randomly and independently drawn, (ii) populations are normally distributed, (iii) populations have equal variances and (iv) no interaction effect between treatments and blocks.
Copyright ©2024 Pearson Education, Inc.
11.54 cont.
(c)
(d)
11.55
Excel output of the Tukey procedure:
Using Q = 3.40 for numerator d.f. = 3 and denominator d.f. = 60, the mean rates of these investments are not all different with One-Year CD being the lowest, followed by Two-Year CD and finally Five-Year CD. H0: 1. 2. 16. H1: Not all i . are equal where i 1, 2, , 16 FSTAT = 34.1775. Since the p-value is virtually 0, reject H0. There is enough evidence of a significant block effect in this experiment. The blocking has been advantageous in reducing the experimental error.
To test at the 0.01 level of significance whether there is any difference in the mean thickness of the wafers for the five positions, you conduct an F test: H0: 1 2 3 4 5 where 1 = position 1, 2 = position 2, 3 = position 18, 4 = position 19, 5 = position 28 H1: At least one mean is different. Decision rule: df: 4, 116. If FSTAT > 3.4852, reject H0. ANOVA Source of Variation Rows
SS
df
MS
F
P-value
F crit
601.5
29
20.74138
5.922219
1.93E-12
1.878497
Columns
1417.733
4
354.4333
101.2002
6.84E-37
3.485212
Error
406.2667
116
3.502299
Total
2425.5
149
Test statistic: FSTAT 101.2 Decision: Since FSTAT 101.2 is greater than the critical bound of 3.4852, reject H0. There is enough evidence to conclude that the means of the thickness of the wafers are different across the five positions. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxli To determine which of the means are significantly different from one another, you use the TukeyKramer procedure to establish the critical range: Q = 4.71 critical range Q
MSE 3.5023 1.609 4.71 r 30
Copyright ©2024 Pearson Education, Inc.
11.55 cont.
Pairs of means that differ at the 0.01 level are marked with * below. X 1 X 2 = 2.2* X 1 X 3 = 5.533* X 1 X 4 = 8.567*
X 1 X 5 = 6.533* X 2 X 3 = 3.333* X 2 X 4 = 6.367* X 2 X 5 = 4.333* X 3 X 4 = 3.033* X 3 X 5 = 1 X 4 X 5 = 2.033* At 1% level of significance, the F test concludes that there are significant differences in the mean thickness of the wafers among the 5 positions. The Tukey-Kramer multiple comparison test reveals that the mean thickness between all the pairs are significantly different with only the exception of the pair between position 18 and position 28.
11.56
(a)
H0: 1 2 3 where 1 = 2 days, 2 = 7 days, 3 = 28 days H1: At least one mean differs. Decision rule: If FSTAT > 3.114, reject H0. ANOVA Source of Variation
(b)
SS
df
MS
F
P-value
F crit
Rows
21.17006
39
0.542822
5.752312
2.92E-11
1.553239
Columns
50.62835
2
25.31417
268.2556
1.09E-35
3.113797
Error
7.360538
78
0.094366
Total
79.15894
119
Test statistic: F = 268.26 Decision: Since FSTAT = 268.26 is greater than the critical bound 3.114, reject H0. There is enough evidence to conclude that there is a difference in the mean compressive strength after 2, 7 and 28 days. MSE 0.0944 3.4 From Table E.10, Q = 3.4. critical range = Q = 0.1651 r 40
X 1 X 2 = 0.5531* X 1 X 3 = 1.5685* X 2 X 3 = 1.0154*
(c)
At the 0.05 level of significance, all of the comparisons are significant. This is consistent with the results of the F-test indicating that there is significant difference in the mean compressive strength after 2, 7 and 28 days. (r 1) MSBL r (c 1) MSE 39 0.5428 40 2 0.0943 RE = 2.558 (rc 1) MSE 119 0.0943
(d)
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxliii Box-and-whisker Plot
28 days
Seven days
Two days
0
(e)
1
2
3
4
5
6
The compressive strength of the concrete increases over the 3 time periods.
Copyright ©2024 Pearson Education, Inc.
Chapter 12
12.60
(a)
H 0 : 1 2 H1 : 1 2 where 1 = group 1, 2 = group 2 Decision rule: If ZSTAT < –1.96 or ZSTAT > 1.96, reject H0. B C 25 16 Test statistic: Z STAT = 1.4056 BC 25 16 Decision: Since ZSTAT = 1.4056 is between the critical bounds of –1.96 and the upper critical bound of 1.96, do not reject H0. There is not enough evidence of a difference between group 1 and group 2.
12.61
(a)
H 0 : 1 2 H1 : 1 2 where 1 = beginning, 2 = end Decision rule: If ZSTAT < –1.645, reject H0. B C 9 22 Test statistic: Z STAT = –2.3349 BC 9 22 Decision: Since ZSTAT = –2.3349 < –1.645, reject H0. There is enough evidence to conclude that the proportion of coffee drinkers who prefer Brand A is lower at the beginning of the advertising campaign than at the end of the advertising campaign. p-value = 0.0098. The probability of obtaining a data set which gives rise to a test statistic smaller than –2.3349 is 0.0098 if the proportion of coffee drinkers who prefer Brand A is not lower at the beginning of the advertising campaign than at the end of the advertising campaign.
(b)
12.62
(a)
(b)
12.63
(a)
(b)
H 0 : 1 2 H1 : 1 2 where 1 = prior, 2 = after Decision rule: If Z < –2.5758 or Z > 2.5758, reject H0. B C 21 36 Test statistic: Z STAT = –1.9868 BC 21 36 Decision: Since ZSTAT = –1.9868 is in between the two critical bounds, do not reject H0. There is not enough evidence to conclude there is a difference in the proportion of voters who favored Candidate A prior to and after the debate. p-value = 0.0469. The probability of obtaining a sample which gives rise to a test statistic that differs from 0 by –1.9868 or more in either direction is 0.0469 if there is not a difference in the proportion of voters who favor Candidate A prior to and after the debate. H 0 : 1 2 H1 : 1 2 where 1 = before, 2 = after Decision rule: If ZSTAT < –1.645, reject H0. B C 5 15 Test statistic: Z STAT = – 2.2361 BC 5 15 Decision: Since ZSTAT = –2.2361 < –1.645, reject H0. There is enough evidence to conclude that the proportion who prefer Brand A is lower before the advertising than after the advertising. p-value = 0.0127. The probability of obtaining a data set which gives rise to a test statistic smaller than –2.2361 is 0.0127 if the proportion who prefer Brand A is not lower before the advertising than after the advertising. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlv
12.64
(a)
12.64
(a)
cont. (b)
12.65
(a)
(b)
12.66
12.67
H 0 : 1 2 H1 : 1 2 where 1 = last year, 2 = now Decision rule: If ZSTAT < –1.645, reject H0. B C 5 20 Test statistic: Z STAT =–3 BC 5 20 Decision: Since ZSTAT = – 3 < –1.645, reject H0. There is enough evidence to conclude that satisfaction was lower last year prior to introduction of Six Sigma management. p-value = 0.0014. The probability of obtaining a data set which gives rise to a test statistic smaller than –3 is 0.0014 if the satisfaction was not lower last year prior to introduction of Six Sigma management.
H 0 : 1 2 H1 : 1 2 where 1 = year 1, 2 = year 2 Decision rule: If ZSTAT < –1.645, reject H0. B C 4 25 Test statistic: Z STAT = –3.8996 BC 4 25 Decision: Since ZSTAT = –3.8996 < –1.645, reject H0. There is enough evidence to conclude that the proportion of employees absent less than 5 days was lower in year 1 than in year 2. p-value is virtually zero. The probability of obtaining a data set which gives rise to a test statistic smaller than –3.8996 is virtually zero if the proportion of employees absent less than 5 years was not lower in year 1 than in year 2.
(a)
For df = 25 and = 0.01, 2/2 = 10.520 and 12 /2 = 46.928.
(b)
For df = 16 and = 0.05, 2/2 = 6.908 and 12 /2 = 28.845.
(c)
For df = 13 and = 0.10, 2/2 = 5.892 and 12 /2 = 22.362.
(a)
For df = 23 and = 0.01, 2/2 = 9.2604 and 12 /2 = 44.1814.
(b)
For df = 19 and = 0.05, 2/2 = 8.9065 and 12 /2 = 32.8523.
(c)
For df = 15 and = 0.10, 2/2 = 7.2609 and 12 /2 = 24.9958.
(n – 1) S 2
24 150 2 = 54 100 2
15 10 2 = 10.417 12 2
12.68
2 STAT
12.69
2 STAT
12.70
df = n – 1 = 16 – 1 = 15
12.71
(a)
For df = 15 and = 0.05, 2/2 = 6.262 and 12 /2 = 27.488.
(b)
For df = 15 and = 0.05, 2/2 = 7.261.
(a)
If H1 : 12 , do not reject H0 since the test statistic 2 = 10.417 falls between the two
12.72
2 (n – 1) S 2
2
critical bounds, 2/2 = 6.262 and 12 /2 = 27.488. Copyright ©2024 Pearson Education, Inc.
(b)
If H1 : 12 , do not reject H0 since the test statistic 2 = 10.417 is greater than the critical bound 7.261.
12.73
You must assume that the data in the population are normally distributed to be able to use the chisquare test of a population variance or standard deviation. If the data selected do not come from an approximately normally distributed population, particularly for small sample sizes, the accuracy of the test can be seriously affected.
12.74
(a)
H0: 1.2F. The standard deviation of the oven temperature has not increased above 1.2°F. H1: 1.2F. The standard deviation of the oven temperature has increased above 1.2°F. 2 Decision rule: df = 29. If STAT > 42.557, reject H0.
29 2.12 = 88.813 2 1.2 2 2 Decision: Since the test statistic of STAT = 88.813 is greater than the critical boundary of 42.557, reject H0. There is sufficient evidence to conclude that the standard deviation of the oven temperature has increased above 1.2°F. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = 5.53 × 10–8 or 0.00000005. The probability that a sample is obtained whose standard deviation is equal to or larger than 2.1°F when the null hypothesis is true is 5.53 × 10–8, a very small probability. Note: The p-value was found using Excel. 2 Test statistic: STAT
(b) (c)
12.75
(a)
(c)
12.76
(a)
H0: = $200. The standard deviation of the amount of auto repairs is equal to $200. H1: $200. The standard deviation of the amount of auto repairs is not equal to $200. 2 2 Decision rule: df = 24. If STAT < 12.401 or STAT > 39.364, reject H0.
24 237.52 2 = 33.849 2 200 2 2 Decision: Since the test statistic of STAT = 33.849 is between the critical boundaries of 12.401 and 39.364, do not reject H0. There is insufficient evidence to conclude that the standard deviation of the amount of auto repairs is not equal to $200. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = 2(0.0874) = 0.1748. The probability of obtaining a sample whose standard deviation will give rise to a test statistic equal to or more extreme than 33.849 is 0.1748 when the null hypothesis is true. Note: The p-value was found using Excel. 2 Test statistic: STAT
(b)
(n – 1) S 2
(n –1) S 2
H0: = 12. H1: 12. 2 2 Decision rule: df = 14. If STAT < 6.571 or STAT > 23.685, reject H0. 2 Test statistic: STAT
(n – 1) S 2
2
14 9.252 = 8.319 12 2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlvii
(b)
2 Decision: Since the test statistic of STAT = 8.319 is between the critical boundaries of 6.571 and 23.685, do not reject H0. There is insufficient evidence that the population standard deviation is different from 12. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation.
Copyright ©2024 Pearson Education, Inc.
12.76 cont.
(c)
p-value = 2(1 – 0.8721) = 0.2558. The probability of obtaining a test statistic equal to or more extreme than the result obtained from this sample data is 0.2558 if the standard deviation is 12. 2 Note: Excel returns an upper-tail area of 0.8721 for STAT = 8.319. But since the sample standard deviation is smaller than the hypothesized value, the amount of area in the lower tail is (1 – 0.8721). That value is doubled to accommodate the two-tail hypotheses.
12.77
(a)
H0: 0.035 inch. The standard deviation of the diameter of doorknobs is greater than or equal to 0.035 inch in the redesigned production process. H1: < 0.035 inch. The standard deviation of the diameter of doorknobs is less than 0.035 inch in the redesigned production process. 2 Decision rule: df = 24. If STAT < 13.848, reject H0.
24 0.0252 = 12.245 2 0.0352 2 Decision: Since the test statistic of STAT = 12.245 is less than the critical boundary of 13.848, reject H0. There is sufficient evidence to conclude that the standard deviation of the diameter of doorknobs is less than 0.035 inch in the redesigned production process. You must assume that the data in the population are normally distributed to be able to use the chi-square test of a population variance or standard deviation. p-value = (1 – 0.9770) = 0.0230. The probability of obtaining a test statistic equal to or more extreme than the result obtained from this sample data is 0.0230 if the population standard deviation is indeed no less than 0.035 inch. 2 Test statistic: STAT
(b) (c)
12.78
(a) (b) (c) (d)
WL = 13, WU = 53 WL = 10, WU = 56 WL = 7, WU = 59 WL = 5, WU = 61
12.79
(a) (b) (c) (d)
WU = 53 WU = 56 WU = 59 WU = 61
12.80
(a) (b) (c) (d)
WL = 13 WL = 10 WL = 7 WL = 5
(n – 1) S 2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxlix 12.81 Observation
Di
abs(Di)
Sign of Di
R
signed R
R(+)
1
3.2
3.2
+
6
6
6
2
1.7
1.7
+
2.5
2.5
2.5
3
4.5
4.5
+
7
7
7
4
0
0
5
11.1
11.1
+
9
9
9
6
-0.8
0.8
-
1
-1
0
7
2.3
2.3
+
5
5
5
8
-2
2
-
4
-4
0
9
0
0
Discard
-
10
14.8
14.8
+
10
10
10
11
5.6
5.6
+
8
8
8
12
1.7
1.7
+
2.5
2.5
2.5
Discard
-
-
-
-
-
W = in 1Ri( ) = 50 12.82
n = 10, = 0.05, WL = 8, WU = 47
12.83
Since W = 50 > WU = 47, reject H 0 .
12.84
W in 1 Ri = 67.5
12.85
n = 12, = 0.05, WU = 61
12.86
Since W = 67.5 > WU = 61, reject H 0 .
12.87
(a)
H0: MD = 0 H1: MD 0 where Populations: 1 = A, 2 = B n
n 9, W Ri = 2 i 1
(b)
Decision rule: Reject H0 if W < 3 or > 33. Since W = 2 is smaller than 3, reject H0. There is enough evidence of a difference in the median summated rating between brand A and Brand B. In Problem 10.22, you conclude that there is enough evidence of a difference in the mean summated ratings between the two brands. Here, you conclude that there is enough evidence of a difference in the median summated rating between brand A and Brand B. Copyright ©2024 Pearson Education, Inc.
12.88
(a)
H0: MD = 0 where Populations:1 = Internet2 = TV H1: MD 0where Di X Internet X TV Using Excel Internet TV Di abs(Di) Sign of Di
R
signed R
R+
65
60
5
5
+
2.57
2.57
2.57
73
66
7
7
+
14.5
14.5
14.5
61
53
8
8
+
17
17
17
65
60
5
5
+
2.57
2.57
2.57
71
66
5
5
+
2.57
2.57
2.57
62
65
-3
3
-
1.5
-1.5
64
59
5
5
+
2.57
2.57
2.57
61
56
5
5
+
2.57
2.57
2.57
64
64
0 Discard
+
74
68
6
6
+
13
13
13
56
53
3
3
+
1.5
1.5
1.5
73
65
8
8
+
17
17
17
71
59
12
12
+
19
19
19
53
49
4
4
+
4
4
4
65
60
5
5
+
2.57
2.57
2.57
70
66
4
4
+
4
4
4
71
64
7
7
+
14.5
14.5
14.5
71
63
8
8
+
17
17
17
65
61
4
4
+
4
4
4
61
56
5
5
+
2.57
2.57
2.57
n
n 19, W Ri = 143.5 i 1
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cli Decision rule: Reject H0 if W < 46 or > 144. Since W = 143.5 is smaller than 144, do not reject H0. There is insufficient evidence of a difference in the median service rating between internet and tv service. 19(20) 143.5 95 19(20)(39) 95 W 1.95 24.85 Z STAT 4 24.85 24 Since ZSTAT < 1.96, do not reject H0.
W
(b)
Using the paired-sample t-test in Problem 10.21, you reject the null hypothesis; you conclude that there is evidence of a difference in the mean service rating between internet and TV. Using the Wilcoxon signed rank test, you do not reject the null hypothesis; you conclude that there is not enough evidence of a difference in the median service rating between TV and phone services.
Copyright ©2024 Pearson Education, Inc.
12.89
(a)
H0: MD = 0 where Populations:1 = Restaurant2 = McDonald’s H1: MD 0where Di X Restaurant X McDonalds Using Excel Restaurant McDonalds Di abs(Di) Sign of Di R
signed R
R+
2.70
4.05
-1.35
1.35
-
5
-5
2.07
5.92
-3.85
3.85
-
14
-14
4.64
5.41
-0.77
0.77
-
4
-4
11.60
9.28
2.32
2.32
+
10.5
10.5 10.5
7.95
5.63
2.32
2.32
+
10.5
10.5 10.5
7.68
6.94
0.74
0.74
+
3
3
3
9.53
7.62
1.91
1.91
+
9
9
9
7.71
5.14
2.57
2.57
+
12
12
12
4.51
3.95
0.56
0.56
+
2
2
2
10.06
4.02
6.04
6.04
+
17
17
17
1.70
7.78
-6.08
6.08
-
18
-18
-18
3.15
4.84
-1.69
1.69
-
7
-7
20.32
8.13
12.19
12.19
+
25
25
25
14.54
8.73
5.81
5.81
+
16
16
16
7.33
5.87
1.46
1.46
+
6
6
6
10.99
4.81
6.18
6.18
+
19
19
19
4.52
6.33
-1.81
1.81
-
8
-8
20.00
9.00
11.00
11
+
24
24
24
17.39
10.44
6.95
6.95
+
21
21
21
17.39
9.28
8.11
8.11
+
22
22
22
5.59
5.96
-0.37
0.37
-
1
-1
13.71
9.14
4.57
4.57
+
15
15
Copyright ©2024 Pearson Education, Inc.
15
Solutions to End-of-Section and Chapter Review Problems cliii 15.85
9.23
6.62
6.62
+
20
20
20
8.86
5.57
3.29
3.29
+
13
13
13
26.87
16.12
10.75
10.75
+
23
23
23
n
n 25, W Ri = 250 i 1
25(26) 250 162.5 25(26)(51) 162.5 W 2.35 37.1652 Z STAT 4 37.1652 24 Since ZSTAT > 1.96, reject H0.
W
Decision rule: Reject H0 if ZSTAT < –1.96 or > 1.96. Since ZSTAT = 2.35 > 1.96, reject H0. There is evidence of a difference in the median service rating between internet and tv service. (b)
12.90
(a)
Using the paired-sample t-test in Problem 10.22, you reject the null hypothesis; you conclude that there is evidence that the mean meal cost is higher at an inexpensive restaurant than at McDonald’s. Using the Wilcoxon signed rank test, you reject the null hypothesis; you conclude that there is enough evidence that the median price is different at an inexpensive restaurant than at McDonald’s. H0: MD = 0 where Populations:1 = Coffeepot2 = K-Cup H1: MD 0where Di X Coffeepot X K Cup Using Excel Coffeepot K-Cup
Di
22
23
-1
1
-
3.5
-3.5
24
21
3
3
+
10
10
10
23
22
1
1
+
3.5
3.5
3.5
25
24
1
1
+
3.5
3.5
3.5
20
22
-2
2
-
8
-8
19
20
-1
1
-
3.5
-3.5
24
25
-1
1
-
3.5
-3.5
25
26
-1
1
-
3.5
-3.5
20
18
2
2
+
8
8
19
21
-2
2
-
8
-8
abs(Di) Sign of Di
R
n
n 10 W Ri = 25 , i 1 Copyright ©2024 Pearson Education, Inc.
signed R
R+
8
Decision rule: Reject H0 if W < 8 or > 47. Since W = 25 is smaller than 47, do not reject H0. There is insufficient evidence of a difference in the median overall scores between coffeepot-brewed and K-cup-brewed coffee. 10(11) 25 27.5 10(11)(23) W 27.5 W 0.2548 9.8107 Z STAT 4 9.8107 24 Since ZSTAT = –0.2548 < 1.96, do not reject H0.
12.91
(b)
Using the paired-sample t-test in Problem 10.23, you do not reject the null hypothesis; you conclude that there is insufficient evidence of a difference in the mean scores of coffeepot-brewed and K-cup-brewed coffee. Using the Wilcoxon signed rank test, you do not reject the null hypothesis; you conclude that there is insufficient evidence of a difference in the median overall scores between coffeepot-brewed and K-cup-brewed coffee.
(a)
H0: MD 0where Populations:1 = two days2 = seven days H1: MD < 0 Minitab Output: Wilcoxon Signed Rank Test: Differences
Test of median = 0.000000 versus median < 0.000000
(b)
Since the p-value is smaller than the Estimated 0.01 level of significance, reject H 0 . There N for = 0.000 Wilcoxon is sufficient evidence that the median strength is less at two days than at seven days. N Test Statistic P Using the paired-sample t-test in Problem 10.26, you reject theMedian null hypothesis and conclude that there is enough evidence that the mean strength is less at two days than at Differen 40 40 0.0 0.000 -0.5100 seven days. Using the Wilcoxon signed rank test, you reject the null hypothesis and conclude that there is enough evidence that the median strength is less at two days than at seven days.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clv 12.92
d.f. = 5, = 0.1, U2 9.2363
12.93
(a) (b)
12.94
H0: M1 = M2 = M3 = M4 = M5 = M6H1: At least one of the medians differs. Reject H0 if FR > 9.2363. Since FR = 11.56 > 9.2363, reject H0. There is enough evidence that the medians are different.
Minitab output: Friedman Test: Rating versus Brand, Expert
Friedman test for Rating by Brand blocked by Expert
S = 20.03
DF = 3
P = 0.000
S = 20.72
DF = 3
P = 0.000 (adjusted for ties)
Est
Sum of
Brand
N
Median
Ranks
A
9
25.000
25.0
B
9
26.750
34.5
C
9
24.000
20.0
D
9
22.250
10.5
Grand median
=
24.500
(a)
(b)
H 0 : M A M B M C M D H1 : Not all medians are the equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median summated ratings of the four brands of Colombian coffee. In (a), you conclude that there is evidence of a difference in the median summated ratings of the four brands of Colombian coffee while in problem 11.23, you conclude that there is evidence of a difference in the mean summated ratings of the four brands of Colombian coffee.
Copyright ©2024 Pearson Education, Inc.
12.95
(a)
H 0 : M A M B M C H1 : Not all medians are the equal. From Excel
Company
Internet
Rank
TV
Rank
AT&T
65
2
60
1
Armstrong
73
2
66
1
Atlantic Broadband
61
2
53
1
Charter(Spectrum)
65
2
60
1
Cincinatti Bell
71
2
66
1
Consolidated Communications
62
1
65
2
Cox
64
2
59
1
Frontier
61
2
56
1
Lumen (Century Link)
64
1.5
64
1.5
Midco
74
2
68
1
Optimum
56
2
53
1
Astound (RCN)
73
2
65
1
Sparklight
71
2
59
1
SuddenLink
53
2
49
1
TDS
65
2
60
1
Verizon
70
2
66
1
Wave
71
2
64
1
WOW!
71
2
63
1
xfinity (Comcast)
65
2
61
1
Xtream (Mediacom)
61
2
56
1
Rank Totals
38.5
21.5
Rank Totals Squared
1482.3
462.25
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clvii r = 20, c = 2 Sum of Rankings = 38.5 + 21.5 = 60 rc(c 1) 20(2)(3) 60 Check the Rankings = 2 2 12 38.5 21.5 3(20)(2) 14.45 FR 20(2)(3) d.f. = c – 1 = 1. For α = 0.05, Critical value = 3.841 Since FR = 14.42 > 3.841, reject H0. There is evidence of a difference in the median service rating between Internet and TV Service. (b)
In (a), you conclude that there is evidence of a difference in the median service rating between Internet and TV Service while in problem 11.52, you conclude that there is evidence of a difference in the mean rating between Internet and TV Service.
Copyright ©2024 Pearson Education, Inc.
12.96
Minitab output:
(a)
(b)
12.97
H 0 : M A M B M C M D H1 : Not all medians are the equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median prices for these items at the four supermarkets. In (a), you conclude that there is evidence of a difference in the median prices for these items at the four supermarkets while in problem 11.25, you conclude that there is evidence of a difference between the mean price of these items at the four supermarkets.
Minitab output: Friedman Test: Thickness versus Position, Batch1
Friedman test for Thickness by Position blocked by Batch1
S = 97.97
DF = 4
P = 0.000
S = 99.63
DF = 4
P = 0.000 (adjusted for ties)
Est
Sum of
Position
N
Median
Ranks
1
30
240.45
32.0
2
30
242.55
64.0
18
30
245.25
97.5
19
30
249.15
141.0
28
30
246.85
115.5
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clix Grand median
(a)
(b)
=
244.85
H 0 : M1 M 2 M18 M19 M 28 H1 : Not all medians are equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median thickness of the wafers for the five positions. In (a), you conclude that there is evidence of a difference in the median thickness of the wafers for the five positions, and in problem 11.27, you conclude that there is evidence of a difference in the mean thickness of the wafers for the five positions.
Copyright ©2024 Pearson Education, Inc.
12.98
Minitab output: Friedman Test: Strength versus Days, Samples
Friedman test for Strength by Days blocked by Samples
S = 80.00
DF = 2
P = 0.000
Est
Sum of
Days
N
Median
Ranks
2
40
3.0863
40.0
7
40
3.5888
80.0
28
40
4.5838
120.0
Grand median
=
3.7529
(a)
(b)
H 0 : M 2 M 7 M 28 H1 : Not all medians are equal. Since the p-value is virtually zero, reject H0 at 0.05 level of significance. There is evidence of a difference in the median compressive strength after 2, 7 and 28 days. In (a), you conclude that there is evidence of a difference in the median compressive strength after 2, 7 and 28 days, and in problem 11.28, you conclude that there is evidence of a difference in the mean compressive strength after 2, 7 and 28 days.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxi Chapter 16
16.65
The price of the commodity in 2018 was 75% higher than in 1995.
16.66
(a)
(b)
16.67
2016 as the base year: P $5 I 2016 2016 100 100 = 100 P2016 $5 P $8 I 2017 2017 100 100 = 160 P2016 $5 P $7 I 2018 2018 100 100 = 140 P2016 $5 2017 as the base year: P $5 I 2016 2016 100 100 = 62.5 P2017 $8 P $8 I 2017 2017 100 100 = 100 P2017 $8 P $7 I 2018 2018 100 100 = 87.5 P2017 $8
(a)
IU2018
43 3i 1 Pi 2018 100 = 100 = 186.96 3 1995 23 i 1 Pi
(b)
I L2018
240 3i 1 Pi 2018Qi1995 100 = 100 = 162.16 3 1995 1995 148 i 1 Pi Qi
(c)
I P2018
227 3i 1 Pi 2018Qi2018 100 = 100 = 154.42 3 1995 2018 147 i 1 Pi Qi
Copyright ©2024 Pearson Education, Inc.
16.68
(a),(b)
(c)
The price index using 2000 as the base year is more useful because it is closer to the present and the DJIA has grown more than 200% over the period. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxiii 16.69
(a), (c) For MLB Salaries, (mean salary of major league baseball players on opening day)
(b) (d) (e) (f)
The MLB salary in 2022 is 86.08% higher than it was in 2003. The MLB salary in 2022 is 37.38% higher than it was in 2012. Using 2012 as the base year is more useful because it is closer to the present.
There is a upward trend in MLB salaries from 2004 to 2017, followed by leveling off from 2017 to 2020, with a dip at 2021. Copyright ©2024 Pearson Education, Inc.
16.70
(a), (c)
(b) (d)
The average price per pound of fresh tomatoes in 2018 in the U.S. is 210.10% higher than it was in 1980. The average price per pound of fresh tomatoes in 2014 in the U.S. is 25.65% higher than it was in 1990.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxv 16.70 cont.
(e)
There is an upward trend in the cost of fresh tomatoes from 1980 to 2018 with a prominent cyclical component.
Copyright ©2024 Pearson Education, Inc.
16.71
(a) Year
Electricity Price Index (base=1992)
Natural Gas Price Index (base=1992)
Fuel Oil Price Index (base=1992)
1992
100
100
100
1993
38.10925597
108.9968153
98.37563452
1994
108.3121728
114.6345162
93.29949239
1995
109.8267455
113.2544738
92.69035533
1996
109.0717063
112.1094935
102.2335025
1997
110.6604346
124.7497725
115.3299492
1998
104.269567
119.1916894
98.07106599
1999
101.2583987
116.3898999
84.67005076
2000
101.5864812
120.048529
120.7106599
2001
106.6762545
188.5577798
153.1979695
2002
107.5661221
137.6857749
114.0101523
2003
107.1054583
152.5136488
141.7258883
2004
110.4671805
174.5829542
153.0964467
2005
114.2603537
193.0997877
188.7309645
2006
128.5881216
251.7515924
245.4822335
2007
135.5992
213.4402487
240.4060914
2008
144.203501
205.4898392
338.7817259
2009
141.5698524
167.5765848
254.7208122
2010
139.3227118
163.4819533
301.2182741
2011
140.4462821
156.8092205
346.7005076
2012
133.4464394
149.4388838
375.3299492
2013
144.9405631
151.0464058
389.9492386
2014
150.5584144
155.6718229
396.3451777
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxvii 2015
155.0526954
157.1125265
285.3807107
2016
150.5584144
136.1844101
200
2017
150.5584144
151.6530179
252.1827411
2018
152.8055549
158.9323628
294.6192893
Copyright ©2024 Pearson Education, Inc.
16.71 cont.
(b)
Year
Electricity Price Index (base=1996)
Natural Gas Price Index (base=1996)
Fuel Oil Price Index (base=1996)
1992
91.68280522
89.19851201
97.81529295
1993
34.93963493
97.22353737
96.22641509
1994
99.30363839
102.2522827
91.2611718
1995
100.6922411
101.0213054
90.6653426
1996
100
100
100
1997
101.4565907
111.2749408
112.8103277
1998
95.597264
106.3172134
95.9285005
1999
92.83654044
103.8180588
82.82025819
2000
93.1373357
107.0815015
118.0734856
2001
97.8037826
168.1907339
149.8510427
2002
98.61963822
122.8136625
111.5193644
2003
98.19728872
136.0399053
138.6295929
2004
101.2794099
155.7253974
149.7517378
2005
104.7570975
172.2421373
184.6077458
2006
117.8931971
224.5586743
240.1191658
2007
124.3211504
190.3855259
235.1539225
2008
132.209815
183.2938789
331.3803376
2009
129.795212
149.4758201
249.1559086
2010
127.7349705
145.8234697
294.6375372
2011
128.7650913
139.8714914
339.1261172
2012
122.3474391
133.2972607
367.1300894
2013
132.8855742
134.7311464
381.4299901
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxix 2014
138.0361778
138.8569496
387.6861966
2015
142.1566608
140.1420358
279.1459782
2016
138.0361778
121.4744674
195.6305859
2017
138.0361778
135.2722354
246.673287
2018
140.0964193
141.7653027
288.182721
Copyright ©2024 Pearson Education, Inc.
16.71
(c)
For 2018: IU2018
68.000 41.920 2.902 100 156.998 , using 1992 as base period. 44.501 26.376 0.985
cont. Year
Electricity
Natural Gas
Fuel Oil
Unweighted
1992
44.501
26.376
0.985
100.000
1993
16.959
28.749
0.969
64.954
1994
48.200
30.236
0.919
110.427
1995
48.874
29.872
0.913
110.850
1996
48.538
29.570
1.007
110.093
1997
49.245
32.904
1.136
115.896
1998
46.401
31.438
0.966
109.662
1999
45.061
30.699
0.834
106.585
2000
45.207
31.664
1.189
108.625
2001
47.472
49.734
1.509
137.367
2002
47.868
36.316
1.123
118.709
2003
47.663
40.227
1.396
124.246
2004
49.159
46.048
1.508
134.584
2005
50.847
50.932
1.859
144.218
2006
57.223
66.402
2.418
175.396
2007
60.343
56.297
2.368
165.606
2008
64.172
54.200
3.337
169.365
2009
63.000
44.200
2.509
152.666
2010
62.000
43.120
2.967
150.409
2011
62.500
41.360
3.415
149.279
2012
59.385
39.416
3.697
142.632
2013
64.500
39.840
3.841
150.540
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxi
(d)
2014
67.000
41.060
3.904
155.804
2015
69.000
41.440
2.811
157.595
2016
67.000
35.920
1.97
145.960
2017
67.000
40.000
2.484
152.353
2018
68.000
41.920
2.902
156.998
Base year = 1992: 3 P 2018Q1992 I L2018 i31 i1992 i1992 100 i 1 Pi Qi =
(e)
68.00010 41.920 24 2.902 400 100 = 193.3977 44.50110 26.376 24 0.985 400
6,500 kWh = 13 units; 1040 therms = 26 units; 235 gal = 235 units Base year = 1992: 3 P 2018Q1992 I L2018 i31 i1992 i1992 100 i 1 Pi Qi =
68.00013 41.920 26 2.902 235 100 = 177.5608 44.50113 26.376 26 0.985 235
Instructional Tips and Solutions for Digital Cases
Chapter 2
Instructional Tips 1. Students should develop a frequency distribution of the More Winners data along with at least one graph such as a histogram, polygon, or cumulative percentage polygon. 2. One objective is to have students look beyond the actual statistical results generated to evaluate the claims presented. For the More Winners data, this might include a comparison with tables and charts developed for the entire Mutual Funds data set. Such a comparison would lead to the realization that all eight funds in the ―Big Eight‖ are high-risk funds that may have a great deal of variation in their return. 3. The presentation of information can lead to different perceptions of a business. This can be seen in the aggressive approach taken in the home page. Solutions 1. Yes. There is a breathless, exaggerated style to the writing and the illustrations are very busy and colorful without conveying much information. There is also a certain aggressiveness in Copyright ©2024 Pearson Education, Inc.
exclamations to ―show me the data.‖ Claims are made, but supporting evidence is scant. The style is reminiscent of a misleading infomercial. The graphs on pages 5 and 6 have poor design that obscures their meaning, if any. Also, nowhere in the document does EndRun disclose its principals and the address of its operations, something that a reputable business would surely do. And a testimonial page at the end is more suitable for an infomercial selling a consumer product and not something one would expect to see from a reputable financial services firm. 2. Frequencies (Return(%)) Bins Frequency Percentage –50 0 0.00% –40 1 3.45% –30 3 10.34% –20 4 13.79% –10 2 6.90% –0.01 1 3.45% 9.99 9 31.03% 20 3 10.34% 30 2 6.90% 40 3 10.34% 50 1 3.45%
Cumulative % .00% 3.45% 13.79% 27.59% 34.48% 37.93% 68.97% 79.31% 86.21% 96.55% 100.00%
Copyright ©2024 Pearson Education, Inc.
Midpts --–45 –35 –25 –15 –5 5 15 25 35 45
Solutions to End-of-Section and Chapter Review Problems clxxiii 2. cont.
3.
4.
Although the claim is literally true, the data show a wide range of returns for the 29 mutual funds selected by EndRun investors. Although 18 funds had positive returns, 11 had negative returns for the five-year period. Of the funds having negative returns, many had large losses, with 27.59% having annualized losses of 20% or more. Many of the positive returns were small, with 31.03% having an annualized return between 0 and 10%. All of this raise questions about the effectiveness of the EndRun investment service. Since mutual funds are rated by risk, it would be important to know the ―risk‖ of the funds EndRun chooses. ―High‖ risk funds, as all eight turn out to be, are not a wise choice for certain types of investors. An in-depth analysis would also see if the eight funds were representative of the performance of that group (no, the eight are among the weakest performers, as it turns out). In addition, examining summary measures (discussed in Chapter 3) would also be helpful in evaluating the ―Big Eight‖ funds. You would hope that one’s investment ―grew‖ over time. Whether this is reason to be truly proud would again be based on a comparison to a similar group of funds. You would also like to know such things as whether the gain in value is greater than any inflation that might have occurred during that period. Even more sophisticated reasoning would look at financial planning analysis to see if an investment in the ―big eight‖ was a worthy one or one that showed a real gain after tax considerations. A warning flag, however, is that the business feels the need to state that it is ―proud‖ even as it does not state a comparative (such as ―we are proud to have outperformed all of the leading national investment services.‖) Such an emotional claim suggests a lack of rational data that could otherwise be used to make a more persuasive case for using EndRun’s service.
Copyright ©2024 Pearson Education, Inc.
Chapter 3
Instructional Tips 1. Students should compute descriptive statistics and develop a boxplot for the More Winners sample. They should compare the measures of central tendency and take note of the measures of variation. The boxplot can be used to evaluate the symmetry of the data. 2. All too often means and standard deviations are computed on data from a scale (usually 5 or 7 points) that is ordinal at best. They should be cautioned that such statistics are of questionable value. Solutions 1. Return(%) Mean
–0.61724
Standard Error
4.533863
Median
1.1
Mode
1.1
Standard Deviation
24.4156
Sample Variance
596.1215
Range
85
Minimum
–41.9
Maximum
43.1
Sum
–17.9
Count
29
Largest(1)
43.1
Smallest(1)
–41.9
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxv Returns(%) for More Winners
Return(%)
-50
-40
-30
-20
-10
0
10
20
30
40
1. For the sample of 29 investors, the average annualized rate of return is –0.62% and the median cont. annualized rate of return is only 1.1%, Thus, half the investors are either losing money or have a very small return. In addition, there is a very large amount of variability with a standard deviation of over 24% in the annualized return. The data appear fairly symmetric since the distance between the minimum return and the median is about the same as the distance between the median and the largest return. However, the first quartile is more distant from the median than is the third quartile. 2. Calculating mean responses for a categorical variable is a naïve error at best. No methodology for collecting this survey is offered. For several questions, the neutral response dominates, surely not an enthusiastic endorsement of EndRun! Strangely, for the question ―How satisfied do you expect to be when using EndRun's services in the coming year?‖ only 19 responses appear, compared with 26 or 27 responses for the other questions (see the next question). Eliminating the means and considering the questions as categorical variables and then developing a bar chart for each question would be more appropriate. 3. As proposed, the question expects that the person being surveyed will be using EndRun. Most likely, the missing responses reflect persons who had already planned not to use EndRun and therefore could not answer the question as posed. Survey questions that would uncover reasons for planning to use or not use would be more insightful.
Copyright ©2024 Pearson Education, Inc.
Chapter 4
Instructional Tips The main goal of the Digital case for this chapter is to have students be able to distinguish between what is a simple probability, a joint probability, and a conditional probability. Solutions 1. Best 10 Customers
Return not less than 20% 8
Return less than 20% 2
Other Customers
0
19
The claim ―four-out-of-five chance of getting annualized rates of return of no less than 20%,‖ is literally accurate, but it applies only to EndRun’s best 10 customers. A more accurate probability would consider all customers (8/25, or about 32%). In fact, none of EndRun’s other customers achieved a return of not less than 20%. Another issue is that you do not know the actual return rates for each customer, so you cannot calculate any meaningful descriptive statistics. 2.
Made money?
3.
Yes No
Invested at EndRun Yes No 15 98 10 41
The 6% probability calculated (10/164 = 6.10%) is actually the joint probability of investing at EndRun and making money. The probability of being an EndRun investor who lost money is the conditional probability of losing money given an investment in EndRun which is equal to 10/25 = 40%. Since the patterns of security markets are somewhat unpredictable by their nature, any probabilities based on past performance are not necessarily indicative of future events. Even if EndRun had the ―best‖ probability for ―success‖, that would be no guarantee that their investment strategy would work in tomorrow’s market.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxvii
Chapter 5
Instructional Tips This digital case involves computing expected values and standard deviations of probability distributions and then using portfolio risk to obtain a good expected return with a lower risk than what would be involved if an entire investment was made in one fund. 1. Students need to realize that a very good return may occur only under certain circumstances. 2. Students need to realize that how the probabilities of the various events are obtained is of crucial importance to the results. 3. Using PHStat2, students can determine the expected portfolio return and portfolio risk of different combinations of two different funds. Solutions 1. Yes! ―With EndRun's Worried Bear Fund, you can get a four hundred percent rate of return in times of recession!‖ However, EndRun itself estimates the probability of recession at only 20% in its own calculations. ―With EndRun's Happy Bull Fund, you can make twelve times your initial investment (that's a 1200 percent rate of return!) in a fast expanding, booming economy.‖ In this case, EndRun itself estimates the probability of a fast-expanding economy at only 10%. 2. Estimating the probabilities of the outcomes is very subjective. It is never made clear how the value of the outcomes was determined. 3. There are several factors to consider. Most obviously, if an investor believed in a different set of probabilities, then the Worried Bear fund would not necessarily have the better expected return. An investor more concerned about risk would want to examine other measures (such as the standard deviation of each investment, the expected portfolio return, and the portfolio risk of different combination of investments). Investors who hedge might also invest in a lower expected return fund if the pattern of outcomes is radically different (as it is in the case of the two EndRun funds).
Copyright ©2024 Pearson Education, Inc.
3. cont. EndRun Portfolio Analysis
Outcomes
P
Happy Bull
Worried Bear
fast expanding economy
0.1
1200
–300
expanding economy
0.2
600
–200
weak economy
0.5
–100
100
recession
0.2
–900
400
Weight Assigned to X
0.5
Statistics E(X)
10
E(Y)
60
Variance(X)
382900
Standard Deviation(X)
618.7891
Variance(Y)
50400
Standard Deviation(Y)
224.4994
Covariance(XY)
–137600
Variance(X+Y)
158100
Standard Deviation(X+Y)
397.6179
Portfolio Management Weight Assigned to X
0.5
Weight Assigned to Y
0.5
Portfolio Expected Return
35
Portfolio Risk
198.809
Portfolio Management Weight Assigned to X
0.3
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxix Weight Assigned to Y
0.7
Portfolio Expected Return
45
Portfolio Risk
36.94591
Portfolio Management Weight Assigned to X
0.2
Weight Assigned to Y
0.8
Portfolio Expected Return
50
Portfolio Risk
59.4979
Portfolio Management Weight Assigned to X
0.1
Weight Assigned to Y
0.9
Portfolio Expected Return
55
Portfolio Risk
141.0142
Portfolio Management Weight Assigned to X
0.7
Weight Assigned to Y
0.3
Portfolio Expected Return
25
Portfolio Risk
366.5583
3. cont. Portfolio Management Weight Assigned to X
0.9
Weight Assigned to Y
0.1
Portfolio Expected Return
15
Portfolio Risk
534.6821
Note that of the two funds, Worried Bear has both a higher expected return and a lower standard deviation. From the results above, it appears that a good approach is to invest more in the Worried Bear fund than the Happy Bull fund to achieve a higher expected portfolio return while minimizing the risk. A reasonable choice is to invest 30% in the Happy Bull fund and 70% in the Worried Bear fund to achieve an expected portfolio return of 45 with a portfolio risk of 36.94. This risk is Copyright ©2024 Pearson Education, Inc.
substantially below the standard deviation of 618 for the Happy Bull fund and 224 for the Worried Bear fund. The expected portfolio return of 45 is much higher than the expected return for investing in only the Happy Bull fund and is somewhat below the expected return for investing completely in the Worried Bear fund. Of course, with the knowledge about EndRun accumulated through Digital cases in Chapters 2–5, a reasonable course of action would be not to invest any money with EndRun!
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxi
Chapter 6
Instructional Tips This digital case consists of two parts – determining whether the download times are approximately normally distributed and then evaluating the validity of various statements made concerning the download times that relate to understanding the meaning of probabilities from the normal distribution. Solution 1. Statistics Sample Size
300
Mean
13.3472
Median
13.535
Std. Deviation
3.137250
Minimum
5.15
Maximum
21.31
Copyright ©2024 Pearson Education, Inc.
From the normal probability plot, the data appear to be approximately normally distributed. In addition, the distance from the minimum value to the median is approximately the same as the distance from the median to the maximum value. 2.
―Standard deviation of 3.1272‖ This is false because the standard deviation is 3.137250.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxiii 2. ―• One time out of every 10 times, an individual user will experience a download time that is cont. greater than 17.37 seconds.‖ The probability of a download time above 17.37 seconds is 9.3%. A probability of 17.37 or greater is 9.67%. However, this does not mean that one of every ten downloads will take more than 17.37 seconds. It means that if the data is normally distributed with = 13.3472 seconds and the standard deviation equal to 3.137250 seconds, 10% of all downloads will take more than 17.37 seconds. ―• An 18-second download time has a probability of only 0.069.‖ This is false since the probability of an exact download time is zero. Statements should be made concerning the likelihood that the download time is less than a specific value. For example, the probability of a download time less than 18 seconds is 0.9316 or 93.16%. Normal Probabilities
Common Data Mean
13.3472
Standard Deviation
3.1272
Probability for X <= X Value
18
Z Value
1.4878486
P(X<=18)
0.9316
―• Because all the download times fall within plus or minus 3 standard deviations, the movie download process meets the Six Sigma benchmark for industrial quality. (Recall that senior management held a meeting last month on the importance of the Six Sigma methodology.)‖ Note: Six Sigma is discussed in online Chapter 19 of the text. This statement is ―double talk‖. In a normal distribution, 99.7% percent of all measurements fall within plus or minus 3 standard deviations. Six Sigma is a managerial approach designed to create processes that results in no more than 3.4 defects per million. The QRT needs to determine the requirements of the customers and then determine the capability of the current process (see Section 19.6) before embarking on quality improvement efforts. 3.
If the standard deviation is lowered to 2 seconds keeping the mean at 13.3472 seconds, the probability of a download taking less than 18 seconds would change from 0.9316 to 0.9900. Normal Probabilities
Common Data Mean
13.3472 Copyright ©2024 Pearson Education, Inc.
Standard Deviation
2
Probability for X <= X Value
18
Z Value
2.3264
P(X<=18)
0.9900
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxv 4.
If the standard deviation was assumed to be the same as it was previously at 3.1272, the probability of obtaining a download time below a specific number of seconds would increase. For example, the probability of having a download time below 18 seconds with a mean of 11.3472 seconds instead of a mean of 13.3472 seconds is 98.33% instead of 93.16%. Normal Probabilities
Common Data Mean
11.3472
Standard Deviation
3.1272
Probability for X <= X Value
18
Z Value
2.1274
P(X<=18)
0.9833
Copyright ©2024 Pearson Education, Inc.
Chapter 7
Instructional Tips This digital case focuses on two concepts – the need for random sampling and the application of the sampling distribution of the mean. Solutions 1. ―For our investigation, members of our group went to their favorite stores …One member thought her box of Oxford’s Pennsylvania Dutch-Style Chocolate Brownie Morning Squares was short, but her son opened the box and starting eating its contents before we could weigh the box...‖ These comments suggest that a non-random, informal collection procedure was used. When the data are examined, you discover that the sample size is only 5 for each of the two snacks. Drawing a random sample, and using a larger sample size would add rigor by reducing the variability in the sample means. 2. (a)
(b)
Oxford Cheez Squares
Alpine Granola Frosted Pretzels
360.4
366.1
361.8
367.2
362.3
365.6
364.2
367.8
371.4
373.5
364.02
368.04
If 15, then X
15 6.7082 , and with an expected population mean of 368 grams, 5
Normal Probabilities Box Weight for Oxford Cheez Squares Mean
368
Standard Deviation
6.7082
Probability for X <= X Value
364.02
Z Value
–0.593304
P(X<=364.02)
0.2764889
The likelihood of obtaining a sample average weight of no more than 364.02 grams if the population weight is 368 grams is 27.65%. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxvii Normal Probabilities Box Weights for Alpine Granola Frosted Pretzels Mean
368
Standard Deviation
6.7082
Probability for X <= X Value
368.04
Z Value
0.005962851
P(X<=368.04)
0.502378833
The likelihood of obtaining a sample average weight of no more than 368.04 grams if the population weight is 368 grams is 50.24%.
Copyright ©2024 Pearson Education, Inc.
2. (c) cont. Normal Probabilities Box Weight for Oxford Cheez Squares Mean
368
Standard Deviation
15
Probability for X <= X Value
364.02
Z Value
–0.265333
P(X<=364.02)
0.3953764
The likelihood of obtaining an individual weight of no more than 364.02 grams if the population weight is 368 grams is 39.54%. Normal Probabilities Box Weights for Alpine Granola Frosted Pretzels Mean
368
Standard Deviation
15
Probability for X <=
3.
4.
X Value
368.04
Z Value
0.002666667
P(X<=368.04)
0.501063851
The likelihood of obtaining an individual weight of no more than 368.04 grams if the population weight is 368 grams is 50.11%. There is a fairly high chance that an individual box of Oxford Cheez Squares or the mean of a sample of five boxes will have a weight below 364.02 grams. There is more than a 50% chance that an individual box of Alpine Granola Frosted Pretzels or the average of a sample of five boxes will have a weight below 368.04 grams. This is true even though four of the five boxes in each sample contain less than 368 grams. Arguments for being reasonable: Statistical procedure used is invalid. The mean of the one group actually exceeds 368. Confusion over conclusions that can be drawn from a sample. Possibility of investigator bias. Arguments against: Data are available for independent review. Oxford is producing some boxes of snacks that had less cereal than claimed on their boxes. Right of individuals to freely express non-libelous opinions. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems clxxxix 5.
Even for the Oxford Cheez Squares sample, you cannot prove cheating without using statistical inference. When the techniques of the next two chapters are applied, it will turn out that with these samples, there is insufficient evidence that the population mean is less than 368 grams.
Copyright ©2024 Pearson Education, Inc.
Chapter 8
Instructional Tips This digital case focuses on two concepts – the need to develop confidence interval estimates rather than point estimates, and using statistical methods to determine sample size. Solutions 1.
Using PHStat Confidence Interval Estimate for the Proportion: Conbanco
Pay a Friend (PAF) Data
Data
Sample Size
200
Sample Size
200
Number of Successes
90
Number of Successes
110
Confidence Level
95%
Confidence Level
95%
Intermediate Calculations Sample Proportion
Intermediate Calculations 0.45
Sample Proportion
0.55
Z Value
-1.9600
Z Value
-1.9600
Standard Error of the Proportion
0.0352
Standard Error of the Proportion
0.0352
Interval Half Width
0.0689
Interval Half Width
0.0689
Confidence Interval
Confidence Interval
Interval Lower Limit
0.3811
Interval Lower Limit
0.4811
Interval Upper Limit
0.5189
Interval Upper Limit
0.6189
The confidence interval estimate for each of the two groups includes 0.50 or 50%. The proportion in the population using Pay A Friend (PAF) is estimated to be between 48.1% and 61.9%, while the proportion using Conbanco is estimated to be between 38.1% and 51.9%.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxci 2.Using PHStat Confidence Interval Estimate for the Mean
Conbanco
PAF Data
Data
Sample Standard Deviation
10.70568735
Sample Standard Deviation
7.479171836
Sample Mean
32.39
Sample Mean
25.25
Sample Size
90
Sample Size
110
Confidence Level
95%
Confidence Level
95%
Intermediate Calculations Standard Error of the Mean
Intermediate Calculations
1.128478532
Degrees of Freedom
89
Standard Error of the Mean
0.713111054
Degrees of Freedom
109
t Value
1.9870
t Value
1.9820
Interval Half Width
2.2423
Interval Half Width
1.4134
Confidence Interval
Confidence Interval
Interval Lower Limit
30.15
Interval Lower Limit
23.84
Interval Upper Limit
34.63
Interval Upper Limit
26.66
The 95% confidence interval estimate for the mean payment amount is $23.84 to $26.66 for Pay a Friend and $30.15 to $34.63 for Conbanco. Since the confidence intervals for both Pay a Friend and Conbanco include 0.50 or 50%, there is no evidence that customers use the two forms of payment in unequal numbers. Since there is some overlap in the two confidence intervals for the mean, it is hard to conclude that there is a difference in the mean purchases for the two forms of payment. However, these data are useful in pointing out the fact that when comparing differences between the means of two groups, confidence interval estimates for each group should not be compared. In fact, the correct procedure is to use the t-test for the difference between the means and the confidence interval estimate for the difference between two means (to be covered in Chapter 10). The results of this test indicate a significant difference in the mean purchase amount between the two forms of payment. Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxciii 2. cont. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
110
Sample Mean
25.25154545
Sample Standard Deviation
7.479171836
Population 2 Sample Sample Size
90
Sample Mean
32.38766667
Sample Standard Deviation
10.70568735
Intermediate Calculations Population 1 Sample Degrees of Freedom
109
Population 2 Sample Degrees of Freedom
89
Total Degrees of Freedom
198
Pooled Variance
82.3116
Standard Error
1.2895
Difference in Sample Means
-7.1361
t Test Statistic
-5.5339 Two-Tail Test Copyright ©2024 Pearson Education, Inc.
3.
Lower Critical Value
-1.9720
Upper Critical Value
1.9720
Using the range of the data divided by 6 as an estimate of the population standard deviation [(56.84 – 3.32)/6] equal to 8.92, the sample size necessary for 95% confidence with a sampling error of ± $3 is 34. Thus, a sample size of 200 is large enough. Sample Size Determination Data Population Standard Deviation
8.92
Sampling Error
3
Confidence Level
95%
Intemediate Calculations Z Value
-1.9600
Calculated Sample Size
33.9612
Result Sample Size Needed
34.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxcv Chapter 9
Instructional Tips There are several objectives involved in this digital case. 1. Have students question the validity of data collected. 2. Have students looking for hidden issues that could invalidate a set of conclusions. 3. Have students use hypothesis testing to draw conclusions about a claimed value. 4. Increase students’ understanding of the effect of sampling on a conclusion. Solutions 1. Issues that could be raised about the testing process – the size of the sample, how the sample was selected, the selection of only two brands of cereals, the identity of the independent testers (not disclosed), whether, as discussed in subsequent chapters, there is a single sample or in fact, samples of two different snacks. Also, if you read all of the materials related to the television station, you could raise issues about the independence of the consumer reporter and wonder why only one out of four plants was chosen for this analysis. 2. t Test for Hypothesis of the Mean
Data Null Hypothesis
=
Level of Significance
368 0.05
Sample Size
80
Sample Mean
370.433375
Sample Standard Deviation
14.70776355
Intermediate Calculations Standard Error of the Mean
1.644377955
Degrees of Freedom
79
t Test Statistic
1.479814901
Lower-Tail Test Lower Critical Value
–1.664370757
p-Value
0.928550208
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
3.
The mean weight is actually above the hypothesized weight of 368 grams by 1.48 standard deviation units. Clearly, with a p-value of 0.929, there is no reason to believe that the mean weight is below 368 grams. However, as noted in the press release, samples of two different snacks were selected, so the question can be raised as to whether separate analyses should have been done on each snack. The claim is true since 42 boxes contain more than 368 grams. However, if the mean were equal to 368, you would expect that approximately half of the boxes would contain more than 368 grams, so the result is certainly not surprising. Of course, the Oxford CEO does not mention that 38 boxes contained less than 368 grams.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxcvii 4.
Sample statistics will vary from sample to sample. It is possible that a sample with a mean below 368 grams and a sample with a mean above 368 grams will both lead to the conclusion that there is insufficient evidence that the population mean is below 368 grams. In fact, if you use the CCACC sample of 10 snack boxes discussed in Chapter 7, the results of the test for whether the population mean is below 368 are not significant. t Test for Hypothesis of the Mean
Data Null Hypothesis
=
Level of Significance
368 0.05
Sample Size
10
Sample Mean
366.03
Sample Standard Deviation
4.165746565
Intermediate Calculations Standard Error of the Mean
1.31732473
Degrees of Freedom
9
t Test Statistic
-1.495455111
Lower-Tail Test Lower Critical Value
-1.833113856
p-Value
0.08450497
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Chapter 10
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand that just having sample statistics does not mean that claims can be made about differences between groups without using hypothesis testing. 2. Use two-sample tests of hypothesis to determine whether there are significant differences between two groups. Solutions 1. Although the means of the two samples are different, without the necessary tests of hypothesis, you cannot infer that the two processes are statistically different. This, of course, assumes that CCACC has drawn random samples, something that is unclear in their posting. 2. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
10
Sample Mean
372.485
Sample Standard Deviation
13.45716104
Population 2 Sample Sample Size
10
Sample Mean
365.549
Sample Standard Deviation
10.07565432
Intermediate Calculations Population 1 Sample Degrees of Freedom
9
Population 2 Sample Degrees of Freedom
9
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cxcix Total Degrees of Freedom
18
Pooled Variance
141.3070
Standard Error
5.3161
Difference in Sample Means
6.9360
t Test Statistic
1.3047 Two-Tail Test
Lower Critical Value
-2.1009
Upper Critical Value
2.1009
Upper-Tail Test Upper Critical Value
1.7341
p-Value
0.1042
Copyright ©2024 Pearson Education, Inc.
2. cont.
For Plant 1 and Plant 2 F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
10
Sample Variance
181.0951833
Smaller-Variance Sample Sample Size
10
Sample Variance
101.51881
Intermediate Calculations F Test Statistic
1.7839
Population 1 Sample Degrees of Freedom
9
Population 2 Sample Degrees of Freedom
9
Upper-Tail Test Upper Critical Value
3.1789
p-Value
0.2008
Two-Tail Test Upper Critical Value
4.0260
p-Value
0.4016
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cci 2. cont.
For Plant 1 and Plant 2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
10
Sum of Ranks
120
Population 2 Sample Sample Size
10
Sum of Ranks
90
Intermediate Calculations Total Sample Size n
20
T1 Test Statistic
120
T1 Mean
105
Standard Error of T1
13.2288
Z Test Statistic
1.1338934
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.1284
Do not reject the null hypothesis
The t-test for the difference between the means indicates a test statistic of tSTAT = 1.30 and a onetail p-value of 0.104. The F-test for the equality of variances indicates a test statistic FSTAT = 1.788 Copyright ©2024 Pearson Education, Inc.
and a two-tailed p-value of 0.40. The Wilcoxon rank sum test (covered in Section 12.6) indicates a test statistic of ZSTAT = 1.133 and a one-tail p-value of 0.128. Thus, there is insufficient statistical evidence to indicate any difference in the mean, median, or variability between Plant 1 and Plant 2.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cciii
Chapter 11
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand that just having sample statistics does not mean that claims can be made about differences between groups without using hypothesis testing. 2. Use the one-factor Analysis of Variance to determine whether there are significant differences between two groups. 3. See that there can be anomalies that can occur when analyzing data in which one analysis can lead to a certain conclusion, and a different analysis might lead to another conclusion. Solutions 1. Yes, because Oxford Snacks operates four plants, a careful examination would explore if there are differences among the four plants. A proper sample of the population of snack boxes would include boxes from all four plants. In addition, as in an earlier case, it is unclear if the CCACC sample is randomly drawn from all snack boxes available. From their posting, it seems as if their members actively excluded boxes from plants other than #1 and #2. 2. In order to determine whether there is a difference in the weights among the four plants, a onefactor analysis of variance needs to be done. Anova: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Plant 1
20
7448
372.4
132.1037
Plant 2
20
7324.07
366.2035
218.1177
Plant 3
20
7393.12
369.656
222.0002
Plant 4
20
7531.72
376.586
131.1284
df
MS
F
P-value
F crit
2.19132
0.095938
2.724946
ANOVA Source of Variation
SS
Between Groups
1155.949
3
385.3162
Within Groups
13363.65
76
175.8375
Total
14519.6
79
Copyright ©2024 Pearson Education, Inc.
Kruskal-Wallis Test of Snack Box Weights
Data Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
134728.4
Sum of Sample Sizes
80
Number of groups
4
H Test Statistic
6.496991 Test Result
Critical Value
7.814725
p-Value
0.089781
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccv 2. The ANOVA results with an FSTAT test statistic equal to 2.19 < 2.72 or a p-value = 0.0959 > 0.05, cont. indicates that there is insufficient evidence to conclude that there is a difference in the means of the four plants. The nonparametric Kruskal-Wallis test (covered in Section 12.5) provides similar results with a 2STAT test statistic = 6.497 < 7 815 or a p-value = 0.0898 > 0.05. Interestingly, had CCACC argued that something was amiss only in Plant 2, but not in Plants 1, 3, and 4, there is some evidence that this is the case. Using an a priori research hypothesis that focused on testing differences between plants 1, 3, and 4 as compared to plant 2, the following results are obtained. t-Test: Two-Sample Assuming Equal Variances
Plant 1, 3, 4
Plant 2
Mean
372.880667
366.2035
Variance
164.518528
218.1177
60
20
Observations Pooled Variance
3.
177.574747
Hypothesized Mean Difference
0
df
78
t Stat
1.94065012
P(T<=t) one-tail
0.02795698
t Critical one-tail
1.66462542
P(T<=t) two-tail
0.05591397
t Critical two-tail
1.99084752
Since tSTAT = 1.94 > 1.664 or the p-value = 0.028 < 0.05, there is evidence that the mean weight of snack boxes in plants 1, 3, and 4 is greater than the mean weight in plant 2. The one-way ANOVA shows that the null hypothesis cannot be rejected, so you cannot claim a statistical difference among the four plants. The mean weight of the 80 boxes in the sample is 371.2 grams, consistent with a claim that boxes average 368 grams. Interestingly, an analysis that pits Plant #2 against the other plants indicates that a statistically significant difference does occur. There may be something different happening in Plant #2, after all. That said, if the source of snack boxes for sale were randomly distributed, consumers would, over time, be unlikely to be ―cheated.‖ Quantifiable claims must be substantiated by the proper statistical analysis. While the CCACC may, in fact, have at least one valid point, the group cannot offer any legitimate evidence to support their claims. So, at least at this point, you should not testify on the group’s behalf.
Copyright ©2024 Pearson Education, Inc.
Chapter 12
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Understand the difference between the results from a one-way table and a two-way contingency table. 2. Be able to use the chi-square test to determine whether a relationship exists between two categorical variables. 3. Be able to see the importance of examining differences between groups in their response to a categorical variable. Solutions 1. They are literally true since 181 of the respondents prefer the Sun Low Concierge Class program as compared to 119 who prefer the T.C. Resorts TCRewards Plus program. However, since the program is described as aimed at business travelers, other interpretations of the data can be made. 2. By examining the preferences of business travelers, the target for the program, especially those business travelers who use travel programs, or by examining the resort last visited by type of traveler. 3. Program Preference by Travel Program Observed Frequencies Program Preference Uses Travel Program
TCRewardsPlus Concierge Class
Total
Yes
55
20
75
No
64
161
225
Total
119
181
300
Expected Frequencies Program Preference Uses Travel Program
TCRewards Plus
Concierge Class
Total
Yes
29.75
45.25
75
No
89.25
135.75
225
Total
119
181
300
Data
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccvii Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841455338
Chi-Square Test Statistic
47.3606017
p-Value
5.90578E-12 Reject the null hypothesis
Expected frequency assumption is met.
Copyright ©2024 Pearson Education, Inc.
3. There is a significant difference in preference for TCRewards Plus versus Concierge Class based cont. on whether the respondent uses a travel rewards program (2STAT = 47.361 > 3.841, p-value = 0.000 < 0.05. Those who use travel rewards programs clearly prefer TCRewards Plus (73.3%) over Concierge Class, while those who do not use travel rewards programs prefer Concierge Class (71.6%). Program Preference by Travel Program Observed Frequencies Program Preference Customer Type
TCRewards Plus Concierge Class
Total
Business
34
16
50
Leisure
85
165
250
Total
119
181
300
Expected Frequencies Program Preference Customer Type
TCRewards Plus
Concierge Class
Business
19.83333333
30.16666667
50
Leisure
99.16666667
150.8333333
250
Total
119
181
300
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841455338
Chi-Square Test Statistic
20.12628256
p-Value
7.24936E-06
Copyright ©2024 Pearson Education, Inc.
Total
Solutions to End-of-Section and Chapter Review Problems ccix Reject the null hypothesis
Expected frequency assumption is met.
4.
There is a significant difference in preference for TCRewards Plus versus Concierge Class based on whether the respondent is a business or leisure traveler (2STAT = 20.126 > 3.841, p-value = 0.000 < 0.05. Business travelers clearly prefer TCReawrds Plus (68%) over Concierge Class, while leisure travelers prefer Concierge Class (66%). Further analysis indicates that of 41 business travelers who use travel reward programs, 31 prefer TCRewards Plus. Of 34 leisure travelers who use travel reward programs, 24 prefer TCRewards Plus. Thus, it is reasonable to conclude that TCRewards Plus is preferred by the target audience of business travelers and also by those who use travel reward programs. Among other factors that might be included in future surveys are whether the travel program influences the choice of accommodation, what attributes of a resort chain are desirable for business travelers, and the reasons for the attractiveness of Concierge Class for leisure travelers.
Copyright ©2024 Pearson Education, Inc.
Chapter 13
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Perform a simple linear regression analysis to determine the usefulness of an independent variable in predicting a dependent variable. 2. Understand the danger in making predictions that extrapolate beyond the range of the independent variable. Solutions 1. Regression Analysis Regression Statistics Multiple R
0.698234618
R Square
0.487531581
Adjusted R Square
0.44482588
Standard Error
2.234863491
Observations
14
ANOVA df
SS
MS
F 11.41607709
Regression
1
57.01890785
57.01890785
Residual
12
59.93537787
4.994614822
Total
13
116.9542857
Coefficients
Standard Error
Intercept
-1.941218839
Average Disposable Income($000)
0.192948059
t Stat
P-value
2.379988792
-0.815642009
0.430597414
0.05710603
3.378768576
0.005480622
Significance F 0.005480622
Yes, there is a correlation between the variables, but not a very strong one, given the r2 value of only 0.49. The sales projection claim should be discarded as Triangle is attempting to extrapolate sales outside the range of the X values. This raises a related point: Sunflowers clearly has not done Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxi
2.
3.
4.
business in areas of ―exceptional affluence,‖ so there is no track record on which to base a decision to accept or reject Triangle’s proposal. No, because the r2 value of mean disposable income with sales is only 0.49 as compared to an r2 value of 0.904 for store size. In fact, a multiple regression analysis reveals that given that store size is included in the regression model, adding mean disposable income does not significantly improve the model. Yes, given the r2 value of only 0.49, it is less significant than other single factors such as store size. However, opening a new retail location would be based on a number of factors (some of these factors such as competitive retail analysis, demographic and geographic profiles, regional economic analysis, and sales potential forecast analysis, are actually mentioned by Triangle in its proposal). The Sunflowers brand perception and merchandise mix would be important as well. For example, a store selling hip junior swimsuit fashions would not do well in a community of senior citizens in wintry Minnesota. The financial health of the Sunflowers chain would be another factor—many retail chains have gone out of business due to unwise overexpansion.
Copyright ©2024 Pearson Education, Inc.
Chapter 14
Instructional Tips The objectives for the digital case in this chapter are to have students: 1.
Evaluate the contribution of dummy variables to a multiple regression model.
2.
Determine whether an interaction term needs to be included in a regression model that has a dummy variable.
Solutions 1. Multiple Regression Analysis: Sales vs Price, Promotion, Location, Digital Coupon (Location: 0 = Food, 1 = New Arrivals, Digital Coupon: 0 = No, 1 = Yes)
Regression Statistics Multiple R
0.9370
R Square
0.8780
Adjusted R Square
0.8623
Standard Error
454.2762
Observations
36
ANOVA df
SS
MS
F 55.8002
Regression
4
46061222.6158
11515305.6539
Residual
31
6397374.1342
206366.9076
Total
35
52458596.7500
Copyright ©2024 Pearson Education, Inc.
Significance F 0.0000
Solutions to End-of-Section and Chapter Review Problems ccxiii Coefficients
Standard Error
t Stat
P-value
Intercept
14910.2954
1159.9239
12.8545
0.0000
Price
-3460.5908
310.6291
-11.1406
0.0000
Promotion
6.3938
0.9835
6.5010
0.0000
Location
-843.0954
155.0591
-5.4373
0.0000
Digital Coupon
149.3491
158.0277
0.9451
0.3519
The presence of Digital Coupon does not make a significant contribution to the multiple regression model since the p-value = 0.3519 > 0.05. Therefore, it should be eliminated from consideration in the model.
Copyright ©2024 Pearson Education, Inc.
1.
Multiple Regression Analysis: Sales vs Price, Promotion, Location (0 = Food, 1 = New Arrivals)
cont. Regression Statistics Multiple R
0.9352
R Square
0.8745
Adjusted R Square
0.8628
Standard Error
453.5174
Observations
36
ANOVA
df
SS
MS
F 74.3507
Regression
3
45876899.8186
15292299.9395
Residual
32
6581696.9314
205678.0291
Total
35
52458596.7500
Coefficients
Standard Error
t Stat
P-value
Intercept
14962.5323
1156.6709
12.9359
0.0000
Price
-3439.7472
309.3276
-11.1201
0.0000
Promotion
6.1440
0.9457
6.4964
0.0000
Location
-843.8204
154.7982
-5.4511
0.0000
Significance F
Multiple Regression Analysis with Interaction Terms: Sales vs Price, Promotion, Location, Price*Location, Promotion*Location Regression Statistics Multiple R
0.9443
R Square
0.8916
Copyright ©2024 Pearson Education, Inc.
0.0000
Solutions to End-of-Section and Chapter Review Problems ccxv Adjusted R Square
0.8736
Standard Error
435.2978
Observations
36
ANOVA
df
SS
MS
F
Regression
5 46774071.3700 9354814.2740 49.3699
Residual
30
Total
35 52458596.7500
5684525.3800
189484.1793
Coefficients
Standard Error
Intercept
14455.9223
1490.7939
9.6968
0.0000
Price
-3197.0210
405.0027
-7.8938
0.0000
Promotion
4.3670
1.2364
3.5320
0.0014
Location
-60.0741
2239.5882
-0.0268
0.9788
Price*Location
-416.0789
597.9686
-0.6958
0.4919
3.7702
1.8287
2.0616
0.0480
Promotion*Location
t Stat
P-value
For the interaction term, Price*Location, tSTAT = –0.6958 with a p-value of 0.44919. Because pvalue > 0.05, do not reject H0. There is not sufficient evidence that the interaction term makes a significant contribution.
Copyright ©2024 Pearson Education, Inc.
1.
Multiple Regression Analysis with Interaction Term:
cont. Sales vs Price, Promotion, Location, Promotion*Location Regression Statistics Multiple R
0.9433
R Square
0.8899
Adjusted R Square
0.8757
Standard Error
431.6610
Observations
36
ANOVA
df
SS
MS
F 62.6335
Regression
4
46682329.5186
11670582.3797
Residual
31
5776267.2314
186331.2010
Total
35
52458596.7500
Coefficients
Standard Error
Intercept
15145.4672
1104.4378
13.7133
0.0000
Price
-3387.8897
295.4748
-11.4659
0.0000
4.4205
1.2237
3.6123
0.0011
-1594.2287
389.8476
-4.0894
0.0003
3.7703
1.8135
2.0791
0.0460
Promotion Location Promotion*Location
t Stat
P-value
Thus, there is a significant effect of location on sales with location having a positive effect on sales. However, the effect of the location is not the same across different levels of promotion with a slight decrease in its effect with increasing levels of promotion expenses. In addition, there is no evidence of any patterns in the residual plots.
2.
You would recommend using the location but not use digital coupons. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxvii 3.
Actual sales by linear display feet (the linear size of the product stock area), the number of QV digital coupons used per store, the number of stores using digital coupons, and the amount or existence of special in-store signage or advertising panels.
Copyright ©2024 Pearson Education, Inc.
Chapter 15
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to determine which one of a set of competing claims concerning regression results are correct. 2. Use model building approach to determine the best fitting model. 3. Evaluate the contribution of dummy variables to a multiple regression model. 4. Determine whether an interaction term needs to be included in a regression model that has a dummy variable. 5. Use the coefficient of partial determination to evaluate the importance of each independent variable. Solutions 1. Regression Analysis: Sales vs Price, Promotion, Location, Digital Coupon with VIF (Location: 0 = Food, 1 = New Arrivals, Digital Coupon: 0 = No, 1 = Yes) Regression Statistics Multiple R
0.9370
R Square
0.8780
Adjusted R Square
0.8623
Standard Error
454.2762
Observations
36
df
SS
MS
F
Regression
4 46061222.6158 11515305.6539 55.8002
Residual
31
Total
35 52458596.7500
6397374.1342
206366.9076
Coefficients
Standard Error
Intercept
14910.2954
1159.9239
12.8545
0.0000
Price
-3460.5908
310.6291
-11.1406
0.0000
1.0099
6.3938
0.9835
6.5010
0.0000
1.1250
Promotion
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
VIF
Solutions to End-of-Section and Chapter Review Problems ccxix Location
-843.0954
155.0591
-5.4373
0.0000
1.0486
Digital Coupon
149.3491
158.0277
0.9451
0.3519
1.0857
Multicollinearity is not an issue (all of the VIFs are small and less than 2). Digital coupon is not significant with p-value = 0.3519 > 0.05.
Copyright ©2024 Pearson Education, Inc.
1. Best Subsets Regression Analysis: cont. Sales vs Price (X1), Promotion (X2), Location(X3), Digital Coupon(X4) Best Subsets Analysis
Intermediate Calculations R2T
0.878049
1 - R2T
0.121951
n
36
T
5
n-T
31
Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1
89.7763
2
0.5209
0.5069
859.7297
X2
161.9328
2
0.2371
0.2146
1084.9413
X3
163.1847
2
0.2322
0.2096
1088.4374
X4
218.3295
2
0.0152
-0.0137
1232.6409
X1X2
31.5085
3
0.7580
0.7434
620.1982
X1X3
43.9560
3
0.7091
0.6914
680.0641
X1X4
90.3695
3
0.5265
0.4978
867.6035
X2X3
125.1366
3
0.3897
0.3527
984.9635
X2X4
163.9090
3
0.2372
0.1910
1101.1893
X3X4
162.8055
3
0.2415
0.1956
1098.0517
X1X2X3
3.8932
4
0.8745
0.8628
453.5174
X1X2X4
32.5637
4
0.7617
0.7394
624.9586
X1X3X4
45.2625
4
0.7118
0.6848
687.3624
X2X3X4
127.1127
4
0.3898
0.3326
1000.1582
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxi X1X2X3X4
5.0000
5
0.8780
0.8623
454.2762
Based on a best subsets regression and examination of the resulting Cp values, the best model appears to be a model with variables X1, X2 and X3, which has Cp = 3.8932. Models that add other variables do not change the results very much.
Copyright ©2024 Pearson Education, Inc.
1. Stepwise Regression: Sales vs Price (X1), Promotion (X2), Location(X3), Digital Coupon(X4) cont. Stepwise Regression Analysis Table of Results for General Stepwise
Price entered.
df
SS
MS
F
Regression
1 27328004.1667 27328004.1667 36.9729
Residual
34 25130592.5833
Total
35 52458596.7500
739135.0760
Coefficients
Standard Error
t Stat
P-value
Intercept
16201.8750
2163.2971
7.4894
0.0000
Price
-3556.9444
584.9719
-6.0805
0.0000
Promotion entered.
df
SS
MS
F
Regression
2 39765284.5417 19882642.2708 51.6908
Residual
33 12693312.2083
Total
35 52458596.7500
384645.8245
Coefficients
Standard Error
Intercept
14762.1250
1580.9818
9.3373
0.0000
Price
-3556.9444
421.9914
-8.4289
0.0000
7.1988
1.2660
5.6863
0.0000
Promotion
t Stat
Location entered.
Copyright ©2024 Pearson Education, Inc.
P-value
Solutions to End-of-Section and Chapter Review Problems ccxxiii df
SS
MS
F
Regression
3 45876899.8186 15292299.9395 74.3507
Residual
32
Total
35 52458596.7500
6581696.9314
205678.0291
Coefficients
Standard Error
t Stat
P-value
Intercept
14962.5323
1156.6709
12.9359
0.0000
Price
-3439.7472
309.3276
-11.1201
0.0000
Promotion
6.1440
0.9457
6.4964
0.0000
Location
-843.8204
154.7982
-5.4511
0.0000
Based on a stepwise regression analysis with all the original variables, only X1, X2 and X3 make a significant contribution to the model at the 0.05 level. Thus, the best model is the model using the price (X1), promotion (X2), and location (X3) should be included in the model. It appears that the predicted increase in sales from the Promotion is approximately $6 each promotion. 1. If only one independent variable could be used, the coefficient of partial determination would be cont. helpful in determining which independent variable explained the most variation in sales holding constant the effect of the other independent variables. Coefficients r2 Y1.234
0.800145312
r2 Y2.134
0.576863749
r2 Y3.124
0.488142201
r2 Y4.123
0.028005361
Price has the highest coefficient of partial determination followed by promotion expenses, shelf location, number of dispensers, and the interaction of promotion expenses and shelf location. Another approach would be to perform a cost-benefit analysis on each variable and use the results as a basis for selection. 2.
Stepwise and best subsets models both suggest that a model of Sales vs Price, Promotion, Location is best.
3.
Sales vs Price, Promotion, Location Regression Analysis Copyright ©2024 Pearson Education, Inc.
Regression Statistics Multiple R
0.9352
R Square
0.8745
Adjusted R Square
0.8628
Standard Error
453.5174
Observations
36
ANOVA df
SS
MS
F
Regression
3 45876899.8186 15292299.9395 74.3507
Residual
32
Total
35 52458596.7500
6581696.9314
205678.0291
Coefficients
Standard Error
t Stat
P-value
Intercept
14962.5323
1156.6709
12.9359
0.0000
Price
-3439.7472
309.3276
-11.1201
0.0000
Promotion
6.1440
0.9457
6.4964
0.0000
Location
-843.8204
154.7982
-5.4511
0.0000
The best linear model, is: Yˆ 14,962.53 – 3,439.7472 Price + 6.144 Promotion – 843.82 Location with r2 = 0.8745.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxv 3. Residual Analysis cont.
Copyright ©2024 Pearson Education, Inc.
3. cont.
4.
Deborah Clair stated ―the most striking thing is the only store with over 4000 unit sales and only 100 in promotional dollars was a store with in-store digital coupons.‖ From the analysis, it is clear that in-store digital coupons are not significant in sales.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxvii
Chapter 16
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to develop a time-series forecasting model using quarterly data. 2. Be able to compare the results of two forecasts and plot the raw time-series data on a graph. 3. Interpret the results of the time-series forecasting model including the compound growth rate and the seasonal multiplier. Solutions 1. Oxford Glen Remodeling: Regression Analysis: log(OG) vs Coded Quarter, Q1, Q2, Q3 Regression Statistics Multiple R
0.9964
R Square
0.9927
Adjusted R Square
0.9901
Standard Error
0.0016
Observations
16
ANOVA
df
SS
MS
F
Regression
4
0.0039
0.0010 376.4251
Residual
11
0.0000
0.0000
Total
15
0.0039
Coefficients
Standard Error
t Stat
P-value
Intercept
4.9983
0.0011 4378.0460
0.0000
Coded Quarter
0.0030
0.0001
33.5374
0.0000
Q1
-0.0126
0.0012
-10.7820
0.0000
Q2
-0.0089
0.0012
-7.7671
0.0000
Q3
-0.0101
0.0011
-8.8098
0.0000
The regression model for the Oxford Glen Remodeling is Copyright ©2024 Pearson Education, Inc.
Log(revenue) = 4.9983 + 0.0030 Coded Quarter – 0.0126 Quarter 1 – 0.0089 Quarter 2 – 0.0101 Quarter 3 ˆ log10 1 0.0030; ˆ1 100.0030 1.00697 , then ( ˆ1 1)100% 0.697% log ˆ 0.0126; ˆ 100.0126 0.9714 , then ( ˆ 1)100% 2.862% 10
2
2
2
log10 ˆ3 0.0089; ˆ3 100.0089 0.9796 , then ( ˆ3 1)100% 2.040% log ˆ 0.0030; ˆ 100.0101 0.9771 , then ( ˆ 1)100% 2.289% 10
1. cont.
4
4
4
The interpretation of the slopes is as follows: The estimated quarterly compound growth rate in revenue is 0.697%. 0.9714 is the seasonal multiplier for the first quarter as compared to the fourth quarter. Sales are 2.86% lower for the first quarter as compared to the fourth quarter. 0.9796 is the seasonal multiplier for the second quarter as compared to the fourth quarter. Sales are 2.04% lower for the second quarter as compared to the fourth quarter. 0.9771 is the seasonal multiplier for the third quarter as compared to the fourth quarter. Sales are 2.29% lower for the third quarter as compared to the fourth quarter. Sycamore Homes Remodelers: Regression Analysis: log(OG) vs Coded Quarter, Q1, Q2, Q3 Regression Statistics Multiple R
0.9724
R Square
0.9456
Adjusted R Square
0.9259
Standard Error
0.0178
Observations
16
ANOVA
df
SS
MS
Regression
4
0.0608
0.0152
Residual
11
0.0035
0.0003
Total
15
0.0643
Coefficients
Standard Error
Intercept
4.7546
Coded Quarter Q1
F 47.8242
t Stat
P-value
0.0126
376.0441
0.0000
-0.0085
0.0010
-8.4887
0.0000
-0.1607
0.0130
-12.4079
0.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxix Q2
-0.1035
0.0128
-8.1077
0.0000
Q3
-0.0923
0.0126
-7.3031
0.0000
The regression model for the Sycamore Homes Remodelers is Log(revenue) = 4.7546 – 0.0085 Coded Quarter – 0.1607 Quarter 1 – 0.1035 Quarter 2 –0.0923 Quarter 3 log10 ˆ1 0.0085; ˆ1 100.0085 0.9807 , then ( ˆ1 1)100% 1.929% log ˆ 0.1607; ˆ 100.1607 0.6907 , then ( ˆ 1)100% 30.934% 10
2
2
2
log10 ˆ3 0.1035; ˆ3 100.1035 0.7880 , then ( ˆ3 1)100% 21.198% log ˆ 0.0923; ˆ 100.0923 0.80846 , then ( ˆ 1)100% 19.154% 10
4
4
4
The interpretation of the slopes is as follows: The estimated quarterly compound growth rate in sales is –1.93% 0.6907 is the seasonal multiplier for the first quarter as compared to the fourth quarter. Sales are 30.93% lower for the first quarter as compared to the fourth quarter. 0.7880 is the seasonal multiplier for the second quarter as compared to the fourth quarter. Sales are 21.2% lower for the second quarter as compared to the fourth quarter. 0.8085 is the seasonal multiplier for the third quarter as compared to the fourth quarter. Sales are 19.15% lower for the third quarter as compared to the fourth quarter. These results refute the claims of the Sycamore Homes Remodelers. First, it is more appropriate to examine the data from four years than just the last year. Second, examining the data from the four years, the quarterly growth rate for the Oxford Glen Remodeling is +0.7% as compared to a negative growth rate of almost 2% for the Sycamore Homes Remodelers. Finally, the Oxford Glen Remodeling has small seasonal effects of 2 – 3 % as compared to the fourth quarter, while the Sycamore Homes Remodelers has large seasonal effects of between 19 and 31% as compared to the fourth quarter.
Copyright ©2024 Pearson Education, Inc.
2.
Oxford Glen Remodeling: Its steady continuous growth with little variability from season to season. Sycamore Homes Remodelers: The decline that occurred in year 3 did not continue. Sales in year 4 stabilized at about the same level as year 3.
3.
Among other variables might be the actual number of homes remodeled, the demographics of the home owners, and the number of repeat customers.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxi
Chapter 20
Instructional Tips The objectives for the digital case in this chapter are to have students: 1. Be able to use several criteria to determine a chosen course of action. 2. Revise probabilities in light of new information and determine if a previous course of action selected has changed. 3. Realize that better is a subjective word in making decisions. Solutions 1. Probabilities & Payoffs Table: P
StraightDeal
Happy Bull
Worried Bear
fast expanding
0.1
150
1200
-300
expanding
0.2
100
600
-200
stable
0.5
95
-100
100
recession
0.2
80
-900
400
Statistics for:
StraightDeal
Happy Bull
Worried Bear
Expected Monetary Value
98.5
10
60
Variance
340.25
382900
50400
Standard Deviation
18.44586675
618.7891402
224.4994432
Coefficient of Variation
0.187267683
61.87891402
3.741657387
Return to Risk Ratio
5.339949668
0.016160594
0.267261242
StraightDeal Expected Opportunity Loss
271.5
Happy Bull 360
Worried Bear 310
EVPI
Better is a subjective term that cannot be solely determined by a statistical analysis. If you accept the probabilities of the various events, StraightDeal should be selected since it has the highest expected monetary value ($98.50), the highest return-to-risk ratio (5.34), and the lowest expected value of perfect information ($271.50). 2. Copyright ©2024 Pearson Education, Inc.
Bayes’ Theorem Calculations
Probabilities Event
Prior
Conditional
Joint
Revised
fast expanding
0.1
0.9
0.09
0.1765
expanding
0.2
0.75
0.15
0.2941
stable
0.5
0.5
0.25
0.4902
recession
0.2
0.1
0.02
0.0392
Total:
0.51
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxiii 2. cont. Probabilities & Payoffs Table: P
StraightDeal
Happy Bull
Worried Bear
fast expanding
0.1765
150
1200
-300
expanding
0.2941
100
600
-200
stable
0.4902
95
-100
100
recession
0.0392
80
-900
400
Statistics for:
StraightDeal
Happy Bull
Worried Bear
Expected Monetary Value
105.59
303.96
-47.07
Variance
437.9369
304298.3184
36607.4151
Standard Deviation
20.92694196
551.6324124
191.3306434
Coefficient of Variation
0.198190567
1.814819096
-4.06481077
Return to Risk Ratio
5.045648819
0.551019108
-0.24601391
StraightDeal Expected Opportunity Loss
Happy Bull
347.37
149
Worried Bear 500.03
EVPI
Now the choice of which fund to invest in is much more difficult. Although Happy Bull has a higher expected monetary value than StraightDeal and a lower expected value of perfect information, it also has a much lower return-to-risk ratio. Perhaps a better approach would be to use the portfolio management approach covered in Section 5.2 to invest a proportion of assets in StraightDeal and a proportion in Happy Bull. For example, investing 70% in StraightDeal and 30% in Happy Bull would provide a portfolio expected return of $165.10 and a portfolio risk of $178.14, substantially below the standard deviation of Happy Bull of $551.63.
The Craybill Instrumentation Company Case
Chapter 15
Copyright ©2024 Pearson Education, Inc.
1.
(a)
Let Y = Sales, X1 = Wonderlic Personnel Test score, X2 = Strong-Campbell Interest Inventory Test score, X3 = experience, X4 = 1 with a degree in electrical engineering; 0 otherwise. Based on a full regression model involving all of the variables: All VIFs are less than 5. So there is no reason to suspect collinearity between any pair of variables. The best-subset approach yields the following models to be considered: Partial PHStat output from the best-subsets selection: Consider Model
Cp
k
R Square
Adj. R Square
Std. Error
This Model?
5
5
0.593228
0.552551222
11.74203
Yes
X1X2X4
3.101155
4
0.5922
0.562360664
11.6126
Yes
X2X3X4
3.001097
4
0.593217
0.563452639
11.59811
Yes
X2X4
1.101172
3
0.5922
0.572780469
11.47353
Yes
X1X2X3X4
Partial PHStat output of the full regression model: Coefficients
Standard Error
t Stat
P-value
Intercept
25.7683
13.9537
1.8467
0.0722
Wonder
-0.0134
0.4050
-0.0331
0.9737
SC
1.3514
0.1947
6.9407
0.0000
Experience
0.1682
0.5287
0.3180
0.7521
Engineer Dummy
7.2747
4.1011
1.7738
0.0837
Since the p-value for X1 and X3 are considerably larger than 0.05, they do not have significant effect individually on sales. The best model should include both X2 and X4. PHStat output of the model with only X2 and X4: Regression Statistics Multiple R
0.7695
R Square
0.5922
Adjusted R Square
0.5728
Standard Error
11.4735
Observations
45
ANOVA df
SS
MS
Regression
2
8029.0413
4014.5207
Residual
42
5528.9587
131.6419
Total
44
13558
Copyright ©2024 Pearson Education, Inc.
F 30.4958
Significance F 6.59784E-09
Solutions to End-of-Section and Chapter Review Problems ccxxxv
Intercept
Coefficient s 26.8910
Standard Error
t Stat
P-value
9.7718
2.7519
0.0087
SC
1.3408
0.1792
7.4824
0.0000
Engineer Dummy
7.2869
3.9857
1.8282
0.0746
Copyright ©2024 Pearson Education, Inc.
1. (a) cont.
(b) (c) (d) (e)
Although the p-value for the Engineer Dummy variable is also > .05, it is close enough to .05 to retain the variable in the model because the managers consider it important. Therefore, the most appropriate model to predict sales is Yˆ 26.8910 1.3408 X 2 7.2869 X 4
With the exception of one single residual point at a value of –44.58 when SC = 54, there is no specific pattern in the residual plot. According to the finding in (a), the company only needs to administer the Strong-Campbell test. According to the model in (a), the variable X4 helps predict sales and, hence, the idea of only hiring electrical engineers should be supported. Prior selling experience (X3) does not help predict sales according to the model chosen in (a). The company only needs to administer the Strong-Campbell test to save time and money. It should consider giving hiring preference to sales managers with an electrical engineering degree.
The Mountain States Potato Company Case
Chapter 15
The independent variables involved are the pH of the filter cake (PH), the pressure in the vacuum line below the fluid line on the rotating drum (LOWER), the pressure of the vacuum line above the fluid line on the rotating drum (UPPER), cake thickness measured on the drum (THICK), setting used to control the drum speed (VARIDRIV), and the speed at which the drum was rotated when collecting filter cake (DRUMSPD). These data are contained in the POTATO file. We begin our analysis of the potato processing data by first measuring the amount of collinearity that exists between the explanatory variables through the use of the variance inflationary factor. The following Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxxxvii figure represents partial PHStat output for a multiple linear regression model in which the percent of solids is predicted from the six explanatory variables. We observe that four of the VIF are above 5.0, ranging from 9.9 for Varidriv to 8.4 for Upper. Thus, based on the criteria developed by Snee, there is evidence of collinearity among at least some of the explanatory variables. A reasonable strategy is to remove the independent variable with the largest VIF above 5, and determine what effect this has on the VIF of the remaining independent variables. Regression Analysis
Regression Analysis
PH and all other X
Lower Pressure and all other X Regression Statistics
Regression Statistics Multiple R
0.561502
Multiple R
0.939267
R Square
0.315284
R Square
0.882222
Adjusted R Square
0.243959
Adjusted R Square
0.869954
Standard Error
0.232255
Standard Error
0.716466
Observations
54
Observations
54
VIF
1.460459
VIF
8.490574
Regression Analysis
Regression Analysis
Upper Pressure and all other X
Cake Thickness and all other X
Regression Statistics
Regression Statistics
Multiple R
0.93823
Multiple R
0.616598
R Square
0.880276
R Square
0.380193
Adjusted R Square
0.867805
Adjusted R Square
0.31563
Standard Error
0.766658
Standard Error
0.108131
Observations
54
Observations
54
VIF
8.352558
VIF
1.613407
Regression Analysis
Regression Analysis
Varidriv speed and all other X
Drum speed setting and all other X
Regression Statistics
Regression Statistics
Multiple R
0.948243
Multiple R
0.946259
R Square
0.899165
R Square
0.895406
Adjusted R Square
0.888661
Adjusted R Square
0.884511
Copyright ©2024 Pearson Education, Inc.
Standard Error
0.179475
Standard Error
2.150324
Observations
54
Observations
54
VIF
9.917201
VIF
9.560793
Figure 1 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Six Explanatory Variables The following figure represents the regression model obtained from PHStat with only the Varidriv variable removed from the model. Regression Analysis
Regression Analysis
PH and all other X
Lower Pressure and all other X
Regression Statistics
Regression Statistics
Multiple R
0.527979
Multiple R
0.938614
R Square
0.278761
R Square
0.880996
Adjusted R Square
0.219885
Adjusted R Square
0.871281
Standard Error
0.235924
Standard Error
0.7128
Observations
54
Observations
54
VIF
1.386504
VIF
8.403063
Regression Analysis
Regression Analysis
Upper Pressure and all other X
Cake Thickness and all other X
Regression Statistics
Regression Statistics
Multiple R
0.937879
Multiple R
0.614269
R Square
0.879617
R Square
0.377327
Adjusted R Square
0.869789
Adjusted R Square
0.326496
Standard Error
0.760882
Standard Error
0.10727
Observations
54
Observations
54
VIF
8.306799
VIF
Regression Analysis Drum speed setting and all other X Regression Statistics Multiple R
0.330551
Copyright ©2024 Pearson Education, Inc.
1.605978
Solutions to End-of-Section and Chapter Review Problems ccxxxix R Square
0.109264
Adjusted R Square
0.036551
Standard Error
6.210805
Observations
54
VIF
1.122667
Figure 2 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Five Explanatory Variables excluding the Varidriv independent variable From the figure above, we see that the VIF for the Drumspeed independent variable has been reduced from 9.6 to 1.1. This indicates that Varidriv and Drumspeed were very correlated with each other, but uncorrelated with the other independent variables. However, we also observe that the VIF values for Lower and Upper are still above 5, being equal to 8.4 and 8.3 respectively. Using the criteria of removing the independent variable with the highest VIF above five, we can remove the Lower independent variable from the model. The following figure represents PHStat output for a model that has excluded the Lower and Varidriv independent variables.
Copyright ©2024 Pearson Education, Inc.
Regression Analysis
Regression Analysis
PH and all other X
Upper Pressure and all other X
Regression Statistics
Regression Statistics
Multiple R 0.516068032
Multiple R
0.530366
R Square
0.266326213
R Square
0.281288
Adjusted R Square
0.222305786
Adjusted R Square
0.238165
Standard Error
0.235557799
Standard Error
1.840452
Observati ons
54
Observations
54
VIF
1.363003583
VIF
1.391378
Regression Analysis
Regression Analysis
Cake Thickness and all other X
Drum speed setting and all other X
Regression Statistics
Regression Statistics
Multiple R 0.603839337
Multiple R
0.29969
R Square
0.364621944
R Square
0.089814
Adjusted R Square
0.326499261
Adjusted R Square
0.035203
Standard Error
0.10726935
Standard Error
6.215147
Observati ons
54
Observations
54
VIF
1.573866128
VIF
1.098677
Figure 3 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Four Explanatory Variables excluding the Lower and Varidriv independent variables From the figure above, we see that none of the remaining four independent variables has a VIF value above 1.6. The Lower independent variable was undoubtedly highly correlated with the Upper independent variable and its removal left us with four relatively uncorrelated independent variables, pH, Upper, Thick, and Drumspeed.
The Stepwise Regression Approach to Model Building
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxli We now continue our analysis of these data by attempting to determine the explanatory variables that might be deleted from the complete model. We shall first utilize stepwise regression. The figure below represents a partial output obtained from the PHStat add-in for Microsoft Excel for the potato processing data.
Copyright ©2024 Pearson Education, Inc.
Stepwise Analysis Table of Results for General Stepwise PH entered.
df
SS
MS
Regression
1
25.35907426 25.35907426
Residual
52
74.61796278 1.434960823
Total
53
99.97703704
Coefficients
Standard Error
t Stat
F
Significance F
17.67231123
0.000103538
P-value
Lower 95%
Upper 95%
Intercept
0.782076396
2.45805091 0.318169324
0.751630817 -4.150360268
5.714513059
PH
2.589618022
0.616011802 4.203844815
0.000103538
1.353500744
3.825735299
F
Significance F
18.07013179
1.16804E-06
P-value
Lower 95%
Upper 95%
Upper Pressure entered.
df
SS
MS
Regression
2
41.46414438 20.73207219
Residual
51
58.51289266 1.147311621
Total
53
99.97703704
Coefficients
Standard Error
t Stat
Intercept
3.839576804
2.344527788 1.637675963
0.107645513
-0.86725551
8.546409118
PH
2.834310878
0.554678372 5.109827637
4.87591E-06
1.720748436
3.947873319
Upper Pressure
-0.263257541
0.070265187 -3.74662834
0.00045751 -0.404320681
-0.1221944
No other variables could be entered into the model. Stepwise ends.
Figure 4 Stepwise regression output obtained from the PHStat2 add-in for Microsoft Excel for the potato processing data For this example, a significance level of .05 was utilized either to enter a variable into the model or to delete a variable from the model. The first variable entered into the model is pH. Since the p-value of .0001 is less than .05, pH is included in the regression model. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxliii The next step involves the evaluation of the second variable to be included in this model. The variable to be chosen is the one will make the largest contribution to the model, given that the first explanatory variable has already been selected. For this model, the second variable is Upper pressure. Since the p-value of .00046 for Upper pressure is less than .05, Upper pressure is included in the regression model. Now that Upper pressure has been entered into the model, we determine whether pH is still an important contributing variable or whether it may be eliminated from the model. Since the p-value of .000004876 (4.87591E-06 in scientific notation) for pH is also less than .05, pH should remain in the regression model. The next step involves the determination of whether any of the remaining variables should be added to the model. Since none of the other variables meet the .05 criterion for entry into the model, the stepwise procedure terminates with a model that includes pH and Upper pressure.
Copyright ©2024 Pearson Education, Inc.
The Best Subset Approach to Model Building The best subset approach evaluates either all possible regression models for a given set of independent variables or at least the best subset of models for a given number of independent variables. The figure below represents partial output obtained from the PHStat2 add-in for Microsoft Excel in which all regression models for a given number of parameters were evaluated according to two widely used criteria, the adjusted r2 and the Cp statistic. Best Subsets Analysis
Intermediate Calculations R2T
0.428728
1 - R2T
0.571272
N
54
T
5
n-T
49 Consider Model
Cp
k
R Square Adj. R Square Std. Error This Model?
X1
14.0171
2
0.253649
0.239296084 1.197899
No
X1X2
2.200053
3
0.414737
0.391785177 1.071126
Yes
X1X2X3
3.038685
4
0.428277
0.393973229 1.069198
Yes
5
5
0.428728
0.382093161 1.079627
Yes
X1X2X4
4.136941
4
0.415472
0.380400828 1.081104
No
X1X3
15.01919
3
0.265283
0.236470785 1.200121
No
X1X3X4
16.79688
4
0.267875
0.223947575 1.209923
No
X1X4
15.80859
3
0.25608
0.226906562 1.207614
No
X2
25.90084
2
0.115101
0.098083637 1.304354
No
X2X3
25.97922
3
0.137504
0.103681097
1.3003
No
X2X3X4
27.03666
4
0.148493
0.097402952 1.304846
No
X2X4
26.46649
3
0.131823
0.097777342 1.304575
No
X3
28.92168
2
0.079882
0.062187562 1.330057
No
X3X4
30.54398
3
0.084286
0.048375212 1.339816
No
X1X2X3X4
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlv X4
34.92666
2
0.009872 -0.009168586
1.37973
No
Figure 5 Best subsets regression output obtained from the PHStat2 add-in for Microsoft Excel for the potato processing data The first criterion that is often used is the adjusted r2, which adjusts the r2 of each model to account for the number of variables in the model. Since models with different numbers of independent variables are to be compared, the adjusted r2 is the appropriate criterion here rather than r2. Referring to the figure above, we observe that the adjusted r2 reaches a maximum value of .39397 for the model that includes the independent variables pH, Upper, and Thick plus the intercept term (for a total of four terms). We note that the model selected by using stepwise regression, that includes pH and Upper has an adjusted r2 of .39179. Thus, the best subset approach, unlike stepwise regression, has provided us with several alternative models to evaluate in greater depth using other criteria such as parsimony, interpretability, and departure from model assumptions (as evaluated by residual analysis). A second criterion often used in the evaluation of competing models is based on the Cp statistic developed by Mallows. When a regression model with k independent variables contains only random differences from a true model, the average value of Cp is k + 1, the number of parameters. Thus, in evaluating many alternative regression models our goal is to find models whose Cp is close to or below k+ 1. From the previous figure, we observe that there are three models that contains a Cp value equal to or below k + 1. These are the models with X1 and X2, with X1, X2, and X3, and with X1, X2, X3, and X4. Since the models with X1 and X2 and with X1, X2, and X3 have fewer variables and also have Cp less than k + 1, we will focus on these two models. One approach for choosing between models that meet the criteria of Cp less than k + 1 is to determine whether the models contain a subset of variables that are common, and then test whether the contribution of the additional variables is significant. In this case, that would mean testing whether variable X3 made a significant contribution to the regression model given that variables X1 and X2 were already included in the model. If the contribution was statistically significant, then variable X3 would be included in the regression model. If variable X3 did not make a statistically significant contribution, variable X3 would not be included in the model. The following figure represents a regression model that includes variables X1, X2, and X3 (pH, Upper, and Thick). Regression Analysis Regression Statistics Multiple R
0.654428477
R Square
0.428276632
Adjusted R Square
0.393973229
Standard Error
1.069197909
Observations
54
ANOVA Copyright ©2024 Pearson Education, Inc.
df
SS
MS
F
Regression
3
42.81782866 14.27260955 12.48496083
Residual
50
57.15920838 1.143184168
Total
53
99.97703704
Coefficients
Standard Error
t Stat
P-value
Significance F 3.26679E-06
Lower 95%
Upper 95%
Intercept
2.625587479
2.592611079 1.012719378 0.316070006 -2.581827253 7.833002211
PH
3.148146196
0.624290082 5.042761826 6.41161E-06
Upper Pressure
-0.309346256
0.081934686 -3.775522569 0.000424913 -0.473916983 -0.144775529
Cake Thickness
1.531919658
1.407781964 1.088179631 0.281733384 -1.295694788 4.359534103
1.89422215 4.402070241
Figure 6 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Three Explanatory Variables including the pH, Upper, and Thick independent variables From this figure, we observe that Thick (X3) has a t value of 1.09 and a p-value of .282. Since the p-value of .282 > .05, we can conclude that Thick (X3) does not make a significant contribution to the regression model given that pH (X1 ) and Upper pressure (X2) are included. Therefore, a reasonable approach is to eliminate Thick (X3) from the model and fit the regression model that includes pH (X1 ) and Upper pressure (X2). The following figure represents PHStat output for this model.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlvii Regression Analysis
Regression Statistics Multiple R
0.644000528
R Square
0.41473668
Adjusted R Square
0.391785177
Standard Error 1.071126333 Observations
54
ANOVA df
SS
MS
F
Regression
2
41.46414438 20.73207219 18.07013179
Residual
51
58.51289266 1.147311621
Total
53
99.97703704
Coefficients
Standard Error
t Stat
P-value
Significance F 1.16804E-06
Lower 95%
Upper 95%
Intercept
3.839576804
2.344527788 1.637675963 0.107645513
-0.86725551 8.546409118
PH
2.834310878
0.554678372 5.109827637 4.87591E-06 1.720748436 3.947873319
Upper Pressure
-0.263257541
0.070265187 -3.74662834
0.00045751 -0.404320681
-0.1221944
Figure 7 Regression Model obtained from PHStat2 to Predict Percent of Solids based on Two Explanatory Variables including the pH, and Upper independent variables The following residual plots do not suggest any need for non-linear transformation. The Durbin-Watson statistic of 1.5509 is greater than dU 1.47 at 10% level of significance. So there is not sufficient evidence to conclude that there is negative autocorrelation in the model. Thus, we can conclude that raising the pH and/or reducing the Upper pressure should result in an increased percentage of solids.
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxlix
Chapter 18
JMP output of the regression tree:
According to the partion of the regression tree for solids, the first split occurs at pH level = 4.2. When pH level 4.2, the percentage of solids in the filter cakes is higher with a mean of 12.28%. Among those with pH level 4.2, the next split occurs at drum speed = 37.45. Those with drum speed 37.45 have a higher mean percentage of solids at 13.92%. All the other splits produce lower mean percentage of solids. Hence, to maintain the highest percentage of solids in the filter cakes, it is recommended that the pH level be set at 4.2 with a drum speed setting at 37.45.
The O. Hara Performance Consulting Case
Chapter 13
1.
Simple Regression Analysis: Internal Rating vs WGCTA Score Regression Statistics Copyright ©2024 Pearson Education, Inc.
Multiple R
0.8961
R Square
0.8030
Adjusted R Square
0.7960
Standard Error
0.3651
Observations
30
ANOVA df
SS
MS
F
Regression
1
15.2150 15.2150 114.1619
Residual
28
3.7317
Total
29
18.9467
Coefficients
Standard Error
t Stat
P-value
Intercept
0.0934
0.7221
0.1293
0.8981
WGCTA Score
0.0965
0.0090 10.6847
0.0000
0.1333
Rating = b0 b1 WGCTA 0.0934 0.0965 WGCTA The p-value = 0.0000 < 0.05 for the t-test for the significance of the slope coefficient. Reject H0 and conclude that WGCTA score is significant in predicting job performance. 2.
Rating = b0 b1 WGCTA 0.0934 0.0965 89 8.6836. 8.4625 Rating|WGCTA=89 8.9047 7.9038 Rating|WGCTA=89 9.4634
3.
Since a WGCTA score of 89 falls outside the domain of the WGCTA scores, you should be concerned that the linear relationship that exists between rating and WGCTA scores might not continue to hold true outside the domain.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccli 4.
The normality assumption of the errors might have been slightly violated. Copyright ©2024 Pearson Education, Inc.
The Sure Value Convenience Stores Case
Chapter 8
S 90 978 2.0345 950.08 1005.92 n 43
1.
X t
2.
Based on the evidence gathered from the sample of 43 stores, the 95% confidence interval for the mean per-store count in all of the franchise’s stores is from 950.08 to 1005.92. With a 95% level of confidence, the franchise can conclude that the mean per-store count in all of its stores is somewhere between 950.08 and 1005.92, which is larger than the original average of 925 mean per-store count before the price reduction. Hence, reducing coffee prices is a good strategy to increase the mean customer count.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccliii
Chapter 9
1.PHStat output: t Test for Hypothesis of the Mean
Data Null Hypothesis
=
Level of Significance
925 0.01
Sample Size
43
Sample Mean
978
Sample Standard Deviation
90
Intermediate Calculations Standard Error of the Mean Degrees of Freedom
13.7249 42
t Test Statistic
3.8616
Upper-Tail Test Upper Critical Value
2.4185
p-Value
0.0000 Reject the null hypothesis
H0: 925 The mean customer count is not more than 925. H1: > 925 The mean customer count is more than 925. A Type I error occurs when you conclude the mean customer count is more than 925 when in fact the mean number is not more than 925. A Type II error occurs when you conclude the mean customer count is not more than 925 when in fact the mean number is more than 925. Copyright ©2024 Pearson Education, Inc.
Decision rule: If tSTAT > 2.4185 or when the p-value < 0.01, reject H0. X – 978 – 925 Test statistic: tSTAT = 3.8616,p-value is virtually 0. S 90 n 43 Decision: Since tSTAT = 3.8616 is greater than 2.4185 or the p-value is less than 0.01, reject H0. There is enough evidence to conclude that reducing coffee prices is a good strategy for increasing the mean customer count. When the null hypothesis is true, the probability of obtaining a sample whose mean is 978 or more is virtually 0.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclv
Chapter 10
1.
Stores that priced the ―short‖ coffee at $0.99 H 0 : 925 vs. H1 : 925 t Test for Hypothesis of the Mean
Data Null Hypothesis
=
925
Level of Significance
0.05
Sample Size
15
Sample Mean
972
Sample Standard Deviation
85
Intermediate Calculations Standard Error of the Mean
21.9469
Degrees of Freedom
14
t Test Statistic
2.1415
Upper-Tail Test Upper Critical Value
1.7613
p-Value
0.0252 Reject the null hypothesis
Since the p-value = 0.0252 < 0.05, reject H 0 . There is evidence that reducing the price of a ―short‖ coffee to $0.99 increases per store average daily customer count. Stores that priced the small coffee at $1.09 H 0 : 925 vs. H1 : 925 t Test for Hypothesis of the Mean
Copyright ©2024 Pearson Education, Inc.
Data Null Hypothesis
=
925
Level of Significance
0.05
Sample Size
15
Sample Mean
951
Sample Standard Deviation
64
Intermediate Calculations Standard Error of the Mean
16.5247
Degrees of Freedom
14
t Test Statistic
1.5734
Upper-Tail Test Upper Critical Value
1.7613
p-Value
0.0690
Do not reject the null hypothesis
Since the p-value = 0.0690 > 0.05, do not reject H 0 . There is insufficient evidence that reducing the price of a ―short‖ coffee to $1.09 increases per store average daily customer count.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclvii 2.
H0 : 12 22 vs. H1 : 12 22 F Test for Difference in Two Variances
Data Level of Significance
0.05
Large-Variance Sample Sample Size
15
Sample Standard Deviation
85
Small-Variance Sample Sample Size
15
Sample Standard Deviation
64
Intermediate Calculations F Test Statistic
1.3281
Population 1 Sample Degrees of Freedom
14
Population 2 Sample Degrees of Freedom
14
Two-Tail Test Upper Critical Value
2.9786
p-Value
0.6026 Do not reject the null hypothesis
Since the p-value = 0.6026 > 0.05, do not reject H 0 . There is not enough evidence that the two variances are different. Hence, a pooled-variance t test is appropriate.
Copyright ©2024 Pearson Education, Inc.
2.
H 0 : 1 2 vs. H1 : 1 2
cont. Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances)
Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
15
Sample Mean
972
Sample Standard Deviation
85
Population 2 Sample Sample Size
15
Sample Mean
951
Sample Standard Deviation
64
Intermediate Calculations Population 1 Sample Degrees of Freedom
14
Population 2 Sample Degrees of Freedom
14
Total Degrees of Freedom
28
Pooled Variance
5660.5000
Difference in Sample Means
21.0000
t Test Statistic
0.7644
Two-Tail Test
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclix Lower Critical Value
–2.0484
Upper Critical Value
2.0484
p-Value
0.4510 Do not reject the null hypothesis
Since the p-value = 0.4510 > 0.05, do not reject H 0 . There is not enough evidence of a difference in the per store daily customer count between stores in which a small coffee was priced at $0.99 and stores in which a ―short‖ coffee was priced at $1.09 for an eight-ounce cup. 3.
Since there is not enough evidence of a difference in the per store mean daily customer count between stores in which a small coffee was priced at $0.99 and stores in which a ―short‖ coffee was priced at $1.09 for an eight-ounce cup, you will recommend that a ―short‖ coffee should be priced at $1.09 since that will bring in more profit per cup.
Copyright ©2024 Pearson Education, Inc.
Chapter 11
1.
H0: 12 22 32 42 H1: At least one variance is different. Excel output for Levene’s test for homogeneity of variance: Source of Variation
SS
df
MS
F
Between Groups
22848.6136
3
7616.2045
0.8574
Within Groups
355296.5455
40
8882.4136
Total
378145.1591
43
P-value
F crit
0.4710 2.8387
Level of significance
0.05
Since the p-value = 0.4710 > 0.05, do not reject H0. There is not enough evidence of a difference in the variation in daily customer count among the different prices.
You can perform the one-way ANOVA F test for the difference in means. H0: 1 2 3 4 where 1 = $0.99, 2 = $1.09, 3 = $1.19, 4 = $1.29 H1: At least one mean is different. Decision rule: df: 3,40. If F > 2.84, reject H0. Excel output: ANOVA Source of Variation
SS
df
MS
Between Groups
1544831.7273
3 514943.9091
Within Groups
821239.4545
40
Total
2366071.1818
43
F 25.0813
P-value
F crit
0.0000 2.8387
20530.9864
Level of significance
0.05
Test statistic: FSTAT = 25.0813 Decision: Since FSTAT = 25.0813 is greater than the critical bound of 2.84, reject H0. There is evidence of a difference in the mean daily customer count based on the price of a short coffee.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxi 2.
To determine which of the means are significantly different from one another, you perform the Tukey-Kramer procedure. PHStat output:
Tukey-Kramer Multiple Comparisons
Group 1: $0.99 Sales 2: $1.09 Sales 3: $1.19 Sales 4: $1.29 Sales
Sample Sample Mean Size 1630.273 11 1973.364 11 1599.636 11 1465.273 11
Other Data Level of significance 0.05 Numerator d.f. 4 Denominator d.f. 40 MSW 20530.99 Q Statistic 3.79
Absolute Std. Error Critical Comparison Difference of Difference Range Results Group 1 to Group 2 343.0909 43.20246875 163.7 Means are different Group 1 to Group 3 30.63636 43.20246875 163.7 Means are not different Group 1 to Group 4 165 43.20246875 163.7 Means are different Group 2 to Group 3 373.7273 43.20246875 163.7 Means are different Group 2 to Group 4 508.0909 43.20246875 163.7 Means are different Group 3 to Group 4 134.3636 43.20246875 163.7 Means are not different
The means are all mostly different among the different prices. In ascending order, they are $1.29, $1.19, $0.99 and $1.09. 3.
If the objective is to maximize the mean daily customer counts, the store should sell the short coffee at $1.09. Even though the mean daily customer counts are highest when the short coffee price is at $1.09, to determine the optimal price to maximize profit, you will need to know the cost of the coffee.
Copyright ©2024 Pearson Education, Inc.
Chapter 12
PHStat output: Kruskal-Wallis Rank Test for Differences in Medians Data Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
26700.41
Sum of Sample Sizes
44
Number of Groups
4 Test Result
H Test Statistic
26.8207
Critical Value
7.8147
p-Value
0.0000
Reject the null hypothesis
(a)
H0: M1 = M2 = M3= M4 H1: At least one of the medians differs. Since the p-value is virtually 0 < 0.05, reject H0. There is enough evidence of a difference in the median daily customer count based on the price of a cup of short coffee.
(b)
Even though you can conclude that there is enough evidence of a difference in the median daily customer count based on the price of a cup of short coffee, you cannot determine which price is optimal to maximize the median daily customer count. From Chapter 11, you have found out that the price to maximize mean daily customer count is $1.09.
The Choice Is Yours/More Descriptive Choices Follow-up Case
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxiii Chapter 2
1.
A complete answer should compile descriptive statistics for each type and compare. (This should be a pivot table.) ―Blend‖ funds are a blend of growth and value funds and would be expected to have returns somewhere between the other two. A pivot using the mean returns confirms this: Mean of 1Yr Return
Mean of 3Yr Return
Mean of 5Yr Return
Mean of 10Yr Return
Growth
-1.99
16.76
16.09
13.75
Value
16.18
12.57
9.46
10.89
Blend
10.21
14.05
11.43
11.82
Grand Total
7.42
14.63
12.59
12.27
Row Labels
There are no other strong patterns other than growth funds having greater returns than the other categories over longer periods of time. 2.
There are two types of answers possible: (1) What constitutes a ―better offering‖ would depend on information about individual investors that is not available. (2) Blend is a more balanced approach which likely will appeal to conservative investors.
Copyright ©2024 Pearson Education, Inc.
Chapter 3
More Descriptive Choices Follow-up
Redo Example 3.5 for 3-Yr Return by Type Mean 3Yr Return Type
Risk Level Low
Average
High
Grand Total
Growth
17.66
16.91
15.64
16.76
Small
14.42
15.16
14.11
14.47
Mid-Cap
15.58
14.85
15.14
15.12
Large
19.05
18.48
17.55
18.48
12.69
12.90
12.03
12.57
Small
11.33
12.15
11.09
11.44
Mid-Cap
11.42
13.40
11.92
12.31
Large
13.34
12.91
12.60
12.97
13.99
14.47
13.42
14.05
Small
9.58
10.66
10.74
10.44
Mid-Cap
12.06
12.90
12.11
12.53
Large
16.35
16.59
16.53
16.51
Grand Total
15.01
14.92
13.82
14.63
Value
Blend
The results are very similar to the results from Example 3.5 for the growth and value funds. Overall, the three-year return for the blend fund is between the value and growth funds. However, there are exceptions. For small cap at all the different risk levels, (and mid-cap at average risk level) the blend fund has a lower mean than both the value and growth funds. For mid-cap, at the low and high-risk level, the blend fund is between the value and growth funds. For large cap at all the different risk levels, the blend fund is between the value and growth funds. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxv Redo Example 3.10 for 3-Yr Return by Type: Descriptive Summary Value
Blend
Growth
Mean
12.57
14.05
16.76
Median
12.46
14.40
16.79
Mode
13.39
17.96
16.37
Minimum
4.00
-11.54
-5.66
Maximum
24.78
24.10
49.84
Range
20.78
35.64
55.5
Variance
8.2900
18.1555
25.4926
Standard Deviation
2.8792
4.2609
5.0490
Coeff. of Variation
22.90%
30.32%
30.13%
Skewness
0.4545
-1.6355
0.9852
Kurtosis
1.1577
8.1970
10.7005
Count
326
368
413
0.1595
0.2221
0.2484
Standard Error
The blend fund has a median of 14.40%, which is between the value fund of 12.46% and the growth fund of 16.79%. The blend fund had a standard deviation also between the value or growth funds, with 4.2609% for blend and 2.8792% and 5.0490% for value and growth, respectively. While the growth and value funds each show right or positive skewness, the blend fund shows left or negative skewness. The value, blend, and growth funds all show positive kurtosis, with the blend fund between the value and growth funds.
Copyright ©2024 Pearson Education, Inc.
Redo Example 3.14 for 3-Yr Return by Type: Five-Number Summary and Boxplot Five-Number Summary Value Minimum
Blend
Growth
4
-11.54
-5.66
First Quartile
10.65
11.48
14.085
Median
12.455
14.4
16.79
Third Quartile
13.91
17.17
19.195
Maximum
24.78
24.1
49.84
From the five-number summary, blend fund is not between value and growth for minimum nor maximum. The blend fund is between value and growth for first quartile, (10.65%, 11.48%, 14.085%) for median (12.455%, 14.4%, 16.79%), and for third quartile (13.91%, 17.17%, 19.195%).
The boxplot for the blend growth, and value funds confirm our results here and in redo of Example 3.10.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxvii
Chapter 3
The Choice Is Yours Follow-up
1.
Descriptive Summary for 1-Yr Return Percentage Value
Blend
Growth
Mean
16.18
10.21
-1.99
Median
16.29
11.38
0.14
Mode
17.35
10.45
10.76
Minimum
-11.26
-23.80
-47.37
Maximum
31.16
40.49
17.41
Range
42.42
64.29
64.78
Variance
20.7292
55.7301
153.4057
Standard Deviation
4.5529
7.4653
12.3857
Coeff. of Variation
28.15%
73.11%
-623.43%
Skewness
-0.5865
-0.4418
-0.8948
Kurtosis
4.2388
3.5711
0.8148
Count
326
368
413
Standard Error
0.2522
0.3892
0.6095
First Quartile
13.76
5.86
-9.915
Third Quartile
18.71
15.25
7.7
Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 73.11%, while the CV for growth and value is –623.43% and 28.15%, respectively. Descriptive Summary for 5-Yr Return Percentage
Copyright ©2024 Pearson Education, Inc.
Value
Blend
Growth
Mean
9.46
11.43
16.09
Median
9.41
11.61
16.09
Mode
8
9.47
15.1
Minimum
3.15
-4.17
0.42
Maximum
16.04
18.87
36.54
Range
12.89
23.04
36.12
Variance
4.8062
10.8459
16.0104
Standard Deviation
2.1923
3.2933
4.0013
Coeff. of Variation
23.18%
28.81%
24.86%
Skewness
0.2396
-0.8801
0.4714
Kurtosis
0.2366
2.1827
4.0251
Count
326
368
413
Standard Error
0.1214
0.1717
0.1969
First Quartile
7.9
9.16
13.615
Third Quartile
10.85
14.16
18.18
Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 28.81%, while the CV for value and growth is 23.18% and 24.86%, respectively.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxix 1. Descriptive Summary for 10-Yr Return Percentage cont. Value Blend
Growth
Mean
10.89
11.82
13.75
Median
10.87
12.07
13.80
Mode
11.16
11.21
14.11
Minimum
5.51
3.87
5.02
Maximum
16.86
15.59
24.83
Range
11.35
11.72
19.81
Variance
2.1725
4.3122
5.0269
Standard Deviation
1.4739
2.0766
2.2421
Coeff. of Variation
13.53%
17.56%
16.30%
Skewness
0.1148
-0.8496
0.2490
Kurtosis
1.9090
0.8581
3.0728
Count
326
368
413
Standard Error
0.0816
0.1082
0.1103
First Quartile
10.07
10.74
12.315
Third Quartile
11.7
13.46
14.95
Relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 17.56%, while the CV for value and growth is 13.53% and 16.30%, respectively. 4.
For Low risk funds only. 1YR Low risk Value
Blend
Growth
Mean
16.41
11.41
0.17
Median
16.73
11.67
3.06
Mode
18.45
13.84
9.53
Minimum
2.38
-9.42
-46.79
Copyright ©2024 Pearson Education, Inc.
Maximum
31.16
40.49
17.41
Range
28.78
49.91
64.2
Variance
22.1795 60.1060
156.8410
Standard Deviation
4.7095
7.7528
12.5236
Coeff. of Variation
28.69%
67.96% 7496.53%
Skewness
-0.0084
0.6161
-1.1566
Kurtosis
1.1931
3.3840
1.4601
93
96
119
0.4884
0.7913
1.1480
Count Standard Error
For low-risk funds only, for the one-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for growth is 7,496.53%, while the CV for value and blend is 28.69% and 67.96%, respectively.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxi 4. For Low risk funds only. cont. 5YR Low risk Value
Blend
Growth
Mean
9.66
11.47
16.77
Median
9.47
11.59
16.84
Mode
11.35
9.47
17.01
Minimum
3.30
-3.46
8.16
Maximum
15.03
18.87
36.54
Range
11.73
22.33
28.38
Variance
4.6819 12.6689 18.0637
Standard Deviation
2.1638
3.5593
4.2501
Coeff. of Variation
22.40%
31.04%
25.35%
Skewness
0.1174
-1.4485
1.1005
Kurtosis
0.4531
4.6346
3.9165
93
96
119
0.2244
0.3633
0.3896
Count Standard Error
For low-risk funds only, for the five-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for blend is 31.04%, while the CV for value and growth is 22.40% and 25.35%, respectively. 10YR Low risk Value
Blend
Growth
Mean
10.93
11.89
14.22
Median
10.94
12.34
14.13
Mode
11.5
10.26
11.72
Minimum
5.51
4.65
8.24
Maximum
15.65
15.19
24.83
Copyright ©2024 Pearson Education, Inc.
Range
10.14
10.54
16.59
Variance
2.2753
4.1087
5.9623
Standard Deviation
1.5084
2.0270
2.4418
Coeff. of Variation
13.81%
17.06%
17.17%
Skewness
-0.1621
-1.3353
0.5057
Kurtosis
2.1607
2.6052
2.3954
93
96
119
0.1564
0.2069
0.2238
Count Standard Error
For low-risk funds only, for the ten-year fund, relative to the mean, the growth fund has much more variation than the blend or value funds. The CV for growth is 17.17%, while the CV for value and blend is 13.81% and 17.06%, respectively.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxiii
Chapter 4
1.
Market cap by Type: Count of Market Cap
Type
Market Cap
Growth
Value
Grand Total
Small
92
59
151
Mid-Cap
102
59
161
Large
219
208
427
Grand Total
413
326
739
Market cap by Risk: Count of Market Cap
Risk Level
Market Cap
Low
Average High Grand Total
Small
29
43
79
151
Mid-Cap
45
71
45
161
Large
138
192
97
427
Grand Total
212
306
221
739
Market cap by Rating: Count of Market Cap
Star Rating
Market Cap
One
Two Three Four Five Grand Total
Small
7
32
69
27
16
151
Mid-Cap
12
43
56
41
9
161
Large
23
112
166
86
40
427
Grand Total
42
187
291
154
65
739
Type by Risk: Copyright ©2024 Pearson Education, Inc.
Count of Type
Risk Level
Type
Low
Average High
Grand Total
Growth
119
173
121
413
Value
93
133
100
326
Grand Total
212
306
221
739
Type by Rating: Count of Type
Star Rating
Type
One
Two
Three
Four Five Grand Total
Growth
23
103
167
87
33
413
Value
19
84
124
67
32
326
Grand Total
42
187
291
154
65
739
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxv 1. Risk and Rating: cont. Count of Risk Level Risk
2.
Star Rating One
Two Three Four Five Grand Total
Low
11
46
92
43
20
212
Average
16
69
126
67
28
306
High
15
72
73
44
17
221
Grand Total
42
187
291
154
65
739
Market cap by Type: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Type)
Type
Market Cap
Growth
Value
P(Type)
Small
0.1245
0.0798
0.2043
Mid-Cap
0.1380
0.0798
0.2179
Large
0.2963
0.2815
0.5778
P(Market Cap)
0.5589
0.4411
1.0000
Conditional probabilities: Conditional Probabilities P(Type|Market Cap)
Type
Market Cap
Growth
Value
Small
0.6093
0.3907
Mid-Cap
0.6335
0.3665
Large
0.5129
0.4871
Conditional Probabilities
Copyright ©2024 Pearson Education, Inc.
P(Market Cap|Type)
Type
Market Cap
Growth
Value
Small
0.2228
0.1810
Mid-Cap
0.2470
0.1810
Large
0.5303
0.6380
Market cap by Risk: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Risk)
Risk Level
Market Cap
Low
Average
High
P(Risk)
Small
0.0392
0.0582
0.1069
0.2043
Mid-Cap
0.0609
0.0961
0.0609
0.2179
Large
0.1867
0.2598
0.1313
0.5778
P(Market Cap)
0.2869
0.4141
0.2991
1.0000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxvii 2. Conditional probabilities: cont. Conditional Probabilities P(Risk|Market Cap)
Risk Level
Market Cap
Low
Average
High
Small
0.1921
0.2848
0.5232
Mid-Cap
0.2795
0.4410
0.2795
Large
0.3232
0.4496
0.2272
Conditional probabilities: Conditional Probabilities P(Market Cap|Risk)
Risk Level
Market Cap
Low
Average
High
Small
0.1368
0.1405
0.3575
Mid-Cap
0.2123
0.2320
0.2036
Large
0.6509
0.6275
0.4389
Market cap by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Market Cap, Rating)
Rating
Market Cap
One
Two
Three
Four
Five
P(Rating)
Small
0.0095
0.0433
0.0934 0.0365 0.0217
0.2043
Mid-Cap
0.0162
0.0582
0.0758 0.0555 0.0122
0.2179
Large
0.0311
0.1516
0.2246 0.1164 0.0541
0.5778
P(Market Cap)
0.0568
0.2530
0.3938 0.2084 0.0880
1.0000
Conditional probabilities: Copyright ©2024 Pearson Education, Inc.
Conditional P(Rating|Market Cap)
Rating
Market Cap
One
Probabilities
Two
Three
Four
Five
Small
0.0464
0.2119
0.4570
0.1788
0.1060
Mid-Cap
0.0745
0.2671
0.3478
0.2547
0.0559
Large
0.0539
0.2623
0.3888
0.2014
0.0937
Conditional
Probabilities
Conditional probabilities:
P(Market Cap|Rating)
Rating
Market Cap
One
Two
Three
Four
Five
Small
0.1667
0.1711
0.2371
0.1753
0.2371
Mid-Cap
0.2857
0.2299
0.1924
0.2662
0.1924
Large
0.5476
0.5989
0.5704
0.5584
0.5704
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxix 2. Type by Risk: cont. Marginal probabilities table: Joint and Marginal Probabilities P(Type, Risk)
Risk Level
Type
Low
Average
High
P(Risk)
Growth
0.1610
0.2341
0.1637
0.5589
Value
0.1258
0.1800
0.1353
0.4411
P(Type)
0.2869
0.4141
0.2991
1.0000
Conditional probabilities: Conditional Probabilities P(Risk|Type)
Risk Level
Type
Low
Average
High
Growth
0.2881
0.4189
0.2930
Value
0.2853
0.4080
0.3067
Conditional probabilities: Conditional Probabilities P(Type|Risk)
Risk Level
Type
Low
Average
High
Growth
0.5613
0.5654
0.5475
Value
0.4387
0.4346
0.4525
Type by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Type, Rating)
Rating
Type
One
Two
Three
Four
Five
Copyright ©2024 Pearson Education, Inc.
P(Rating)
Growth
0.0311
0.1394
0.2260 0.1177 0.0447
0.5589
Value
0.0257
0.1137
0.1678 0.0907 0.0433
0.4411
P(Type)
0.0568
0.2530
0.3938 0.2084 0.0880
1.0000
Conditional probabilities: Conditional
Probabilities
P(Rating|Type) Rating Type
One
Two
Three
Four
Five
Growth
0.0557
0.2494 0.4044 0.2107 0.0799
Value
0.0583
0.2577 0.3804 0.2055 0.0982
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxi 2. Conditional probabilities: cont. Conditional
Probabilities
P(Type|Rating) Rating Type
One
Two
Three
Four
Five
Growth
0.5476
0.5508 0.5739 0.5649 0.5739
Value
0.4524
0.4492 0.4261 0.4351 0.4261
Risk by Rating: Marginal probabilities table: Joint and Marginal Probabilities P(Risk, Rating)
Rating
Risk Level
One
Two
Three
Four
Five
P(Rating)
Low
0.0149
0.0622
0.1245
0.0582
0.0271
0.2869
Average
0.0217
0.0934
0.1705
0.0907
0.0379
0.4141
High
0.0203
0.0974
0.0988
0.0595
0.0230
0.2991
P(Risk)
0.0568
0.2530
0.3938
0.2084
0.0880
1.0000
Conditional probabilities: Conditional P(Rating|Risk)
Rating
Risk Level
One
Probabilities
Two
Three
Four
Five
Low
0.0519
0.2170
0.4340
0.2028
0.0943
Average
0.0523
0.2255
0.4118
0.2190
0.0915
High
0.0679
0.3258
0.3303
0.1991
0.0769
Conditional probabilities conditioned on Risk: Conditional
Probabilities
Copyright ©2024 Pearson Education, Inc.
P(Risk|Rating)
Rating
Risk Level
One
Two
Three
Four
Five
Low
0.2619
0.2460
0.3162
0.2792
0.3162
Average
0.3810
0.3690
0.4330
0.4351
0.4330
High
0.3571
0.3850
0.2509
0.2857
0.2509
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxiii
Chapter 6
3 Yr, 5 Yr, 10 Yr
Copyright ©2024 Pearson Education, Inc.
3 Yr, 5 Yr, 10 Yr cont.
According to the boxplots and normal probability plots, the 3-year and 10-year return % are quite normally distributed while the 5-year return % is slightly right-skewed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxv 3-Yr Return% by Type
Copyright ©2024 Pearson Education, Inc.
According to the boxplots and normal probability plots, the 3-year return % for both growth and value funds is slightly right-skewed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxvii 3-year Return % by Market cap:
Copyright ©2024 Pearson Education, Inc.
3-year Return % by Market cap: cont.
According to the boxplots and normal probability plots, the 3-year return % for small is right skewed, while the mid-cap and large funds are both approximately normal.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cclxxxix 5-Yr Return% by Type
Copyright ©2024 Pearson Education, Inc.
According to the boxplots and normal probability plots, the 5-year return % for growth is right-skewed and value funds is approximately normal.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxci 5-year Return % by Market cap:
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxciii 5-year Return % by Market cap: cont.
According to the boxplots and normal probability plots, the 3-year return % for small, mid-cap and large funds is are right-skewed.
Copyright ©2024 Pearson Education, Inc.
10-Yr Return% by Type
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcv According to the boxplots and normal probability plots, the 5-year return % for growth funds is leftskewed while that of the value funds is roughly normally distributed.
Copyright ©2024 Pearson Education, Inc.
10-year Return % by Market cap:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcvii
Copyright ©2024 Pearson Education, Inc.
10-year Return % by Market cap: cont.
According to the boxplots and normal probability plots, the 10-year return % for small-cap is quite normally distributed but left-skewed for the midcap and large-cap funds.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccxcix
Chapter 8
95% confidence interval 3-year return % Growth: 16.27 17.25 Value:12.26 12.88 Since the 95% confidence intervals do not overlap each other, the mean 3-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 5-year return% Growth: 15.71 16.48 Value:9.22 9.70 Since the 95% confidence intervals do not overlap each other, the mean 5-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 10-year return% Growth: 13.54 13.97 Value:10.73 11.05 Since the 95% confidence intervals do not overlap each other, the mean 10-year return % between the growth and value funds is significantly different from each other at the 95% level of confidence. 3-year return % Small: 12.58 13.99 Mid-Cap:13.52 14.66 Large: 15.32 16.27 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 3-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 3-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence. 5-year return% Small: 11.22 12.66 Mid-Cap:11.84 13.12 Large: 13.40 14.32 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 5-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 5-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence. Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccci 10-year return% Small: 11.23 11.95 Mid-Cap:11.74 12.26 Large: 12.75 13.24 Since the 95% confidence interval for large-cap funds does not overlap and is above the highest limit of that of the midcap and small-cap funds, the mean 10-year return % of the large-cap is significantly higher than that of the midcap and small-cap funds at the 95% level of confidence. Since the 95% confidence interval for midcap and small-cap overlap each other, the mean 10-year return % of the mid- and small-cap funds are not significantly different from each other at the 95% level of confidence.
Copyright ©2024 Pearson Education, Inc.
Chapter 10
Year-to-date Return %: Population 1 = growth (413), 2 = value (326) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. PHstat output: F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
413
Sample Variance
21.8834059
Smaller-Variance Sample Sample Size
326
Sample Variance
9.904962372
Intermediate Calculations F Test Statistic
2.2093
Population 1 Sample Degrees of Freedom
412
Population 2 Sample Degrees of Freedom
325
Two-Tail Test Upper Critical Value
1.2303
p-Value
0.0000 Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccciii Reject the null hypothesis S12 = 2.2093 S22 Decision: Since FSTAT = 2.2093 > 1.2303 and the p-value = 0.0000< 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.
Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT
Copyright ©2024 Pearson Education, Inc.
Year-to-date Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1 2 H1: 1 2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
413
Sample Mean
-11.98745763
Sample Standard Deviation
4.6780
Population 2 Sample Sample Size
326
Sample Mean
-0.48309816
Sample Standard Deviation
3.1472
Intermediate Calculations Numerator of Degrees of Freedom
0.0070
Denominator of Degrees of Freedom
0.0000
Total Degrees of Freedom
719.8936
Degrees of Freedom
719
Standard Error
0.2887
Difference in Sample Means
-11.5044
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccv Separate-Variance t Test Statistic
-39.8436
Two-Tail Test Lower Critical Value
-1.9633
Upper Critical Value
1.9633
p-Value
0.0000 Reject the null hypothesis
Decision: Since tSTAT = –39.8436 < –1.9633 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean year-to-date return percentage is different between growth and value funds.
Copyright ©2024 Pearson Education, Inc.
5-year Return %: Population 1 = growth (413), 2 = value (326) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. PHstat output: F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
413
Sample Variance
16.0104397
Smaller-Variance Sample Sample Size
326
Sample Variance
4.806191672
Intermediate Calculations F Test Statistic
3.3312
Population 1 Sample Degrees of Freedom
412
Population 2 Sample Degrees of Freedom
325
Two-Tail Test Upper Critical Value
1.2303
p-Value
0.0000 Reject the null hypothesis
Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT
S12 = 3.3312 S22
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccvii Decision: Since FSTAT = 3.3312 > 1.2303 and the p-value = 0.0000 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.
Copyright ©2024 Pearson Education, Inc.
5-year Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1 2 H1: 1 2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
413
Sample Mean
16.09435835
Sample Standard Deviation
4.0013
Population 2 Sample Sample Size
326
Sample Mean
9.45696319
Sample Standard Deviation
2.1923
Intermediate Calculations Numerator of Degrees of Freedom
0.0029
Denominator of Degrees of Freedom
0.0000
Total Degrees of Freedom Degrees of Freedom
663.3369 663
Standard Error
0.2313
Difference in Sample Means
6.6374
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccix Separate-Variance t Test Statistic
28.6935
Two-Tail Test Lower Critical Value
-1.9635
Upper Critical Value
1.9635
p-Value
0.0000 Reject the null hypothesis
Decision: Since tSTAT = 28.6935 > 1.9635 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean five-year return percentage is different between growth and value funds.
Copyright ©2024 Pearson Education, Inc.
10-year Return %: Population 1 = growth (413), 2 = value (326) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. PHstat output: F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
413
Sample Variance
5.026899222
Smaller-Variance Sample Sample Size
326
Sample Variance
2.172481672
Intermediate Calculations F Test Statistic
2.3139
Population 1 Sample Degrees of Freedom
412
Population 2 Sample Degrees of Freedom
325
Two-Tail Test Upper Critical Value
1.2303
p-Value
0.0000 Reject the null hypothesis
Decision rule: If FSTAT > 1.2303, reject H0.Test statistic: FSTAT
S12 = 2.3139 S22
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxi Decision: Since FSTAT = 2.3139 > 1.2303 and the p-value = 0.0000 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.
Copyright ©2024 Pearson Education, Inc.
10-year Return %: cont. Population 1 = growth (413), 2 = value (326) H0: 1 2 H1: 1 2 PHStat output: Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
413
Sample Mean
13.75256659
Sample Standard Deviation
2.2421
Population 2 Sample Sample Size
326
Sample Mean
10.89196319
Sample Standard Deviation
1.4739
Intermediate Calculations Numerator of Degrees of Freedom
0.0004
Denominator of Degrees of Freedom
0.0000
Total Degrees of Freedom
714.9580
Degrees of Freedom
714
Standard Error
0.1372
Difference in Sample Means
2.8606
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxiii Separate-Variance t Test Statistic
20.8433
Two-Tail Test Lower Critical Value
-1.9633
Upper Critical Value
1.9633
p-Value
0.0000 Reject the null hypothesis
Decision: Since tSTAT = 20.8433 > 1.9633 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean ten-year return percentage is different between growth and value funds.
Copyright ©2024 Pearson Education, Inc.
Chapter 11
Year to date return percentages, based on the fund market caps H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Large
427 2489.61 5.830468384
13.1594
Mid-Cap
161
969.67 6.022795031
14.6381
Small
151
866.84 5.740662252
15.5679
ANOVA Source of Variation
SS
df
MS
Between Groups
6.7633
2
3.3816
Within Groups
10283.1965
736
13.9717
Total
10289.9598
738
F 0.2420
P-value
F crit
0.7851 3.0080
Level of significance
0.05
Since p-value = 0.7851 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups Large
Count 427
Sum
Average
Variance
-2523.04
-5.908758782
47.2274
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxv Mid-Cap
161
-1323.5
-8.220496894
50.4465
Small
151
-1261.77
-8.356092715
47.5818
df
MS
F
ANOVA Source of Variation
SS
Between Groups
1020.3267
2
510.1634
Within Groups
35327.5746
736
47.9994
Total
36347.9013
738
10.6285
P-value
F crit
0.0000 3.0080
Level of significance
0.05
Test statistic: FSTAT = 10.6285 Since p-value = 0.0000 < 0.05, and FSTAT = 10.6285 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in year-to-date return percentages across the funds (Small, Mid-cap, Large).
Copyright ©2024 Pearson Education, Inc.
Year to date return percentages, based on the fund market caps cont.
From the Tukey Pairwise Comparison procedure, there is a difference in year-to-date return percentages between the funds of Large and Mid-Cap, and Large and Small. There is no difference between Mid-Cap and Small.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxvii Five-year return percentages, based on the fund market caps H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Large
427
1730.4 4.052459016
7.0773
Mid-Cap
161
534.01 3.316832298
5.8330
Small
151
542.29 3.591324503
7.5656
ANOVA Source of Variation
SS
df
MS
Between Groups
71.3730
2
35.6865
Within Groups
5083.0683
736
6.9063
Total
5154.4414
738
F 5.1672
P-value
F crit
0.0059 3.0080
Level of significance
0.05
Since p-value = 0.0059 < 0.05, reject H0. There enough evidence to conclude that the variances in five-year return percentages across the funds (Small, Mid-Cap, Large) are different. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Large
427
5918.18
13.85990632
23.3936
Mid-Cap
161
2008.76
12.47677019
16.8730
Small
151
1803
11.94039735
20.0694
ANOVA
Copyright ©2024 Pearson Education, Inc.
Source of Variation
SS
df
MS
Between Groups
508.9014
2
254.4507
Within Groups
15675.7707
736
21.2986
Total
16184.6721
738
F 11.9468
P-value
F crit
0.0000 3.0080
Level of significance
0.05
Test statistic: FSTAT = 11.9468 Since p-value = 0.0000 < 0.05, and FSTAT = 11.9468 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in five-year return percentages across the funds (Small, Mid-cap, Large).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxix Five-year return percentages, based on the fund market caps cont.
From the Tukey Pairwise Comparison procedure, there is a difference in five-year return percentages between the funds of Large and Mid-Cap, and Large and Small. There is no difference between Mid-Cap and Small.
Copyright ©2024 Pearson Education, Inc.
Ten-year return percentages, based on the fund market caps H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Large
427
882.24 2.066135831
2.3130
Mid-Cap
161
216.92 1.347329193
0.9866
Small
151
264.28 1.750198675
1.9114
ANOVA Source of Variation
SS
df
MS
Between Groups
62.1137
2
31.0569
Within Groups
1429.8890
736
1.9428
Total
1492.0027
738
F 15.9857
P-value
F crit
0.0000 3.0080
Level of significance
0.05
Since p-value = 0.0000 < 0.05, reject H0. There enough evidence to conclude that the variances in ten-year return percentages across the funds (Small, Mid-Cap, Large) are different. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,736. If F > 3.0080, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Large
427
5549.14
12.99564403
6.5676
Mid-Cap
161
1931.98
11.99987578
2.8116
Small
151
1749.47
11.58589404
4.9938
ANOVA
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxi Source of Variation
SS
df
MS
Between Groups
271.2775
2
135.6388
Within Groups
3996.7271
736
5.4303
Total
4268.0047
738
F 24.9780
P-value
F crit
0.0000 3.0080
Level of significance
0.05
Test statistic: FSTAT = 24.9780 Since p-value = 0.0000 < 0.05, and FSTAT = 24.9780 > 3.0080, reject H0. There is enough evidence to conclude that there is a significant difference in ten-year return percentages across the funds (Small, Mid-cap, Large).
Copyright ©2024 Pearson Education, Inc.
Ten-year return percentages, based on the fund market caps cont.
From the Tukey Pairwise Comparison procedure, there is a difference in ten-year return percentages between the funds of Large and Mid-Cap, and Large and Small.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxiii
Chapter 12
1.
Year-to-date Return%: Population 1 = Blend, 2 = Growth, 3 = Value H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
4.19E+08
Sum of Sample Sizes
1107
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
Blend
368
222092.5 603.512228
Growth
413
99223.5 240.250605
Value
326
291962 895.588957
Test Result H Test Statistic
778.7265
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis
Because H = 778.7265 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median year-to-date returns among the fund types (blend, growth, value). Copyright ©2024 Pearson Education, Inc.
1. Five-year Return%:Population 1 = Blend, 2 = Growth, 3 = Value cont. H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
3.93E+08
Sum of Sample Sizes
1107
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
Blend
368
176964.5 480.881793
Growth
413
339124 821.123487
Value
326
97189.5 298.127301
Test Result H Test Statistic
516.3777
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis Because H = 516.3777 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median five-year returns among the fund types (blend, growth, value). Ten-year Return%:Population 1 = Blend, 2 = Growth, 3 = Value H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data
Level of Significance
0.05
Intermediate Calculations
Group
Sample Sum of Size Ranks
Mean Ranks
Blend
368
188260.5 511.577446
Growth
413
316103 765.382567
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxv Sum of Squared Ranks/Sample Size
3.75E+08
Sum of Sample Sizes
1107
Number of Groups
3
Value
326
108914.5 334.093558
Test Result H Test Statistic
341.2596
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis Because H = 341.2596 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median ten-year returns among the fund types (blend, growth, value).
Copyright ©2024 Pearson Education, Inc.
2.
Year-to-date Return%: Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
3.42E+08
Sum of Sample Sizes
1107
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
Large
621
366578 590.302738
Mid-Cap
234
122198.5 522.215812
Small
252
124501.5 494.053571
Test Result H Test Statistic
19.1794
Critical Value
5.9915
p-Value
0.0001
Reject the null hypothesis Because H = 19.1794 > 5.9915 or p-value = 0.0001, reject H0. At the 0.05 significance level, there is evidence of a difference in median year-to-date returns among the market caps (small, mid-cap, large).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxvii 2. cont. Five-year Return%:Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
3.52E+08
Sum of Sample Sizes
1107
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
Large
621
400166 644.389694
Mid-Cap
234
113809.5 486.365385
Small
252
99302.5
394.05754
Test Result H Test Statistic
123.1813
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis Because H = 123.1813 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median five-year returns among the market caps (small, mid-cap, large).
Copyright ©2024 Pearson Education, Inc.
2. Ten-year Return%: cont. Population 1 = Large, 2 = Mid-Cap, 3 = Small H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
3.54E+08
Sum of Sample Sizes
1107
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
Large
621
404495.5 651.361514
Mid-Cap
234
109863
Small
252
98919.5 392.537698
469.5
Test Result H Test Statistic
138.2124
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis
Because H = 138.2124 > 5.9915 or p-value = 0.0000, reject H0. At the 0.05 significance level, there is evidence of a difference in median ten-year returns among the market caps (small, mid-cap, large).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxix 3.
Risk based on market cap: H 0 : There is no relationship between risk and market cap H1 : There is relationship between risk and market cap
PHStat output of the chi-square test: Chi-Square Test
Observed Frequencies Risk Level Market Cap
Low
Average
High
Total
Mid-Cap
60
111
63
234
Small
53
79
120
252
Large
195
286
140
621
Total
308
476
323
1107
Expected Frequencies Risk Level Market Cap
Low
Mid-Cap
Average
High
Total
65.10569 100.6179 68.27642
234
Small 70.11382 108.3577 73.52846
252
Large
621
Total
172.7805 267.0244 181.1951 308
476
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
3
Degrees of Freedom
4
Copyright ©2024 Pearson Education, Inc.
323
1107
Results Critical Value
9.487729
Chi-Square Test Statistic
56.95336
p-Value
1.27E-11
Reject the null hypothesis
Expected frequency assumption is met. Since p-value = 0.0000 < 0.05, reject H0. There is enough evidence that risk is related to market cap and, hence, a difference in risk based on market cap.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxi 3. Rating based on market cap: cont. H 0 : There is no relationship between rating and market cap H1 : There is relationship between rating and market cap PHStat output of the chi-square test: Chi-Square Test
Observed Frequencies Star Rating Market Cap
One
Two
Three
Four
Five
Total
Mid-Cap
18
67
80
53
16
234
Small
14
65
101
52
20
252
Large
41
147
243
135
55
621
Total
73
279
424
240
91
1107
Expected Frequencies Star Rating Market Cap
One
Two
Three
Four
Five
Total
Mid-Cap 15.43089 58.97561 89.62602 50.73171 19.23577 Small 16.61789
234
63.5122 96.52033 54.63415 20.71545
252
Large 40.95122 156.5122 237.8537 134.6341 51.04878
621
Total
1107
73
279
424
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
5
Copyright ©2024 Pearson Education, Inc.
240
91
Degrees of Freedom
8
Results Critical Value
15.50731
Chi-Square Test Statistic
5.002362
p-Value
0.757324
Do not reject the null hypothesis
Expected frequency assumption is met. Since p-value = 0.7573 > 0.05, do not reject H0. There is not enough evidence that rating is related to market cap and, hence, a difference in rating based on market cap.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxiii 3. Risk based on type: cont. H 0 : There is no relationship between risk and type H1 : There is relationship between risk and type PHStat output of the chi-square test: Chi-Square Test
Observed Frequencies Risk Level Market Type
Low
Average
High
Total
Growth
119
173
121
413
Value
93
133
100
326
Blend
96
170
102
368
Total
308
476
323
1107
Expected Frequencies Risk Level Market Type
Low
Growth
Average
Total
120.505
413
Value
90.7028 140.1771 95.12014
326
Blend
102.3884 158.2367 107.3749
368
Total
114.9088 177.5863
High
308
476
323
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
3
Copyright ©2024 Pearson Education, Inc.
1107
Degrees of Freedom
4
Results Critical Value
9.487729
Chi-Square Test Statistic
2.484273
p-Value
0.647454
Do not reject the null hypothesis
Expected frequency assumption is met. Since p-value = 0.6475 > 0.05, do not reject H0. There is not enough evidence that risk is related to type and, hence, a difference in risk based on type.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxv 3. Rating based on type: cont. H 0 : There is no relationship between rating and type H1 : There is relationship between rating and type PHStat output of the chi-square test: Chi-Square Test
Observed Frequencies Star Rating Type
One
Two
Three
Four
Five
Total
Growth
23
103
167
87
33
413
Value
19
84
124
67
32
326
Blend
31
92
133
86
26
368
Total
73
279
424
240
91
1107
Expected Frequencies Star Rating Type
One Growth Value
Two
Three
27.23487 104.0894 158.1861 21.49774
73
82.1626 124.8636 70.67751 26.79855
326
279
79.7832 30.25113
424
0.05
Number of Rows
3
Number of Columns
5
Total 413
Data Level of Significance
Five
89.5393 33.95032
Blend 24.26739 92.74797 140.9503 Total
Four
Copyright ©2024 Pearson Education, Inc.
240
91
368 1107
Degrees of Freedom
8
Results Critical Value
15.50731
Chi-Square Test Statistic
6.201951
p-Value
0.624622
Do not reject the null hypothesis
Expected frequency assumption is met. Since p-value = 0.6246 > 0.05, do not reject H0. There is not enough evidence that rating is related to type and, hence, a difference in rating based on type. 4.
Refer to the conclusions of the various hypothesis tests in parts (1) to (3).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxvii
Chapter 15
3-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients
Standard Error
t Stat
P-value
R Square
VIF
Intercept
14.7528
0.5127 28.7734
0.0000
Assets
0.0000
0.0000
0.7769
0.4375
0.0605
1.0645
Turnover Ratio
-0.0073
0.0030
-2.4050
0.0164
0.0349
1.0362
Expense Ratio
-1.2833
0.3711
-3.4578
0.0006
0.0663
1.0710
Type
4.2002
0.3077 13.6507
0.0000
0.0057
1.0057
Risk Level
-1.1058
0.3368
0.0011
0.0239
1.0244
-3.2835
A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 4.6035 with X2X3X4X5 and adjusted r2 = 0.2282. (PHStat display of 4 smallest Cp) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X3X4X5
8.9597
4
0.2257
0.2225
4.1553
X1X3X4X5
9.7842
5
0.2269
0.2227
4.1548
X2X3X4X5
4.6035
5
0.2324
0.2282
4.1402
X1X2X3X4X5
6.0000
6
0.2330
0.2278
4.1413
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
t Stat
Copyright ©2024 Pearson Education, Inc.
P-value
Intercept
14.8718
0.4892
30.4008
0.0000
Type
4.2060
0.3075
13.6774
0.0000
Expense Ratio
-1.3371
0.3645
-3.6683
0.0003
Risk Level
-1.1209
0.3361
-3.3348
0.0009
Turnover Ratio
-0.0076
0.0030
-2.5218
0.0119
The most appropriate multiple regression model for predicting three-year return is: Yˆ 14.8718 0 X1 0.0076 X 2 1.3371X 3 4.2060 X 4 1.1209 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.2282 and the r2 for the model is 0.2324, so the variation in three-year return can be explained by variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxxxix 3-year Return %: The residual plots:
Copyright ©2024 Pearson Education, Inc.
3-year Return %: The residual plots:
The residual plots do not reveal any specific pattern.
Normal probability plot:
The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 3-year return % is Yˆ 14.8718 0 X1 0.0076 X 2 1.3371X 3 4.2060 X 4 1.1209 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxli
5-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients
Standard Error
t Stat
P-value
R Square
VIF
Intercept
11.5498
0.3988 28.9636
0.0000
Assets
0.0000
0.0000
1.2028
0.2295
0.0605
1.0645
Turnover Ratio
-0.0044
0.0024
-1.8650
0.0626
0.0349
1.0362
Expense Ratio
-1.3985
0.2886
-4.8451
0.0000
0.0663
1.0710
Type
6.6679
0.2393 27.8633
0.0000
0.0057
1.0057
Risk Level
-0.9499
0.2619
0.0003
0.0239
1.0244
-3.6265
A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a four-variable model with turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 5.4466 with X2X3X4X5 and adjusted r2 = 0.5267. (PHStat display of 4 smallest Cp) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X3X4X5
7.5684
4
0.5266
0.5246
3.2287
X1X3X4X5
7.4781
5
0.5279
0.5253
3.2263
X2X3X4X5
5.4466
5
0.5292
0.5267
3.2219
X1X2X3X4X5
6.0000
6
0.5302
0.5269
3.2209
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
t Stat
P-value
Intercept
11.6931
0.3807
30.7158
0.0000
Type
6.6749
0.2393
27.8924
0.0000
Copyright ©2024 Pearson Education, Inc.
Expense Ratio
-1.4633
0.2837
-5.1588
0.0000
Risk Level
-0.9681
0.2616
-3.7010
0.0002
Turnover Ratio
-0.0048
0.0023
-2.0296
0.0428
The most appropriate multiple regression model for predicting five-year return is: Yˆ 11.6931 0 X 1 0.0048 X 2 1.4633 X 3 6.6749 X 4 0.9681X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.5267 and the r2 for the model is 0.5292, so the variation in five-year return can be explained by variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxliii 5-year Return %: The residual plots:
Copyright ©2024 Pearson Education, Inc.
5-year Return %: The residual plots:
The residual plots do not reveal any specific pattern.
Normal probability plot:
The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 5-year return % is Yˆ 11.6931 0 X 1 0.0048 X 2 1.4633 X 3 6.6749 X 4 0.9681X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxlv 10-year Return %: All hypothesis tests are performed at the 5% level of significance. Let X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise) Coefficients
Standard Error
t Stat
P-value
R Square
VIF
Intercept
12.6075
0.2239 56.3114
0.0000
Assets
0.0000
0.0000
2.9701
0.0031
0.0605
1.0645
Turnover Ratio
-0.0035
0.0013
-2.6058
0.0094
0.0349
1.0362
Expense Ratio
-1.2513
0.1621
-7.7216
0.0000
0.0663
1.0710
Type
2.8917
0.1344 21.5223
0.0000
0.0057
1.0057
Risk Level
-0.4980
0.1471
0.0007
0.0239
1.0244
-3.3864
A regression analysis revealed that all five variables had VIF values below 5. So there is no reason to suspect collinearity between any pair of variables. A best subsets regression analysis reveals that a five-variable model with assets, turnover ratio, expense ratio, type, and risk level has the lowest Cp value and the highest adjusted r2. The best-subset approach yielded: Cp = 6.0000 with X1X2X3X4X5 and adjusted r2 = 0.4345. (PHStat display of 4 smallest Cp) Model
Cp
k+1
R Square
Adj. R Square
Std. Error
X1X2X3X4
15.4679
5
0.4296
0.4265
1.8212
X1X3X4X5
10.7903
5
0.4332
0.4301
1.8155
X2X3X4X5
12.8214
5
0.4316
0.4285
1.8180
X1X2X3X4X5
6.0000
6
0.4384
0.4345
1.8084
Using Stepwise Regression Analysis in PHStat (partial display) reveals the best model Coefficients
Standard Error
t Stat
P-value
Intercept
12.6075
0.2239
56.3114
0.0000
Type
2.8917
0.1344
21.5223
0.0000
Expense Ratio
-1.2513
0.1621
-7.7216
0.0000
Copyright ©2024 Pearson Education, Inc.
Assets
0.0000
0.0000
2.9701
0.0031
Risk Level
-0.4980
0.1471
-3.3864
0.0007
Turnover Ratio
-0.0035
0.0013
-2.6058
0.0094
The most appropriate multiple regression model for predicting ten-year return is: Yˆ 12.6075 0 X 1 0.0035 X 2 1.2513 X 3 2.8917 X 4 0.4980 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise). The adjusted r2 for the best model is 0.4345 and the r2 for the model is 0.4384, so the variation in three-year return can be explained by variation in assets, variation in turnover ratio, variation in expense ratio, variation in type, and variation in risk level. All the remaining independent variables have a p-value for the individual t test statistic < 0.05.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxlvii 10-year Return %: The residual plots:
Copyright ©2024 Pearson Education, Inc.
10-year Return %: The residual plots:
The residual plots do not reveal any specific pattern.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxlix 10-year Return %: Normal probability plot:
The normal probability plot indicates that the distribution of the residuals deviates slightly from the normal distribution in the tails. The best model for predicting the 10-year return % is Yˆ 12.6075 0 X 1 0.0035 X 2 1.2513 X 3 2.8917 X 4 0.4980 X 5 , for X1 = Assets, X2 = Turnover Ratio, X3 = Expense Ratio, X4 = Type (1 for growth and 0 otherwise), X5 = Risk Level (1 for high and 0 otherwise).
The Claro Mountain State Student Surveys Case
Chapter 1
1.
2.
Question 1: categorical, nominal; Question 2: categorical, nominal; Question 3: categorical, nominal; Question 4: numerical, discrete, interval; Question 5: categorical, nominal; Question 6: categorical, nominal; Question 7: numerical, discrete, interval; Question 8: numerical, discrete, ratio; Question 9: categorical, nominal; Question 10: categorical, nominal; Question 11: categorical, nominal; Question 12: categorical, nominal; Question 13: numerical, discrete, ratio; Question14: numerical, discrete, ratio; Question 15: numerical, discrete, interval; Question 16: numerical, discrete, ratio; Question 17: categorical, nominal; Question 18: numerical, discrete, interval; Question 19: numerical, discrete, interval; Question 20: numerical, discrete, interval; Question 21: numerical, discrete, interval; Question 22: numerical, discrete, interval; Question 23: categorical, nominal. ZIP or postal code, to be consistent to the data in the file, ―the first five characters of such a code with the third fourth, and fifth characters changed to X.‖ Annual household income might need more specification such as thousands of dollars. The free response gender might also be considered Copyright ©2024 Pearson Education, Inc.
3.
4. 5.
6. 7.
even though ―free response‖ is an acceptable definition (although one that may not be amenable to data analysis). Questions in which the domain is listed as a set of choices. Question 1 asks for full-time or parttime and Question 3 asks for transfer status of yes or no. Neither of these responses would need data wrangling. No. This is an open response question. Invite discussion of how gender can be represented. Interested students should be referred to websites such as https://williamsinstitute.law.uclea.edu/quick-facts/survey-measures/ An alternate survey question and response is: Your current gender identity: Man, Woman, Neither, Both, Other. In the data cleaning process, a response of ―10‖ for Questions 18, 19, or 20 could be changed to 7, but recoding the answer as missing is reasonable too. There are many errors. Look at the data for typographical errors, constancy among values for categorical variables, numerical values that are invalid or seem irregular, and non-numerical values for a numeric variable. For example, cell B21 ―PT‖ is an entry error, which is inconsistent to the coding of part-time, which should be ―P/T.‖ In cell D11, ―N‖ should be ―No‖, and cell D17, ―Y‖ should be ―Yes.‖ Other specific errors may include cells P16, N18, F19, R14, S3, H12, T4, and V9. The error in the Major column is most subtle. There is a certain amount of ambiguity because none of the columns come with a formal definition. That’s the reason operational definitions are needed. For example, is the postal code 60XXX an error (and should it be something such as 60601), or is it a valid value created to a less-specific identifier? One would not know unless one had the operational definition of postal code.
8.
For instructor’s use: Question 8 invites learners to reflect about Column K. Coding gender identity is an open question that is still being discussed and which raises some non-statistical concerns. There is no one correct way to code gender, but the student survey document uses one model approach described by the Williams Institute at UCLA. This question can be omitted without any loss of comprehension of chapter concepts or later learning. However, for those wanting to include some DEI-related content, this question opens the door to a broad discussion that might include how many categories could/should be defined and whether a category ―Other‖ would be an act of inclusion.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccli
Chapter 2
1.
Status:
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccliii 1. Class: cont.
Copyright ©2024 Pearson Education, Inc.
1. Transfer: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclv
1. Expected Year of Graduation: cont.
Copyright ©2024 Pearson Education, Inc.
1. Major: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclvii
Copyright ©2024 Pearson Education, Inc.
1. Grad School Intention: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclix 1. Age: cont.
Copyright ©2024 Pearson Education, Inc.
1. Height: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxi 1. Assigned Sex: cont.
Copyright ©2024 Pearson Education, Inc.
1. Gender Identity: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxiii 1. Postal Code: cont.
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxv 1. Employment: cont.
Copyright ©2024 Pearson Education, Inc.
1. Loans: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxvii 1. GPA: cont.
Copyright ©2024 Pearson Education, Inc.
1. Current Credit Hours: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxix 1. Course Materials: cont.
Copyright ©2024 Pearson Education, Inc.
1. Delivery Mode Preference: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxi
Copyright ©2024 Pearson Education, Inc.
1. Food-Dining Satisfaction: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxiii 1. Athletic Satisfaction: cont.
Copyright ©2024 Pearson Education, Inc.
1. Student Support Satisfaction: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxv 1. Development Center Visits: cont.
Copyright ©2024 Pearson Education, Inc.
1. Expected Starting Salary: cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxvii 1. Recommended Course: cont.
Copyright ©2024 Pearson Education, Inc.
2.
About half of the students are full-time and half are part-time. Only 13% of the students are first-year, while 64% are upper level, and 23% are second-year students. Most of the students are not transfer students. Nearly 56% of students are expecting to graduation in 2026 or 2027. Not one major has more than 16% of the responses. There are more students with a grad intention than either of the other categories. Nearly 75% of the students are between 17 and 21 years old. The majority of students are between 64 and 68 inches tall. There are about the same number of females and males. Majority of the students are currently employed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxix
Chapter 3
Implicit in this assignment is the determination of which variables are numerical variables. In turn, that creates an opportunity to discuss an issue about summary statistics. The data set contains a set of ordinal-scaled attitudinal variables that measure satisfaction. In a strict sense, these are categorical variables with numeric values—and, therefore, would not part of the report. That said, some sources treat such variables as quasi-numerical and students will have seen many online review sites that report means of ordinal-scaled ratings, such as 4.3 whether the rating is for a particular, a hotel or even a professor’s performance. Using means for such data is questionable, but reporting the mode, the median, and the range would not be. This assignment can be extended by asking student to examine group data by calculating data by categorical data such as status. Explored that way, there are differences in means which could be further explored if hypothesis testing is taught. Descriptive Summary for Age, Height, Loans, GPA, Current Credit Hours, Course Materials Current Course Age Height Loans GPA Credit Hours Materials Mean
21.9304 69.7739 32.8583
3.3171
9.7739
157.9826
Median
22
70
33.3
3.45
12
142
Mode
21
69
37
3.55
12
289
Minimum
17
64
3.4
1.91
1
31
Maximum
30
76
59.7
4.53
18
291
Range
13
12
56.3
2.62
17
260
Variance
5.8548
8.1941 75.2518
0.3157
29.2291
5220.7716
Standard Deviation
2.4197
2.8625
8.6748
0.5619
5.4064
72.2549
Coeff. of Variation
11.03%
4.10%
26.40%
16.94%
55.31%
45.74%
Skewness
0.8503
-0.0286
-0.2290
-0.4800
-0.2737
0.3316
Kurtosis
1.1478
-0.6865
0.8657
-0.2074
-1.3071
-0.9278
Count
115
115
115
115
115
115
0.2256
0.2669
0.8089
0.0524
0.5041
6.7378
Standard Error
Copyright ©2024 Pearson Education, Inc.
First Quartile
20
67
27.3
2.96
5
107
Third Quartile
23
72
38.1
3.67
14
216
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxi Descriptive Summary for Food-Dining Satisfaction, Athletic Satisfaction, Student Support Satisfaction, cont. Devel Center Visits, Expected Starting Salary FoodStudent Expected Dining Athletic Support Devel Ctr Starting Satisfaction Satisfaction Satisfaction Visits Salary Mean
4.1565
4.5913
4.5478
3.1652
62.4957
Median
4
5
4
4
63
Mode
4
4
4
4
59
Minimum
1
1
1
0
12
Maximum
7
7
7
22
98
Range
6
6
6
22
86
Variance
2.7121
1.8227
2.2323
7.7005
358.1118
Standard Deviation
1.6469
1.3501
1.4941
2.7750
18.9238
Coeff. of Variation
39.62%
29.41%
32.85%
87.67%
30.28%
Skewness
-0.3270
-0.4802
-0.0058
2.7286
-0.3706
Kurtosis
-0.4448
0.1007
-0.0654
17.2127
-0.3672
115
115
115
115
115
Standard Error
0.1536
0.1259
0.1393
0.2588
1.7647
First Quartile
3
4
4
1
50
Third Quartile
5
6
6
5
77
Count
Copyright ©2024 Pearson Education, Inc.
Chapter 4
This case naturally follows the Chapter 3 case and authors recommend this case be assigned only if the Chapter 3 case was assigned first. There are no implicit concepts in this case, just practice in the computation and presentation of conditional and marginal probabilities. 1.
Student status and current class Count of Class
Status
Class
F/T
P/T
Grand Total
First-year
9
6
15
Second-year
8
18
26
Upper-level
41
33
74
Grand Total
58
57
115
Joint and Marginal Probabilities P(Class, Status)
Status
Class
F/T
P/T
First-year
0.0783
0.0522
0.1304
Second-year
0.0696
0.1565
0.2261
Upper-level
0.3565
0.2870
0.6435
P(Class)
0.5043
0.4957
1.0000
P(Status)
Conditional Probabilities P(Status|Class)
Status
Class
F/T
P/T
First-year
0.6000
0.4000
Second-year
0.3077
0.6923
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxiii Upper-level
0.5541
0.4459
Conditional Probabilities P(Class|Status)
Status
Class
F/T
P/T
First-year
0.1552
0.1053
Second-year
0.1379
0.3158
Upper-level
0.7069
0.5789
Copyright ©2024 Pearson Education, Inc.
1. Student status and graduate school intentions cont. Count of Grad School Intention Status Intention
F/T
P/T
Grand Total
No
9
5
14
Not sure
23
22
45
Yes
26
30
56
Grand Total
58
57
115
Joint and Marginal Probabilities P(Intention, Status)
Status
Intention
F/T
P/T
P(Status)
No
0.0783
0.0435
0.1217
Not sure
0.2000
0.1913
0.3913
Yes
0.2261
0.2609
0.4870
P(Intention)
0.5043
0.4957
1.0000
Conditional Probabilities P(Status|Intention) Status Intention
F/T
P/T
No
0.6429
0.3571
Not sure
0.5111
0.4889
Yes
0.4643
0.5357
Conditional Probabilities P(Intention|Status) Status
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxv Intention
F/T
P/T
No
0.1552
0.0877
Not sure
0.3966
0.3860
Yes
0.4483
0.5263
Copyright ©2024 Pearson Education, Inc.
1. Student status and employment status cont. Count of Employment Status Employment
F/T
P/T
Grand Total
F/T
13
7
20
Not
7
12
19
P/T
38
38
76
Grand Total
58
57
115
Joint and Marginal Probabilities P(Employment, Status)
Status
Employment
F/T
P/T
P(Status)
F/T
0.1130
0.0609
0.1739
Not
0.0609
0.1043
0.1652
P/T
0.3304
0.3304
0.6609
P(Employment)
0.5043
0.4957
1.0000
Conditional Probabilities P(Status|Employment) Status Employment
F/T
P/T
F/T
0.6500
0.3500
Not
0.3684
0.6316
P/T
0.5000
0.5000
Conditional Probabilities P(Employment|Status) Status Employment
F/T
P/T
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxvii F/T
0.2241
0.1228
Not
0.1207
0.2105
P/T
0.6552
0.6667
Copyright ©2024 Pearson Education, Inc.
1. Student status and preferred instructional delivery mode cont. Count of Delivery Mode Preference Status Preference
F/T
P/T
Grand Total
Hybrid
17
12
29
None
11
7
18
Online asynch
5
12
17
Physical
16
19
35
Virtual
9
7
16
Grand Total
58
57
115
Joint and Marginal Probabilities P(Preference, Status)
Status
Preference
F/T
P/T
P(Status)
Hybrid
0.1478
0.1043
0.2522
None
0.0957
0.0609
0.1565
Online asynch
0.0435
0.1043
0.1478
Physical
0.1391
0.1652
0.3043
Virtual
0.0783
0.0609
0.1391
P(Preference)
0.5043
0.4957
1.0000
Conditional Probabilities P(Status|Preference) Status Preference
F/T
P/T
Hybrid
0.5862
0.4138
None
0.6111
0.3889
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems ccclxxxix Online asynch
0.2941
0.7059
Physical
0.4571
0.5429
Virtual
0.5625
0.4375
Conditional Probabilities P(Preference|Status) Status Preference
F/T
P/T
Hybrid
0.2931
0.2105
None
0.1897
0.1228
Online asynch
0.0862
0.2105
Physical
0.2759
0.3333
Virtual
0.1552
0.1228
Copyright ©2024 Pearson Education, Inc.
1. Current class and graduate school intentions cont. Count of Grad School Intention Class Firstyear
Intention
Second- Upper- Grand year level Total
No
2
4
8
14
Not sure
7
10
28
45
Yes
6
12
38
56
Grand Total
15
26
74
115
Joint and Marginal Probabilities P(Intention, Class)
Class
Intention
Firstyear
No
0.0174
0.0348
0.0696
0.1217
Not sure
0.0609
0.0870
0.2435
0.3913
Yes
0.0522
0.1043
0.3304
0.4870
P(Intention)
0.1304
0.2261
0.6435
1.0000
Secondyear
Upperlevel
P(Class)
Conditional Probabilities P(Class|Intention)
Class
Intention
First-year
Second-year
Upper-level
No
0.1429
0.2857
0.5714
Not sure
0.1556
0.2222
0.6222
Yes
0.1071
0.2143
0.6786
Conditional Probabilities P(Intention|Class)
Class Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxci Intention
First-year
Second-year
Upper-level
No
0.1333
0.1538
0.1081
Not sure
0.4667
0.3846
0.3784
Yes
0.4000
0.4615
0.5135
Copyright ©2024 Pearson Education, Inc.
1. Current class and employment status cont. Count of Employment Class Employment
First-year
Second-year
Upper-level
Grand Total
3
17
20
F/T Not
7
7
5
19
P/T
8
16
52
76
Grand Total
15
26
74
115
Joint and Marginal Probabilities P(Employment, Class)
Class
Employment
Firstyear
F/T
0.0000
0.0261
0.1478
0.1739
Not
0.0609
0.0609
0.0435
0.1652
P/T
0.0696
0.1391
0.4522
0.6609
P(Employment)
0.1304
0.2261
0.6435
1.0000
Secondyear
Upperlevel
P(Class)
Conditional Probabilities P(Class|Employment) Class Employment
First-year
Second-year
Upper-level
F/T
0.0000
0.1500
0.8500
Not
0.3684
0.3684
0.2632
P/T
0.1053
0.2105
0.6842
Conditional Probabilities P(Employment|Class) Class
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxciii Employment
First-year
Second-year
Upper-level
F/T
0.0000
0.1154
0.2297
Not
0.4667
0.2692
0.0676
P/T
0.5333
0.6154
0.7027
Copyright ©2024 Pearson Education, Inc.
1. Major and graduate school intentions cont. Count of Grad School Intention Intention Major
No
Not sure
Yes
Grand Total
Accounting
1
1
14
16
Computing
1
1
3
5
Finance
1
3
3
7
1
4
5
Hospitality management International Business
2
6
6
14
Marketing
2
8
7
17
OR/Management science
2
5
7
Other
2
4
6
Retail management
1
1
5
7
Statistics or Analytics
3
5
5
13
Undecided/No major
3
15
Grand Total
14
45
18 56
115
Major and employment status Count of Employment
Employment
Major
F/T
Not
Accounting
5
Computing
1
Finance
2
Hospitality management
P/T 1
Grand Total
10
16
4
5
1
4
7
1
1
3
5
International Business
1
2
11
14
Marketing
3
6
8
17
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxcv OR/Management science
1
2
4
7
Other
1
1
4
6
1
6
7
Retail management Statistics or Analytics
2
2
9
13
Undecided/No major
3
2
13
18
Grand Total
20
19
76
115
Copyright ©2024 Pearson Education, Inc.
1. Major and preferred instructional delivery mode cont. Count of Delivery Mode Preference Preference
Major
Hybrid
None
Accounting
3
Computing
2
Finance
2
3
Hospitality management
2.
Online asynch
Grand Physical Virtual Total 4
4
2
16
2
1
5
1
2
2
7
1
2
2
5
International Business
4
2
1
5
2
14
Marketing
4
3
3
5
2
17
OR/Management science
3
1
1
2
7
Other
2
Retail management
1
Statistics or Analytics
1
3
1
3
1
1
7
2
3
1
5
2
13
Undecided/No major
6
5
2
5
Grand Total
29
18
17
35
Each of the pairs are not statistically independent.
Copyright ©2024 Pearson Education, Inc.
6
18 16
115
Solutions to End-of-Section and Chapter Review Problems cccxcvii
Chapter 6
This case provides practice in determining whether the values of a numerical variable are normally distributed. Reports should, of course, exclude the attitudinal variables that may have been included the Chapter 3 report. Note that values for height were generated by a random normal function and are least ambiguous. Values for loans and GPA were generated by pairs of normal distributions. For loans, values for first-year students used one distribution, while all other classes used a second distribution. For GPA, values for accounting majors used one distribution, while all other majors used a second distribution. As a minimum. look for students to report that current credit hours is not normally distributed based on a non-linear normal probability plot
Copyright ©2024 Pearson Education, Inc.
1.
Age Mean
Height
Loans
21.9304 69.7739 32.8583
GPA
Current Credit Hours
Course Materials
Expected Starting Salary
3.3171
9.7739
157.9826
62.4957
Median
22
70
33.3
3.45
12
142
63
Mode
21
69
37
3.55
12
289
59
Minimum
17
64
3.4
1.91
1
31
12
Maximum
30
76
59.7
4.53
18
291
98
Range
13
12
56.3
2.62
17
260
86
Variance
5.8548
8.1941 75.2518
0.3157
29.2291
5220.7716
358.1118
Standard Deviation
2.4197
2.8625
8.6748
0.5619
5.4064
72.2549
18.9238
Coeff. of Variation
11.03%
4.10%
26.40%
16.94%
55.31%
45.74%
30.28%
Skewness
0.8503
-0.0286
-0.2290
-0.4800
-0.2737
0.3316
-0.3706
Kurtosis
1.1478
-0.6865
0.8657
-0.2074
-1.3071
-0.9278
-0.3672
Count
115
115
115
115
115
115
115
Standard Error
0.2256
0.2669
0.8089
0.0524
0.5041
6.7378
1.7647
First Quartile
20
67
27.3
2.96
5
107
50
Third Quartile
23
72
38.1
3.67
14
216
77
Interquartile Range
3
5
10.8
0.71
9
109
27
6*std dev
14.5180 17.1752 52.0487
3.3713
32.4384
433.5294
113.5431
1.33*std dev
3.2182
0.7473
7.1905
96.0990
25.1687
3.8072 11.5375
Age:
The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is slightly smaller than 1.33 times the standard deviation.
Height:
The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cccxcix Loans:
The mean is smaller than the median; the range is larger than 6 times the standard deviation and the interquartile range is smaller than 1.33 times the standard deviation.
GPA:
The mean is approximately equal to the median; the range is smaller than 6 times the standard deviation and the interquartile range is approximately equal to 1.33 times the standard deviation.
Hours:
The mean is smaller than the median; the range is much smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation.
Materials: The mean is larger than the median; the range is much smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation. Salary:
The mean is slightly smaller than the median; the range is smaller than 6 times the standard deviation and the interquartile range is larger than 1.33 times the standard deviation.
Copyright ©2024 Pearson Education, Inc.
2.
Normal Probability Plots
Age:
Height:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdi 2. Normal Probability Plots cont. Loans:
GPA:
Copyright ©2024 Pearson Education, Inc.
2. Normal Probability Plots cont. Hours:
Materials:
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdiii 2. Normal Probability Plots cont. Salary:
3.
Age and height appear to be roughly normally distributed. Loans and GPA appear to be normally distributed. Current credit hours and course materials are not normally distributed. Expected starting salary appears to be left-skewed.
Copyright ©2024 Pearson Education, Inc.
Chapter 8
This case provides practice in computing confidence interval estimates. Note that the (undisclosed) true mean for the normal distribution from which values for height were chosen is 70, which is contained in the confidence interval estimate for height, 69.25 ≤ ≤ 70.30. 95% confidence interval of the means: Age: 21.48 22.38 Height: 69.25 70.30 Loans: 31.26 34.46 GPA: 3.21 3.42 Current Credit Hours: 8.78 10.77 Course Materials: 144.64 171.33 Expected Starting Salary: 59.00 65.99 95% confidence Interval Estimate for the Proportion: Status (sample size = 115) F/T (58) 0.4130 0.5957 P/T (57) 0.4043 0.5870 Class (sample size = 115) First-year (15) 0.0689 0.1920 Second-year (26) 0.1496 0.3025 Upper-level (74) 0.5559 0.7310 Transfer (sample size = 115) No (102) 0.8291 0.9448 Yes (13) 0.0552 0.1709 Major (sample size = 115) Accounting (16) 0.0759 0.2024 Computing (5) 0.0062 0.0808 Finance (7) 0.0172 0.1046 Hospitality Management (5)0.0062 0.0808 International Business (14)0.0620 0.1815 Marketing (17) 0.0830 0.2127 OR/Management science (7)0.0172 0.1046 Other (6) 0.0115 0.928 Retail management (7)0.0172 0.1046 Statistics or Analytics (13)0.0552 0.1709 Undecided/No major (18)0.0901 0.2229 Graduate School Intentions (sample size = 115) No (14) 0.0620 0.1815 Not sure (45) 0.3021 0.4805 Yes (56) 0.3956 0.5783 Assigned Sex (sample size = 115) Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdv Female (57) Male (58)
0.4043 0.5870 0.4130 0.5957
Copyright ©2024 Pearson Education, Inc.
95% confidence Interval Estimate for the Proportion: cont. Gender Identity (sample size = 115) Man (51) 0.3527 0.5343 Non-binary (9) 0.0292 0.1273 Prefer self-description (7)0.0172 0.1046 Woman (48) 0.3273 0.5075 Employment (sample size = 115) F/T (20) 0.1046 0.2432 Not (19) 0.0973 0.2331 P/T (76) 0.5743 0.7474 Delivery Mode Preference (sample size = 115) Hybrid (29) 0.1728 0.3315 None (18) 0.0901 0.2229 Online asynch (17) 0.0830 0.2127 Physical (35) 0.2203 0.3884 Virtual (16) 0.0759 0.2024 Food-Dining Satisfaction (sample size = 115) 1 (12) 0.0485 0.1602 2 (7) 0.0172 0.1046 3 (13) 0.0552 0.1709 4 (35) 0.2203 0.3884 5 (23) 0.1269 0.2731 6 (17) 0.0830 0.2127 7 (8) 0.0231 0.1161 Athletic Satisfaction (sample size = 115) 1 (3) –0.0030 0.0552 2 (6) 0.0115 0.928 3 (10) 0.0355 0.1385 4 (35) 0.2203 0.3884 5 (29) 0.1728 0.3315 6 (26) 0.1496 0.3025 7 (6) 0.0115 0.928 Student Support Satisfaction (sample size = 115) 1 (5) 0.0062 0.0808 2 (1) –0.0083 0.0257 3 (15) 0.0689 0.1920 4 (46) 0.3105 0.4895 5 (19) 0.0973 0.2331 6 (11) 0.0419 0.1494 7 (18) 0.0901 0.2229
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdvii 95% confidence Interval Estimate for the Proportion: cont. Development Center Visits (sample size = 115) 0 (17) 0.0830 0.2127 1 (24) 0.1344 0.2830 2 (9) 0.0292 0.1273 3 (6) 0.0115 0.928 4 (29) 0.1728 0.3315 5 (14) 0.0620 0.1815 6 (9) 0.0292 0.1273 7 (6) 0.0115 0.928 22 (1) –0.0083 0.0257 Recommended Course (sample size = 115) BUS 1000 (32) 0.1964 0.3602 COM 2150 (17) 0.0830 0.2127 INB 2700 (26) 0.1496 0.3025 INB 2800 (2) –0.0065 0.0413 (blank) (38) 0.2445 0.4164
You are 95% confident that the mean age is between 21.48 and 22.38. Similar statements can be made for the other confidence intervals of the means.
You are 95% confident that the proportion of the status: F/T is between 0.4130 and 0.5957, and P/T is between 0.4043 and 0.5870. Similar statements can be made for the other confidence intervals of the proportions.
Copyright ©2024 Pearson Education, Inc.
Chapter 10
Question 1 requires some data preparation by students. Perhaps the simplest method is to extract data for full-time students and part-time students separately, using filtering or sorting techniques. With the two data sets (two worksheets or two data tables), then for each variable to be analyzed, copy corresponding columns to a third worksheet or data table. Students using PHStat can (repeatedly) use Data Preparation Unstack Data to prepare the two columns needed for each variable. More advanced students using a specific data analysis program may find equivalent ways that exploit special features of the program being used.) In Question 2, the ―no‖ and ―not sure‖ categories should be combined to form the category ―do not have current plans to attend graduate school.‖ This would be best solved by redefining the grad school intention column to have ―Yes‖ and ―Not Yes‖ values as a first data preparation step. Answers to both questions illustrate how proper data preparation simplifies performing statistical analysis, calling back observations the book makes in the earliest chapters. For courses that include group work, each group could be assigned to analyze one or a subset of the numerical variables that the questions use.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdix 1.Age: Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
57
Sample Variance
6.738721805
Smaller-Variance Sample Sample Size
58
Sample Variance
5.086509377
Intermediate Calculations F Test Statistic
1.3248
Population 1 Sample Degrees of Freedom
56
Population 2 Sample Degrees of Freedom
57
Two-Tail Test Upper Critical Value
1.6925
p-Value
0.2928
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT 12 = 1.3248 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.3248 < 1.6925 and the p-value = 0.2928 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxi 1.Age: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
58
Sample Mean
21.96551724
Sample Standard Deviation
2.255329106
Population 2 Sample Sample Size
57
Sample Mean
21.89473684
Sample Standard Deviation
2.595904814
Intermediate Calculations Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Total Degrees of Freedom
113
Pooled Variance
5.9053
Standard Error
0.4532
Difference in Sample Means
0.0708
t Test Statistic
0.1562
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.8762
Do not reject the null hypothesis
Decision: Since tSTAT = 0.1562 < 1.9812 and the p-value = 0.8762 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between part-time and fulltime students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxiii 1.GPA: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
57
Sample Variance
0.325556955
Smaller-Variance Sample Sample Size
58
Sample Variance
0.311323291
Intermediate Calculations F Test Statistic
1.0457
Population 1 Sample Degrees of Freedom
56
Population 2 Sample Degrees of Freedom
57
Two-Tail Test Upper Critical Value
1.6925
p-Value
0.8665
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT 12 = 1.0457 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.0457 < 1.6925 and the p-value = 0.8665 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxv 1.GPA: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
58
Sample Mean
3.328275862
Sample Standard Deviation
0.557963521
Population 2 Sample Sample Size
57
Sample Mean
3.305789474
Sample Standard Deviation
0.570575985
Intermediate Calculations Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Total Degrees of Freedom
113
Pooled Variance
0.3184
Standard Error
0.1052
Difference in Sample Means
0.0225
t Test Statistic
0.2137
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.8312
Do not reject the null hypothesis
Decision: Since tSTAT = 0.2137 < 1.9812 and the p-value = 0.8312 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean GPA is different between part-time and fulltime students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxvii 1.Amount of current outstanding student loans: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
57
Sample Variance
81.34800752
Smaller-Variance Sample Sample Size
58
Sample Variance
69.10416213
Intermediate Calculations F Test Statistic
1.1772
Population 1 Sample Degrees of Freedom
56
Population 2 Sample Degrees of Freedom
57
Two-Tail Test Upper Critical Value
1.6925
p-Value
0.5412
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6925, reject H0. S2 Test statistic: FSTAT 12 = 1.1772 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.1772 < 1.6925 and the p-value = 0.5412 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxix 1.Amount of current outstanding current loans: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
58
Sample Mean
33.70689655
Sample Standard Deviation
8.312891322
Population 2 Sample Sample Size
57
Sample Mean
31.99473684
Sample Standard Deviation
9.019313029
Intermediate Calculations Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Total Degrees of Freedom
113
Pooled Variance
75.1719
Standard Error
1.6171
Difference in Sample Means
1.7122
t Test Statistic
1.0588
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.2919
Do not reject the null hypothesis
Decision: Since tSTAT = 1.0588 < 1.9812 and the p-value = 0.2919 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount of current outstanding student loans is different between part-time and full-time students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxi 1.Amount spent on course materials: cont.Population Larger Variance = F/T (58), Smaller Variance = P/T (57) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
58
Sample Variance
2837.317604
Smaller-Variance Sample Sample Size
57
Sample Variance
1028.684211
Intermediate Calculations F Test Statistic
2.7582
Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Two-Tail Test Upper Critical Value
1.6946
p-Value
0.0002 Reject the null hypothesis
Decision rule: If FSTAT > 1.6946, reject H0. S2 Test statistic: FSTAT 12 = 2.7582 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 2.7582 > 1.6946 and the p-value = 0.0002 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxiii 1.Amount spent on course materials: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1 2 H1: 1 2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
58
Sample Mean
214.6551724
Sample Standard Deviation
53.2665
Population 2 Sample Sample Size
57
Sample Mean
100.3157895
Sample Standard Deviation
32.0731
Intermediate Calculations Numerator of Degrees of Freedom
4484.4934
Denominator of Degrees of Freedom
47.8001
Total Degrees of Freedom
93.8176
Degrees of Freedom
93
Standard Error
8.1833
Difference in Sample Means
114.3394
Separate-Variance t Test Statistic
13.9723
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9858
Upper Critical Value
1.9858
p-Value
0.0000 Reject the null hypothesis
Decision: Since tSTAT = 13.9723 > 1.9858 and the p-value = 0.0000 < 0.05, reject H0. There is sufficient evidence to conclude that the mean amount spent on course materials is different between part-time and full-time students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxv 1.Expected annual starting salary: cont.Population Larger Variance = P/T (57), Smaller Variance = F/T (58) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
58
Sample Variance
370.03902
Smaller-Variance Sample Sample Size
57
Sample Variance
352.3552632
Intermediate Calculations F Test Statistic
1.0502
Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Two-Tail Test Upper Critical Value
1.6946
p-Value
0.8553
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6946, reject H0. S2 Test statistic: FSTAT 12 = 1.0502 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.0502 < 1.6946 and the p-value = 0.8553 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxvii 1.Expected annual starting salary: cont.Population 1 = F/T (58), 2 = P/T (57) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
58
Sample Mean
62.56896552
Sample Standard Deviation
19.23639831
Population 2 Sample Sample Size
57
Sample Mean
62.42105263
Sample Standard Deviation
18.77112845
Intermediate Calculations Population 1 Sample Degrees of Freedom
57
Population 2 Sample Degrees of Freedom
56
Total Degrees of Freedom
113
Pooled Variance
361.2754
Standard Error
3.5450
Difference in Sample Means
0.1479
t Test Statistic
0.0417
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.9668
Do not reject the null hypothesis
Decision: Since tSTAT = 0.0417 < 1.9812 and the p-value = 0.9668 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between part-time and fulltime students.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxix 2.Age: Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
56
Sample Variance
7.361038961
Smaller-Variance Sample Sample Size
59
Sample Variance
4.184687317
Intermediate Calculations F Test Statistic
1.7590
Population 1 Sample Degrees of Freedom
55
Population 2 Sample Degrees of Freedom
58
Two-Tail Test Upper Critical Value
1.6907
p-Value
0.0352 Reject the null hypothesis
Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT 12 = 1.7590 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.7590 > 1.6907 and the p-value = 0.0352 < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxi 2.Age: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1 2 H1: 1 2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
59
Sample Mean
21.52542373
Sample Standard Deviation
2.0457
Population 2 Sample Sample Size
56
Sample Mean
22.35714286
Sample Standard Deviation
2.7131
Intermediate Calculations Numerator of Degrees of Freedom
0.0410
Denominator of Degrees of Freedom
0.0004
Total Degrees of Freedom
102.1617
Degrees of Freedom
102
Standard Error
0.4499
Difference in Sample Means
-0.8317
Separate-Variance t Test Statistic
-1.8488
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9835
Upper Critical Value
1.9835
p-Value
0.0674
Do not reject the null hypothesis Decision: Since tSTAT = –1.8488 > –1.9835 and the p-value = 0.0674 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean age is different between students who have current plans to attend graduate school and those who do not.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxiii 2.GPA: cont.Population Larger Variance = Not yes (59), Smaller Variance = Yes (56) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
59
Sample Variance
0.362404442
Smaller-Variance Sample Sample Size
56
Sample Variance
0.262032955
Intermediate Calculations F Test Statistic
1.3830
Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Two-Tail Test Upper Critical Value
1.6970
p-Value
0.2277
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6970, reject H0. S2 Test statistic: FSTAT 12 = 1.3830 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.3830 < 1.6970 and the p-value = 0.2277 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxv 2.GPA: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
59
Sample Mean
3.249152542
Sample Standard Deviation
0.602000367
Population 2 Sample Sample Size
56
Sample Mean
3.38875
Sample Standard Deviation
0.511891546
Intermediate Calculations Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Total Degrees of Freedom
113
Pooled Variance
0.3136
Standard Error
0.1045
Difference in Sample Means
-0.1396
t Test Statistic
-1.3363
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.1841
Do not reject the null hypothesis Decision: Since tSTAT = –1.3363 > –1.9812 and the p-value = 0.1841 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean GPA is different between students who have current plans to attend graduate school and those who do not.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxvii 2.Amount of current outstanding student loans: cont.Population Larger Variance = Not yes (59), Smaller Variance = Yes (56) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
59
Sample Variance
76.45322034
Smaller-Variance Sample Sample Size
56
Sample Variance
75.23178896
Intermediate Calculations F Test Statistic
1.0162
Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Two-Tail Test Upper Critical Value
1.6970
p-Value
0.9538
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6970, reject H0. S2 Test statistic: FSTAT 12 = 1.0162 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.0162 < 1.6970 and the p-value = 0.9538 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxxxix 2.Amount of current outstanding current loans: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
59
Sample Mean
32.62372881
Sample Standard Deviation
8.743753218
Population 2 Sample Sample Size
56
Sample Mean
33.10535714
Sample Standard Deviation
8.673626056
Intermediate Calculations Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Total Degrees of Freedom
113
Pooled Variance
75.8587
Standard Error
1.6249
Difference in Sample Means
-0.4816
t Test Statistic
-0.2964
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.7675
Do not reject the null hypothesis Decision: Since tSTAT = –0.2964 > –1.9812 and the p-value = 0.7675 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount of current outstanding student loans is different between students who have current plans to attend graduate school and those who do not.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxli 2.Amount spent on course materials: cont.Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
56
Sample Variance
5581.506169
Smaller-Variance Sample Sample Size
59
Sample Variance
4902.552309
Intermediate Calculations F Test Statistic
1.1385
Population 1 Sample Degrees of Freedom
55
Population 2 Sample Degrees of Freedom
58
Two-Tail Test Upper Critical Value
1.6907
p-Value
0.6258
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT 12 = 1.1385 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.1385 < 1.6907 and the p-value = 0.6258 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxliii 2.Amount spent on course materials: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
59
Sample Mean
163.6101695
Sample Standard Deviation
70.0182284
Population 2 Sample Sample Size
56
Sample Mean
152.0535714
Sample Standard Deviation
74.70947844
Intermediate Calculations Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Total Degrees of Freedom
113
Pooled Variance
5233.0166
Standard Error
13.4960
Difference in Sample Means
11.5566
t Test Statistic
0.8563
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.3936
Do not reject the null hypothesis
Decision: Since tSTAT = 0.8563 < 1.9812 and the p-value = 0.3936 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean amount spent on course materials is different between students who have current plans to attend graduate school and those who do not.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlv 2.Expected annual starting salary: cont.Population Larger Variance = Yes (56), Smaller Variance = Not yes (59) H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
56
Sample Variance
380.5077922
Smaller-Variance Sample Sample Size
59
Sample Variance
342.635301
Intermediate Calculations F Test Statistic
1.1105
Population 1 Sample Degrees of Freedom
55
Population 2 Sample Degrees of Freedom
58
Two-Tail Test Upper Critical Value
1.6907
p-Value
0.6931
Do not reject the null hypothesis
Decision rule: If FSTAT > 1.6907, reject H0. S2 Test statistic: FSTAT 12 = 1.1105 S2 Copyright ©2024 Pearson Education, Inc.
Decision: Since FSTAT = 1.1105 < 1.6907 and the p-value = 0.6931 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlvii 2.Expected annual starting salary: cont.Population 1 = Not yes (59), 2 = Yes (56) H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference Level of Significance
0 0.05
Population 1 Sample Sample Size
59
Sample Mean
62.05084746
Sample Standard Deviation
18.51041061
Population 2 Sample Sample Size
56
Sample Mean
62.96428571
Sample Standard Deviation
19.50660894
Intermediate Calculations Population 1 Sample Degrees of Freedom
58
Population 2 Sample Degrees of Freedom
55
Total Degrees of Freedom
113
Pooled Variance
361.0688
Standard Error
3.5451
Difference in Sample Means
-0.9134
t Test Statistic
-0.2577
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9812
Upper Critical Value
1.9812
p-Value
0.7971
Do not reject the null hypothesis Decision: Since tSTAT = –0.2577 > –1.9812 and the p-value = 0.7971 > 0.05, do not reject H0. There is not sufficient evidence to conclude that the mean expected annual salary is different between students who have current plans to attend graduate school and those who do not.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxlix
Chapter 11
This case extends the Chapter 10 case and requires the data preparation done for that case. Recall, the ―no‖ and ―not sure‖ categories should be combined to form the category ―do not have current plans to attend graduate school.‖ This would be best solved by redefining the grad school intention column to have ―Yes‖ and ―Not Yes‖ values as a first data preparation step. NOTE: In initial printings, question 1 lists the variable spending on textbooks and supplies, text messages sent in a week, and the wealth needed to feel rich, an editing error. Those three variables should be replaced by these two, amount spent on course materials and total number of credit hours this semester, which will make the question consistent to the (correct) question 2. 1.
Salary (expected starting annual salary), based on academic major H0: 12 22 32 42 52 62 72 82 92 102 112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
218
13.625
175.9833
Computing
5
92
18.4
293.3000
Finance
7
70
10
79.0000
Hospitality management
5
47
9.4
71.3000
International Business
14
64
4.571428571
13.6484
Marketing
17
144
8.470588235
48.6397
OR/Management science
7
28
4
13.0000
Other
6
62
10.33333333
58.5667
Retail management
7
101
14.42857143
104.6190
Statistics or Analytics
13
158
12.15384615
66.6410
Undecided/No major
18
246
13.66666667
103.1765
ANOVA Source of Variation
SS
df
MS
Copyright ©2024 Pearson Education, Inc.
F
P-value
F crit
Between Groups
1653.7940
10
165.3794
Within Groups
9080.0538
104
87.3082
Total
10733.8478
114
1.8942
0.0540 1.9229
Level of significance
0.05
Since p-value = 0.0540 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdli 1. Salary (expected starting annual salary), based on academic major cont. H0: 1 2 3 4 5 6 7 8 9 10 11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
786
49.125
359.9833
Computing
5
218
43.6
644.3000
Finance
7
576
82.28571429
157.5714
Hospitality management
5
357
71.4
167.3000
International Business
14
1170
83.57142857
33.4945
Marketing
17
1114
65.52941176
124.6397
OR/Management science
7
443
63.28571429
31.5714
Other
6
384
64
186.4000
Retail management
7
467
66.71428571
345.5714
Statistics or Analytics
13
798
61.38461538
226.2564
Undecided/No major
18
874
48.55555556
298.7320
ANOVA Source of Variation
SS
df
MS
Between Groups
17815.1269
10
1781.5127
Within Groups
23009.6209
104
221.2464
Total
40824.7478
114
F 8.0522
P-value
F crit
0.0000 1.9229
Level of significance
0.05
Test statistic: FSTAT = 8.0522 Since p-value = 0.0000 < 0.05, and FSTAT = 8.0522 > 1.9229, reject H0. There is enough evidence to conclude that there is a significant difference expected starting annual salary across the majors. Copyright ©2024 Pearson Education, Inc.
Using Tukey-Kramer Multiple Comparisons in PHStat, the procedure, there is a difference in the mean expected starting annual salary between the following majors: Accounting and Finance Accounting and International Business Computing and Finance Computing and International Business Finance and Statistics or Analytics International Business and Marketing International Business and Statistics or Analytics International Business and Undecided/No major Marketing and Undecided/No major
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdliii 1. Age, based on academic major cont. H0: 12 22 32 42 52 62 72 82 92 102 112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
29
1.8125
7.2292
Computing
5
3
0.6
0.3000
Finance
7
15
2.142857143
2.4762
Hospitality management
5
8
1.6
1.8000
International Business
14
20
1.428571429
1.9560
Marketing
17
28
1.647058824
2.4926
OR/Management science
7
13
1.857142857
7.1429
Other
6
13
2.166666667
3.8667
Retail management
7
12
1.714285714
3.2381
Statistics or Analytics
13
16
1.230769231
1.1923
Undecided/No major
18
35
1.944444444
0.9673
ANOVA Source of Variation
SS
df
MS
F
Between Groups
14.0667
10
1.4067
Within Groups
309.3768
104
2.9748
Total
323.4435
114
0.4729
P-value
F crit
0.9041 1.9229
Level of significance
0.05
Since p-value = 0.9041 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.
Copyright ©2024 Pearson Education, Inc.
1. Age, based on academic major cont. H0: 1 2 3 4 5 6 7 8 9 10 11 ; Decision rule: df: 10,104. If F > 1.9229, reject H0.
H1: At least one mean is different.
One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
365
22.8125
10.0292
Computing
5
111
22.2
0.7000
Finance
7
161
23
6.6667
Hospitality management
5
115
23
5.0000
International Business
14
294
21
4.1538
Marketing
17
367
21.58823529
5.0074
OR/Management science
7
149
21.28571429
9.2381
Other
6
129
21.5
8.3000
Retail management
7
165
23.57142857
6.2857
Statistics or Analytics
13
279
21.46153846
2.6026
Undecided/No major
18
387
21.5
4.9706
ANOVA Source of Variation
SS
df
MS
F
Between Groups
69.7147
10
6.9715
Within Groups
597.7288
104
5.7474
Total
667.4435
114
1.2130
P-value
F crit
0.2914 1.9229
Level of significance
0.05
Test statistic: FSTAT = 1.2130 Since p-value = 0.2914 > 0.05, and FSTAT = 1.2130 < 1.9229, do not reject H0. There is insufficient evidence to conclude that the ages across the majors are different. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlv 1. Materials (amount spent on course materials), based on academic major cont. H0: 12 22 32 42 52 62 72 82 92 102 112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
954
59.625
2107.7167
Computing
5
348
69.6
1653.3000
Finance
7
412
58.85714286
3783.4762
Hospitality management
5
201
40.2
2738.2000
International Business
14
908
64.85714286
1628.5934
Marketing
17
991
58.29411765
2323.0956
OR/Management science
7
317
45.28571429
3304.2381
Other
6
398
66.33333333
3490.1667
Retail management
7
295
42.14285714
3236.8095
Statistics or Analytics
13
694
53.38461538
2409.4231
Undecided/No major
18
915
50.83333333
1996.5000
ANOVA Source of Variation
SS
df
MS
Between Groups
6985.5271
10
698.5527
Within Groups
249774.5468
104
2401.6783
Total
256760.0739
114
F 0.2909
P-value
F crit
0.9819 1.9229
Level of significance
0.05
Since p-value = 0.9819 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.
Copyright ©2024 Pearson Education, Inc.
1. Materials (amount spent on course materials), based on academic major cont. H0: 1 2 3 4 5 6 7 8 9 10 11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
2532
158.25
5635.2667
Computing
5
1016
203.2
7611.7000
Finance
7
1031
147.2857143
6891.5714
Hospitality management
5
871
174.2
2972.2000
International Business
14
2180
155.7142857
6010.8352
Marketing
17
2725
160.2941176
5651.5956
OR/Management science
7
1179
168.4285714
3869.9524
Other
6
1178
196.3333333
8180.6667
Retail management
7
819
117
3879.6667
Statistics or Analytics
13
1874
144.1538462
5056.8077
Undecided/No major
18
2763
153.5
4329.9118
ANOVA Source of Variation
SS
df
MS
Between Groups
36696.3102
10
3669.6310
Within Groups
558471.6551
104
5369.9198
Total
595167.9652
114
F 0.6834
P-value
F crit
0.7377 1.9229
Level of significance
0.05
Test statistic: FSTAT = 0.6831 Since p-value = 0.7377 > 0.05, and FSTAT = 0.6831 < 1.9229, do not reject H0. There is insufficient evidence to conclude that amount spend on course materials across the majors are different. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlvii 1. Hours (total number of credit hours this semester), based on academic major cont. H0: 12 22 32 42 52 62 72 82 92 102 112 ; H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
81
5.0625
4.4625
Computing
5
18
3.6
29.3000
Finance
7
29
4.142857143
15.1429
Hospitality management
5
11
2.2
7.7000
International Business
14
65
4.642857143
11.0165
Marketing
17
80
4.705882353
4.2206
OR/Management science
7
23
3.285714286
14.5714
Other
6
21
3.5
9.6000
Retail management
7
27
3.857142857
6.4762
Statistics or Analytics
13
60
4.615384615
15.7564
Undecided/No major
18
85
4.722222222
10.3007
ANOVA Source of Variation
SS
df
MS
F
Between Groups
55.0749
10
5.5075
Within Groups
1055.0121
104
10.1443
Total
1110.0870
114
0.5429
P-value
F crit
0.8559 1.9229
Level of significance
0.05
Since p-value = 0.8559 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used.
Copyright ©2024 Pearson Education, Inc.
1. Hours (total number of credit hours this semester), based on academic major cont. H0: 1 2 3 4 5 6 7 8 9 10 11 ; H1: At least one mean is different. Decision rule: df: 10,104. If F > 1.9229, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Accounting
16
143
8.9375
31.7958
Computing
5
66
13.2
35.7000
Finance
7
67
9.571428571
28.2857
Hospitality management
5
70
14
12.5000
International Business
14
135
9.642857143
32.2473
Marketing
17
143
8.411764706
27.3824
OR/Management science
7
75
10.71428571
25.2381
Other
6
79
13.16666667
22.1667
Retail management
7
56
8
22.6667
Statistics or Analytics
13
103
7.923076923
28.5769
Undecided/No major
18
187
10.38888889
32.6046
ANOVA Source of Variation
SS
df
MS
Between Groups
339.8753
10
33.9875
Within Groups
2992.2465
104
28.7716
Total
3332.1217
114
F 1.1813
P-value
F crit
0.3119 1.9229
Level of significance
0.05
Test statistic: FSTAT = 1.1813 Since p-value = 0.3119 > 0.05, and FSTAT = 1.1813 < 1.9229, do not reject H0. There is insufficient evidence to conclude that the total number of credit hours this semester across the majors are different. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlix
Copyright ©2024 Pearson Education, Inc.
2.
GPA, based on graduate school intention H0: 12 22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Yes
56 22.25 0.397321429
0.1188
Not Yes
59 27.62 0.468135593
0.1479
ANOVA Source of Variation
SS
df
MS
F
Between Groups
0.1441
1
0.1441
Within Groups
15.1126
113
0.1337
Total
15.2567
114
P-value
1.0773
F crit
0.3015 3.9251
Level of significance
0.05
Since p-value = 0.3015 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
189.77
3.38875
0.2620
Not Yes
59
191.7
3.249152542
0.3624
df
MS
F
ANOVA Source of Variation
SS
Copyright ©2024 Pearson Education, Inc.
P-value
F crit
Solutions to End-of-Section and Chapter Review Problems cdlxi Between Groups
0.5599
1
0.5599
Within Groups
35.4313
113
0.3136
Total
35.9912
114
1.7856
0.1841
3.9251
Level of significance
0.05
Test statistic: FSTAT = 1.7856 Since p-value = 0.1841 > 0.05, and FSTAT = 1.7856 < 3.9251, do not reject H0. There is insufficient evidence to conclude that GPA across the graduate school intentions are different.
Copyright ©2024 Pearson Education, Inc.
2. Salary (expected starting annual salary), based on graduate school intention cont. H0: 12 22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
884 15.78571429 129.1896
Not Yes
59
892 15.11864407 111.0374
ANOVA Source of Variation
SS
df
MS
F
Between Groups
12.7845
1
12.7845
0.1067
Within Groups
13545.5981
113
119.8725
Total
13558.3826
114
P-value
F crit
0.7446 3.9251
Level of significance
0.05
Since p-value = 0.7446 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
3526 62.96428571 380.5078
Not Yes
59
3661 62.05084746 342.6353
ANOVA Source of Variation
SS
df
MS
Copyright ©2024 Pearson Education, Inc.
F
P-value
F crit
Solutions to End-of-Section and Chapter Review Problems cdlxiii Between Groups
23.9718
1
23.9718
Within Groups
40800.7760
113
361.0688
Total
40824.7478
114
0.0664
0.7971 3.9251
Level of significance
0.05
Test statistic: FSTAT = 0.664 Since p-value = 0.7971 > 0.05, and FSTAT = 0.7971 < 3.9251, do not reject H0. There is insufficient evidence to conclude that expected starting annual salary across the graduate school intentions are different.
Copyright ©2024 Pearson Education, Inc.
2. Age, based on graduate school intention cont. H0: 12 22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
112
2
3.4182
Not Yes
59
101 1.711864407
1.4845
df
MS
F 0.9833
ANOVA Source of Variation
SS
Between Groups
2.3853
1
2.3853
Within Groups
274.1017
113
2.4257
Total
276.4870
114
P-value
F crit
0.3235 3.9251
Level of significance
0.05
Since p-value = 0.3235 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
1252 22.35714286
7.3610
Not Yes
59
1270 21.52542373
4.1847
df
F
ANOVA Source of Variation
SS
MS
Copyright ©2024 Pearson Education, Inc.
P-value
F crit
Solutions to End-of-Section and Chapter Review Problems cdlxv Between Groups
19.8745
1
19.8745
Within Groups
647.5690
113
5.7307
Total
667.4435
114
3.4681
0.0652 3.9251
Level of significance
0.05
Test statistic: FSTAT = 3.4681 Since p-value = 0.0652 > 0.05, and FSTAT = 3.4681 < 3.9251, do not reject H0. There is insufficient evidence to conclude that ages across the graduate school intentions are different.
Copyright ©2024 Pearson Education, Inc.
2. Materials (amount spent on course materials), based on graduate school intention cont. H0: 12 22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
3517 62.80357143 2781.1607
Not Yes
59
3393 57.50847458 1786.1853
ANOVA Source of Variation
SS
df
MS
F
Between Groups
805.5454
1
805.5454
0.3548
Within Groups
256562.5850
113
2270.4654
Total
257368.1304
114
P-value
F crit
0.5526 3.9251
Level of significance
0.05
Since p-value = 0.5526 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
8515 152.0535714 5581.5062
Not Yes
59
9653 163.6101695 4902.5523
ANOVA Source of Variation
SS
df
MS
Copyright ©2024 Pearson Education, Inc.
F
P-value
F crit
Solutions to End-of-Section and Chapter Review Problems cdlxvii Between Groups
3837.0920
1
3837.0920
Within Groups
591330.8732
113
5233.0166
Total
595167.9652
114
0.7332
0.3936 3.9251
Level of significance
0.05
Test statistic: FSTAT = 0.7332 Since p-value = 0.3936 > 0.05, and FSTAT = 0.7332 < 3.9251, do not reject H0. There is insufficient evidence to conclude that amount spent on course materials across the graduate school intentions are different.
Copyright ©2024 Pearson Education, Inc.
2. Hours (total number of credit hours this semester), based on graduate school intention cont. H0: 12 22 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
249 4.446428571
6.3607
Not Yes
59
279 4.728813559
14.6493
ANOVA Source of Variation
SS
df
MS
F
Between Groups
2.2910
1
2.2910
0.2158
Within Groups
1199.5003
113
10.6150
Total
1201.7913
114
P-value
F crit
0.6431 3.9251
Level of significance
0.05
Since p-value = 0.6431 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 H1: At least one mean is different. Decision rule: df: 1,113. If F > 3.9251, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Yes
56
547 9.767857143
26.4360
Not Yes
59
577 9.779661017
32.3816
ANOVA Source of Variation
SS
df
MS
Copyright ©2024 Pearson Education, Inc.
F
P-value
F crit
Solutions to End-of-Section and Chapter Review Problems cdlxix Between Groups
0.0040
1
0.0040
Within Groups
3332.1177
113
29.4878
Total
3332.1217
114
0.0001
0.9907 3.9251
Level of significance
0.05
Test statistic: FSTAT = 0.0001 Since p-value = 0.9907 > 0.05, and FSTAT = 0.0001 < 3.9251, do not reject H0. There is insufficient evidence to conclude that the total number of credit hours this semester across the graduate school intentions are different.
Copyright ©2024 Pearson Education, Inc.
Chapter 12
This case provides practice in performing chi-square tests. As written, the three questions can be easily modified by eliminating one or more variables that the questions name. 1.
Student Status and Major H0: There is no relationship between student status and major. H1: There is a relationship between student status and major.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxi
2 Since STAT = 6.8310 is lower than the critical bound of 18.3070, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and major.
Copyright ©2024 Pearson Education, Inc.
1. Student Status and Graduate school intention cont. H0: There is no relationship between student status and graduate school intention. H1: There is a relationship between student status and graduate school intention. Chi-Square Test
Observed Frequencies Student Status Grad School Intention
F/T
P/T
Total
No
9
5
14
Not sure
23
22
45
Yes
26
30
56
Total
58
57
115
Expected Frequencies Student Status Grad School Intention
F/T
P/T
7.06087
6.93913
14
Not sure
22.69565 22.30435
45
Yes
28.24348 27.75652
56
No
Total
58
Total
57
115
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
2
Degrees of Freedom
2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxiii Results Critical Value
5.991465
Chi-Square Test Statistic
1.442207
p-Value
0.486215
Do not reject the null hypothesis
Expected frequency assumption is met. 2 Since STAT = 1.442207 is lower than the critical bound of 5.991465, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and graduate school intention.
Copyright ©2024 Pearson Education, Inc.
1. Student status and employment status cont. H0: There is no relationship between student status and employment status. H1: There is a relationship between student status and employment status. Chi-Square Test
Observed Frequencies Student Status Employment
F/T
P/T
Total
F/T
13
7
20
Not
7
12
19
P/T
38
38
76
Total
58
57
115
Expected Frequencies Student Status Employment
F/T
P/T
Total
F/T
10.08696 9.913043
20
Not
9.582609 9.417391
19
P/T
38.33043 37.66957
76
Total
58
57
115
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
2
Degrees of Freedom
2
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxv Results Critical Value
5.991465
Chi-Square Test Statistic
3.107329
p-Value
0.211472
Do not reject the null hypothesis
Expected frequency assumption is met. 2 Since STAT = 3.107329 is lower than the critical bound of 5.991465, do not reject H0. There is not enough evidence to conclude there is a relationship between student status and employment status.
Copyright ©2024 Pearson Education, Inc.
1. Graduate school intention and Major cont. H0: There is no relationship between graduate school intention and major. H1: There is a relationship between graduate school intention and major.
The expected frequency assumption for the 2 test is violated. The test results above might not be reliable. 2 Since STAT = 37.98281 is higher than the critical bound of 31.41043, reject H0. There is evidence to conclude there is a relationship between graduate school intention and major. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxvii 1. Employment status and Major cont. H0: There is no relationship between employment status and major. H1: There is a relationship between employment status and major.
The expected frequency assumption for the 2 test is violated. The test results above might not be reliable. 2 Since STAT = 13.15548 is lower than the critical bound of 31.41043, do not reject H0. There is not enough evidence to conclude there is a relationship between employment status and major. Copyright ©2024 Pearson Education, Inc.
1. Graduate school intention and employment status cont. H0: There is no relationship between graduate school intention and employment status. H1: There is a relationship between graduate school intention and employment status. Chi-Square Test
Observed Frequencies Employment Status Grad School Intention
F/T
Not
P/T
Total
No
2
2
10
14
Not sure
7
10
28
45
Yes
11
7
38
56
Total
20
19
76
115
Expected Frequencies Employment Status Grad School Intention No
F/T
Not
P/T
Total
2.434783 2.313043 9.252174
14
Not sure 7.826087 7.434783 29.73913
45
Yes Total
9.73913 9.252174 20
19
37.0087
56
76
115
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
3
Degrees of Freedom
4
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxix Results Critical Value
9.487729
Chi-Square Test Statistic
1.992445
p-Value
0.737149
Do not reject the null hypothesis
Expected frequency assumption is met. 2 Since STAT = 1.992445 is lower than the critical bound of 9.487729, do not reject H0. There is not enough evidence to conclude there is a relationship between graduate school intention and employment status.
Copyright ©2024 Pearson Education, Inc.
2.
GPA: Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
3216
Population 2 Sample Sample Size
58
Sum of Ranks
3454
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3216
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
-0.503446
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.6147
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxi Since p-value = 0.6147 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median grade point average.
Copyright ©2024 Pearson Education, Inc.
2. Expected starting salary: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
3312.5
Population 2 Sample Sample Size
58
Sum of Ranks
3357.5
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3312.5
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
0.03636
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.9710
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxiii Since p-value = 0.9710 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median expected starting salary.
Copyright ©2024 Pearson Education, Inc.
2. Age: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
3122.5
Population 2 Sample Sample Size
58
Sum of Ranks
3547.5
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3122.5
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
-1.02647
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.3047
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxv Since p-value = 0.3047 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median age.
Copyright ©2024 Pearson Education, Inc.
2. Spending on course materials: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
1765
Population 2 Sample Sample Size
58
Sum of Ranks
4905
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
1765
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
-8.620111
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxvii Since p-value = 0.0000 < 0.05, reject H0. There is evidence of any difference between full-time and part-time students in median spending on course materials.
Copyright ©2024 Pearson Education, Inc.
2. Number of times visited the Student & Post-Graduate Development Center: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
1853
Population 2 Sample Sample Size
58
Sum of Ranks
4817
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
1853
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
-8.127853
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0000
Reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdlxxxix Since p-value = 0.0000 < 0.05, reject H0. There is evidence of any difference between full-time and part-time students in median number of times visited the Student & Post-Graduate Development Center.
Copyright ©2024 Pearson Education, Inc.
2. Amount of current outstanding student loans: cont. Population 1 = P/T (57), 2 = F/T (58) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
57
Sum of Ranks
3168
Population 2 Sample Sample Size
58
Sum of Ranks
3502
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3168
T1 Mean
3306
Standard Error of T1
178.7680
Z Test Statistic
-0.77195
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.4401
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxci Since p-value = 0.4401 > 0.05, do not reject H0. There is not enough evidence of any difference between full-time and part-time students in median amount of current outstanding student loans.
Copyright ©2024 Pearson Education, Inc.
3.
GPA: Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
59
Sum of Ranks
3174
Population 2 Sample Sample Size
56
Sum of Ranks
3496
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3496
T1 Mean
3248
Standard Error of T1
178.7139
Z Test Statistic
1.3876927
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.1652
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxciii Since p-value = 0.1652 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median grade point average.
Copyright ©2024 Pearson Education, Inc.
3. Expected starting salary: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
59
Sum of Ranks
3361
Population 2 Sample Sample Size
56
Sum of Ranks
3309
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3309
T1 Mean
3248
Standard Error of T1
178.7139
Z Test Statistic
0.3413276
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.7329
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcv Since p-value = 0.7329 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median expected starting salary.
Copyright ©2024 Pearson Education, Inc.
3. Age: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
59
Sum of Ranks
3202.5
Population 2 Sample Sample Size
56
Sum of Ranks
3467.5
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3467.5
T1 Mean
3248
Standard Error of T1
178.7139
Z Test Statistic
1.2282199
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.2194
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcvii Since p-value = 0.2194 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median age.
Copyright ©2024 Pearson Education, Inc.
3. Spending for course materials: cont. Population 1 = Not yes (59), 2 = Yes (56) (Graduate school intentions, where ―Not yes‖ created from ―No‖ and ―Not sure‖) H0: M1 = M2 H1: M1 M2 Wilcoxon Rank Sum Test
Data Level of Significance
0.05
Population 1 Sample Sample Size
59
Sum of Ranks
3610
Population 2 Sample Sample Size
56
Sum of Ranks
3060
Intermediate Calculations Total Sample Size n
115
T1 Test Statistic
3060
T1 Mean
3248
Standard Error of T1
178.7139
Z Test Statistic
-1.051961
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.2928
Do not reject the null hypothesis Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems cdxcix Since p-value = 0.2928 > 0.05, do not reject H0. There is not enough evidence of any difference between students who plan to go to graduate school and those who do not plan to go to graduate school in median spending for course materials.
The Shelter Bay Lifestyles Case
Chapter 1
1.
2.
3.
4.
Gender is not well-defined. Suggestions could include: Man, Woman, Both, Neither, Other. Or a free response gender might also be considered, even though ―free response‖ is an acceptable definition (although one that may not be amenable to data analysis). Annual household income might need more specification, such as thousands of dollars. This somewhat depends on how the survey is administered. If online, through choices offered on screens, then annual household income may need some recoding (because one might write $30K while another might write 30,000), and as might gender (to eliminate variations of the same category). Product purchased: categorical, nominal, EX-10, EX-11, AIX-12; store: categorical, nominal, store identifier; years as customer: numerical, discrete, whole number; age: categorical ordinal, under 18, 18-34, 35-54, 55 or older; gender: categorical, nominal, free response; education: categorical, ordinal, high school or lower, college graduate, master’s degree, doctorate; relationship status: categorical nominal, single, partnered, separated; income; numerical, discrete, ratio, a nonnegative a dollar and cents number; ZIP code, categorical, nominal, five digits, planned use: numerical, ratio whole number; planned miles: numerical, ratio; self-rated fitness: categorical, ordinal, 1, 2, 3, 4, or 5. The data source should be a probability sample.
Copyright ©2024 Pearson Education, Inc.
Chapter 2
1.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems di
1. cont.
Copyright ©2024 Pearson Education, Inc.
1. cont.
EX-10: Years as Customer
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems diii
Copyright ©2024 Pearson Education, Inc.
1. cont. EX-10: Annual Household Income
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dv 1. cont. EX-10: Weekly Usage
EX-10: Elliptical Miles
EX-10: Fitness
Copyright ©2024 Pearson Education, Inc.
1. cont. EX-11: Years as Customer
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dvii 1. EX-11: Annual Household Income cont.
Copyright ©2024 Pearson Education, Inc.
1. EX-11: Weekly Usage cont.
EX-11: Elliptical Miles
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dix 1. EX-11: Fitness cont.
AIX-12: Years as Customer
AIX-12: Annual Household Income
Copyright ©2024 Pearson Education, Inc.
1. AIX-12: Weekly Usage cont.
AIX-12: Elliptical Miles
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxi 1. AIX-12: Fitness cont.
2.
EX-10 Web store sold 32, followed by Oxford Glen with 19, Springville with 13, Ashland with 12, and Galleria with just 4. EX-10 was preferred by the 18-34 age group, women, people with a master’s degree, and whose relationship status is partnered. The average years as an EX-10 customer is 3.26 with a median of 3 years. The average annual household income of an EX-10 customer is $46,418, with a median of $46,617, although the minimum is $29,600 and the maximum is $68,200. The average weekly usage of the EX-10 is 3.0875 with a median of 3. The average number of elliptical miles of the EX-10 is 82.71 with a median of 84.6. The average level of fitness is 2.96 with a median of 3.00. EX-11 Oxford Glen store sold 21, followed by Web with 15, Ashland with 12, Springville with 7, and Galleria with just 5. EX-11 was preferred by the 18-34 age group, women, people with a master’s degree, and whose relationship status is partnered. The average years as an EX-11 customer is 3.17 with a median of 2.95 years. The average annual household income of an EX-11 customer is $48.974, with a median of $49,460, although the minimum is $31,800 and the maximum is $67,100. The average weekly usage of the EX-11 is 3.0667 with a median of 3. The average number of elliptical miles of the EX-11 is 87.98 with a median of 84.8. The average level of fitness is 2.90 with a median of 3.00. AIX-12 Web store sold 14, followed by Oxford Glen with 10, Ashland with 8, Springville with 5, and Galleria with just 3. AIX-12 was preferred by the 18-34 age group, men, people with a master’s degree, and whose relationship status is partnered. The average years as an AIX-12 customer is 3.57 with a median of 3.25 years. The average annual household income of an AIX-12 customer is $75,442, with a median of $76,589, although the minimum is $49,000 and the maximum is $105,000. The average weekly usage of the AIX-12 is 4.775 with a median of 5. The average number of elliptical miles of the AIX-12 is 166.9 with a median of 160, although the minimum is 80 and the maximum is 360. The average level of fitness is 4.625 with a median of 5.000.
Copyright ©2024 Pearson Education, Inc.
Chapter 3
1.
EX-10 descriptive summary: Years As Customer
Annual Household Income
Elliptical Miles
Fitness
3.26375
46418.025
3.0875
82.71
2.9625
Median
3
46617
3
84.6
3
Mode
2.1
46617
3
84.6
3
Minimum
1.4
29562
2
37.6
1
Maximum
6.8
68220
5
188
5
Range
5.4
38658
3
150.4
4
Variance
2.0596
82369840.5057
0.6125
832.4434
0.4416
Standard Deviation
1.4351
9075.7832
0.7826
28.8521
0.6645
Coeff. of Variation
43.97%
19.55%
25.35%
34.88%
22.43%
Skewness
0.5641
0.1766
0.1691
1.0153
0.3065
Kurtosis
-0.6319
-0.6213
-0.6205
1.8517
1.9068
80
80
80
80
80
Standard Error
0.1605
1014.7034
0.0875
3.2258
0.0743
First Quartile
2
38658
3
65.8
3
Third Quartile
4.3
53439
4
94
3
EX-11 descriptive summary: Years As Customer
Annual Household Income
Elliptical Miles
Fitness
Mean
Count
Weekly Usage
Weekly Usage
Mean
3.17
48973.65 3.066666667
87.98
2.9
Median
2.95
49459.5
84.8
3
Copyright ©2024 Pearson Education, Inc.
3
Solutions to End-of-Section and Chapter Review Problems dxiii Mode
1.8
45480
3
95.4
3
Minimum
1.3
31836
2
21.2
1
Maximum
8.6
67083
5
212
4
Range
7.3
35247
3
190.8
3
Variance
2.6428
74891532.3331
0.6395
1105.6986
0.3966
Standard Deviation
1.6257
8653.9894
0.7997
33.2520
0.6298
Coeff. of Variation
51.28%
17.67%
26.08%
37.80%
21.72%
Skewness
1.1797
-0.0105
0.4949
1.0859
-0.3454
Kurtosis
1.2182
-0.3250
0.0132
2.7948
0.7326
60
60
60
60
60
Standard Error
0.2099
1117.2252
0.1032
4.2928
0.0813
First Quartile
1.8
43206
3
63.6
3
Third Quartile
4
53439
4
106
3
Count
Copyright ©2024 Pearson Education, Inc.
1. AIX-12 descriptive summary: cont. Years As Customer
Weekly Usage
Elliptical Miles
Fitness
Mean
3.5675
75441.575
4.775
166.9
4.625
Median
3.25
76568.5
5
160
5
Mode
3.5
90886
4
100
5
Minimum
1.2
48556
3
80
3
Maximum
8.1
104581
7
360
5
Range
6.9
56025
4
280
2
Variance
3.0058
342465992.7122
0.8968
3607.9897
0.4455
Standard Deviation
1.7337
18505.8367
0.9470
60.0665
0.6675
Coeff. of Variation
48.60%
24.53%
19.83%
35.99%
14.43%
Skewness
0.8251
-0.0796
0.6694
1.1340
-1.5742
Kurtosis
0.2614
-1.4521
-0.2245
1.8213
1.2049
40
40
40
40
40
Standard Error
0.2741
2926.0297
0.1497
9.4974
0.1055
First Quartile
2.3
57271
4
120
4
Third Quartile
4.3
90886
5
200
5
Count
2.
Annual Household Income
EX-10 Half of the purchasers of EX-10 have been customers for 3 years or less with a mean of 3.26 years. The mean spread around the mean years as a customer is 1.44 with the least years at 1.4 and the most at 6.8. The middle 50% of the customers fall between 2 and 4.3 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $46,617 or less with a mean annual income of $46,418. The mean spread around the mean annual income is $9,076 with a minimum income of $29,562 and a maximum income of $68,220. The middle 50% of the customers have an annual income that falls between $38,658 and $53,439. The annual household income is almost symmetrical. Half of the customers of EX-10 have a usage value below 3 with a mean of 3.09. The mean spread around the mean usage is 0.783 with the lowest value of 2 and the highest of 5. The middle 50% of the usage value falls between 3 and 4. The usage value is nearly symmetrical. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxv Half of the customers of EX-10 expect to have less than 84.6 elliptical miles each week with a mean average elliptical miles of 82.71. The mean spread around the mean average miles per week is 28.85 with the lowest value of 37.6 miles and the highest value of 188 miles. The middle 50% of the customers expect to use the elliptical an average between 65.8 miles and 94 miles per week. The average number of miles the customer expects to use the elliptical each week is left skewed. Half of the fitness value fall below 3 with a mean value of 2.96. The mean spread around the mean fitness value is 0.6645 with the lowest value of 1 and the highest of 5. The middle 50% of the fitness value is equal to 3. The fitness value is symmetrical.
Copyright ©2024 Pearson Education, Inc.
2. EX-11 cont. Half of the purchasers of EX-11 have been customers for 2.95 years or less with a mean of 3.17 years. The mean spread around the mean years as a customer is 1.63 with the least years at 1.3 and the most at 8.6. The middle 50% of the customers fall between 1.8 and 4 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $49,459.50 or less with a mean annual income of $48,974. The mean spread around the mean annual income is $8,654 with a minimum income of $31,836 and a maximum income of $67,083. The middle 50% of the customers have an annual income that falls between $43,206 and $53,439. The annual household income is leftskewed. Half of the customers of EX-11 have a usage value below 3 with a mean of 3.067. The mean spread around the mean usage is 0.80 with the lowest value of 2 and the highest of 5. The middle 50% of the usage value falls between 3 and 4. The usage value is nearly symmetrical. Half of the customers of EX-11 expect to have less than 84.8 elliptical miles each week with a mean average elliptical miles of 87.98. The mean spread around the mean average miles per week is 33.25 with the lowest value of 21.2 miles and the highest value of 212 miles. The middle 50% of the customers expect to use the elliptical an average between 63.6 miles and 106 miles per week. The average number of miles the customer expects to use the elliptical each week is right skewed. Half of the fitness value fall below 3 with a mean value of 2.9. The mean spread around the mean fitness value is 0.63 with the lowest value of 1 and the highest of 4. The middle 50% of the fitness value is equal to 3. The fitness value is left-skewed. AIX-12 Half of the purchasers of AIX-12 have been customers for 3.25 years or less with a mean of 3.57 years. The mean spread around the mean years as a customer is 1.73 with the least years at 1.2 and the most at 8.1. The middle 50% of the customers fall between 2.3 and 4.3 years as a customer. Years as a customer is right-skewed. Half of the customers have an annual household income of $76,568.50 or less with a mean annual income of $75,442. The mean spread around the mean annual income is $18,506 with a minimum income of $48,556 and a maximum income of $104,581. The middle 50% of the customers have an annual income that falls between $57,271 and $90,886. The annual household income is almost symmetrical. Half of the customers of AIX-12 have a usage value below 5 with a mean of 4.775. The mean spread around the mean usage is 0.947 with the lowest value of 3 and the highest of 7. The middle 50% of the usage value falls between 4 and 5. The usage value is left-skewed. Half of the customers of AIX-12 expect to have less than 160 elliptical miles each week with a mean average elliptical miles of 166.9. The mean spread around the mean average miles per week is 60.07 with the lowest value of 80 miles and the highest value of 360 miles. The middle 50% of the customers expect to use the elliptical an average between 120 miles and 200 miles per week. The average number of miles the customer expects to use the elliptical each week is right skewed. Half of the fitness value fall below 5 with a mean value of 4.625. The mean spread around the mean fitness value is 0.6675 with the lowest value of 3 and the highest of 5. The middle 50% of the fitness value are between 4 and 5. The fitness value is left-skewed.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxvii
Chapter 4
All the contingency tables are given here for EX-10. Similar steps can be made for EX-11 and AIX-12. 1.
EX-10 Gender, Education Count of Highest Level of Education
Education
Gender
college graduate
high school or lower
Cis Man Man
2
3
1
17
19
4
2
6
3
3
7
2
1
3
1
1
8
14
24
11
6
17
30
46
80
Non-binary 1
Trans Trans Woman Woman
2
(blank) Grand Total
4
Grand Total
1 1
Prefer not to say
master’s degree
EX-10 Gender, Relationship Count of Relationship Status
Relationship
Gender
Partnered
Single
Grand Total
Cis Man
2
1
3
Man
10
9
19
Non-binary
3
3
6
Prefer not to say
3
4
7
Trans
3 Copyright ©2024 Pearson Education, Inc.
3
Trans Woman
1
1
Woman
16
8
24
(blank)
10
7
17
Grand Total
48
32
80
EX-10 Gender, Fitness Count of Fitness Gender
Fitness 1
2
Cis Man Man
3
4
3 1
Non-binary
Trans
3
2
13
2
1
4
1
Prefer not to say 2
19 6 7
1
3 1
Woman
6
16
1
(blank)
3
10
4
14
54
9
1
1. EX-10 Degree, Relationship cont. Count of Highest Level of Education Education
1
7
Trans Woman
Grand Total
5 Grand Total
1 1
24 17
2
80
Relationship Partnered
Single
Grand Total
college graduate
2
2
4
high school or lower
18
12
30
Master’s degree
28
18
46
Grand Total
48
32
80
EX-10 Degree, Fitness Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxix Count of Highest Level of Education
Fitness
Education
1
2
3
4
5 Grand Total
college graduate
1
2
1
4
high school or lower
6
20
4
30
Master’s degree
1
7
32
4
2
46
Grand Total
1
14
54
9
2
80
EX-10 Relationship, Fitness Count of Relationship Status
Fitness
Relationship
1
2
3
4
5 Grand Total
Partnered
1
11
31
4
1
48
3
23
5
1
32
14
54
9
2
80
Single Grand Total
1
Copyright ©2024 Pearson Education, Inc.
NOTE: All the conditional and marginal probabilities are given here for EX-10. Similar steps can be made for EX-11 and AIX-12. 2.
EX-10 Gender, Education Joint and Marginal Probabilities
P(Gender, Education)
Education
Gender
college graduate
high school or lower
Master’s degree
P(Education)
Cis Man
0.0000
0.0125
0.0250
0.0375
Man
0.0125
0.0125
0.2125
0.2375
Non-binary
0.0000
0.0500
0.0250
0.0750
Prefer not to say
0.0125
0.0375
0.0375
0.0875
Trans
0.0000
0.0250
0.0125
0.0375
Trans Woman
0.0000
0.0000
0.0125
0.0125
Woman
0.0250
0.1000
0.1750
0.3000
(blank)
0.0000
0.1375
0.0750
0.2125
P(Gender)
0.0500
0.3750
0.5750
1.0000
EX-10 Gender, Education Conditional
Probabilities
P(Education|Gender)
Education
Gender
college graduate
high school or lower
Master’s degree
Cis Man
0.0000
0.3333
0.6667
Man
0.0526
0.0526
0.8947
Non-binary
0.0000
0.6667
0.3333
Prefer not to say
0.1429
0.4286
0.4286
Trans
0.0000
0.6667
0.3333
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxi Trans Woman
0.0000
0.0000
1.0000
Woman
0.0833
0.3333
0.5833
(blank)
0.0000
0.6471
0.3529
EX-10 Gender, Education Conditional
Probabilities
P(Gender|Education)
Education
Gender
college graduate
high school or lower
Master’s degree
Cis Man
0.0000
0.0333
0.0435
Man
0.2500
0.0333
0.3696
Non-binary
0.0000
0.1333
0.0435
Prefer not to say
0.2500
0.1000
0.0652
Trans
0.0000
0.0667
0.0217
Trans Woman
0.0000
0.0000
0.0217
Woman
0.5000
0.2667
0.3043
(blank)
0.0000
0.3667
0.1304
Copyright ©2024 Pearson Education, Inc.
2. EX-10 Gender, Relationship cont. Joint and Marginal Probabilities P(Gender, Relationship)
Relationship
Gender
Partnered
Single
P(Relationship)
Cis Man
0.0250
0.0125
0.0375
Man
0.1250
0.1125
0.2375
Non-binary
0.0375
0.0375
0.0750
Prefer not to say
0.0375
0.0500
0.0875
Trans
0.0375
0.0000
0.0375
Trans Woman
0.0125
0.0000
0.0125
Woman
0.2000
0.1000
0.3000
(blank)
0.1250
0.0875
0.2125
P(Gender)
0.6000
0.4000
1.0000
EX-10 Gender, Relationship Conditional Probabilities P(Relationship|Gender)
Relationship
Gender
Partnered
Single
Cis Man
0.6667
0.3333
Man
0.5263
0.4737
Non-binary
0.5000
0.5000
Prefer not to say
0.4286
0.5714
Trans
1.0000
0.0000
Trans Woman
1.0000
0.0000
Woman
0.6667
0.3333
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxiii (blank)
0.5882
0.4118
EX-10 Gender, Relationship Conditional Probabilities P(Gender|Relationship)
Relationship
Gender
Partnered
Single
Cis Man
0.0417
0.0313
Man
0.2083
0.2813
Non-binary
0.0625
0.0938
Prefer not to say
0.0625
0.1250
Trans
0.0625
0.0000
Trans Woman
0.0208
0.0000
Woman
0.3333
0.2500
(blank)
0.2083
0.2188
Copyright ©2024 Pearson Education, Inc.
2. EX-10 Gender, Fitness cont. Joint and Marginal Probabilities P(Gender, Fitness)
Fitness
Gender
1
2
3
4
5 P(Fitness)
Cis Man
0.0000
0.0000
0.0375
0.0000
0.0000
0.0375
Man
0.0125
0.0250
0.1625
0.0250
0.0125
0.2375
Non-binary
0.0000
0.0125
0.0500
0.0125
0.0000
0.0750
Prefer not to say
0.0000
0.0000
0.0875
0.0000
0.0000
0.0875
Trans
0.0000
0.0250
0.0125
0.0000
0.0000
0.0375
Trans Woman
0.0000
0.0000
0.0000
0.0125
0.0000
0.0125
Woman
0.0000
0.0750
0.2000
0.0125
0.0125
0.3000
(blank)
0.0000
0.0375
0.1250
0.0500
0.0000
0.2125
P(Gender)
0.0125
0.1750
0.6750
0.1125
0.0250
1.0000
EX-10 Gender, Fitness Conditional Probabilities P(Fitness|Gender) Fitness Gender
1
2
3
4
5
Cis Man
0.0000
0.0000
1.0000
0.0000
0.0000
Man
0.0526
0.1053
0.6842
0.1053
0.0526
Non-binary
0.0000
0.1667
0.6667
0.1667
0.0000
Prefer not to say
0.0000
0.0000
1.0000
0.0000
0.0000
Trans
0.0000
0.6667
0.3333
0.0000
0.0000
Trans Woman
0.0000
0.0000
0.0000
1.0000
0.0000
Woman
0.0000
0.2500
0.6667
0.0417
0.0417
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxv (blank)
0.0000
0.1765
0.5882
0.2353
0.0000
EX-10 Gender, Fitness Conditional Probabilities P(Gender|Fitness) Fitness Gender
1
2
3
4
5
Cis Man
0.0000
0.0000
0.0556
0.0000
0.0556
Man
1.0000
0.1429
0.2407
0.2222
0.2407
Non-binary
0.0000
0.0714
0.0741
0.1111
0.0741
Prefer not to say
0.0000
0.0000
0.1296
0.0000
0.1296
Trans
0.0000
0.1429
0.0185
0.0000
0.0185
Trans Woman
0.0000
0.0000
0.0000
0.1111
0.0000
Woman
0.0000
0.4286
0.2963
0.1111
0.2963
(blank)
0.0000
0.2143
0.1852
0.4444
0.1852
Copyright ©2024 Pearson Education, Inc.
1. EX-10 Degree, Relationship cont. Joint and Marginal Probabilities P(Education, Relationship)
Relationship
Education
Partnered
Single
P(Relationship)
college graduate
0.0250
0.0250
0.0500
high school or lower
0.2250
0.1500
0.3750
Master’s degree
0.3500
0.2250
0.5750
P(Education)
0.6000
0.4000
1.0000
EX-10 Degree, Relationship Conditional Probabilities P(Relationship|Education)
Relationship
Education
Partnered
Single
college graduate
0.5000
0.5000
high school or lower
0.6000
0.4000
Master’s degree
0.6087
0.3913
EX-10 Degree, Relationship Conditional Probabilities P(Education|Relationship)
Relationship
Education
Partnered
Single
college graduate
0.0417
0.0625
high school or lower
0.3750
0.3750
Master’s degree
0.5833
0.5625
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxvii 2. EX-10 Degree, Fitness cont. Joint and Marginal Probabilities P(Education, Fitness) Education
Fitness 1
2
3
4
5 P(Fitness)
college graduate
0.0000
0.0125
0.0250
0.0125
0.0000
0.05
high school or lower
0.0000
0.0750
0.2500
0.0500
0.0000
0.375
Master’s degree
0.0125
0.0875
0.4000
0.0500
0.0250
0.575
P(Education)
0.0125
0.1750
0.6750
0.1125
0.0250
1.0000
EX-10 Degree, Fitness Conditional Probabilities P(Fitness|Education)
Fitness
Education
1
2
3
4
5
college graduate
0.0000
0.2500
0.5000
0.2500
0.0000
high school or lower
0.0000
0.2000
0.6667
0.1333
0.0000
Master’s degree
0.0000
0.2000
0.6667
0.1333
0.0000
EX-10 Degree, Fitness Conditional Probabilities P(Education|Fitness) Education
Fitness 1
2
3
4
5
college graduate
0.0000
0.0714
0.0370
0.1111
0.0000
high school or lower
0.0000
0.4286
0.3704
0.4444
0.0000
Master’s degree
1.0000
0.5000
0.5926
0.4444
1.0000
Copyright ©2024 Pearson Education, Inc.
2. EX-10 Relationship, Fitness cont. Joint and Marginal Probabilities P(Relationship, Fitness)
Fitness
Relationship
1
2
3
4
5 P(Fitness)
Partnered
0.0125
0.1375
0.3875 0.0500
0.0125
0.6
Single
0.0000
0.0375
0.2875 0.0625
0.0125
0.4
P(Education)
0.0125
0.1750
0.6750 0.1125
0.0250
1.0000
EX-10 Relationship, Fitness Conditional P(Fitness|Relationship)
Probabilities
Fitness
Relationship
1
2
Partnered
0.0208
Single
3
4
5
0.2292
0.6458 0.0833
0.0208
0.0000
0.0938
0.7188 0.1563
0.0313
Conditional
Probabilities
EX-10 Relationship, Fitness
P(Relationship|Fitness) Relationship
Fitness 1
2
Partnered
1.0000
Single
0.0000
3
4
5
0.7857
0.5741 0.4444
0.5000
0.2143
0.4259 0.5556
0.5000
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxix
Chapter 6
1.
EX-10, Age Age cannot be approximated by the normal distribution because age is defined as a categorical variable. EX-10, Income
Copyright ©2024 Pearson Education, Inc.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxi 1. EX-10, Usage cont.
Copyright ©2024 Pearson Education, Inc.
1. EX-10, Miles cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxiii 1. EX-11, Age cont. Age cannot be approximated by the normal distribution because age is defined as a categorical variable. EX-11, Income
Copyright ©2024 Pearson Education, Inc.
1. EX-11, Usage cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxv 1. EX-11, Miles cont.
Copyright ©2024 Pearson Education, Inc.
1. AIX-12, Age cont. Age cannot be approximated by the normal distribution because age is defined as a categorical variable. AIX-12, Income
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxvii 1. AIX-12, Usage cont.
Copyright ©2024 Pearson Education, Inc.
1. AIX-12, Miles cont.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxxxix 2.
EX-10: Age is defined as a categorical variable. Income appears approximately normally distributed. Usage is a discrete variable with only 4 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed. EX-11: Age is defined as a categorical variable. Income appears approximately normally distributed. Usage is a discrete variable with only 4 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed. AIX-12: Age is defined as a categorical variable. Income appears to depart slightly from the normal distribution. Usage is a discrete variable with only 5 different values and, hence, is not normally distributed. The number of elliptical miles appears to be right-skewed.
Copyright ©2024 Pearson Education, Inc.
Chapter 8
1.
EX-10 Confidence Interval Estimate for the Mean: 95% confidence interval Years:2.94 3.58 Income:44,398.31 48,437.74 Usage:2.91 3.26 Miles:76.29 89.13 Fitness:2.81 3.11 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 80) Ashland (12):0.0718 0.2282 Galleria (4): 0.0022 0.0978 Oxford Glen (19):0.1442 0.3308 Springville (13):0.0817 0.2433 Web (32):0.2926 0.5074 Age Group (sample size = 80) 18-34 (63):0.6979 0.8771 35-55 (17):0.1229 0.3021 Gender (sample size = 80) Cis Man (3):–0.0041 0.0791 Man (19):0.1442 0.3308 Non-binary (6):0.0173 0.1327 Prefer not to say (7):0.0256 0.1494 Trans (3):–0.0041 0.0791 Trans Woman (1):–0.0118 0.0368 Woman (24): 0.1996 0.4004 (blank) (17):0.1229 0.3021 Highest Level of Education (sample size = 80) high school or lower (30):0.2689 0.4811 college graduate (4):0.0022 0.0978 master’s degree (46):0.4667 0.6833 Relationship Status (sample size = 80) Single (32):0.2926 0.5074 Partnered (48):0.4926 0.7074
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxli 1. EX-11 cont. Confidence Interval Estimate for the Mean: 95% confidence interval Years:2.75 3.59 Income:46,738.09 51,209.21 Usage:2.86 3.27 Miles:79.39 96.57 Fitness:2.74 3.06 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 60) Ashland (12):0.0988 0.3012 Galleria (5): 0.0134 0.1533 Oxford Glen (21):0.2293 0.4707 Springville (7):0.0354 0.1979 Web (15):0.1404 0.3596 Age Group (sample size = 60) 18-34 (48):0.6988 0.9012 35-55 (12):0.0988 0.3012 Gender (sample size = 60) Cis Man (3):–0.0051 0.1051 Man (14):0.1263 0.3404 Non-binary (6):0.0241 0.1759 Prefer not to say (5):0.0134 0.1533 Trans (3):–0.0051 0.1051 Woman (21): 0.2293 0.4707 (blank) (8):0.0473 0.2193 Highest Level of Education (sample size = 60) high school or lower (23):0.2603 0.5064 college graduate (1):–0.0157 0.0491 master’s degree (36):0.4760 0.7240 Relationship Status (sample size = 60) Single (24):0.2760 0.5240 Partnered (36):0.4760 0.7240
Copyright ©2024 Pearson Education, Inc.
1. AIX-12 cont. Confidence Interval Estimate for the Mean: 95% confidence interval Years:3.01 4.12 Income:69,523.12 81,360.03 Usage:4.47 5.08 Miles:147.69 186.11 Fitness:4.41 4.84 Confidence Interval Estimate for the Proportion: 95% confidence interval Store (sample size = 40) Ashland (8):0.0760 0.3240 Galleria (3): –0.0066 0.1566 Oxford Glen (10):0.1158 0.3842 Springville (5):0.0225 0.2275 Web (14):0.2022 0.4978 Age Group (sample size = 40) 18-34 (33):0.7072 0.9428 35-55 (7):0.0572 0.2928 Gender (sample size = 40) Cis Man (1):–0.0234 0.0734 Cis Woman (1):–0.0234 0.0734 Man (17):0.2718 0.5782 Non-binary (3):–0.0066 0.1566 Prefer not to say (5):0.0225 0.2275 Trans (6):0.0393 0.2607 Woman (4): 0.0070 0.1930 (blank) (3):–0.0066 0.1566 Highest Level of Education (sample size = 40) high school or lower (2):–0.0175 0.1175 doctorate (4):0.0070 0.1930 master’s degree (34):0.7393 0.9607 Relationship Status (sample size = 40) Single (17):0.2718 0.5782 Partnered (23):0.4218 0.7282
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxliii 2.
EX-10 You are 95% confident that the mean years as customer is between 2.94 and 3.58. You are 95% confident that the mean annual household income of the customers is between $44,398.31 and $48,437.74. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 2.91 and 3.26. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 76.29 and 89.13. You are 95% confident that the mean level of fitness is between 2.81and 3.11. You are 95% confident that the proportion of the store: Ashland is between 0.0718 and 0.2282, Galleria is between 0.0022 and 0.0978, Oxford Glen is between 0.1442 and 0.3308, Springville is between 0.0817 and 0.2433, and Web 0.2926 and 0.5074. You are 95% confident that the proportion of the age group: 18-34 is between 0.6979 and 0.8771, and 35-55 is between 0.1229 and 0.3021. You are 95% confident that the proportion of the gender: Cis Man is between –0.0041 and 0.0791, Man is between 0.1442 and 0.3308, Non-binary is between 0.0173 and 0.1327, Prefer not to say is between 0.0256 and 0.1494, Trans is between –0.0041 and 0.0791, Trans woman is between –0.0118 and 0.0368, Woman is between 0.1996 and 0.4004, and (blank) is between 0.1229 and 0.3021. You are 95% confident that the proportion of highest education: high school or lower is between 0.2689 and 0.4811, college graduate is between 0.0022 and 0.0978, and master’s degree is between 0.4667 and 0.6833. You are 95% confident that the proportion of the relationship status: single is between 0.2926 and 0.5074, and partnered is between 0.4926 and 0.7074. EX-11 You are 95% confident that the mean years as customer is between 2.75 and 3.59. You are 95% confident that the mean annual household income of the customers is between $46,738.09 and $51,209.21. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 2.86 and 3.27. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 79.39 and 96.57. You are 95% confident that the mean level of fitness is between 2.74and 3.06. You are 95% confident that the proportion of the store: Ashland is between 0.0988 and 0.3012, Galleria is between 0.0134 and 0.1533, Oxford Glen is between 0.2293 and 0.4707, Springville is between 0.0354 and 0.1979, and Web 0.1404 and 0.3596. You are 95% confident that the proportion of the age group: 18-34 is between 0.6988 and 0.9012, and 35-55 is between 0.0988 and 0.3012. You are 95% confident that the proportion of the gender: Cis Man is between –0.0051 and 0.1051, Man is between 0.1263 and 0.3404, Non-binary is between 0.0241 and 0.1759, Prefer not to say is between 0.0134 and 0.1533, Trans is between –0.0051 and 0.1051, Woman is between 0.2293 and 0.4707, and (blank) is between 0.0473 and 0.2193. You are 95% confident that the proportion of highest education: high school or lower is between 0.2603 and 0.5064, college graduate is between –0.0157 and 0.0491, and master’s degree is between 0.4760 and 0.7240. You are 95% confident that the proportion of the relationship status: single is between 0.2760 and 0.5240, and partnered is between 0.4760 and 0.7240.
Copyright ©2024 Pearson Education, Inc.
2. AIX-12 cont. You are 95% confident that the mean years as customer is between 3.01 and 4.12. You are 95% confident that the mean annual household income of the customers is between $69,523.12 and $81,360.03. You are 95% confident that the mean average number of times the customer plans to use the elliptical each week is between 4.47 and 5.08. You are 95% confident that the mean average number of miles the customer expects to use the elliptical each week is between 147.69 and 186.11. You are 95% confident that the mean level of fitness is between 4.41and 4.84. You are 95% confident that the proportion of the store: Ashland is between 0.0760 and 0.3240, Galleria is between –0.0066 and 0.1566, Oxford Glen is between 0.1158 and 0.3842, Springville is between 0.0225 and 0.2275, and Web 0.2022 and 0.4978. You are 95% confident that the proportion of the age group: 18-34 is between 0.7072 and 0.9428, and 35-55 is between 0.0572 and 0.2928. You are 95% confident that the proportion of the gender: Cis Man is between –0.0234 and 0.0734, Cis Woman is between –0.0234 and 0.0734, Man is between 0.2718 and 0.5782, Non-binary is between –0.0066 and 0.1566, Prefer not to say is between 0.0225 and 0.2275, Trans is between 0.0393 and 0.2607, Woman is between 0.0070 and 0.1930, and (blank) is between –0.0066 and 0.1566. You are 95% confident that the proportion of highest education: high school or lower is between –0.0175 and 0.1175, doctorate is between 0.0070 and 0.1930, and master’s degree is between 0.7393 and 0.9607. You are 95% confident that the proportion of the relationship status: single is between 0.2718 and 0.5782, and partnered is between 0.4218 and 0.7282.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxlv
Chapter 10
1.
Years as a Customer:
Population 1 = graduate degree (120), 2 = not (60)H0: 12 22 ; H1: 12 22 F Test for Differences in Two Variances Data Level of Significance
0.05
Larger-Variance Sample Sample Size
120
Sample Variance
2.735394958
Smaller-Variance Sample Sample Size
60
Sample Variance
1.459389831
Intermediate Calculations F Test Statistic
1.8743 Copyright ©2024 Pearson Education, Inc.
Population 1 Sample Degrees of Freedom
119
Population 2 Sample Degrees of Freedom
59
Two-Tail Test Upper Critical Value
1.5867
p-Value
0.0082 Reject the null hypothesis
S12 = 1.8743 S22 Decision: Since FSTAT = 1.8743 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxlvii 1. Years as a Customer: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1 2 H1: 1 2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
120
Sample Mean
3.58
Sample Standard Deviation
1.6539
Population 2 Sample Sample Size
60
Sample Mean
2.74
Sample Standard Deviation
1.2081
Intermediate Calculations Numerator of Degrees of Freedom
0.0022
Denominator of Degrees of Freedom
0.0000
Total Degrees of Freedom
154.2405
Degrees of Freedom
154
Standard Error
0.2171
Difference in Sample Means
0.8400
Separate-Variance t Test Statistic
3.8698
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9755
Upper Critical Value
1.9755
p-Value
0.0002 Reject the null hypothesis
Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean years as a customer is different between graduate degree holding customers and customers without a graduate degree.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxlix 1. Annual Household Income: cont.
Population 1 = graduate degree (120), 2 = not (60)H0: 12 22 ; H1: 12 22 F Test for Differences in Two Variances Data Level of Significance
0.05
Larger-Variance Sample Sample Size
120
Sample Variance
299586226.1
Smaller-Variance Sample Sample Size
60
Sample Variance
93303111.51
Intermediate Calculations F Test Statistic
3.2109
Population 1 Sample Degrees of Freedom
119
Population 2 Sample Degrees of Freedom
59
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Upper Critical Value
1.5867
p-Value
0.0000 Reject the null hypothesis
S12 = 3.2109 S22 Decision: Since FSTAT = 3.2109 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dli 1. Annual Household Income: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1 2 H1: 1 2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
120
Sample Mean
58319.275
Sample Standard Deviation
17308.5593
Population 2 Sample Sample Size
60
Sample Mean
44520.18333
Sample Standard Deviation
9659.3536
Intermediate Calculations Numerator of Degrees of Freedom Denominator of Degrees of Freedom
16415492889630.4000 93362437688.9342
Total Degrees of Freedom
175.8255
Degrees of Freedom
175
Standard Error
2012.8596
Difference in Sample Means
13799.0917
Separate-Variance t Test Statistic
Copyright ©2024 Pearson Education, Inc.
6.8555
Two-Tail Test Lower Critical Value
-1.9736
Upper Critical Value
1.9736
p-Value
0.0000 Reject the null hypothesis
Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean annual household income is different between graduate degree holding customers and customers without a graduate degree.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dliii 1. Weekly Usage: cont.
Population 1 = graduate degree (120), 2 = not (60)H0: 12 22 ; H1: 12 22 F Test for Differences in Two Variances Data Level of Significance
0.05
Larger-Variance Sample Sample Size
120
Sample Variance
1.162394958
Smaller-Variance Sample Sample Size
60
Sample Variance
0.931920904
Intermediate Calculations F Test Statistic
1.2473
Population 1 Sample Degrees of Freedom
119
Population 2 Sample Degrees of Freedom
59
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Upper Critical Value
1.5867
p-Value
0.3471
Do not reject the null hypothesis
S12 = 1.2473 S22 Decision: Since FSTAT = 1.2473 < 1.5867 and the p-value = 0.3471 > 0.05, do not reject H0. There is not enough evidence of a difference in the two population variances. Hence, a pooled-variance t test for the difference in two population means can be used. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlv 1. Weekly Usage: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1 2 H1: 1 2 Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
120
Sample Mean
3.675
Sample Standard Deviation
1.078144219
Population 2 Sample Sample Size
60
Sample Mean
3.016666667
Sample Standard Deviation
0.965360505
Intermediate Calculations Population 1 Sample Degrees of Freedom
119
Population 2 Sample Degrees of Freedom
59
Total Degrees of Freedom
178
Pooled Variance
1.0860
Standard Error
0.1648
Difference in Sample Means
0.6583
t Test Statistic
3.9954
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9734
Upper Critical Value
1.9734
p-Value
0.0001 Reject the null hypothesis
Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean weekly usage is different between graduate degree holding customers and customers without a graduate degree.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlvii 1. Elliptical Miles: cont.
Population 1 = graduate degree (120), 2 = not (60)H0: 12 22 ; H1: 12 22 F Test for Differences in Two Variances Data Level of Significance
0.05
Larger-Variance Sample Sample Size
120
Sample Variance
3005.576513
Smaller-Variance Sample Sample Size
60
Sample Variance
1825.137751
Intermediate Calculations F Test Statistic
1.6468
Population 1 Sample Degrees of Freedom
119
Population 2 Sample Degrees of Freedom
59
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Upper Critical Value
1.5867
p-Value
0.0345 Reject the null hypothesis
S12 = 1.6468 S22 Decision: Since FSTAT = 1.6468 > 1.5867 and the p-value < 0.05, reject H0. There is enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the separate-variance t test. Decision rule: If FSTAT > 1.5867, reject H0.Test statistic: FSTAT
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlix 1. Elliptical Miles: cont. Population 1 = graduate degree (120), 2 = not (60)H0: 1 2 H1: 1 2 Separate-Variances t Test for the Difference Between Two Means (assumes unequal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
120
Sample Mean
109.875
Sample Standard Deviation
54.8231
Population 2 Sample Sample Size
60
Sample Mean
89.77666667
Sample Standard Deviation
42.7216
Intermediate Calculations Numerator of Degrees of Freedom
3076.4143
Denominator of Degrees of Freedom
20.9549
Total Degrees of Freedom
146.8111
Degrees of Freedom
146
Standard Error
7.4475
Difference in Sample Means
20.0983
Separate-Variance t Test Statistic
2.6987
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-1.9763
Upper Critical Value
1.9763
p-Value
0.0078 Reject the null hypothesis
Decision: Since the p-value < 0.05, reject H0. There is sufficient evidence to conclude that the mean elliptical miles is different between graduate degree holding customers and customers without a graduate degree.
2.
At the 5% level of significance, there is not enough evidence of a difference between graduate degree holding customers and customers without a graduate degree in their years as a customer, annual household income, weekly usage, and elliptical miles.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxi
Chapter 11
1.
Years as a Customer, based on the elliptical purchased H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
EX-10
80 96.9
1.21125
0.6443
EX-11
60 76.6 1.276666667
1.0345
AIX-12
40 52.9
1.3154
1.3225
ANOVA Source of Variation
SS
df
MS
Between Groups
0.3622
2
0.1811
Within Groups
163.2370
177
0.9222
Total
163.5991
179
F 0.1963
P-value
F crit
0.8219 3.0470
Level of significance
0.05
Since p-value = 0.8219 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups EX-10
Count
Sum
80 261.1
Average Variance 3.26375
2.0596
Copyright ©2024 Pearson Education, Inc.
EX-11
60 190.2
3.17
2.6428
AIX-12
40 142.7
3.5675
3.0058
MS
F 0.8084
ANOVA Source of Variation
SS
df
Between Groups
3.9814
2
1.9907
Within Groups
435.8586
177
2.4625
Total
439.8400
179
P-value
F crit
0.4472 3.0470
Level of significance
0.05
Test statistic: FSTAT = 0.8084 Since p-value = 0.4472 > 0.05, and FSTAT = 0.8084 < 3.047, do not reject H0. There is insufficient evidence to conclude that the years as a customer across the product purchased (EX-10, EX-11, and AIX-12) are different.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxiii 1. Annual Household Income ($), based on the elliptical purchased cont. H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
EX-10
80
598062
7475.775
25815287.7968
EX-11
60
405183
6753.05
28754955.3025
AIX-12
40
659391
16484.775
65052816.4609
ANOVA Source of Variation
SS
df
Between Groups
2719563231.0250
Within Groups
6273009940.7750
177
Total
8992573171.8000
179
MS
F
2 1359781615.5125 38.3678
P-value
F crit
0.0000 3.0470
35440734.1287
Level of significance
0.05
Since p-value = 0.0000 < 0.05, reject H0. There enough evidence to conclude that the variances in annual household income across the products (EX-10, EX-11, AIX-12) are different. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
EX-10
80
3713442
46418.025
82369840.5057
EX-11
60
2938419
48973.65
74891532.3331
AIX-12
40
3017663
75441.575
342465992.7122
ANOVA Copyright ©2024 Pearson Education, Inc.
Source of Variation
SS
df
MS
Between Groups
24490250198.5361
Within Groups
24281991523.3750
177
Total
48772241721.9111
179
F
2 12245125099.2681 89.2590
P-value
F crit
0.0000 3.0470
137186392.7874
Level of significance
0.05
Test statistic: FSTAT = 89.2590 Since p-value = 0.0000 < 0.05, and FSTAT = 89.2590 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in annual household income across the product purchased (EX-10, EX-11, AIX-12).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxv 1. Annual Household Income ($), based on the elliptical purchased cont.
From the Tukey Pairwise Comparison procedure, there is a difference in mean annual household income between the customers of EX-10 and AIX-12, and EX-11 and AIX-12.
Copyright ©2024 Pearson Education, Inc.
1. Usage (mean number of times the customer plans to use the elliptical each week), based on the cont. elliptical purchased H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
EX-10
80
45
0.5625
0.2998
EX-11
60
32 0.533333333
0.3548
AIX-12
40
31
0.3327
0.775
Variance
ANOVA Source of Variation
SS
df
MS
F
Between Groups
1.6042
2
0.8021
Within Groups
57.5958
177
0.3254
Total
59.2000
179
2.4649
P-value
F crit
0.0879 3.0470
Level of significance
0.05
Since p-value = 0.0879 > 0.05, do not reject H0. There is not enough evidence that the equal variance assumption needed for the one-way ANOVA F test for the difference in the means is violated. Hence, the one-way ANOVA F test for the difference in the mean can be used. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
EX-10
80
247
3.0875
0.6125
EX-11
60
184 3.066666667
0.6395
AIX-12
40
191
0.8968
4.775
Variance
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxvii
ANOVA Source of Variation
SS
df
MS
Between Groups
89.5486
2
44.7743
Within Groups
121.0958
177
0.6842
Total
210.6444
179
F 65.4445
P-value
F crit
0.0000 3.0470
Level of significance
0.05
Test statistic: FSTAT = 65.4445 Since p-value = 0.0000 < 0.05, and FSTAT = 65.4445 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in usage (mean number of times the customer plans to use the elliptical each week) across the product purchased (EX-10, EX-11, AIX-12).
Copyright ©2024 Pearson Education, Inc.
1. Usage (mean number of times the customer plans to use the elliptical each week), based on the cont. elliptical purchased
From the Tukey Pairwise Comparison procedure, there is a difference in usage (mean number of times the customer plans to use the elliptical each week) between the customers of EX-10 and AIX-12, and EX-11 and AIX-12.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxix 1. Miles (mean number of elliptical miles the customer expects to exercise each week), based on the cont. elliptical purchased H0: 12 22 32 H1: At least one variance is different. Levene’s test for equality of variances: SUMMARY Groups
Count
Sum
Average
Variance
EX-10
80
1710
21.375
373.3867
EX-11
60 1420.4
23.67333333
546.0569
AIX-12
40
1744
43.6 1707.1179
ANOVA Source of Variation
SS
df
MS
F 9.8070
Between Groups
14216.5007
2
7108.2503
Within Groups
128292.5073
177
724.8164
Total
142509.0080
179
P-value
F crit
0.0001 3.0470
Level of significance
0.05
Since p-value = 0.0001 < 0.05, reject H0. There enough evidence to conclude that the variances in miles across the products (EX-10, EX-11, AIX-12) are different. H0: 1 2 3 H1: At least one mean is different. Decision rule: df: 2,177. If F > 3.047, reject H0. One-way ANOVA F test for difference in the means: ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
EX-10
80 6616.8
82.71
EX-11
60 5278.8
87.98 1105.6986
AIX-12
40
166.9 3607.9897
6676
Copyright ©2024 Pearson Education, Inc.
832.4434
ANOVA Source of Variation
SS
df
MS
Between Groups
209793.6044
2 104896.8022
Within Groups
271710.8480
177
Total
481504.4524
179
F 68.3327
P-value
F crit
0.0000 3.0470
1535.0895
Level of significance
0.05
Test statistic: FSTAT = 68.3327 Since p-value = 0.0000 < 0.05, and FSTAT = 68.3327 > 3.047, reject H0. There is enough evidence to conclude that there is a significant difference in miles (mean number of elliptical miles the customer expects to exercise each week) across the product purchased (EX-10, EX-11, AIX-12).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxi 1. Miles (mean number of elliptical miles the customer expects to exercise each week), based on the cont. elliptical purchased
From the Tukey Pairwise Comparison procedure, there is a difference in miles (mean number of elliptical miles the customer expects to exercise each week) between the customers of EX-10 and AIX-12, and EX-11 and AIX-12. 2.
There is not enough evidence of any difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the number of years as a customer. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in annual household income. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in usage (mean number of times the customer plans to use the elliptical each week). There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in miles (mean number of elliptical miles the customer expects to exercise each week).
Copyright ©2024 Pearson Education, Inc.
Chapter 12
1.
Years as a customer: Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
1478330
Sum of Sample Sizes
180
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
EX-10
80
7301
91.2625
EX-11
60
5083.5
84.725
AIX-12
40
3905.5
97.6375
Test Result H Test Statistic
1.5047
Critical Value
5.9915
p-Value
0.4713
Do not reject the null hypothesis
Because H = 1.5047 < 5.9915 or p-value = 0.4713 > 0.05, do not reject H0. At the 0.05 significance level, there is not enough evidence of any difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in median years as a customer.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxiii 1. Annual household income ($): cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
1640847
Sum of Sample Sizes
180
Number of Groups
3
Group
Sample Sum of Size Ranks
Mean Ranks
EX-10
80
5571.5
EX-11
60
4851.5 80.8583333
AIX-12
40
5867
69.64375
146.675
Test Result H Test Statistic
61.3634
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis
Because H = 61.3634 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in their median annual household income.
Copyright ©2024 Pearson Education, Inc.
1. Number of times the customer plans to use the elliptical each week: cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
1644559
Sum of Sample Sizes
180
Number of Groups
3
Group
Sample Size
Sum of Ranks
Mean Ranks
EX-10
80
5992
74.9
EX-11
60
4377
72.95
AIX-12
40
5921
148.025
Test Result H Test Statistic
62.7307
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis
Because H = 62.7307 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the median mean number of times the customer plans to use the elliptical each week.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxv 1. Number of elliptical miles: cont. Population 1 = EX-10, 2 = EX-11, 3 = AIX-12 H0: M1 M 2 M 3 H1: Not all M j s are the same. Kruskal-Wallis Rank Test for Differences in Medians
Data
Level of Significance
0.05
Intermediate Calculations Sum of Squared Ranks/Sample Size
1654611
Sum of Sample Sizes
180
Number of Groups
3
Group
Sample Size
Sum of Ranks
Mean Ranks
EX-10
80
5516
68.95
EX-11
60
4814 80.2333333
AIX-12
40
5960
149
Test Result H Test Statistic
66.4333
Critical Value
5.9915
p-Value
0.0000
Reject the null hypothesis
Because H = 66.4333 > 5.9915 or p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference between customers based on the product purchased (EX-10, EX-11, AIX-12) in the median mean number of elliptical miles.
Copyright ©2024 Pearson Education, Inc.
2.
Risk based on market cap: H 0 : There is no relationship between risk and market cap H1 : There is relationship between risk and market cap
PHStat output of the chi-square test: Chi-Square Test
Observed Frequencies Risk Level Market Cap
Low
Average
High
Total
Mid-Cap
60
111
63
234
Small
53
79
120
252
Large
195
286
140
621
Total
308
476
323
1107
Expected Frequencies Risk Level Market Cap
Low
Mid-Cap
Average
High
Total
65.10569 100.6179 68.27642
234
Small 70.11382 108.3577 73.52846
252
Large
621
Total
172.7805 267.0244 181.1951 308
476
323
Data Level of Significance
0.05
Number of Rows
3
Number of Columns
3
Degrees of Freedom
4 Copyright ©2024 Pearson Education, Inc.
1107
Solutions to End-of-Section and Chapter Review Problems dlxxvii
Results Critical Value
9.487729
Chi-Square Test Statistic
56.95336
p-Value
1.27E-11
Reject the null hypothesis
Expected frequency assumption is met. Since p-value = 0.0000 < 0.05, reject H0. There is enough evidence that risk is related to market cap and, hence, a difference in risk based on market cap.
Copyright ©2024 Pearson Education, Inc.
2. Relationship Status (single or partnered): cont. H 0 : There is no relationship between relationship status and product purchased H1 : There is relationship between relationship status and product purchased Chi-Square Test
Observed Frequencies Product Purchased Relationship Status
EX-11
EX-10
AIX-12
Total
Partnered
36
48
23
107
Single
24
32
17
73
Total
60
80
40
180
AIX-12
Total
Partnered 35.66667 47.55556 23.77778
107
Expected Frequencies Product Purchased Relationship Status
Single Total
EX-11
EX-10
24.33333 32.44444 16.22222 60
80
40
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Critical Value
5.991465 Copyright ©2024 Pearson Education, Inc.
73 180
Solutions to End-of-Section and Chapter Review Problems dlxxix Chi-Square Test Statistic
0.080655
p-Value
0.960475
Do not reject the null hypothesis
Expected frequency assumption is met. The p-value = 0.9605 > 0.05, do not reject H0. There is not enough evidence of any difference in the relationship status (single or partnered) based on the product purchased (EX-10, EX-11, AIX12).
Copyright ©2024 Pearson Education, Inc.
2. Fitness: cont. H 0 : There is no relationship between fitness and product purchased H1 : There is relationship between fitness and product purchased Chi-Square Test
Observed Frequencies Product Purchased Fitness
EX-11
EX-10
AIX-12
Total
1
1
1
0
2
2
12
14
0
26
3
39
54
4
97
4
8
9
7
24
5
0
2
29
31
Total
60
80
40
180
AIX-12
Total
Expected Frequencies Product Purchased Fitness
EX-11
EX-10
1
0.666667 0.888889 0.444444
2
2
8.666667 11.55556 5.777778
26
3
32.33333 43.11111 21.55556
97
4
8 10.66667 5.333333
24
5
10.33333 13.77778 6.888889
31
Total
60
80
40
Data
Copyright ©2024 Pearson Education, Inc.
180
Solutions to End-of-Section and Chapter Review Problems dlxxxi Level of Significance
0.05
Number of Rows
5
Number of Columns
3
Degrees of Freedom
8
Results Critical Value
15.50731
Chi-Square Test Statistic
118.7768
p-Value
5.93E-22
Reject the null hypothesis
Expected frequency assumption is violated.
3.
The p-value = 0.0000 < 0.05, reject H0. There is enough evidence of a difference in the self-rated fitness based on the product purchased (EX-10, EX-11, AIX-12). However, the expected frequency assumption required for the chi-square test is violated. Hence, the conclusion might not be reliable. Refer to the conclusions of the various hypothesis tests in parts (1) and (2).
The Tri-Cities Times Case
Chapter 0
1.
Each company’s own historical data would be a primary data source. If available, data about business operations, like formal accounting statements, and revenue as well as subscription and advertising including breakdowns and analyses. The research team might use secondary data sources published by some private market research companies, or secondary data sources published by governmental agencies. A third-party that analyze advertising, market share, and intangibles such as reputation and perception of the existing business.
2.
Using DCOSAC, to define the data, collect the data, organize and summarize the data, analyze and reach conclusions about the data, and communicate the results of the analysis would be essential to combine the two businesses. Start with defining a business goal to be achieved, or a problem to be Copyright ©2024 Pearson Education, Inc.
solved. Then take that goal and work through the framework of DCOSAC, identifying data needs and the actionable information that would be needed. When combining operations, there is often talk about ―operational efficiencies‖ to be had that will lower expenses. In the best situations, achieving such a goal can eliminate unnecessary duplication. In other situations, the goal may be just a euphemism for large-scale layoffs or business closures which comes as a surprise for interested parties who failed to require more than just the Define step when accepting terms. For example, in human resources, determine who is employed with each company, what job each employee has, and use the information to determine how many and which employees will be needed with the new company. 3.
(a)
One needs to know more to be able to determine if Staff&Save is an application of business statistics. A reasonable observation is that the software sounds more like an OR/mgt. sci. or AI application rather than business statistics. However, a description of software features does not explain how the software works, which would be important to know. Some points about FTF.1 might be mentioned. Best answers would question whether evaluation data could be defined in a meaningful way for some of the factors being considered. Software to minimize staffing has been used in the retail industry for some time, but Staff&Save is talking about company-wide staffing and hiring, a much broader concept. The phrase ―to determine who gets retained and scheduled‖ suggests that this software may be supplying management decisions and not actionable information for decision-making. That software produces actionable information to assist decision-making, and is not a replacement for that decision-making, is a constant theme in this text.
(b)
The short answer is every step of DCOSAC could be bypassed. Even if a company sought ―minimal staffing,‖ that goal would first need to be well-defined for that company. For example, by only using data related to employee availability, preference, and qualification, Staff&Save bypasses defining data and collecting data with respect to other areas of import, which could result in an analysis that is missing pertinent data. The benefit of bypassing the skipped tasks is saved time and money. There are no true benefits to bypassing tasks in the DCOSAC or similar framework. The general risk is always to increase the chance of error which can include making a poor or uninformed decisions.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxxiii
Chapter 1
1.
This is a sampling technique issue. Answers may vary. For example, the intern suggests that an Internet poll could be posted on a social media site as well as on the new website for the Times. The advantages could include convenient sampling and avoiding coding errors and data integration errors. The disadvantages of using an Internet poll posted on a social media site and the new website include coverage error, nonresponse error, and sampling error. Other errors could include missing values. For a fact-based decision-making process, the disadvantages outweigh the advantages because Internet poll may not represent new subscribers.
2.
This is an open response, so check all answers for plausibility of variable type. For example, useful demographic facts about subscribers could include age (years), gender, education level (years completed), marital status, household income ($), and employment status.
3.
One would expect a wider range of values, however the survey responses may not include data decision makers need. Answers may vary. For example, an Internet poll would be subject to coverage error by only including responders who use the Internet. These demographics typically leave out older population. An Internet poll could only include responders who don’t think that the poll is just a scam/spam. Increasingly, tech-savvy young people avoid scam/spam situations. In addition, Internet poll participants might be non-subscribers.
4.
For ordinal or nominal, the domain should be a set of values, or perhaps a scale of 1 to 5, or 1 to 7, or even five categories from strongly dislike to strongly like would be best to use. For free response, one cannot quantify attitudes. ―I like that change 79%‖ does not make sense, whereas ―I strongly like that proposed change‖ does.
5.
Possible problems that could be listed: two differently named variables representing the same fact and variable, but with a different domain (including different codings); each subscriber file may have variables unique to the file. Answers will vary based on what problems the student chooses. For example, separate subscriber processing systems could include different entries in the datasets. For example, one company could use CC for general credit card identifier, and the other company could use VISA or MC, or OTHER. Data cleaning could help with combining this data. Also, separate subscriber processing systems could include different categories of data. Combining different datasets would result in final dataset missing category data. For example, if one company included a category of ―best contact method,‖ like cell phone number, email address, street address, and the other company did not keep track of this category of data, when the data are combined, that category would be empty for all of the second company’s subscribers.
Copyright ©2024 Pearson Education, Inc.
Chapter 2
1.
No, not in the current form because the data needs cleaning. There is variation in how categories are coded and some values need recoding. For example, ―donate money‖ and ―I will donate‖ all map to the same categorical value. Also, a ―Venmo‖ response maps to, and should be recoded to, the categorical value ―Other.‖ Because the variable is categorical, the solution could start with a tabular summary of the data and discover all of the unique values for the data set.
2.
This question requires that the data cleaning be done. There should be 5 categories in the summary, but what the values are for the five categories is a bit open. For example, ―Donate money‖, ―donation‖, or ―I will donate‖ are all plausible for one of the categories, although given the frequencies of these values, ―donation‖ is likely to be chosen. The most appropriate summaries would be a summarization of the cleaned data, so readers have a second chance to clean data here in case they missed the point of question 1.
3.
Presenting tabular form of the table data. Subscriptions
4.
Frequency Percentage
Digital
54
33.75%
47
29.38%
Print with Digital
59
36.88%
Total
160
100.00%
For Table 1, the chart is fine as is. For Table 2, the chart is passable. A change to make the chart better would be to eliminate gradient effect which is not modeled in this textbook. Reordering the bars here would not improve the chart because the variable of interest is operating system. A recoding of the data would lose information but would not necessarily be wrong. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxxv For Table 3, there are too many categories. The background should be white and a more distinctive coloring for the slices should be used. As is, the chart is not fully accessible and does not meet the WCAG standard mentioned in a sidenote on page 85. For Table 4, the chart is okay given the table data, but a better chart would have graphed the bounce rate itself as a time-series chart. For Table 5, the map chart is fine, but the 3D chart violates best practices. The second chart could be a bar chart, but given the data, a Pareto chart would better show off the ―vital few‖ from the ―trivial many.‖
Copyright ©2024 Pearson Education, Inc.
Chapter 3
1.
For the daily website users (Table 1), the following descriptive statistics are computed: Excel Results Current Period Prior Period Mean
54.13333333
66
Median
51
69.5
Mode
38
73
Minimum
31
31
Maximum
98
97
Range
67
66
Variance
227.0851
271.8621
Standard Deviation
15.0693
16.4882
Coeff. of Variation
27.84%
24.98%
Skewness
1.0303
-0.4107
Kurtosis
1.0622
-0.4980
30
30
Standard Error
2.7513
3.0103
First Quartile
46
54
Third Quartile
64
80
Count
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxxvii
Copyright ©2024 Pearson Education, Inc.
1. For the changes in bounce rate (Table 4), the following descriptive statistics are computed: cont. Excel Results Percentage Change Mean
0.108733333
Median
-0.1215
Mode
#N/A
Minimum
-0.563
Maximum
2.166
Range
2.729
Variance
0.4063
Standard Deviation
0.6374
Coeff. of Variation
586.22%
Skewness
1.9030
Kurtosis
3.8387
Count
30
Standard Error
0.1164
First Quartile
-0.301
Third Quartile
0.353
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dlxxxix
Copyright ©2024 Pearson Education, Inc.
2.
For daily website users, refer to the chart associated with Table 1. For changes in bounce rate (Table 4), the following Time Series is included.
3.
Daily Website Users Current Period: In the current period, the mean of daily website users is 54.1 visitors with a median of 51 visitors. The data are right-skewed. Twenty-five percent of the daily website users are below 46 visitors, 50% are below 51 visitors, and 75% are below 64 visitors, with the least number of 31 visitors and the most with 98 visitors, giving a range of 67. Daily Website Users Prior Period: In the prior period, the mean of daily website users was 66 visitors with a median of 69.5 visitors. The data are left-skewed. Twenty-five percent of the prior period daily website users were below 54 visitors, 50% were below 69.5 visitors, and 75% were below 80 visitors, with the least number of 31 visitors and the most with 97 visitors, giving a range of 66. Changes in Bounce Rate: The mean of changes in the bounce rate is 10.87% with a median of –12.15%. The data are rightskewed. Twenty-five percent of the changes in the bounce rate are below –30.1%, 50% are below –12.15%, and 75% were below 35.3%, with the smallest bounce rate of –56.3% and the largest bounce rate of 216.6%, giving a range of 272.9%
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxci
Chapter 5
1.
Assume that the assumptions for using a binomial distribution are satisfied and let denote the probability that a customer will subscribe to the 3-At Large service, (a) 0.10, P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2) = 0.0052 + 0.0286 + 0.0779 = 0.1117 (b) 0.10, P(X = 0) + P(X = 1) = 0.0052 + 0.0286 = 0.0338 (c) 0.10, P(X > 3) = 1 – P(X 3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3)] = 1 – (0.0052 + 0.0286 + 0.0779 + 0.1386) = 1 – 0.2503 = 0.7497 (d) P(X = 4) = 0.1809 The likelihood that you would get 4 subscribers in a sample of 50 if the probability of a subscription is 0.10 is only 0.1809. Thus, you can conclude that it is more likely than 0.10 that you will get new subscribers when no free premium channels are included.
2.
(a) (b) (c)
0.20, P(X < 3) = P(X = 0) + P(X = 1) + P(X = 2) = 0.0000 + 0.0002 + 0.0011 = 0.0013 0.20, P(X = 0) + P(X = 1) = 0.0000 + 0.0002 = 0.0002 0.20, P(X > 3) = 1 – P(X 3) = 1 – [P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3) )]
(d)
= 1 – (0.0000 + 0.0002 + 0.0011 + 0.0044) = 1 – 0.0057 = 0.9943 The likelihood that you would get fewer than 3 subscribers when two complimentary channels are offered is 0.0013. This is much lower than when no free channels are offered, which is a probability of 0.1117. The likelihood that you would get fewer than 0 or 1 subscribers when two complimentary channels are offered is 0.0002. This is much lower than when no free channels are offered, which is a probability of 0.0338. The likelihood that you would get more than 3 subscribers when two complimentary channels are offered is 0.9943. This is much higher than when no free channels are offered, which is a probability of 0.7497.
(e)
If no premium channels were offered, the probability of getting six or more subscriptions is small (0.3839) assuming that the probability of a new subscription is 0.10. If two premium channels were offered, the probability of getting six or more subscriptions is large (0.9520) assuming that the probability of a new subscription is 0.20. Thus, you can conclude that it is more likely than 0.20 that you will get new subscribers when two free premium channels are included.
3.
There is no single answer here, but a lot can be discussed. The ultimate question is to determine the number of premium channels to offer free. On one hand you want to maximize the chance of a new subscription, on the other hand, you don’t want to give away more premium channels than necessary. A reasonable conclusion might be to offer one premium channel as an incentive so that you can advertise that there is something being given away for free. Additional free premium channels may not produce sufficiently more new subscriptions.
4.
For a sample of 100 people, how many customers are likely to skip the offers? = 100(0.25) = 25
5.
(a) (c)
25% + 20% = 45% 15%
(b)20% (d)10% Copyright ©2024 Pearson Education, Inc.
6.
Part (a) group, because 25% + 20% = 45% of all subscribers, which is literally the greatest number of subscribers. It could also be argued for part (c) group, due to costs and effectiveness of the offer.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxciii
Chapter 6
1.
Subscribers: n = 1 µ = 14.6 = 1.3 (a) P(< 6) = 0.0000 (b) P(> 14) = 0.6778 Non-subscribers: n = 1 µ = 6.3 = 2.5 (a) P(< 6) = 0.4522 (b) P(> 14) = 0.0010
2.
The probability of a subscriber spending 15 minutes or more on the website is 0.3792. So, only 37.92% of subscribers spend 15 minutes or more on the website. The goal is met 37.92% of the time.
3.
The probability of a non-subscriber spending 8 minutes or more on the website is 0.2483. So, only 24.83% of non-subscribers spend 8 minutes or more on the website. The goal is met 24.83% of the time.
Copyright ©2024 Pearson Education, Inc.
Chapter 7
1.
n = 25 µ = 6.0 = 1.5 (a) P(< 6.0) = 0.5000 (b) P(between 5.25 and 6.75) = 0.3829 (c) P(between 6.0 and 6.75) = 0.1915 (d) P(less than 0.95 or greater than 1.05) = 0.9999 (e) P( X < 5.7) = P(Z < –1.00) = 0.1587 If the time spent on the website is normally distributed with a mean of 6.0 and a standard deviation of 1.5, the probability is 0.1587 of obtaining a sample that will yield a sample mean time spent on the website of 5.7 or less, a rather unlikely event. The fact that today’s sample of 25 yields a sample mean time spent on the website of 5.7 indicates that the distribution of the upload speed is most likely not normally distributed with a mean of 6.0 and a standard deviation of 1.5.
2.
The results here are based on a sample of 25. The standard error of the mean is 0.03 and thus there are more means than individual times spent on the website close to the population mean.
3.
p 0.14, p (a) (b) (c)
(1 )
n P(p < 16%) = 0.7178 P(12% < p < 16%) = 0.4356 P(p > 5%) = 0.0.9953
0.14(1 0.14) = 0.0346987031 100
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxcv
Chapter 8
1.
95% confidence interval constructed with n = 300: Time:7.09 7.95 You are 95% confident that the mean time of a live chat session is between 7.09 and 7.95 minutes.
2.
95% confidence interval for the proportion constructed with n = 300: Subscribers (192):0.5857 0.6943 You are 95% confident that the proportion of the subscribers is between 0.5857 and 0.6943.
3.
For the sample, the sample mean is 7.5, and the sample standard deviation is 3.7834, yielding a half width of 0.5663 minutes. There are two different answers here. First answer: digital managers would have to first determine the sampling error to allow (and other, business factors) in order to determine best the sample size. Second answer: There is no one best sample size. The best sample size would be one that yields an interval estimate that would be useful for decision making. For the help chat sample, the sample mean is 7.5 and the sample standard deviation is 3.783 that creates an interval half width that is less than 0.6 minutes. That might be sufficient for a planning phase in which managers are using whole number estimates of chat times.
Copyright ©2024 Pearson Education, Inc.
Chapter 9
Since the population standard deviation is unknown and the sample size is large enough at 50, the t test for the mean can be used. 1.
H 0 : 3 H1 : 3 Use the t Test for Hypothesis of the Mean Data
Null Hypothesis
=
3
Level of Significance
0.1
Sample Size
50
Sample Mean
3.0995
Sample Standard Deviation
0.9526775
Intermediate Calculations Standard Error of the Mean
0.1347
t Test Statistic
0.7385
Upper-Tail Test Upper Critical Value
1.2991
p-Value
0.2319
Do not reject the null hypothesis Since tSTAT = 0.7385 < tCRITICAL = 1.2991 or the p-value = 0.2319 > 0.05, do not reject the null hypothesis. At 0.10 level of significance, there is insufficient evidence to conclude that the mean response time is greater than 3 seconds. 2.
H 0 : 3 H1 : 3 Use the t Test for Hypothesis of the Mean Data
Null Hypothesis Level of Significance Sample Size
=
3 0.01 50
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxcvii Sample Mean
3.0995
Sample Standard Deviation
0.9526775
Intermediate Calculations Standard Error of the Mean
0.1347
t Test Statistic
0.7385
Upper-Tail Test Upper Critical Value
2.4049
p-Value
0.2319
Do not reject the null hypothesis Since tSTAT = 0.7385 < tCRITICAL = 2.4049 or the p-value = 0.2319 > 0.05, do not reject the null hypothesis. At 0.01 level of significance, there is insufficient evidence to conclude that the mean response time is greater than 3 seconds. 3.
There is insufficient evidence to conclude that the mean response time is greater than 3 seconds at both the 0.10 and 0.01 level of significance.
Copyright ©2024 Pearson Education, Inc.
Chapter 10
1.
(a)
You need to test whether the variances are equal since in order to conduct the t test for the difference between two independent means, you first need to determine whether the population variances of the two groups are equal. Population Larger Variance = Late, Smaller Variance = Early H0: 12 22 The population variances are the same. H1: 12 22 The population variances are different. PHStat output: F Test for Differences in Two Variances
Data Level of Significance
0.05
Larger-Variance Sample Sample Size
15
Sample Variance
458.0845714
Smaller-Variance Sample Sample Size
15
Sample Variance
320.2868571
Intermediate Calculations F Test Statistic
1.4302
Population 1 Sample Degrees of Freedom
14
Population 2 Sample Degrees of Freedom
14
Two-Tail Test Upper Critical Value
2.9786
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dxcix p-Value
0.5119
Do not reject the null hypothesis
Decision rule: If FSTAT > 2.9786, reject H0. S2 Test statistic: FSTAT 12 = 1.4302 S2 Decision: Since FSTAT = 1.4302 < 2.9786 and the p-value = 0.5119 > 0.05, do not reject H0. There is not enough evidence to conclude that the two population variances are different. Hence, the appropriate test for the difference in two means is the pooled-variance t test.
Copyright ©2024 Pearson Education, Inc.
1. (a) cont.
The boxplots do not show serious departure from the normality assumption. Because the sample sizes are each 15, you assume that the two populations are normally distributed with roughly equal variances. Since the boxplots do not show serious departure from normality and the F test shows insufficient evidence of a difference in the variances between the two interfaces, you use the pooled-variance t test for the difference in means of the two independent samples.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dci 1. (a) cont.
Population 1 = Early Evening, 2 = Late Evening H 0 : 1 2 H1 : 1 2 PHStat output: Pooled-Variance t Test for the Difference Between Two Means (assumes equal population variances) Data Hypothesized Difference
0
Level of Significance
0.05
Population 1 Sample Sample Size
15
Sample Mean
280.44
Sample Standard Deviation
17.89655992
Population 2 Sample Sample Size
15
Sample Mean
304.32
Sample Standard Deviation
21.40291035
Intermediate Calculations Population 1 Sample Degrees of Freedom
14
Population 2 Sample Degrees of Freedom
14
Total Degrees of Freedom
28
Pooled Variance
389.1857
Standard Error
7.2036
Difference in Sample Means
-23.8800
t Test Statistic
-3.3150
Copyright ©2024 Pearson Education, Inc.
Two-Tail Test Lower Critical Value
-2.0484
Upper Critical Value
2.0484
p-Value
0.0025 Reject the null hypothesis
(b)
2.
Since tSTAT = –3.315 < –2.0484 or the p-value of 0.0025 < 0.05, you reject the null hypothesis at the 5% level of significance. There is enough evidence to conclude that the two population mean call times are different.
For answers in part (a) and (b) in question 1, using a 0.01 level of significance only changes the critical value. (a)
F-test Upper Critical Value at 0.05 level of significance is 2.9786, and at 0.01 is 4.2993. Since the FSTAT = 1.4302 < both, there is no difference in the answer.
(b)
t-test Lower Critical Value at 0.05 level of significance is –2.0484, and at 0.01 is –2.7633. Since the tSTAT = –3.3150 < both, there is no difference in the answer.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dciii 3.
Using the two-tail test from PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
612
Sample Size
2218 Group 2
Number of Items of Interest
535
Sample Size
2112
Intermediate Calculations Group 1 Proportion
0.275924256
Group 2 Proportion
0.253314394
Difference in Two Proportions
0.022609862
Average Proportion
0.2649
Z Test Statistic
1.6853
Two-Tail Test Lower Critical Value
-1.9600
Upper Critical Value
1.9600
p-Value
0.0919
Do not reject the null hypothesis
Copyright ©2024 Pearson Education, Inc.
H0: 1 = 2 H1: 1 2 where Populations: 1 = Early Evening, 2 = Late Evening Decision rule: If ZSTAT < –1.960 or ZSTAT > 1.960, reject H0. Z STAT = 1.6853 Decision: Since ZSTAT = 1.6853 is between the two critical bounds, and p-value = 0.0919 > 0.05, do not reject H0. There is insufficient evidence of a difference between early evening and late evening billing or payment calls in the proportion at the 0.05 level of significance.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcv 4.
Using the upper tail test from PHStat Z Test for Differences in Two Proportions
Data Hypothesized Difference
0
Level of Significance
0.05
Group 1 Number of Items of Interest
612
Sample Size
2218
Group 2 Number of Items of Interest
535
Sample Size
2112
Intermediate Calculations Group 1 Proportion
0.275924256
Group 2 Proportion
0.253314394
Difference in Two Proportions
0.022609862
Average Proportion
0.2649
Z Test Statistic
1.6853
Upper-Tail Test Upper Critical Value
1.6449
p-Value
0.0460
Reject the null hypothesis H0: 1 > 2 H1: 1 2 where Populations: 1 = Early Evening, 2 = Late Evening Copyright ©2024 Pearson Education, Inc.
Decision rule: If ZSTAT > 1.6449, reject H0. Z STAT = 1.6853 Decision: Since ZSTAT = 1.6853 > 1.6449, and p-value = 0.0460 < 0.05, reject H0. There is evidence that the proportion of billing or payment calls made in the early evening is greater than the proportion of such calls made in the late evening at the 0.05 level of significance.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcvii
Chapter 11
1. First you need to determine whether there is a difference in the variance of the three home pages. H0 : 12 22 32
H1 : 12 22 32 PHStat output: Levene Test on Home Pages SUMMARY Groups
Count
Sum
Average
Variance
Headlines
8 21.36
2.67
3.8179
Subheads
8 15.52
1.94
5.0885
Excerpts
8 20.96
2.62
3.2690
ANOVA Source of Variation
SS
Between Groups
2.6608
Within Groups
Total
df
MS
F
P-value
F crit
2
1.3304
0.3278
0.7241
3.4668
85.2280
21
4.0585
87.8888
23 Level of significance
0.05
Since FSTAT = 0.3278 < 3.4668 or p-value = 0.7241 > 0.05, you do not reject H0. There is insufficient evidence of a difference in the variation of the time spent between the home pages. Now that you can assume that the home pages do not differ in their variances, you can test to determine whether there is a difference in their times. H 0 : 1 2 3 H1 : At least one of the means differs PHStat output: Copyright ©2024 Pearson Education, Inc.
ANOVA: Single Factor SUMMARY Groups
Count
Sum
Average
Variance
Headlines
8 265.36
33.17
11.9514
Subheads
8
230.4
28.8
9.3440
Excerpts
8 227.52
28.44
11.1067
ANOVA Source of Variation
SS
df
MS
F
5.1354
Between Groups
110.9317
2
55.4659
Within Groups
226.8152
21
10.8007
Total
337.7469
23
P-value
F crit
0.0153
3.4668
Level of significance
0.05
1. Since FSTAT = 5.1354 > 3.4668 or p-value = 0.0153 < 0.05, you reject H0. There is evidence of a cont. difference in the mean time spent between the home pages. Now, you can use the Tukey-Kramer multiple comparisons to determine which home pages differ in their mean times. PHstat output: Tukey Kramer Multiple Comparisons
Group
Sample
Sample
Mean
Size
1: Headlines
33.17
8
2: Subheads
28.8
8
3: Excerpts
28.44
8
Other Data
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcix Level of significance
0.05
Numerator d.f.
3
Denominator d.f.
21
MSW Q Statistic
Comparison
10.80072 3.58 Absolute
Std. Error
Critical
Difference
of Difference
Range
Results
Group 1 to Group 2
4.37 1.161933938
4.16 Means are different
Group 1 to Group 3
4.73 1.161933938
4.16 Means are different
Group 2 to Group 3
0.36 1.161933938
4.16 Means are not different
Headlines has a higher time than Subheads or Excerpts.
Copyright ©2024 Pearson Education, Inc.
2.
PHStat output: Anova: Two-Factor With Replication
Headlines Subheads Excerpts
SUMMARY
Total
Casual
Count
6
6
6
18
Sum
259.51
239.9
207.9
707.31
Average
43.25167
39.98333
34.65
39.295
Variance
19.9944
5.3857
19.0270
26.3686
Count
6
6
6
18
Sum
279.6
242.41
237.7
759.71
Average
46.6
40.40167
39.6167 42.20611
Variance
14.2880
10.9748
13.6537
Count
12
12
12
Sum
539.11
482.31
445.6
Subscriber
21.7757
Total
Average
44.92583
40.1925 37.13333
Variance
18.6406
7.4843
21.5824
SS
df
MS
F
P-value
F crit
ANOVA Source of Variation
Sample
76.2711
1
76.2711
5.4922
0.0259
4.1709
Columns
369.9440
2 184.9720
13.3195
0.0001
3.3158
Interaction
31.8912
2
1.1482
0.3307
3.3158
15.9456
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxi Within
416.6178
30
Total
894.7242
35
13.8873
2. At the 5% level of significance, since FSTAT = 1.1482 < 3.3158 or p-value = 0.3307 > 0.05, there is cont. insufficient evidence to conclude that there is interaction between visitors and home pages. At the 5% level of significance, since FSTAT = 13.3195 > 3.3158 or p-value = 0.0001 < 0.05, there is enough evidence to conclude that the mean visitors is different among the home pages. At the 5% level of significance, since FSTAT = 5.4922 < 4.1709 or the p-value = 0.0259 > 0.05, there is insufficient evidence to conclude that the mean visitors is different between the home pages.
Copyright ©2024 Pearson Education, Inc.
Chapter 12
1.
Using Table 1: Chi-Square Test
Observed Frequencies First Subscription Renewed?
Promotional
Direct
Upgrade
Total
Yes
45
162
100
307
No
202
105
40
347
Total
247
267
140
654
Upgrade
Total
Yes
115.9465 125.3349 65.71865
307
No
131.0535 141.6651 74.28135
347
Expected Frequencies First Subscription Renewed?
Promotional
Total
247
Direct
267
140
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
3
Degrees of Freedom
2
Results Copyright ©2024 Pearson Education, Inc.
654
Solutions to End-of-Section and Chapter Review Problems dcxiii Critical Value
5.991465
Chi-Square Test Statistic
135.7376
p-Value
3.35E-30
Reject the null hypothesis
Expected frequency assumption is met.
2 = 135.7376 > 5.991465. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription among the three subscriptions.
Copyright ©2024 Pearson Education, Inc.
1. Using Table 1: cont. Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
2.447746831
Sample Proportions Group 1
0.182186235
Group 2
0.606741573
Group 3
0.714285714
MARASCUILO TABLE Proportions
Absolute Differences
Critical Range
| Group 1 - Group 2 |
0.424555338
0.094701947 Significant
| Group 1 - Group 3 |
0.532099479
0.111121834 Significant
| Group 2 - Group 3 |
0.107544141
0.118693822 Not significant
Promotional subscription in a higher proportion of subscribers who renew after a promotion than providing direct subscription or upgrade to subscription.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxv 1. Using Table 2: cont. Chi-Square Test
Observed Frequencies Subscription Type Recommend?
Basic
At Large
Total
Yes
334
314
648
No
291
340
631
Total
625
654
1279
Expected Frequencies Subscription Type Recommend?
Basic
At Large
Total
Yes
316.6536 331.3464
648
No
308.3464 322.6536
631
Total
625
654
1279
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
3.766747
Copyright ©2024 Pearson Education, Inc.
p-Value
0.052281
Do not reject the null hypothesis
Expected frequency assumption is met.
2 = 3.766747 < 3.841459. Do not reject H0. There is no evidence of a relationship between the proportion of subscribers who would recommend among the two subscription types.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxvii 2.
From Table 1, we combine ―direct‖ and ―upgrade‖ to ―Not‖ and complete a Chi-Square Test, Chi-Square Test
Observed Frequencies First Subscription Renewed?
Promotional
Not
Total
Yes
45
262
307
No
202
145
347
Total
247
407
654
Expected Frequencies First Subscription Renewed?
Promotional
Not
Total
Yes
115.9465 191.0535
307
No
131.0535 215.9465
347
Total
247
407
654
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
2
Degrees of Freedom
1
Results Critical Value
3.841459
Chi-Square Test Statistic
131.4728 Copyright ©2024 Pearson Education, Inc.
p-Value
1.95E-30
Reject the null hypothesis
Expected frequency assumption is met.
2 = 131.4728 > 3.841459. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription among the promotional first subscription and not (direct and upgrade).
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxix 2. From Table 1, we combine ―direct‖ and ―upgrade‖ to ―Not‖ and complete a Chi-Square Test, cont. Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
1.959963985
Sample Proportions Group 1
0.182186235
Group 2
0.643734644
MARASCUILO TABLE Proportions | Group 1 - Group 2 |
Absolute Differences 0.461548409
Critical Range 0.066946645 Significant
Promotional subscription in a higher proportion of subscribers who renew after a promotion than not (providing direct subscription or upgrade to subscription). 3.
Answers may vary.
4.
Answers may vary. The subscribers to each plan are as follows: Plan A 54 subscribers, Plan B 139 subscribers, Plan C 243 subscribers, and Plan D 67 subscribers. There are significantly fewer subscribers to Plan A and Plan D than Plan B or Plan C.
Copyright ©2024 Pearson Education, Inc.
5.
Using Table 3: Chi-Square Test
Observed Frequencies Initial Subscription Renewed?
Plan A
Plan B
Plan C
Plan D
Total
Yes
13
50
189
52
304
No
41
89
54
15
199
Total
54
139
243
67
503
Plan D
Total
Yes 32.63618 84.00795 146.8628 40.49304
304
No
199
Expected Frequencies Initial Subscription Renewed?
Plan A
Total
Plan B
Plan C
21.36382 54.99205 96.13718 26.50696 54
139
243
Data Level of Significance
0.05
Number of Rows
2
Number of Columns
4
Degrees of Freedom
3
Results Critical Value
7.814728
Chi-Square Test Statistic
103.4847 Copyright ©2024 Pearson Education, Inc.
67
503
Solutions to End-of-Section and Chapter Review Problems dcxxi p-Value
2.77E-22
Reject the null hypothesis
Expected frequency assumption is met.
2 = 103.4847 > 7.814728. Reject H0. There is evidence of a difference in the proportion of subscribers who renewed their subscription based on the initial subscription plan (the four plans).
Copyright ©2024 Pearson Education, Inc.
5. Using Table 3: cont. Marascuilo Procedure
Level of Significance
0.05
Square Root of Critical Value
2.795483483
Sample Proportions Group 1
0.240740741
Group 2
0.35971223
Group 3
0.777777778
Group 4
0.776119403
MARASCUILO TABLE
Proportions
Absolute Differences
Critical Range
| Group 1 - Group 2 |
0.118971489
0.19849654 Not significant
| Group 1 - Group 3 |
0.537037037 0.178914751 Significant
| Group 1 - Group 4 |
0.535378662
| Group 2 - Group 3 |
0.418065548 0.136041203 Significant
| Group 2 - Group 4 |
0.416407173 0.182251326 Significant
| Group 3 - Group 4 |
0.001658375 0.160702078 Not significant
0.21614538 Significant
There is a difference between those who signed up for Plan A and those who signed up for Plan C or Plan D. There is also a difference between those who signed up for Plan B and those who signed up for Plan C or Plan D. There is no significant difference between those who signed up for Plan A to Plan B, or those who signed up for Plan C to Plan D. 6.
Thus, people who subscribed to Plan C or Plan D are more likely to renew. Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxiii 7.
Using Table 4: Chi-Square Test
Observed Frequencies Subscriber Type Selection
Basic
At Large
Add-on
Total
Digital
70
7
75
152
Digital Ad-Free
12
18
9
39
Digital All-Access
5
72
4
81
Print+Digital
13
3
12
28
Total
100
100
100
300
Add-on
Total
Digital 50.66667 50.66667 50.66667
152
Expected Frequencies Subscriber Type Selection
Basic
At Large
Digital Ad-Free
13
13
13
39
Digital All-Access
27
27
27
81
Print+Digital 9.333333 9.333333 9.333333
28
Total
100
100
100
Data Level of Significance
0.05
Number of Rows
4
Number of Columns
3
Degrees of Freedom
6 Copyright ©2024 Pearson Education, Inc.
300
Results Critical Value
12.59159
Chi-Square Test Statistic
178.9467
p-Value
5.68E-36
Reject the null hypothesis
Expected frequency assumption is met.
2 = 178.9467 > 12.59159. Reject H0. There is evidence of a difference in the proportion of subscriber type based on the selection. 8.
Answers may vary.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxv
Chapter 13
1.
Develop a regression model to predict revenue based on the number of clicks a digital ad generates. Simple Linear Regression Analysis
Regression Statistics Multiple R
0.8527
R Square
0.7270
Adjusted R Square
0.7242
Standard Error
3.7656
Observations
100
ANOVA df
SS
MS
F
Regression
1
3701.0064 3701.0064 261.0091
Residual
98
1389.6012
Total
99
5090.6076
Coefficients
Standard Error
Intercept
9.0847
Clicks
0.1656
Significance F 0.0000
14.1796
t Stat
P-value
1.3018
6.9783
0.0000
6.5013
0.0102
16.1558
0.0000
0.1452
Yˆ 9.0847 0.1656 X , where X = the number of clicks.
Copyright ©2024 Pearson Education, Inc.
Lower 95%
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxvii 1. cont.
The residual plot reveals no evidence of a pattern in the residuals. There appears to be no violation of the linearity and equal variance assumptions. The regression model to predict revenue: Yˆ 9.0847 0.1656 X , where X = number of clicks. 2.
For 90 clicks, the predicted revenue is Yˆ 9.0847 0.1656(90) 23.98. Copyright ©2024 Pearson Education, Inc.
3.
There are many factors that might be considered.
4.
The value of 525 for X, the number of clicks, is beyond the range of our X values.
5.
Answers will vary. Some possibilities of how a visitor’s engagement could be measured is by the length of time the visitor is at the website, or information about purchases from sponsored content.
6.
Answers will vary.
7.
A generalized prediction line for a simple linear model: Ŷ b0 b1 X , where X = time spent at website.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxix
Chapter 14
Let Let Y revenue, X 1 = number of clicks,
X 2 1 if ad appears on website; 0 if ad does not appear on website, X 3 X 1 X 2 Regression Analysis
Regression Statistics Multiple R
0.8546
R Square
0.7304
Adjusted R Square
0.7220
Standard Error
3.7811
Observations
100
ANOVA df
SS
MS
F
Regression
3
3718.1386 1239.3795 86.6908
Residual
96
1372.4690
Total
99
5090.6076
Coefficients
Standard Error
Intercept
10.5607
Clicks
Significance F 0.0000
14.2966
t Stat
P-value
Lower 95%
Upper 95%
1.9959
5.2912
0.0000
6.5989
14.5224
0.1566
0.0153
10.2120
0.0000
0.1261
0.1870
Home Page
-2.5013
2.6467
-0.9451
0.3470
-7.7549
2.7523
Clicks*Home Page
0.0154
0.0207
0.7442
0.4586
-0.0257
0.0565
Copyright ©2024 Pearson Education, Inc.
Testing the significance of the interaction: H 0 : 3 0 vs. H1 : 3 0 Since the p-value of the t-test statistic for the significance of X 3 is 0.4586 > 0.05, do not reject the null hypothesis. There is insufficient evidence of an interaction between number of clicks and the ad appearing on the website.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxxi The Excel results of the multiple regression without the interaction term are as follows: Regression Analysis
Regression Statistics Multiple R
0.8537
R Square
0.7288
Adjusted R Square
0.7232
Standard Error
3.7724
Observations
100
ANOVA df
SS
MS
F
Regression
2
3710.2212 1855.1106 130.3590
Residual
97
1380.3864
Total
99
5090.6076
Coefficients
Standard Error
Intercept
9.5096
Clicks Home Page
Significance F 0.0000
14.2308
t Stat
P-value
Lower 95%
Upper 95%
1.4070
6.7586
0.0000
6.7171
12.3022
0.1650
0.0103
16.0367
0.0000
0.1446
0.1854
-0.6164
0.7660
-0.8047
0.4230
-2.1368
0.9040
Durbin-Watson Calculations
Sum of Squared Difference of Residuals
3409.766552
Sum of Squared Residuals
1380.386434
Copyright ©2024 Pearson Education, Inc.
Durbin-Watson Statistic
2.470153623
Regression Analysis Coefficients of Partial Determination
Intermediate Calculations SSR(X1,X2) 3710.221166 SST
5090.6076
SSR(X2)
50.41387258 SSR(X1 | X2)
3659.807293
SSR(X1)
3701.006381 SSR(X2 | X1)
9.214784733
Coefficients r2 Y1.2
0.72612433
r2 Y2.1
0.006631244
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxxiii
Copyright ©2024 Pearson Education, Inc.
The 5% critical values of the Durbin-Watson statistics are d L 1.65 and dU 1.69 . The Durbin-Watson test statistic = 2.47 > 1.69. There is no evidence of autocorrelation in the data. Testing for the overall significance of the multiple regression: H 0 : 1 2 0 vs. H1 : not all j = 0. Since the p-value of the overall F test statistic is essentially zero, reject the null hypothesis and conclude that there is evidence that revenue depend on the number of clicks and/or whether the ad appears on the website. Testing for the effect of the individual independent variable on the revenue: H 0 : 1 0 vs. H1 : 1 0 Since the p-value of the t-test statistic for the significance of X1 is essentially zero, reject the null hypothesis and conclude that there is evidence that the number of clicks has significant effect on the revenue. H 0 : 2 0 vs. H1 : 2 0 Since the p-value of the t-test statistic for the significance of X 2 is 0.4230 > 0.05, do not reject the null hypothesis and conclude that there is insufficient evidence that whether the ad appears on the website alone has significant effect on the revenue. There is no pattern in the residuals versus hours or presentation type. The best model to predict the revenue is Yˆ 9.5096 0.1650 X 1 0.6164 X 2 72.61% of the variation in the revenue can be explained by variation in the clicks and whether the ad appears on the website. Holding constant whether the ad appears on the website, 72.61% of the variation in revenue can be explained by variation in clicks. Holding constant the number of clicks, 0.66% of the variation in revenue can be explained by variation in whether the ad appears on the website. Since the regression coefficient for whether the ad appears on the website is negative, this means that holding constant the number of clicks, having the ad on the website is predicted to decrease the revenue by a mean of 0.6164. Holding constant whether the ad appears on the website, for each increase of one click, the mean revenue is predicted to increase by 0.1650.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxxv
Chapter 15
1.
Before beginning, please note that even a math major who is a super-excellent solutions-preparer cannot execute a solution for this case. Once reason is that the scope of the book does not discuss things such as VIFs in something other than an OLS regression. But the concept of multicollinearity does apply. The marketing department clearly seeks a predictive model with a dependent categorical variable, like subscriber, so one is considering a model created by logistic regression analysis. One is given 40 candidate variables. What does one do with them? The case says, ―the department would like to use those fact.‖ One’s question would be, ―Why?‖ Perhaps, these facts were selected because they were easily collectible. Establishing relevant facts would be the first thing to do. This invokes the D in DCOSAC. Note that Exhibit 15.1, which slightly oversimplifies things, invokes DCOSAC in step 1. A complete answer would use DCOSAC as a frame, illustrating what a framework does! As established early in the book (did students forget–it’s chapter 15 and a long semester or two later!), some prior task may need to be redone. For example, while data has already been collected, additional ―facts‖ may need to be collected based on the outcome of the design task. The requirement to ―be specific about methodology‖ is both a reference to problem-solving methodology as well as to inferential methods and techniques that Chapters 14 and 15 discuss. A higher-level student would walk through the things Sections 15.1 and 15.2 discuss. A lower-level answer should acknowledge that ―analyze‖ task is not a simple application of a particular method. Any answer should acknowledge the principle of parsimony in some way.
Copyright ©2024 Pearson Education, Inc.
Chapter 16
1.
For Visitors vs Month time-series, there is a sharp downward trend (an irregular component) followed by a slight upward trend.
Visitors vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxxvii
Copyright ©2024 Pearson Education, Inc.
1. For Page Impressions vs Month time-series, there is a sharp downward trend (an irregular cont. component) followed by a slight upward trend.
Page Impressions vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxxxix 1. For Bounce Rates vs Month time-series, there is a sharp downward trend (an irregular component) cont. followed by a slight upward trend.
Bounce Rates vs Month, MA(3), ES(0.5), ES(0.25) time series, shows the exponential smoothing over the time series.
2.
It would be better if there was more data than just 18 months. There is not enough data to show trends. The sharp downward trend in each of the time-series could be explained and smoothed away with more data.
Copyright ©2024 Pearson Education, Inc.
3.
Number of Visitors: Linear model: Visitors vs Coded Month r2 = 0.3308, adjusted r2 = 0.2890 Coefficients Standard Error
t Stat
P-value
Intercept
767.3918
103.5712
7.4093
0.0000
Coded Month
-29.2487
10.4005
-2.8122
0.0125
Quadratic model: Visitors vs Coded Month, Coded Month Sq r2 = 0.7234, adjusted r2 = 0.6865 Coefficients Standard Error
t Stat
P-value
Intercept
1080.1404
96.5607
11.1861
0.0000
Coded Month
-146.5294
26.3398
-5.5630
0.0001
Coded Month Sq
6.8989
1.4952
4.6140
0.0003
Exponential model: log(Visitors) vs Coded Month r2 = 0.3376, adjusted r2 = 0.2962 Coefficients Standard Error
t Stat
P-value
Intercept
2.8305
0.0621 45.5488
0.0000
Coded Month
-0.0178
0.0062
0.0114
-2.8558
Autoregressive model third order: Visitors vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.0552, adjusted r2 = –0.2025 Coefficients Standard Error t Stat P-value Intercept
410.5708
99.6986
4.1181
0.0017
Lag1
0.1457
0.2729
0.5338
0.6041
Lag2
-0.1391
0.2744
-0.5071
0.6221
Lag3
0.0181
0.1471
0.1231
0.9043
Autoregressive model second order: Visitors vs Lag1, Lag2 (rows unused: 2) r2 = 0.3969, adjusted r2 = 0.3042 Coefficients Standard Error t Stat P-value Intercept
272.4606
64.9359
4.1958
Copyright ©2024 Pearson Education, Inc.
0.0010
Solutions to End-of-Section and Chapter Review Problems dcxli Lag1
0.3848
0.2598
1.4813
0.1624
Lag2
-0.0321
0.1423
-0.2256
0.8250
Autoregressive model first order: Visitors vs Lag1 (rows unused: 1) r2 = 0.7965, adjusted r2 = 0.7829 Coefficients Standard Error t Stat Intercept Lag1
P-value
210.9523
37.4591
5.6315
0.0000
0.4900
0.0639
7.6625
0.0000
The Autoregressive model first order: Visitors vs Lag1 would be best suited for prediction. Yˆi 210.9523 0.4900(Yi 1 )
Copyright ©2024 Pearson Education, Inc.
3. Page Impressions: cont. Linear model: Page Impressions vs Coded Month r2 = 0.3192, adjusted r2 = 0.2767 Coefficients Standard Error Intercept Coded Month
t Stat
P-value
1447.9357
213.5708
6.7797
0.0000
-58.7441
21.4466
-2.7391
0.0146
Quadratic model: Page Impressions vs Coded Month, Coded Month Sq r2 = 0.8288, adjusted r2 = 0.8059 Coefficients Standard Error t Stat
P-value
Intercept
2176.3860
155.3194
14.0123
0.0000
Coded Month
-331.9129
42.3679
-7.8341
0.0000
Coded Month Sq
16.0688
2.4050
6.6813
0.0000
Exponential model: log(Page Impressions) vs Coded Month r2 = 0.2610, adjusted r2 = 0.2148 Coefficients Standard Error t Stat
P-value
Intercept
3.0869
0.0783 39.4423
0.0000
Coded Month
-0.0187
0.0079
0.0303
-2.3769
Autoregressive model third order: Page Impressions vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.2122, adjusted r2 = –0.0027 Coefficients Standard Error t Stat P-value Intercept
572.8549
132.0586
4.3379
0.0012
Lag1
0.1102
0.2546
0.4328
0.6735
Lag2
0.3030
0.2355
1.2869
0.2246
Lag3
-0.1948
0.1489
-1.3081
0.2175
Autoregressive model second order: Page Impressions vs Lag1, Lag2 (rows unused: 2) r2 = 0.5780, adjusted r2 = 0.5130 Coefficients Standard Error t Stat P-value Intercept
398.9237
104.6132
3.8133
Copyright ©2024 Pearson Education, Inc.
0.0022
Solutions to End-of-Section and Chapter Review Problems dcxliii Lag1
0.3909
0.2531
1.5446
0.1464
Lag2
0.0502
0.1749
0.2869
0.7787
Autoregressive model first order: Page Impressions vs Lag1 (rows unused: 1) r2 = 0.7977, adjusted r2 = 0.7842 Coefficients Standard Error t Stat P-value Intercept Lag1
269.0196
88.2314
3.0490
0.0081
0.6187
0.0805
7.6899
0.0000
The Quadratic model: Page Impressions vs Coded Month, Coded Month Sq would be best suited for prediction. Yˆ 2176.3860 331.9129 X 16.0688 X 2 , where X = coded month with month 1 = 0
Copyright ©2024 Pearson Education, Inc.
3. Bounce Rates: cont. Linear model: Bounce Rates vs Coded Month r2 = 0.1788, adjusted r2 = 0.1275 Coefficients Standard Error
t Stat
P-value
Intercept
0.0378
0.0102
3.7129
0.0019
Coded Month
-0.0019
0.0010
-1.8665
0.0804
Quadratic model: Bounce Rates vs Coded Month, Coded Month Sq r2 = 0.3571, adjusted r2 = 0.2714 Coefficients Standard Error t Stat
P-value
Intercept
0.0565
0.0131
4.3257
0.0006
Coded Month
-0.0089
0.0036
-2.5040
0.0243
Coded Month Sq
0.0004
0.0002
2.0398
0.0594
Exponential model: log(Bounce Rates) vs Coded Month r2 = 0.1108, adjusted r2 = 0.0552 Coefficients Standard Error
t Stat
P-value
Intercept
-1.6299
0.1272
-12.8173
0.0000
Coded Month
-0.0180
0.0128
-1.4118
0.1772
Autoregressive model third order: Bounce Rates vs Lag1, Lag2, Lag3 (rows unused: 3) r2 = 0.0478, adjusted r2 = –0.2119 Coefficients Standard Error t Stat P-value Intercept
0.0180
0.0070
2.5876
0.0252
Lag1
-0.0925
0.3034
-0.3048
0.7662
Lag2
0.0146
0.2756
0.0528
0.9588
Lag3
-0.0514
0.0758
-0.6779
0.5118
Autoregressive model second order: Bounce Rates vs Lag1, Lag2 (rows unused: 2) r2 = 0.0835, adjusted r2 = –0.0575 Coefficients Standard Error t Stat P-value
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxlv Intercept
0.0171
0.0040
4.3264
0.0008
Lag1
-0.0356
0.2566
-0.1387
0.8918
Lag2
-0.0594
0.0689
-0.8624
0.4041
Autoregressive model first order: Bounce Rates vs Lag1 (rows unused: 1) r2 = 0.2620, adjusted r2 = 0.2128 Coefficients Standard Error t Stat P-value
4.
Intercept
0.0130
0.0020
6.6254
0.0000
Lag1
0.1390
0.0602
2.3076
0.0357
The Quadratic model: Bounce Rates vs Coded Month, Coded Month Sq would be best suited for prediction. Yˆ 0.0565 0.0089 X 0.0004 X 2 , where X = coded month with month 1 = 0 Forecast of next month’s value: Visitors: Yˆ19 210.9523 0.4900(Y18 ) 210.9523 0.4900(514) 462.8153
Page Impressions: Yˆ 2176.3860 331.9129 X 16.0688 X 2
2176.3860 331.9129(18) 16.0688(18)2 1408.2304 Bounce Rates: Yˆ 0.0565 0.0089 X 0.0004 X 2
0.0565 0.0089(18) 0.0004(18) 2 0.0296 2.96%
Copyright ©2024 Pearson Education, Inc.
Chapter 19
1. Xbar-R Chart of Upload Speed
Sample M ean
1.14
U C L=1.1401
1.08 _ _ X=1.0003
1.02 0.96 0.90
LC L=0.8605 1
3
5
7
9
11
13 Sample
15
17
19
21
23
25
U C L=0.5126
Sample Range
0.48 0.36
_ R=0.2424
0.24 0.12 0.00
LC L=0 1
2.
3.
3
5
7
9
11
13 Sample
15
17
19
21
23
25
Since there are five observations for each day, you should use the X chart in conjunction with the Range chart. Mean and Range Charts: X = 1.0003 R = 0.2424 Range Chart: UCL = 0.5126 LCL does not exist R = .02424 There are no points outside the control limits and no violations of the rules 1 - 5. X Chart: UCL = 1.1401 LCL = 0.8605 X = 1.0003 There are no points outside the control limits and no violations of the rules 1 - 5. The process is stable, so any attempt to reduce the common cause variation in the upload speed or to improve the upload speed must be undertaken by management by changing the process.
Copyright ©2024 Pearson Education, Inc.
Solutions to End-of-Section and Chapter Review Problems dcxlvii
Copyright ©2024 Pearson Education, Inc.